Abstract
We prove that a set of density one satisfies the local-global conjecture for integral Apollonian gaskets. That is, for a fixed integral, primitive Apollonian gasket, almost every (in the sense of density) admissible (passing local obstructions) integer is the curvature of some circle in the gasket.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
1.1 The local-global conjecture
Let be an Apollonian gasket, see Fig. 1. The number b(C) shown inside a circle is its curvature, that is, the reciprocal of its radius (the bounding circle has negative orientation). Soddy [46] first observed the existence of integral gaskets , meaning ones for which \(b(C)\in \mathbb {Z}\) for all . Let
be the set of all curvatures in . We call a gasket primitive if . From now on, we restrict our attention to a fixed primitive integral Apollonian gasket .
Graham, Lagarias, Mallows, Wilks, and Yan [26, 34] initiated a detailed study of Diophantine properties of , with two separate families of problems (see also e.g. [23, 33, 43]): studying with multiplicity (that is, studying circles), or without multiplicity (studying the integers which arise). In the present paper, we are concerned with the latter.
In particular, the following striking local-to-global conjecture for is given in [26, p. 37], [23]. Let denote the admissible integers, that is, those passing all local (congruence) obstructions:
Conjecture 1.1
(Local-Global Conjecture)
Fix a primitive, integral Apollonian gasket . Then every sufficiently large admissible number is the curvature of a circle in . That is, if and n≫1, then .
The purpose of this paper is to prove the following
Theorem 1.2
Almost every admissible number is the curvature of a circle in . Quantitatively, the number of exceptions up to N is bounded by O(N 1−η), where η>0 is effectively computable.
Admissibility is completely explained in Fuchs’s thesis [22], and is a condition restricting to certain residue classes modulo 24, cf. Lemma 2.3. E.g. for the gasket in Fig. 1, iff
Thus contains one of every four numbers (six admissible residue classes out of 24), and Theorem 1.2 can be restated in this case as
In general, the local obstructions are easily determined (see Remark 2.4) from the so-called root quadruple
which is the column vector of the four smallest curvatures in . For the gasket in Fig. 1, v 0=(−11,21,24,28).
The history of this problem is as follows. The first progress towards the Conjecture was already made in [26], who showed that
Sarnak [42] improved this to
and then Fuchs [22] showed
Finally Bourgain and Fuchs [4] settled the so-called “Positive Density Conjecture,” that
1.2 Methods
Our main approach is through the Hardy-Littlewood circle method, combining two new ingredients. The first, applied to the major arcs, is effective bisector counting in infinite volume hyperbolic 3-folds, recently achieved by I. Vinogradov [49], as well as the uniform spectral gap over congruence towers of such, see the Appendix by Péter Varjú. The second ingredient is the minor arcs analysis, inspired by that given recently by the first-named author in [3], where it was proved that the prime curvatures in a gasket constitute a positive proportion of the primes. (Obviously Theorem 1.2 implies that 100 % of the admissible prime curvatures appear.)
1.3 Plan for the paper
A more detailed outline of the proof, as well as the setup of some relevant exponential sums, is given in Sect. 3. Before we can do this, we need to recall the Apollonian group and some of its subgroups in Sect. 2. After the outline in Sect. 3, we use Sect. 4 to collect some background from the spectral and representation theory of infinite volume hyperbolic quotients. Then some lemmata are reserved for Sect. 5, the major arcs are estimated in Sect. 6, and the minor arcs are dealt with in Sects. 7–9. The Appendix, by Péter Varjú, extracts the spectral gap property for the Apollonian group from that of its arithmetic subgroups.
1.4 Notation
We use the following standard notation. Set e(x)=e 2πix and \(e_{q}(x)=e(\frac{x}{q})\). We use f≪g and f=O(g) interchangeably; moreover f≍g means f≪g≪f. Unless otherwise specified, the implied constants may depend at most on the gasket (or equivalently on the root quadruple v 0), which is treated as fixed. The symbol 1 {⋅} is the indicator function of the event {⋅}. The greatest common divisor of n and m is written (n,m), their least common multiple is [n,m], and ω(n) denotes the number of distinct prime factors of n. The cardinality of a finite set S is denoted |S| or #S. The transpose of a matrix g is written g t. The prime symbol ′ in \(\sum_{r(q)}'\) means the range of \(r(\operatorname {mod}q)\) is restricted to (r,q)=1. Finally, p j∥q denotes p j∣q and p j+1∤q.
2 Preliminaries I: the Apollonian group and its subgroups
2.1 Descartes theorem and consequences
Descartes’ Circle Theorem states that a quadruple v of (oriented) curvatures of four mutually tangent circles lies on the cone
where F is the Descartes quadratic form:
Note that F has signature (3,1) over \(\mathbb {R}\), and let
be the real special orthogonal group preserving F.
It follows immediately that for b,c and d fixed, there are two solutions a,a′ to (2.1), and
Hence we observe that a can be changed into a′ by a reflection, that is,
where the reflections
generate the so-called Apollonian group
It is a Coxeter group, free except for the relations \(S_{j}^{2}=I\), 1≤j≤4. We immediately pass to the index two subgroup
of orientation preserving transformations, that is, even words in the generators. Then Γ is freely generated by S 1 S 2, S 2 S 3 and S 3 S 4. It is known that Γ is Zariski dense in G but thin, that is, of infinite index in \(G(\mathbb {Z})\); equivalently, the Haar measure of Γ∖G is infinite.
2.2 Arithmetic subgroups
Now we review the arguments from [26, 42] which lead to (1.3) and (1.4), as our setup depends critically on them.
Recall that for any fixed gasket , there is a root quadruple v 0 of the four smallest curvatures in , cf. (1.2). It follows from (2.1) and (2.3) that the set of all curvatures can be realized as the orbit of the root quadruple v 0 under \(\mathcal {A}\). Let
be the orbit of v 0 under Γ. Then the set of all curvatures certainly contains
where e 1=(1,0,0,0)t,…,e 4=(0,0,0,1)t constitute the standard basis for \(\mathbb {R}^{4}\), and the inner product above is the standard one. Recall we are treating as a set, that is, without multiplicities.
It was observed in [26] that Γ contains unipotent elements, and hence one can use these to furnish an injection of affine space in the otherwise intractable orbit , as follows. Note first that
and after conjugation by
we have
Recall the spin homomorphism \(\rho: \operatorname {SL}_{2}\to \operatorname {SO}(2,1)\), embedded for our purposes in \(\operatorname {SL}_{4}\), given explicitly by
In fact \(\operatorname {SL}_{2}\) is a double cover of \(\operatorname {SO}(2,1)\) under ρ, with kernel ±I. It is clear from inspection that
Since , for each \(n\in \mathbb {Z}\), Γ contains the element
(Of course this can be seen directly from (2.5); these transformations will be more enlightening below.)
Thus if is a quadruple in the orbit, then also contains \(C_{1}^{n}\cdot v \) for all n. From (2.4), we then have that the set of curvatures contains
The circles thus generated are all tangent to two fixed circles, which explains the square curvatures in Fig. 2. Of course (2.7) immediately implies (1.3).
Observe further that
is another unipotent element, with
and
Since T 1 and T 2 generate Λ(2), the principal 2-congruence subgroup of \(\operatorname {PSL}(2,\mathbb {Z})\), we see that the Apollonian group Γ contains the subgroup
In particular, whenever (2x,y)=1, there is an element
and thus Ξ contains the element
Write
Then again by (2.4), we have shown the following
Lemma 2.1
([42])
Let \(x,y\in \mathbb {Z}\) with (2x,y)=1, and take any element γ∈Γ with corresponding quadruple
Then the number
is the curvature of some circle in , where we have defined
Note from (2.1) that B γ is integral.
Observe that, by construction, the value of a γ is unchanged under the orbit of the group (2.8), and the circles whose curvatures are generated by (2.12) are all tangent to the circle corresponding to a γ . It is classical (see [2]) that the number of distinct primitive values up to N assumed by a positive-definite binary quadratic form is of order at least N(logN)−1/2, proving (1.4).
To fix notation, we define the binary quadratic appearing in (2.12) and its shift by
so that
Note from (2.13) and (2.1) that the discriminant of f γ is
When convenient, we will drop the subscripts γ in all the above.
2.3 Congruence subgroups
For each q≥1, define the “principal” q-congruence subgroup
These groups all have infinite index in \(G(\mathbb {Z})\), but finite index in Γ. The quotients Γ/Γ(q) have been determined completely by Fuchs [22] by proving an explicit Strong Approximation theorem (see [37]), Goursat’s Lemma, and other ingredients, as we explain below. Since G does not itself have the Strong Approximation Property, we pass to its connected spin double cover \(\operatorname {SL}_{2}(\mathbb {C})\). We will need the covering map explicitly later, so record it here.
First change variables from the Descartes form F to
Then there is a homomorphism \(\iota_{0}: \operatorname {SL}(2,\mathbb {C})\to \operatorname {SO}_{\tilde{F}}(\mathbb {R})\), sending
to
To map from \(\operatorname {SO}_{\tilde{F}}\) to \(\operatorname {SO}_{F}\), we apply a conjugation, see [26, (4.1)]. Let
be the composition of this conjugation with ι 0. Let \(\tilde{\varGamma }\) be the preimage of Γ under ι.
Lemma 2.2
The group \(\tilde{\varGamma }\) is generated by
With this explicit realization of \(\tilde{\varGamma }\) (and hence Γ), Fuchs was able to explicitly determine the images of \(\tilde{\varGamma }\) in \(\operatorname {SL}(2,\mathbb {Z}[i]/(q))\), and hence understand the quotients Γ/Γ(q) for all q.
Lemma 2.3
[22]
-
(1)
The quotient groups Γ/Γ(q) are multiplicative, that is, if q factors as
$$q=p_{1}^{\ell_{1}}\cdots p_{r}^{\ell_{r}}, $$then
$$\varGamma /\varGamma (q)\cong \varGamma /\varGamma \bigl(p_{1}^{\ell_{1}}\bigr)\times\cdots\times \varGamma /\varGamma \bigl(p_{r}^{\ell_{r}}\bigr). $$ -
(2)
If (q,6)=1 then
$$ \varGamma /\varGamma (q)\cong \operatorname {SO}_{F}(\mathbb {Z}/q\mathbb {Z}). $$(2.19) -
(3)
If q=2ℓ, ℓ≥3, then Γ/Γ(q) is the full preimage of Γ/Γ(8) under the projection \(\operatorname {SO}_{F}(\mathbb {Z}/q\mathbb {Z})\to \operatorname {SO}_{F}(\mathbb {Z}/8\mathbb {Z})\). That is, the powers of 2 stabilize at 8. Similarly, the powers of 3 stabilize at 3, meaning that for q=3ℓ, ℓ≥1, the quotient Γ/Γ(q) is the preimage of Γ/Γ(3) under the corresponding projection map.
Remark 2.4
This of course explains all local obstructions, cf. (1.1). The admissible numbers are precisely those residue classes \((\operatorname {mod}24)\) which appear as some entry in the orbit of v 0 under Γ/Γ(24).
3 Setup and Outline of the Proof
In this section, we introduce the main exponential sum and give an outline of the rest of the argument. Recall the fixed gasket having curvatures and root quadruple v 0. Let Γ be the Apollonian subgroup with subgroup Ξ, see (2.8). Let δ≈1.3 be the Hausdorff dimension of the gasket ; see Sect. 4 for the important role played by this geometric invariant. Recall also from (2.12) that for any γ∈Γ and ξ∈Ξ,
Our approach, mimicking [8, 9], is to exploit the bilinear (or multilinear) structure above.
We first give an informal description of the main ensemble from which we will form an exponential sum. Let N be our main growing parameter. We construct our ensemble by decomposing a ball in Γ of norm N into two balls, a small one in all of Γ of norm T, and a larger one of norm X 2 in Ξ, corresponding to x,y≍X. Specifically, we take
See (9.8) and (9.11) where these numbers are used.
We further need the technical condition that in the T-ball, the value of a γ =〈e 1,γ v 0〉 (see (2.11)) is of order T. This is used crucially in (7.8) and (5.41).
Finally, for technical reasons (see Lemma 5.2 below), we need to further split the T-ball into two: a small ball of norm T 1, and a big ball of norm T 2. Write
where \(\mathcal {C}\) is a large constant depending only on the spectral gap for Γ; it is determined in (5.11). We now make formal the above discussion.
3.1 Introducing the main exponential sum
Let N,X,T,T 1, and T 2 be as in (3.1) and (3.2). Define the family
From Lax-Phillips [35] (or see (4.10)), we have the bound
From (2.15), we can identify \(\gamma \in \frak {F}\) with a shifted binary quadratic form \(\frak {f}_{\gamma }\) of discriminant \(-4a_{\gamma }^{2}\) via
Recall from (2.12) that whenever (2x,y)=1, the above is a curvature in the gasket. We sometimes drop γ, writing simply \(\frak {f}\in \frak {F}\); then the latter can also be thought of as a family of shifted quadratic forms. Note also that the decomposition γ=γ 1 γ 2 in (3.3) need not be unique, so some forms may appear with multiplicity.
One final technicality is to smooth the sum on x,y≍X. To this end, we fix a smooth, nonnegative function ϒ, supported in [1,2] and having unit mass, \(\int_{\mathbb {R}}\varUpsilon (x)dx=1\).
Our main object of study is then the representation number
and the corresponding exponential sum, its Fourier transform
Clearly \(\mathcal {R}_{N}(n)\neq0\) implies that . Note also from (3.4) that the total mass satisfies
The condition (2x,y)=1 will be a technical nuisance, and can be freed by a standard use of the Möbius inversion formula. To this end, we introduce another parameter
a small power of N, with \(\frak {u}>0\) depending only on the spectral gap of Γ; it is determined in (6.3). Then by truncating Möbius inversion, define
with corresponding “representation function” \(\mathcal {R}_{N}^{U}\) (which could be negative).
3.2 Reduction to the circle method
We are now in position to outline the argument in the rest of the paper. Recall that is the set of admissible numbers. We first reduce our main Theorem 1.2 to the following
Theorem 3.1
There exists an η>0 and a function \(\frak {S}(n)\) with the following properties. For \(\frac {1}{2}N<n<N\), the singular series \(\frak {S}(n)\) is nonnegative, vanishes only when , and is otherwise ≫ ε N −ε for any ε>0. Moreover, for \(\frac {1}{2}N<n<N\) and admissible,
except for a set of cardinality ≪N 1−η.
Proof of Theorem 1.2 assuming Theorem 3.1
We first show that the difference between \(\mathcal {R}_{N}\) and \(\mathcal {R}_{N}^{U}\) is small in ℓ 1. Using (3.4) we have
for any ε>0. Recall from (3.8) that U is a fixed power of N, so the above saves a power from the total mass (3.7).
Now let Z be the “exceptional” set of admissible n<N for which \(\mathcal {R}_{N}(n)=0\). Furthermore, let W be the set of admissible n<N for which (3.10) is satisfied. Then
Note also from Theorem 3.1 that |Z∩W c|≤|W c|≪N 1−η. Hence by (3.1) and (3.8),
which is a power savings since ε>0 is arbitrary. This completes the proof. □
To establish (3.10), we decompose \(\mathcal {R}_{N}^{U}\) into “major” and “minor” arcs, reducing Theorem 3.1 to the following
Theorem 3.2
There exists an η>0 and a decomposition
with the following properties. For \(\frac {1}{2}N<n<N\) and admissible, , we have
except for a set of cardinality ≪N 1−η. The singular series \(\frak {S}(n)\) is the same as in Theorem 3.1. Moreover,
Proof of Theorem 3.1 assuming Theorem 3.2
We restrict our attention to the set of admissible n<N so that (3.13) holds (the remainder having sufficiently small cardinality). Let Z denote the subset of these n for which \(\mathcal {R}_{N}^{U}(n)<\frac {1}{2}\mathcal {M}_{N}^{U}(n)\); hence for n∈Z,
Then by (3.14),
whence the claim follows, since ε>0 is arbitrary. □
3.3 Decomposition into major and minor arcs
Next we explain the decomposition (3.12). Let M be a parameter controlling the depth of approximation in Dirichlet’s theorem: for any irrational θ∈[0,1], there exists some q<M and (r,q)=1 so that |θ−r/q|<1/(qM). We will eventually set
see (7.7) where this value is used. (Note that M is a bit bigger than N 1/2=XT 1/2.)
Writing θ=r/q+β, we introduce parameters
small powers of N as determined in (6.2), so that the “major arcs” correspond to q<Q 0 and |β|<K 0/N. In fact, we need a smooth version of this decomposition.
To this end, recall the “hat” function and its Fourier transform
Localize \(\frak {t}\) to the width K 0/N, periodize it to the circle, and put this spike on each fraction in the major arcs:
By construction, \(\frak {T}\) lives on the circle \(\mathbb {R}/\mathbb {Z}\) and is supported within K 0/N of fractions r/q with small denominator, q<Q 0, as desired.
Then define the “main term”
and “error term”
so that (3.12) obviously holds.
Since \(\mathcal {R}_{N}^{U}\) could be negative, the same holds for \(\mathcal {M}_{N}^{U}\). Hence we will establish (3.13) by first proving a related result for
and then showing that \(\mathcal {M}_{N}\) and \(\mathcal {M}_{N}^{U}\) cannot differ by too much for too many values of n. This is the same (but in reverse) as the transfer from \(\mathcal {R}_{N}\) to \(\mathcal {R}_{N}^{U}\) in (3.11). See Theorem 6.1 for the lower bound on \(\mathcal {M}_{N}\), and Theorem 6.2 for the transfer.
To prove (3.14), we apply Parseval and decompose dyadically:
where we have dissected the circle into the following regions (using that \(|1-\frak {t}(x)|=|x|\) on [−1,1]):
Bounds of the quality (3.14) are given for (3.22) and (3.23) in Sect. 7, see Theorem 7.3. Our estimation of (3.24) decomposes further into two cases, whether Q<X or X≤Q<M, and are handled separately in Sect. 8 and Sect. 9; see Theorems 8.5 and 9.5, respectively.
We point out again that our averaging on n in the minor arcs makes this quite crude as far as individual n’s (the subject of Conjecture 1.1) are concerned.
3.4 The rest of the paper
The only section not yet described is Sect. 5, where we furnish some lemmata which are useful in the sequel. These decompose into two categories: one set of lemmata is related to some infinite-volume counting problems, for which the background in Sect. 4 is indispensable. The other lemma is of a classical flavor, corresponding to a local analysis for the shifted binary form \(\frak {f}\); this studies a certain exponential sum which is dealt with via Gauss and Kloosterman/Salié sums.
This completes our outline of the rest of the paper.
4 Preliminaries II: automorphic forms and representations
4.1 Spectral theory
Recall the general spectral theory in our present context. We abuse notation (in this section only), passing from \(G=\operatorname {SO}_{F}(\mathbb {R})\) to its spin double cover \(G= \operatorname {SL}(2,\mathbb {C})\). Let Γ<G be a geometrically finite discrete group. (The Apollonian group is such, being a Schottky group, see Fig. 3.) Then Γ acts discontinuously on the upper half space \(\mathbb {H}^{3}\), and any Γ orbit has a limit set Λ Γ in the boundary \(\partial \mathbb {H}^{3}\cong S^{2} \) of some Hausdorff dimension δ=δ(Γ)∈[0,2]. We assume that Γ is non-elementary (not virtually Abelian), so δ>0, and moreover that Γ is not a lattice, that is, the quotient \(\varGamma \backslash \mathbb {H}^{3}\) has infinite hyperbolic volume; then δ<2. The hyperbolic Laplacian Δ acts on the space \(L^{2}(\varGamma \backslash \mathbb {H}^{3})\) of functions automorphic under Γ and square integrable on the quotient; we choose the Laplacian to be positive definite. The spectrum is controlled via the following, see [35, 38, 47].
Theorem 4.1
(Patterson, Sullivan, Lax-Phillips)
The spectrum above 1 is purely continuous, and the spectrum below 1 is purely discrete. The latter is empty unless δ>1, in which case, ordering the eigenvalues by
the base eigenvalue λ 0 is given by
Remark 4.2
In our application to the Apollonian group, the limit set is precisely the underlying gasket, see Fig. 3. It has dimension
Corresponding to λ 0 is the Patterson-Sullivan base eigenfunction, φ 0, which can be realized explicitly as the integral of a Poisson kernel against the so-called Patterson-Sullivan measure μ. Roughly speaking, μ is the weak∗ limit as s→δ + of the measures
where d(⋅,⋅) is the hyperbolic distance, and \(\frak {o}\) is any fixed point in \(\mathbb {H}^{3}\).
4.2 Spectral gap
We assume henceforth that Γ moreover satisfies \(\varGamma < \operatorname {SL}(2, \mathcal {O})\), where \(\mathcal {O}=\mathbb {Z}[i]\). Then we have a tower of congruence subgroups: for any integer q≥1, define Γ(q) to be the kernel of the projection map \(\varGamma \to \operatorname {SL}(2, \mathcal {O}/\frak {q})\), with \(\frak {q}=(q)\) the principal ideal. As in (4.1), write
for the discrete spectrum of \(\varGamma (q)\backslash \mathbb {H}^{3}\). The groups Γ(q), while of infinite covolume, have finite index in Γ, and hence
But the second eigenvalues λ 1(q) could a priori encroach on the base. The fact that this does not happen is the spectral gap property for Γ.
Theorem 4.3
Given Γ as above, there exists some ε=ε(Γ)>0 such that for all q≥1,
This is proved in the Appendix by Péter Varjú.
4.3 Representation theory and mixing rates
By the Duality Theorem of Gelfand, Graev, and Piatetski-Shapiro [24], the spectral decomposition above is equivalent to the decomposition into irreducibles of the right regular representation acting on L 2(Γ∖G). That is, we identify \(\mathbb {H}^{3}\cong G/K\), with \(K=\operatorname {SU}(2)\) a maximal compact subgroup, and lift functions from \(\mathbb {H}^{3}\) to (right K-invariant) functions on G. Corresponding to (4.1) is the decomposition
Here V temp contains the tempered spectrum (for \(\operatorname {SL}_{2}(\mathbb {C})\), every non-spherical irreducible representation is tempered), and each \(V_{\lambda _{j}}\) is an infinite dimensional vector space, isomorphic as a G-representation to a complementary series representation with parameter s j ∈(1,2) determined by λ j =s j (2−s j ). Obviously, a similar decomposition holds for L 2(Γ(q)∖G), corresponding to (4.4).
We also have the following well-known general fact about mixing rates of matrix coefficients, see e.g. [20]. First we recall the relevant Sobolev norm. Let (π,V) be a unitary G-representation, and let {X j } denote an orthonormal basis of the Lie algebra \(\frak {k}\) of K with respect to an Ad-invariant scalar product. For a smooth vector v∈V ∞, define the (second order) Sobolev norm \(\mathcal {S}\) of v by
Theorem 4.4
([33, Prop. 5.3])
Let Θ>1 and (π,V) be a unitary representation of G which does not weakly contain any complementary series representation with parameter s>Θ. Then for any smooth vectors v,w∈V ∞,
Here ∥⋅∥ is the standard Frobenius matrix norm.
4.4 Effective bisector counting
The next ingredient which we require is the recent work by Vinogradov [49] on effective bisector counting for such infinite volume quotients. Recall the following sub(semi)groups of G:
We have the Cartan decomposition G=KA + K, unique up to the normalizer M of A in K. We require it in the following more precise form. Identify K/M with the sphere \(S^{2}\cong \partial \mathbb {H}^{3}\). Then for every g∈G not in K, there is a unique decomposition
with s 1,s 2∈K/M, a∈A + and m∈M, corresponding to
see, e.g., [49, (3.4)]. The following theorem follows easily from [49, Theorem 2.2].
Theorem 4.5
([49])
Let Φ,Ψ⊂S 2 be spherical caps and let \(\mathcal {I}\subset \mathbb {R}/\mathbb {Z}\) be an interval. Then under the above hypotheses on Γ (in particular δ>1), and using the decomposition (4.9), we have
as T→∞. Here c δ >0, ∥⋅∥ is the Frobenius norm, ℓ is Lebesgue measure, μ is Patterson-Sullivan measure (cf. (4.3)), and
depends only on the spectral gap for Γ. The implied constant does not depend on Φ,Ψ, or \(\mathcal {I}\).
This generalizes from \(\operatorname {SL}(2,\mathbb {R})\) to \(\operatorname {SL}(2,\mathbb {C})\) the main result of [12], which is itself a generalization (with weaker exponents) to our infinite volume setting of [25, Theorem 4].
5 Some lemmata
5.1 Infinite volume counting statements
Equipped with the tools of Sect. 4, we isolate here some consequences which will be needed in the sequel. We return to the notation \(G=\operatorname {SO}_{F}\), with F the Descartes form (2.2), \(\varGamma =\mathcal {A}\cap G\), the orientation preserving Apollonian subgroup, and Γ(q) its principal congruence subgroups. Moreover, we import all the notation from the previous section.
First we use the spectral gap to see that summing over a coset of a congruence group can be reduced to summing over the original group.
Lemma 5.1
Fix γ 1∈Γ, q≥1, and any “congruence” group \(\tilde{\varGamma }(q)\) satisfying
Then as Y→∞,
where Θ 0<δ depends only on the spectral gap for Γ. The implied constant above does not depend on q or γ 1. The same holds with γ 1 γ in (5.2) replaced by γγ 1.
This simple lemma follows from a more-or-less standard argument. We give a sketch below, since a slightly more complicated result will be needed later, cf. Lemma 5.3, but with essentially no new ideas. After proving the lemma below, we will use the argument as a template for the more complicated statement.
Sketch of Proof
Denote the left hand side (5.2) by \(\mathcal {N}_{q}\), and let \(\mathcal {N}_{1}/[\varGamma :\tilde{\varGamma }(q)]\) be the first term of (5.3). For g∈G, let
and define
so that
By construction, F q is a function on \(\tilde{\varGamma }(q)\backslash G\times \tilde{\varGamma }(q)\backslash G\), and we smooth F q in both copies of \(\tilde{\varGamma }(q)\backslash G\), as follows. Let ψ≥0 be a smooth bump function supported in a ball of radius η>0 (to be chosen later) about the origin in G with ∫ G ψ=1, and automorphize it to
Then clearly Ψ q is a bump function in \(\tilde{\varGamma }(q)\backslash G\) with \(\int_{\tilde{\varGamma }(q)\backslash G}\varPsi_{q}=1\). Let
Smooth the variables g and h in F q by considering
First we estimate the error from smoothing:
where we have increased γ to run over all of Γ. The analysis splits into three ranges.
-
(1)
If γ is such that
$$ \|\gamma _{1}\gamma \|>Y(1+10\eta), $$(5.7)then both f(γ 1 g −1 γh) and f(γ 1 γ) vanish.
-
(2)
In the range
$$ \|\gamma _{1}\gamma \|<Y(1-10\eta), $$(5.8)both f(γ 1 g −1 γh) and f(γ 1 γ) are 1, so their difference vanishes.
-
(3)
In the intermediate range, we apply [35], bounding the count by
$$ \ll Y^{\delta }\eta+ Y^{\delta -\varepsilon }, $$(5.9)where ε>0 depends on the spectral gap for Γ.
Thus it remains to analyze \(\mathcal {H}_{q}\).
Use a simple change of variables (see [12, Lemma 3.7]) to express \(\mathcal {H}_{q}\) via matrix coefficients:
Decompose the matrix coefficient into its projection onto the base irreducible \(V_{\lambda _{0}}\) in (4.7) and an orthogonal term, and bound the remainder by the mixing rate (4.8) using the uniform spectral gap ε>0 in (4.6). The functions ψ are bump functions in six real dimensions, so can be chosen to have second-order Sobolev norms bounded by ≪η −5. Of course the projection onto the base representation is just \([\varGamma :\tilde{\varGamma }(q)]^{-1}\) times the same projection at level one, cf. (4.5). Running the above argument in reverse at level one (see [12, Proposition 4.18]) gives:
Optimizing η and renaming Θ 0<δ in terms of the spectral gap ε gives the claim. □
Next we exploit the previous lemma and the product structure of the family \(\frak {F}\) in (3.3) to save a small power of q in the following modular restriction. Such a bound is needed at several places in Sect. 8.
Lemma 5.2
Let Θ 0 be as in (5.3). Define \(\mathcal {C}\) in (3.2) by
hence determining T 1 and T 2. There exists some η 0>0 depending only on the spectral gap of Γ so that for any 1≤q<N and any \(r(\operatorname {mod}q)\),
The implied constant is independent of r.
Proof
Dropping the condition 〈e 1,γ 1 γ 2 v 0〉>T/100 in (3.3), bound the left hand side of (5.12) by
We decompose the argument into two ranges of q.
Case 1: q small
In this range, we fix γ 1, and follow a standard argument for γ 2. Let \(\tilde{\varGamma }(q)<\varGamma \) denote the stabilizer of \(v_{0} (\operatorname {mod}q)\), that is
Clearly (5.1) is satisfied, and it is elementary that
cf. (2.19). Decompose \(\gamma _{2}=\gamma _{2}'\gamma _{2}''\) with \(\gamma _{2}''\in\tilde{\varGamma }(q)\) and \(\gamma _{2}'\in \varGamma /\tilde{\varGamma }(q)\). Then by (5.3) and [35], we have
Hence we have saved a whole power of q, as long as
Case 2: \(q\ge T_{2}^{{\delta -\varTheta _{0}\over2}}\)
Then by (5.11) and (3.2), q is actually a very large power of T 1,
In this range, we exploit Hilbert’s Nullstellensatz and effective versions of Bezout’s theorem; see a related argument in [7, Proof of Proposition 4.1].
Fixing γ 2 in (5.13) (with \(\ll T_{2}^{\delta }\) choices), we set
and play now with γ 1. Let S be the set of γ 1’s in question (and we now drop the subscript 1):
This congruence restriction is to a modulus much bigger than the parameter, so we
Claim
There is an integer vector v ∗≠0 and an integer z ∗ such that
holds for all γ∈S. That is, the modular condition can be lifted to an exact equality.
First we assume the Claim and complete the proof of (5.12). Let q 0 be a prime of size \(\asymp T_{1}^{(\delta -\varTheta _{0})/2}\), say, such that \(v_{*}\not \equiv0(\operatorname {mod}q_{0})\); then
by the argument in Case 1. Recall we assumed that q<N. Since q 0 above is a small power of N, the above saves a tiny power of q, as desired.
It remains to establish the Claim. For each γ∈S, consider the condition
First massage the equation into one with no trivial solutions. Since v is a primitive vector, after a linear change of variables we may assume that (v 1,q)=1. Then multiply through by \(\bar{v}_{1}\), where \(v_{1}\bar{v}_{1}\equiv 1(\operatorname {mod}q)\), getting
Now, for variables V=(V 2,V 3,V 4) and Z, and each γ∈S, consider the (linear) polynomials \(P_{\gamma }\in \mathbb {Z}[V,Z]\):
and the affine variety
If this variety \(\mathcal {V}(\mathbb {C})\) is non-empty, then there is clearly a rational solution, \((V^{*},Z^{*})\in \mathcal {V}(\mathbb {Q})\). Hence we have found a rational solution to (5.18), namely \(v^{*}=(1,V_{2}^{*},V_{3}^{*},V_{4}^{*})\neq0\) and z ∗=Z ∗. Since (5.18) is homogeneous, we may clear denominators, getting an integral solution, v ∗,z ∗.
Thus we henceforth assume by contradiction that the variety \(\mathcal {V}(\mathbb {C})\) is empty. Then by Hilbert’s Nullstellensatz, there are polynomials \(Q_{\gamma }\in \mathbb {Z}[V,Z]\) and an integer \(\frak {d}\ge1\) so that
for all \((V,Z)\in \mathbb {C}^{4}\). Moreover, Hermann’s method [29] (see [36, Theorem IV]) gives effective bounds on the heights of Q γ and \(\frak {d}\) in the above Bezout equation. Recall the height of a polynomial is the logarithm of its largest coefficient (in absolute value); thus the polynomials P γ are linear in four variables with height ≤logT 1. Then Q γ and \(\frak {d}\) can be found so that
(Much better bounds are known, see e.g. [1, Theorem 5.1], but these suffice for our purposes.)
On the other hand, reducing (5.20) modulo q and evaluating at
we have
by (5.19). But then since \(\frak {d}\ge1\), we in fact have \(\frak {d}\ge q\), which is incompatible with (5.21) and (5.17). This furnishes our desired contradiction, completing the proof. □
Next we need a slight generalization of Lemma 5.1, which will be used in the major arcs analysis, see (6.6).
Lemma 5.3
Let \(1<K\le T_{2}^{1/10}\), fix |β|<K/N, and fix x,y≍X. Then for any γ 0∈Γ, any q≥1, and any group \(\tilde{\varGamma }(q)\) satisfying (5.1), we have
where Θ<δ depends only on the spectral gap for Γ, and the implied constant does not depend on q, γ 0, β, x or y.
Proof
The proof follows with minor changes that of Lemma 5.1, so we give a sketch; see also [12, Sect. 4].
According to the construction (3.3) of \(\frak {F}\), the γ’s in question satisfy \(\gamma =\gamma _{1}\gamma _{2}\in \gamma _{0}\tilde{\varGamma }(q)\), and hence we can write
with \(\gamma _{2}'\in\tilde{\varGamma }(q)\). Then \(\gamma _{2}'=\gamma _{0}^{-1}\gamma _{1}\gamma _{2}\), and using (2.15), we can write the left hand side of (5.22) as
Now we fix γ 1 and mimic the proof of Lemma 5.1 in \(\gamma _{2}'\).
Replace (5.4) by
Then (5.5)–(5.7) remains essentially unchanged, save cosmetic changes such as replacing (5.6) by \(F_{q}(\gamma _{1}\gamma _{0}^{-1},e)\). Then in the estimation of the difference \(|\mathcal {N}_{q}-\mathcal {H}_{q}|\) by splitting the sum on \(\gamma _{2}'\) into ranges, the argument now proceeds as follows.
-
(1)
The range (5.7) should be replaced by
$$\begin{aligned} &\big\|\gamma _{1}\gamma _{0}^{-1}\gamma _{2}' \big\|<T_{2}(1-10\eta),\quad\text{or}\quad \big\|\gamma _{1}\gamma _{0}^{-1} \gamma _{2}'\big\|>2T_{2}(1+10\eta), \\ &\quad\text{or}\quad\bigl\langle e_{1},\gamma _{1}\gamma _{0}^{-1} \gamma _{2}'\,v_{0}\bigr\rangle<\frac{T}{100}(1-10 \eta). \end{aligned}$$ -
(2)
The range (5.8) should be replaced by the range
$$\begin{aligned} &T_{2}(1+10\eta)<\big\|\gamma _{1}\gamma _{0}^{-1} \gamma _{2}'\big\|<2T_{2}(1-10\eta),\quad \text{and}\\ &\bigl \langle e_{1},\gamma _{1}\gamma _{0}^{-1} \gamma _{2}'\,v_{0}\bigr\rangle>\frac{T}{100}(1+10 \eta), \end{aligned}$$in which f is differentiable. Here instead of the difference \(|f(\gamma _{1}\gamma _{0}^{-1}\cdot g\gamma _{2}'h)-f(\gamma _{1}\gamma _{0}^{-1}\gamma _{2}')|\) vanishing, it is now bounded by
$$\ll\eta K, $$for a net contribution to the error of ≪ηKT δ.
-
(3)
In the remaining range, (5.9) remains unchanged, using |f|≤1.
The error in (5.10) is then replaced by
Optimizing η and renaming Θ gives the bound \(O(T_{2}^{\varTheta }K^{10/11})\), which is better than claimed in the power of K. Rename Θ once more using (3.2) and (5.11), giving (5.22). □
The following is our last counting lemma, showing a certain equidistribution among the values of \(\frak {f}_{\gamma }(2x,y)\) at the scale N/K. This bound is used in the major arcs, see the proof of Theorem 6.1.
Lemma 5.4
Fix N/2<n<N, \(1<K\le T_{2}^{1/10}\), and x,y≍X. Then
where Θ<δ only depends on the spectral gap for Γ. The implied constant is independent of x,y, and n.
Sketch
The proof is an explicit calculation nearly identical to the one given in [12, Sect. 5]; we give only a sketch here. Write the left hand side of (5.23) as
Fix γ 1 and express the condition on γ 2 as γ 2∈R⊂G, where R is the region
Lift \(G=\operatorname {SO}_{F}(\mathbb {R})\) to its spin cover \(\tilde{G}= \operatorname {SL}_{2}(\mathbb {C})\) via the map ι of (2.18). Let \(\tilde{R}\subset\tilde{G}\) be the corresponding pullback region, and decompose \(\tilde{G}\) into Cartan KAK coordinates according to (4.9). Note that ι is quadratic in the entries, so, e.g., the condition
explaining the factor ∥a(g)∥2 appearing in (4.10).
Then chop \(\tilde{R}\) into spherical caps and apply Theorem 4.5. The same argument as in [12, Sect. 5] then leads to (5.23), after renaming Θ; we suppress the details. □
5.2 Local analysis statements
In this subsection, we study a certain exponential sum which arises in a crucial way in our estimates. Fix \(\frak {f}\in \frak {F}\), and write \(\frak {f}=f-a\) with
according to (2.14). Let q 0≥1, fix r with (r,q 0)=1, and fix \(n,m\in \mathbb {Z}\). (The notation is meant to be consistent with its later use; there will be another parameter q, and q 0 will be a divisor of q.) Define the exponential sum
This sum appears naturally in many places in the minor arcs analysis, see e.g. (7.4) and (9.2). Our first lemma is completely standard, see, e.g. [30, Sect. 12.3].
Lemma 5.5
With the above conditions,
Remark 5.6
Being a sum in two variables, one might expect square-root cancellation in each, giving a savings of \(q_{0}^{-1}\); indeed this is what we obtain, modulo some coprimality conditions, see (5.29). For some of our applications, saving just one square-root is plenty, and we can ignore the coprimality; hence the cleaner statement in (5.26).
Proof
Write \(\mathcal {S}_{f}\) for \(\mathcal {S}_{f}(q_{0},r;n,m)\). Note first that \(\mathcal {S}_{f}\) is multiplicative in q 0, so we study the case q 0=p j is a prime power. Assume for simplicity (q 0,2)=1; similar calculations are needed to handle the 2-adic case.
First we re-express \(\mathcal {S}_{f}\) in a more convenient form. By Descartes theorem (2.1), primitivity of the gasket , and (2.13), we have that (A,B,C)=1; assume henceforth that (C,q 0)=1, say. Write \(\bar{x}\) for the multiplicative inverse of x (the modulus will be clear from context). Recall throughout that (r,q 0)=1.
Looking at the terms in the summand of \(\mathcal {S}_{f}\), we have
where we used (2.16). Hence we have
and the ℓ sum is just a classical Gauss sum. It can be evaluated explicitly, see e.g. [30, Eq. (3.38)]. Let
Then the Gauss sum on ℓ is \(\varepsilon _{q_{0}}\sqrt{q}_{0} ({rC\over q_{0}} )\), where \(({\cdot\over q_{0}})\) is the Legendre symbol. Thus we have
Let
so that a 2/q 0=a 1/q 1 in lowest terms. Break the sum on 0≤k<q 0 according to \(k= k_{1}+q_{1}\tilde{k}\), with 0≤k 1<q 1 and \(0\le\tilde{k}< \tilde{q}_{0}\). Then
The last sum vanishes unless \(n-B\bar{C}m\equiv0\) \((\operatorname {mod}\tilde{q}_{0})\), in which case it is \(\tilde{q}_{0}\). In the latter case, define L by
Then we have
The Gauss sum in brackets is again evaluated as \(\varepsilon _{q_{1}} q_{1}^{1/2} ( { a_{1} r\bar{C} \over q_{1} } ) \), so we have
The claim then follows trivially. □
Next we introduce a certain average of a pair of such sums. Let f,q 0,r,n, and m be as before, and fix q≡0 \((\operatorname {mod}q_{0})\) and (u 0,q 0)=1. Let \(\frak {f}'\in \frak {F}\) be another shifted form \(\frak {f}'=f'-a'\), with
Also let \(n',m'\in \mathbb {Z}\). Then define
This sum also appears naturally in the minor arcs analysis, see (8.2) and (9.4).
Lemma 5.7
With the above notation, we have the estimate
Remark 5.8
Treating all gcd’s above as 1 and pretending q=q 0, the trivial bound here (after having saved essentially a whole q from each of the two \(\mathcal {S}_{f}\) sums) is 1/q, since the r sum is unnormalized. So (5.31) saves an extra q 1/4 in the r sum. (In fact we could have saved the expected q 1/2, but this does not improve our final estimates.)
Proof
Observe that \(\mathcal {S}\) is multiplicative in q, so we again consider the prime power case q=p j, p≠2; then q 0 is also a prime power, since q 0∣q. As before, we may assume (C,q 0)=(C′,q 0)=1.
Recall a 1, \(\tilde{q}_{0}\), and L given in (5.27) and (5.28), and let \(a_{1}'\), \(\tilde{q}_{0}'\) and L′ be defined similarly. Inputting the analysis from (5.29) into both \(\mathcal {S}_{f}\) and \(\mathcal {S}_{f'}\), we have
The term in brackets [⋅] is a Kloosterman- or Salié-type sum, for which we have an elementary bound [32] to the power 3/4:
giving the claim. (There is no improvement in our use of this estimate from appealing to Weil’s bound instead of Kloosterman’s; any power gain suffices.) □
In the case a=a′, (5.31) only saves one power of q, and in Sect. 9 we will need slightly more; see the proof of (9.10). We get a bit more cancellation in the special case f(m,−n)≠f′(m′,−n′) below.
Lemma 5.9
Assuming a=a′ and f(m,−n)≠f′(m′,−n′), we have the estimate
Proof
Assume first that q (and hence q 0) is a prime power, continuing to omit the prime 2. Returning to the definition of \(\mathcal {S}\) in (5.30), it is clear in the case a=a′ that
Hence we again apply Kloosterman’s 3/4th bound to (5.32), getting
which is valid now without the assumption that q 0 is a prime power. (Here a 1 satisfies a 2=a 1(a 2,p j) as in (5.27), and L is given in (5.28), so both depend on p j.)
Break the primes diving q 0 into two sets, \(\mathcal {P}_{1}\) and \(\mathcal {P}_{2}\), defining \(\mathcal {P}_{1}\) to be the set of those primes p for which
and \(\mathcal {P}_{2}\) the rest. For the latter, the gcd in (p j,…) of (5.34) is at most p j/2, so we clearly have
For \(p\in \mathcal {P}_{1}\), we multiply both sides of (5.35) by
giving
Using (5.28) that
and subtracting a from both sides of (5.37), we have shown that
Let
By assumption Z≠0. Moreover (5.38) implies that
and hence
Combining (5.39) and (5.36) in (5.34) gives the claim. □
Finally we need some savings in the case a=a′ and f(m,−n)=f′(m′,−n′). This will no longer come from \(\mathcal {S}\) itself, but from the following supplementary lemmata.
Lemma 5.10
Fix an equivalence class \(\mathcal {K}\) of primitive binary quadratic forms of discriminant −4a 2. We claim that the number of equivalent forms \(f\in \mathcal {K}\) with \(\frak {f}=f-a\in \frak {F}\) is bounded, that is,
Proof
From (2.13), (3.3), and (2.16), we have that f(m,n)=Am 2+2Bmn+Cn 2 has coefficients of size
and AC−B 2=a 2, with a≍T. It follows that AC≍T 2, and hence
Now suppose we have \(\frak {f}=f-a\) and \(\frak {f}'=f'-a\) with f as above and f′ having coefficients A′,B′,C′. If f and f′ are equivalent then there is an element so that
The first line can be rewritten as
so that
Similarly,
and hence |i|≪1. In a similar fashion, we see that |h| and |j| are also bounded, thus the number of equivalent forms in \(\mathcal {K}\) is bounded, as claimed. □
Lemma 5.11
For a fixed large integer z, the number of inequivalent classes \(\mathcal {K}\) of primitive quadratic forms of determinant −4a 2 which represent z is
Proof
If \(f\in \mathcal {K}\) represents z, say f(m,n)=z, then, setting w=(m,n), f represents z 1:=z/w 2 primitively. We see from (5.42) that f is then in the same class as f 1(m,n)=z 1 m 2+2Bmn+Cn 2, with
Moreover, by a unipotent change of variables preserving z 1, we can force B into the range [0,z 1), that is, B is determined mod z 1. So the number of inequivalent such f 1 is equal to
where p f∣∣2a. If 2f≥e, then the number of local solutions is at most p e/2. Otherwise, write B=B 1 p f; then there are at most 2 solutions to \(B_{1}^{2}\equiv-1(\operatorname {mod}p^{e-2f})\), and there are p f values for B once B 1 is determined. Hence the number of local solutions is at most 2⋅min(p e/2,p f), so the number of solutions to (5.44) is at most
The number of divisors z 1 of z is ≪ ε z ε, completing the proof. □
Lemma 5.12
Fix (A,B,C)=1 and d∣AC−B 2. Then there are integers k,ℓ with (k,ℓ,d)=1 so that, whenever Am 2+2Bmn+Cn 2≡0(d), we have
Proof
We will work locally, then lift to a global solution. Let p e∣∣d.
-
Case 1:
If (p,A)=1, then Am 2+2Bmn+Cn 2≡0(p e) implies
$$(m+\bar{A} Bn)^{2}-\bar{A} ^{2}B^{2}n^{2}+ \bar{A}Cn^{2}\equiv (m+\bar{A} Bn)^{2}\equiv 0 \bigl(p^{e}\bigr). $$In this case, we set k p :=1, and \(\ell_{p}:={\bar{A} B}\).
-
Case 2:
If (p,A)>1, then by primitivity, (p,C)=1. As before, we have \((n+\bar{C} Bm)^{2}\equiv 0(p^{e}) \), and we choose \(k_{p}={\bar{C} B}\), ℓ p :=1.
By the Chinese Remainder Theorem, there are integers k and ℓ so that \(k\equiv k_{p}(\operatorname {mod}p^{e})\), and similarly with ℓ. By construction, we have (k,ℓ,d)=1, as claimed. □
Lemma 5.13
Given large M, (A,B,C)=1 and d∣AC−B 2,
Proof
As in Lemma 5.12, A,B,C and d determine k,ℓ so that
But then there is a d 1∣d, with \(d\mid d_{1}^{2}\) so that mk+nℓ≡0(d 1). Let w=(ℓ,d 1); then mk≡0(w) implies m≡0(w) since (k,ℓ,d)=1. There are at most 1+M/w such m up to M. With m fixed, n is uniquely determined mod d 1/w. Hence we get the bound
as claimed. □
Finally we collect the above lemmata into our desired estimate, essential in the proof of (9.12).
Proposition 5.14
For large M and \(\frak {f}=f-a\in \frak {F}\) fixed,
for any ε>0.
Proof
Once f,m,n, and \(\frak {f}'=f'-a\in \frak {F}\) are determined, it is elementary that there are ≪ ε M ε values of m′,n′ with f(m,−n)=f′(m′,−n′). Decomposing f′ into classes and applying (5.40), (5.43), and (5.46), in succession, we have
from which the claim follows since a≪T. □
6 Major arcs
We return to the setting and notation of Sect. 3 with the goal of establishing (3.13). Thanks to the counting lemmata in Sect. 5.1, we can now define the major arcs parameters Q 0 and K 0 from (3.16). First recall the two numbers Θ<δ appearing in (5.22), (5.23), and define
to be the larger of the two. Then set
We may now also set the parameter U from (3.8) to be
where 0<η 0<1 is the number which appears in Lemma 5.2.
Let \(\mathcal {M}_{N}^{(U)}(n)\) denote either \(\mathcal {M}_{N}(n)\) or \(\mathcal {M}_{N}^{U}(n)\) from (3.21), (3.19), respectively. Putting (3.18) and (3.6) (resp. (3.9)) into (3.21) (resp. (3.19)), making a change of variables θ=r/q+β, and unfolding the integral from \(\sum_{m}\int_{0}^{1}\) to \(\int_{\mathbb {R}}\) gives
where in the last sum, u ranges over u∣(2x,y) (resp. and u<U). Here we have defined
using (2.15).
As in (5.14), let \(\tilde{\varGamma }(q)\) be the stabilizer of \(v_{0}(\operatorname {mod}q)\). Decompose the sum on \(\gamma \in \frak {F}\) in (6.5) as a sum on \({\gamma _{0}\in \varGamma /\tilde{\varGamma }(q)}\) and \({\gamma \in \frak {F}\cap \gamma _{0}\tilde{\varGamma }(q)}\). Applying Lemma 5.3 to the latter sum, using the definition of Θ 1 in (6.1), and recalling the estimate (5.15) gives
where
Clearly we have thus split \(\frak {M}\) into “modular” and “Archimedean” components. It is now a simple matter to prove the following
Theorem 6.1
For \(\frac {1}{2}N<n<N\), there exists a function \(\frak {S}(n)\) as in Theorem 3.1 so that
Proof
First we discuss the modular component. Write \(\frak {S}_{Q_{0}}\) as
where c q is the Ramanujan sum, \(c_{q}(m)=\sum_{r(q)}'e_{q}(rm)\). By (2.19), the analysis now reduces to a classical estimate for the singular series. We may use the transitivity of the γ 0 sum to replace 〈w x,y ,γ 0 v 0〉 by 〈e 4,γ 0 v 0〉, extend the sum on q to all natural numbers, and use multiplicativity to write the sum as an Euler product. Then the resulting singular series
vanishes only on non-admissible numbers, and can easily be seen to satisfy
for any ε>0. See, e.g. [8, Sect. 4.3].
Next we handle the Archimedean component. By our choice of \(\frak {t}\) in (3.17), specifically that \(\widehat{\frak {t}}>0\) and \(\widehat{\frak {t}}(y)>2/5\) for |y|<1/2, we have
using Lemma 5.4.
Putting everything into (6.6) and then into (6.4) gives (6.7), using (6.2) and (3.1). □
Next we derive from the above that the same bound holds for \(\mathcal {M}_{N}^{U}\) (most of the time).
Theorem 6.2
There is an η>0 such that the bound (6.7) holds with \(\mathcal {M}_{N}\) replaced by \(\mathcal {M}_{N}^{U}\), except on a set of cardinality ≪N 1−η.
Proof
Putting (6.6) into (6.4) gives
using (6.8) and (6.2). The rest of the argument is identical to that leading to (3.11). □
This establishes (3.13), and hence completes our Major Arcs analysis; the rest of the paper is devoted to proving (3.14).
7 Minor arcs I: case q<Q 0
We keep all the notation of Sect. 3, our goal in this section being to bound (3.22) and (3.23). First we return to (3.9) and reverse orders of summation, writing
where \(\frak {f}=f-a\) according to (2.14), and we have set
If u is even, then we have
If u is odd, we have
From now on, we focus exclusively on the case u is even, the other case being handled similarly. We first massage \(\widehat{\mathcal {R}}_{f,u}\) further.
Since f is homogeneous quadratic, we have
Hence expressing \(\theta =\frac{r}{q}+\beta \), we will need to write u 2/q as a reduced fraction; to this end, introduce the notation
so that u 2/q=u 0/q 0 in lowest terms, (u 0,q 0)=1.
Lemma 7.1
Recalling the notation (5.25), we have
where we have set
Proof
Returning to (7.2), we have
Apply Poisson summation to the bracketed term above:
Inserting this in the above, the claim follows immediately. □
We are now in position to prove the following
Proposition 7.2
With the above notation,
Proof
By (non)stationary phase (see, e.g., [30, §8.3]), the integral in (7.5) has negligible contribution unless
so the n,m sum can be restricted to
Here we used |β|≪(qM)−1 with M given by (3.15). In this range, stationary phase gives
using (2.16) and (3.4) that \(|\operatorname{discr}(f)|=4|B^{2}-AC|=4 a^{2}\gg T^{2}\).
Putting (7.7), (7.8) and (5.26) into (7.4), we have
from which the claim follows, using (7.3). □
Finally, we prove the desired estimates of the strength (3.14).
Theorem 7.3
Recall the integrals \(\mathcal {I}_{Q_{0},K_{0}},\ \mathcal {I}_{Q_{0}}\) from (3.22), (3.23). There is an η>0 so that
as N→∞.
Proof
We first handle \(\mathcal {I}_{Q_{0},K_{0}}\). Returning to (7.1) and applying (7.6) gives
Inserting this into (3.22) and using (6.2), (6.3) gives
Next we handle
which is again a power savings. □
8 Minor arcs II: case Q 0≤Q<X
Keeping all the notation from the last section, we now turn our attention to the integrals \(\mathcal {I}_{Q}\) in (3.24). It is no longer sufficient just to get cancellation in \(\widehat{\mathcal {R}}_{f,u}\) alone, as in (7.6); we must use the fact that \(\mathcal {I}_{Q}\) is an L 2-norm.
To this end, recall the notation (7.3), and put (7.4) into (7.1), applying Cauchy-Schwarz in the u-variable:
Recall from (2.14) that \(\frak {f}=f-a\). Insert (8.1) into (3.24) and open the square, setting \(\frak {f}'=f'-a'\). This gives
Note that again the sum has split into “modular” and “Archimedean” pieces (collected in brackets, respectively), with the former being exactly equal to \(\mathcal {S}\) in (5.30).
Decompose (8.2) as
where, once \(\frak {f}\) is fixed, we collect \(\frak {f}'\) according to whether a′=a (the “diagonal” case) and the off-diagonal a′≠a.
Lemma 8.1
Assume Q<X. For □∈{=,≠}, we have
Proof
Apply (5.31) and (7.7), (7.8) to (8.2), giving
where we used (7.3). The claim then follows immediately from (3.15) and Q<X. □
We treat \(\mathcal {I}_{Q}^{(=)}, \ \mathcal {I}_{Q}^{(\neq)}\) separately, starting with the former; we give bounds of the quality claimed in (3.14).
Proposition 8.2
There is an η>0 such that
as N→∞.
Proof
From (8.4), we have
Recalling that a=a γ =〈e 1,γv 0〉, replace the condition a′=a with \(a'\equiv a(\operatorname {mod}\lfloor Q_{0}\rfloor)\), and apply (5.12):
Then (6.3) and (3.1) imply the claimed power savings. □
Next we turn our attention to \(\mathcal {I}_{Q}^{(\neq)}\), the off-diagonal contribution. We decompose this sum further according to whether gcd(a,a′) is large or not. To this end, introduce a parameter H, which we will eventually set to
where, as in (6.3), the constant η 0>0 comes from Lemma 5.2. Write
corresponding to whether (a,a′)>H or (a,a′)≤H, respectively. We deal first with the large gcd.
Proposition 8.3
There is an η>0 such that
as N→∞.
Proof
Writing (a,a′)=h>H, \(\tilde{q}_{1}=(a^{2},q)\), \(\tilde{q}_{1}'=((a')^{2},q)\), and using (a−a′,q)≤q in (8.4), we have
where we used [n,m]>(nm)1/2. Apply (5.12) to the innermost sum, getting
By (8.6) and (6.3), this is a power savings, as claimed. □
Finally, we handle small gcd.
Proposition 8.4
There is an η>0 such that
as N→∞.
Proof
First note that
Write g=(a,q) and g′=(a′,q), and let h=(g,g′); observe then that h∣(a,a′) and h≪Q. Hence we can write g=hg 1 and \(g'=hg_{1}'\) so that \((g_{1},g_{1}')=1\). Note also that h∣(a−a′,q), so we can write \((a-a',q)=h\tilde{g}\); thus \(g_{1},g_{1}'\), and \(\tilde{g}\) are pairwise coprime, implying
Then we have
To the last sum, we again apply Lemma 5.2, giving
since Q≥Q 0. By (8.6) and (6.3), this is again a power savings, as claimed. □
Putting together (8.3), (8.5), (8.7), (8.8), and (8.9), we have proved the following
Theorem 8.5
For Q 0≤Q<X, there is some η>0 such that
as N→∞.
9 Minor arcs III: case X≤Q<M
In this section, we continue our analysis of \(\mathcal {I}_{Q}\) from (3.24), but now we need different methods to handle the very large Q situation. In particular, the range of x,y in (7.2) is now such that we have incomplete sums, so our first step is to complete them.
To this end, recall the notation (7.3) and introduce
so that, using (5.25), an elementary calculation gives
Put (9.2) into (7.1) and apply Cauchy-Schwarz in the u-variable:
As before, open the square, setting \(\frak {f}'=f'-a'\), and insert the result into (3.24):
Yet again the sum has split into modular and Archimedean components with the former being exactly equal to \(\mathcal {S}\) in (5.30). As before, decompose \(\mathcal {I}_{Q}\) according to the diagonal (a=a′) and off-diagonal terms:
Lemma 9.1
Assume Q≥X. For □∈{=,≠}, we have
Proof
Consider the sum λ f in (9.1). Since x,y≍X/u, |β|<1/(qM), X≤Q, and using (3.15), we have that
Hence there is contribution only if nx/q 0,my/q 0≪1, that is, we may restrict to the range
In this range, we give λ f the trivial bound of X 2/u 2. Putting this analysis into (9.4), the claim follows. □
We handle the off-diagonal term first.
Proposition 9.2
Assuming X≤Q<M, there is some η>0 such that
as N→∞.
Proof
Since (5.31) is such a large savings in q>X, we can afford to lose in the much smaller variable T. Hence put (5.31) into (9.6), estimating (a−a′,q)≤|a−a′| (since a≠a′):
where we used (7.3), Q<M, and (3.15). Using (3.1) we have that
so together with (6.3), this is clearly a substantial power savings. □
Lastly, we deal with the diagonal term. We no longer save enough from a=a′ alone. But recall that here more cancellation can be gotten from (5.33) in the special case that \(\frak {f}(m,-n)\neq \frak {f}'(m',-n')\). Hence we return to (9.6) and, once n,m, and \(\frak {f}\) are determined, separate n′,m′, and \(\frak {f}'\) into cases corresponding to whether \(\frak {f}(m,-n)=\frak {f}'(m',-n')\) or not. Accordingly, write
We now estimate \(\mathcal {I}_{Q}^{(=,\neq)}\) using the extra cancellation in (5.33).
Proposition 9.3
Assuming Q<XT, there is some η>0 such that
as N→∞.
Proof
Returning to (9.6), apply (5.33):
where we used that \(\frak {f}(m,n)\ll T (UQ/X)^{2}\) and Q<XT. From (3.1), we have
so we have again a power savings, as claimed. □
Lastly, we turn to the case \(\mathcal {I}_{Q}^{(=,=)}\), with \(\frak {f}(m,-n)=\frak {f}'(m',-n')\). We exploit this condition to get savings using (5.47).
Proposition 9.4
Assuming Q<XT, there is some η>0 such that
as N→∞.
Proof
Returning to (9.6), apply (5.31), and (5.47):
From (4.2), this is a power savings. □
Combining (9.5), (9.7), (9.9), (9.10), and (9.12), we have the following
Theorem 9.5
If X≤Q<M, then there is some η>0 so that
as N→∞.
Finally, Theorems 7.3, 8.5, and 9.5 together complete the proof of (3.14), and hence Theorem 1.2 is proved.
References
Berenstein, C.A., Yger, A.: Effective Bezout identities in Q[z 1,…,z n ]. Acta Math. 166(1–2), 69–120 (1991)
Bernays, P.: Über die Darstellung von positiven, ganzen Zahlen durch die primitiven, binären quadratischen Formen einer nicht quadratischen Diskriminante. PhD thesis, Georg-August-Universität, Göttingen, Germany (1912)
Bourgain, J.: Integral Apollonian circle packings and prime curvatures. J. Anal. Math. 118(1), 221–249 (2012)
Bourgain, J., Fuchs, E.: A proof of the positive density conjecture for integer Apollonian circle packings. J. Am. Math. Soc. 24(4), 945–967 (2011)
Bourgain, J., Gamburd, A.: Expansion and random walks in \(\mathrm{SL}_{d}(\mathbb{Z}/p^{n} \mathbb{Z})\). I. J. Eur. Math. Soc. 10(4), 987–1011 (2008)
Bourgain, J., Gamburd, A.: Uniform expansion bounds for Cayley graphs of \(\mathrm{SL}_{2}(\mathbb{F}_{p})\). Ann. Math. (2) 167(2), 625–642 (2008)
Bourgain, J., Gamburd, A.: Expansion and random walks in \(\mathrm{SL}_{d}(\mathbb {Z}/p^{n}\mathbb{Z})\). II. J. Eur. Math. Soc. 11(5), 1057–1103 (2009). With an appendix by Bourgain
Bourgain, J., Kontorovich, A.: On representations of integers in thin subgroups of SL\((2,{{\bf{Z}}})\). Geom. Funct. Anal. 20(5), 1144–1174 (2010)
Bourgain, J., Kontorovich, A.: On Zaremba’s conjecture (2011). Preprint arXiv:1107.3776
Bourgain, J., Varjú, P.P.: Expansion in SL n (Z/q Z), q arbitrary. Invent. Math. 188(1), 151–173 (2012)
Bourgain, J., Gamburd, A., Sarnak, P.: Affine linear sieve, expanders, and sum-product. Invent. Math. 179(3), 559–644 (2010)
Bourgain, J., Kontorovich, A., Sarnak, P.: Sector estimates for hyperbolic isometries. Geom. Funct. Anal. 20(5), 1175–1200 (2010)
Bourgain, J., Gamburd, A., Sarnak, P.: Generalization of Selberg’s 3/16 theorem and affine sieve. Acta Math. 207, 255–290 (2011)
Breuillard, E., Green, B., Tao, T.: Approximate subgroups of linear groups. Geom. Funct. Anal. 21(4), 774–819 (2011)
Brooks, R.: The spectral geometry of a tower of coverings. J. Differ. Geom. 23(1), 97–107 (1986)
Brooks, R.: The spectral geometry of Riemannian surfaces. In: Monastyrsky, M.I. (ed.) Topology in Molecular Biology. Springer, Berlin (2007)
Burger, M.: Grandes valeurs propres du Laplacien et graphes. In: Séminaire de Théorie Spectrale et Géométrie, No. 4, Année 1985–1986, pp. 95–100. Univ. Grenoble I (1986)
Burger, M.: Petites valeurs propres du Laplacien et topologie de Fell. PhD thesis, EPFL (1986)
Burger, M.: Spectre du Laplacien, graphes et topologie de Fell. Comment. Math. Helv. 63(2), 226–252 (1988)
Cowling, M., Haagerup, U., Howe, R.: Almost L 2 matrix coefficients. J. Reine Angew. Math. 387, 97–110 (1988)
Diaconis, P., Saloff-Coste, L.: Comparison techniques for random walk on finite groups. Ann. Probab. 21(4), 2131–2156 (1993)
Fuchs, E.: Arithmetic properties of Apollonian circle packings. Princeton University Thesis (2010)
Fuchs, E., Sanden, K.: Some experiments with integral Apollonian circle packings. Exp. Math. 20(4), 380–399 (2011)
Gelfand, I.M., Graev, M.I., Pjateckii-Shapiro, I.I.: Teoriya Predstavlenii i Avtomorfnye Funktsii. Generalized Functions, vol. 6. Nauka, Moscow (1966)
Good, A.: Local Analysis of Selberg’s Trace Formula. Lecture Notes in Mathematics, vol. 1040. Springer, Berlin (1983)
Graham, R.L., Lagarias, J.C., Mallows, C.L., Wilks, A.R., Yan, C.H.: Apollonian circle packings: number theory. J. Number Theory 100(1), 1–45 (2003)
Graham, R.L., Lagarias, J.C., Mallows, C.L., Wilks, A.R., Yan, C.H.: Apollonian circle packings: geometry and group theory. I. The Apollonian group. Discrete Comput. Geom. 34(4), 547–585 (2005)
Helfgott, H.A.: Growth and generation in \(\mathrm{SL}_{2}(\mathbb{Z}/p\mathbb{Z})\). Ann. Math. (2) 167(2), 601–623 (2008)
Hermann, G.: Die Frage der endlich vielen Schritte in der Theorie der Polynomideale. Math. Ann. 95(1), 736–788 (1926)
Iwaniec, H., Kowalski, E.: Analytic Number Theory. American Mathematical Society Colloquium Publications, vol. 53. American Mathematical Society, Providence (2004)
Kassabov, M., Lubotzky, A., Nikolov, N.: Finite simple groups as expanders. Proc. Natl. Acad. Sci. USA 103(16), 6116–6119 (2006)
Kloosterman, H.D.: On the representation of numbers in the form ax 2+by 2+cz 2+dt 2. Acta Math. 49(3–4), 407–464 (1927)
Kontorovich, A., Oh, H.: Apollonian circle packings and closed horospheres on hyperbolic 3-manifolds. J. Am. Math. Soc. 24(3), 603–648 (2011)
Lagarias, J.C., Mallows, C.L., Wilks, A.R.: Beyond the Descartes circle theorem. Am. Math. Mon. 109(4), 338–361 (2002)
Lax, P.D., Phillips, R.S.: The asymptotic distribution of lattice points in Euclidean and non-Euclidean space. J. Funct. Anal. 46, 280–350 (1982)
Masser, D.W., Wüstholz, G.: Fields of large transcendence degree generated by values of elliptic functions. Invent. Math. 72(3), 407–464 (1983)
Matthews, C., Vaserstein, L., Weisfeiler, B.: Congruence properties of Zariski-dense subgroups. Proc. Lond. Math. Soc. 48, 514–532 (1984)
Patterson, S.J.: The limit set of a Fuchsian group. Acta Math. 136, 241–273 (1976)
Pyber, L., Szabó, E.: Growth in finite simple groups of lie type of bounded rank (2010). Preprint arXiv:1005.1858
Salehi Golsefidy, A., Varjú, P.: Expansion in perfect groups. Geom. Funct. Anal. 22(6), 1832–1891 (2012)
Sarnak, P.: Some Applications of Modular Forms. Cambridge Tracts in Mathematics, vol. 99. Cambridge University Press, Cambridge (1990)
Sarnak, P.: Letter to J. Lagarias. web.math.princeton.edu/sarnak/AppolonianPackings.pdf (2007)
Sarnak, P.: Integral Apollonian packings. Am. Math. Mon. 118(4), 291–306 (2011)
Selberg, A.: On the estimation of Fourier coefficients of modular forms. Proc. Symp. Pure Math. VII, 1–15 (1965)
Shalom, Y.: Bounded generation and Kazhdan’s property (T). Publ. Math. Inst. Hautes Études Sci. 90, 145–168 (1999)
Soddy, F.: The bowl of integers and the hexlet. Nature 139, 77–79 (1937)
Sullivan, D.: Entropy, Hausdorff measures old and new, and limit sets of geometrically finite Kleinian groups. Acta Math. 153(3–4), 259–277 (1984)
Varjú, P.P.: Expansion in SL d (O K /I), I square-free. J. Eur. Math. Soc. 14(1), 273–305 (2012)
Vinogradov, I.: Effective bisector estimate with application to Apollonian circle packings. IMRN (2013). Princeton University Thesis (2012). arXiv:1204.5498v1
Acknowledgements
The authors are grateful to Peter Sarnak for illuminating discussions, and many detailed comments improving the exposition of an earlier version of this paper. We thank Tim Browning, Sam Chow, Hee Oh, Xin Zhang, and the referee for numerous corrections and suggestions.
Author information
Authors and Affiliations
Corresponding author
Additional information
Bourgain is partially supported by NSF grant DMS-0808042.
Kontorovich is partially supported by NSF grants DMS-1209373, DMS-1064214 and DMS-1001252.
Varjú is partially supported by the Simons Foundation and the European Research Council (Advanced Research Grant 267259).
Appendix: Spectral gap for the Apollonian group (by Péter P. Varjú)
P.P. Varjú University of Cambridge, Cambridge CB3 0WA, UK e-mail: pv270@dpmms.cam.ac.uk
Appendix: Spectral gap for the Apollonian group (by Péter P. Varjú)
In recent years some spectacular advances were made on estimating spectral gaps (to be defined below) of infinite co-volume subgroups of \(\operatorname {SL}(d,\mathbb {Z})\). Bourgain and Gamburd [6] proved uniform spectral gap estimates for Zariski-dense subgroups of \(\operatorname {SL}(2,\mathbb {Z})\) under the additional assumption that the modulus q is prime. One of the crucial ideas in their paper is the application of Helfgott’s triple-product theorem [28]. The result in [6] was generalized in a series of papers [5, 7, 10, 11, 48] and [40]. Some of these require the generalization of [28] obtained independently by Breuillard, Green and Tao [14] and Pyber and Szabó [39].
In particular, Bourgain and Varjú [10, Theorem 1] proved the spectral gap for Zariski-dense subgroups of \(\operatorname {SL}(d, \mathbb {Z})\) without any restriction for the modulus q. Salehi Golsefidy and Varjú [40, Theorem 1] obtained the result for Zariski-dense subgroups of perfect arithmetic groups, but only for square-free q. Unfortunately, these results do not cover Theorem 4.3; the first one is not applicable to the Apollonian group, the second one is restricted for the moduli.
In this appendix, we present an approach which differs from those discussed above. This is much simpler and probably would give better numerical results, but we do not pursue explicit bounds. However, our method depends on special properties of the Apollonian group and does not apply to general Zariski-dense subgroups.
Recall from Sect. 2 that the preimage of the Apollonian group under the homomorphism
is generated by the matrices
We describe an automorphism of \(\operatorname {SL}(2,\mathbb {Z}[i])\) which transforms the above generators to matrices that will be more convenient to work with. Set . A simple calculation shows that the image of the matrices (A.1) under the map g↦A −1 gA are
We put
These are the image of (A.1) under the product of two isomorphism: first conjugation by A and then multiplication of the off-diagonal elements by −i and i. We denote by \(\bar{\varGamma }\) the group generated by \(\bar{S}=\{\pm \gamma _{1}^{\pm1},\pm \gamma _{2}^{\pm1},\pm \gamma _{3}^{\pm1}\}\). This is isomorphic to the group denoted by the same symbol in the paper.
First we recall two different notions of spectral gap. The notion, “geometric” spectral gap, has already been explained in Sect. 4.2. Recall that for an integer q, \(\bar{\varGamma}(q)\) denotes the kernel of the projection map \(\bar{\varGamma}\to \operatorname {SL}(2,\mathbb {Z}[i]/(q))\). We consider the Laplace Beltrami operator Δ on the hyperbolic orbifolds \(\bar{\varGamma}(q)\backslash \mathbb {H}^{3}\). We denote by λ 0(q)≤λ 1(q) the two smallest eigenvalues of Δ on \(\bar{\varGamma}(q)\backslash \mathbb {H}^{3}\). The geometric spectral gap is an inequality of the form λ 1(q)>λ 0(q)+ε for some ε>0 independent of q.
The other notion, “combinatorial” spectral gap is defined as follows. Let G be a finite group, and S a symmetric set of generators. Let T G,S be the Markov operator on the space L 2(G) defined by
for f∈L 2(G) and g∈G. We denote by
the eigenvalues of T G,S in increasing order.
The operator \(Id-T_{\bar{\varGamma}/\bar{\varGamma}(q)}\) is a discrete analogue of the Laplacian Δ on \(\bar{\varGamma}(q)\backslash \mathbb {H}^{3}\). So by combinatorial spectral gap we mean the inequality
for some ε>0 independent of q. To simplify notation, we will write \(\lambda _{1}'(q)=\lambda _{1}'(\bar{\varGamma}/\bar{\varGamma}(q),\bar{S})\).
The relation between the two notions is not just an analogy. It was shown by Brooks [15, Theorem 1] and Burger [17–19] that they are equivalent for the fundamental groups of a family of covers of a compact manifold. The orbifolds \(\bar{\varGamma}(q)\backslash \mathbb {H}^{3}\) are not compact, they even have infinite volume, however the equivalence can be extended to cover our example, see [13, Theorems 1.2 and 2.1].
We show that the congruence subgroups \(\bar{\varGamma}(q)\) of the Apollonian group have combinatorial spectral gap which implies Theorem 4.3 in light of [13, Theorems 1.2 and 2.1].
Theorem A.1
Let \(\bar{\varGamma}\) be the Apollonian group and \(\lambda '_{1}(q)\) be as above. There is an absolute constant c>0 such that \(\lambda _{1}'(q)<1-c\) for all q. I.e. the Apollonian group has combinatorial spectral gap.
Denote by Γ 1 and Γ 2 respectively, the groups generated by {γ 1,γ 2} and {γ 1,γ 3} respectively. Denote by \({\bf G}_{1}\) and \({\bf G}_{2}\) the Zariski-closures of Γ 1 and Γ 2 in \(\operatorname {Res}_{\mathbb {R}|\mathbb {C}} \operatorname {SL}(2,\mathbb {C})\), i.e. in \(\operatorname {SL}(2,\mathbb {C})\) considered an algebraic group over \(\mathbb {R}\).
As we will see later, \({\bf G}_{1}\) and \({\bf G}_{2}\) are isomorphic to \(\operatorname {SL}(2,\mathbb {R})\). Moreover Γ 1 and Γ 2 are lattices inside them. This feature of the Apollonian group was pointed out by Sarnak [42]. We exploit it heavily in our approach.
Due to a result going back to Selberg [44], Γ 1 and Γ 2 have geometric spectral gaps with respect to the congruence subgroups. From here we can deduce the combinatorial spectral gap using Brooks [15, Theorem 1] (see also [16, Theorem 1], where the non-compact case is considered.)
We transfer the combinatorial spectral gap property of Γ 1 and Γ 2 to the Apollonian group \(\bar{\varGamma}\) and conclude Theorem A.1. This is done in following two Lemmata:
Lemma A.2
Let G be a finite group and S⊂G a finite symmetric generating set. Let G 1,G 2,…,G k be subgroups of G such that for every g∈G there are g 1∈G 1,…,g k ∈G k such that g=g 1⋯g k . Then
The above Lemma and its proof below is closely related to the well-known fact that if G is generated by S in k steps then one has \(\lambda '_{1}(G,S)\le1-1/|S|k^{2}\). This can be found for example in [21, Corollary 1 on page 2138]. After circulating an earlier version of this appendix, it was pointed out to me that an idea similar to Lemma A.2 has been used by Sarnak [41, Sect. 2.4], by Shalom [45], and also by Kassabov, Lubotzky and Nikolov [31].
Lemma A.3
Let q≥2 be an integer. Then for every \(g\in\bar{\varGamma}/\bar{\varGamma}(q)\), there are \(g_{1},\ldots, g_{10^{13}}\in \varGamma _{1}/\varGamma _{1}(q)\) and \(h_{1},\ldots, h_{10^{13}}\in \varGamma _{2}/\varGamma _{2}(q)\) such that \(g=g_{1}h_{1}\cdots g_{10^{13}}h_{10^{13}}\).
Lemma A.3 enables us to apply Lemma A.2 with k=2⋅1013 and G i =Γ 1/Γ 1(q) for odd i and G i =Γ 2/Γ 2(q) for even i. Now [44] and [16, Theorem 1] provides us with lower bounds on
Therefore Theorem A.1 is proved once the two Lemmata are proved.
Before we proceed with the proofs, we make two remarks. First, we note that instead of [44] we could just as well use [10, Theorem 1]. Second, we suggest that the constant 1013 in Lemma A.3 is not optimal. In particular, the argument we present would give 72 if the statement is checked for q=27⋅3, e.g. by a computer program. Certainly there is further room for improvement but we make no efforts to optimize the constants.
Proof of Lemma A.2
Denote by π the regular representation of G, i.e. we write
for f∈L 2(G) and g,g 0∈G. Let T G,S be the Markov operator defined above. Let f 0∈L 2(G) be an eigenfunction with ∥f 0∥2=1 corresponding to \(\lambda _{1}'(G,S)\). It is orthogonal to the constant and
Since f 0 is orthogonal to the constant, we have
Thus there is g 0∈G such that 〈π(g 0)f 0,f 0〉≤0 and hence \(\|\pi(g_{0})f_{0}-f_{0}\|_{2}\ge\sqrt{2}\).
By the hypothesis of the lemma, there are g i ∈G i for 1≤i≤k such that g 0=g 1⋯g k . By the triangle inequality, there is some 1≤i 0≤k such that
Since π is unitary, we have \(\|f_{0}-\pi(g_{i_{0}})f_{0}\|_{2}\ge\sqrt{2}/k\).
We write f 0=f 1+f 2 such that f 1 is invariant under the elements of \(G_{i_{0}}\) in the regular representation π and f 2 is orthogonal to the space of functions invariant under \(G_{i_{0}}\). Then
Thus \(\|f_{2}\|_{2}\ge1/\sqrt{2}k\).
Now we can write
Since
we have
We combine (A.3), (A.4) and the estimate on ∥f 2∥2 and get
which was to be proved. □
Now we turn to the proof of Lemma A.3. It will be convenient to write
First we consider the case when q is the power of a prime; the general case will be easy to deduce from this.
Lemma A.4
Let p be a prime and m a positive integer. Then \(A_{10^{13}}(p^{m})=\bar{\varGamma}/\bar{\varGamma}(p^{m})\).
We use different methods when p is 2 or 3 compared to when it is larger. First we consider the latter situation.
Proof of Lemma A.4 for p≥5
It is well-known and easy to check that the group generated by γ 1 and γ 2 is
Thus \(\varGamma _{1}/\varGamma _{1}(p^{m})= \operatorname {SL}(2,\mathbb {Z}/p^{m}\mathbb {Z})\) for p≠2.
By simple calculation:
Since p≠2 we can divide by 2 in the ring \(\mathbb {Z}/p^{m}\mathbb {Z}\), hence for (a,p)=1, the matrices in the above calculation are in Γ 1/Γ 1(p m) except for γ 3. Therefore
Using this, we want to show that
for all \(a\in \mathbb {Z}/p^{m}\mathbb {Z}\). To do this, we need to show that for every element \(x\in \mathbb {Z}/p^{m}\mathbb {Z}\), we can find elements \(a_{1},\ldots, a_{k}\in \mathbb {Z}/p^{m}\mathbb {Z}\) for some 0≤k≤4, such that a 1,…,a k are not divisible by p and \(x=a_{1}^{2}+\cdots+a_{k}^{2}\). If m=1, this simply follows from the fact that any positive integer is a sum of at most 4 squares, and the a i can not be divisible by p since 0<a i ≤x≤p and at least one of the inequalities are strict.
Suppose that m>1, \(x\in \mathbb {Z}/p^{m}\mathbb {Z}\) and \(a_{1}^{2}+\cdots+a_{k}^{2}\equiv x\operatorname {mod}p\) with none of a 1…a k divisible by p. Then by Hensel’s lemma (recall that p≠2), there is an \(a_{1}'\in \mathbb {Z}/p^{m}\mathbb {Z}\) such that
This proves the claim for arbitrary m≥1.
Multiplying (A.6) by a suitable unipotent element of Γ 1/Γ 1(p m), we can get
for \(a\in \mathbb {Z}[i]/(p^{m})\). We can prove the same for the upper triangular unipotents by a very similar argument.
Again, by simple calculation:
This shows that
for all \(a',b',c',d'\in \mathbb {Z}[i]/(p^{m})\), a′d′−b′c′=1, provided c′ is not divisible by a prime above p.
Thus, A 36(p m) contains more than half of the group \(\bar{\varGamma}/\bar{\varGamma}(p^{m})\), hence
□
Proof of Lemma A.4 for p=2 and 3
We give the proof for p=2 and then explain the differences for p=3.
We prove by induction the following statement. For every m≥7 and \(g\in\bar{\varGamma}(2^{7})/\bar{\varGamma}(2^{m})\), there are g 1,g 2,g 3∈Γ 1(22)/Γ 1(2m) such that
For m=7 this is clear since we can take g 1=g 2=g 3=1. Now assume that m>7 and the statement holds for m−1. In this proof, we denote by 1 the multiplicative unit (identity matrix) and by 0 the matrix with all entries 0. Let \(g\in\bar{\varGamma}(2^{7})/\bar{\varGamma}(2^{m})\) be arbitrary. By the induction hypothesis, there is h 1,h 2,h 3∈Γ 1(22)/Γ 1(2m) such that
where x can be considered as an element of \(\operatorname {Mat}(2,\mathbb {Z}[i]/(2))\), i.e. a 2×2 matrix with elements in \(\mathbb {Z}[i]/(2)\). Since g,h 1,h 2,h 3 has determinant 1 and congruent to the unit element mod 2, x has trace 0.
Now we look for suitable \(x_{1},x_{2},x_{3}\in \operatorname {Mat}(2,\mathbb {Z})\) such that
Moreover, we ensure that \(x_{i}\equiv0\quad \operatorname {mod}2^{m-4}\) and that \(\operatorname {Tr}(x_{i})\equiv0\quad \operatorname {mod}2^{m}\) for all i=1,2,3. Since m≥8, this implies that \(h_{i}+x_{i}\equiv1\quad \operatorname {mod}4\) and \(\det(h_{i}+x_{i})\equiv1 \quad \operatorname {mod}2^{m}\), hence h i +x i ∈Γ 1(22)/Γ 1(2m). Recall (A.5) from the previous proof. If the matrices x i satisfy the claimed properties then
The matrices x 1,x 2,x 3 can be chosen to be a suitable linear combination of the matrices in the following calculations, and this finishes the induction:
Now we showed that
The index of \(\bar{\varGamma}(2^{7})/\bar{\varGamma}(2^{m})\) in \(\bar{\varGamma}/\bar{\varGamma}(2^{m})\) is at most
This shows that
Now we turn to the case p=3. By the same argument, one can show that for every m≥1 and \(g\in\bar{\varGamma}(3)/\bar{\varGamma}(3^{m})\), there are g 1,g 2,g 3∈Γ 1/Γ 1(3m) such that
The only significant difference is that one needs to use the following identities:
Using this claim, one can finish the proof as above. □
Proof of Lemma A.3
Let q be an integer and \(q=p_{1}^{m_{1}}\cdots p_{n}^{m_{n}}\) where p i are primes. We prove that
Let \(x\in A_{10^{13}}(p_{1}^{m_{1}})\times\cdots\times A_{10^{13}}(p_{n}^{m_{n}})\) be arbitrary. By definition, for each k, we can find elements \(g_{1}^{(k)},\ldots, g_{10^{13}}^{(k)}\in \varGamma _{1}/\varGamma _{1}(q)\) and \(h_{1}^{(k)},\ldots, h_{10^{13}}^{(k)}\in \varGamma _{2}/\varGamma _{2}(q)\) such that
Since Γ 1/Γ 1(p m) and Γ 2/Γ 2(p m) are the direct product of local factors, we can find elements \(g_{1},\ldots, g_{10^{13}}\in \varGamma _{1}/\varGamma _{1}(p^{m})\) and \(h_{1},\ldots, h_{10^{13}}\in \varGamma _{2}/\varGamma _{2}(p^{m})\) such that
for each i and k. Thus
Using Lemma A.4 we get
Obviously
hence all these containments must be equality. □
Rights and permissions
About this article
Cite this article
Bourgain, J., Kontorovich, A. On the local-global conjecture for integral Apollonian gaskets. Invent. math. 196, 589–650 (2014). https://doi.org/10.1007/s00222-013-0475-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00222-013-0475-y