On the local-global conjecture for integral Apollonian gaskets

Bourgain, Jean; Kontorovich, Alex

doi:10.1007/s00222-013-0475-y

On the local-global conjecture for integral Apollonian gaskets

With an appendix by Péter P. Varjú

Published: 10 July 2013

Volume 196, pages 589–650, (2014)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Inventiones mathematicae Aims and scope

On the local-global conjecture for integral Apollonian gaskets

Download PDF

Jean Bourgain¹ &
Alex Kontorovich²

1004 Accesses
27 Citations
1 Altmetric
Explore all metrics

Abstract

We prove that a set of density one satisfies the local-global conjecture for integral Apollonian gaskets. That is, for a fixed integral, primitive Apollonian gasket, almost every (in the sense of density) admissible (passing local obstructions) integer is the curvature of some circle in the gasket.

Geometric and Group-Theoretic Approach

Arithmetic Properties of Apollonian Gaskets

The First Passage Sets of the 2D Gaussian Free Field: Convergence and Isomorphisms

Article 09 March 2020

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

1.1 The local-global conjecture

Let be an Apollonian gasket, see Fig. 1. The number b(C) shown inside a circle is its curvature, that is, the reciprocal of its radius (the bounding circle has negative orientation). Soddy [46] first observed the existence of integral gaskets , meaning ones for which $b(C)\in \mathbb {Z}$ for all . Let

be the set of all curvatures in . We call a gasket primitive if . From now on, we restrict our attention to a fixed primitive integral Apollonian gasket .

Graham, Lagarias, Mallows, Wilks, and Yan [26, 34] initiated a detailed study of Diophantine properties of , with two separate families of problems (see also e.g. [23, 33, 43]): studying with multiplicity (that is, studying circles), or without multiplicity (studying the integers which arise). In the present paper, we are concerned with the latter.

In particular, the following striking local-to-global conjecture for is given in [26, p. 37], [23]. Let denote the admissible integers, that is, those passing all local (congruence) obstructions:

Conjecture 1.1

(Local-Global Conjecture)

Fix a primitive, integral Apollonian gasket . Then every sufficiently large admissible number is the curvature of a circle in . That is, if and n≫1, then .

The purpose of this paper is to prove the following

Theorem 1.2

Almost every admissible number is the curvature of a circle in . Quantitatively, the number of exceptions up to N is bounded by O(N ^1−η), where η>0 is effectively computable.

Admissibility is completely explained in Fuchs’s thesis [22], and is a condition restricting to certain residue classes modulo 24, cf. Lemma 2.3. E.g. for the gasket in Fig. 1, iff

$$ n\equiv0, 4, 12, 13, 16,\ \text{or}\ 21\quad (\operatorname {mod}24). $$

(1.1)

Thus contains one of every four numbers (six admissible residue classes out of 24), and Theorem 1.2 can be restated in this case as

In general, the local obstructions are easily determined (see Remark 2.4) from the so-called root quadruple

(1.2)

which is the column vector of the four smallest curvatures in . For the gasket in Fig. 1, v ₀=(−11,21,24,28).

The history of this problem is as follows. The first progress towards the Conjecture was already made in [26], who showed that

(1.3)

Sarnak [42] improved this to

(1.4)

and then Fuchs [22] showed

Finally Bourgain and Fuchs [4] settled the so-called “Positive Density Conjecture,” that

1.2 Methods

Our main approach is through the Hardy-Littlewood circle method, combining two new ingredients. The first, applied to the major arcs, is effective bisector counting in infinite volume hyperbolic 3-folds, recently achieved by I. Vinogradov [49], as well as the uniform spectral gap over congruence towers of such, see the Appendix by Péter Varjú. The second ingredient is the minor arcs analysis, inspired by that given recently by the first-named author in [3], where it was proved that the prime curvatures in a gasket constitute a positive proportion of the primes. (Obviously Theorem 1.2 implies that 100 % of the admissible prime curvatures appear.)

1.3 Plan for the paper

A more detailed outline of the proof, as well as the setup of some relevant exponential sums, is given in Sect. 3. Before we can do this, we need to recall the Apollonian group and some of its subgroups in Sect. 2. After the outline in Sect. 3, we use Sect. 4 to collect some background from the spectral and representation theory of infinite volume hyperbolic quotients. Then some lemmata are reserved for Sect. 5, the major arcs are estimated in Sect. 6, and the minor arcs are dealt with in Sects. 7–9. The Appendix, by Péter Varjú, extracts the spectral gap property for the Apollonian group from that of its arithmetic subgroups.

1.4 Notation

We use the following standard notation. Set e(x)=e ^2πix and $e_{q}(x)=e(\frac{x}{q})$. We use f≪g and f=O(g) interchangeably; moreover f≍g means f≪g≪f. Unless otherwise specified, the implied constants may depend at most on the gasket (or equivalently on the root quadruple v ₀), which is treated as fixed. The symbol 1 _{⋅} is the indicator function of the event {⋅}. The greatest common divisor of n and m is written (n,m), their least common multiple is [n,m], and ω(n) denotes the number of distinct prime factors of n. The cardinality of a finite set S is denoted |S| or #S. The transpose of a matrix g is written g ^t. The prime symbol ′ in $\sum_{r(q)}'$ means the range of $r(\operatorname {mod}q)$ is restricted to (r,q)=1. Finally, p ^j∥q denotes p ^j∣q and p ^j+1∤q.

2 Preliminaries I: the Apollonian group and its subgroups

2.1 Descartes theorem and consequences

Descartes’ Circle Theorem states that a quadruple v of (oriented) curvatures of four mutually tangent circles lies on the cone

$$ F(v)=0, $$

(2.1)

where F is the Descartes quadratic form:

$$ F(a,b,c,d) = 2\bigl(a^{2}+b^{2}+c^{2}+d^{2} \bigr) -(a+b+c+d)^{2} . $$

(2.2)

Note that F has signature (3,1) over $\mathbb {R}$, and let

$$G:=\operatorname {SO}_{F}(\mathbb {R})=\bigl\{g\in \operatorname {SL}(4,\mathbb {R}):F(g v)=F(v),\text{ for all }v\in \mathbb {R}^{4}\bigr\} $$

be the real special orthogonal group preserving F.

It follows immediately that for b,c and d fixed, there are two solutions a,a′ to (2.1), and

$$a+a'=2(b+c+d). $$

Hence we observe that a can be changed into a′ by a reflection, that is,

$$(a,b,c,d)^{t}=S_{1}\cdot\bigl(a',b,c,d \bigr)^{t}, $$

where the reflections

generate the so-called Apollonian group

$$ \mathcal {A}= \langle S_{1},S_{2},S_{3},S_{4} \rangle . $$

(2.3)

It is a Coxeter group, free except for the relations $S_{j}^{2}=I$, 1≤j≤4. We immediately pass to the index two subgroup

$$\varGamma :=\mathcal {A}\cap \operatorname {SO}_{F} $$

of orientation preserving transformations, that is, even words in the generators. Then Γ is freely generated by S ₁ S ₂, S ₂ S ₃ and S ₃ S ₄. It is known that Γ is Zariski dense in G but thin, that is, of infinite index in $G(\mathbb {Z})$; equivalently, the Haar measure of Γ∖G is infinite.

2.2 Arithmetic subgroups

Now we review the arguments from [26, 42] which lead to (1.3) and (1.4), as our setup depends critically on them.

Recall that for any fixed gasket , there is a root quadruple v ₀ of the four smallest curvatures in , cf. (1.2). It follows from (2.1) and (2.3) that the set of all curvatures can be realized as the orbit of the root quadruple v ₀ under $\mathcal {A}$. Let

be the orbit of v ₀ under Γ. Then the set of all curvatures certainly contains

(2.4)

where e ₁=(1,0,0,0)^t,…,e ₄=(0,0,0,1)^t constitute the standard basis for $\mathbb {R}^{4}$, and the inner product above is the standard one. Recall we are treating as a set, that is, without multiplicities.

It was observed in [26] that Γ contains unipotent elements, and hence one can use these to furnish an injection of affine space in the otherwise intractable orbit , as follows. Note first that

$$ C_{1}:= S_{4}S_{3} = \left ( \begin{array}{c@{\quad}c@{\quad}c@{\quad}c} 1 & & & \\ & 1 & & \\ 2 & 2 & -1 & 2 \\ 6 & 6 & -2 & 3 \end{array} \right ) \in \varGamma , $$

(2.5)

and after conjugation by

$$J:= \left ( \begin{array}{c@{\quad}c@{\quad}c@{\quad}c} 1 & & & \\ -1 & 1 & & \\ -1 & 1 & -2 & 1 \\ -1 & & & 1 \end{array} \right ) , $$

we have

$$\tilde{C}_{1}:= J^{-1}\cdot C_{1}\cdot J = \left ( \begin{array}{c@{\quad}c@{\quad}c@{\quad}c} 1 & & & \\ & 1 & & \\ & 2 & 1 & \\ & 4 & 4 & 1 \end{array} \right ) . $$

Recall the spin homomorphism $\rho: \operatorname {SL}_{2}\to \operatorname {SO}(2,1)$, embedded for our purposes in $\operatorname {SL}_{4}$, given explicitly by

(2.6)

In fact $\operatorname {SL}_{2}$ is a double cover of $\operatorname {SO}(2,1)$ under ρ, with kernel ±I. It is clear from inspection that

$$\rho: \left ( \begin{array}{c@{\quad}c} 1&2\\ 0&1 \end{array} \right )=:T_{1} \mapsto \tilde{C}_{1} . $$

Since , for each $n\in \mathbb {Z}$, Γ contains the element

$$C_{1}^{n}= J\cdot\rho\bigl(T_{1}^{n} \bigr)\cdot J^{-1} = \left ( \begin{array}{c@{\quad}c@{\quad}c@{\quad}c} 1 & 0 & 0 & 0 \\ 0 & 1 & 0 & 0 \\ 4 n^2-2 n & 4 n^2-2 n & 1-2 n & 2 n \\ 4 n^2+2 n & 4 n^2+2 n & -2 n & 2 n+1 \end{array} \right ). $$

(Of course this can be seen directly from (2.5); these transformations will be more enlightening below.)

Thus if is a quadruple in the orbit, then also contains $C_{1}^{n}\cdot v $ for all n. From (2.4), we then have that the set of curvatures contains

(2.7)

The circles thus generated are all tangent to two fixed circles, which explains the square curvatures in Fig. 2. Of course (2.7) immediately implies (1.3).

Observe further that

$$C_{2} := S_{2}S_{3} = \left ( \begin{array}{c@{\quad}c@{\quad}c@{\quad}c} 1 & & & \\ 6 & 3 & -2 & 6 \\ 2 & 2 & -1 & 2 \\ & & & 1 \end{array} \right ) $$

is another unipotent element, with

$$\tilde{C}_{2}:= J^{-1}\cdot C_{2}\cdot J = \left ( \begin{array}{c@{\quad}c@{\quad}c@{\quad}c} 1 & & & \\ & 1 & 4 & 4 \\ & & 1 & 2 \\ & & & 1 \end{array} \right ) , $$

and

$$\rho: \left ( \begin{array}{c@{\quad}c} 1&0\\ 2&1 \end{array} \right ) =:T_{2} \mapsto \tilde{C}_{2} . $$

Since T ₁ and T ₂ generate Λ(2), the principal 2-congruence subgroup of $\operatorname {PSL}(2,\mathbb {Z})$, we see that the Apollonian group Γ contains the subgroup

$$ \varXi:= \langle C_{1},C_{2}\rangle = J\cdot \rho \bigl( \varLambda (2) \bigr) \cdot J^{-1} < \varGamma . $$

(2.8)

In particular, whenever (2x,y)=1, there is an element

$$\left ( \begin{array}{c@{\quad}c} *&{2x}\\ *&{y} \end{array} \right ) \in \varLambda (2) , $$

and thus Ξ contains the element

$$\begin{aligned} \xi_{x,y} :=& J\cdot \rho \left ( \begin{array}{c@{\quad}c} *&{2x}\\ *&{y} \end{array} \right ) \cdot J^{-1} \\ =& \left ( \begin{array}{c@{\quad}c@{\quad}c@{\quad}c} 1 & 0 & 0 & 0 \\ *&*&*&*\\ *&*&*&*\\ 4x ^2+2xy +y ^2-1 & 4x ^2+2xy &-2xy & 2xy+ y ^2 \end{array} \right ) . \end{aligned}$$

(2.9)

Write

$$\begin{aligned} w_{x,y} =&\xi_{x,y}^{t}\cdot e_{4} \\ =& \bigl(4x^{2}+2xy+y^{2}-1,4x^{2}+2xy,-2xy,2xy+y^{2} \bigr)^{t}. \end{aligned}$$

(2.10)

Then again by (2.4), we have shown the following

Lemma 2.1

([42])

Let $x,y\in \mathbb {Z}$ with (2x,y)=1, and take any element γ∈Γ with corresponding quadruple

(2.11)

Then the number

$$ \langle e_{4}, \xi_{x,y}\cdot \gamma \cdot v_{0}\rangle = \langle w_{x,y},\gamma \cdot v_{0}\rangle =4A_{\gamma }x^{2} +4B_{\gamma }xy +C_{\gamma }y^{2} -a _{\gamma } $$

(2.12)

is the curvature of some circle in , where we have defined

$$\begin{aligned} A_{\gamma } :=& a_{\gamma }+b_{\gamma }, \\ B_{\gamma } :=& {a _{\gamma } + b _{\gamma } - c _{\gamma } + d _{\gamma }\over2}, \\ C_{\gamma } :=& a_{\gamma } + d_{\gamma } . \end{aligned}$$

(2.13)

Note from (2.1) that B _γ is integral.

Observe that, by construction, the value of a _γ is unchanged under the orbit of the group (2.8), and the circles whose curvatures are generated by (2.12) are all tangent to the circle corresponding to a _γ. It is classical (see [2]) that the number of distinct primitive values up to N assumed by a positive-definite binary quadratic form is of order at least N(logN)^−1/2, proving (1.4).

To fix notation, we define the binary quadratic appearing in (2.12) and its shift by

$$ f_{\gamma }(x,y) := A_{\gamma }x^{2}+2B_{\gamma } xy+C_{\gamma }y^{2} , \qquad \frak {f}_{\gamma }(x,y):= f_{\gamma }(x,y) -a_{\gamma } , $$

(2.14)

so that

$$ \langle w_{x,y}, \gamma \cdot v_{0}\rangle = \frak {f}_{\gamma }(2x,y) . $$

(2.15)

Note from (2.13) and (2.1) that the discriminant of f _γ is

$$ \Delta _{\gamma } = 4\bigl(B_{\gamma }^{2}-A_{\gamma }C_{\gamma } \bigr) = -4 a_{\gamma }^2 . $$

(2.16)

When convenient, we will drop the subscripts γ in all the above.

2.3 Congruence subgroups

For each q≥1, define the “principal” q-congruence subgroup

$$ \varGamma (q) := \bigl\{ \gamma \in \varGamma : \gamma \equiv I(\operatorname {mod}q) \bigr\} . $$

(2.17)

These groups all have infinite index in $G(\mathbb {Z})$, but finite index in Γ. The quotients Γ/Γ(q) have been determined completely by Fuchs [22] by proving an explicit Strong Approximation theorem (see [37]), Goursat’s Lemma, and other ingredients, as we explain below. Since G does not itself have the Strong Approximation Property, we pass to its connected spin double cover $\operatorname {SL}_{2}(\mathbb {C})$. We will need the covering map explicitly later, so record it here.

First change variables from the Descartes form F to

$$\tilde{F}(x,y,z,w):=xw+y^{2}+z^{2}. $$

Then there is a homomorphism $\iota_{0}: \operatorname {SL}(2,\mathbb {C})\to \operatorname {SO}_{\tilde{F}}(\mathbb {R})$, sending

$$g= \left ( \begin{array}{c@{\quad}c} {a+\alpha i}&{b+\beta i}\\ {c+\gamma i}&{d+\delta i} \end{array} \right ) \in \operatorname {SL}(2,\mathbb {C}) $$

to

$$\begin{aligned} &{1\over|\det(g)|^{2}} \\ &{}\times \left ( \begin{array}{c@{\ \ }c@{\ \ }c@{\ \ }c} a^2+\alpha^2 & 2 (a c+\alpha \gamma) & 2 (c \alpha- a \gamma) & -c^2-\gamma^2 \\ a b+\alpha \beta & b c+a d+\beta \gamma+\alpha \delta & d \alpha +c \beta-b \gamma-a \delta & -c d-\gamma \delta \\ a \beta-b \alpha & -d \alpha+c \beta-b \gamma+a \delta & -b c+a d-\beta \gamma+\alpha \delta & d \gamma-c \delta \\ -b^2-\beta^2 & -2 (b d+\beta \delta) & 2( b \delta-d \beta) & d^2+\delta^2 \end{array} \right ) . \end{aligned}$$

To map from $\operatorname {SO}_{\tilde{F}}$ to $\operatorname {SO}_{F}$, we apply a conjugation, see [26, (4.1)]. Let

$$ \iota: \operatorname {SL}(2,\mathbb {C})\to \operatorname {SO}_{F}(\mathbb {R}) $$

(2.18)

be the composition of this conjugation with ι ₀. Let $\tilde{\varGamma }$ be the preimage of Γ under ι.

Lemma 2.2

([22, 27])

The group $\tilde{\varGamma }$ is generated by

$$\pm \left ( \begin{array}{c@{\quad}c} 1&{4i}\\ &1 \end{array} \right ),\qquad \pm \left ( \begin{array}{c@{\quad}c} {-2}&i\\ i& \end{array} \right ),\qquad \pm \left ( \begin{array}{c@{\quad}c}{2+2i}&{4+3i}\\ {-i}&{-2i} \end{array} \right ). $$

With this explicit realization of $\tilde{\varGamma }$ (and hence Γ), Fuchs was able to explicitly determine the images of $\tilde{\varGamma }$ in $\operatorname {SL}(2,\mathbb {Z}[i]/(q))$, and hence understand the quotients Γ/Γ(q) for all q.

Lemma 2.3

[22]

(1)
The quotient groups Γ/Γ(q) are multiplicative, that is, if q factors as
$$q=p_{1}^{\ell_{1}}\cdots p_{r}^{\ell_{r}}, $$
then
$$\varGamma /\varGamma (q)\cong \varGamma /\varGamma \bigl(p_{1}^{\ell_{1}}\bigr)\times\cdots\times \varGamma /\varGamma \bigl(p_{r}^{\ell_{r}}\bigr). $$
(2)
If (q,6)=1 then
$$ \varGamma /\varGamma (q)\cong \operatorname {SO}_{F}(\mathbb {Z}/q\mathbb {Z}). $$
(2.19)
(3)
If q=2^ℓ, ℓ≥3, then Γ/Γ(q) is the full preimage of Γ/Γ(8) under the projection $\operatorname {SO}_{F}(\mathbb {Z}/q\mathbb {Z})\to \operatorname {SO}_{F}(\mathbb {Z}/8\mathbb {Z})$. That is, the powers of 2 stabilize at 8. Similarly, the powers of 3 stabilize at 3, meaning that for q=3^ℓ, ℓ≥1, the quotient Γ/Γ(q) is the preimage of Γ/Γ(3) under the corresponding projection map.

Remark 2.4

This of course explains all local obstructions, cf. (1.1). The admissible numbers are precisely those residue classes $(\operatorname {mod}24)$ which appear as some entry in the orbit of v ₀ under Γ/Γ(24).

3 Setup and Outline of the Proof

In this section, we introduce the main exponential sum and give an outline of the rest of the argument. Recall the fixed gasket having curvatures and root quadruple v ₀. Let Γ be the Apollonian subgroup with subgroup Ξ, see (2.8). Let δ≈1.3 be the Hausdorff dimension of the gasket ; see Sect. 4 for the important role played by this geometric invariant. Recall also from (2.12) that for any γ∈Γ and ξ∈Ξ,

Our approach, mimicking [8, 9], is to exploit the bilinear (or multilinear) structure above.

We first give an informal description of the main ensemble from which we will form an exponential sum. Let N be our main growing parameter. We construct our ensemble by decomposing a ball in Γ of norm N into two balls, a small one in all of Γ of norm T, and a larger one of norm X ² in Ξ, corresponding to x,y≍X. Specifically, we take

$$ T=N^{1/100}\quad\text{and}\quad X=N^{99/200},\quad \text{so that}\ TX^{2}=N. $$

(3.1)

See (9.8) and (9.11) where these numbers are used.

We further need the technical condition that in the T-ball, the value of a _γ=〈e ₁,γ v ₀〉 (see (2.11)) is of order T. This is used crucially in (7.8) and (5.41).

Finally, for technical reasons (see Lemma 5.2 below), we need to further split the T-ball into two: a small ball of norm T ₁, and a big ball of norm T ₂. Write

$$ T=T_{1}T_{2},\quad T_{2}=T_{1}^{\mathcal {C}}, $$

(3.2)

where $\mathcal {C}$ is a large constant depending only on the spectral gap for Γ; it is determined in (5.11). We now make formal the above discussion.

3.1 Introducing the main exponential sum

Let N,X,T,T ₁, and T ₂ be as in (3.1) and (3.2). Define the family

$$ \frak {F}=\frak {F}_{T}:= \left \{ \gamma =\gamma _{1} \gamma _{2}:\ \begin{array}{c} \gamma _{1},\gamma _{2}\in \varGamma ,\\ T_{1}<\|\gamma _{1}\|<2T_{1},\\ T_{2}<\|\gamma _{2}\|<2T_{2},\\ \langle e_{1},\gamma _{1}\,\gamma _{2}\,v_{0}\rangle>T/100 \end{array} \right \} . $$

(3.3)

From Lax-Phillips [35] (or see (4.10)), we have the bound

$$ \#\frak {F}_{T}\ll T^{\delta }. $$

(3.4)

From (2.15), we can identify $\gamma \in \frak {F}$ with a shifted binary quadratic form $\frak {f}_{\gamma }$ of discriminant $-4a_{\gamma }^{2}$ via

$$\frak {f}_{\gamma }(2x,y)=\langle w_{x,y},\gamma \, v_{0} \rangle. $$

Recall from (2.12) that whenever (2x,y)=1, the above is a curvature in the gasket. We sometimes drop γ, writing simply $\frak {f}\in \frak {F}$; then the latter can also be thought of as a family of shifted quadratic forms. Note also that the decomposition γ=γ ₁ γ ₂ in (3.3) need not be unique, so some forms may appear with multiplicity.

One final technicality is to smooth the sum on x,y≍X. To this end, we fix a smooth, nonnegative function ϒ, supported in [1,2] and having unit mass, $\int_{\mathbb {R}}\varUpsilon (x)dx=1$.

Our main object of study is then the representation number

$$ \mathcal {R}_{N}(n):=\sum_{\frak {f}\in \frak {F}_{T}} \sum_{(2x,y)=1} \varUpsilon \biggl(\frac{2x}{X} \biggr) \varUpsilon \biggl(\frac{y}{X} \biggr) \boldsymbol {1}_{\{n=\frak {f}(2x,y)\}} , $$

(3.5)

and the corresponding exponential sum, its Fourier transform

$$ \widehat{\mathcal {R}_{N}}(\theta ):=\sum _{\frak {f}\in \frak {F}}\sum_{(2x,y)=1} \varUpsilon \biggl( \frac{2x}{X} \biggr) \varUpsilon \biggl(\frac{y}{X} \biggr) e\bigl(\theta \, \frak {f}(2x,y)\bigr) . $$

(3.6)

Clearly $\mathcal {R}_{N}(n)\neq0$ implies that . Note also from (3.4) that the total mass satisfies

$$ \widehat{\mathcal {R}_{N}}(0)\ll T^{\delta }X^{2}. $$

(3.7)

The condition (2x,y)=1 will be a technical nuisance, and can be freed by a standard use of the Möbius inversion formula. To this end, we introduce another parameter

$$ U=N^{\frak {u}}, $$

(3.8)

a small power of N, with $\frak {u}>0$ depending only on the spectral gap of Γ; it is determined in (6.3). Then by truncating Möbius inversion, define

$$ \widehat{\mathcal {R}_{N}^{U}}(\theta ) := \sum _{\frak {f}\in \frak {F}}\sum_{x,y\in \mathbb {Z}} \varUpsilon \biggl(\frac{2x}{X} \biggr) \varUpsilon \biggl(\frac{y}{X} \biggr) e\bigl(\theta \, \frak {f}(2x,y)\bigr) \sum_{u\mid(2x,y)\atop u<U}\mu(u) , $$

(3.9)

with corresponding “representation function” $\mathcal {R}_{N}^{U}$ (which could be negative).

3.2 Reduction to the circle method

We are now in position to outline the argument in the rest of the paper. Recall that is the set of admissible numbers. We first reduce our main Theorem 1.2 to the following

Theorem 3.1

There exists an η>0 and a function $\frak {S}(n)$ with the following properties. For $\frac {1}{2}N<n<N$, the singular series $\frak {S}(n)$ is nonnegative, vanishes only when , and is otherwise ≫_ε N ^−ε for any ε>0. Moreover, for $\frac {1}{2}N<n<N$ and admissible,

$$ \mathcal {R}_{N}^{U}(n)\gg \frak {S}(n) T^{\delta -1}, $$

(3.10)

except for a set of cardinality ≪N ^1−η.

Proof of Theorem 1.2 assuming Theorem 3.1

We first show that the difference between $\mathcal {R}_{N}$ and $\mathcal {R}_{N}^{U}$ is small in ℓ ¹. Using (3.4) we have

$$\begin{aligned} &\sum_{n<N}\big|\mathcal {R}_{N}(n)- \mathcal {R}_{N}^{U}(n)\big|\\ &\quad= \sum_{n<N} \biggl \vert \sum_{\frak {f}\in \frak {F}}\sum _{x,y\in \mathbb {Z}} \varUpsilon \biggl(\frac{2x}{X} \biggr) \varUpsilon \biggl(\frac{y}{X} \biggr) \boldsymbol {1}_{\{n=\frak {f}(2x,y)\}} \sum_{u\mid(2x,y)\atop u\ge U}\mu(u) \biggr \vert \\ &\quad\ll \sum_{\frak {f}\in \frak {F}} \sum_{u \ge U} \sum _{y\ll X\atop y\equiv0(\operatorname {mod}u)} \sum_{x\ll X\atop2x\equiv0(\operatorname {mod}u)} 1 \\ &\quad\ll T^{\delta }{X^{2}\over U} , \end{aligned}$$

for any ε>0. Recall from (3.8) that U is a fixed power of N, so the above saves a power from the total mass (3.7).

Now let Z be the “exceptional” set of admissible n<N for which $\mathcal {R}_{N}(n)=0$. Furthermore, let W be the set of admissible n<N for which (3.10) is satisfied. Then

$$\begin{aligned} T^{\delta }{X^{2}\over U} \gg & \sum _{n<N} \big|\mathcal {R}_{N}^{U}(n)- \mathcal {R}_{N}(n)\big| \ge \sum_{n\in Z\cap W} \big| \mathcal {R}_{N}^{U}(n)-\mathcal {R}_{N}(n)\big| \\ \gg_{\varepsilon }& | Z\cap W| \cdot T^{\delta -1} N^{-\varepsilon } . \end{aligned}$$

Note also from Theorem 3.1 that |Z∩W ^c|≤|W ^c|≪N ^1−η. Hence by (3.1) and (3.8),

$$ |Z| = \big| Z\cap W^{c}\big| + | Z\cap W| \ll_{\varepsilon } N^{1-\eta} + {N^{1+\varepsilon }\over U} , $$

(3.11)

which is a power savings since ε>0 is arbitrary. This completes the proof. □

To establish (3.10), we decompose $\mathcal {R}_{N}^{U}$ into “major” and “minor” arcs, reducing Theorem 3.1 to the following

Theorem 3.2

There exists an η>0 and a decomposition

$$ \mathcal {R}_{N}^{U}(n)=\mathcal {M}_{N}^{U}(n)+ \mathcal {E}_{N}^{U}(n) $$

(3.12)

with the following properties. For $\frac {1}{2}N<n<N$ and admissible, , we have

$$ \mathcal {M}_{N}^{U}(n)\gg \frak {S}(n) T^{\delta -1}, $$

(3.13)

except for a set of cardinality ≪N ^1−η. The singular series $\frak {S}(n)$ is the same as in Theorem 3.1. Moreover,

$$ \sum_{n<N}\big|\mathcal {E}_{N}^{U}(n)\big|^{2} \ll N\, T^{2(\delta -1)} N^{-\eta} . $$

(3.14)

Proof of Theorem 3.1 assuming Theorem 3.2

We restrict our attention to the set of admissible n<N so that (3.13) holds (the remainder having sufficiently small cardinality). Let Z denote the subset of these n for which $\mathcal {R}_{N}^{U}(n)<\frac {1}{2}\mathcal {M}_{N}^{U}(n)$; hence for n∈Z,

$$1\ll {|\mathcal {E}_{N}^{U}(n)|\over N^{-\varepsilon }T^{\delta -1}} . $$

Then by (3.14),

$$\begin{aligned} |Z| \ll_{\varepsilon }& \sum_{n<N} {|\mathcal {E}_{N}^{U}(n)|^{2} \over N^{-\varepsilon }T^{2(\delta -1)}} \ll N^{1-\eta+\varepsilon } , \end{aligned}$$

whence the claim follows, since ε>0 is arbitrary. □

3.3 Decomposition into major and minor arcs

Next we explain the decomposition (3.12). Let M be a parameter controlling the depth of approximation in Dirichlet’s theorem: for any irrational θ∈[0,1], there exists some q<M and (r,q)=1 so that |θ−r/q|<1/(qM). We will eventually set

$$ M=XT, $$

(3.15)

see (7.7) where this value is used. (Note that M is a bit bigger than N ^1/2=XT ^1/2.)

Writing θ=r/q+β, we introduce parameters

$$ Q_{0},\quad K_{0}, $$

(3.16)

small powers of N as determined in (6.2), so that the “major arcs” correspond to q<Q ₀ and |β|<K ₀/N. In fact, we need a smooth version of this decomposition.

To this end, recall the “hat” function and its Fourier transform

$$ \frak {t}(x):=\min(1+x,1-x)^{+},\qquad \widehat{\frak {t}}(y)= \biggl({\sin(\pi y)\over\pi y} \biggr)^{2}. $$

(3.17)

Localize $\frak {t}$ to the width K ₀/N, periodize it to the circle, and put this spike on each fraction in the major arcs:

$$ \frak {T}(\theta ) = \frak {T}_{N,Q_{0},K_{0}}(\theta ):=\sum _{q<Q_{0}}\sum_{(r,q)=1}\sum _{m\in \mathbb {Z}}\frak {t}\biggl({N\over K_{0}} \biggl(\theta +m-\frac{r}{q} \biggr) \biggr) . $$

(3.18)

By construction, $\frak {T}$ lives on the circle $\mathbb {R}/\mathbb {Z}$ and is supported within K ₀/N of fractions r/q with small denominator, q<Q ₀, as desired.

Then define the “main term”

$$ \mathcal {M}_{N}^{U}(n):= \int _{0}^{1} \frak {T}(\theta ) \widehat{ \mathcal {R}_{N}^{U}} (\theta ) e(-n\theta ) d\theta , $$

(3.19)

and “error term”

$$ \mathcal {E}_{N}^{U}(n):= \int _{0}^{1} \bigl(1-\frak {T}(\theta )\bigr) \widehat{ \mathcal {R}_{N}^{U}} (\theta ) e(-n\theta ) d\theta , $$

(3.20)

so that (3.12) obviously holds.

Since $\mathcal {R}_{N}^{U}$ could be negative, the same holds for $\mathcal {M}_{N}^{U}$. Hence we will establish (3.13) by first proving a related result for

$$ \mathcal {M}_{N}(n):= \int_{0}^{1} \frak {T}(\theta ) \widehat{ \mathcal {R}_{N}} (\theta ) e(-n\theta ) d\theta , $$

(3.21)

and then showing that $\mathcal {M}_{N}$ and $\mathcal {M}_{N}^{U}$ cannot differ by too much for too many values of n. This is the same (but in reverse) as the transfer from $\mathcal {R}_{N}$ to $\mathcal {R}_{N}^{U}$ in (3.11). See Theorem 6.1 for the lower bound on $\mathcal {M}_{N}$, and Theorem 6.2 for the transfer.

To prove (3.14), we apply Parseval and decompose dyadically:

$$\begin{aligned} \sum_{n}\big|\mathcal {E}_{N}^{U}(n)\big|^{2} =& \int_{0}^{1} \big|1-\frak {T}(\theta )\big|^{2} \bigl \vert \widehat{\mathcal {R}_{N}^{U}}(\theta ) \bigr \vert ^{2} d\theta \\ \ll& \mathcal {I}_{Q_{0},K_{0}} + \mathcal {I}_{Q_{0}} + \sum _{Q_{0}\le Q<M\atop\text{dyadic}} \mathcal {I}_{Q} , \end{aligned}$$

where we have dissected the circle into the following regions (using that $|1-\frak {t}(x)|=|x|$ on [−1,1]):

$$\begin{aligned} \mathcal {I}_{Q_{0},K_{0}} :=& \int _{\theta =\frac{r}{q}+\beta \atop q<Q_{0},(r,q)=1,|\beta |<K_{0}/N} \biggl \vert \beta \frac{N}{K_{0}}\biggr \vert ^{2} \bigl \vert \widehat{ \mathcal {R}_{N}^{U}}(\theta ) \bigr \vert ^{2} d\theta , \end{aligned}$$

(3.22)

$$\begin{aligned} \mathcal {I}_{Q_{0}} :=& \int _{\theta =\frac{r}{q}+\beta \atop q<Q_{0},(r,q)=1,K_{0}/N<|\beta |<1/(qM)} \bigl \vert \widehat{ \mathcal {R}_{N}^{U}}(\theta ) \bigr \vert ^{2} d\theta , \end{aligned}$$

(3.23)

$$\begin{aligned} \mathcal {I}_{Q} :=& \int _{\theta =\frac{r}{q}+\beta \atop Q\le q<2Q,(r,q)=1,|\beta |<1/(qM)} \bigl \vert \widehat{ \mathcal {R}_{N}^{U}}(\theta ) \bigr \vert ^{2} d\theta . \end{aligned}$$

(3.24)

Bounds of the quality (3.14) are given for (3.22) and (3.23) in Sect. 7, see Theorem 7.3. Our estimation of (3.24) decomposes further into two cases, whether Q<X or X≤Q<M, and are handled separately in Sect. 8 and Sect. 9; see Theorems 8.5 and 9.5, respectively.

We point out again that our averaging on n in the minor arcs makes this quite crude as far as individual n’s (the subject of Conjecture 1.1) are concerned.

3.4 The rest of the paper

The only section not yet described is Sect. 5, where we furnish some lemmata which are useful in the sequel. These decompose into two categories: one set of lemmata is related to some infinite-volume counting problems, for which the background in Sect. 4 is indispensable. The other lemma is of a classical flavor, corresponding to a local analysis for the shifted binary form $\frak {f}$; this studies a certain exponential sum which is dealt with via Gauss and Kloosterman/Salié sums.

This completes our outline of the rest of the paper.

4 Preliminaries II: automorphic forms and representations

4.1 Spectral theory

Recall the general spectral theory in our present context. We abuse notation (in this section only), passing from $G=\operatorname {SO}_{F}(\mathbb {R})$ to its spin double cover $G= \operatorname {SL}(2,\mathbb {C})$. Let Γ<G be a geometrically finite discrete group. (The Apollonian group is such, being a Schottky group, see Fig. 3.) Then Γ acts discontinuously on the upper half space $\mathbb {H}^{3}$, and any Γ orbit has a limit set Λ _Γ in the boundary $\partial \mathbb {H}^{3}\cong S^{2} $ of some Hausdorff dimension δ=δ(Γ)∈[0,2]. We assume that Γ is non-elementary (not virtually Abelian), so δ>0, and moreover that Γ is not a lattice, that is, the quotient $\varGamma \backslash \mathbb {H}^{3}$ has infinite hyperbolic volume; then δ<2. The hyperbolic Laplacian Δ acts on the space $L^{2}(\varGamma \backslash \mathbb {H}^{3})$ of functions automorphic under Γ and square integrable on the quotient; we choose the Laplacian to be positive definite. The spectrum is controlled via the following, see [35, 38, 47].

Theorem 4.1

(Patterson, Sullivan, Lax-Phillips)

The spectrum above 1 is purely continuous, and the spectrum below 1 is purely discrete. The latter is empty unless δ>1, in which case, ordering the eigenvalues by

$$ 0<\lambda _{0}<\lambda _{1} \le \cdots\le \lambda _{max}<1, $$

(4.1)

the base eigenvalue λ ₀ is given by

$$\lambda _{0}=\delta (2-\delta ). $$

Remark 4.2

In our application to the Apollonian group, the limit set is precisely the underlying gasket, see Fig. 3. It has dimension

$$ \delta \approx1.3\ldots>1. $$

(4.2)

Corresponding to λ ₀ is the Patterson-Sullivan base eigenfunction, φ ₀, which can be realized explicitly as the integral of a Poisson kernel against the so-called Patterson-Sullivan measure μ. Roughly speaking, μ is the weak^∗ limit as s→δ ⁺ of the measures

$$ \mu_{s}(x):= { \sum_{\gamma \in \varGamma }\exp({-s\,d(\frak {o},\gamma \cdot \frak {o})){\bf1}_{x=\gamma \frak {o}}} \over \sum_{\gamma \in \varGamma }\exp({-s\,d(\frak {o},\gamma \cdot \frak {o}))} } , $$

(4.3)

where d(⋅,⋅) is the hyperbolic distance, and $\frak {o}$ is any fixed point in $\mathbb {H}^{3}$.

4.2 Spectral gap

We assume henceforth that Γ moreover satisfies $\varGamma < \operatorname {SL}(2, \mathcal {O})$, where $\mathcal {O}=\mathbb {Z}[i]$. Then we have a tower of congruence subgroups: for any integer q≥1, define Γ(q) to be the kernel of the projection map $\varGamma \to \operatorname {SL}(2, \mathcal {O}/\frak {q})$, with $\frak {q}=(q)$ the principal ideal. As in (4.1), write

$$ 0<\lambda _{0}(q)<\lambda _{1}(q) \le \cdots\le \lambda _{max(q)}(q)<1, $$

(4.4)

for the discrete spectrum of $\varGamma (q)\backslash \mathbb {H}^{3}$. The groups Γ(q), while of infinite covolume, have finite index in Γ, and hence

$$ \lambda _{0}(q)=\lambda _{0}=\delta (2-\delta ). $$

(4.5)

But the second eigenvalues λ ₁(q) could a priori encroach on the base. The fact that this does not happen is the spectral gap property for Γ.

Theorem 4.3

Given Γ as above, there exists some ε=ε(Γ)>0 such that for all q≥1,

$$ \lambda _{1}(q)\ge \lambda _{0}+\varepsilon . $$

(4.6)

This is proved in the Appendix by Péter Varjú.

4.3 Representation theory and mixing rates

By the Duality Theorem of Gelfand, Graev, and Piatetski-Shapiro [24], the spectral decomposition above is equivalent to the decomposition into irreducibles of the right regular representation acting on L ²(Γ∖G). That is, we identify $\mathbb {H}^{3}\cong G/K$, with $K=\operatorname {SU}(2)$ a maximal compact subgroup, and lift functions from $\mathbb {H}^{3}$ to (right K-invariant) functions on G. Corresponding to (4.1) is the decomposition

$$ L^{2}(\varGamma \backslash G)= V_{\lambda _{0}}\oplus V_{\lambda _{1}}\oplus \cdots \oplus V_{\lambda _{max}}\oplus V_{temp} . $$

(4.7)

Here V _temp contains the tempered spectrum (for $\operatorname {SL}_{2}(\mathbb {C})$, every non-spherical irreducible representation is tempered), and each $V_{\lambda _{j}}$ is an infinite dimensional vector space, isomorphic as a G-representation to a complementary series representation with parameter s _j∈(1,2) determined by λ _j=s _j(2−s _j). Obviously, a similar decomposition holds for L ²(Γ(q)∖G), corresponding to (4.4).

We also have the following well-known general fact about mixing rates of matrix coefficients, see e.g. [20]. First we recall the relevant Sobolev norm. Let (π,V) be a unitary G-representation, and let {X _j} denote an orthonormal basis of the Lie algebra $\frak {k}$ of K with respect to an Ad-invariant scalar product. For a smooth vector v∈V ^∞, define the (second order) Sobolev norm $\mathcal {S}$ of v by

$$\mathcal {S}v := \|v\|_{2} +\sum_{j} \big\|d \pi(X_{j}).v\big\|_{2} +\sum_{j} \sum_{j'} \big\|d\pi(X_{j})d \pi(X_{j'}).v\big\|_{2} . $$

Theorem 4.4

([33, Prop. 5.3])

Let Θ>1 and (π,V) be a unitary representation of G which does not weakly contain any complementary series representation with parameter s>Θ. Then for any smooth vectors v,w∈V ^∞,

$$ \bigl \vert \bigl\langle\pi(g).v,w\bigr\rangle \bigr \vert \ll \|g\|^{-2(2-\varTheta )} \cdot \mathcal {S}v\cdot \mathcal {S}w . $$

(4.8)

Here ∥⋅∥ is the standard Frobenius matrix norm.

4.4 Effective bisector counting

The next ingredient which we require is the recent work by Vinogradov [49] on effective bisector counting for such infinite volume quotients. Recall the following sub(semi)groups of G:

$$A=\left \{a_{t}:= \left ( \begin{array}{c@{\quad}c} {e^{t/2}}\\ &{e^{-t/2}} \end{array} \right ) :t\in \mathbb {R}\right \},\qquad A^{+}= \{a_{t}:t\ge0 \}, $$

$$M=\left \{ \left ( \begin{array}{c@{\quad}c} {e^{2\pi i\theta }}\\ &{e^{-2\pi i\theta }} \end{array} \right ):\theta \in \mathbb {R}/ \mathbb {Z}\right \},\qquad K=\operatorname {SU}(2) . $$

We have the Cartan decomposition G=KA ⁺ K, unique up to the normalizer M of A in K. We require it in the following more precise form. Identify K/M with the sphere $S^{2}\cong \partial \mathbb {H}^{3}$. Then for every g∈G not in K, there is a unique decomposition

$$ g=s_{1}(g)\cdot a(g)\cdot m(g) \cdot s_{2}(g)^{-1} $$

(4.9)

with s ₁,s ₂∈K/M, a∈A ⁺ and m∈M, corresponding to

$$G=K/M\times A^{+}\times M\times M\backslash K, $$

see, e.g., [49, (3.4)]. The following theorem follows easily from [49, Theorem 2.2].

Theorem 4.5

([49])

Let Φ,Ψ⊂S ² be spherical caps and let $\mathcal {I}\subset \mathbb {R}/\mathbb {Z}$ be an interval. Then under the above hypotheses on Γ (in particular δ>1), and using the decomposition (4.9), we have

$$ \sum_{\gamma \in \varGamma } \boldsymbol {1}{ \left \{ \begin{array}{c} s_{1}(\gamma )\in\varPhi \\ s_{2}(\gamma )\in\varPsi \\ \|a(\gamma )\|^{2}<T \\ m(\gamma )\in \mathcal {I}\end{array} \right \} } = c_{\delta }\cdot \mu(\varPhi) \mu( \varPsi) \ell(\mathcal {I}) T^{\delta } + O \bigl( T^{ \varTheta } \bigr) , $$

(4.10)

as T→∞. Here c _δ>0, ∥⋅∥ is the Frobenius norm, ℓ is Lebesgue measure, μ is Patterson-Sullivan measure (cf. (4.3)), and

$$ \varTheta <\delta $$

(4.11)

depends only on the spectral gap for Γ. The implied constant does not depend on Φ,Ψ, or $\mathcal {I}$.

This generalizes from $\operatorname {SL}(2,\mathbb {R})$ to $\operatorname {SL}(2,\mathbb {C})$ the main result of [12], which is itself a generalization (with weaker exponents) to our infinite volume setting of [25, Theorem 4].

5 Some lemmata

5.1 Infinite volume counting statements

Equipped with the tools of Sect. 4, we isolate here some consequences which will be needed in the sequel. We return to the notation $G=\operatorname {SO}_{F}$, with F the Descartes form (2.2), $\varGamma =\mathcal {A}\cap G$, the orientation preserving Apollonian subgroup, and Γ(q) its principal congruence subgroups. Moreover, we import all the notation from the previous section.

First we use the spectral gap to see that summing over a coset of a congruence group can be reduced to summing over the original group.

Lemma 5.1

Fix γ ₁∈Γ, q≥1, and any “congruence” group $\tilde{\varGamma }(q)$ satisfying

$$ \varGamma (q)<\tilde{\varGamma }(q)<\varGamma . $$

(5.1)

Then as Y→∞,

$$\begin{aligned} & \#\bigl\{ \gamma \in\tilde{\varGamma }(q) : \| \gamma _{1}\gamma \|<Y \bigr\} \end{aligned}$$

(5.2)

$$\begin{aligned} &\quad = {1\over[\varGamma :\tilde{\varGamma }(q)]} \cdot \#\bigl\{ \gamma \in \varGamma : \|\gamma \|<Y \bigr\} + O\bigl(Y^{\varTheta _{0}}\bigr) , \end{aligned}$$

(5.3)

where Θ ₀<δ depends only on the spectral gap for Γ. The implied constant above does not depend on q or γ ₁. The same holds with γ ₁ γ in (5.2) replaced by γγ ₁.

This simple lemma follows from a more-or-less standard argument. We give a sketch below, since a slightly more complicated result will be needed later, cf. Lemma 5.3, but with essentially no new ideas. After proving the lemma below, we will use the argument as a template for the more complicated statement.

Sketch of Proof

Denote the left hand side (5.2) by $\mathcal {N}_{q}$, and let $\mathcal {N}_{1}/[\varGamma :\tilde{\varGamma }(q)]$ be the first term of (5.3). For g∈G, let

$$ f(g)=f_{Y}(g):=\boldsymbol {1}_{\{\|g\|<Y\}}, $$

(5.4)

and define

$$ F_{q}(g,h):= \sum_{\gamma \in\tilde{\varGamma }(q)} f\bigl(g^{-1}\gamma h\bigr), $$

(5.5)

so that

$$ \mathcal {N}_{q}=F_{q}\bigl(\gamma _{1}^{-1},e \bigr). $$

(5.6)

By construction, F _q is a function on $\tilde{\varGamma }(q)\backslash G\times \tilde{\varGamma }(q)\backslash G$, and we smooth F _q in both copies of $\tilde{\varGamma }(q)\backslash G$, as follows. Let ψ≥0 be a smooth bump function supported in a ball of radius η>0 (to be chosen later) about the origin in G with ∫_G ψ=1, and automorphize it to

$$\varPsi_{q}(g):=\sum_{\gamma \in\tilde{\varGamma }(q)}\psi(\gamma g) . $$

Then clearly Ψ _q is a bump function in $\tilde{\varGamma }(q)\backslash G$ with $\int_{\tilde{\varGamma }(q)\backslash G}\varPsi_{q}=1$. Let

$$\varPsi_{q,\gamma _{1}}(g):=\varPsi_{q}(g\gamma _{1}) . $$

Smooth the variables g and h in F _q by considering

$$\begin{aligned} \mathcal {H}_{q} :=& \langle F_{q},\varPsi_{q,\gamma _{1}} \otimes\varPsi_{q}\rangle = \int_{\tilde{\varGamma }(q)\backslash G} \int _{\tilde{\varGamma }(q)\backslash G} F_{q}(g,h)\varPsi_{q,\gamma _{1}}(g) \varPsi_{q}(h) dg\, dh \\ =& \sum_{\gamma \in\tilde{\varGamma }(q)} \int_{\tilde{\varGamma }(q)\backslash G} \int _{\tilde{\varGamma }(q)\backslash G} f\bigl(\gamma _{1}g^{-1}\gamma h\bigr) \varPsi_{q}(g)\varPsi_{q}(h) dg\, dh . \end{aligned}$$

First we estimate the error from smoothing:

$$\begin{aligned} \mathcal {E} =& |\mathcal {N}_{q}-\mathcal {H}_{q}| \\ \le& \sum_{\gamma \in \varGamma } \int_{\tilde{\varGamma }(q)\backslash G} \int_{\tilde{\varGamma }(q)\backslash G} \big|f\bigl(\gamma _{1}g^{-1}\gamma h\bigr) -f(\gamma _{1}\gamma ) \big| \varPsi_{q}(g)\varPsi_{q}(h) dg \, dh , \end{aligned}$$

where we have increased γ to run over all of Γ. The analysis splits into three ranges.

(1)
If γ is such that
$$ \|\gamma _{1}\gamma \|>Y(1+10\eta), $$
(5.7)
then both f(γ ₁ g ⁻¹ γh) and f(γ ₁ γ) vanish.
(2)
In the range
$$ \|\gamma _{1}\gamma \|<Y(1-10\eta), $$
(5.8)
both f(γ ₁ g ⁻¹ γh) and f(γ ₁ γ) are 1, so their difference vanishes.
(3)
In the intermediate range, we apply [35], bounding the count by
$$ \ll Y^{\delta }\eta+ Y^{\delta -\varepsilon }, $$
(5.9)
where ε>0 depends on the spectral gap for Γ.

Thus it remains to analyze $\mathcal {H}_{q}$.

Use a simple change of variables (see [12, Lemma 3.7]) to express $\mathcal {H}_{q}$ via matrix coefficients:

$$\mathcal {H}_{q} = \int_{G} f(g) \bigl\langle \pi(g) \varPsi_{q}, \varPsi_{q,\gamma _{1}} \bigr\rangle_{\tilde{\varGamma }(q)\backslash G} dg. $$

Decompose the matrix coefficient into its projection onto the base irreducible $V_{\lambda _{0}}$ in (4.7) and an orthogonal term, and bound the remainder by the mixing rate (4.8) using the uniform spectral gap ε>0 in (4.6). The functions ψ are bump functions in six real dimensions, so can be chosen to have second-order Sobolev norms bounded by ≪η ⁻⁵. Of course the projection onto the base representation is just $[\varGamma :\tilde{\varGamma }(q)]^{-1}$ times the same projection at level one, cf. (4.5). Running the above argument in reverse at level one (see [12, Proposition 4.18]) gives:

$$ \mathcal {N}_{q}= {1\over[\varGamma :\tilde{\varGamma }(q)]}\cdot \mathcal {N}_{1} + O\bigl(\eta Y^{\delta }+Y^{\delta -\varepsilon }\bigr) + O \bigl(Y^{\delta -\varepsilon } \eta^{-10}\bigr) . $$

(5.10)

Optimizing η and renaming Θ ₀<δ in terms of the spectral gap ε gives the claim. □

Next we exploit the previous lemma and the product structure of the family $\frak {F}$ in (3.3) to save a small power of q in the following modular restriction. Such a bound is needed at several places in Sect. 8.

Lemma 5.2

Let Θ ₀ be as in (5.3). Define $\mathcal {C}$ in (3.2) by

$$ \mathcal {C}:={10^{30}\over \delta -\varTheta _{0}}, $$

(5.11)

hence determining T ₁ and T ₂. There exists some η ₀>0 depending only on the spectral gap of Γ so that for any 1≤q<N and any $r(\operatorname {mod}q)$,

$$ \sum_ {\gamma \in \frak {F}} \boldsymbol {1}_{\{ \langle e_{1},\gamma v_{0}\rangle \equiv r(\operatorname {mod}q)\}} \ll {1\over q^{\eta_{0}}} T^{\delta } . $$

(5.12)

The implied constant is independent of r.

Proof

Dropping the condition 〈e ₁,γ ₁ γ ₂ v ₀〉>T/100 in (3.3), bound the left hand side of (5.12) by

$$ \sum_ {\gamma _{1}\in \varGamma \atop\|\gamma _{1}\|\asymp T_{1}} \sum _ {\gamma _{2}\in \varGamma \atop\|\gamma _{2}\|\asymp T_{2}} \boldsymbol {1}_{\{ \langle e_{1},\gamma _{1}\gamma _{2} v_{0}\rangle \equiv r(\operatorname {mod}q)\}}. $$

(5.13)

We decompose the argument into two ranges of q.

Case 1: q small

In this range, we fix γ ₁, and follow a standard argument for γ ₂. Let $\tilde{\varGamma }(q)<\varGamma $ denote the stabilizer of $v_{0} (\operatorname {mod}q)$, that is

$$ \tilde{\varGamma }(q):=\bigl\{\gamma \in \varGamma :\gamma v_{0}\equiv v_{0}(\operatorname {mod}q)\bigr\}. $$

(5.14)

Clearly (5.1) is satisfied, and it is elementary that

$$ \bigl[\varGamma :\tilde{\varGamma }(q)\bigr]\asymp q^{2}, $$

(5.15)

cf. (2.19). Decompose $\gamma _{2}=\gamma _{2}'\gamma _{2}''$ with $\gamma _{2}''\in\tilde{\varGamma }(q)$ and $\gamma _{2}'\in \varGamma /\tilde{\varGamma }(q)$. Then by (5.3) and [35], we have

$$\begin{aligned} (5.13) =& \sum_ {\gamma _{1}\in \varGamma \atop\|\gamma _{1}\|\asymp T_{1}} \sum _{\gamma _{2}'\in \varGamma /\tilde{\varGamma }(q)} \boldsymbol {1}_{\{ \langle e_{1},\gamma _{1}\gamma _{2}' v_{0}\rangle \equiv r(\operatorname {mod}q)\}} \sum_ {\gamma _{2}''\in\tilde{\varGamma }(q)\atop\|\gamma _{2}'\gamma _{2}''\|\asymp T_{2}} 1 \\ \ll& T_{1}^{\delta } q \biggl( \frac{1}{q^{2}}\ T_{2}^{\delta } + T_{2}^{\varTheta _{0}} \biggr) . \end{aligned}$$

Hence we have saved a whole power of q, as long as

$$ q<T_{2}^{(\delta -\varTheta _{0})/2} . $$

(5.16)

Case 2: $q\ge T_{2}^{{\delta -\varTheta _{0}\over2}}$

Then by (5.11) and (3.2), q is actually a very large power of T ₁,

$$ q\ge T_{1}^{10^{29}} . $$

(5.17)

In this range, we exploit Hilbert’s Nullstellensatz and effective versions of Bezout’s theorem; see a related argument in [7, Proof of Proposition 4.1].

Fixing γ ₂ in (5.13) (with $\ll T_{2}^{\delta }$ choices), we set

$$v:=\gamma _{2}v_{0}, $$

and play now with γ ₁. Let S be the set of γ ₁’s in question (and we now drop the subscript 1):

$$S=S_{v,q}(T_{1}):=\bigl\{\gamma \in \varGamma :\|\gamma \|\asymp T_{1}, \langle e_{1},\gamma v\rangle\equiv r(\operatorname {mod}q)\bigr\}. $$

This congruence restriction is to a modulus much bigger than the parameter, so we

Claim

There is an integer vector v _∗≠0 and an integer z _∗ such that

$$ \langle e_{1} , \gamma v_{*} \rangle=z_{*} $$

(5.18)

holds for all γ∈S. That is, the modular condition can be lifted to an exact equality.

First we assume the Claim and complete the proof of (5.12). Let q ₀ be a prime of size $\asymp T_{1}^{(\delta -\varTheta _{0})/2}$, say, such that $v_{*}\not \equiv0(\operatorname {mod}q_{0})$; then

$$\begin{aligned} |S| \ll& \# \bigl\{ \|\gamma _{1}\|<T_{1}:\langle e_{1},\gamma v_{*}\rangle\equiv z_{*}( \operatorname {mod}q_{0}) \bigr\} \\ \ll& q_{0} \biggl( \frac{1}{q_{0}^{2}} T_{1}^{\delta } + T_{1}^{\varTheta _{0}} \biggr) \ll \frac{1}{q_{0}} T_{1}^{\delta } , \end{aligned}$$

by the argument in Case 1. Recall we assumed that q<N. Since q ₀ above is a small power of N, the above saves a tiny power of q, as desired.

It remains to establish the Claim. For each γ∈S, consider the condition

$$\langle e_{1},\gamma \, v\rangle = \sum_{1\le j\le4} \gamma _{1,j}\, v_{j}\equiv r(\operatorname {mod}q). $$

First massage the equation into one with no trivial solutions. Since v is a primitive vector, after a linear change of variables we may assume that (v ₁,q)=1. Then multiply through by $\bar{v}_{1}$, where $v_{1}\bar{v}_{1}\equiv 1(\operatorname {mod}q)$, getting

$$ \gamma _{1,1} + \sum_{2\le j\le4} \gamma _{1,j}\, v_{j}\bar{v}_{1}\equiv r \bar{v}_{1}(\operatorname {mod}q). $$

(5.19)

Now, for variables V=(V ₂,V ₃,V ₄) and Z, and each γ∈S, consider the (linear) polynomials $P_{\gamma }\in \mathbb {Z}[V,Z]$:

$$P_{\gamma }(V,Z):= \gamma _{1,1} + \sum_{2\le j\le4} \gamma _{1,j}\, V_{j}-Z, $$

and the affine variety

$$\mathcal {V}:=\bigcap_{\gamma \in S}\{P_{\gamma }=0\}. $$

If this variety $\mathcal {V}(\mathbb {C})$ is non-empty, then there is clearly a rational solution, $(V^{*},Z^{*})\in \mathcal {V}(\mathbb {Q})$. Hence we have found a rational solution to (5.18), namely $v^{*}=(1,V_{2}^{*},V_{3}^{*},V_{4}^{*})\neq0$ and z ^∗=Z ^∗. Since (5.18) is homogeneous, we may clear denominators, getting an integral solution, v _∗,z _∗.

Thus we henceforth assume by contradiction that the variety $\mathcal {V}(\mathbb {C})$ is empty. Then by Hilbert’s Nullstellensatz, there are polynomials $Q_{\gamma }\in \mathbb {Z}[V,Z]$ and an integer $\frak {d}\ge1$ so that

$$ \sum_{\gamma \in S}P_{\gamma }(V,Z)Q_{\gamma }(V,Z)=\frak {d}, $$

(5.20)

for all $(V,Z)\in \mathbb {C}^{4}$. Moreover, Hermann’s method [29] (see [36, Theorem IV]) gives effective bounds on the heights of Q _γ and $\frak {d}$ in the above Bezout equation. Recall the height of a polynomial is the logarithm of its largest coefficient (in absolute value); thus the polynomials P _γ are linear in four variables with height ≤logT ₁. Then Q _γ and $\frak {d}$ can be found so that

$$ \frak {d}\le e^{8^{4\cdot2^{4-1}-1}(\log T_{1}+8\log8)} \ll T_{1}^{10^{28}} . $$

(5.21)

(Much better bounds are known, see e.g. [1, Theorem 5.1], but these suffice for our purposes.)

On the other hand, reducing (5.20) modulo q and evaluating at

$$V_{0}=( v_{2}\bar{v}_{1}, v_{3}\bar{v}_{1}, v_{4}\bar{v}_{1}), \qquad Z_{0}=r\bar{v}_{1}, $$

we have

$$\sum_{\gamma \in S}P_{\gamma }(V_{0},Z_{0}) Q_{\gamma }(V_{0},Z_{0})\equiv0\equiv \frak {d}(\operatorname {mod}q), $$

by (5.19). But then since $\frak {d}\ge1$, we in fact have $\frak {d}\ge q$, which is incompatible with (5.21) and (5.17). This furnishes our desired contradiction, completing the proof. □

Next we need a slight generalization of Lemma 5.1, which will be used in the major arcs analysis, see (6.6).

Lemma 5.3

Let $1<K\le T_{2}^{1/10}$, fix |β|<K/N, and fix x,y≍X. Then for any γ ₀∈Γ, any q≥1, and any group $\tilde{\varGamma }(q)$ satisfying (5.1), we have

$$\begin{aligned} \sum_ {\gamma \in \frak {F}\cap\{ \gamma _{0}\tilde{\varGamma }(q)\}} e \bigl( \beta \, \frak {f}_{\gamma }(2x,y) \bigr) =& {1\over[\varGamma :\tilde{\varGamma }(q)]} \sum_ {\gamma \in \frak {F}} e \bigl( \beta \, \frak {f}_{\gamma }(2x,y) \bigr) \\ & {} + O\bigl( T^{\varTheta } K\bigr) , \end{aligned}$$

(5.22)

where Θ<δ depends only on the spectral gap for Γ, and the implied constant does not depend on q, γ ₀, β, x or y.

Proof

The proof follows with minor changes that of Lemma 5.1, so we give a sketch; see also [12, Sect. 4].

According to the construction (3.3) of $\frak {F}$, the γ’s in question satisfy $\gamma =\gamma _{1}\gamma _{2}\in \gamma _{0}\tilde{\varGamma }(q)$, and hence we can write

$$\gamma _{2}=\gamma _{1}^{-1}\gamma _{0} \gamma _{2}', $$

with $\gamma _{2}'\in\tilde{\varGamma }(q)$. Then $\gamma _{2}'=\gamma _{0}^{-1}\gamma _{1}\gamma _{2}$, and using (2.15), we can write the left hand side of (5.22) as

$$\sum_{\gamma _{1}\in \varGamma \atop T_{1}<\|\gamma _{1}\|<2T_{1}} \sum_{\gamma _{2}'\in\tilde{\varGamma }(q)\atop T_{2}<\|\gamma _{1}^{-1}\gamma _{0}\gamma _{2}'\|<2T_{2}} \boldsymbol {1}_{\{\langle e_{1},\gamma _{0}\gamma _{2}'\,v_{0}\rangle>T/100\}}\ e \bigl( \beta \, \bigl\langle w_{x,y}, \gamma _{0}\gamma _{2}'\,v_{0}\bigr\rangle \bigr) . $$

Now we fix γ ₁ and mimic the proof of Lemma 5.1 in $\gamma _{2}'$.

Replace (5.4) by

$$f(g):= \boldsymbol {1}_{\{T_{2}<\|\gamma _{1}^{-1}g\|<2T_{2}\}} \boldsymbol {1}_{\{\langle e_{1},g\,v_{0}\rangle>T/100\}}\ e \bigl( \beta \, \langle w_{x,y}, g\,v_{0}\rangle \bigr) . $$

Then (5.5)–(5.7) remains essentially unchanged, save cosmetic changes such as replacing (5.6) by $F_{q}(\gamma _{1}\gamma _{0}^{-1},e)$. Then in the estimation of the difference $|\mathcal {N}_{q}-\mathcal {H}_{q}|$ by splitting the sum on $\gamma _{2}'$ into ranges, the argument now proceeds as follows.

(1)
The range (5.7) should be replaced by
$$\begin{aligned} &\big\|\gamma _{1}\gamma _{0}^{-1}\gamma _{2}' \big\|<T_{2}(1-10\eta),\quad\text{or}\quad \big\|\gamma _{1}\gamma _{0}^{-1} \gamma _{2}'\big\|>2T_{2}(1+10\eta), \\ &\quad\text{or}\quad\bigl\langle e_{1},\gamma _{1}\gamma _{0}^{-1} \gamma _{2}'\,v_{0}\bigr\rangle<\frac{T}{100}(1-10 \eta). \end{aligned}$$
(2)
The range (5.8) should be replaced by the range
$$\begin{aligned} &T_{2}(1+10\eta)<\big\|\gamma _{1}\gamma _{0}^{-1} \gamma _{2}'\big\|<2T_{2}(1-10\eta),\quad \text{and}\\ &\bigl \langle e_{1},\gamma _{1}\gamma _{0}^{-1} \gamma _{2}'\,v_{0}\bigr\rangle>\frac{T}{100}(1+10 \eta), \end{aligned}$$
in which f is differentiable. Here instead of the difference $|f(\gamma _{1}\gamma _{0}^{-1}\cdot g\gamma _{2}'h)-f(\gamma _{1}\gamma _{0}^{-1}\gamma _{2}')|$ vanishing, it is now bounded by
$$\ll\eta K, $$
for a net contribution to the error of ≪ηKT ^δ.
(3)
In the remaining range, (5.9) remains unchanged, using |f|≤1.

The error in (5.10) is then replaced by

$$O\bigl(\eta\, K\, T_{2}^{\delta } + T_{2}^{\delta -\varepsilon } \eta^{-10}\bigr). $$

Optimizing η and renaming Θ gives the bound $O(T_{2}^{\varTheta }K^{10/11})$, which is better than claimed in the power of K. Rename Θ once more using (3.2) and (5.11), giving (5.22). □

The following is our last counting lemma, showing a certain equidistribution among the values of $\frak {f}_{\gamma }(2x,y)$ at the scale N/K. This bound is used in the major arcs, see the proof of Theorem 6.1.

Lemma 5.4

Fix N/2<n<N, $1<K\le T_{2}^{1/10}$, and x,y≍X. Then

$$ \sum_ {\gamma \in \frak {F}} \boldsymbol {1}_{ \{ | \frak {f}_{\gamma }(2x,y)- n| < \frac{N}{K} \} } \gg {T^{\delta }\over K} + T^{\varTheta } , $$

(5.23)

where Θ<δ only depends on the spectral gap for Γ. The implied constant is independent of x,y, and n.

Sketch

The proof is an explicit calculation nearly identical to the one given in [12, Sect. 5]; we give only a sketch here. Write the left hand side of (5.23) as

$$\sum_{\gamma _{1}\in \varGamma \atop T_{1}<\|\gamma _{1}\|<2T_{1}} \sum_{\gamma _{2}\in \varGamma \atop T_{2}<\|\gamma _{2}\|<2T_{2}} \boldsymbol {1}_{\{ \langle e_{1},\gamma _{1}\gamma _{2}v_{0}\rangle > T/100 \}} \boldsymbol {1}_{\{ |\langle w_{x,y},\gamma _{1}\gamma _{2}v_{0}\rangle-n|<N/K \}} . $$

Fix γ ₁ and express the condition on γ ₂ as γ ₂∈R⊂G, where R is the region

$$R= R_{\gamma _{1},x,y,n}:= \left \{ g\in G :\ \begin{array}{c} T_{2}<\|g\|<2T_{2} \\ \langle \gamma _{1}^{t}e_{1},g\, v_{0}\rangle>T/100 \\ |\langle \gamma _{1}^{t}w_{x,y}, g\, v_{0} \rangle -n| < \frac{N}{K} \end{array} \right \} . $$

Lift $G=\operatorname {SO}_{F}(\mathbb {R})$ to its spin cover $\tilde{G}= \operatorname {SL}_{2}(\mathbb {C})$ via the map ι of (2.18). Let $\tilde{R}\subset\tilde{G}$ be the corresponding pullback region, and decompose $\tilde{G}$ into Cartan KAK coordinates according to (4.9). Note that ι is quadratic in the entries, so, e.g., the condition

$$ \|g\|^{2}\asymp T\quad \mbox{gives}\ \big\|\iota(g)\big\|\asymp T, $$

(5.24)

explaining the factor ∥a(g)∥² appearing in (4.10).

Then chop $\tilde{R}$ into spherical caps and apply Theorem 4.5. The same argument as in [12, Sect. 5] then leads to (5.23), after renaming Θ; we suppress the details. □

5.2 Local analysis statements

In this subsection, we study a certain exponential sum which arises in a crucial way in our estimates. Fix $\frak {f}\in \frak {F}$, and write $\frak {f}=f-a$ with

$$f(x,y)=Ax^{2}+2Bxy+Cy^{2} $$

according to (2.14). Let q ₀≥1, fix r with (r,q ₀)=1, and fix $n,m\in \mathbb {Z}$. (The notation is meant to be consistent with its later use; there will be another parameter q, and q ₀ will be a divisor of q.) Define the exponential sum

$$ \mathcal {S}_{f}(q_{0},r;n,m) := {1\over q_{0}^{2}} \sum_{k(q_{0})} \sum _{\ell( q_{0})} e_{q_{0}} \bigl( r f(k,\ell) +nk+m\ell \bigr) . $$

(5.25)

This sum appears naturally in many places in the minor arcs analysis, see e.g. (7.4) and (9.2). Our first lemma is completely standard, see, e.g. [30, Sect. 12.3].

Lemma 5.5

With the above conditions,

$$ \big| \mathcal {S}_{f}(q_{0},r;n,m) \big| \le q_{0}^{-1/2} . $$

(5.26)

Remark 5.6

Being a sum in two variables, one might expect square-root cancellation in each, giving a savings of $q_{0}^{-1}$; indeed this is what we obtain, modulo some coprimality conditions, see (5.29). For some of our applications, saving just one square-root is plenty, and we can ignore the coprimality; hence the cleaner statement in (5.26).

Proof

Write $\mathcal {S}_{f}$ for $\mathcal {S}_{f}(q_{0},r;n,m)$. Note first that $\mathcal {S}_{f}$ is multiplicative in q ₀, so we study the case q ₀=p ^j is a prime power. Assume for simplicity (q ₀,2)=1; similar calculations are needed to handle the 2-adic case.

First we re-express $\mathcal {S}_{f}$ in a more convenient form. By Descartes theorem (2.1), primitivity of the gasket , and (2.13), we have that (A,B,C)=1; assume henceforth that (C,q ₀)=1, say. Write $\bar{x}$ for the multiplicative inverse of x (the modulus will be clear from context). Recall throughout that (r,q ₀)=1.

Looking at the terms in the summand of $\mathcal {S}_{f}$, we have

$$\begin{aligned} & r f(k,\ell) + nk+m\ell \quad(\operatorname {mod}q_{0}) \\ &\quad\equiv r \bigl(Ak^{2}+2Bk\ell+C\ell^{2}\bigr) + nk+m\ell \\ &\quad\equiv rC(\ell +B\bar{C}k )^{2} + r\bar{C}k^{2} \bigl(AC - B^{2} \bigr) + nk+m\ell \\ &\quad\equiv rC(\ell +B\bar{C}k )^{2} + a^{2} r\bar{C} k^{2} + nk+m\ell \\ &\quad\equiv rC(\ell +B\bar{C}k + \overline{2rC} m )^{2} -\overline{4rC} m^{2} + a^{2} r\bar{C} k^{2} + k(n - B\bar{C} m) , \end{aligned}$$

where we used (2.16). Hence we have

$$\begin{aligned} \mathcal {S}_{f} =& {1\over q_{0}^{2}} e_{q_{0}} \bigl( -\overline{4rC} m^{2} \bigr) \sum _{k( q_{0})} e_{q_{0}} \bigl( a^{2} r\bar{C} k^{2} + k(n - B\bar{C} m) \bigr) \\ &{} \times \sum_{\ell( q_{0})} e_{q_{0}} \bigl( rC(\ell +B\bar{C}k + \overline{2rC} m )^{2} \bigr) , \end{aligned}$$

and the ℓ sum is just a classical Gauss sum. It can be evaluated explicitly, see e.g. [30, Eq. (3.38)]. Let

$$\varepsilon _{q_{0}}:= \begin{cases} 1&\mbox{if}\ q_{0}\equiv1(\operatorname {mod}4)\\ i&\mbox{if}\ q_{0}\equiv3(\operatorname {mod}4). \end{cases} $$

Then the Gauss sum on ℓ is $\varepsilon _{q_{0}}\sqrt{q}_{0} ({rC\over q_{0}} )$, where $({\cdot\over q_{0}})$ is the Legendre symbol. Thus we have

$$\begin{aligned} \mathcal {S}_{f} =& {\varepsilon _{q_{0}}\over q_{0}^{3/2}} \biggl( {rC\over q_{0}} \biggr) e_{q_{0}} \bigl( -\overline{4rC} m^{2} \bigr) \sum_{k( q_{0})} e_{q_{0}} \bigl( a^{2} r\bar{C} k^{2} + k(n - B\bar{C} m) \bigr) . \end{aligned}$$

Let

$$ \tilde{q}_{0}:=\bigl(a^{2},q_{0} \bigr), \qquad q_{1}:=q_{0}/\tilde{q}_{0},\quad \text{and }\quad a_{1}:=a^{2}/\tilde{q}_{0}, $$

(5.27)

so that a ²/q ₀=a ₁/q ₁ in lowest terms. Break the sum on 0≤k<q ₀ according to $k= k_{1}+q_{1}\tilde{k}$, with 0≤k ₁<q ₁ and $0\le\tilde{k}< \tilde{q}_{0}$. Then

$$\begin{aligned} \mathcal {S}_{f} =& {\varepsilon _{q_{0}}\over q_{0}^{3/2}} \biggl( {rC\over q_{0}} \biggr) e_{q_{0}} \bigl( -\overline{4rC} m^{2} \bigr) \\ & {} \times \sum_{k_{1}( q_{1})} e_{q_{1}} \bigl( a_{1} r\bar{C} (k_{1})^{2} \bigr) e_{q_{0}} \bigl( {k_{1} } (n - B\bar{C} m) \bigr) \\ & {} \times \sum_{\tilde{k}( \tilde{q}_{0})} e_{\tilde{q}_{0}} \bigl( {\tilde{k}} (n - B\bar{C} m) \bigr) . \end{aligned}$$

The last sum vanishes unless $n-B\bar{C}m\equiv0$ $(\operatorname {mod}\tilde{q}_{0})$, in which case it is $\tilde{q}_{0}$. In the latter case, define L by

$$ L:=(Cn-Bm )/ \tilde{q}_{0} . $$

(5.28)

Then we have

$$\begin{aligned} \mathcal {S}_{f} =& \boldsymbol {1}_{ nC\equiv mB( \tilde{q}_{0}) } {\varepsilon _{q_{0}}\over q_{0}^{3/2}} \biggl({rC\over q_{0}} \biggr) e_{q_{0}} \bigl( -\overline{4rC} m^{2} \bigr) \\ & {} \times e_{q_{1}} \bigl( - \overline{4a_{1}rC} L^{2} \bigr) \biggl[ \sum_{k_{1}( q_{1})} e_{q_{1}} \bigl( a_{1} r\bar{C} ( k_{1} + \overline{2a_{1}r} L )^{2} \bigr) \biggr] \tilde{q}_{0} . \end{aligned}$$

The Gauss sum in brackets is again evaluated as $\varepsilon _{q_{1}} q_{1}^{1/2} ( { a_{1} r\bar{C} \over q_{1} } ) $, so we have

$$\begin{aligned} \mathcal {S}_{f}(q_{0},r;n,m) =& \boldsymbol {1}_{ nC\equiv mB(\tilde{q}_{0}) } { \varepsilon _{q_{0}}\varepsilon _{q_{1}} \tilde{q}_{0} ^{1/2} \over q_{0} } e_{q_{0}} \bigl( - \overline{4rC} m^{2} \bigr) \\ & {} \times e_{q_{1}} \bigl( - \overline{4a_{1}rC} L^{2} \bigr) \biggl({rC\over q_{0}} \biggr) \biggl( { a_{1} r\bar{C} \over q_{1} } \biggr) . \end{aligned}$$

(5.29)

The claim then follows trivially. □

Next we introduce a certain average of a pair of such sums. Let f,q ₀,r,n, and m be as before, and fix q≡0 $(\operatorname {mod}q_{0})$ and (u ₀,q ₀)=1. Let $\frak {f}'\in \frak {F}$ be another shifted form $\frak {f}'=f'-a'$, with

$$f'(x,y)=A'x^{2}+2B'xy+C'y^{2}. $$

Also let $n',m'\in \mathbb {Z}$. Then define

$$\begin{aligned} \mathcal {S} =& \mathcal {S}\bigl(q,q_{0},f,f',n,m,n',m';u_{0} \bigr) \\ :=& \sideset{} {'} \sum_{r(q)} \mathcal {S}_{f}(q_{0},ru_{0};n,m) \overline{ \mathcal {S}_{f'}\bigl(q_{0},ru_{0};n',m' \bigr)} e_{q}\bigl(r \bigl(a'- a\bigr)\bigr) . \end{aligned}$$

(5.30)

This sum also appears naturally in the minor arcs analysis, see (8.2) and (9.4).

Lemma 5.7

With the above notation, we have the estimate

$$ |\mathcal {S}| \ll ({q/ q_{0}} )^{2} { \{(a^{2},q_{0}) \cdot ((a')^{2},q_{0})\} ^{1/2} \over q^{5/4} } \bigl(a-a',q\bigr)^{1/4} . $$

(5.31)

Remark 5.8

Treating all gcd’s above as 1 and pretending q=q ₀, the trivial bound here (after having saved essentially a whole q from each of the two $\mathcal {S}_{f}$ sums) is 1/q, since the r sum is unnormalized. So (5.31) saves an extra q ^1/4 in the r sum. (In fact we could have saved the expected q ^1/2, but this does not improve our final estimates.)

Proof

Observe that $\mathcal {S}$ is multiplicative in q, so we again consider the prime power case q=p ^j, p≠2; then q ₀ is also a prime power, since q ₀∣q. As before, we may assume (C,q ₀)=(C′,q ₀)=1.

Recall a ₁, $\tilde{q}_{0}$, and L given in (5.27) and (5.28), and let $a_{1}'$, $\tilde{q}_{0}'$ and L′ be defined similarly. Inputting the analysis from (5.29) into both $\mathcal {S}_{f}$ and $\mathcal {S}_{f'}$, we have

$$\begin{aligned} \mathcal {S} =& \boldsymbol {1}_{ nC\equiv2mB(\tilde{q}_{0}) \atop n'C'\equiv2m'B'(\tilde{q}'_{0}) } { \varepsilon _{q_{1}} \bar{\varepsilon }_{q_{1}'} (\tilde{q}_{0}\tilde{q}'_{0})^{1/2} \over q_{0}^{2} } \biggl({CC'\over q_{0}} \biggr) \biggl( { a_{1} u_{0}\bar{C} \over q_{1} } \biggr) \biggl( { a_{1}' u_{0}\bar{C}' \over q_{1}' } \biggr) \\ &{} \times \biggl[ \sideset{} {'} \sum _{r(q)} \biggl( { r\over q_{1} } \biggr) \biggl( { r\over q_{1}' } \biggr) e_{q}\bigl(r \bigl\{a'- a\bigr\}\bigr) \\ &{} \times e_{q_{0}} \biggl( \overline{ 4 r u_{0} } \biggl\{ \overline{C'} \bigl(m'\bigr)^{2} -\overline{C} m^{2} + \overline{a_{1}'C'} \bigl(L' \bigr)^{2} \tilde{q}' - \overline{a_{1}C} L^{2} \tilde{q} \biggr\} \biggr) \biggr] . \end{aligned}$$

(5.32)

The term in brackets [⋅] is a Kloosterman- or Salié-type sum, for which we have an elementary bound [32] to the power 3/4:

$$\begin{aligned} |\mathcal {S}| \ll& { (\tilde{q}_{0} \tilde{q}'_{0}) ^{1/2} \over q_{0}^{2} } q^{3/4} \bigl(a-a',q\bigr)^{1/4} , \end{aligned}$$

giving the claim. (There is no improvement in our use of this estimate from appealing to Weil’s bound instead of Kloosterman’s; any power gain suffices.) □

In the case a=a′, (5.31) only saves one power of q, and in Sect. 9 we will need slightly more; see the proof of (9.10). We get a bit more cancellation in the special case f(m,−n)≠f′(m′,−n′) below.

Lemma 5.9

Assuming a=a′ and f(m,−n)≠f′(m′,−n′), we have the estimate

$$ |\mathcal {S}| \ll (q/q_{0})^{5} { (a^{2},q_{0}) \over q^{9/8} } \cdot \big|f(m,-n)-f' \bigl(m',-n'\bigr)\big|^{1/2} . $$

(5.33)

Proof

Assume first that q (and hence q ₀) is a prime power, continuing to omit the prime 2. Returning to the definition of $\mathcal {S}$ in (5.30), it is clear in the case a=a′ that

$$\sum_{r(q)}'=(q/q_{0})\sum _{r(q_{0})}'. $$

Hence we again apply Kloosterman’s 3/4th bound to (5.32), getting

$$\begin{aligned} |\mathcal {S}| \ll& \boldsymbol {1}_{ nC\equiv2mB(\tilde{q}_{0}) \atop n'C'\equiv2m'B'(\tilde{q}'_{0}) } (q/q_{0})^{9/2} { (a^{2},q_{0}) \over q^{5/4} } \\ & \times \prod_{p^{j}\| q_{0}} \bigl( p^{j} , \bar{4} \bigl\{ \overline{C'} \bigl(m' \bigr)^{2} -\overline{C} m^{2} + \overline{a_{1}} \bigl(a^{2},p^{j} \bigr) \bigl( \overline{C'} \bigl(L' \bigr)^{2} - \overline{C} L^{2} \bigr) \bigr\} \bigr)^{1/4} , \end{aligned}$$

(5.34)

which is valid now without the assumption that q ₀ is a prime power. (Here a ₁ satisfies a ²=a ₁(a ²,p ^j) as in (5.27), and L is given in (5.28), so both depend on p ^j.)

Break the primes diving q ₀ into two sets, $\mathcal {P}_{1}$ and $\mathcal {P}_{2}$, defining $\mathcal {P}_{1}$ to be the set of those primes p for which

$$\begin{aligned} \overline{C} m^{2} + \overline{C} L^{2} \overline{a_{1}} \bigl(a^{2},p^{j}\bigr) \equiv \overline{C'} \bigl(m'\bigr)^{2} + \overline{C'} \bigl(L' \bigr)^{2} \overline{a_{1}} \bigl(a^{2},p^{j}\bigr)\quad \bigl(\operatorname {mod}p^{\lceil j/2\rceil} \bigr) , \end{aligned}$$

(5.35)

and $\mathcal {P}_{2}$ the rest. For the latter, the gcd in (p ^j,…) of (5.34) is at most p ^j/2, so we clearly have

$$ \prod_{p^{j}\| q_{0}\atop p\in \mathcal {P}_{2}}\bigl(p^{j}, \ldots\bigr)^{1/4} \le \prod_{p^{j}\| q_{0} }p^{j/8} = q_{0}^{1/8} . $$

(5.36)

For $p\in \mathcal {P}_{1}$, we multiply both sides of (5.35) by

$$a^{2}=AC-B^{2}=A'C'-\bigl(B' \bigr)^{2}=a_{1} \bigl(a^{2},p^{j}\bigr) , $$

giving

$$\begin{aligned} & \bigl(AC-B^{2}\bigr) \overline{C} m^{2} + \overline{C} L^{2} \bigl(a^{2},p^{j} \bigr)^{2} \\ &\quad \equiv \bigl(A'C'- \bigl(B'\bigr)^{2}\bigr)\overline{C'} \bigl(m'\bigr)^{2} + \overline{C'} \bigl(L'\bigr)^{2} \bigl(a^{2},p^{j} \bigr)^{2} \quad\bigl(\operatorname {mod}p^{\lceil j/2\rceil}\bigr) . \end{aligned}$$

(5.37)

Using (5.28) that

$$nC- mB = \bigl(a^{2},p^{j}\bigr) L , \qquad n'C'- m'B' = \bigl(a^{2},p^{j}\bigr) L' $$

and subtracting a from both sides of (5.37), we have shown that

$$ f'\bigl(m',-n'\bigr) \equiv f(m,-n) \quad\bigl(\operatorname {mod}p^{\lceil j/2\rceil}\bigr) . $$

(5.38)

Let

$$Z=\big|f(m,-n)-f'\bigl(m',-n'\bigr)\big| . $$

By assumption Z≠0. Moreover (5.38) implies that

$$\biggl( \prod_{p\in \mathcal {P}_{1}}p^{\lceil j/2\rceil} \biggr) \mid Z , $$

and hence

$$ \prod_{p^{j}\|q_{0}\atop p\in \mathcal {P}_{1}}p^{j/4} \le Z^{1/2} . $$

(5.39)

Combining (5.39) and (5.36) in (5.34) gives the claim. □

Finally we need some savings in the case a=a′ and f(m,−n)=f′(m′,−n′). This will no longer come from $\mathcal {S}$ itself, but from the following supplementary lemmata.

Lemma 5.10

Fix an equivalence class $\mathcal {K}$ of primitive binary quadratic forms of discriminant −4a ². We claim that the number of equivalent forms $f\in \mathcal {K}$ with $\frak {f}=f-a\in \frak {F}$ is bounded, that is,

$$ \#\{\frak {f}\in \frak {F}:f\in \mathcal {K}\}=O(1). $$

(5.40)

Proof

From (2.13), (3.3), and (2.16), we have that f(m,n)=Am ²+2Bmn+Cn ² has coefficients of size

$$A,B,C\ll T, $$

and AC−B ²=a ², with a≍T. It follows that AC≍T ², and hence

$$ A,C\asymp T. $$

(5.41)

Now suppose we have $\frak {f}=f-a$ and $\frak {f}'=f'-a$ with f as above and f′ having coefficients A′,B′,C′. If f and f′ are equivalent then there is an element so that

$$\begin{aligned} A' =& g^{2} A + 2 gi B + i^{2} C, \\ B' =& gh A + (gj+hi) B + ij C , \\ C' =& h^{2} A + 2 hj B + j^{2} C. \end{aligned}$$

(5.42)

The first line can be rewritten as

$$A' = C ( i + g B /C )^{2} + g^{2} {4a^{2}\over C} , $$

so that

$$g^{2} \le A' {C\over4a^{2}} \ll 1 . $$

Similarly,

$$( i + g B /C )^{2} \le {A' \over C} \ll 1 , $$

and hence |i|≪1. In a similar fashion, we see that |h| and |j| are also bounded, thus the number of equivalent forms in $\mathcal {K}$ is bounded, as claimed. □

Lemma 5.11

For a fixed large integer z, the number of inequivalent classes $\mathcal {K}$ of primitive quadratic forms of determinant −4a ² which represent z is

$$ \ll_{\varepsilon }\ z^{\varepsilon }\cdot \bigl(z,4a^{2}\bigr)^{1/2} , \quad \textit{for any}\ \varepsilon >0. $$

(5.43)

Proof

If $f\in \mathcal {K}$ represents z, say f(m,n)=z, then, setting w=(m,n), f represents z ₁:=z/w ² primitively. We see from (5.42) that f is then in the same class as f ₁(m,n)=z ₁ m ²+2Bmn+Cn ², with

$$-4a^{2}=z_{1}C-B^{2} . $$

Moreover, by a unipotent change of variables preserving z ₁, we can force B into the range [0,z ₁), that is, B is determined mod z ₁. So the number of inequivalent such f ₁ is equal to

$$ \#\bigl\{B(\operatorname {mod}z_{1}):B^{2} \equiv-4a^{2}(z_{1})\bigr\} = \prod _{p^{e}\mid\mid z_{1}} \#\bigl\{B^{2}\equiv-p^{2f} \bigl(p^{e}\bigr)\bigr\} , $$

(5.44)

where p ^f∣∣2a. If 2f≥e, then the number of local solutions is at most p ^e/2. Otherwise, write B=B ₁ p ^f; then there are at most 2 solutions to $B_{1}^{2}\equiv-1(\operatorname {mod}p^{e-2f})$, and there are p ^f values for B once B ₁ is determined. Hence the number of local solutions is at most 2⋅min(p ^e/2,p ^f), so the number of solutions to (5.44) is at most

$$2^{\omega (z)}\bigl(z_{1},4a^{2}\bigr)^{1/2} \ll_{\varepsilon }z^{\varepsilon }\bigl(z,4a^{2}\bigr)^{1/2}. $$

The number of divisors z ₁ of z is ≪_ε z ^ε, completing the proof. □

Lemma 5.12

Fix (A,B,C)=1 and d∣AC−B ². Then there are integers k,ℓ with (k,ℓ,d)=1 so that, whenever Am ²+2Bmn+Cn ²≡0(d), we have

$$ (mk+n\ell)^{2}\equiv0(d). $$

(5.45)

Proof

We will work locally, then lift to a global solution. Let p ^e∣∣d.

Case 1:
If (p,A)=1, then Am ²+2Bmn+Cn ²≡0(p ^e) implies
$$(m+\bar{A} Bn)^{2}-\bar{A} ^{2}B^{2}n^{2}+ \bar{A}Cn^{2}\equiv (m+\bar{A} Bn)^{2}\equiv 0 \bigl(p^{e}\bigr). $$
In this case, we set k _p:=1, and $\ell_{p}:={\bar{A} B}$.
Case 2:
If (p,A)>1, then by primitivity, (p,C)=1. As before, we have $(n+\bar{C} Bm)^{2}\equiv 0(p^{e}) $, and we choose $k_{p}={\bar{C} B}$, ℓ _p:=1.

By the Chinese Remainder Theorem, there are integers k and ℓ so that $k\equiv k_{p}(\operatorname {mod}p^{e})$, and similarly with ℓ. By construction, we have (k,ℓ,d)=1, as claimed. □

Lemma 5.13

Given large M, (A,B,C)=1 and d∣AC−B ²,

$$\begin{aligned} &\#\bigl\{m,n<M:Am^{2}+2Bmn+Cn^{2} \equiv0(d)\bigr\} \\ &\quad\ll_{\varepsilon } d^{\varepsilon } \biggl({M^{2}\over d^{1/2}}+M \biggr) . \end{aligned}$$

(5.46)

Proof

As in Lemma 5.12, A,B,C and d determine k,ℓ so that

$$\sum_{m,n<M}\boldsymbol {1}_{\{Am^{2}+2Bmn+Cn^{2}\equiv0(d)\}} \le \sum _{m,n<M}\boldsymbol {1}_{\{(mk+n\ell)^{2}\equiv0(d)\}} . $$

But then there is a d ₁∣d, with $d\mid d_{1}^{2}$ so that mk+nℓ≡0(d ₁). Let w=(ℓ,d ₁); then mk≡0(w) implies m≡0(w) since (k,ℓ,d)=1. There are at most 1+M/w such m up to M. With m fixed, n is uniquely determined mod d ₁/w. Hence we get the bound

$$\begin{aligned} (5.46) \le& \sum_{d_{1}\mid d\atop d\mid d_{1}^{2}} \sum _{w\mid d_{1}} \sum_{m,n<M} \boldsymbol {1}_{\{m\equiv0(\operatorname {mod}w)\}}\boldsymbol {1}_{\{ n\equiv-\overline{\frac{\ell}{w}}\frac{m}{w} k(\operatorname {mod}\frac{d_{1}}{w})\}} \\ \ll& \sum_{d_{1}\mid d\atop d\mid d_{1}^{2}} \sum_{w\mid d_{1}} \biggl( {M\over w} +1 \biggr) \biggl( {wM\over d_{1}} +1 \biggr) \ll_{\varepsilon } d^{\varepsilon } \biggl( {M^{2}\over d^{1/2}} +M \biggr) , \end{aligned}$$

as claimed. □

Finally we collect the above lemmata into our desired estimate, essential in the proof of (9.12).

Proposition 5.14

For large M and $\frak {f}=f-a\in \frak {F}$ fixed,

$$ \# \left \{ \begin{array}{c} \frak {f}'\in \frak {F}\\ m,n,m',n'<M \end{array} \biggm{|} \begin{array}{c} a'=a\\ \frak {f}(m,-n)=\frak {f}'(m',-n') \end{array} \right \} \ll_{\varepsilon } (TM)^{\varepsilon } \bigl( M^{2} + TM \bigr) , $$

(5.47)

for any ε>0.

Proof

Once f,m,n, and $\frak {f}'=f'-a\in \frak {F}$ are determined, it is elementary that there are ≪_ε M ^ε values of m′,n′ with f(m,−n)=f′(m′,−n′). Decomposing f′ into classes and applying (5.40), (5.43), and (5.46), in succession, we have

$$\begin{aligned} & \sum_{m,n<M} \sum _{\frak {f}'\in \frak {F}\atop a'=a} \sum_{m',n'<M} \boldsymbol {1}_{\{f(m,-n)=f'(m',-n')\}} \\ &\quad\ll_{\varepsilon } \sum_{m,n<M} \sum _{\frak {f}'\in \frak {F}\atop a'=a} \boldsymbol {1}_{\{f'\text{ represents }f(m,-n)\}} M^{\varepsilon } \\ &\quad\ll M^{\varepsilon } \sum_{m,n<M} \sum _{\text{classes }\mathcal {K}\atop\text{representing }f(m,-n)} \sum_{\frak {f}'\in \frak {F}\atop a'=a,f'\in \mathcal {K}} 1 \\ &\quad\ll_{\varepsilon } (TM)^{\varepsilon } \sum_{m,n<M} \bigl(f(m,-n),4a^{2}\bigr)^{1/2} \\ &\quad\ll (TM)^{\varepsilon } \sum_{d\mid4a^{2}} d^{1/2} \sum_{m,n<M} \boldsymbol {1}_{\{f(m,-n)\equiv0(d)\}} \\ &\quad\ll (TM)^{\varepsilon } \sum_{d\mid4a^{2}} d^{1/2} \biggl( {M^{2}\over d^{1/2}} + M \biggr) \\ &\quad\ll (TM)^{\varepsilon } \bigl( M^{2} + M a \bigr) , \end{aligned}$$

from which the claim follows since a≪T. □

6 Major arcs

We return to the setting and notation of Sect. 3 with the goal of establishing (3.13). Thanks to the counting lemmata in Sect. 5.1, we can now define the major arcs parameters Q ₀ and K ₀ from (3.16). First recall the two numbers Θ<δ appearing in (5.22), (5.23), and define

$$ 1<\varTheta _{1}<\delta $$

(6.1)

to be the larger of the two. Then set

$$ Q_{0}= T ^{(\delta -\varTheta _{1})/20}, \quad K_{0}=Q_{0}^{2}. $$

(6.2)

We may now also set the parameter U from (3.8) to be

$$ U=Q_{0}{}^{(\eta_{0})^{2}/100}, $$

(6.3)

where 0<η ₀<1 is the number which appears in Lemma 5.2.

Let $\mathcal {M}_{N}^{(U)}(n)$ denote either $\mathcal {M}_{N}(n)$ or $\mathcal {M}_{N}^{U}(n)$ from (3.21), (3.19), respectively. Putting (3.18) and (3.6) (resp. (3.9)) into (3.21) (resp. (3.19)), making a change of variables θ=r/q+β, and unfolding the integral from $\sum_{m}\int_{0}^{1}$ to $\int_{\mathbb {R}}$ gives

$$ \mathcal {M}_{N}^{(U)}(n) = \sum _{x,y\in \mathbb {Z}} \varUpsilon \biggl(\frac{2x}{X} \biggr) \varUpsilon \biggl(\frac{y}{X} \biggr) \cdot \frak {M}(n) \cdot \sum_{u} \mu(u) , $$

(6.4)

where in the last sum, u ranges over u∣(2x,y) (resp. and u<U). Here we have defined

$$\begin{aligned} \frak {M}(n) =& \frak {M}_{x,y}(n) \\ :=& \sum_{q<Q_{0}} \sideset{} {'}\sum_{r(q)} \sum _{\gamma \in \frak {F}} e_{q}\bigl(r\bigl(\langle w_{x,y},\gamma v_{0}\rangle-n\bigr)\bigr) \\ &{}\times \int _{\mathbb {R}} \frak {t}\biggl( \frac{N}{K_{0}} \beta \biggr) e \bigl(\beta \bigl(\frak {f}_{\gamma }(2x,y)-n\bigr)\bigr) d\beta , \end{aligned}$$

(6.5)

using (2.15).

As in (5.14), let $\tilde{\varGamma }(q)$ be the stabilizer of $v_{0}(\operatorname {mod}q)$. Decompose the sum on $\gamma \in \frak {F}$ in (6.5) as a sum on ${\gamma _{0}\in \varGamma /\tilde{\varGamma }(q)}$ and ${\gamma \in \frak {F}\cap \gamma _{0}\tilde{\varGamma }(q)}$. Applying Lemma 5.3 to the latter sum, using the definition of Θ ₁ in (6.1), and recalling the estimate (5.15) gives

$$ \frak {M}(n) = \frak {S}_{Q_{0}}(n)\cdot \frak {W}(n) + O \biggl( {T^{\varTheta _{1}}\over N}K_{0}^{2}Q_{0}^{4} \biggr) , $$

(6.6)

where

$$\begin{aligned} \frak {S}_{Q_{0}}(n) :=& \sum_{q<Q_{0}} \sideset{} {'}\sum_{r(q)} \sum _{\gamma _{0}\in \varGamma /\tilde{\varGamma }(q)} {e_{q}(r(\langle w_{x,y},\gamma _{0} v_{0}\rangle-n)) \over [\varGamma :\tilde{\varGamma }(q)] } , \\ \frak {W}(n) :=& {K_{0}\over N}\, \sum _{\frak {f}\in \frak {F}} \widehat{\frak {t}}\biggl( \bigl(\frak {f}(2x,y)-n\bigr) \frac{K_{0}}{N} \biggr) . \end{aligned}$$

Clearly we have thus split $\frak {M}$ into “modular” and “Archimedean” components. It is now a simple matter to prove the following

Theorem 6.1

For $\frac {1}{2}N<n<N$, there exists a function $\frak {S}(n)$ as in Theorem 3.1 so that

$$ \mathcal {M}_{N}(n)\gg \frak {S}(n)T^{\delta -1} . $$

(6.7)

Proof

First we discuss the modular component. Write $\frak {S}_{Q_{0}}$ as

$$\frak {S}_{Q_{0}}(n) = \sum_{q<Q_{0}} {1\over [\varGamma :\tilde{\varGamma }(q)] } \sum_{\gamma _{0}\in \varGamma /\tilde{\varGamma }(q)} c_{q} \bigl(\langle w_{x,y},\gamma _{0} v_{0}\rangle-n \bigr) , $$

where c _q is the Ramanujan sum, $c_{q}(m)=\sum_{r(q)}'e_{q}(rm)$. By (2.19), the analysis now reduces to a classical estimate for the singular series. We may use the transitivity of the γ ₀ sum to replace 〈w _x,y,γ ₀ v ₀〉 by 〈e ₄,γ ₀ v ₀〉, extend the sum on q to all natural numbers, and use multiplicativity to write the sum as an Euler product. Then the resulting singular series

$$\frak {S}(n) := \prod_{p} \biggl[ 1+ \sum _{k\ge1} {1\over[\varGamma :\varGamma _{0}(p^{k})]} \sum _{\gamma _{0}\in \varGamma /\varGamma _{0}(p^{k})} c_{p^{k}} \bigl( \langle e_{4}, \gamma _{0}\, v_{0} \rangle - n\bigr) \biggr] $$

vanishes only on non-admissible numbers, and can easily be seen to satisfy

$$ N^{-\varepsilon }\ll_{\varepsilon } \frak {S}(n)\ll_{\varepsilon } N^{\varepsilon }, $$

(6.8)

for any ε>0. See, e.g. [8, Sect. 4.3].

Next we handle the Archimedean component. By our choice of $\frak {t}$ in (3.17), specifically that $\widehat{\frak {t}}>0$ and $\widehat{\frak {t}}(y)>2/5$ for |y|<1/2, we have

$$\frak {W}(n)\gg{K_{0}\over N} \sum_{\frak {f}\in \frak {F}} \boldsymbol {1}_{\{ |\frak {f}(2x,y)-n| < \frac{N}{2K_{0}} \}} \gg {T^{\delta }\over N}+{T^{\varTheta _{1}}K_{0}\over N} , $$

using Lemma 5.4.

Putting everything into (6.6) and then into (6.4) gives (6.7), using (6.2) and (3.1). □

Next we derive from the above that the same bound holds for $\mathcal {M}_{N}^{U}$ (most of the time).

Theorem 6.2

There is an η>0 such that the bound (6.7) holds with $\mathcal {M}_{N}$ replaced by $\mathcal {M}_{N}^{U}$, except on a set of cardinality ≪N ^1−η.

Proof

Putting (6.6) into (6.4) gives

$$\begin{aligned} & \sum_{n<N} \big|\mathcal {M}_{N}(n)- \mathcal {M}_{N}^{U}(n)\big| \\ &\quad\ll \sum _{x,y\asymp X} \sum_{n<N}\big|\frak {M}(n)\big| \sum _{u\mid(2x,y)\atop u\ge U} 1 \\ &\quad\ll_{\varepsilon } \sum_{y<X} \sum _{u\mid y\atop u\ge U} \sum_{x<X\atop2x\equiv0(\operatorname {mod}u)} \biggl\{ N^{\varepsilon } \sum_{\frak {f}\in \frak {F}} {K_{0}\over N} \biggl[ \sum_{n<N} \widehat{\frak {t}}\biggl( \bigl(\frak {f}(2x,y)-n \bigr) \frac{K_{0}}{N} \biggr) \biggr] \\ &\qquad{} + K_{0}^{2}Q_{0}^{4}T^{\varTheta _{1}} \biggr\} \\ &\quad\ll N^{\varepsilon } X\frac{X}{U} T^{\delta } , \end{aligned}$$

using (6.8) and (6.2). The rest of the argument is identical to that leading to (3.11). □

This establishes (3.13), and hence completes our Major Arcs analysis; the rest of the paper is devoted to proving (3.14).

7 Minor arcs I: case q<Q ₀

We keep all the notation of Sect. 3, our goal in this section being to bound (3.22) and (3.23). First we return to (3.9) and reverse orders of summation, writing

$$ \widehat{\mathcal {R}_{N}^{U}} (\theta ) = \sum _{u< U} \mu(u) \sum_{\frak {f}\in \frak {F}} e(-a\theta ) \widehat{\mathcal {R}}_{f,u} (\theta ) , $$

(7.1)

where $\frak {f}=f-a$ according to (2.14), and we have set

$$\widehat{\mathcal {R}}_{f,u}(\theta ) := \sum_{ 2x\equiv0( u)} \sum _{ y\equiv0( u)} \varUpsilon \biggl(\frac{2x}{X} \biggr) \varUpsilon \biggl(\frac{y}{X} \biggr) e \bigl( \theta f(2 x, y) \bigr) . $$

If u is even, then we have

$$ \widehat{\mathcal {R}}_{f,u}(\theta ) = \sum _{ x,y\in \mathbb {Z}} \varUpsilon \biggl(\frac{x u}{X} \biggr) \varUpsilon \biggl( \frac{y u}{X} \biggr) e \bigl( \theta f( x u , y u) \bigr) . $$

(7.2)

If u is odd, we have

$$\widehat{\mathcal {R}}_{f,u}(\theta ) = \sum_{ x,y\in \mathbb {Z}} \varUpsilon \biggl(\frac{2x u}{X} \biggr) \varUpsilon \biggl(\frac{y u}{X} \biggr) e \bigl( \theta f(2 x u , y u) \bigr) . $$

From now on, we focus exclusively on the case u is even, the other case being handled similarly. We first massage $\widehat{\mathcal {R}}_{f,u}$ further.

Since f is homogeneous quadratic, we have

$$f( x u , y u) = u^{2} f(x,y) . $$

Hence expressing $\theta =\frac{r}{q}+\beta $, we will need to write u ²/q as a reduced fraction; to this end, introduce the notation

$$ \tilde{q}:=\bigl(u^{2},q\bigr)\qquad u_{0}:=u^{2}/\tilde{q} ,\qquad q_{0}:=q/\tilde{q} , $$

(7.3)

so that u ²/q=u ₀/q ₀ in lowest terms, (u ₀,q ₀)=1.

Lemma 7.1

Recalling the notation (5.25), we have

$$ \widehat{\mathcal {R}}_{f,u } \biggl(\frac{r}{q}+\beta \biggr) = {1\over u ^{2}} \sum_{n,m\in \mathbb {Z}} \mathcal {J}_{f} \biggl(X,\beta ;{n\over u q_{0}}, {m\over u q_{0}} \biggr) \mathcal {S}_{f}(q_{0},ru _{0};n,m) , $$

(7.4)

where we have set

$$\begin{aligned} &\mathcal {J}_{f} \biggl(X,\beta ;{n\over uq_{0}}, {m\over uq_{0}} \biggr) \\ &\quad := \iint\limits _{x,y\in \mathbb {R}} \varUpsilon \biggl( {x\over X} \biggr) \varUpsilon \biggl({y\over X} \biggr) e \biggl( \beta f(x,y) -{n\over uq_{0}}x-{m\over uq_{0}}y \biggr) dx dy . \end{aligned}$$

(7.5)

Proof

Returning to (7.2), we have

$$\begin{aligned} \widehat{\mathcal {R}}_{f,u} \biggl(\frac{r}{q}+\beta \biggr) =& \sum _{x,y\in \mathbb {Z}} \varUpsilon \biggl({u x\over X} \biggr) \varUpsilon \biggl({u y\over X} \biggr) e_{q_{0}} \bigl( r u_{0} f( x,y) \bigr) e \bigl( \beta u^{2} f(x,y) \bigr) \\ =& \sum_{k(q_{0})} \sum_{\ell( q_{0})} e_{q_{0}} \bigl( r u_{0} f(k,\ell) \bigr) \\ & {}\times \biggl[ \sum_{x\in \mathbb {Z}\atop x\equiv k(q_{0})} \sum _{y\in \mathbb {Z}\atop y\equiv\ell(q_{0})} \varUpsilon \biggl({u x\over X} \biggr) \varUpsilon \biggl( {u y\over X} \biggr) e \bigl( \beta u^{2} f(x,y) \bigr) \biggr] . \end{aligned}$$

Apply Poisson summation to the bracketed term above:

$$\begin{aligned} [ \cdot ] =& \sum_{x,y\in \mathbb {Z}} \varUpsilon \biggl( {u (q_{0}x+k)\over X} \biggr) \varUpsilon \biggl({u (q_{0}y+\ell)\over X} \biggr) e \bigl( \beta u^{2} f(q_{0}x+k,q_{0}y+\ell) \bigr) \\ =& \sum_{n,m\in \mathbb {Z}}\ \ \iint\limits _{x,y\in \mathbb {R}} \varUpsilon \biggl({u (q_{0}x+k)\over X} \biggr) \varUpsilon \biggl( {u (q_{0}y+\ell)\over X} \biggr)\\ &{}\times e \bigl( \beta u^{2} f(q_{0}x+k,q_{0}y+\ell) \bigr) \\ &{} \times e(-nx-my) dx dy \\ =& {1\over u^{2}q_{0}^{2}} \sum_{n,m\in \mathbb {Z}} e_{q_{0}}(nk+m\ell) \mathcal {J}_{f} \biggl(X,\beta ; {n\over u q_{0}},{m\over u q_{0}} \biggr). \end{aligned}$$

Inserting this in the above, the claim follows immediately. □

We are now in position to prove the following

Proposition 7.2

With the above notation,

$$ \biggl \vert \widehat{\mathcal {R}}_{f,u } \biggl(\frac{r}{q}+\beta \biggr) \biggr \vert \ll u \bigl(\sqrt{q}|\beta |T\bigr)^{-1} . $$

(7.6)

Proof

By (non)stationary phase (see, e.g., [30, §8.3]), the integral in (7.5) has negligible contribution unless

$${|n|\over u q_{0}}, {|m|\over u q_{0}} \ll |\beta |\cdot|\nabla f|\ll| \beta |\cdot TX, $$

so the n,m sum can be restricted to

$$ |n|,|m|\ll |\beta |\cdot TX\cdot u q_{0} \ll u . $$

(7.7)

Here we used |β|≪(qM)⁻¹ with M given by (3.15). In this range, stationary phase gives

$$\begin{aligned} \biggl \vert \mathcal {J}_{f} \biggl(X,\beta ; {n\over u q_{0}},{m\over u q_{0}} \biggr)\biggr \vert \ll&\min \biggl(X^{2},{1\over|\beta |\cdot|\operatorname {discr}(f)|^{1/2}} \biggr) \\ \ll&\min \biggl(X^{2},{1\over|\beta | T} \biggr) , \end{aligned}$$

(7.8)

using (2.16) and (3.4) that $|\operatorname{discr}(f)|=4|B^{2}-AC|=4 a^{2}\gg T^{2}$.

Putting (7.7), (7.8) and (5.26) into (7.4), we have

$$\biggl \vert \widehat{\mathcal {R}}_{f,u } \biggl(\frac{r}{q}+\beta \biggr) \biggr \vert \ll {1\over u ^{2}} \sum_{|n|,|m|\ll u} {1\over|\beta |T}\cdot {1\over\sqrt{q_{0}}}, $$

from which the claim follows, using (7.3). □

Finally, we prove the desired estimates of the strength (3.14).

Theorem 7.3

Recall the integrals $\mathcal {I}_{Q_{0},K_{0}},\ \mathcal {I}_{Q_{0}}$ from (3.22), (3.23). There is an η>0 so that

$$\mathcal {I}_{Q_{0},K_{0}}, \ \mathcal {I}_{Q_{0}}\ll N\, T^{2(\delta -1)}\, N^{-\eta} , $$

as N→∞.

Proof

We first handle $\mathcal {I}_{Q_{0},K_{0}}$. Returning to (7.1) and applying (7.6) gives

$$\biggl \vert \widehat{\mathcal {R}_{N}^{U}} \biggl(\frac{r}{q}+\beta \biggr) \biggr \vert \ll \sum_{u< U} \sum _{\frak {f}\in \frak {F}} u \bigl(\sqrt{q}|\beta |T\bigr)^{-1} \ll U^{2} T^{\delta -1} \bigl(\sqrt{q}|\beta |\bigr)^{-1} . $$

Inserting this into (3.22) and using (6.2), (6.3) gives

$$\begin{aligned} \mathcal {I}_{Q_{0},K_{0}} \ll& \sum_{q<Q_{0}} \sideset{} {'}\sum_{r(q)} \int_{|\beta |<K_{0}/N} \biggl \vert \beta \frac{N}{K_{0}} \biggr \vert ^{2} U^{4} T^{2(\delta -1)} {1\over q|\beta |^{2}} d\beta \\ \ll& Q_{0} \frac{N}{K_{0}} U^{4} T^{2(\delta -1)} \ll N T^{2(\delta -1)} N^{-\eta} . \end{aligned}$$

Next we handle

$$\begin{aligned} \mathcal {I}_{Q_{0}} \ll& \sum_{q<Q_{0}} \sideset{} {'} \sum_{r(q)} \int _{{K_{0}\over N}<|\beta |<\frac{1}{qM}} U^{4} T^{2(\delta -1)} {1\over q|\beta |^{2}} d\beta \\ \ll& Q_{0} U^{4} T^{2(\delta -1)} \biggl( { N\over K_{0}}+{Q_{0}M} \biggr) \\ \ll& N T^{2(\delta -1)} { Q_{0} U^{4} \over K_{0}} , \end{aligned}$$

which is again a power savings. □

8 Minor arcs II: case Q ₀≤Q<X

Keeping all the notation from the last section, we now turn our attention to the integrals $\mathcal {I}_{Q}$ in (3.24). It is no longer sufficient just to get cancellation in $\widehat{\mathcal {R}}_{f,u}$ alone, as in (7.6); we must use the fact that $\mathcal {I}_{Q}$ is an L ²-norm.

To this end, recall the notation (7.3), and put (7.4) into (7.1), applying Cauchy-Schwarz in the u-variable:

$$\begin{aligned} \biggl \vert \widehat{\mathcal {R}_{N}^{U}} \biggl(\frac{r}{q}+\beta \biggr)\biggr \vert ^{2} \ll& U \sum _{u< U} \bigg| \sum_{\frak {f}\in \frak {F}} e_{q}(-r a) e(-a\beta ) \\ &{} \times {1\over u ^{2}} \sum _{n,m\in \mathbb {Z}} \mathcal {J}_{f} \biggl(X,\beta ;{n\over u q_{0}}, {m\over u q_{0}} \biggr) \mathcal {S}_{f}(q_{0},ru _{0};n,m) \bigg|^{2} . \end{aligned}$$

(8.1)

Recall from (2.14) that $\frak {f}=f-a$. Insert (8.1) into (3.24) and open the square, setting $\frak {f}'=f'-a'$. This gives

$$\begin{aligned} \mathcal {I}_{Q} \ll& U \sum_{u< U} {1\over u ^{4}} \sum_{q\asymp Q} \sideset{} {'} \sum_{r(q)} \int _{|\beta |<\frac{1}{qM}} \bigg| \sum_{\frak {f}\in \frak {F}} e_{q}(-r a) e(-a\beta ) \\ &{} \times \sum_{n,m\in \mathbb {Z}} \mathcal {J}_{f} \biggl(X,\beta ;{n\over u q_{0}},{m\over u q_{0}} \biggr) \mathcal {S}_{f}(q_{0},ru _{0};n,m) \bigg|^{2} d \beta \\ = & U \sum_{u< U} {1\over u ^{4}} \sum_{n,m,n',m'\in \mathbb {Z}}\ \sum _{\frak {f},\frak {f}'\in \frak {F}}\ \sum_{q\asymp Q}\biggl[ \sideset{} {'} \sum _{r(q)} \mathcal {S}_{f}(q_{0},ru _{0};n,m) \\ &{} \times \overline{\mathcal {S}_{f'}\bigl(q_{0},ru _{0};n',m'\bigr)} e_{q}\bigl(r \bigl(a'-a\bigr)\bigr) \biggr] \\ &{} \times \biggl[ \int_{|\beta |<\frac{1}{qM}} \mathcal {J}_{f} \biggl(X,\beta ;{n\over u q_{0}},{m\over u q_{0}} \biggr) \\ &{}\times \overline{ \mathcal {J}_{f'} \biggl(X,\beta ;{n'\over u q_{0}},{m'\over u q_{0}} \biggr)} e\bigl(\beta \bigl(a'-a\bigr)\bigr) d\beta \biggr] . \end{aligned}$$

(8.2)

Note that again the sum has split into “modular” and “Archimedean” pieces (collected in brackets, respectively), with the former being exactly equal to $\mathcal {S}$ in (5.30).

Decompose (8.2) as

$$ \mathcal {I}_{Q}\ll \mathcal {I}_{Q}^{(=)}+ \mathcal {I}_{Q}^{(\neq)}, $$

(8.3)

where, once $\frak {f}$ is fixed, we collect $\frak {f}'$ according to whether a′=a (the “diagonal” case) and the off-diagonal a′≠a.

Lemma 8.1

Assume Q<X. For □∈{=,≠}, we have

$$ \mathcal {I}_{Q}^{(\square)} \ll U^{6} {X^{2}\over T} \sum_{\frak {f}\in \frak {F}} \sum _{\frak {f}'\in \frak {F}\atop a'\square a} \sum_{q\asymp Q} { \{(a^{2},q)\cdot((a')^{2},q)\}^{1/2}(a-a',q)^{1/4} \over q^{5/4} } . $$

(8.4)

Proof

Apply (5.31) and (7.7), (7.8) to (8.2), giving

$$\begin{aligned} \mathcal {I}_{Q}^{(\square)} \ll& U \sum_{u<U} {1\over u^{4}} \\ &{} \times\sum_{|n|,|m|,|n'|,|m'|\ll u}\ \sum _{\frak {f},\frak {f}'\in \frak {F}\atop a'\square a}\ \sum_{q\asymp Q} {u^{4} \{(a^{2},q)\cdot((a')^{2},q)\}^{1/2}(a-a',q)^{1/4} \over q^{5/4} } \\ &{} \times \int_{|\beta |<1/(qM)} \min \biggl( X^{2}, {1\over|\beta |T} \biggr)^{2} d\beta , \end{aligned}$$

where we used (7.3). The claim then follows immediately from (3.15) and Q<X. □

We treat $\mathcal {I}_{Q}^{(=)}, \ \mathcal {I}_{Q}^{(\neq)}$ separately, starting with the former; we give bounds of the quality claimed in (3.14).

Proposition 8.2

There is an η>0 such that

$$ \mathcal {I}_{Q}^{(=)}\ll N \, T^{2(\delta -1)}N^{-\eta}, $$

(8.5)

as N→∞.

Proof

From (8.4), we have

$$\begin{aligned} \mathcal {I}_{Q}^{(=)} \ll& U^{6} {X^{2}\over T} \sum_{\frak {f}\in \frak {F}} \sum_{\frak {f}'\in \frak {F}\atop a'=a} \sum_{q\asymp Q} {(a^{2},q)\over q} \\ \ll& {U^{6} X^{2}\over QT} \sum_{\frak {f}\in \frak {F}} \sum _{\tilde{q}_{1}\mid a^{2}\atop\tilde{q}_{1}\ll Q} \tilde{q}_{1} \sum _{q\asymp Q\atop q\equiv0(\tilde{q}_{1})} \sum_{\frak {f}'\in \frak {F}\atop a'=a} 1 \\ \ll_{\varepsilon }& {U^{6} X^{2}\over T} \sum_{\frak {f}\in \frak {F}} T^{\varepsilon } \sum_{\frak {f}'\in \frak {F}\atop a'=a} 1 . \end{aligned}$$

Recalling that a=a _γ=〈e ₁,γv ₀〉, replace the condition a′=a with $a'\equiv a(\operatorname {mod}\lfloor Q_{0}\rfloor)$, and apply (5.12):

$$\mathcal {I}_{Q}^{(=)} \ll_{\varepsilon } {U^{6} X^{2}\over T} T^{\delta } T^{\varepsilon } {T^{\delta }\over Q_{0}^{\eta_{0}}} . $$

Then (6.3) and (3.1) imply the claimed power savings. □

Next we turn our attention to $\mathcal {I}_{Q}^{(\neq)}$, the off-diagonal contribution. We decompose this sum further according to whether gcd(a,a′) is large or not. To this end, introduce a parameter H, which we will eventually set to

$$ H=U^{10/\eta_{0}}=Q_{0}{}^{\eta_{0}/10}, $$

(8.6)

where, as in (6.3), the constant η ₀>0 comes from Lemma 5.2. Write

$$ \mathcal {I}_{Q}^{(\neq)}=\mathcal {I}_{Q}^{(\neq,>)}+ \mathcal {I}_{Q}^{(\neq,\le)}, $$

(8.7)

corresponding to whether (a,a′)>H or (a,a′)≤H, respectively. We deal first with the large gcd.

Proposition 8.3

There is an η>0 such that

$$ \mathcal {I}_{Q}^{(\neq,>)}\ll N \, T^{2(\delta -1)}N^{-\eta}, $$

(8.8)

as N→∞.

Proof

Writing (a,a′)=h>H, $\tilde{q}_{1}=(a^{2},q)$, $\tilde{q}_{1}'=((a')^{2},q)$, and using (a−a′,q)≤q in (8.4), we have

$$\begin{aligned} \mathcal {I}_{Q}^{(\neq,>)} \ll& U^{6} {X^{2}\over T} \sum_{\frak {f}\in \frak {F}} \sum_{\frak {f}'\in \frak {F}\atop a'\neq a,(a,a')>H} \sum_{q\asymp Q} { \{(a^{2},q)\cdot((a')^{2},q)\}^{1/2}(a-a',q)^{1/4} \over q^{5/4} } \\ \ll& U^{6} {X^{2}\over T} \sum _{\frak {f}\in \frak {F}} \sum_{h\mid a\atop h>H} \sum _{\frak {f}'\in \frak {F}\atop a'\equiv0(\operatorname {mod}h)} \sum_{\tilde{q}_{1}\mid a^{2}\atop\tilde{q}_{1}\ll Q} \sum _{\tilde{q}_{1}'\mid(a')^{2}\atop[\tilde{q}_{1},\tilde{q}_{1}']\ll Q} \bigl(\tilde{q}_{1} \tilde{q}_{1}'\bigr)^{1/2} \sum _{q\asymp Q\atop q\equiv0([\tilde{q}_{1},\tilde{q}_{1}'])} {1\over Q} \\ \ll_{\varepsilon }& U^{6} {X^{2}\over T} T^{\varepsilon } \sum_{\frak {f}\in \frak {F}} \sum_{h\mid a\atop h>H} \sum_{\frak {f}'\in \frak {F}\atop a'\equiv0(\operatorname {mod}h)} 1 , \end{aligned}$$

where we used [n,m]>(nm)^1/2. Apply (5.12) to the innermost sum, getting

$$\begin{aligned} \mathcal {I}_{Q}^{(\neq,>)} \ll_{\varepsilon }& U^{6} {X^{2}\over T} T^{\varepsilon } T^{\delta } {1\over H^{\eta_{0}}} T^{\delta } . \end{aligned}$$

By (8.6) and (6.3), this is a power savings, as claimed. □

Finally, we handle small gcd.

Proposition 8.4

There is an η>0 such that

$$ \mathcal {I}_{Q}^{(\neq,\le)}\ll N \, T^{2(\delta -1)}N^{-\eta}, $$

(8.9)

as N→∞.

Proof

First note that

$$\begin{aligned} \mathcal {I}_{Q}^{(\neq,\le)} =& U^{6} {X^{2}\over T} \sum_{\frak {f}\in \frak {F}} \sum_{\frak {f}'\in \frak {F}\atop a'\neq a,(a,a')\le H} \sum_{q\asymp Q} { \{(a^{2},q)\cdot((a')^{2},q)\}^{1/2}(a-a',q)^{1/4} \over q^{5/4} } \\ \ll& U^{6} {X^{2}\over T} {1\over Q^{5/4}} \sum _{\frak {f}\in \frak {F}} \sum_{\frak {f}'\in \frak {F}\atop a'\neq a, (a,a')\le H} \sum _{q\asymp Q} (a,q) \bigl(a',q\bigr) \bigl(a-a',q\bigr)^{1/4} . \end{aligned}$$

Write g=(a,q) and g′=(a′,q), and let h=(g,g′); observe then that h∣(a,a′) and h≪Q. Hence we can write g=hg ₁ and $g'=hg_{1}'$ so that $(g_{1},g_{1}')=1$. Note also that h∣(a−a′,q), so we can write $(a-a',q)=h\tilde{g}$; thus $g_{1},g_{1}'$, and $\tilde{g}$ are pairwise coprime, implying

$$\bigl[hg_{1},hg_{1}',h\tilde{g}\bigr] \ge g_{1}g_{1}'\tilde{g} . $$

Then we have

$$\begin{aligned} \mathcal {I}_{Q}^{(\neq,\le)} \ll& U^{6} {X^{2}\over T} {1\over Q^{5/4}} \sum_{\frak {f}\in \frak {F}} \sum _{\frak {f}'\in \frak {F}\atop a'\neq a, (a,a')\le H} \sum_{h\mid(a,a')\atop h\le H} \sum _{g_{1}\mid a\atop g_{1}\ll Q} \sum_{g_{1}'\mid a'\atop g_{1}'\ll Q} \\ &{}\times \sum_{\tilde{g}\mid(a-a')\atop[hg_{1},hg_{1}',h\tilde{g}]\ll Q} (hg_{1}) \bigl(hg_{1}'\bigr) (h\tilde{g})^{1/4} \sum _{q\asymp Q\atop q\equiv0([hg_{1},hg_{1}',h\tilde{g}])} 1 \\ \ll_{\varepsilon }& U^{6} {X^{2}\over T} {H^{9/4}\over Q^{5/4}} \sum_{\frak {f},\frak {f}'\in \frak {F}} T^{\varepsilon } \sum_{g_{1}\mid a\atop g_{1}\ll Q} \sum _{g_{1}'\mid a'\atop g_{1}'\ll Q} \sum _{\tilde{g}\mid(a-a')\atop\tilde{g}\ll Q}g_{1}\, g_{1}'\, \tilde{g}^{1/4} {Q\over g_{1}g_{1}'\tilde{g}} \\ \ll& U^{6} {X^{2}\over T} {H^{9/4}\over Q^{1/4}} \sum _{\frak {f}\in \frak {F}} \sum_{\tilde{g}\ll Q} {1\over\tilde{g}^{3/4}} T^{\varepsilon } \sum_{\frak {f}'\in \frak {F}\atop a'\equiv a(\operatorname {mod}\tilde{g})}1. \end{aligned}$$

To the last sum, we again apply Lemma 5.2, giving

$$\begin{aligned} \mathcal {I}_{Q}^{(\neq,\le)} \ll_{\varepsilon }& U^{6} {X^{2}\over T} {H^{9/4}\over Q^{1/4}} T^{\delta } \sum _{\tilde{g}\ll Q} {1\over\tilde{g}^{3/4}} T^{\varepsilon } {1\over\tilde{g}^{\eta_{0}}} T^{\delta } \ll U^{6} {X^{2}\over T} {H^{9/4}\over Q_{0}^{\eta_{0}}} T^{\delta } T^{\varepsilon } T^{\delta } , \end{aligned}$$

since Q≥Q ₀. By (8.6) and (6.3), this is again a power savings, as claimed. □

Putting together (8.3), (8.5), (8.7), (8.8), and (8.9), we have proved the following

Theorem 8.5

For Q ₀≤Q<X, there is some η>0 such that

$$\mathcal {I}_{Q}\ll N\, T^{2(\delta -1)}\, N^{-\eta}, $$

as N→∞.

9 Minor arcs III: case X≤Q<M

In this section, we continue our analysis of $\mathcal {I}_{Q}$ from (3.24), but now we need different methods to handle the very large Q situation. In particular, the range of x,y in (7.2) is now such that we have incomplete sums, so our first step is to complete them.

To this end, recall the notation (7.3) and introduce

$$\begin{aligned} \lambda _{f} \biggl(X,\beta ;\frac{n}{q_{0}}, \frac{m}{q_{0}},u \biggr) :=& \sum_{x,y\in \mathbb {Z}} \varUpsilon \biggl(\frac{ux}{X} \biggr) \varUpsilon \biggl(\frac{uy}{X} \biggr) e \biggl( -{n\over q_{0}}x-{m\over q_{0}}y \biggr) \\ &{}\times e \bigl( \beta u^{2} f(x,y) \bigr) , \end{aligned}$$

(9.1)

so that, using (5.25), an elementary calculation gives

$$ \widehat{\mathcal {R}}_{f,u} \biggl(\frac{r}{q}+\beta \biggr) = \sum _{n(q_{0})} \sum_{m(q_{0})} \lambda _{f} \biggl(X,\beta ;\frac{n}{q_{0}},\frac{m}{q_{0}},u \biggr) \mathcal {S}_{f}(q_{0},ru_{0};n,m) . $$

(9.2)

Put (9.2) into (7.1) and apply Cauchy-Schwarz in the u-variable:

$$\begin{aligned} \biggl \vert \widehat{\mathcal {R}_{N}^{U}} \biggl(\frac{r}{q}+\beta \biggr)\biggr \vert ^{2} \ll& U \sum _{u< U} \bigg| \sum_{\frak {f}\in \frak {F}} e_{q}(-r a) e(-a\beta ) \\ &{} \times \sum_{0\le n,m<q_{0}} \lambda _{f} \biggl(X,\beta ;\frac{n}{q_{0}},\frac{m}{q_{0}},u \biggr) \mathcal {S}_{f}(q_{0},ru_{0};n,m) \bigg|^{2} . \end{aligned}$$

(9.3)

As before, open the square, setting $\frak {f}'=f'-a'$, and insert the result into (3.24):

$$\begin{aligned} \mathcal {I}_{Q} \ll& U \sum_{u< U} \sum_{q\asymp Q} \sum_{n,m,n',m'<q_{0}}\ \sum_{\frak {f},\frak {f}'\in \frak {F}}\biggl[ \sideset{} {'} \sum _{r(q)} \mathcal {S}_{f}(q_{0},ru _{0};n,m) \\ &{} \times \overline{\mathcal {S}_{f'}\bigl(q_{0},ru _{0};n',m'\bigr)} e_{q}\bigl(r \bigl(a'-a\bigr)\bigr) \biggr] \\ &{} \times \biggl[ \int_{|\beta |<1/(qM)} \lambda _{f} \biggl(X,\beta ;\frac{n}{q_{0}},\frac{m}{q_{0}},u \biggr) \overline{\lambda _{f'} \biggl(X,\beta ; \frac{n'}{q_{0}},\frac {m'}{q_{0}},u \biggr)} \\ &{}\times e\bigl(\beta \bigl(a-a' \bigr)\bigr) d\beta \biggr] . \end{aligned}$$

(9.4)

Yet again the sum has split into modular and Archimedean components with the former being exactly equal to $\mathcal {S}$ in (5.30). As before, decompose $\mathcal {I}_{Q}$ according to the diagonal (a=a′) and off-diagonal terms:

$$ \mathcal {I}_{Q}\ll \mathcal {I}_{Q}^{(=)}+ \mathcal {I}_{Q}^{(\neq)}. $$

(9.5)

Lemma 9.1

Assume Q≥X. For □∈{=,≠}, we have

$$ \mathcal {I}_{Q}^{(\square)} \ll {U X^{3} \over QT} \sum_{u<U} {1\over u^{4}} \sum_{q\asymp Q}\ \sum _{n,m,n',m'\ll{U Q\over X}} \sum_{\frak {f}\in \frak {F}} \sum _{\frak {f}'\in \frak {F}\atop a'\square a} |\mathcal {S}| . $$

(9.6)

Proof

Consider the sum λ _f in (9.1). Since x,y≍X/u, |β|<1/(qM), X≤Q, and using (3.15), we have that

$$\big|\beta u^{2} f(x,y)\big|\ll\frac{1}{QM} u^{2} T \biggl(\frac{X}{u} \biggr)^{2}= \frac{X}{Q}\le1. $$

Hence there is contribution only if nx/q ₀,my/q ₀≪1, that is, we may restrict to the range

$$n,m\ll u q_{0}/X . $$

In this range, we give λ _f the trivial bound of X ²/u ². Putting this analysis into (9.4), the claim follows. □

We handle the off-diagonal term first.

Proposition 9.2

Assuming X≤Q<M, there is some η>0 such that

$$ \mathcal {I}_{Q}^{(\neq)} \ll N\, T^{2(\delta -1)}\, N^{-\eta} , $$

(9.7)

as N→∞.

Proof

Since (5.31) is such a large savings in q>X, we can afford to lose in the much smaller variable T. Hence put (5.31) into (9.6), estimating (a−a′,q)≤|a−a′| (since a≠a′):

$$\begin{aligned} \mathcal {I}_{Q}^{(\neq)} \ll& {U X^{3} \over QT} \sum _{u<U} {1\over u^{4}} \sum _{q\asymp Q}\ \sum_{n,m,n',m'\ll{UQ\over X}} \sum _{\frak {f},\frak {f}'\in \frak {F}} u^{4} { a \cdot a' \over q^{5/4} } \big|a-a'\big|^{1/4} \\ \ll& {U^{6} X^{3} \over T} \biggl( {Q\over X} \biggr)^{4} T^{2\delta } { T^{2}\over Q^{5/4}} T^{1/4} \\ \ll& U^{6} X^{7/4} T^{2\delta } T^{4} = X^{2}T\, T^{2(\delta -1)} \, \bigl( U^{6} X^{-1/4} T^{5} \bigr) , \end{aligned}$$

where we used (7.3), Q<M, and (3.15). Using (3.1) we have that

$$ X^{-1/4}T^{5}=N^{-59/800}, $$

(9.8)

so together with (6.3), this is clearly a substantial power savings. □

Lastly, we deal with the diagonal term. We no longer save enough from a=a′ alone. But recall that here more cancellation can be gotten from (5.33) in the special case that $\frak {f}(m,-n)\neq \frak {f}'(m',-n')$. Hence we return to (9.6) and, once n,m, and $\frak {f}$ are determined, separate n′,m′, and $\frak {f}'$ into cases corresponding to whether $\frak {f}(m,-n)=\frak {f}'(m',-n')$ or not. Accordingly, write

$$ \mathcal {I}_{Q}^{(=)}\ =\ \mathcal {I}_{Q}^{(=,=)} + \mathcal {I}_{Q}^{(=,\neq)} . $$

(9.9)

We now estimate $\mathcal {I}_{Q}^{(=,\neq)}$ using the extra cancellation in (5.33).

Proposition 9.3

Assuming Q<XT, there is some η>0 such that

$$ \mathcal {I}_{Q}^{(=,\neq)} \ll N\, T^{2(\delta -1)}\, N^{-\eta} , $$

(9.10)

as N→∞.

Proof

Returning to (9.6), apply (5.33):

$$\begin{aligned} \mathcal {I}_{Q}^{(=,\neq)} \ll& {U X^{3} \over QT} \sum _{u<U} {1\over u^{4}} \sum _{\frak {f}\in \frak {F}} \sum_{\frak {f}'\in \frak {F}\atop a'= a} \sum _{q\asymp Q}\ \sum_{n,m\ll{U Q\over X}} \sum _{n',m'\ll{U Q\over X}\atop \frak {f}(m,-n)\neq \frak {f}'(m',-n')} |\mathcal {S}| \\ \ll& {U X^{3} \over QT} \sum_{u<U} {1\over u^{4}} \sum_{\frak {f},\frak {f}'\in \frak {F}} \sum _{\tilde{q}_{1}\mid a^{2}\atop\tilde{q}_{1}\ll Q} \sum_{q\asymp Q\atop q\equiv0(\tilde{q}_{1})} \sum _{n,m,n',m'\ll{U Q\over X}} u^{10} { \tilde{q}_{1} \over Q^{9/8} } \\ &{}\times\biggl(T \biggl({UQ\over X} \biggr)^{2} \biggr)^{1/2} \\ \ll_{\varepsilon }& {U^{8} X^{3} \over T} T^{2\delta }\, T^{\varepsilon } \biggl( {U Q\over X} \biggr)^{4} { 1\over Q^{9/8} } T^{1/2} {UQ\over X} \\ \ll& X^{2} T \ T^{2(\delta -1)}\, \bigl( X^{-1/8} T^{35/8} U^{13} T^{\varepsilon } \bigr) , \end{aligned}$$

where we used that $\frak {f}(m,n)\ll T (UQ/X)^{2}$ and Q<XT. From (3.1), we have

$$ X^{-1/8} T^{35/8} = N^{-29/1600} , $$

(9.11)

so we have again a power savings, as claimed. □

Lastly, we turn to the case $\mathcal {I}_{Q}^{(=,=)}$, with $\frak {f}(m,-n)=\frak {f}'(m',-n')$. We exploit this condition to get savings using (5.47).

Proposition 9.4

Assuming Q<XT, there is some η>0 such that

$$ \mathcal {I}_{Q}^{(=,=)} \ll N\, T^{2(\delta -1)}\, N^{-\eta} , $$

(9.12)

as N→∞.

Proof

Returning to (9.6), apply (5.31), and (5.47):

$$\begin{aligned} \mathcal {I}_{Q}^{(=,=)} \ll& {U X^{3} \over QT} \sum _{u<U} {1\over u^{4}} \sum _{q\asymp Q}\ \sum_{n,m\ll{UQ\over X}} \sum _{\frak {f}\in \frak {F}} \sum_{\frak {f}'\in \frak {F}\atop a'= a} \sum _{n',m'\ll UQ/X\atop \frak {f}(m,-n)=\frak {f}'(m',-n')} u^{4} {(a^{2},q)\over q^{5/4}}q^{1/4} \\ \ll& {U X^{3} \over Q^{2}T} \sum_{u<U} \sum _{\frak {f}\in \frak {F}} \sum_{\tilde{q}_{1}\mid a^{2}\atop\tilde{q}_{1}\ll Q} \tilde{q}_{1} \sum_{q\asymp Q\atop q\equiv0(\tilde{q}_{1})}\ \biggl[ \sum _{n,m\ll{UQ\over X}} \sum_{\frak {f}'\in \frak {F}\atop a'= a} \sum _{n',m'\ll UQ/X\atop \frak {f}(m,-n)=\frak {f}'(m',-n')} 1 \biggr] \\ \ll_{\varepsilon }& N^{\varepsilon } {U X^{3} \over Q^{2}T} U T^{\delta } Q \biggl[ \biggl( {UQ\over X} \biggr)^{2} + T {UQ\over X} \biggr] \\ \ll_{\varepsilon }& N^{\varepsilon } U^{4} X^{2} T^{\delta } \ll X^{2}T\, T^{2(\delta -1)}\, \bigl( T^{1-\delta } U^{4} N^{\varepsilon } \bigr) . \end{aligned}$$

From (4.2), this is a power savings. □

Combining (9.5), (9.7), (9.9), (9.10), and (9.12), we have the following

Theorem 9.5

If X≤Q<M, then there is some η>0 so that

$$\mathcal {I}_{Q}\ll N\, T^{2(\delta -1)} \, N^{-\eta}, $$

as N→∞.

Finally, Theorems 7.3, 8.5, and 9.5 together complete the proof of (3.14), and hence Theorem 1.2 is proved.

References

Berenstein, C.A., Yger, A.: Effective Bezout identities in Q[z ₁,…,z _n]. Acta Math. 166(1–2), 69–120 (1991)
Article MATH MathSciNet Google Scholar
Bernays, P.: Über die Darstellung von positiven, ganzen Zahlen durch die primitiven, binären quadratischen Formen einer nicht quadratischen Diskriminante. PhD thesis, Georg-August-Universität, Göttingen, Germany (1912)
Bourgain, J.: Integral Apollonian circle packings and prime curvatures. J. Anal. Math. 118(1), 221–249 (2012)
Article MATH MathSciNet Google Scholar
Bourgain, J., Fuchs, E.: A proof of the positive density conjecture for integer Apollonian circle packings. J. Am. Math. Soc. 24(4), 945–967 (2011)
Article MATH MathSciNet Google Scholar
Bourgain, J., Gamburd, A.: Expansion and random walks in $\mathrm{SL}_{d}(\mathbb{Z}/p^{n} \mathbb{Z})$. I. J. Eur. Math. Soc. 10(4), 987–1011 (2008)
Article MATH MathSciNet Google Scholar
Bourgain, J., Gamburd, A.: Uniform expansion bounds for Cayley graphs of $\mathrm{SL}_{2}(\mathbb{F}_{p})$. Ann. Math. (2) 167(2), 625–642 (2008)
Article MATH MathSciNet Google Scholar
Bourgain, J., Gamburd, A.: Expansion and random walks in $\mathrm{SL}_{d}(\mathbb {Z}/p^{n}\mathbb{Z})$. II. J. Eur. Math. Soc. 11(5), 1057–1103 (2009). With an appendix by Bourgain
Article MATH MathSciNet Google Scholar
Bourgain, J., Kontorovich, A.: On representations of integers in thin subgroups of SL$(2,{{\bf{Z}}})$. Geom. Funct. Anal. 20(5), 1144–1174 (2010)
Article MATH MathSciNet Google Scholar
Bourgain, J., Kontorovich, A.: On Zaremba’s conjecture (2011). Preprint arXiv:1107.3776
Bourgain, J., Varjú, P.P.: Expansion in SL _n(Z/q Z), q arbitrary. Invent. Math. 188(1), 151–173 (2012)
Article MATH MathSciNet Google Scholar
Bourgain, J., Gamburd, A., Sarnak, P.: Affine linear sieve, expanders, and sum-product. Invent. Math. 179(3), 559–644 (2010)
Article MATH MathSciNet Google Scholar
Bourgain, J., Kontorovich, A., Sarnak, P.: Sector estimates for hyperbolic isometries. Geom. Funct. Anal. 20(5), 1175–1200 (2010)
Article MATH MathSciNet Google Scholar
Bourgain, J., Gamburd, A., Sarnak, P.: Generalization of Selberg’s 3/16 theorem and affine sieve. Acta Math. 207, 255–290 (2011)
Article MATH MathSciNet Google Scholar
Breuillard, E., Green, B., Tao, T.: Approximate subgroups of linear groups. Geom. Funct. Anal. 21(4), 774–819 (2011)
Article MATH MathSciNet Google Scholar
Brooks, R.: The spectral geometry of a tower of coverings. J. Differ. Geom. 23(1), 97–107 (1986)
MATH MathSciNet Google Scholar
Brooks, R.: The spectral geometry of Riemannian surfaces. In: Monastyrsky, M.I. (ed.) Topology in Molecular Biology. Springer, Berlin (2007)
Google Scholar
Burger, M.: Grandes valeurs propres du Laplacien et graphes. In: Séminaire de Théorie Spectrale et Géométrie, No. 4, Année 1985–1986, pp. 95–100. Univ. Grenoble I (1986)
Burger, M.: Petites valeurs propres du Laplacien et topologie de Fell. PhD thesis, EPFL (1986)
Burger, M.: Spectre du Laplacien, graphes et topologie de Fell. Comment. Math. Helv. 63(2), 226–252 (1988)
Article MATH MathSciNet Google Scholar
Cowling, M., Haagerup, U., Howe, R.: Almost L ² matrix coefficients. J. Reine Angew. Math. 387, 97–110 (1988)
MATH MathSciNet Google Scholar
Diaconis, P., Saloff-Coste, L.: Comparison techniques for random walk on finite groups. Ann. Probab. 21(4), 2131–2156 (1993)
Article MATH MathSciNet Google Scholar
Fuchs, E.: Arithmetic properties of Apollonian circle packings. Princeton University Thesis (2010)
Fuchs, E., Sanden, K.: Some experiments with integral Apollonian circle packings. Exp. Math. 20(4), 380–399 (2011)
Article MATH MathSciNet Google Scholar
Gelfand, I.M., Graev, M.I., Pjateckii-Shapiro, I.I.: Teoriya Predstavlenii i Avtomorfnye Funktsii. Generalized Functions, vol. 6. Nauka, Moscow (1966)
Google Scholar
Good, A.: Local Analysis of Selberg’s Trace Formula. Lecture Notes in Mathematics, vol. 1040. Springer, Berlin (1983)
MATH Google Scholar
Graham, R.L., Lagarias, J.C., Mallows, C.L., Wilks, A.R., Yan, C.H.: Apollonian circle packings: number theory. J. Number Theory 100(1), 1–45 (2003)
Article MATH MathSciNet Google Scholar
Graham, R.L., Lagarias, J.C., Mallows, C.L., Wilks, A.R., Yan, C.H.: Apollonian circle packings: geometry and group theory. I. The Apollonian group. Discrete Comput. Geom. 34(4), 547–585 (2005)
Article MATH MathSciNet Google Scholar
Helfgott, H.A.: Growth and generation in $\mathrm{SL}_{2}(\mathbb{Z}/p\mathbb{Z})$. Ann. Math. (2) 167(2), 601–623 (2008)
Article MATH MathSciNet Google Scholar
Hermann, G.: Die Frage der endlich vielen Schritte in der Theorie der Polynomideale. Math. Ann. 95(1), 736–788 (1926)
Article MATH MathSciNet Google Scholar
Iwaniec, H., Kowalski, E.: Analytic Number Theory. American Mathematical Society Colloquium Publications, vol. 53. American Mathematical Society, Providence (2004)
MATH Google Scholar
Kassabov, M., Lubotzky, A., Nikolov, N.: Finite simple groups as expanders. Proc. Natl. Acad. Sci. USA 103(16), 6116–6119 (2006)
Article MATH MathSciNet Google Scholar
Kloosterman, H.D.: On the representation of numbers in the form ax ²+by ²+cz ²+dt ². Acta Math. 49(3–4), 407–464 (1927)
Article MATH MathSciNet Google Scholar
Kontorovich, A., Oh, H.: Apollonian circle packings and closed horospheres on hyperbolic 3-manifolds. J. Am. Math. Soc. 24(3), 603–648 (2011)
Article MATH MathSciNet Google Scholar
Lagarias, J.C., Mallows, C.L., Wilks, A.R.: Beyond the Descartes circle theorem. Am. Math. Mon. 109(4), 338–361 (2002)
Article MATH MathSciNet Google Scholar
Lax, P.D., Phillips, R.S.: The asymptotic distribution of lattice points in Euclidean and non-Euclidean space. J. Funct. Anal. 46, 280–350 (1982)
Article MATH MathSciNet Google Scholar
Masser, D.W., Wüstholz, G.: Fields of large transcendence degree generated by values of elliptic functions. Invent. Math. 72(3), 407–464 (1983)
Article MATH MathSciNet Google Scholar
Matthews, C., Vaserstein, L., Weisfeiler, B.: Congruence properties of Zariski-dense subgroups. Proc. Lond. Math. Soc. 48, 514–532 (1984)
Article MATH MathSciNet Google Scholar
Patterson, S.J.: The limit set of a Fuchsian group. Acta Math. 136, 241–273 (1976)
Article MATH MathSciNet Google Scholar
Pyber, L., Szabó, E.: Growth in finite simple groups of lie type of bounded rank (2010). Preprint arXiv:1005.1858
Salehi Golsefidy, A., Varjú, P.: Expansion in perfect groups. Geom. Funct. Anal. 22(6), 1832–1891 (2012)
Article MATH MathSciNet Google Scholar
Sarnak, P.: Some Applications of Modular Forms. Cambridge Tracts in Mathematics, vol. 99. Cambridge University Press, Cambridge (1990)
Book MATH Google Scholar
Sarnak, P.: Letter to J. Lagarias. web.math.princeton.edu/sarnak/AppolonianPackings.pdf (2007)
Sarnak, P.: Integral Apollonian packings. Am. Math. Mon. 118(4), 291–306 (2011)
Article MATH MathSciNet Google Scholar
Selberg, A.: On the estimation of Fourier coefficients of modular forms. Proc. Symp. Pure Math. VII, 1–15 (1965)
Article MathSciNet Google Scholar
Shalom, Y.: Bounded generation and Kazhdan’s property (T). Publ. Math. Inst. Hautes Études Sci. 90, 145–168 (1999)
Article MATH MathSciNet Google Scholar
Soddy, F.: The bowl of integers and the hexlet. Nature 139, 77–79 (1937)
Article MATH Google Scholar
Sullivan, D.: Entropy, Hausdorff measures old and new, and limit sets of geometrically finite Kleinian groups. Acta Math. 153(3–4), 259–277 (1984)
Article MATH MathSciNet Google Scholar
Varjú, P.P.: Expansion in SL _d(O _K/I), I square-free. J. Eur. Math. Soc. 14(1), 273–305 (2012)
Article MATH MathSciNet Google Scholar
Vinogradov, I.: Effective bisector estimate with application to Apollonian circle packings. IMRN (2013). Princeton University Thesis (2012). arXiv:1204.5498v1

Download references

Acknowledgements

The authors are grateful to Peter Sarnak for illuminating discussions, and many detailed comments improving the exposition of an earlier version of this paper. We thank Tim Browning, Sam Chow, Hee Oh, Xin Zhang, and the referee for numerous corrections and suggestions.

Author information

Authors and Affiliations

IAS, Princeton, NJ, 08540, USA
Jean Bourgain
Yale University, New Haven, CT, 06511, USA
Alex Kontorovich

Authors

Jean Bourgain
View author publications
You can also search for this author in PubMed Google Scholar
Alex Kontorovich
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Alex Kontorovich.

Additional information

Bourgain is partially supported by NSF grant DMS-0808042.

Kontorovich is partially supported by NSF grants DMS-1209373, DMS-1064214 and DMS-1001252.

Varjú is partially supported by the Simons Foundation and the European Research Council (Advanced Research Grant 267259).

Appendix: Spectral gap for the Apollonian group (by Péter P. Varjú)

P.P. Varjú University of Cambridge, Cambridge CB3 0WA, UK e-mail: pv270@dpmms.cam.ac.uk

In recent years some spectacular advances were made on estimating spectral gaps (to be defined below) of infinite co-volume subgroups of $\operatorname {SL}(d,\mathbb {Z})$. Bourgain and Gamburd [6] proved uniform spectral gap estimates for Zariski-dense subgroups of $\operatorname {SL}(2,\mathbb {Z})$ under the additional assumption that the modulus q is prime. One of the crucial ideas in their paper is the application of Helfgott’s triple-product theorem [28]. The result in [6] was generalized in a series of papers [5, 7, 10, 11, 48] and [40]. Some of these require the generalization of [28] obtained independently by Breuillard, Green and Tao [14] and Pyber and Szabó [39].

In particular, Bourgain and Varjú [10, Theorem 1] proved the spectral gap for Zariski-dense subgroups of $\operatorname {SL}(d, \mathbb {Z})$ without any restriction for the modulus q. Salehi Golsefidy and Varjú [40, Theorem 1] obtained the result for Zariski-dense subgroups of perfect arithmetic groups, but only for square-free q. Unfortunately, these results do not cover Theorem 4.3; the first one is not applicable to the Apollonian group, the second one is restricted for the moduli.

In this appendix, we present an approach which differs from those discussed above. This is much simpler and probably would give better numerical results, but we do not pursue explicit bounds. However, our method depends on special properties of the Apollonian group and does not apply to general Zariski-dense subgroups.

Recall from Sect. 2 that the preimage of the Apollonian group under the homomorphism

$$\iota: \operatorname {SL}(2,\mathbb {C})\to \operatorname {SO}_F(\mathbb {R}) $$

is generated by the matrices

$$ \pm \left ( \begin{array}{c@{\quad}c}{1}&{4i}\\ {0}&{1} \end{array} \right ),\qquad\pm \left ( \begin{array}{c@{\quad}c} {2}&{-i}\\ {-i}&{0} \end{array} \right ), \qquad\pm \left ( \begin{array}{c@{\quad}c} {2+2i}&{4+3i}\\ {-i}&{-2i} \end{array} \right ). $$

(A.1)

We describe an automorphism of $\operatorname {SL}(2,\mathbb {Z}[i])$ which transforms the above generators to matrices that will be more convenient to work with. Set . A simple calculation shows that the image of the matrices (A.1) under the map g↦A ⁻¹ gA are

$$\pm \left ( \begin{array}{c@{\quad}c} {1}&{4i}\\ {0}&{1} \end{array} \right ),\qquad\pm \left ( \begin{array}{c@{\quad}c} {1}&{0}\\ {-i}&{1} \end{array} \right ), \qquad\pm \left ( \begin{array}{c@{\quad}c} {1+2i}&{4i}\\ {-i}&{1-2i} \end{array} \right ). $$

We put

$$ \gamma _1=\left ( \begin{array}{c@{\quad}c} {1}&{4}\\ {0}&{1} \end{array} \right ),\qquad \gamma _2=\left ( \begin{array}{c@{\quad}c} {1}&{0}\\ {1}&{1} \end{array} \right ), \qquad \gamma _3=\left ( \begin{array}{c@{\quad}c} {1+2i}&{4}\\ {1}&{1-2i} \end{array} \right ). $$

(A.2)

These are the image of (A.1) under the product of two isomorphism: first conjugation by A and then multiplication of the off-diagonal elements by −i and i. We denote by $\bar{\varGamma }$ the group generated by $\bar{S}=\{\pm \gamma _{1}^{\pm1},\pm \gamma _{2}^{\pm1},\pm \gamma _{3}^{\pm1}\}$. This is isomorphic to the group denoted by the same symbol in the paper.

First we recall two different notions of spectral gap. The notion, “geometric” spectral gap, has already been explained in Sect. 4.2. Recall that for an integer q, $\bar{\varGamma}(q)$ denotes the kernel of the projection map $\bar{\varGamma}\to \operatorname {SL}(2,\mathbb {Z}[i]/(q))$. We consider the Laplace Beltrami operator Δ on the hyperbolic orbifolds $\bar{\varGamma}(q)\backslash \mathbb {H}^{3}$. We denote by λ ₀(q)≤λ ₁(q) the two smallest eigenvalues of Δ on $\bar{\varGamma}(q)\backslash \mathbb {H}^{3}$. The geometric spectral gap is an inequality of the form λ ₁(q)>λ ₀(q)+ε for some ε>0 independent of q.

The other notion, “combinatorial” spectral gap is defined as follows. Let G be a finite group, and S a symmetric set of generators. Let T _G,S be the Markov operator on the space L ²(G) defined by

$$T_{G,S}f(g)=\frac{1}{|S|}\sum_{\gamma \in S} f(\gamma g) $$

for f∈L ²(G) and g∈G. We denote by

$$\lambda _n'(G,S)\le\cdots\le \lambda _1'(G,S) \le \lambda _0'(G,S)=1 $$

the eigenvalues of T _G,S in increasing order.

The operator $Id-T_{\bar{\varGamma}/\bar{\varGamma}(q)}$ is a discrete analogue of the Laplacian Δ on $\bar{\varGamma}(q)\backslash \mathbb {H}^{3}$. So by combinatorial spectral gap we mean the inequality

$$\lambda _1'\bigl(\bar{\varGamma}/\bar{\varGamma}(q),\bar{S} \bigr)<1-\varepsilon $$

for some ε>0 independent of q. To simplify notation, we will write $\lambda _{1}'(q)=\lambda _{1}'(\bar{\varGamma}/\bar{\varGamma}(q),\bar{S})$.

The relation between the two notions is not just an analogy. It was shown by Brooks [15, Theorem 1] and Burger [17–19] that they are equivalent for the fundamental groups of a family of covers of a compact manifold. The orbifolds $\bar{\varGamma}(q)\backslash \mathbb {H}^{3}$ are not compact, they even have infinite volume, however the equivalence can be extended to cover our example, see [13, Theorems 1.2 and 2.1].

We show that the congruence subgroups $\bar{\varGamma}(q)$ of the Apollonian group have combinatorial spectral gap which implies Theorem 4.3 in light of [13, Theorems 1.2 and 2.1].

Theorem A.1

Let $\bar{\varGamma}$ be the Apollonian group and $\lambda '_{1}(q)$ be as above. There is an absolute constant c>0 such that $\lambda _{1}'(q)<1-c$ for all q. I.e. the Apollonian group has combinatorial spectral gap.

Denote by Γ ₁ and Γ ₂ respectively, the groups generated by {γ ₁,γ ₂} and {γ ₁,γ ₃} respectively. Denote by ${\bf G}_{1}$ and ${\bf G}_{2}$ the Zariski-closures of Γ ₁ and Γ ₂ in $\operatorname {Res}_{\mathbb {R}|\mathbb {C}} \operatorname {SL}(2,\mathbb {C})$, i.e. in $\operatorname {SL}(2,\mathbb {C})$ considered an algebraic group over $\mathbb {R}$.

As we will see later, ${\bf G}_{1}$ and ${\bf G}_{2}$ are isomorphic to $\operatorname {SL}(2,\mathbb {R})$. Moreover Γ ₁ and Γ ₂ are lattices inside them. This feature of the Apollonian group was pointed out by Sarnak [42]. We exploit it heavily in our approach.

Due to a result going back to Selberg [44], Γ ₁ and Γ ₂ have geometric spectral gaps with respect to the congruence subgroups. From here we can deduce the combinatorial spectral gap using Brooks [15, Theorem 1] (see also [16, Theorem 1], where the non-compact case is considered.)

We transfer the combinatorial spectral gap property of Γ ₁ and Γ ₂ to the Apollonian group $\bar{\varGamma}$ and conclude Theorem A.1. This is done in following two Lemmata:

Lemma A.2

Let G be a finite group and S⊂G a finite symmetric generating set. Let G ₁,G ₂,…,G _k be subgroups of G such that for every g∈G there are g ₁∈G ₁,…,g _k∈G _k such that g=g ₁⋯g _k. Then

$$1-\lambda _1'(G,S)\ge\min_{1\le i\le k} \biggl \{\frac{|S\cap G_{i}|}{|S|}\cdot \frac{1-\lambda _1'(G_{i},S\cap G_{i})}{2k^2} \biggr\}. $$

The above Lemma and its proof below is closely related to the well-known fact that if G is generated by S in k steps then one has $\lambda '_{1}(G,S)\le1-1/|S|k^{2}$. This can be found for example in [21, Corollary 1 on page 2138]. After circulating an earlier version of this appendix, it was pointed out to me that an idea similar to Lemma A.2 has been used by Sarnak [41, Sect. 2.4], by Shalom [45], and also by Kassabov, Lubotzky and Nikolov [31].

Lemma A.3

Let q≥2 be an integer. Then for every $g\in\bar{\varGamma}/\bar{\varGamma}(q)$, there are $g_{1},\ldots, g_{10^{13}}\in \varGamma _{1}/\varGamma _{1}(q)$ and $h_{1},\ldots, h_{10^{13}}\in \varGamma _{2}/\varGamma _{2}(q)$ such that $g=g_{1}h_{1}\cdots g_{10^{13}}h_{10^{13}}$.

Lemma A.3 enables us to apply Lemma A.2 with k=2⋅10¹³ and G _i=Γ ₁/Γ ₁(q) for odd i and G _i=Γ ₂/Γ ₂(q) for even i. Now [44] and [16, Theorem 1] provides us with lower bounds on

$$\begin{aligned} &1-\lambda _1'\bigl(\varGamma _1/ \varGamma _1(q),\bigl\{\pm \gamma _1^{\pm1},\pm \gamma _2^{\pm1}\bigr\}\bigr) \quad\mbox{and}\\ & 1- \lambda _1'\bigl(\varGamma _2/\varGamma _2(q), \bigl\{\pm \gamma _1^{\pm1},\pm \gamma _3^{\pm1} \bigr\}\bigr). \end{aligned}$$

Therefore Theorem A.1 is proved once the two Lemmata are proved.

Before we proceed with the proofs, we make two remarks. First, we note that instead of [44] we could just as well use [10, Theorem 1]. Second, we suggest that the constant 10¹³ in Lemma A.3 is not optimal. In particular, the argument we present would give 72 if the statement is checked for q=2⁷⋅3, e.g. by a computer program. Certainly there is further room for improvement but we make no efforts to optimize the constants.

Proof of Lemma A.2

Denote by π the regular representation of G, i.e. we write

$$\pi(g_0)f(g)=f\bigl(g_0^{-1}g\bigr) $$

for f∈L ²(G) and g,g ₀∈G. Let T _G,S be the Markov operator defined above. Let f ₀∈L ²(G) be an eigenfunction with ∥f ₀∥₂=1 corresponding to $\lambda _{1}'(G,S)$. It is orthogonal to the constant and

$$\langle T_{G,S}f_0,f_0\rangle= \lambda _1'(G,S). $$

Since f ₀ is orthogonal to the constant, we have

$$\sum_{g\in G}\bigl\langle\pi(g)f_0,f_0 \bigr\rangle=\big|\langle f_0,1\rangle\big|^2=0. $$

Thus there is g ₀∈G such that 〈π(g ₀)f ₀,f ₀〉≤0 and hence $\|\pi(g_{0})f_{0}-f_{0}\|_{2}\ge\sqrt{2}$.

By the hypothesis of the lemma, there are g _i∈G _i for 1≤i≤k such that g ₀=g ₁⋯g _k. By the triangle inequality, there is some 1≤i ₀≤k such that

$$\big\|\pi(g_1\cdots g_{i_0-1})f_0- \pi(g_1\cdots g_{i_0})f_0\big\|_2\ge \sqrt{2}/k. $$

Since π is unitary, we have $\|f_{0}-\pi(g_{i_{0}})f_{0}\|_{2}\ge\sqrt{2}/k$.

We write f ₀=f ₁+f ₂ such that f ₁ is invariant under the elements of $G_{i_{0}}$ in the regular representation π and f ₂ is orthogonal to the space of functions invariant under $G_{i_{0}}$. Then

$$\sqrt{2}/k\le\big\|f_0-\pi(g_{i_0})f_0 \big\|_2 =\big\|f_2-\pi(g_{i_0})f_2 \big\|_2\le2\|f_2\|_2. $$

Thus $\|f_{2}\|_{2}\ge1/\sqrt{2}k$.

Now we can write

$$\begin{aligned} \langle T_{G,S\cap G_{i_0}}f_0,f_0\rangle =& \|f_1\|_2^2+\langle T_{G,S\cap G_{i_0}}f_2,f_2 \rangle \\ \le&\|f_1\|_2^2+\lambda _1'(G_{i_0},S \cap G_{i_0})\|f_2\|_2^2 \\ =& 1-\bigl(1-\lambda _1'(G_{i_0},S\cap G_{i_0})\bigr)\|f_2\|_2^2. \end{aligned}$$

(A.3)

Since

$$T_{G,S}=\frac{|S\cap G_{i_0}|}{|S|}T_{G,S\cap G_{i_0}} +\frac{|S\backslash G_{i_0}|}{|S|}T_{G,S\backslash G_{i_0}}, $$

we have

$$ \langle T_{G,S}f_0,f_0 \rangle\le1- \frac{|S\cap G_{i_0}|}{|S|}\bigl(1-\langle T_{G,S\cap G_{i_0}}f_0,f_0 \rangle\bigr). $$

(A.4)

We combine (A.3), (A.4) and the estimate on ∥f ₂∥₂ and get

$$\langle T_{G,S}f_0,f_0\rangle\le1- \frac{|S\cap G_{i_0}|}{|S|}\cdot \frac{1-\lambda _1'(G_{i_0},S\cap G_{i_0})}{2k^2} $$

which was to be proved. □

Now we turn to the proof of Lemma A.3. It will be convenient to write

$$A_k(q)=\bigl\{g_1h_1\cdots g_kh_k:g_1,\ldots g_k\in \varGamma _1/\varGamma _1(q), h_1,\ldots h_k\in \varGamma _2/\varGamma _2(q)\bigr\}. $$

First we consider the case when q is the power of a prime; the general case will be easy to deduce from this.

Lemma A.4

Let p be a prime and m a positive integer. Then $A_{10^{13}}(p^{m})=\bar{\varGamma}/\bar{\varGamma}(p^{m})$.

We use different methods when p is 2 or 3 compared to when it is larger. First we consider the latter situation.

Proof of Lemma A.4 for p≥5

It is well-known and easy to check that the group generated by γ ₁ and γ ₂ is

$$ \varGamma _1=\left \{\left ( \begin{array}{c@{\quad}c} {a}&{b}\\ {c}&{d} \end{array} \right )\in \operatorname {SL}(2,\mathbb {Z}): b\equiv0\quad \operatorname {mod}4 \right \}. $$

(A.5)

Thus $\varGamma _{1}/\varGamma _{1}(p^{m})= \operatorname {SL}(2,\mathbb {Z}/p^{m}\mathbb {Z})$ for p≠2.

By simple calculation:

$$\left ( \begin{array}{c@{\quad}c} {a^{-1}}&{0}\\ {0}&{a} \end{array} \right ) \left ( \begin{array}{c@{\quad}c} {\frac{1}{2}}&{0}\\ {\frac{-1}{8}}&{2} \end{array} \right ) \gamma _3^2 \left ( \begin{array}{c@{\quad}c} {1}&{0}\\ {\frac{1}{8}}&{1} \end{array} \right ) \gamma _3^{-1} \left ( \begin{array}{c@{\quad}c} {a}&{0}\\ {0}&{a^{-1}} \end{array} \right ) =\left ( \begin{array}{c@{\quad}c} {1}&{0}\\ {\frac{-3ia^2}{2}}&{1} \end{array} \right ). $$

Since p≠2 we can divide by 2 in the ring $\mathbb {Z}/p^{m}\mathbb {Z}$, hence for (a,p)=1, the matrices in the above calculation are in Γ ₁/Γ ₁(p ^m) except for γ ₃. Therefore

$$\left ( \begin{array}{c@{\quad}c} {1}&{0}\\ {\frac{-3ia^2}{2}}&{1} \end{array} \right )\in A_3 \bigl(p^m\bigr). $$

Using this, we want to show that

$$ \left ( \begin{array}{c@{\quad}c} {1}&{0}\\ {a i}&{1} \end{array} \right )\in A_{12}\bigl(p^m\bigr) $$

(A.6)

for all $a\in \mathbb {Z}/p^{m}\mathbb {Z}$. To do this, we need to show that for every element $x\in \mathbb {Z}/p^{m}\mathbb {Z}$, we can find elements $a_{1},\ldots, a_{k}\in \mathbb {Z}/p^{m}\mathbb {Z}$ for some 0≤k≤4, such that a ₁,…,a _k are not divisible by p and $x=a_{1}^{2}+\cdots+a_{k}^{2}$. If m=1, this simply follows from the fact that any positive integer is a sum of at most 4 squares, and the a _i can not be divisible by p since 0<a _i≤x≤p and at least one of the inequalities are strict.

Suppose that m>1, $x\in \mathbb {Z}/p^{m}\mathbb {Z}$ and $a_{1}^{2}+\cdots+a_{k}^{2}\equiv x\operatorname {mod}p$ with none of a ₁…a _k divisible by p. Then by Hensel’s lemma (recall that p≠2), there is an $a_{1}'\in \mathbb {Z}/p^{m}\mathbb {Z}$ such that

$$\bigl(a_1'\bigr)^2=a_1^2+ \bigl(x-a_1^2-\cdots-a_k^2 \bigr). $$

This proves the claim for arbitrary m≥1.

Multiplying (A.6) by a suitable unipotent element of Γ ₁/Γ ₁(p ^m), we can get

$$\left ( \begin{array}{c@{\quad}c} {1}&{0}\\ {a}&{1} \end{array} \right )\in A_{12} \bigl(p^m\bigr) $$

for $a\in \mathbb {Z}[i]/(p^{m})$. We can prove the same for the upper triangular unipotents by a very similar argument.

Again, by simple calculation:

$$\left ( \begin{array}{c@{\quad}c} {1}&{a}\\ {0}&{1} \end{array} \right ) \left ( \begin{array}{c@{\quad}c} {1}&{0}\\ {b}&{1} \end{array} \right ) \left ( \begin{array}{c@{\quad}c} {1}&{c}\\ {0}&{1} \end{array} \right ) =\left ( \begin{array}{c@{\quad}c} {1+ab}&{a+c+abc}\\ {b}&{1+bc} \end{array} \right ). $$

This shows that

$$\left ( \begin{array}{c@{\quad}c} {a'}&{b'}\\ {c'}&{d'} \end{array} \right )\in A_{36} \bigl(p^m\bigr) $$

for all $a',b',c',d'\in \mathbb {Z}[i]/(p^{m})$, a′d′−b′c′=1, provided c′ is not divisible by a prime above p.

Thus, A ₃₆(p ^m) contains more than half of the group $\bar{\varGamma}/\bar{\varGamma}(p^{m})$, hence

$$A_{72}\bigl(p^m\bigr)=\bar{\varGamma}/\bar{\varGamma} \bigl(p^m\bigr). $$

□

Proof of Lemma A.4 for p=2 and 3

We give the proof for p=2 and then explain the differences for p=3.

We prove by induction the following statement. For every m≥7 and $g\in\bar{\varGamma}(2^{7})/\bar{\varGamma}(2^{m})$, there are g ₁,g ₂,g ₃∈Γ ₁(2²)/Γ ₁(2^m) such that

$$g=g_1\gamma _3g_2\gamma _3^{-1} \gamma _3^2 g_3\gamma _3^{-2}. $$

For m=7 this is clear since we can take g ₁=g ₂=g ₃=1. Now assume that m>7 and the statement holds for m−1. In this proof, we denote by 1 the multiplicative unit (identity matrix) and by 0 the matrix with all entries 0. Let $g\in\bar{\varGamma}(2^{7})/\bar{\varGamma}(2^{m})$ be arbitrary. By the induction hypothesis, there is h ₁,h ₂,h ₃∈Γ ₁(2²)/Γ ₁(2^m) such that

$$g-h_1\gamma _3h_2\gamma _3^{-1} \gamma _3^2 h_3\gamma _3^{-2}=2^{m-1}x, $$

where x can be considered as an element of $\operatorname {Mat}(2,\mathbb {Z}[i]/(2))$, i.e. a 2×2 matrix with elements in $\mathbb {Z}[i]/(2)$. Since g,h ₁,h ₂,h ₃ has determinant 1 and congruent to the unit element mod 2, x has trace 0.

Now we look for suitable $x_{1},x_{2},x_{3}\in \operatorname {Mat}(2,\mathbb {Z})$ such that

$$x_1+\gamma _3x_2\gamma _3^{-1}+ \gamma _3^2 x_3\gamma _3^{-2} \equiv2^{m-1}x\quad \operatorname {mod}2^m. $$

Moreover, we ensure that $x_{i}\equiv0\quad \operatorname {mod}2^{m-4}$ and that $\operatorname {Tr}(x_{i})\equiv0\quad \operatorname {mod}2^{m}$ for all i=1,2,3. Since m≥8, this implies that $h_{i}+x_{i}\equiv1\quad \operatorname {mod}4$ and $\det(h_{i}+x_{i})\equiv1 \quad \operatorname {mod}2^{m}$, hence h _i+x _i∈Γ ₁(2²)/Γ ₁(2^m). Recall (A.5) from the previous proof. If the matrices x _i satisfy the claimed properties then

$$\begin{aligned} &(h_1+x_1)\gamma _3(h_2+x_2) \gamma _3^{-1}\gamma _3^2 (h_3+x_3) \gamma _3^{-2} \\ &\quad\equiv h_1\gamma _3h_2 \gamma _3^{-1}\gamma _3^2 h_3 \gamma _3^{-2}+ x_1+\gamma _3x_2 \gamma _3^{-1}+\gamma _3^2 x_3 \gamma _3^{-2}\equiv g\quad \operatorname {mod}2^m. \end{aligned}$$

The matrices x ₁,x ₂,x ₃ can be chosen to be a suitable linear combination of the matrices in the following calculations, and this finishes the induction:

$$2^{m-1}\left ( \begin{array}{c@{\quad}c} {0}&{1}\\ {0}&{0} \end{array} \right )+ \gamma _3 0\gamma _3^{-1}+\gamma _3^2 0 \gamma _3^{-2} \equiv2^{m-1}\left ( \begin{array}{c@{\quad}c} {0}&{1}\\ {0}&{0} \end{array} \right )\quad \operatorname {mod}2^m, $$

$$2^{m-1}\left ( \begin{array}{c@{\quad}c} {0}&{0}\\ {1}&{0} \end{array} \right )+ \gamma _3 0\gamma _3^{-1}+\gamma _3^2 0 \gamma _3^{-2} 2^{m-1}\equiv \left ( \begin{array}{c@{\quad}c} {0}&{0}\\ {1}&{0} \end{array} \right )\quad \operatorname {mod}2^m, $$

$$2^{m-1}\left ( \begin{array}{c@{\quad}c} {1}&{0}\\ {0}&{-1} \end{array} \right )+ \gamma _3 0\gamma _3^{-1}+\gamma _3^2 0 \gamma _3^{-2} \equiv2^{m-1}\left ( \begin{array}{c@{\quad}c} {1}&{0}\\ {0}&{-1} \end{array} \right )\quad \operatorname {mod}2^m, $$

$$\begin{aligned} &2^{m-2}\left ( \begin{array}{c@{\quad}c} {1}&{3}\\ {1}&{-1} \end{array} \right )+ \gamma _3 2^{m-2}\left ( \begin{array}{c@{\quad}c} {0}&{1}\\ {0}&{0} \end{array} \right )\gamma _3^{-1}+\gamma _3^2 0 \gamma _3^{-2} \\ &\quad\equiv2^{m-1}\left ( \begin{array}{c@{\quad}c} {-i}&{0}\\ {0}&{i} \end{array} \right )\quad \operatorname {mod}2^m, \\ &2^{m-3}\left ( \begin{array}{c@{\quad}c} {-4}&{0}\\ {3}&{4} \end{array} \right )+ \gamma _3 2^{m-3}\left ( \begin{array}{c@{\quad}c} {0}&{0}\\ {1}&{0} \end{array} \right )\gamma _3^{-1}+\gamma _3^2 0 \gamma _3^{-2}\\ &\quad \equiv2^{m-1}\left ( \begin{array}{c@{\quad}c} {0}&{0}\\ {i}&{0} \end{array} \right )\quad \operatorname {mod}2^m, \\ &2^{m-4}\left ( \begin{array}{c@{\quad}c} {2}&{15}\\ {4}&{-2} \end{array} \right )+ \gamma _3 0\gamma _3^{-1}+\gamma _3^2 2^{m-4}\left ( \begin{array}{c@{\quad}c} {0}&{1}\\ {0}&{0} \end{array} \right ) \gamma _3^{-2}\\ &\quad \equiv2^{m-1} \left ( \begin{array}{c@{\quad}c} {-i}&{i}\\ {0}&{i} \end{array} \right )\quad \operatorname {mod}2^m. \end{aligned}$$

Now we showed that

$$A_3\bigl(2^{m}\bigr)\supseteq\bar{\varGamma} \bigl(2^7\bigr)/\bar{\varGamma}\bigl(2^m\bigr). $$

The index of $\bar{\varGamma}(2^{7})/\bar{\varGamma}(2^{m})$ in $\bar{\varGamma}/\bar{\varGamma}(2^{m})$ is at most

$$\big| \operatorname {SL}\bigl(2,\mathbb {Z}[i]/\bigl(2^7\bigr)\bigr)\big|=46\cdot64^6. $$

This shows that

$$A_{10^{13}}\bigl(2^{m}\bigr)=\bar{\varGamma}/\bar{\varGamma} \bigl(2^m\bigr). $$

Now we turn to the case p=3. By the same argument, one can show that for every m≥1 and $g\in\bar{\varGamma}(3)/\bar{\varGamma}(3^{m})$, there are g ₁,g ₂,g ₃∈Γ ₁/Γ ₁(3^m) such that

$$g=g_1\gamma _3g_2\gamma _3^{-1} \gamma _3^2 g_3\gamma _3^{-2}. $$

The only significant difference is that one needs to use the following identities:

$$\begin{aligned} &3^{m-1}\left ( \begin{array}{c@{\quad}c} {1}&{3}\\ {1}&{-1} \end{array} \right )+ \gamma _33^{m-1}\left ( \begin{array}{c@{\quad}c} {0}&{1}\\ {0}&{0} \end{array} \right )\gamma _3^{-1}+\gamma _3^2 0 \gamma _3^{-2}\\ &\quad \equiv3^{m-1}\left ( \begin{array}{c@{\quad}c} {i}&{i}\\ {0}&{-i} \end{array} \right )\quad \operatorname {mod}3^m, \\ &3^{m-1}\left ( \begin{array}{c@{\quad}c} {-4}&{16}\\ {3}&{4} \end{array} \right )+ \gamma _33^{m-1} \left ( \begin{array}{c@{\quad}c} {0}&{0}\\ {1}&{0} \end{array} \right )\gamma _3^{-1}+\gamma _3^2 0 \gamma _3^{-2}\\ &\quad \equiv3^{m-1}\left ( \begin{array}{c@{\quad}c} {i}&{0}\\ {-i}&{-i} \end{array} \right )\quad \operatorname {mod}3^m, \\ &3^{m-1}\left ( \begin{array}{c@{\quad}c} {2}&{15}\\ {4}&{-2} \end{array} \right )+ \gamma _3 0\gamma _3^{-1}+\gamma _3^23^{m-1} \left ( \begin{array}{c@{\quad}c} {0}&{1}\\ {0}&{0} \end{array} \right )\gamma _3^{-2}\\ &\quad\equiv3^{m-1}\left ( \begin{array}{c@{\quad}c} {i}&{-i}\\ {0}&{-i} \end{array} \right ) \quad \operatorname {mod}3^m. \end{aligned}$$

Using this claim, one can finish the proof as above. □

Proof of Lemma A.3

Let q be an integer and $q=p_{1}^{m_{1}}\cdots p_{n}^{m_{n}}$ where p _i are primes. We prove that

$$A_{10^{13}}(q)=A_{10^{13}}\bigl(p_1^{m_1} \bigr)\times\cdots\times A_{10^{13}}\bigl(p_n^{m_n} \bigr). $$

Let $x\in A_{10^{13}}(p_{1}^{m_{1}})\times\cdots\times A_{10^{13}}(p_{n}^{m_{n}})$ be arbitrary. By definition, for each k, we can find elements $g_{1}^{(k)},\ldots, g_{10^{13}}^{(k)}\in \varGamma _{1}/\varGamma _{1}(q)$ and $h_{1}^{(k)},\ldots, h_{10^{13}}^{(k)}\in \varGamma _{2}/\varGamma _{2}(q)$ such that

$$x\equiv g_1^{(k)}h_1^{(k)}\cdots g_{10^{13}}^{(k)}h_{10^{13}}^{(k)}\quad \operatorname {mod}p_k^{m_k}. $$

Since Γ ₁/Γ ₁(p ^m) and Γ ₂/Γ ₂(p ^m) are the direct product of local factors, we can find elements $g_{1},\ldots, g_{10^{13}}\in \varGamma _{1}/\varGamma _{1}(p^{m})$ and $h_{1},\ldots, h_{10^{13}}\in \varGamma _{2}/\varGamma _{2}(p^{m})$ such that

$$g_i\equiv g_i^{(k)}\quad \operatorname {mod}p_k^{m_k}\quad\mbox{and}\quad h_i\equiv h_i^{(k)}\quad \operatorname {mod}p_k^{m_k} $$

for each i and k. Thus

$$x= g_1h_1\cdots g_{10^{13}}h_{10^{13}} \in A_{10^{13}}(q). $$

Using Lemma A.4 we get

$$\begin{aligned} \bar{\varGamma}/\bar{\varGamma}(q) \supset& A_{10^{13}}(q) \supset A_{10^{13}}\bigl(p_1^{m_1}\bigr)\times\cdots\times A_{10^{13}}\bigl(p_n^{m_n}\bigr) \\ =&\bar{\varGamma}/ \bar{\varGamma}\bigl(p_1^{m_1}\bigr)\times\cdots\times \bar{\varGamma }/\bar{\varGamma}\bigl(p_{n}^{m_n}\bigr). \end{aligned}$$

Obviously

$$\bar{\varGamma}/\bar{\varGamma}(q)\subset\bar{\varGamma}/\bar{\varGamma } \bigl(p_1^{m_1}\bigr)\times\cdots\times \bar{\varGamma}/ \bar{\varGamma}\bigl(p_{n}^{m_n}\bigr) $$

hence all these containments must be equality. □

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bourgain, J., Kontorovich, A. On the local-global conjecture for integral Apollonian gaskets. Invent. math. 196, 589–650 (2014). https://doi.org/10.1007/s00222-013-0475-y

Download citation

Received: 24 May 2012
Accepted: 27 May 2013
Published: 10 July 2013
Issue Date: June 2014
DOI: https://doi.org/10.1007/s00222-013-0475-y

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

On the local-global conjecture for integral Apollonian gaskets

Abstract

Similar content being viewed by others

Geometric and Group-Theoretic Approach

Arithmetic Properties of Apollonian Gaskets

The First Passage Sets of the 2D Gaussian Free Field: Convergence and Isomorphisms

1 Introduction

1.1 The local-global conjecture

Conjecture 1.1

Theorem 1.2

1.2 Methods

1.3 Plan for the paper

1.4 Notation

2 Preliminaries I: the Apollonian group and its subgroups

2.1 Descartes theorem and consequences

2.2 Arithmetic subgroups

Lemma 2.1

2.3 Congruence subgroups

Lemma 2.2

Lemma 2.3

Remark 2.4

3 Setup and Outline of the Proof

3.1 Introducing the main exponential sum

3.2 Reduction to the circle method

Theorem 3.1

Proof of Theorem 1.2 assuming Theorem 3.1

Theorem 3.2

Proof of Theorem 3.1 assuming Theorem 3.2

3.3 Decomposition into major and minor arcs

3.4 The rest of the paper

4 Preliminaries II: automorphic forms and representations

4.1 Spectral theory

Theorem 4.1

Remark 4.2

4.2 Spectral gap

Theorem 4.3

4.3 Representation theory and mixing rates

Theorem 4.4

4.4 Effective bisector counting

Theorem 4.5

5 Some lemmata

5.1 Infinite volume counting statements

Lemma 5.1

Sketch of Proof

Lemma 5.2

Proof

Case 1: q small

Case 2: \(q\ge T_{2}^{{\delta -\varTheta _{0}\over2}}\)

Claim

Lemma 5.3

Proof

Lemma 5.4

Sketch

5.2 Local analysis statements

Lemma 5.5

Remark 5.6

Proof

Lemma 5.7

Remark 5.8

Proof

Lemma 5.9

Proof

Lemma 5.10

Proof

Lemma 5.11

Proof

Lemma 5.12

Proof

Lemma 5.13

Proof

Proposition 5.14

Proof

6 Major arcs

Theorem 6.1

Proof

Theorem 6.2

Proof

7 Minor arcs I: case q<Q 0

Lemma 7.1

Proof

7 Minor arcs I: case q<Q ₀

8 Minor arcs II: case Q ₀≤Q<X