Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

A fundamental problem in number theory is to find integer solutions of diophantine equations , that is equations of the form

$$\displaystyle{P(m_{1},\ldots,m_{n}) =\lambda }$$

where P is a polynomial with integer coefficients. The approaches fall into two broad categories, algebraic and analytic, the latter being especially useful when the number of variables is large with respect to the degree of the polynomial.

If the polynomial P is also positive and homogeneous of (even) degree d, then for each natural number \(\lambda\), there is a finite solution set

$$\displaystyle{ Z_{P,\lambda } =\{ m \in \mathbf{Z}^{n}:\ P(m) =\lambda \}. }$$
(8.1)

One may view these sets as the set of lattice points on the level surfaces \(\{P =\lambda \}\) and by homogeneity they can be projected onto the unit level surface S P : = { P = 1} via the dilations \(m \rightarrow \lambda ^{-1/d}m\). We will study the rate of equi-distribution of the images of the solution sets \(Z'_{P,\lambda }\) on the unit level surface S P as \(\lambda \rightarrow \infty \). Of course one needs some more conditions on the polynomial P in order to have solutions at all of the diophantine equation \(P(m) =\lambda\). For example if \(P(m) = m_{1}^{8} + (m_{2}^{2} +\ldots +m_{n}^{2})^{4}\) then even for large n there are only a sparse set of \(\lambda\)’s (namely which can be written as a sum of an 8-th and a 4-th power) for which there are solutions, and even for those values of \(\lambda\) one cannot have equi-distribution as the first coordinate m 1 can take very few values. A natural condition on the polynomial P, introduced by Birch [4], is that of P being non-singular in the sense that

$$\displaystyle{\nabla P(z) = (\partial _{1}P(z),\ldots,\partial _{n}P(z))\neq 0,\ \ \ \ \mbox{ for all}\ \ \ \ z \in \mathbf{C}^{n},\ \ z\neq 0.}$$

Also, there are local or congruence obstructions. For example, the polynomial \(P(m) = m_{1}^{d} + p(m_{2}^{d} +\ldots +m_{n}^{d})\) is non-singular, but the equation \(P(m) =\lambda\) can only have an integer solution if \(\lambda\) is congruent to a d-th power modulo p. Nevertheless, as it is implicit in the work of Birch [4], that if P is non-singular and if the number of variables n is large enough with respect to the degree d, then there is an infinite arithmetic progression \(\varLambda\) depending on P, which can be explicitly determined, such that for each \(\lambda \in \varLambda\) the equation \(P(m) =\lambda\) has the expected number of solutions \(\approx \lambda ^{n/d-1}\), in fact the number of solutions can be asymptotically determined. We will refer to such a set \(\varLambda\) as a set of regular values of the polynomial P.

As mentioned earlier, one of the problems we will be interested in is the asymptotic distribution of the images of the solution sets \(\,Z'_{P,\lambda } =\{\lambda ^{-1/d}m;\ P(m) =\lambda \}\,\) as \(\lambda \rightarrow \infty \) (\(\lambda \in \varLambda\)), on the unit level surface S P . First, one can show that there is a natural measure \(\sigma _{P}\) on the surface S P , such that the sets \(Z'_{P,\lambda }\) become weakly equi-distributed with respect to the measure \(d\sigma _{P}\). That is for any smooth function ϕ one has that

$$\displaystyle{ \frac{1} {N_{\lambda }}\sum _{x\in Z'_{P,\lambda }}\phi (x)\ \rightarrow \ \int _{S_{P}}\phi (x)\,d\sigma _{P}(x),\ \ \ \ \mbox{ as}\ \ \ \ \lambda \rightarrow \infty,\ \ \lambda \in \varLambda,}$$

where \(N_{\lambda }\) is the number of solutions of the equation \(P(m) =\lambda\). To get quantitative information on the rate of equi-distribution, we define below the discrepancy of a finite set Z ⊂ S P with respect to caps . For a unit vector \(\xi \in \mathbf{R}^{n}\) and positive number a, define the cap

$$\displaystyle{C_{a,\xi }:=\{ x \in S_{P}:\ x\cdot \xi \geq a\},}$$

where x ⋅ ξ is the dot product of the vectors x and ξ. Note that C a, ξ is the intersection of the surface S P with the half-space defined by \(x\cdot \xi \geq a\), and we will refer to ξ as the direction of the cap. The associated discrepancy of a finite set Z ⊂ S P , consisting of N points, with respect to caps of a given direction ξ is given by

$$\displaystyle{ D(Z,\xi ) =\sup _{a>0}\ \vert \,\vert Z \cap C_{a,\xi }\vert - N\,\sigma _{P}(C_{a,\xi })\vert, }$$
(8.2)

where | A | denotes the size of a set A.

It turns out that for the solution sets \(Z'_{P,\lambda }\) the discrepancy depends heavily on the direction of the cap. To see this consider the polynomial \(P(m) = m_{1}^{2} +\ldots +m_{n}^{2}\), so that one is interested in the distribution of lattice points on spheres, projected back to the unit sphere. It is well-known that for n ≥ 5, the size of the solution sets are \(N_{\lambda } \approx \lambda ^{\frac{n} {2} -1}\). If ξ = (0, , 0, 1) then for certain values of a, the boundary of the cap contain as many as \(\approx \lambda ^{\frac{n-3} {2} }\) points from the set Z P, λ . Indeed, after scaling back with a factor of λ 1∕2, the boundary of the cap is given by the equation.

\(m_{1}^{2} +\ldots +m_{n-1}^{2} =\mu \,\) for some μ depending on λ and a. Thus the discrepancy cannot be smaller than \(\,\lambda ^{\frac{n-3} {2} } \approx N_{\lambda }^{1- \frac{1} {n-2} }\). In contrast, we will show that if the direction of the cap points away from rational points as much as possible, then one can obtain much better bounds on the discrepancy. To be more precise let us call a point \(\,\alpha \in \mathbf{R}^{n-1}\,\) diophantine , if for every \(\varepsilon > 0\) there exists a constant C ε  > 0 such that for all q ∈ N

$$\displaystyle{ \|q\alpha \| =\min _{m\in \mathbf{Z}^{n-1}}\,\vert q\alpha - m\vert \geq C_{\varepsilon }\,q^{- \frac{1} {n-1} -\varepsilon }. }$$
(8.3)

Correspondingly a point ξ ∈ S n−1 is called diophantine if for every 1 ≤ i ≤ n for which ξ i ≠ 0, the point \(\alpha ^{i} \in \mathbf{R}^{n-1}\) is diophantine, where the coordinates of \(\alpha ^{i}\) are obtained by dividing each coordinate of ξ by ξ i and deleting the i-th coordinate. It is easy to see that the complement of diophantine points has measure 0 in \(\mathbf{R}^{n-1}\) and hence in S n−1 as well, see [Lemma 3, Sec. 2.2]. We will show, see also [13], in dimensions n ≥ 4, that the discrepancy is bounded by above by

$$\displaystyle{ D(Z'_{P,\lambda },\xi ) \leq C_{\xi,\varepsilon }\,N_{\lambda }^{\frac{1} {2} + \frac{1} {2(n-2)} } }$$
(8.4)

for all \(\varepsilon > 0\), when ξ is diophantine. This is especially significant in large dimensions as it is known from the works of Beck and Schmidt [3, 15], see also [12], that for any set of N points on the unit sphere S n−1, the L 2 average of the discrepancy with respect to spherical caps is at least \(N^{\frac{1} {2} - \frac{1} {2(n-1)} }\). For general non-singular, positive and homogeneous polynomials P, the same observation shows that for rational directions (p.e. when ξ = (0, , 0, 1)), the discrepancy is at least \(N_{\lambda }^{1- \frac{1} {n-d} }\), while we’ll show that in diophantine directions it is bounded by \(N_{\lambda }^{1-\gamma _{d}}\) with \(\gamma _{d} = \frac{1} {(d-1)2^{d+1}}\), in large enough dimensions.

We will also study the equi-distribution of the solutions when mapped to the flat torus \(\mathbf{T}^{n} = \mathbf{R}^{n}/\mathbf{Z}^{n}\). Let \(\alpha = (\alpha _{1},\ldots,\alpha _{n}) \in \mathbf{R}^{n}\) and consider the map \(T_{\alpha }: \mathbf{Z}^{n} \rightarrow \mathbf{T}^{n}\), defined by \(\,T_{\alpha }(m) = (m_{1}\alpha _{1},\ldots,m_{n}\alpha _{n})\pmod 1\,\). Then the images of the solution sets take the form

$$\displaystyle{\varOmega _{\lambda,\alpha } =\{ (m_{1}\alpha _{1},\ldots,m_{n}\alpha _{n});\ P(m_{1},\ldots,m_{n}) =\lambda \}\subseteq \mathbf{T}^{n}.}$$

It is clear that if one of the coordinates of the point \(\alpha\) is rational then the corresponding coordinate of the points in the image set can take only finitely many different values and the sets \(\varOmega _{\lambda,\alpha }\) cannot become equi-distributed as \(\lambda \rightarrow \infty \). It turns out that this is the only obstruction for non-singular polynomials P in sufficiently many variables. Indeed we will see that if \(\alpha \in (\mathbf{R}\setminus \mathbf{Q})^{n}\), then for any \(\phi \in C^{\infty }(T^{n})\) we have that

$$\displaystyle{ N_{\lambda }^{-1}\sum _{ P(m)=\lambda }\phi (m_{1}\alpha _{1},\ldots,m_{n}\alpha _{n}) \rightarrow \int _{\mathbf{T}^{n}}\phi (x)\,\mathit{dx}, }$$
(8.5)

as \(\lambda \rightarrow \infty \) through regular values of the polynomial P. To obtain quantitative bounds on the rate of equi-distribution, we will assume that each coordinate of the point \(\alpha\) is diophantine , that is \(\,\|q\alpha _{i}\| \geq C_{\varepsilon }q^{-1-\varepsilon }\,\), for all \(\varepsilon > 0\) and for all q ∈ N with an appropriate constant \(C_{\varepsilon } > 0\). Identify the torus with the set \([-\frac{1} {2}, \frac{1} {2})^{n}\) and let \(\,K \subseteq (-\frac{1} {2}, \frac{1} {2})^{n}\,\) be a compact, convex set with nonempty interior. The discrepancy of the image set \(\varOmega _{\lambda,\alpha }\) with respect to the convex body K is defined by

$$\displaystyle{D(K,\alpha,\lambda ) =\sum _{P(m)=\lambda }\chi _{K}(m_{1}\alpha _{1},\ldots,m_{n}\alpha _{n}) - N_{\lambda }\,vol_{n}(K),}$$

where χ K is the indicator function of the set K. We will show that for diophantine points \(\alpha\) one has the upper bound

$$\displaystyle{ \vert D(K,\alpha,\lambda )\vert \,\leq \, C_{P}\,\lambda ^{\frac{n} {d} -1-\gamma _{d}}, }$$
(8.6)

for some constant \(\gamma _{d} > 0\) depending only on the degree d.

Let us remark that the above is a special case of a more general phenomenon; namely if (X, μ) is a probability measure space, and if \(T = (T_{1},\ldots,T_{n})\) is a commuting, fully ergodic family of measure-preserving transformations , then the images of the solution sets

$$\displaystyle{\varOmega _{\lambda,x}:=\{ T_{1}^{m_{1} }\ldots T_{n}^{m_{n} }(x);\ P(m_{1},\ldots,m_{n}) =\lambda \}\subseteq X,}$$

become equi-distributed as \(\lambda \rightarrow \infty,\ (\lambda \in \varLambda )\), for almost every x ∈ X [11]. To prove such results one needs estimate certain maximal operators associated to averages over the solution sets \(P(m) =\lambda\), however, as in this generality one cannot hope for quantitative bounds on the rate of equi-distribution, we will not discuss such results below.

Crucial to all these results is the structure of the Fourier transform of the indicator function of the set of lattice points on the level surface \(\{P =\lambda \}\). This is an exponential sum of the form

$$\displaystyle{ \hat{\omega }_{\lambda }(\xi ) =\sum _{m\in \mathbf{Z}^{n},\,P(m)=\lambda }e^{-2\pi i\,m\cdot \xi }. }$$
(8.7)

Note that \(\hat{\omega }_{\lambda }(0) = N_{\lambda }\), that is the number of solutions to the equation \(P(m) =\lambda\), a quantity which has been extensively studied in analytic number theory. Indeed for the special case \(P(m) = m_{1}^{2} +\ldots +m_{n}^{2}\) asymptotic formulae for the number of solutions were obtained by Hardy and Littlewood, by developing the so-called “circle method” of exponential sums. Their methods were later further extended by Birch and Davenport [4, 5], to treat higher degree non-diagonal forms; in fact they have shown that

$$\displaystyle{ N_{\lambda } =\hat{\omega } _{\lambda }(0) = c_{p}\lambda ^{\frac{n} {d} -1}\sum _{q=1}^{\infty }K(q,0,\lambda ) + O(\lambda ^{\frac{n} {d} -1-\delta }), }$$
(8.8)

for some \(\delta > 0\). The expression \(K(\lambda ) =\sum _{ q=1}^{\infty }K(q,0,\lambda )\) is called the singular series , and it capturers all the local arithmetic information about the polynomial P. Without recalling the precise definition of the terms \(K(q,0,\lambda )\) here (see Sect. 8.3.1.4), it is enough to note here that for regular values \(\lambda \in \varLambda\) the singular series \(K(\lambda )\) bounded below by a fixed constant A P  > 0. It turns out that one can derive similar asymptotic formulas for the exponential sums \(\hat{\omega }_{\lambda }(\xi )\), which are uniform in the phase variable ξ. Namely, we will show that

$$\displaystyle{ \hat{\omega }_{\lambda }(\xi ) = c_{P}\,\lambda ^{\frac{n} {d} -1}\sum _{q=1}^{\infty }m_{q,\lambda }(\xi ) + \mathcal{E}_{\lambda }(\xi ), }$$
(8.9)

where

$$\displaystyle{\sup _{\xi }\vert \mathcal{E}(\xi )\vert \leq C\,\lambda ^{(\frac{n} {d} -1)(1-\gamma )}.}$$

Moreover

$$\displaystyle{m_{q,\lambda }(\xi ) =\sum _{l\in \mathbf{Z}^{n}}K(q,l,\lambda )\,\psi (q\xi - l)\,\,\tilde{\sigma }_{P}(\lambda ^{\frac{1} {d} }(\xi -l/q)),}$$

where ψ is a smooth cut-off function supported near the origin, and \(\tilde{\sigma }_{P}\) is the Euclidean Fourier transform of the surface measure \(\sigma _{P}\)

$$\displaystyle{\tilde{\sigma }_{P}(\xi ) =\int _{S_{P}}\,e^{-2\pi i\,x\cdot \xi }\ d\sigma _{ P}(x).}$$

To describe the meaning of this formula, note first that for ξ near the origin

$$\displaystyle{m_{1,\lambda }(\xi ) =\psi (\xi )\,\tilde{\sigma }_{P}(\lambda ^{\frac{1} {d} }(\xi )),}$$

since \(K(1,0,\lambda ) = 1\). The term \(\ c_{P}\,\lambda ^{\frac{n} {d} -1}\tilde{\sigma }_{P}(\lambda ^{\frac{1} {d} }\xi )\ \) can be interpreted as the Fourier transform of a smooth density supported on the level surface \(\{P =\lambda \}\). Thus the first term in the approximation formula may be viewed as an approximation near the origin via the Fourier transform of a surface carried measure. Notice also that all other terms are similar involving the Fourier transform \(\ \tilde{\sigma }_{P}(\lambda ^{\frac{1} {d} }(\xi -l/q))\ \), and may be viewed as higher order approximations near the rational points lq. In fact if ξ is near a rational point lq, then the sum expressing \(m_{q,\lambda }(\xi )\) has only one nonzero term taken at l = [q ξ], the nearest integer point to q ξ.

Let us sketch below how this formula will allow us to compare the discrete and the continuous case and to estimate the rate of equi-distribution of the solution sets in terms of the discrepancy.

Let χ a be the indicator function of the interval [a, b] (b being a fixed number depending on P), then by taking the inverse Fourier transform \(\ \chi _{a} =\int \,\hat{\chi } _{a}(t)e^{2\pi it\cdot }dt\), and by making a change of variables \(\,t \rightarrow \lambda ^{-1/d}t\,\), one may write

$$\displaystyle{\vert Z'_{P,\lambda } \cap C_{a,\xi }\vert =\sum _{P(m)=\lambda }\chi _{a}(\lambda ^{-\frac{1} {d} }m\cdot \xi ) =\int _{\mathbf{R}}\lambda ^{\frac{1} {d} }\hat{\chi }_{a}(t\lambda ^{\frac{1} {d} })\,\hat{\omega }_{\lambda }(t\xi )\,\mathit{dt},}$$

and also

$$\displaystyle{\sigma _{P}(C_{a,\xi }) =\int _{S_{P}}\chi _{a}(x\cdot \xi )\,d\sigma _{P}(x) =\int _{\mathbf{R}}\hat{\chi }_{a}(t)\,\tilde{\sigma }_{P}(t\xi )\,\mathit{dt}.}$$

Substituting the asymptotic formula (8.9) into this expression one may study the contribution of each term separately

$$\displaystyle{I_{q,\lambda }(\xi ):=\int _{\mathbf{R}}\lambda ^{\frac{1} {d} }\hat{\chi }_{a}(t\lambda ^{\frac{1} {d} })\,m_{q,\lambda }(t\xi )\,\mathit{dt}.}$$

A crucial point is that if | t | ≪ q −1 then ψ(qt ξl) = 0 unless l = 0, moreover ψ(qt ξ) = 1, hence

$$\displaystyle{m_{q,\lambda }(t\xi ) = K(q,0,\lambda )\ \tilde{\sigma }_{P}(\lambda ^{\frac{1} {d} }t\xi ).}$$

Writing

$$\displaystyle{I_{q,\lambda }(\xi ) =\int _{\vert t\vert \ll q^{-1}}\ \ \ \ +\int _{\vert t\vert \gg q^{-1}}\ \ \ \ \ \ = I_{q,\lambda }^{1}(\xi ) + I_{ q,\lambda }^{2}(\xi ),}$$

one has, after a change of variables t: = λ 1∕dt, that

$$\displaystyle{I_{q,\lambda }^{1}(\xi ) = K(q,0,\lambda )\int _{ \vert t\vert \ll \lambda ^{\frac{1} {d} }q^{-1}}\hat{\chi }_{a}(t)\tilde{\sigma }_{P}(t\xi )\,\mathit{dt}.}$$

At this point one exploits the cancelation in the normalized exponential sums K(q, l, λ) and oscillatory integrals \(\tilde{\sigma }_{P}(\xi )\), expressed in estimates roughly of the form

$$\displaystyle\begin{array}{rcl} & \vert K(q,l,\lambda )\vert \ll q^{-cn} & {}\\ & \vert \tilde{\sigma }_{P}(\xi )\vert \ll (1 + \vert \xi \vert )^{-c'n}.& {}\\ \end{array}$$

Then, for | ξ |  = 1, one can extend the integral to the whole real line by making a small error. This gives

$$\displaystyle{I_{q,\lambda }^{1}(\xi ) \approx K(q,0,\lambda )\int _{\mathbf{ R}}\hat{\chi }_{a}(t)\tilde{\sigma }_{P}(t\xi )\,\mathit{dt} = K(q,0,\lambda )\,\sigma _{P}(C_{a,\xi }).}$$

Thus by formula (8.8), we have that

$$\displaystyle{ c_{P}\lambda ^{\frac{n} {d} -1}\sum _{q=1}^{\infty }I_{q,\lambda }^{1}(\xi ) \approx N_{\lambda }\,\sigma _{P}(C_{a,\xi }). }$$
(8.10)

To get upper bounds for the discrepancy one needs to estimate the total contribution of the rest of terms I q, λ 2(ξ), exploiting the diophantine properties of the point ξ. In fact by making a change of variables t: = qt, and noting that the only nonzero term of the sum expressing \(\,m_{q,\lambda }(\frac{t} {q}\,\xi )\,\) is taken at l = [t ξ], one may write

$$\displaystyle{I_{q,\lambda }^{2}(\xi ) = \frac{\lambda ^{\frac{1} {d} }} {q} \int _{\vert t\vert \gg 1}\hat{\chi }_{a}\left (t\frac{\lambda ^{\frac{1} {d} }} {q} \right )\,K(q,[t\xi ],\lambda )\,\psi (\{t\xi \})\,\tilde{\sigma }_{P}\left (\frac{\lambda ^{\frac{1} {d} }} {q} \{t\xi \}\right )\,\mathit{dt},}$$

where {t ξ} = t ξ − [t ξ] denotes the fractional part of the point t ξ. At diophantine points it is not hard to show that on average \(\ \vert \{t\xi \}\vert =\| t\xi \| \geq c_{\varepsilon }\,t^{-\varepsilon }\ \) (see Lemma 6 in Sect. 8.2.2), thus using the cancelation estimates for K(q, l, λ) and \(\tilde{\sigma }(\xi )\) again, the terms I q, λ 2(ξ) add up only to a small error.

The organization of the rest of this chapter is as follows. In the next section we will derive the asymptotic expansion (8.9) for the polynomial P(m) = m 1 2 + + m d 2, and prove upper bounds on the discrepancy of lattice points on spheres. Next, we will extend our approach to general non-singular forms, using the Birch-Davenport method of exponential sums. Finally, in the last section we will study the equi-distribution of the images of the solution sets {P(m) = λ} modulo 1, when mapped to the flat torus T n via the map T α .

As for our notations, we will think of the polynomial P hence the parameters n, d being fixed, and write f = O(g) or alternatively f ≪ g if | f(m) | ≤ Cg(m) for all m ∈ N with a constant C > 0 depending only on the polynomial P or the parameters n, d. We will also write, f ≫ g if g ≪ f and f ≈ g if both f ≪ g and f ≫ g . If the implicit constant in our estimates depend on additional parameters \(\varepsilon,\delta,\ldots\) then we may write \(\,f = O_{\varepsilon,\delta \ldots }(g)\) or \(f \ll _{\varepsilon,\delta,\ldots }g\). The Fourier transform of a function f defined on Z n will be denoted by \(\hat{f}\), as opposed, somewhat unconventionally, we will denote the Euclidean Fourier transform of a function ϕ defined on R n by \(\tilde{\phi }\). This is to avoid confusion as we will often move between the discrete and continuous settings.

2 The Discrepancy of Lattice Points on Spheres

The uniformity of the distribution of lattice points on spheres has been extensively studied and proved in dimension at least 4, see [7], and later in dimension 3 [6] using difficult estimates for the Fourier coefficients of modular forms. These methods, however, do not take into consideration the direction of the caps, and hence the bounds obtained are subject to the limitations described in the introduction, arising from caps whose direction has rational coordinates.

We will assume that the direction ξ of the caps is diophantine in the sense that ξ i = ξξ i satisfies condition (8.3) for each 1 ≤ i ≤ n such that ξ i ≠ 0 . In this case, when \(\,Z =\{\lambda ^{-1/2}m;\ \vert m\vert ^{2} =\lambda \}\,\), we will obtain the following upper bound on the discrepancy, defined in (8.2), see also [13].

Theorem 1.

Let n ≥ 4 and let ξ ∈ S n−1 be a diophantine point. Then for every \(\varepsilon > 0\) , one has

$$\displaystyle{ \vert D_{n}(\xi,\lambda )\vert \leq C_{\xi,\varepsilon }\,\lambda ^{\frac{n-1} {4} +\varepsilon } }$$
(8.11)

We note that for n ≥ 4, and if n = 4 assuming that 4 does not divide λ, one has that \(N_{\lambda } \gg \lambda ^{\frac{n} {2} -1}\), thus (8.11) implies that

$$\displaystyle{\vert D_{n}(\xi,\lambda )\vert \leq C_{\xi,\varepsilon }\,N_{\lambda }^{\frac{1} {2} + \frac{1} {2(n-2)} +\varepsilon }.}$$

In dimension n = 4, the best previous estimate for the normalized discrepancy \(D(\xi,\lambda )/N_{\lambda }\) was given in [7] of the order of \(\lambda ^{-1/5+\varepsilon }\) while we get the improvement \(\lambda ^{-1/4+\varepsilon }\). In case n = 4 and λ = 4k there are only 24 lattice points of length λ 1∕2, estimates for the discrepancy become trivial in such degenerate cases.

2.1 The Fourier Transform of Lattice Points on Spheres

Our first task will be to derive the asymptotic formula (8.9) for the special case when P(m) =  | m | 2 = m 1 2 + … m n 2. As we have mentioned this can be viewed as an extension of the asymptotic formula for the number of representations of a positive integer λ as sum of n squares, and as such our main tool will be the Hardy-Littlewood method of exponential sums. Because of the quadratic nature of the problem, there are special tools available this case, most notably the transformation properties of certain theta functions. Also, we will use the so-called Kloosterman refinement , mainly to include the case n = 4. For a fixed λ ∈ N and ξ ∈ T n, set δ = λ −1 and write

$$\displaystyle{ e^{-2\pi }\hat{\omega }_{ \lambda }(\xi ) =\sum _{\vert m\vert ^{2}=\lambda }e^{-2\pi \delta \vert m\vert ^{2} }e^{2\pi im\cdot \xi } =\sum _{ \vert m\vert ^{2}=\lambda }w(m), }$$
(8.12)

where the weight function \(w(x) = e^{-2\pi \delta \vert m\vert ^{2} }e^{2\pi im\cdot \xi }\) is bounded and absolute summable. Using the fact that \(\,\int _{0}^{1}e^{2\pi i(\vert m\vert ^{2}-\lambda )\alpha }\,d\alpha = 1\,\) if | m | 2 = λ and is equal to 0 otherwise, one may write

$$\displaystyle{\hat{\omega }_{\lambda }(\xi ) = e^{2\pi }\int _{ 0}^{1}S(\alpha,\xi )\,e^{-2\pi i\alpha \lambda }\,d\alpha,}$$

where

$$\displaystyle{ S(\alpha,\xi ) =\sum _{m\in \mathbf{Z}^{n}}e^{2\pi i\vert m\vert ^{2}\alpha }\ w(m) =\sum _{m\in \mathbf{Z}^{n}}e^{2\pi i\,((\alpha +i\delta )\vert m\vert ^{2}+m\cdot \xi ) } }$$
(8.13)

is a theta function. It is well-known, at least when ξ = 0, that it is concentrated near rational points aq with small denominator. To exploit this, one dissects the interval [0, 1] into small neighborhoods of the set of rational points \(\,\mathcal{R}_{N} =\{ a/q;\ (a,q) = 1,\ q \leq N\}\,\) for some specific choice of the parameter N. It is easy to see, using Dirichlet’s principle , that one can choose intervals around the rational points aq of length | I aq  | ≈ 1∕Nq. This suggests that

$$\displaystyle{\hat{\omega }_{\lambda }(\xi ) \approx c\,\sum _{q\leq N}\sum _{(a,q)=1}e^{-\pi i\lambda \frac{a} {q} }\int _{- \frac{1} {Nq} }^{ \frac{1} {Nq} }S\left (\frac{a} {q}+\tau,\xi \right )\,e^{-2\pi i\,\lambda \tau }\ d\tau.}$$

The idea behind the Kloosterman refinement is to make a specific choice of this partition (the so-called Farey dissection ) and to estimate carefully the errors arising from the fact that the length of the intervals corresponding to a fixed denominator are not quite the same. We will use the following general result

Theorem A (Heath-Brown [9]).

Let P: Z n Z be a polynomial with integral coefficients, let λ, N be natural numbers and let w ∈ L 1 ( Z n ). Then one has

$$\displaystyle{ \sum _{P(m)=\lambda }w(m) =\sum _{q\leq N}\int _{- \frac{1} {qN} }^{ \frac{1} {qN} }e^{-2\pi i\lambda \tau }S_{0}(q,\tau )d\tau \, + E_{1}(\lambda ) }$$
(8.14)

where

$$\displaystyle{ \vert E_{1}(\lambda )\vert \leq C\,\,N^{-2}\sum _{ q\leq N}\sum _{\vert u\vert \leq q/2}(1 + \vert u\vert )^{-1}\max _{ \tau \approx \frac{1} {qN} }\vert S_{u}(q,\tau )\vert }$$
(8.15)

Here C > 0 is an absolute constant and

$$\displaystyle{ S_{u}(q,\tau ) =\sum _{(a,q)=1}e^{2\pi i\frac{\bar{a}u-a\lambda } {q} }S(a/q\,+\tau )\,,\ \ S(\alpha ) =\sum _{m\in \mathbf{Z}^{n}}e^{2\pi i\alpha P(m)}w(m),\ \ \ \ }$$
(8.16)

where \(\ a\bar{a} \equiv 1\pmod q\) .

This is proved in [9] for the case λ = 0 and for a non-negative weight function w, however the proof extends without any changes to all λ ∈ N and w ∈ L 1(Z n). Let us postpone the proof of the above result to the end of this section and see how it translates to our situation.

By (8.13) we have that

$$\displaystyle{S(a/q+\tau ) =\sum _{m\in \mathbf{Z}^{n}}e^{2\pi i\frac{a} {q}\vert m\vert ^{2} }\,e^{2\pi i\,m\cdot \xi }\,h_{\tau,\delta }(m),}$$

with \(h_{\tau,\delta }(x) = e^{2\pi i(\tau +i\delta )\vert x\vert ^{2} }\). Writing m: = qm 1 + s, where m 1 ∈ Z n, \(s \in \left (\mathbf{Z}/q\mathbf{Z}\right )^{n}\), and applying Poisson summation , we have

$$\displaystyle\begin{array}{rcl} S(a/q+\tau )& =& \sum _{s\in (\mathbf{Z}/q\mathbf{Z})^{n}}e^{2\pi i\frac{a} {q}\vert s\vert ^{2} }\sum _{m_{1}\in \mathbf{Z}^{n}}e^{2\pi i\,(qm_{1}+s)\cdot \xi }\,h_{\tau,\delta }(qm_{1} + s) \\ & =& \sum _{s\in (\mathbf{Z}/q\mathbf{Z})^{n}}e^{2\pi i\frac{a} {q}\vert s\vert ^{2} }\sum _{l\in \mathbf{Z}^{n}}\int _{\mathbf{R}^{n}}e^{2\pi i\,(qx+s)\cdot \xi }\,h_{\tau,\delta }(qx + s)\,e^{-2\pi i\,x\cdot l}\,\mathit{dx} \\ & =& \sum _{l\in \mathbf{Z}^{n}}q^{-n}\sum _{ s\in (\mathbf{Z}/q\mathbf{Z})^{n}}e^{2\pi i\frac{a\vert s\vert ^{2}+l\cdot s} {q} }\ \int _{\mathbf{R}^{n}}h_{\tau,\delta }(y)\,e^{2\pi iy\cdot (\xi -\frac{l} {q})}\,\mathit{dy} \\ & =& \sum _{l\in \mathbf{Z}^{n}}G(a,q,l)\ \tilde{h}_{\tau,\delta }(l/q-\xi ). {}\end{array}$$
(8.17)

Here G(a, q, l) is a normalized Gaussian sum :

$$\displaystyle{ G(a,q,l) = q^{-n}\sum _{ s\in (\mathbf{Z}/q\mathbf{Z})^{n}}e^{2\pi i\,\frac{a\vert s\vert ^{2}-s\cdot l} {q} }. }$$
(8.18)

The function h τ, δ (x) is of the form \(e^{-\pi z\vert x\vert ^{2} }\) with z = 2(δi τ), hence, after a change of variables x: = z 1∕2 x, its Fourier transform can be evaluated explicitly,

$$\displaystyle{ \tilde{h}_{\tau,\delta }(l/q-\xi ) = (2(\delta -i\tau ))^{-\frac{n} {2} }\ e^{- \frac{\pi \vert q\xi -l\vert ^{2}} {2q^{2}(\delta -i\tau )} }. }$$
(8.19)

Let us first estimate the error terms S u (q, τ) in formula (8.15). Note that on the range when \(\vert \tau \vert \approx 1/\,qN \approx 1/\,q\lambda ^{1/2}\), one has \(\ Re\,\left ( \frac{1} {q^{2}(\delta -i\tau )}\right ) = \frac{\delta } {q^{2}(\delta ^{2}+\tau ^{2})} \geq c\ \), for some absolute constant c > 0. Thus

$$\displaystyle{ \vert \tilde{h}_{\tau,\delta }(\xi -l/q)\vert \,\leq \, C\,q^{\frac{n} {2} }\,\lambda ^{ \frac{n} {4} }\,e^{-c\vert q\xi -l\vert ^{2} }. }$$
(8.20)

Also, by (8.17)

$$\displaystyle{S_{u}(q,\tau ) =\sum _{l\in \mathbf{Z}^{n}}K(q,l,\lambda;u)\ \tilde{h}_{\tau,\delta }(\xi -l/q),}$$

where

$$\displaystyle{ K(q,l,\lambda;u) =\sum _{(a,q)=1}e^{2\pi i\frac{\bar{a}u-a\lambda } {q} }\,G(a,l,q), }$$
(8.21)

These exponential sums have been extensively studied in number theory, various estimates are known in the literature, going back to the original work of Kloosterman. We will use the following estimate, which we will take for granted for now, however for the sake of completeness will include a proof later.

Theorem B.

Let K(q,l,λ;u) be the exponential sum defined in  (8.21) . Then one has for every \(\varepsilon > 0\),

$$\displaystyle{ \vert K(q,l,\lambda;u)\vert \leq C_{n,\varepsilon }\,q^{\frac{n-1} {2} +\varepsilon }\,(\lambda,q_{1})^{\frac{1} {2} }\,2^{\frac{r} {2} }, }$$
(8.22)

where q = q 1 2 r with q 1 odd, and (λ,q 1 ) denotes the greatest common divisor of λ and q 1 .

We remark that using only standard estimates for Gaussian sums would yield to a weaker bound of O(q n∕2+1), thus the extra cancelation in the sum over (a, q) = 1 is crucial. By this and estimate (8.20) we have

$$\displaystyle{ \max _{\vert \tau \vert \approx \frac{1} {qN} }S_{u}(q,\tau ) \leq C_{\varepsilon }\,q^{\frac{1} {2} +\varepsilon }(\lambda,q_{1})^{\frac{1} {2} }2^{\frac{r} {2} }. }$$
(8.23)

The factors \((\lambda,q_{1})^{\frac{1} {2} }2^{\frac{r} {2} }\) are at most \(\lambda ^{\varepsilon }\) on average for \(q \leq \lambda ^{\frac{1} {2} }\), hence they do not play any role in our estimates. Indeed, it is easy to see that

Lemma 2.

Let β ∈ R . Then for every \(\varepsilon > 0\) , one has

$$\displaystyle{\sum _{ q\leq \lambda ^{\frac{1} {2} }}q^{\beta }\,(\lambda,q_{1})^{\frac{1} {2} }\,2^{\frac{r} {2} } \leq \, C_{\beta,\varepsilon }\,\lambda ^{\frac{\beta +1} {2} +\varepsilon }}$$

Proof.

Let 1 ≤ μ ≤ λ 1∕2. First, we show that

$$\displaystyle{\sum _{q\leq \mu }(\lambda,q_{1})^{\frac{1} {2} }\,2^{\frac{r} {2} } \leq C_{\varepsilon }\,\lambda ^{\varepsilon }\,\mu }$$

To see this, write d = (λ, q 1) and q 1 = dt. Then d divides λ and d2r t ≤ μ, hence the left side is majorized by

$$\displaystyle{\sum _{d\vert \lambda }\sum _{r\in \mathbf{N}}\,d^{\frac{1} {2} }\,2^{\frac{r} {2} }\, \frac{\mu } {d2^{r}} \leq C_{\varepsilon }\,\lambda ^{\varepsilon }\,\mu }$$

By partial summation, we have

$$\displaystyle{C_{\varepsilon }\,\lambda ^{\varepsilon }\,(\lambda ^{ \frac{\beta }{ 2} } +\sum _{\mu \leq \lambda ^{\frac{1} {2} }}\mu \,\mu ^{\beta -1}\,) \leq C_{\varepsilon }\,\lambda ^{\frac{\beta +1} {2} +\varepsilon }.}$$

 □ 

Going back to the error term E 1(λ) defined in (8.15), we have by estimate (8.23) and Lemma 2

$$\displaystyle{ \vert E_{1}(\lambda )\vert \leq C_{n,\varepsilon }\,\lambda ^{\frac{n-1} {4} +\varepsilon }, }$$
(8.24)

for all \(\varepsilon > 0\).

The main term in (8.14) takes the form

$$\displaystyle\begin{array}{rcl} M(\lambda ):& =& \sum _{q\leq N}\int _{- \frac{1} {qN} }^{ \frac{1} {qN} }e^{-2\pi i\lambda \tau }S_{0}(q,\tau )d\tau \, \\ & =& \sum _{q\leq N}\sum _{l\in \mathbf{Z}^{n}}K(q,l,\lambda;0)\,\int _{- \frac{1} {qN} }^{ \frac{1} {qN} }e^{-2\pi i\lambda \tau }\,\tilde{h}_{\tau,\delta }(\xi -l/q){}\end{array}$$
(8.25)

We will do now a number of transformations, to obtain the asymptotic formula (8.9), described in the introduction. First we insert the functions ψ(q ξl), the restrict the summation in l to at most one non-zero term. Then we extend the integral to the whole real line and identify it with the Fourier transform of the normalized measure on the unit sphere.

First, let ψ(ξ) be a smooth cut-off function which is constant 1 on \([-\frac{1} {8}, \frac{1} {8}]^{n}\) and is equal to 0 for \(\xi \notin [-\frac{1} {4}, \frac{1} {4}]^{n}\). Then by (8.19), one estimates

$$\displaystyle{\sum _{l\in \mathbf{Z}^{n}}(1 -\psi (q\xi - l))\,\vert \tilde{h}_{\tau,\delta }(\xi -l/q)\vert \,\leq C_{n}\,(\tau ^{2} +\delta ^{2})^{-\frac{n} {4} }\,\,e^{\, \frac{c\delta } {q^{2}(\tau ^{2}+\delta ^{2})} }\, \ll \,\lambda ^{\frac{n} {4} }\,q^{\frac{n} {2} },}$$

where the last inequality follows from the fact that \(e^{-u} \ll \, u^{-\frac{n} {4} }\) taking the special value \(u = \frac{\delta } {q^{2 } (\tau ^{2 } +\delta ^{2 } )}\). Thus, by (8.15), the total error accumulated by inserting the cut-off functions in (8.25) is bounded by

$$\displaystyle{ \vert E_{2}(\lambda )\vert \leq C_{\varepsilon }\,\lambda ^{\frac{n} {4} -\frac{1} {2} }\sum _{q\leq N}q^{-\frac{1} {2} +\epsilon }(\lambda,q)^{\frac{1} {2} } \leq C_{\varepsilon }\,\lambda ^{\frac{n-1} {4} +\epsilon }, }$$
(8.26)

and the main term takes the form

$$\displaystyle{ M_{2}(\lambda ):=\sum _{q\leq N}\sum _{l\in \mathbf{Z}^{n}}K(q,l,\lambda;0)\,\psi (q\xi - l)\int _{- \frac{1} {qN} }^{ \frac{1} {qN} }e^{-2\pi i\lambda \tau }\,\tilde{h}_{\tau,\delta }(\xi -l/q). }$$
(8.27)

At this point, the integration can be extended to the whole real line, exploiting the fact that now there is at most one nonzero term in the l-sum. For \(\,\vert \tau \vert \geq \frac{1} {qN} \geq \delta \,\) one has \(\,\vert \tilde{s}_{\tau,\delta }(\xi -l/q)\vert \,\ll \,\tau ^{-\frac{n} {2} }\,\), thus the total error obtained in (8.27) by extending the integration is

$$\displaystyle{ \vert E_{3}(\lambda )\vert \leq \, C_{\varepsilon }\,\sum _{q\leq N}q^{-\frac{1} {2} +\epsilon }(\lambda,q)^{\frac{1} {2} }\int _{\vert \tau \vert \geq \frac{1} {qN} }\tau ^{-\frac{n} {2} }\,d\tau \leq C_{\varepsilon }\,\lambda ^{\frac{n-1} {4} +\epsilon }. }$$
(8.28)

Finally, we identify the integrals, and show that

Lemma 3.

$$\displaystyle{ I_{\lambda }(\xi ):= e^{2\pi }\int _{ \mathbf{R}}e^{-2\pi i\lambda \tau }\,\tilde{h}_{\tau,\delta }(\xi )\,d\tau \, =\,\lambda ^{\frac{n} {2} -1}\,\tilde{\sigma }(\lambda ^{\frac{1} {2} }\,\xi ), }$$
(8.29)

where σ is one-half of the surface area measure on the unit sphere in R n .

Proof.

By using (8.19) and making a change of variables: t = λ τ to take out the dependence on λ, one has that

$$\displaystyle{I_{\lambda }(\xi ) = e^{2\pi }\lambda ^{\frac{n} {2} -1}\int _{\mathbf{R}}e^{-2\pi it}(2(1 - it))^{-\frac{n} {2} }e^{- \frac{\pi \lambda \vert \xi \vert ^{2}} {2(1-it)} }\,\mathit{dt}.}$$

Let η: = λ 1∕2 ξ, then out task is to show that

$$\displaystyle{J(\eta ):= e^{2\pi }\int _{ \mathbf{R}}e^{-2\pi it}(2(1 - it))^{-\frac{n} {2} }e^{- \frac{\pi \vert \eta \vert ^{2}} {2(1-it)} }\,\mathit{dt}\, =\,\tilde{\sigma } (\eta ).}$$

We now insert an extra convergence factor \(\,e^{-\pi \gamma \ t^{2} }\,\) into the integral defining J(η). Denoting the resulting integral by J γ we have J γ → J as η → 0; moreover for any test function ϕ in the Schwartz space

$$\displaystyle{\int _{\mathbf{R}^{n}}\hat{\phi }(\eta )J(\eta )\,d\eta \, =\,\lim _{\gamma \rightarrow 0}\int _{\mathbf{R}^{d}}\hat{\phi }(\eta )J^{\gamma }(\eta )\,d\eta.}$$

Also,

$$\displaystyle{ \int _{\mathbf{R}^{d}}\hat{\phi }(\eta )J^{\gamma }(\eta )\,d\eta \, =\,\int _{\mathbf{R}^{d}}\phi (x)J^{\gamma }(x)\,\mathit{dx}. }$$
(8.30)

Note, that by (8.19) we have that \(\,\tilde{h}_{t,1}(\eta ) = (2(1 - it))^{-n/2}e^{- \frac{\pi \vert \eta \vert ^{2}} {2(1-it)} }\,\), thus

$$\displaystyle{J^{\gamma }(x) = e^{2\pi }\int _{ \mathbf{R}}e^{-2\pi it}e^{-2\pi \vert x\vert ^{2}(1-it) }e^{-\pi \gamma t^{2} }\,\mathit{dt}\, =\,\gamma ^{-\frac{1} {2} }e^{-\pi (1-\vert x\vert ^{2})/\gamma }e^{-\pi \vert x\vert ^{2} }.}$$

Inserting this into (8.30), and letting γ → 0, we obtain

$$\displaystyle{\int _{\mathbf{R}^{n}}\hat{\phi }(\eta )J(\eta )\,d\eta \, =\,\int _{\mathbf{R}^{n}}\phi (x)\,d\sigma (x),}$$

and thus \(J(\eta ) =\tilde{\sigma } (\eta )\), as we wanted to prove. Note that

$$\displaystyle{\tilde{\sigma }(0) = J(0) =\int _{\mathbf{R}}e^{-2\pi it} \frac{dt} {(2(1 - it))^{n/2}} = \frac{\pi ^{n/2}} {\varGamma (n/2)}.}$$

This identifies σ as one-half of the surface area measure of the unit sphere. □ 

Substituting the above formula (8.29) into the expression (8.27), the main term finally takes the form

$$\displaystyle{ M_{3}(\lambda ):=\lambda ^{n/2-1}\,\sum _{ q\leq N}\sum _{l\in \mathbf{Z}^{n}}K(q,l,\lambda;0)\,\psi (q\xi - l)\tilde{\sigma }(\lambda ^{1/2}(\xi -l/q). }$$
(8.31)

Note, that all error terms (8.15), (8.24), (8.26), and (8.28), we obtained in the process of transforming the main term into the above expression is of magnitude \(O_{\varepsilon }(\lambda ^{\frac{n-1} {4} +\varepsilon })\). Summarizing we have proved

Theorem 4.

Let n ≥ 4. Then one has

$$\displaystyle{\hat{\omega }_{\lambda }(\xi ) =\,\lambda ^{\frac{n} {2} -1}\,\sum _{ q\leq \lambda ^{\frac{1} {2} }}m_{q,\lambda }(\xi ) + \mathcal{E}_{\lambda }(\xi ),}$$

where

$$\displaystyle{ \vert \mathcal{E}_{\lambda }(\xi )\vert \leq C_{\varepsilon }\,\lambda ^{\frac{n-1} {4} +\varepsilon } }$$
(8.32)

holds uniformly in ξ for every \(\varepsilon > 0\) . Moreover

$$\displaystyle{ m_{q,\lambda }(\xi ) =\sum _{l\in \mathbf{Z}^{n}}K(q,l,\lambda )\,\psi (q\xi - l)\,\,\tilde{\sigma }(\lambda ^{\frac{1} {2} }(\xi -l/q)) }$$
(8.33)

where

$$\displaystyle{K(q,l,\lambda ) = q^{-n}\sum _{ (a,q)=1}\sum _{s\in (\mathbf{Z}/q\mathbf{Z})^{n}}e^{2\pi i\,\frac{a(\vert s\vert ^{2}-\lambda )+s\cdot l} {q} }.}$$

Here \(\tilde{\sigma }\) denotes the Fourier transform of the surface-area measure σ on S n−1 , and ψ is a smooth cut-off function supported on \([-\frac{1} {4}, \frac{1} {4}]^{n}\) which is constant 1 on \([-\frac{1} {8}, \frac{1} {8}]^{n}\) .

2.2 Some Properties of Diophantine Points

We will derive here a few elementary properties of diophantine points, needed later in our estimates on the discrepancy. Crucial among them is the fact if ξ ∈ S n−1 is a diophantine point, then \(\,\|t\xi \| \gg T^{-\varepsilon }\,\) on average for 1 ≤ t ≤ T, where \(\|\xi \|\) denotes the distance of a point ξ ∈ R n to the nearest lattice point. To start, let us call a point α ∈ R n of type \(\varepsilon\) if it satisfies condition (8.3) with a given \(\varepsilon > 0\).

Lemma 5.

For every ε > 0 the set of points α ∈ [0,1] n−1 of type \(\varepsilon\) has measure 1.

Proof.

If a point \(\,\alpha \in \mathbf{R}^{n-1}\,\) is not of type ε then there are infinitely many positive integers q such that: \(\,\|q\xi \| \leq q^{- \frac{1} {n-1} -\epsilon }\,\). This means that there exists an m ∈ Z n such that: \(\,\vert \xi - m/q\vert \leq q^{- \frac{n} {n-1} -\epsilon }\,\). However the sum of the volumes of all such neighborhoods around the points mq ∈ [0, 1]n−1 is bounded by

$$\displaystyle{\sum _{n=1}^{\infty }\,q^{n-1}q^{-n-\epsilon }\leq C_{\varepsilon },}$$

thus the set of points which belong to infinitely many of such neighborhoods has measure 0. □ 

This shows that the set of points α ∈ R n−1 which are not diophantine has measure 0. Indeed α is diophantine if it is of type \(\varepsilon _{k} = (1/2)^{k}\) for all k ∈ N. Next we show that \(\|q\alpha \| \approx 1\) on average.

Lemma 6.

Let α ∈ [0,1] n−1 be diophantine, Q > 1 and 1 ≤ k < n − 1. Then for every \(\varepsilon > 0\) , we have

$$\displaystyle{ \sum _{q\leq Q}\|q\alpha \|^{-k} \leq C_{\varepsilon }\ Q^{1+\varepsilon } }$$
(8.34)

Proof.

Let \(\varepsilon > 0\). Consider the set of points {q α} ∈ [−1∕2, 1∕2]n−1, for 1 ≤ q ≤ Q. If q 1q 2 then

$$\displaystyle{\vert \{q_{1}\alpha \} -\{ q_{2}\alpha \}\vert \geq \| (q_{1} - q_{2})\alpha \| \geq C_{\varepsilon }\,Q^{- \frac{1} {n-1} - \frac{\varepsilon }{ n} },}$$

thus the number of points in a dyadic annulus \(2^{-j} \leq \| q\alpha \| < 2^{-j+1}\) is bounded by \(2^{-(n-1)j}\,Q^{1+\varepsilon }\) and the sum in (8.34) is convergent for 1 ≤ k < n − 1. □ 

Lemma 7.

Let ξ ∈ S n−1 be diophantine, and assume that max j  |ξ j | = |ξ n |. Let t ≥ 1 , α = (α 1 ,…α n−1 ) , α j = ξ j ∕ξ n and q = [tξ n ]. Then one has

$$\displaystyle{\|t\xi \| \geq \frac{1} {n}\,\|q\alpha \|}$$

Proof.

Note that

$$\displaystyle{t\xi _{j} = t\xi _{n}\alpha _{j} = [t\xi _{n}]\alpha _{j} \pm \| t\xi _{n}\|\alpha _{j}}$$

hence

$$\displaystyle{\vert q\alpha _{j} - m_{j}\vert \leq \vert t\xi _{j} - m_{j}\vert +\| t\xi _{n}\|.}$$

Thus taking m j  = [t ξ j ], we have

$$\displaystyle{\|q\alpha _{j}\| \leq \| t\xi _{j}\| +\| t\xi _{n}\|.}$$

Summing for 1 ≤ j ≤ n − 1 proves the lemma. □ 

Lemma 8.

Suppose ξ ∈ S n−1 is diophantine, and let t ≥ 1 and T ≥ 1. Then for every \(\varepsilon > 0\) , one has

$$\displaystyle{ \|t\xi \| \geq C_{\varepsilon }\,t^{- \frac{1} {n-1} -\varepsilon } }$$
(8.35)

Moreover, for 1 ≤ k < n − 1

$$\displaystyle{ \int _{1}^{T}\|t\xi \|^{-k} \leq C_{\varepsilon }\,T^{1+\varepsilon } }$$
(8.36)

Proof.

By permuting the coordinates of ξ, one can assume that max j  | ξ j  |  =  | ξ n  | . Inequality (8.35) follows immediately from Lemma 7 and the definition of a diophantine point. Similarly (8.35) is reduced to (8.34) by observing that for a fixed q, the set of t’s for which q = [t ξ n ] is an interval of length at most \(1/\xi _{n} \leq \sqrt{n}\). □ 

2.3 Upper Bounds on Discrepancy

We have developed all the necessary tools to prove Theorem 8.11, our main result in this section. The argument will follow the broad outline given at the end of the introduction, in addition we will use the standard stationary phase estimate on the Fourier transform of the surface area measure on the unit sphere S n−1, see for example [17]

$$\displaystyle{ \vert \hat{\sigma }(\xi )\vert \,\ll \, (1 + \vert \xi \vert )^{-\frac{n-1} {2} } }$$
(8.37)

Now, for given a > 0 let χ a denote the indicator function of the interval [a, 1 + a]. The discrepancy may be written as

$$\displaystyle{ D_{n}(\xi,\lambda ) =\,\sum _{\vert m\vert ^{2}=\lambda }\chi _{a}(\lambda ^{-\frac{1} {2} }\,m\cdot \xi ) - N_{\lambda }\int _{S^{n-1}}\chi _{a}(x\cdot \xi )\,d\sigma (x). }$$
(8.38)

The function χ a can be replaced with a smooth function \(\phi _{a,\delta }\) by making a small error in the discrepancy. Indeed, let 0 ≤ ϕ(t) ≤ 1 be smooth function supported in [−1, 1]n, such that ∫ ϕ = 1. For a given \(\delta > 0\) let \(\,\phi _{a,\delta }^{\pm } =\chi _{a\pm \delta }{\ast}\phi _{\delta }\,\), where \(\,\phi _{\delta }(t) =\delta ^{-1}\phi (t\,\delta ^{-1})\,\) and define the smoothed discrepancy as

$$\displaystyle{ D_{n}(\phi _{a,\delta }^{\pm },\xi,\lambda ) =\,\sum _{ \vert m\vert ^{2}=\lambda }\phi _{a,\delta }^{\pm }(\lambda ^{-\frac{1} {2} }\,m\cdot \xi ) - N_{\lambda }\int _{S^{n-1}}\phi _{a,\delta }^{\pm }(x\cdot \xi )\,d\sigma (x). }$$
(8.39)

Lemma 9.

One has

$$\displaystyle{ \vert D_{n}(\xi,\lambda )\vert \leq \,\max \, (\vert D_{n}(\phi _{a,\delta }^{+},\xi,\lambda )\vert,\vert D_{ n}(\phi _{a,\delta }^{-},\xi,\lambda )\vert ) +\, O(\delta N_{\lambda }). }$$
(8.40)

Proof.

Note that \(\ \phi _{a,\delta }^{-}(t) \leq \chi _{a}(t) \leq \phi _{a,\delta }^{+}(t)\ \) thus

$$\displaystyle{\sum _{\vert m\vert ^{2}=\lambda }\phi _{a,\delta }^{-}(\lambda ^{-\frac{1} {2} }\,m\cdot \xi ) \leq \sum _{\vert m\vert ^{2}=\lambda }\chi _{a}(\lambda ^{-\frac{1} {2} }\,m\cdot \xi ) \leq \sum _{\vert m\vert ^{2}=\lambda }\phi _{a,\delta }^{+}(\lambda ^{-\frac{1} {2} }\,m\cdot \xi )}$$

and

$$\displaystyle{N_{\lambda }\int _{S^{n-1}}\phi _{a,\delta }^{+}(x\cdot \xi )\,d\sigma (x) \geq N_{\lambda }\int _{ S^{n-1}}\chi _{a}(x\cdot \xi )\,d\sigma (x) \geq N_{\lambda }\int _{S^{n-1}}\phi _{a,\delta }^{-}(x\cdot \xi )\,d\sigma (x)}$$

Subtracting the above inequalities, (8.40) follows from the fact that

$$\displaystyle{\int _{S^{n-1}}(\phi _{a,\delta }^{+} -\phi _{ a,\delta }^{-})\,(x\cdot \xi )\,d\sigma (x)\, \ll \,\delta }$$

 □ 

In what follows, we take δ = λ n and write ϕ a, δ for ϕ a, δ ±, as our estimates work the same way for both choices of the sign. By taking the inverse Fourier transform of ϕ a, δ (t) one has

$$\displaystyle{ \sum _{\vert m\vert ^{2}=\lambda }\phi _{a,\delta }\,(\lambda ^{-1/2}\,m\cdot \xi ) =\int _{\mathbf{ R}}\lambda ^{\frac{1} {2} }\tilde{\phi }_{a,\delta }(t\lambda ^{\frac{1} {2} })\,\hat{\omega }_{\lambda }(t\xi )\,\mathit{dt} }$$
(8.41)

also

$$\displaystyle{ \int _{S^{n-1}}\phi _{a,\delta }\,(x\cdot \xi )\,d\sigma (x) =\int _{\mathbf{R}}\tilde{\phi }_{a,\delta }(t)\,\tilde{\sigma }(t\xi )\,\mathit{dt} }$$
(8.42)

We substitute the asymptotic formula (8.9) into (8.41) and study the contribution of each term separately. Accordingly, let

$$\displaystyle{ I_{q,\lambda }:=\int _{\mathbf{R}}\lambda ^{\frac{1} {2} }\tilde{\phi }_{a,\delta }(t\lambda ^{\frac{1} {2} })\,m_{q.\lambda }(t\xi )\,\mathit{dt}, }$$
(8.43)

and

$$\displaystyle{ E_{\lambda } =\int _{\mathbf{R}}\lambda ^{\frac{1} {2} }\tilde{\phi }_{a,\delta }(t\lambda ^{\frac{1} {2} })\,\mathcal{E}_{\lambda }(t\xi )\,\mathit{dt}. }$$
(8.44)

To estimate the error term in (8.44) note that

$$\displaystyle{\int _{\mathbf{R}}\lambda ^{\frac{1} {2} }\vert \tilde{\phi }_{a,\delta }(t\lambda ^{\frac{1} {2} })\vert \,\mathit{dt} \leq C\,\int _{\mathbf{R}}(1 + \vert t\vert )^{-1}(1 +\delta \vert t\vert )^{-1}\,\mathit{dt} \leq C\,\log \,\lambda.}$$

Thus by (8.32) one has for every \(\varepsilon > 0\)

$$\displaystyle{ \vert E_{\lambda }\vert \leq C_{\varepsilon }\,\lambda ^{\frac{n-1} {4} +\varepsilon }. }$$
(8.45)

Next, decompose the integral in (8.43) as

$$\displaystyle{ I_{q,\lambda } =\int _{\vert t\vert <1/\,8q}\ \ \ \ +\int _{\vert t\vert \geq 1/\,8q}\ \ \ \ \ \ = I_{q,\lambda }^{1} + I_{ q,\lambda }^{2}. }$$
(8.46)

Here an important observation is that if | t |  < 1∕ 8q then ψ(qt ξl) = 0 unless l = 0, moreover ψ(tq ξ) = 1 since | tq ξ j  |  < 1∕ 8q for each j. Hence

$$\displaystyle{m_{q,\lambda }(t\xi ) = K(q,0,\lambda )\,\,\tilde{\sigma }(\lambda ^{\frac{1} {2} }\,t\xi ).}$$

Thus by (8.43) and a change of variables: t: = t λ 1∕2

$$\displaystyle{ I_{q,\lambda }^{1} = K(q,l,\lambda )\int _{ \vert t\vert <\lambda ^{\frac{1} {2} }/\,8q}\tilde{\phi }_{a,\delta }(t)\,\tilde{\sigma }(t\xi )\,\mathit{dt}. }$$
(8.47)

Lemma 10.

One has for every \(\varepsilon > 0\)

$$\displaystyle{ \vert \,\gamma _{n}\lambda ^{\frac{n} {2} -1}\sum _{ q\leq \lambda ^{\frac{1} {2} }}I_{q,\lambda }^{1}\, -\, N_{\lambda }\int _{ S^{n-1}}\phi _{a,\delta }\,(x\cdot \xi )\,d\sigma (x)\,\vert \,\leq C_{\varepsilon }\,\lambda ^{\frac{n-1} {4} +\varepsilon } }$$
(8.48)

Proof.

Using (8.37), one has

$$\displaystyle{ \int _{ \vert t\vert \geq \lambda ^{\frac{1} {2} }/\,8q}\vert \tilde{\phi }_{a,\delta }(t)\,\tilde{\sigma }(t\xi )\,\vert \,\mathit{dt}\, \leq \, C_{\varepsilon }\,\lambda ^{-\frac{n-1} {4} +\varepsilon }\,q^{\frac{n-1} {2} } }$$
(8.49)

Thus by (8.42) and (8.47)

$$\displaystyle{\vert \,I_{q,\lambda }^{1} - K(q,0,\lambda )\int _{ S^{n-1}}\phi _{a,\delta }\,(x\cdot \xi )\,d\sigma (x)\,\vert \,\leq \, C_{\varepsilon }\,\lambda ^{-\frac{n-1} {4} +\varepsilon }\,q^{\frac{n-1} {2} }\,\vert K(q,0,\lambda )\vert }$$

Substituting ξ = 0 in (8.33) one has

$$\displaystyle{ \vert \,N_{\lambda } -\gamma _{n}\lambda ^{\frac{n} {2} -1}\sum _{ q\leq \lambda ^{\frac{1} {2} }}K(q,0,\lambda )\vert \,\leq \, C_{\varepsilon }\,\lambda ^{\frac{n-1} {4} +\varepsilon } }$$
(8.50)

Using (8.22) and (8.50), the left side of (8.48) is estimated by

$$\displaystyle{ C_{\varepsilon }\,\left (\,\lambda ^{\frac{n-1} {4} +\varepsilon }\, +\lambda ^{\frac{n-3} {4} +\varepsilon }\,\sum _{ q\leq \lambda ^{\frac{1} {2} }}q^{\varepsilon }\,(\lambda,q_{1})^{\frac{1} {2} }\,2^{\frac{r} {2} }\,\right )\, \leq \, C_{\varepsilon }\,\lambda ^{\frac{n-1} {4} +\varepsilon } }$$
(8.51)

 □ 

To estimate the remaining error terms one needs to exploit the diophantine properties of the direction ξ.

Lemma 11.

Let ξ ∈ S n−1 diophantine. Then for every \(\varepsilon > 0\) , we have

$$\displaystyle{ \sum _{ q\leq \lambda ^{\frac{1} {2} }}\vert I_{q,\lambda }^{2}\vert \,\leq \, C_{\xi,\varepsilon }\,\lambda ^{-\frac{n-3} {4} +\varepsilon } }$$
(8.52)

Proof.

First, note that ψ(q ξl) = 0 unless l = [q ξ], that is the closest lattice point to the point q ξ ∈ R n. Using the notation {q ξ} = q ξ − [q ξ] one may write

$$\displaystyle{m_{q,\lambda }(t\xi ) = K(q,[qt\xi ],\lambda )\,\psi (\{qt\xi \})\,\,\tilde{\sigma }\left (\frac{\lambda ^{\frac{1} {2} }} {q}\,\{qt\xi \}\right )}$$

By making a change of variables t: = qt, it follows from estimates (8.22) and (8.37) that

$$\displaystyle{ \vert I_{q,\lambda }^{2}\vert \,\leq \, C_{\varepsilon }\,(\lambda ^{\frac{1} {2} }/q)^{-\frac{n-3} {2} }\,\,q^{-\frac{n-1} {2} +\varepsilon }\,(\lambda,q_{1})^{\frac{1} {2} }\,2^{\frac{r} {2} }\,J_{\lambda }, }$$
(8.53)

where

$$\displaystyle{J_{\lambda } =\int _{\vert t\vert \geq 1/8}\vert \tilde{\phi }_{a,\delta }(t\,\lambda ^{\frac{1} {2} }/q)\vert \,\,\|t\xi \|^{-\frac{n-1} {2} }\,\mathit{dt},}$$

and \(\|t\xi \|\) denotes the distance of the point t ξ to the nearest lattice point. For q ≤ λ 1∕2 one has

$$\displaystyle{ \vert \hat{\phi }_{a,\delta }(t\,\lambda ^{\frac{1} {2} }/q)\vert \,\leq C\,(\lambda ^{\frac{1} {2} }/q)^{-1}\,\vert t\vert ^{-1}\,(1 +\delta \vert t\vert )^{-1} }$$
(8.54)

To estimate the integral J λ we use (8.54), and integrate over dyadic intervals 2j ≤ | t |  < 2j+1 (j ≥ −3). For a fixed j we have

$$\displaystyle{ \int _{2^{j}}^{2^{j+1} }t^{-1}(1 +\delta t)^{-1}\,\|t\xi \|^{-\frac{n-1} {2} }\,\mathit{dt}\, \leq C_{\varepsilon }\,2^{j\varepsilon }\,(1 +\delta 2^{j})^{-1} }$$
(8.55)

Summing over j this gives: \(J_{\lambda } \leq C_{\varepsilon }\,(\lambda ^{\frac{1} {2} }/q)^{-1}\lambda ^{\varepsilon }\). Substituting into (8.53) one estimates

$$\displaystyle{ \vert I_{q,\lambda }^{2}\vert \,\leq \, C_{\varepsilon }\,\lambda ^{-\frac{n-1} {4} +\varepsilon }\,q^{\varepsilon }\,(\lambda,q_{1})^{\frac{1} {2} }\,2^{\frac{r} {2} } }$$
(8.56)

Summing over q ≤ λ 1∕2, and using Lemma 2, estimate (8.52) follows. □ 

Theorem 1 follows immediately from Lemmas 911, and estimate (8.45).

2.4 The Kloosterman Refinement

For the sake of completeness we include below the proofs of Theorems AB. The present form of Theorem A was given by Heath-Brown [9] in his study of non-singular cubic forms, the idea going back to Kloosterman. Theorem B follows from the multiplicative properties of Kloosterman sums and Weil’s estimates [14].

To start let w be an absolutely summable weight function, P be an integral polynomial, N a fixed positive integer, and write

$$\displaystyle{ I:=\sum _{P(m)=\lambda }w(m) =\int _{ -1/N+1}^{1-1/N+1}S(\alpha )\,d\alpha, }$$
(8.57)

with \(\,S(\alpha ) =\sum _{m\in \mathbf{Z}^{n}}e^{2\pi i\,P'(m)}w(m)\,\), P′(m) = P(m) −λ. Breaking up the interval [−1∕N + 1, 1 − 1∕N + 1] according to the Farey dissection of order N (see [8], Ch. 3.8), we have

$$\displaystyle{I =\sum _{q\leq N}\sum _{(a,q)=1}\int S(a/q\,+\beta )\,d\beta.}$$

Here for fixed a the inner integral is over the interval

$$\displaystyle{\left [\frac{a + a'} {q + q'} -\frac{a} {q}\,,\, \frac{a + a''} {q + q''} -\frac{a} {q}\right ],}$$

where a′∕q′, aq, a″∕q″ are consecutive Farey fractions . Since qa′ − qa = −1 and qa″ − qa = 1 the range of β is given by

$$\displaystyle{-(q + q')^{-1} \leq q\beta \leq (q + q'')^{-1}.}$$

Since for consecutive Farey fractions, we have q + q′, q + q″ ≥ N, one may write I as

$$\displaystyle{ \sum _{q\leq N}\int _{-1/qN}^{1/qN}\sum _{ a}S(a/q+\beta )\,d\beta, }$$
(8.58)

where the inner sum is restricted to 1 ≤ a ≤ q, (a, q) = 1, and

$$\displaystyle{ q' \leq \frac{1} {q\vert \beta \vert }- q\ \ (\beta < 0),\ \ \ \ q'' \leq \frac{1} {q\beta } - q\ \ (\beta > 0). }$$
(8.59)

The numbers q′, q″ are completely specified by a as \(q' \equiv -q'' \equiv a^{-1}\pmod q\,\) and Nq < q′, q″ ≤ N, thus (8.58) eventually restricts the summation in a. The point is that if \(\vert \beta \vert \leq q^{-1}(q + N)^{-1}\), then (8.58) places no restriction on a, and otherwise \(\bar{a} = a^{-1}\pmod q\) must lie in one of two intervals \(J(q,\beta ) \subseteq (0,q)\). Then one estimates

$$\displaystyle\begin{array}{rcl} \sum _{\bar{a}\in J(q,\beta )}S(a/q+\beta )& =& \sum _{(s,q)=1}S(\bar{s}/q+\beta )\sum _{t\in J(q,\beta )}\frac{1} {q}\sum _{\vert u\vert \leq q/2}e^{2\pi i\frac{u(s-t)} {q} } \\ & =& \frac{1} {q}\sum _{\vert u\vert \leq q/2}S_{u}(q,\beta )\sum _{t\in J(q,\beta )}e^{-2\pi i\frac{ut} {q} } \\ & \ll & \sum _{\vert u\vert \leq q/2}(1 + \vert u\vert )^{-1}\vert S_{ u}(q,\beta )\vert, {}\end{array}$$
(8.60)

where

$$\displaystyle{S_{u}(q,\beta ) =\sum _{(s,q)=1}e^{2\pi i\frac{us} {q} }S(\bar{s}/q+\beta ),}$$

using the estimate

$$\displaystyle{\frac{1} {q}\sum _{t\in J(q,\beta )}e^{-2\pi i\frac{ut} {q} } \ll (1 + \vert u\vert )^{-1}.}$$

Since

$$\displaystyle{(qN)^{-1} - q^{-1}(q + N)^{-1} = N^{-1}(q + N)^{-1} \leq N^{-2}}$$

and

$$\displaystyle{q^{-1}(q + N)^{-1} \geq (2qN)^{-1},}$$

the total contribution to (8.58) arising from the ranges | β | ≥ q −1(q + N)−1 is

$$\displaystyle{ \ll N^{-2}\sum _{ q\leq N}\sum _{\vert u\vert \leq q/2}(1 + \vert u\vert )^{-1}\max _{\frac{ 1} {2} \leq qN\vert \beta \vert \leq 1}\vert S_{u}(q,\beta )\vert. }$$
(8.61)

The remaining range for β gives

$$\displaystyle{\sum _{q\leq N}\int _{-1/q(q+N)}^{1/q(q+N)}S_{ 0}(q,\beta )\,d\beta.}$$

If one integrates for | β | ≤ 1∕qN instead, the resulting error is again of the form of (8.61). Thus summarizing the above estimates, we have

$$\displaystyle{I =\sum _{q\leq N}\int _{-1/q(q+N)}^{1/q(q+N)}S_{ 0}(q,\beta )\,d\beta +O(N^{-2}\sum _{ q\leq N}\sum _{\vert u\vert \leq q/2}(1+\vert u\vert )^{-1}\max _{\frac{ 1} {2} \leq qN\vert \beta \vert \leq 1}\vert S_{u}(q,\beta )\vert ).}$$

and Theorem A follows.

From the standard estimate for the Gaussian sums G(a, l, q) ≪ q n∕2, it is immediate that

$$\displaystyle{ \vert K(q,l,\lambda;u)\vert \,\ll \, q^{-n/2\,+1}. }$$
(8.62)

Also, G(a, l, q) is a product of one dimensional sums, thus for q odd, by completing the square in the exponent, it may be written in the form (see also [14], Ch. 4)

$$\displaystyle{G(a,l,q) = q^{-n}\,\epsilon _{ q}^{n}\,\left (\frac{q} {a}\right )^{n}\,e^{-2\pi i\frac{\bar{4}\bar{a}\,\vert l\vert ^{2}} {q} }\,G(1,0,q)^{n},}$$

where \(\left (\frac{q} {a}\right )\) denotes the Jacobi symbol , ε q is a 4th root of unity, and \(\bar{a}\) denotes the multiplicative inverse of a mod q. Substituting this into (8.21) we have

$$\displaystyle{ K(q,l,\lambda;u) =\epsilon _{ q}^{n}\,q^{-n}G(1,0,q)^{n}\sum _{ (a,q)=1}\left (\frac{q} {a}\right )^{n}e^{2\pi i\frac{a\lambda +\bar{4}\bar{a}(u-\vert l\vert ^{2})} {q} }. }$$
(8.63)

The sum in (8.63) is a Kloosterman sum or Salie sum depending on whether n is even or odd. Weil’s estimates [14, Ch. 4] imply

$$\displaystyle{ \vert K(q,l,\lambda;u)\vert \leq C_{\varepsilon }\,q^{-\frac{n-1} {2} +\varepsilon }\,(\lambda,q)^{\frac{1} {2} }. }$$
(8.64)

Estimate (8.22) follows by writing q = q 1 q 2, with q 1 odd and q 2 = 2r, applying (8.64) to q 1, (8.62) to q 2 = 2r and using the multiplicative property

$$\displaystyle{ K(q,l,\lambda;u) = K(q_{1},l\,\bar{q_{2}},\lambda;u\,\bar{q_{2}}^{2})\,K(q_{ 2},l\,\bar{q_{1}},\lambda;u\,\bar{q_{1}}^{2}), }$$
(8.65)

where \(q_{1}\bar{q_{1}} \equiv 1\) (mod q 2), and \(q_{2}\bar{q_{2}} \equiv 1\) (mod q 1). Property (8.65) is well-known, and is an easy computation using the Chinese Remainder Theorem . This finishes the proof of Theorem B.

3 The Discrepancy of Lattice Points on Hypersurfaces

We will study now the uniformity of distribution of lattice points on a homogeneous, non-singular, hypersurface. We will show that if the dimension of the underlying Euclidean space is large enough with respect to the degree of the hypersurface, then there are non-trivial upper bounds on the discrepancy with respect to caps.

The analysis will be similar to what we have carried out for spheres, however in this generality we will use the Birch-Davenport method of exponential sums, which will allow us to develop uniform asymptotic formulae for the Fourier transform of the set of lattice points on the hypersurface.

To formulate our main result in this section, let P(m) be a positive, homogeneous polynomial of degree d with integer coefficients, and for λ ∈ N, define the hypersurface

$$\displaystyle{S_{\lambda } =\{ x \in \mathbf{R}^{n};\ P(x) =\lambda \}}$$

We will write S for S 1, the unit level surface. Recall that the polynomial is called non-singular if for all z ∈ C n∕{0}

$$\displaystyle{ \nabla P(z) = (\partial _{1}(z),\ldots,\partial _{n}(z))\neq 0 }$$
(8.66)

Our main result in this section is the following upper bound of the set of solutions \(\,Z'_{P,\lambda } =\{\lambda ^{-1/d}m;\ P(m) =\lambda \}\,\) with respect to the family of caps C a, ξ corresponding to a given diophantine direction ξ, defined in (8.2). Similar but somewhat weaker results have been obtained in [13].

Theorem 12.

Let n > d(d − 1)2 d+1 , and let P(m) be a positive, homogeneous non-singular polynomial of degree d with integer coefficients. If ξ ∈ S n−1 is diophantine, then we have

$$\displaystyle{ \vert D_{P}(\xi,\lambda )\vert \leq C_{\xi,\varepsilon }\,\lambda ^{(\frac{n} {d} -1)(1-\gamma _{d})}, }$$
(8.67)

with \(\gamma _{d} = \frac{1} {(d-1)2^{d+1}}\) .

To see why this upper bound is non-trivial, note that as P is positive, we have that P(x) ≈ | x | d, thus on average for L ≤ λ < 2L, the surface S λ contains ≈ λ nd −1 lattice points. Indeed there are ≈ L nd lattice points m in the region L ≤ P(m) < 2L, and they lie on L hypersurfaces. As we have mentioned, because of congruence obstructions, one cannot have that \(\,\vert \mathbf{Z}^{n} \cap S_{\lambda }\vert \approx \lambda ^{n/d\,-1}\,\) for all large λ, but it can be shown that this holds all λ ∈ Λ, for an infinite arithmetic progression Λ ⊆ N. Such a set Λ will be called a set of regular values . Thus one has

Corollary 13.

Let n > d(d − 1)2 d+1 , P(m) be a positive, homogeneous non-singular polynomial of degree d with integer coefficients, and let Λ be a set of regular values for P. If ξ ∈ S n−1 is diophantine, then we have

$$\displaystyle{ \vert D_{P}(\xi,\lambda )\vert \leq C_{\xi,\varepsilon }\,N_{\lambda }^{1-\gamma _{d} }, }$$
(8.68)

for each λ ∈Λ, with \(\gamma _{d} = \frac{1} {(d-1)2^{d+1}}\) , where N λ denotes the number of lattice points on the surface S λ .

3.1 The Fourier Transform of the Set of Lattice Points on Hypersurfaces

We will now generalize the asymptotic formula (8.9) describing the structure of the Fourier transform of lattice points on spheres, using the Birch-Davenport [4, 5, 16] version of the Hardy-Littlewood method of exponential sums. This method was developed to count solutions of (systems of) diophantine equations, when the number of variables is large enough with respect to the degrees of the polynomials, and it is one of the most far reaching application of analytic tools in the area of diophantine equations. In spite of this there are very few accessible description of this method, so perhaps it is of interest to discuss it in detail in the case of a single non-singular homogeneous polynomial.

3.1.1 Minor Arcs Estimates

To start, let ϕ be a smooth cut-off function which is constant 1 on the unit level surface S = { P = 1}, and let N = λ 1∕d. Then

$$\displaystyle{ \hat{\omega }_{\lambda }(\xi ) =\sum _{P(m)=\lambda }e^{2\pi i\,x\cdot \xi }\phi (m) =\int _{ 0}^{1}S(\alpha,\xi )e^{-2\pi i\alpha \lambda }\,d\alpha, }$$
(8.69)

where

$$\displaystyle{ S(\alpha,\xi ) =\sum _{m\in \mathbf{Z}^{n}}e^{2\pi i(P(m)+m\cdot \xi )}\phi (m/N) }$$
(8.70)

As is usual in the circle-method, we will now define a family of small intervals, which we will call major arcs on which the exponential sum S(α, ξ) is concentrated. Let 0 < θ ≤ 1 be a parameter, and for a given pair of natural numbers a, q such that (a, q) = 1, define the corresponding major arc centered at aq by

$$\displaystyle{L_{a,q}(\theta ) =\{\alpha:\ 2\vert \alpha - a/q\vert < q^{-1}N^{-d+(d-1)\theta }\},}$$

moreover let

$$\displaystyle{L(\theta ) =\bigcup _{q\leq N^{(d-1)\theta },\,(a,q)=1}L_{a,q}(\theta ).}$$

If \(\alpha \notin L(\theta )\), the we say α is in a minor arc . The following properties of the major arcs are immediate from their definition.

Proposition 14.

  1. (i)

    If θ 1 < θ 2  then \(\ \theta _{1} \subseteq L(\theta _{2})\) .

  2. (ii)

    If \(\ \theta < \frac{d} {2(d-1)}\ \) then the intervals L a,q (θ) are disjoint for different values of a and q.

  3. (iii)

     |L(θ)|≤ N −d+(d−1)θ .

We will now derive standard Weyl-type estimates , following [4], for the exponential sum S(α, ξ), when α is in a minor arc. It will be useful to introduce the notations

$$\displaystyle{D_{h}P(m) = P(m) - P(m + h),\ \ \ \ \varDelta _{h}\phi (m) =\phi (m)\bar{\phi }(m + h),}$$

and inductively

$$\displaystyle{D_{h^{1},\ldots,h^{k}}P = D_{h^{1}}(D_{h^{2},\ldots,h^{k}}P),\ \ \ \ \varDelta _{h^{1},\ldots,h^{k}}\phi =\varDelta _{h^{1}}(\varDelta _{h^{2},\ldots,h^{k}}\phi ).}$$

Note, that the above expressions are independent of the order of the vectors h 1, , h k. We will also use repeatedly the expression

$$\displaystyle{\vert \sum _{m}\phi (m)\vert ^{2} =\sum _{ m,h}\phi (m)\bar{\phi }(m + h) =\sum _{m,h}\varDelta _{h}\phi (m).}$$

Writing ϕ N (m) = ϕ(mN), and taking averages, we have

$$\displaystyle\begin{array}{rcl} \vert N^{-n}S(\alpha,\xi )\vert ^{2}& =& N^{-2n}\sum _{ h^{1},m}e^{2\pi i\,\alpha D_{h^{1}}P(m)-h\cdot \xi }\varDelta _{ h^{1}}\phi _{N}(m) {}\\ & \leq & N^{-n}\sum _{ h^{1}}\vert N^{-n}\sum _{ m}e^{2\pi i\,\alpha D_{h^{1}}P(m)}\varDelta _{ h^{1}}\phi _{N}(m)\vert {}\\ \end{array}$$

Note that the summation is restricted to | h 1 | ≪ N and | m | ≪ N. Applying the Cauchy-Schwarz inequality d − 2 times, one has

$$\displaystyle{ \vert N^{-n}S(\alpha,\xi )\vert ^{2^{d-1} } \ll \, N^{-n(d-1)}\sum _{ h^{1},\ldots,h^{d-1}}N^{-n}\vert \sum _{ m}e^{2\pi i\alpha D_{h^{1},\ldots,h^{d-1}}P(m)}\varDelta _{ h^{1},\ldots,h^{d-1}}\phi _{N}(m)\vert }$$
(8.71)

Note that the implicit constant in (8.71) depends only on the dimension n and the degree d, and the summation again is restricted to | h i | ≪ N and | m | ≪ N. The point is that after taking d − 1 “derivatives”, the polynomial \(D_{h^{1},\ldots,h^{d-1}}P\) becomes linear, i.e. it is of the form

$$\displaystyle{ D_{h^{1},\ldots,h^{d-1}}P(m) =\sum _{ j=1}^{n}m_{ j}\,\varPhi _{j}(h^{1},\ldots,h^{d-1}), }$$
(8.72)

where the coefficients \(\varPhi _{j}: \mathbf{Z}^{n(d-1)} \rightarrow \mathbf{Z}\) are multi-linear forms. In fact writing the homogeneous polynomial P as

$$\displaystyle{P(m) =\sum _{1\leq j_{1},\ldots,j_{d}}\ a_{j_{1},\ldots,j_{d}}\ m_{j_{1}}\ldots m_{j_{d}},}$$

so that the coefficients \(a_{j_{1},\ldots,j_{d}}\) are independent of the order of the indices j 1, , j d−1, it is not hard to see that

$$\displaystyle{ \varPhi _{j}(h^{1},\ldots,h^{d-1}) = d!\sum _{ j_{1},\ldots,j_{d-1}}a_{j_{1},\ldots,j_{d-1},j}\ \ h_{j_{1}}^{1}\ldots h_{ j_{d-1}}^{d-1}. }$$
(8.73)

For simplicity let us introduce the notations

$$\displaystyle\begin{array}{rcl} \underline{h}&:=& (h^{1},\ldots,h^{d-1}), {}\\ \varPsi _{N,\underline{h}}(x)&:=& \varDelta _{h^{1},\ldots,h^{d-1}}\phi _{N}(x), {}\\ \varPhi (\underline{h})&:=& (\varPhi _{1}(\underline{h}),\ldots,\varPhi _{n}(\underline{h})). {}\\ \end{array}$$

Now, by (8.72) the inner sum in (8.71) is the Fourier transform the function \(\varPsi _{N,\underline{h}}\) at \(\xi =\alpha \varPhi (\underline{h})\). To estimate it, note that

$$\displaystyle{\left \vert \left ( \frac{d} {\mathit{dx}}\right )^{k}\varPsi _{ N,\underline{h}}(x)\right \vert \ll N^{-k},\ \ \ \ \mathit{for\ all}\ \ k \in \mathbf{N},}$$

(where the implicit constant depends only on n, d and k), and that the function \(\varPsi _{N,\underline{h}}\) is supported on | x | ≪ N. Thus integrating by parts k-times we have that

$$\displaystyle{\vert \tilde{\varPsi }_{N,\underline{h}}(\xi )\vert \ll N^{n}\ (1 + N\vert \xi \vert )^{-k},}$$

where \(\tilde{\varPsi }_{N,\underline{h}}\) denotes the Fourier transform of \(\varPsi _{N,\underline{h}}(x)\) considered as function on R n. Thus by Poisson summation

$$\displaystyle{\vert \hat{\varPsi }_{N,\underline{h}}(\xi )\vert \leq \sum _{l\in \mathbf{Z}^{n}}\vert \tilde{\varPsi }_{N,\underline{h}}(\xi -l)\vert \ll N^{n}\,(1 + N\|\xi \|)^{-k}.}$$

Here we used the notation \(\|\xi \|=\max _{j}\|\xi _{j}\|\), for a point ξ = (ξ 1, , ξ n ), where \(\|\xi _{j}\|\) denotes the distance of the j-th coordinate ξ j from the nearest integer. Plugging this, into inequality (8.71) we have

$$\displaystyle{ \vert N^{-n}S(\alpha,\xi )\vert ^{2^{d-1} }\, \ll \, N^{-n(d-1)}\sum _{ \underline{h}\in \mathbf{Z}^{n(d-1)},\ \vert \underline{h}\vert \ll N}(1 + N\,\|\alpha \varPhi (\underline{h})\|)^{-k}, }$$
(8.74)

for all k ∈ N. We will fix now k = n + 1, and use the multi-linearity of the forms \(\varPhi _{j}(h^{1},\ldots,h^{d-1})\), to estimate the right side of inequality (8.74) by the number of \(\,(h^{1},\ldots,h^{d-1})\mathbf{Z}^{n(d-1)}\), | h j | ≪ N such that \(\,\|\alpha \varPhi (\underline{h})\| \leq N^{-1}\,\). More generally, for given parameters τ,   η, let us introduce the quantities

$$\displaystyle{ \mathcal{R}(N^{\tau },N^{-\eta },\alpha ) = \vert \{\underline{h} \in \mathbf{Z}^{n(d-1)};\ \vert \underline{h}\vert \ll N,\ \|\alpha \varPhi _{ j}(\underline{h})\| \leq N^{-\eta },\ 1 \leq j \leq n\}\vert. }$$
(8.75)

Lemma 15.

$$\displaystyle{ (N^{-n}\vert S(\alpha,\xi )\vert )^{2^{d-1} } \ll N^{-n(d-1)}\mathcal{R}(N,N^{-1},\alpha ). }$$
(8.76)

Proof.

Consider the points \(\,\{\alpha \,\varPhi (\underline{h})\}\, \in [-\frac{1} {2}, \frac{1} {2}]^{n}\,\), where { } denotes the fractional part, and divide the cube \([-\frac{1} {2}, \frac{1} {2}]^{n}\,\) into N n cubes \(B_{\underline{s}}\) of size \(\frac{1} {N}\). Now if \(B_{\underline{0}} = [- \frac{1} {2N}, \frac{1} {2N}]^{n}\), then for each fixed \(\underline{h}' = (h^{1},\ldots,h^{d-2})\), the cube \(B_{\underline{0}}\) will contain at least as many points of the form \(\,\{\alpha \,\varPhi (\underline{h}',h^{d-1})\}\,\), as any of the other cubes \(B_{\underline{s}}\). Indeed, this follows immediately from the linearity of the forms Φ j in the variable h d−1. Since the center of the cubes \(B_{\underline{s}}\) are \(\,N^{-1}\underline{s} = (\frac{s_{1}} {N},\ldots, \frac{s_{n}} {N} )\,\) with − N∕2 ≤ s j  < N∕2 , the right side of (8.74) is bounded by

$$\displaystyle{N^{-n(d-1)}\sum _{ -\frac{N} {2} \leq s_{1},\ldots,s_{n}<\frac{N} {2} }\ (1 + \vert \underline{s}\vert )^{-n-1}\ \mathcal{R}(N,N^{-1},\alpha )\, \ll \, N^{-n(d-1)}\mathcal{R}(N,N^{-1},\alpha ).}$$

 □ 

Next, we will use that fact that the quantities \(\,\mathcal{R}(N^{\tau },N^{-\eta },\alpha )\,\) can be compared to each other for different values of the parameters τ, η, in fact we will need the following

Lemma 16.

Let \(\ 0 <\theta < \frac{1} {d-1}\,\) . Then we have

$$\displaystyle{ N^{-n(d-1)}\,\mathcal{R}\,(N,N^{-1},\alpha )\, \ll \, N^{-n(d-1)\theta }\,\mathcal{R}\,(N^{\theta },N^{-d+(d-1)\theta },\alpha ). }$$
(8.77)

This is based on the following result

Lemma 17 (Davenport [5]).

Let L 1 (u),…,L n (u) be n real linear forms in n variables u 1 ,…,u n , say

$$\displaystyle{L_{j}(u) =\sum _{k}\lambda _{jk}\,u_{k},}$$

which are symmetric in the sense that \(\lambda _{jk} =\lambda _{kj}\) . Let 1 < K 1 < K 2 and for 0 < r < 1 let U(r) denote the number of integer solutions of the system

$$\displaystyle{ \vert u_{k}\vert < rK_{1},\ \ \ \ \ \ \|L_{j}(u)\| < rK_{2}^{-1}. }$$
(8.78)

Then for all 0 < r ≤ 1 we have

$$\displaystyle{ U(1) \ll r^{-n}U(r). }$$
(8.79)

This is Lemma 3.3 in [5] and is an application of the geometry of numbers. Let us remark here only that the solutions of (8.78) can be viewed as lattice points \(\,(u,v) \in \mathbf{Z}^{2n}\,\) which are inside the convex symmetric body rB, where

$$\displaystyle{B =\{ (u,v) \in \mathbf{R}^{2n};\ \vert u_{ k}\vert < K_{1},\ \vert v_{j} - L_{j}(u)\vert < K_{2}^{-1},\ 1 \leq k,j \leq n\}.}$$

Proof (of Lemma 16).

We will apply Lemma 17 in each variable h 1, , h d−1 . Fix \(\,\underline{h}' = (h^{2},\ldots,h^{d-1}\,\), write u = h 1 and \(\,L_{j}(u) =\alpha \,\varPhi _{j}(u,h\underline{h}')\,\). From (8.73) it is clear that the linear forms L j (u) are symmetric, thus we can apply Lemma 17, with K 1 = K 2 = N, r 1 = N θ−1, r 2 = 1 for each \(\underline{h}'\). Summing over \(\underline{h}'\) gives

$$\displaystyle{\mathcal{R}(N,N^{-1}\alpha )\,\ll \,N^{n(1-\theta )}\vert \{\underline{h} \in \mathbf{Z}^{n(d-1)};\ \vert h^{1}\vert \ll N^{\theta },\ \vert \underline{h}'\vert \ll N,\ \|\alpha \,\varPhi (\underline{h})\| < N^{\theta -2}\}.}$$

Next, set u = h 2, fix the remaining variables and apply Lemma 8.69 with \(K_{1} = N,\ K_{2} = N^{2-\theta }\) and \(r = N^{\theta -1}\). Continuing this procedure for all the variables h 1, , h d−1 eventually, we have

$$\displaystyle{\mathcal{R}(N,N^{-1}\alpha )\, \ll \, N^{n(d-1)(1-\delta )}\mathcal{R}\,(N^{\theta },N^{-d+(d-1)\theta },\alpha ),}$$

which is the same as (8.77). □ 

Note that if there is a point \(\,\underline{h} \in \mathbf{Z}^{n(d-1)}\,\), \(\,\vert \underline{h}\vert \ll N\,\) such that \(\,\|\alpha \varPhi _{j}(\underline{h})\| < N^{-d+(d-1)\theta }\,\) and \(\,\varPhi _{j}(\underline{h})\neq 0\), then setting \(\,q = \vert \varPhi _{j}(\underline{h})\vert \,\), we have that

$$\displaystyle{\left \vert \alpha -\frac{a} {q}\right \vert < \frac{1} {q}\,N^{-d+(d-1)\theta }}$$

for some a ∈ Z n such that (a, q) = 1. Thus α ∈ L(θ) by the definition of major arcs, hence if α is in a minor arc, we have

$$\displaystyle{ \mathcal{R}\,(N^{\theta },N^{-d+(d-1)\theta },\alpha ) = \vert \{\underline{h} \in \mathbf{Z}^{n(d-1)};\ \vert \underline{h}\vert \ll N,\ \varPhi _{ 1}(\underline{h}) =\ldots =\varPhi _{n}(\underline{h}) = 0\}. }$$
(8.80)

which is the number of lattice points \(\,\underline{h} \in \mathbf{Z}^{n(d-1)}\,\) of size \(\vert \underline{h}\vert \ll N^{\theta }\,\) on the variety

$$\displaystyle{S_{\varPhi }:=\{\underline{ z} \in \mathbf{C}^{n(d-1)};\ \varPhi _{ 1}(\underline{z}) =\ldots =\varPhi _{n}(\underline{z}) = 0\}.}$$

By (8.73) it is easy to see that Φ j (h, , h) = (d − 1)! ( j )P(h), thus if we set

$$\displaystyle{\boldsymbol{\bigtriangleup }:=\{ (h,\ldots,h);\ h \in \mathbf{C}^{n}\} \subseteq \mathbf{C}^{n(d-1)},}$$

then

$$\displaystyle{S_{\varPhi } \cap \boldsymbol{\bigtriangleup } =\{ h \in \mathbf{C}^{n};\ \partial _{ 1}P(h) =\ldots = \partial _{n}P(h) = 0\} =\{ 0\},}$$

by our assumption that the polynomial P is non-singular. Then by basic facts from algebraic geometry it follows that

$$\displaystyle{D:=\dim \, S_{\phi } \leq n(d - 1) - n.}$$

The dimension of the algebraic set S Φ is defined algebraically, however it is well-known, see [10], Ch. 7, that if it has dimension D then every bounded part of it can be covered by O(ρ D) balls of diameter ρ for any 0 < ρ < 1. Combining this with the fact that S Φ is homogeneous, we have

$$\displaystyle{ \vert \{\underline{h} \in \mathbf{Z}^{n(d-1)} \cap S_{\varPhi };\ \vert \underline{h}\vert \ll N^{\theta }\}\vert \ll \vert \{\underline{h}' \in (N^{-\theta }\mathbf{Z})^{n(d-1)} \cap S_{\varPhi };\ \vert \underline{h}'\vert \ll 1\}\vert \ll N^{D\,\theta }. }$$
(8.81)

Then by (8.76), (8.77) and (8.81), we have the following estimate on the minor arcs.

Lemma 18.

Let 0 < θ < 1. If \(\alpha \notin L(\theta )\) , then we have uniformly in ξ

$$\displaystyle{ \vert S(\alpha,\xi )\vert \,\ll \, N^{\ n-n\theta \,2^{-(d-1)} }. }$$
(8.82)

We will also need a variant of the above estimate when the cut-off function ϕ is replaced by the indicator function χ of a cube of side length ≈ 1 centered near the origin. The estimate below is proved in [4], however it easily follows from (8.82). Indeed, choose a cut off function ϕ such that χ ϕ = χ, and let P 1(m) = P(m) + m ⋅ ξ . Then by Plancherel’s identity

$$\displaystyle\begin{array}{rcl} \sum _{m\in \mathbf{Z}^{n}}e^{2\pi i\alpha P_{1}(m)}\,\phi (m/N)\chi (m/N)\,& =& \\ =\int _{\mathbf{T}^{n}}\left (\sum _{m\in \mathbf{Z}^{n}}e^{2\pi i\alpha P_{1}(m)-m\cdot \xi }\phi (N/P)\right )\,(N^{n}\hat{\chi }(N\xi ))\,d\xi \,& \ll & \,N^{n- \frac{n} {2^{d-1}} }\,(\log \,N)^{n}.{}\end{array}$$
(8.83)

Here T n is the flat torus, and the above estimate follows using (8.82) for the first term of the integral uniformly in ξ, and the fact that \(\|N^{n}\hat{\chi }(N\xi )\|_{L^{1}(\mathbf{T}^{n})}\, \ll \, (\log \,N)^{n}\).

Corollary 19.

Let 1 ≤ a < q be natural numbers s.t. (a,q) = 1. The for the exponential sum

$$\displaystyle{G(a,q,l) = q^{-n}\sum _{ s\in (\mathbf{Z}/q\mathbf{Z})^{n}}e^{2\pi i\,\frac{aP(m)-l\cdot s} {q} },}$$

one has

$$\displaystyle{ \vert G(a,q,l)\vert \,\ll \, q^{- \frac{n} {(d-1)2^{d}} }(\log \,q)^{n}. }$$
(8.84)

Proof.

Set N = q, α = aq, ξ = lq, θ = 1∕2(d − 1) and notice that \(\alpha \notin L(\theta )\). Indeed, for q 1 ≤ q (d−1)θ we have

$$\displaystyle{\left \vert \frac{a} {q} -\frac{a_{1}} {q_{1}} \right \vert \geq \frac{1} {q_{1}q} \geq \frac{1} {q_{1}}q^{-d+(d-1)\theta }.}$$

Then (8.84) follows from (8.83), choosing χ to be the indicator function of [0, 1)n, and identifying (Zq Z)n with [0, q)nZ n. □ 

Corollary 20.

If |α| < P −d∕2  then one has

$$\displaystyle{\vert S(\alpha,\xi )\vert \,\ll \, N^{n}\,(N^{d}\vert \alpha \vert )^{- \frac{n} {(d-1)2^{d-1}} }.}$$

Proof.

Choose θ such that | α |  = N d+(d−1)θ , that is \(\,(N^{d}\vert \alpha \vert )^{ \frac{1} {d-1} } = N^{\theta }\,\). The major arcs L a, q (θ) are disjoint since (d − 1)θ < d∕2, moreover α is an endpoint of the interval L 0, 1(θ) hence \(\alpha \notin L_{a,q}(\theta )\). By (8.82) this gives

$$\displaystyle{\vert S(\alpha,\xi )\vert \,\ll \, N^{n-n2^{-(d-1)}\theta } = N^{n}\,(N^{d}\vert \alpha \vert )^{- \frac{n} {(d-1)2^{(d-1)}} }.}$$

 □ 

3.1.2 Approximations on the Major Arcs

We will now derive an asymptotic expansion for the Fourier transform of the lattice points on the hypersurface \(\,S_{\lambda } =\{ P =\lambda \}\,\) along the lines as in Sect. 8.2. Throughout this section we will assume that n is sufficiently large, in particular that \(\,n > n_{d}:= d(d - 1)2^{d+1}\,\), set \(\,\gamma _{d}:=\, \frac{1} {(d-1)2^{d+1}} \,\), and for simplicity of notation introduce the quantity D: = (d − 1)2d−1 .

Going back to the integral defined in (8.69), for a given θ, write

$$\displaystyle{ \hat{\omega }_{\lambda }(\xi ) =\int _{\alpha \in L(\theta )}S(\alpha,\xi )\,d\alpha +\int _{\alpha \notin L(\theta )}S(\alpha,\xi )\,d\alpha \, =\, A_{\lambda }(\xi ) + E_{\lambda }^{1}(\xi ). }$$
(8.85)

It follows from our assumptions on n, that there is a \(\theta < \frac{1} {2(d-1)}\), such that

$$\displaystyle{n\theta 2^{-(d-1)} > d + n\gamma _{ d}}$$

thus (8.71) implies that \(\,S(\lambda,\xi ) \ll N^{n-d-n\gamma _{d}}\,\) for \(\lambda \notin L(\theta )\,\). Thus we have the estimate, uniformly in ξ

$$\displaystyle{ \vert E_{\lambda }^{1}(\xi )\vert \,\ll \, N^{n-d-n\gamma _{d} }. }$$
(8.86)

We will fix a \(\,\theta < \frac{1} {2(d-1)}\,\) so that (8.86) holds, and will do a number of transformations on the main term A λ (ξ) which are similar the ones we have used in the special case of the spheres. For a given \(\,\alpha \in L_{a,q}(\theta )\,\) for some (a, q) = 1, q ≤ N (d−1)θ, write α = aq +β, with | β | ≤ N d+(d−1)θ and m = qm 1 + s with m 1 ∈ Z n, s ∈ (Zq Z)n. Applying Poisson summation as in (8.17), we have

$$\displaystyle\begin{array}{rcl} S(a/q+\beta,\xi )& =& \sum _{m\in \mathbf{Z}^{n}}e^{2\pi i\frac{a} {q}P(m)}e^{2\pi im\cdot \xi }H_{\beta,N}(m) \\ & =& \sum _{s\in (\mathbf{Z}/q\mathbf{Z})^{n}}G(a,q,l)\,\tilde{H}_{\beta,N}(l/q-\xi ),{}\end{array}$$
(8.87)

where \(\tilde{H}_{\beta,N}\,\) is the Fourier transform of the function \(\,H_{\beta,N}(x) = e^{2\pi i\beta P(m)}\phi (m/N)\,\), and G(a, q, l) is the exponential sum defined in (8.84). Thus we have

$$\displaystyle{ A_{\lambda }(\xi ) =\sum _{q\leq N^{(d-1)\theta }}\sum _{(a,q)=1}\,\sum _{l\in \mathbf{Z}^{n}}G(a,l,q)\ J_{\lambda }(\xi -l/q), }$$
(8.88)

where

$$\displaystyle{J_{\lambda }(\xi -l/q) =\int _{\vert \beta \vert \leq N^{-d+(d-1)\theta }}\tilde{H}(l/q-\xi,\beta )e^{-2\pi i\lambda \beta }\,d\beta }$$

We shall approximate the functions A λ (ξ) with functions B λ (ξ) where the cut-off function ψ(q ξl) have been inserted in (8.88), that is let

$$\displaystyle{B_{\lambda }(\xi ) =\sum _{a,q}\sum _{l\in \mathbf{Z}^{n}}G(a,l,q)\,\psi (q\xi - l)\,J_{\lambda }(\xi -l/q)}$$

Next, we extend the integration in β and define

$$\displaystyle{M_{\lambda }(\xi ) =\sum _{a,q}\sum _{l\in \mathbf{Z}^{n}}G(a,l,q)\,\psi (q\xi - l)\,I_{\lambda }(\xi -l/q)}$$

with

$$\displaystyle{ I_{\lambda }(\xi -l/q) =\int _{\mathbf{R}}\tilde{H}(\xi -l/q,\beta )e^{-2\pi i\lambda \beta }\,d\beta. }$$
(8.89)

A crucial point is to identify the integrals I λ (η), in fact we will show that

$$\displaystyle{I_{\lambda }(\eta ) =\tilde{\sigma } _{\lambda }(\eta ).}$$

First we estimate the errors obtained.

Lemma 21.

If \(\,0 <\theta < \frac{1} {2(d-1)}\,\) then one has uniformly in ξ

$$\displaystyle{\vert A_{\lambda }(\xi ) - B_{\lambda }(\xi )\vert \,\ll \, N^{n-d-n\gamma _{d} }.}$$

Proof.

If we set

$$\displaystyle{\mu _{\beta }(\xi ) =\sum _{l}G(a,q,l)\,(1 -\psi (q\xi - l))\,\tilde{H}_{N,\beta }(\xi -l/q),}$$

then it is enough to show that \(\,\vert \mu _{\beta }(\xi )\vert \ll N^{n-d-n\gamma _{d}}\,\) uniformly for \(\,\vert \beta \vert \leq N^{-d+(d-1)\theta }\,\) and ξ ∈ T n. Let η = ξ −ł∕q, and estimate \(\tilde{H}_{N,\beta }(\eta )\) by partial integration:

$$\displaystyle\begin{array}{rcl} \tilde{H}_{N,\beta }(\eta )& \leq & N^{n}\left \vert \int _{\mathbf{ R}^{n}}e^{2\pi i\,N^{d}\beta \,P(x) }\phi (x)\ e^{2\pi iNx\cdot \eta }\,\mathit{dx}\right \vert {}\\ &\ll & N^{n}\vert N\eta \vert ^{-K}\,\left \vert \int _{\mathbf{ R}^{n}}(d/d\eta )^{K}\,(e^{2\pi i\,N^{d}\beta \,P(x) }\phi (x)\ e^{2\pi iNx\cdot \eta }\,\mathit{dx}\right \vert {}\\ &\ll & N^{n}\vert N\eta \vert ^{-K}\,(1 + N^{d}\vert \beta \vert )^{K}. {}\\ \end{array}$$

Now, on the support of 1 −ψ(q ξl) we have that

$$\displaystyle{N\vert \eta \vert = N\vert \xi - l/q\vert \gg N^{1-(d-1)\theta },}$$

hence for \(\vert \beta \vert \leq N^{-d+(d-1)\theta }\) and θ < 1∕2(d − 1), choosing \(0 <\tau < \frac{1} {2} - (d - 1)\theta\) we have

$$\displaystyle{\vert \mu _{\beta }(\xi )\vert \ll N^{n}(N/q)^{-\tau K}\,\sum _{ l\in \mathbf{Z}^{n}}(1 + \vert q\xi - l\vert )^{-\tau K} \ll N^{n-\tau K(1-(d-1)\theta )}.}$$

The Lemma follows by choosing K sufficiently large. □ 

In order to estimate the error obtained by extending the integration in β, we will need the following

Lemma 22.

For given η, L > 0 let

$$\displaystyle{I(L,\eta ) =\int e^{2\pi iL(P(x)+x\cdot \eta )}\phi (x)\,\mathit{dx}.}$$

Then one has

$$\displaystyle{ I(L,\eta )\, \ll \, (1 + L)^{-\frac{n} {D} }, }$$
(8.90)

with D = (d − 1)2 d−1 .

Proof.

The estimate is obvious for L < 1, so let L ≥ 1. If | η | ≥ C with a large enough constant C, then the gradient of the phase L | P (x) +η | ≥ L on the support of ϕ and (8.90) follows by partial integration.

Suppose | η | ≤ C and introduce the parameters θ, N, α such that L = N (d−1)θ , α = N d L. Note that if θ < 2d−1n, then we have \(\,N > L^{\frac{2n} {D} }\). Changing variables y = Nx yields

$$\displaystyle{I(L,\eta ) = N^{-n}\int e^{2\pi i\alpha \,(P(y)+N^{d-1}y\cdot \eta ) }\phi (y/N)\,\mathit{dy}.}$$

We compare the integral to a corresponding exponential sum

$$\displaystyle{N^{-n}S(\alpha,\eta ) = N^{-n}\sum _{ m\in \mathbf{Z}^{n}}e^{2\pi i\alpha \,(P(m)+N^{d-1}\,m\cdot \eta ) }\,\phi (m/N).}$$

If y = m + z where m ∈ Z n and z ∈ [0, 1]n, then it is easy to see that

$$\displaystyle{\vert e^{2\pi i\alpha \,(P(y)+N^{d-1}y\cdot \eta ) } - e^{2\pi i\alpha \,(P(m)+N^{d-1}m\cdot \eta ) }\vert \,\ll \, N^{-1+(d-1)\theta },}$$

since | α |  = N d+(d−1)θ and | η | ≤ C. Thus

$$\displaystyle{\vert I(L,\eta ) - N^{-n}S(\alpha )\vert \,\ll \, N^{-1+2(d-1)\theta } \ll N^{-\frac{1} {2} } \leq L^{-\frac{n} {D} }.}$$

Also, by Corollary 20

$$\displaystyle{\vert N^{-n}\,S(\alpha,\eta )\vert \,\ll \,\vert N^{d}\alpha \vert ^{-\frac{n} {D} }\, =\, L^{-\frac{n} {D} }}$$

and (8.90) follows. □ 

We remark that a better uniform estimate can be obtained by using real variable methods, exploiting the fact that P(x) ≈ | x | d. However we have chosen to estimate integral using exponential sums as this method works also for indefinite forms P. Now, it is easy to prove.

Lemma 23.

We have, uniformly in ξ

$$\displaystyle{\vert B_{\lambda }(\xi ) - M_{\lambda }(\xi )\vert \,\ll \, N^{n-d-n\gamma _{d} }}$$

Proof.

One has by (8.90)

$$\displaystyle{\int _{\vert \beta \vert \geq N^{-d+(d-1)\theta }}\vert \tilde{H}(\xi -l/q)\vert \,d\beta \, \ll \, N^{n-\frac{n} {D} }\, \ll \, N^{n-d-n\gamma _{d}}.}$$

The factors ψ(q ξl) restrict the sum in l to at most one non-zero term, moreover by (8.84) we have \(\,\vert G(a,q,l)\vert \ll q^{-\frac{n} {D}+\varepsilon }\, \ll \, q^{-3}\,\), say. Thus

$$\displaystyle{\vert B_{\lambda }(\xi ) - M_{\lambda }(\xi )\vert \,\ll \, (\sum _{q\leq N^{(d-1)\theta }}\sum _{(a,q)=1}q^{-3})\,N^{n-d-n\gamma _{d} }\, \ll \, N^{n-d-n\gamma _{d} }.}$$

 □ 

Summarizing, we have the asymptotic formula

$$\displaystyle{\hat{\omega }_{\lambda }(\xi ) = M_{\lambda }(\xi ) + E_{\lambda }(\xi ),}$$

where

$$\displaystyle{M_{\lambda }(\xi ) =\sum _{q\leq N^{(d-1)\theta }}\sum _{(a,q)=1}\sum _{l\in \mathbf{Z}^{n}}G(a,q,l)\,\psi (q\xi - l)\,I_{\lambda }(\xi -l/q),}$$

and

$$\displaystyle{\vert E_{\lambda }(\xi )\vert \,\ll \, N^{n-d-n\gamma _{d} },}$$

uniformly in ξ ∈ T n.

3.1.3 The Singular Integral

We will now identify the integrals I λ (η) with the Fourier transform of a certain natural measure supported on the surface S λ  = { P = λ}. Note that by assumption that the polynomial P is non-singular and positive, S λ is a smooth, compact hyper-surface in R n.

There is a unique n − 1-form d σ P (x) on \(\mathbf{R}^{n}\setminus \{0\}\) such that

$$\displaystyle{ dP \wedge d\sigma _{P} = dx_{1} \wedge \ldots \wedge dx_{n}, }$$
(8.91)

called the Gelfand-Leray form (see [1, 2], Sec.7.1). To see this, suppose that say  1 P(x) ≠ 0 on some open set U. By a change of coordinates: y 1 = P(x), y j  = x j for 2 ≤ j ≤ n, Eq. (8.91) takes the form

$$\displaystyle{\mathit{dy}_{1} \wedge d\sigma _{P}(y) = \partial _{1}H(y)\ \mathit{dy}_{1} \wedge \ldots \wedge \mathit{dy}_{n}}$$

where x 1 = H(y), x j  = y j is the inverse map. Thus the form \(d\sigma _{P}(y) = \partial _{1}H(y)\,\mathit{dy}_{2} \wedge \ldots \wedge \mathit{dy}_{n}\ \) satisfies (8.91).

We define the measure σ λ as the restriction of the n − 1 form d σ P to the level surface S λ . This measure is absolutely continuous with respect to the Euclidean surface area measure dS P, λ , more precisely one has

Proposition 24.

$$\displaystyle{ d\sigma _{\lambda }(x) = \frac{dS_{\lambda }(x)} {\vert P^{{\prime}}(x)\vert }, }$$
(8.92)

where dS λ denotes the Euclidean surface area measure on the level surface {P = λ}.

Proof.

Choose local coordinates y as before; in coordinates y level surface S λ and surface area measure dS λ takes the form

$$\displaystyle{S_{\lambda } =\{ x_{1} = H(\lambda,y_{2},\ldots,y_{n}),\ x_{j} = y_{j};\ 2 \leq j \leq n\},}$$

and

$$\displaystyle{\mathit{dS}_{\lambda }(y) = (1 +\sum _{ j=2}^{n}\partial _{ j}^{2}H(\lambda,y))^{1/2}\ \mathit{dy}_{ 2} \wedge \ldots \wedge dy_{n}.}$$

Using the identity P(H(y), y 2, , y n ) = y 1 , one has

$$\displaystyle{\partial _{1}P(x)\partial _{1}H(y) = 1\,,\ \ \partial _{1}P(x)\partial _{j}H(y) + \partial _{j}P(x) = 0,}$$

This implies that

$$\displaystyle{\partial _{1}H(y) = (1 +\sum _{ j=2}^{n}\partial _{ j}^{2}H(y))^{1/2} \cdot \vert P^{{\prime}}(x)\vert ^{-1},}$$

and (8.92) follows by taking y 1 = λ. □ 

A crucial observation is that the measure d σ λ , considered as a distribution on R n, has a simple oscillatory integral representation.

Lemma 25.

Let P(x) be a non-singular, homogeneous polynomial, and let λ be a real number. Then in the sense of distributions

$$\displaystyle{ \sigma _{\lambda }(x) =\int _{\mathbf{R}}e^{2\pi i\,(P(x)-\lambda )t}\,\mathit{dt}. }$$
(8.93)

This means that for any smooth cut-off function χ(t) and test function ϕ(x) one has

$$\displaystyle{ \lim _{\varepsilon \rightarrow 0}\int \int e^{2\pi i(P(x)-\lambda )t}\chi (\epsilon t)\phi (x)\,\mathit{dx}dt\, =\,\int \phi (x)d\sigma _{\lambda }(x). }$$
(8.94)

Proof.

Let U be an open set on which 1 P ≠ 0, and by a partition of unity we can assume that \(\ \mathop{\mathrm{supp}}\,\phi \subseteq U\,\). Changing variables y 1 = P(x), y j  = x j  the left side of (8.94) becomes

$$\displaystyle{\lim _{\epsilon \rightarrow 0}\int \int e^{2\pi i(y_{1}-\lambda )t}\chi (\epsilon t)\tilde{\phi }(y)\vert \partial _{ 1}H(y)\vert \,\mathit{dy}\mathit{dt} =\int \tilde{\phi } (\lambda,y')\vert \partial _{1}H(\lambda,y')\vert dy',}$$

where y′ = (y 2, … y n ).

The last equality can be seen by integrating in t and in y 1 first, and using the Fourier inversion formula:

$$\displaystyle{\lim _{\epsilon \rightarrow 0}\int \int e^{2\pi i(y_{1}-\lambda )t}\chi (\epsilon t)g(y_{ 1})\,\mathit{dy}_{1}dt = g(\lambda ).}$$

On the other hand \(\ S_{\lambda } \cap U =\{ x_{1} = H(\lambda,y_{2},\ldots y_{n}),\,x_{j} = y_{j}\}\ \), and \(\sigma _{\lambda }(y) = \vert \partial _{1}H(\lambda,y')\vert \,\mathit{dy}'\ \) in parameters y′ = (y 2, , y n ). □ 

Now it is easy to identity the integrals \(\,I_{\lambda }(\eta )\,\) defined in (8.89). Indeed by (8.94), we have

$$\displaystyle\begin{array}{rcl} I_{\lambda }(\eta )& =& \int _{\mathbf{R}^{n}}\int _{\mathbf{R}}e^{-2\pi i\,(P(x)-\lambda )\beta }\,e^{2\pi ix\cdot \eta }\phi (x/P)\,d\beta \,d\eta {}\\ & =& \int _{\mathbf{R}^{n}}\,\sigma _{\lambda }(x)e^{2\pi ix\cdot \eta }\phi (x/P)\,d\eta =\tilde{\sigma } _{\lambda }(\eta ) {}\\ \end{array}$$

Also, by homogeneity, \(\,\tilde{\sigma }_{\lambda }(\eta ) =\lambda ^{n/d\,-1}\tilde{\sigma }(\lambda ^{1/d}\eta )\,\), where σ is the Gelfand-Leray measure restricted the unit level surface S = { P = 1}. Thus we have shown

Theorem 26.

Let d ≥ 2, n ≥ d(d − 1)2 d+1 , and let P be a positive, homogeneous, non-singular polynomial of degree d. Then we have

$$\displaystyle{ \hat{\omega }_{\lambda }(\xi ) = M_{\lambda }(\xi ) + E_{\lambda }(\xi ), }$$
(8.95)

where

$$\displaystyle{ M_{\lambda }(\xi ) =\lambda ^{\frac{n} {d} -1}\sum _{q\leq N^{d-1 }\theta }\sum _{(a,q)=1}\sum _{l\in \mathbf{Z}^{n}}G(a,q,l)\,\psi (q\xi - l)\,\tilde{\sigma }(\lambda ^{\frac{1} {d} }(\xi -l/q)),\quad }$$
(8.96)

and

$$\displaystyle{ \vert E_{\lambda }(\xi )\vert \,\ll \, N^{n-d-n\gamma _{d} } }$$
(8.97)

uniformly in ξ ∈ T n , where \(\gamma _{d} = \frac{1} {(d-1)2^{d+1}}\) .

Let us remark that following the error estimates carefully, in fact it was shown that

$$\displaystyle{\vert E_{\lambda }(\xi )\vert \,\ll \, N^{n-d-\frac{n} {D}+2} = N^{n-d-n\gamma '_{d}}}$$

with some constant γ d  > γ d for n > d(d − 1)2d+1. This will be utilized in our estimates on the discrepancy, to swallow certain small factors of size \(N^{\varepsilon }\).

We will also need an estimate on the decay of the Fourier transform of the measure σ, later in our upper bounds on the discrepancy.

Lemma 27.

One has

$$\displaystyle{\vert \tilde{\sigma }(\xi )\vert \,\ll \, (1 + \vert \xi \vert )^{-\frac{n} {D}+1}}$$

Proof.

Suppose | ξ |  > 1, and choose a cut-off ϕ such that ϕ σ = σ . Then by (8.94), we have

$$\displaystyle\begin{array}{rcl} \tilde{\sigma }(\xi )& =& \int e^{-2\pi i\,x\cdot \xi }\phi (x)\,d\sigma (x) {}\\ & =& \lim _{\delta \rightarrow 0}\int \int e^{-2\pi i\,x\cdot \xi }e^{2\pi i(P(x)-1)t}\phi (x)\chi (\delta t)\,\mathit{dx}dt {}\\ \end{array}$$

We decompose the range of integration into two parts

$$\displaystyle{\tilde{\sigma }(\xi ) =\int _{\vert t\vert \geq c\vert \xi \vert }\int _{\mathbf{R}^{n}} +\int _{\vert t\vert \leq c\vert \xi \vert }\int _{\mathbf{R}^{n}} = I_{1} + I_{2}}$$

Note that if | t | ≤ C | ξ | , with a sufficiently small constant c > 0, then one has for the gradient of the phase

$$\displaystyle{\vert (tP(x) - x\cdot \xi )'\vert = \vert P'(x) -\xi \vert \geq \vert \xi \vert /2,}$$

thus integrating by parts K times yields

$$\displaystyle{\vert I_{2}\vert \leq C_{N}\,(1 + \vert \xi \vert )^{-K+1}.}$$

For | t | ≥ C | ξ | we have by (8.90)

$$\displaystyle{\vert \int e^{2\pi i(tP(x)-x\cdot \xi )}\phi (x)\,\mathit{dx}\vert \,\ll \,\vert t\vert ^{-\frac{n} {D} },}$$

hence

$$\displaystyle{I_{1}\, \ll \,\int _{\vert t\vert \geq C\vert \xi \vert }\vert t\vert ^{-\frac{n} {D} }\,\mathit{dt}\, \ll \,\vert \xi \vert ^{-\frac{n} {D}+1},}$$

with D = (d − 1)2d−1. □ 

3.1.4 The Singular Series

In order to get nontrivial upper bounds on the discrepancy for the set of lattice points on hypersurfaces, one needs to ensure that there are many lattice points on the surface. We will do this, by showing the existence of a regular set of values Λ corresponding to a non-singular polynomial P. Most of what we discuss below is standard, for example it is implicit in [4], so we only include the details for the sake of completeness.

Recall that we have a fixed θ slightly smaller than \(\frac{1} {2(d-1)}\), so that the asymptotic expansion (8.96) holds with an error term of size \(O(N^{n-d-n\gamma _{d}})\), where N = λ 1∕d and \(\,\gamma _{d} = \frac{1} {(d-1)2^{d+1}} \,\). Taking ξ = 0 this means that

$$\displaystyle{\hat{\omega }_{\lambda }(0)\, =\,\lambda ^{\frac{n} {d} -1}\sum _{1\leq N^{(d-1)\theta }}K(q,0,\lambda ) + O(N^{n-d-n\gamma _{d} }),}$$

where

$$\displaystyle{K(q,0,\lambda ) =\sum _{(a,q)=1}G(a,q,l) = q^{-n}\sum _{ (a,q)=1}\ \sum _{s\in (\mathbf{Z}/q\mathbf{Z})^{n}}e^{2\pi i\frac{a(P(s)-\lambda )-s\cdot l} {q} }.}$$

To exploit the multiplicativity of the terms K(q, 0, λ) we need to extend the summation for all q ∈ N, and estimate the error obtained. This can be done by using (8.84) which yields

$$\displaystyle{\vert K(q,0,\lambda )\vert \,\ll \, (\log \,q)^{n}q^{-\frac{n} {D}+1},}$$

thus for a sufficiently small \(\varepsilon > 0\)

$$\displaystyle{\sum _{q\geq N^{(d-1)\theta }}\vert K(q,0,\lambda )\vert \,\ll _{\varepsilon }\,N^{-(d-1)\theta ( \frac{n} {D}-2-\varepsilon )} \ll N^{-n\gamma _{d}}}$$

if n > d(d − 1)2d+1, by our choice of the parameters, D and γ d . Indeed, we have that (nD − 2) > 2n γ d  , thus choosing θ sufficiently close to (but smaller than) 1∕2(d − 1), the above estimate holds. It is well-known, and easy to see from the Chinese Remainder Theorem, that K(q 1, 0, λ)K(q 2, 0, λ) = K(q 1 q 2, 0, λ) for q 1 and q 2 being relative primes, which implies that

$$\displaystyle{\sum _{q=1}^{\infty }K(q,0,\lambda ) =\prod _{ p\ \mathit{prime}}(\sum _{r=0}^{\infty }K(p^{r},0,\lambda )) =:\prod _{ p\ \mathit{prime}}K_{p}(\lambda ),}$$

where the last equality is used to define the arithmetic factors \(\,K_{p}(\lambda ) =\sum \limits _{ r=0}^{\infty }K(p^{r},0,\lambda )\,\). Note that K(1, 0, λ) = 1 and by estimate (8.84) we have that \(\ K_{p}(\lambda ) = 1 + O(p^{-\frac{n} {D}+2}) = 1 + O(p^{-2})\ \). Thus choosing R = R P sufficiently large, we have that

$$\displaystyle{ 1/2 \leq \prod _{p>R\ p\ \mathit{prime}}\vert K_{p}(\lambda )\vert \leq 2 }$$
(8.98)

An important and well-known fact, which we will explain below, is that the arithmetic factors K p (λ) can be interpreted as the density of solutions of the equation P(m) = λ among the p-adic integers (see [4]). Thus the main term in the asymptotic formula (8.8) is the product of the densities of the solutions in the p-adic integers and the density of solutions among the real numbers and is an instance of the so-called local-global principle .

To see this, define

$$\displaystyle{r(p^{K},\lambda ):= \vert \{m \in (\mathbf{Z}/(p^{K}\mathbf{Z})^{n}:\ P(m) \equiv \lambda \ (\mathit{mod}\ p^{N})\}\vert,}$$

One has

Proposition 28.

$$\displaystyle{\sum _{r=0}^{K}K(p^{r},0,\lambda ) = p^{-n(K-1)}r(p^{K},\lambda ).}$$

Proof.

Note that

$$\displaystyle{r(p^{K},\lambda ) =\sum _{ m\ (mod\ p^{K})}p^{-K}\,\sum _{ b=1}^{p^{K} }e^{2\pi i(P(m)-\lambda ) \frac{b} {p^{K}} },}$$

since the inner sum is equal to p K or 0 according to whether \(P(m) \equiv \lambda \ (mod\ p^{K})\) or not. Next one writes b = ap Kr, where (a, p) = 1, 1 ≤ a < p r for r = 0, 1, , K, and collects the terms corresponding to a fixed r which turn out to be K(p r, 0, λ). □ 

Let us remark that this implies \(\,K_{p}(\lambda ) =\lim _{K\rightarrow \infty }p^{-n(K-1)}r(p^{K},\lambda )\,\), which can be viewed as the density of the solutions among the p-adic integers.

To count the number of solutions modulo p K, one uses the p-adic version of Newton’s method.

Lemma 29.

Let p be a prime, λ and let k,l be natural numbers such that l > 2k. Suppose there is an m 0 Z n for which

$$\displaystyle{P(m_{0}) \equiv \lambda \ (mod\ p^{l}),}$$

moreover suppose, that p k is the highest power of p which divides all the partial derivatives ∂ j P(m 0 ).

Then for K ≥ l, one has \(p^{-K(n-1)}r_{P}(p^{K},\lambda ) \geq p^{-l(n-1)}\) .

Proof.

For K = l this is obvious. Suppose it is true for K, and consider all the solutions \(m_{1}\ (mod\ p^{N+1})\) of the form m 1 = m + p Kk s where s (mod p). Then

$$\displaystyle{P(m + p^{K-k}s)-\lambda = P(m) -\lambda +p^{K-k}P'(m) \cdot s = 0\ \ (mod\ p^{K+1}),}$$

which yields a + b ⋅ s = 0 (mod p) where ap K = P(m) −λ and bp k = P′(m). Then b j ≠ 0 (mod p) for some j hence there are p n−1 solutions of this form. All obtained solutions are different mod (p K+1), and m 1 satisfies the hypothesis of the lemma. □ 

We remark that in case of m = 1, k = 0 the above argument shows that there are exactly p (K−1)(n−1) solutions m for which m = m 0 (mod p) and P(m) = λ(mod p K). It is not hard to establish now the existence of a set of regular values for the polynomial P.

Lemma 30.

let P(m) be a homogeneous non-singular polynomial of degree d ≥ 2, then there exists an infinite arithmetic progression Λ and constants 0 < c P < C P , such that for all λ ∈Λ

$$\displaystyle{c_{P} \leq K(\lambda ) \leq C_{P}}$$

Proof.

Let λ 0 = P(m 0) ≠ 0 for some fixed m 0 ≠ 0. Let p 1, , p J be the set of primes less then R. Let k be an integer s.t. p j k does not divide d λ 0, for all j ≤ J, where d is degree of P(m). By the homogeneity relation P′(m 0) ⋅ m = d λ 0 it follows that p j k does not divide some partial derivative i P(m 0). Fix l s.t. l > 2k and define the arithmetic progression

\(\varLambda =\{\lambda _{0} + k\prod _{j=1}^{J}p_{j}^{l}:\ k \geq k_{Q}\}\). Then we claim that Λ is a set of regular values. Indeed by Proposition 28 one has for λ ∈ Λ

$$\displaystyle{K_{p_{j}}(\lambda ) =\lim _{N\rightarrow \infty }p_{j}^{-n(N-1)}r_{ Q}(p_{j}^{N},\lambda ) \geq p_{ j}^{-l(N-1)}.}$$

This together with (8.98) ensures that the singular series K(λ) remains bounded from below, and the error term becomes negligible by choosing k = k P large enough. □ 

Let us remark that along the same lines it can be shown, that all large numbers are regular values of P(m), if for each prime p < R and each residue class s (mod p), there is a solution of the equations P(m) = s (mod p) such that P′(m) ≠ 0 (mod p). This is the case for example for \(P(m) =\sum _{j}m_{j}^{d}\).

3.2 Upper Bounds for the Discrepancy

We will prove Theorem 12 by extending the arguments given in Sect. 8.2 to the case of a general homogeneous non-singular hypersurface. Our main tool again will be the asymptotic expansion (8.95)

$$\displaystyle{\hat{\omega }_{\lambda }(\xi )\, =\,\lambda ^{\frac{n} {d} -1}\,\sum _{q\leq N^{(d-1)\theta }}m_{q,\lambda }(\xi ) + E_{\lambda }(\xi ),}$$

where

$$\displaystyle{m_{q,\lambda }(\xi ) =\sum _{l\in \mathbf{Z}^{n}}K(q,l,\lambda )\,\psi (q\xi - l)\,\,\tilde{\sigma }(\lambda ^{\frac{1} {d} }(\xi -l/q)).}$$

Note that \(\,0 <\theta < \frac{1} {2(d-1)}\,\) and \(N =\lambda ^{1/d}\,\). Moreover we will need the decay estimates

$$\displaystyle{ \vert \tilde{\sigma }(\xi )\vert \,\ll \, (1 + \vert \xi \vert )^{-\frac{n} {D}+1} }$$
(8.99)
$$\displaystyle{ \vert K_{p}(q,l,\lambda )\vert \,\ll _{\varepsilon }\,q^{-\frac{n} {D}+1+\varepsilon } }$$
(8.100)

Recall that the discrepancy of the set \(\,Z'_{P,\lambda } =\{\lambda ^{-1/d}m;\ P(m) =\lambda \}\,\) with respect to caps \(\,C_{a,\xi } =\{ x \in S_{P}:\ \vert x\cdot \xi \geq a\}\,\) may be written as

$$\displaystyle{D_{P}(\xi,\lambda ) =\sum _{P(m)=\lambda }\chi _{a}(\lambda ^{-1/d}m\cdot \xi )\, -\, N_{\lambda }\int _{ S_{P}}\chi _{a}(x\cdot \xi )\,d\sigma (x),}$$

where N λ is the number of solutions of the diophantine equation P(m) = λ, and χ a is the indicator function of an interval [a, b], b being a fixed constant such that | x ⋅ ξ | ≤ b for all x ∈ S P and ξ ∈ S n−1.

We turn to the proof of Theorem 12. As before, it will be enough to estimate the “smoothed” discrepancy

$$\displaystyle{D_{P}(\phi _{a,\delta },\xi,\lambda ) =\sum _{P(m)=\lambda }\phi _{a,\delta }(\lambda ^{-\frac{1} {d} }m\cdot \xi ) - N_{\lambda }\int _{S_{ P}}\phi _{a,\delta }(x\cdot \xi )\,d\sigma (x),}$$

for, say δ = λ n. Taking the inverse Fourier transform of the functions ϕ a, δ , we have

$$\displaystyle{ \sum _{P(m)=\lambda }\phi _{a,\delta }\,(\lambda ^{-1/d}\,m\cdot \xi ) =\int _{\mathbf{ R}}\lambda ^{\frac{1} {d} }\tilde{\phi }_{a,\delta }(t\lambda ^{\frac{1} {d} })\,\hat{\omega }_{\lambda }(t\xi )\,\mathit{dt} }$$
(8.101)

also

$$\displaystyle{ \int _{S^{n-1}}\phi _{a,\delta }\,(x\cdot \xi )\,d\sigma (x) =\int _{\mathbf{R}}\tilde{\phi }_{a,\delta }(t)\,\tilde{\sigma }(t\xi )\,\mathit{dt}. }$$
(8.102)

Moreover, as in (8.43) and (8.44), set

$$\displaystyle{I_{q,\lambda }:=\int _{\mathbf{R}}\lambda ^{\frac{1} {d} }\tilde{\phi }_{a,\delta }(t\lambda ^{\frac{1} {d} })\,m_{q,\lambda }(t\xi )\,\mathit{dt},}$$

and

$$\displaystyle{\mathcal{E}_{\lambda }:=\int _{\mathbf{R}}\lambda ^{\frac{1} {2} }\tilde{\phi }_{a,\delta }(t\lambda ^{\frac{1} {2} })\,E_{\lambda }(t\xi )\,\mathit{dt}.}$$

First, we estimate the error term using (8.95)

$$\displaystyle\begin{array}{rcl} \vert \mathcal{E}_{\lambda }\vert \,& \ll & \,N^{n-d-n\gamma _{d} }\int _{\mathbf{R}}(1 + \vert t\vert )^{-1}(1 +\delta \vert t\vert )^{-1}{}\end{array}$$
(8.103)
$$\displaystyle\begin{array}{rcl} & \ll & \lambda ^{\frac{n} {d} -1-\frac{n} {d} }\,(\log \,\lambda )\, \ll \,\lambda ^{(\frac{n} {d} -1)(1-\gamma _{d})},{}\end{array}$$
(8.104)

Next, we decompose the integral I q, λ as in (8.46), and observe that for | t |  < 1∕ 8q

$$\displaystyle{m_{q,\lambda }(t\xi ) = K(q,0,\lambda )\,\tilde{\sigma }(\lambda ^{1/d}\,t\xi ).}$$

Lemma 31.

We have

$$\displaystyle{ \vert \lambda ^{\frac{n} {d} -1}\sum _{q\leq N^{(d-1)\theta }}I_{q,\lambda }^{1}\, -\, N_{\lambda }\int _{ S_{P}}\phi _{a,\delta }\,(x\cdot \xi )\,d\sigma (x)\,\vert \,\ll \,\lambda ^{(\frac{n} {d} -1)(1-\gamma _{d})}. }$$
(8.105)

Proof.

By the above observation and a change of variables \(\,t =\lambda ^{1/d}t\,\), we have

$$\displaystyle{\sum _{q\leq N^{(d-1)\theta }}I_{q,\lambda }^{1}\, =\,\sum _{ q\leq N^{(d-1)\theta }}K(q,0,\lambda )\,\int _{\vert t\vert <N/8q}\tilde{\phi }_{a,\delta }(t)\,\tilde{\sigma }(t\xi )\,\mathit{dt}.}$$

We extend the integration to the whole real line to exploit (8.102), the error obtained is bounded by

$$\displaystyle{\int _{\vert t\vert \geq N/8q}\vert \tilde{\phi }_{a,\delta }(t)\vert \,\vert \tilde{\sigma }(t\xi )\vert \,\mathit{dt} \ll \int _{\vert t\vert \geq N/8q}(1 + \vert t\vert )^{-\frac{n} {D} }\,\mathit{dt}\, \ll \, N^{-\frac{n} {D}+1}q^{\frac{n} {D}-1}.}$$

Thus

$$\displaystyle\begin{array}{rcl} & & \left \vert \,\lambda ^{\frac{n} {d} -1}\sum _{q\leq N^{(d-1)\theta }}I_{q,\lambda }^{1}\, -\,\sum _{ q\leq N^{(d-1)\theta }}K(q,0,\lambda )\,\int _{S_{p}}\phi _{a,\delta }\,(x\cdot \xi )\,d\sigma _{p}(x)\,\right \vert \\ & &\qquad \qquad \quad \,\ll _{\varepsilon }\,N^{-\frac{n} {D}+1}\,\sum _{q\leq N^{(d-1)\theta }}\,q^{-\frac{n} {D}+1+\varepsilon }q^{\frac{n} {D}-1}\, \ll \, N^{-n\gamma _{d}},\qquad {}\end{array}$$
(8.106)

using the facts that \(\,(d - 1)\theta < \frac{1} {2}\,\) and \(\, \frac{n} {D} - 2 > n\gamma _{d}\), choosing \(\varepsilon > 0\) sufficiently small. □ 

Lemma 32.

One has

$$\displaystyle{ \sum _{q\leq N^{(d-1)\theta }}\vert I_{q,\lambda }^{2}\vert \,\ll \, N^{-n\gamma _{d} }. }$$
(8.107)

Proof.

Since ψ(q ξl) = 0 unless l = [q ξ], the nearest lattice point to the point q ξ, we have that

$$\displaystyle{m_{q,\lambda }(t\xi ) = K(q,[qt\xi ],\lambda )\,\psi (\{qt\xi \})\,\,\tilde{\sigma }\left (\frac{N} {q} \,\{qt\xi \}\right ).}$$

By making a change of variables t: = tq, it follows from (8.99) and (8.100)

$$\displaystyle{\vert I_{q,\lambda }^{2}\vert \,\ll _{\varepsilon }\,N^{-\frac{n} {D}+2}\,q^{-1+\varepsilon }\,J_{\lambda }}$$

where

$$\displaystyle{J_{\lambda } =\int _{\vert t\vert \geq 1/8}\vert \tilde{\phi }_{a,\delta }(tN/q)\vert \,\|t\xi \|^{-\frac{n} {D}+1}\,\mathit{dt}.}$$

Note that for \(q \leq N^{(d-1)\theta } < N^{1/2}\)

$$\displaystyle{\vert \tilde{\phi }_{a,\delta }(tN/q)\vert \,\ll \, \frac{q} {N}\ \vert t\vert ^{-1}(1 + \vert \delta t\vert )^{-1}.}$$

By a dyadic decomposition of the range of integration, using (8.36), we have

$$\displaystyle{\vert J_{\lambda }\vert \,\ll _{\varepsilon }\,\frac{q} {N}\sum _{j\geq -3}2^{\varepsilon j}\,(1 +\delta 2^{j})^{-1}\, \ll \, q\,N^{-1+\varepsilon '},}$$

with \(\varepsilon ' = nd\varepsilon\). Choosing \(\varepsilon > 0\) sufficiently small, this implies

$$\displaystyle{ \sum _{q\leq N^{(d-1)\theta }}\vert I_{q,\lambda }^{2}\vert \,\ll _{\varepsilon }\,\sum _{ q\leq N^{1/2}}q^{\varepsilon }N^{-\frac{n} {D}+1+\varepsilon '}\, \ll \, N^{-\frac{n} {D}+2} \ll N^{-n\gamma _{d}}. }$$
(8.108)

 □ 

Finally, we remark that Theorem 12 follows immediately from estimates (8.103)–(8.107). □ 

3.3 The Distribution of the Solutions Modulo 1

We will study the distribution of the images of the solutions of a diophantine equation P(m) = λ on the flat torus \(\mathbf{T}^{n} = \mathbf{R}^{n}/\mathbf{Z}^{n}\), via the map \(T_{\alpha }:\ (m_{1},\ldots,m_{n}) \rightarrow (m_{1}\alpha _{1},\ldots,m_{n}\alpha _{n})\pmod 1\), where \(\alpha = (\alpha _{1},\ldots,\alpha _{n}) \in \mathbf{R}^{n}\) is a given point. We will assume, as before, that P is a positive, homogeneous, non-singular polynomial of degree d, and n ≥ n d is large enough with respect to the degree. Note that if one of the coordinates α i is rational, say equal to aq, then m i α i can take at most q different values modulo 1, so the images of the solution sets

$$\displaystyle{ \varOmega _{\lambda,\alpha }:=\{ (m_{1}\alpha _{1},\ldots,m_{n}\alpha _{n}):\ P(m_{1},\ldots,m_{n}) =\lambda \}\subseteq \mathbf{T}^{n} }$$
(8.109)

cannot become equi-distributed on the torus as λ → , even if one restricts to regular values only. In the opposite case, we have

Theorem 33.

Let \(\alpha = (\alpha _{1},\ldots,\alpha _{n})\) be point such that α i is irrational for all 1 ≤ i ≤ n, and ϕ be a smooth function on T n . If Λ is a set of regular values of the form P, then one has

$$\displaystyle{ \lim _{\lambda \rightarrow \infty,\,\lambda \in \varLambda }N_{\lambda }^{-1}\sum _{ P(m)=\lambda }\phi (m_{1}\alpha _{1},\ldots,m_{n}\alpha _{n})\, =\,\int _{\mathbf{T}^{n}}\phi (x)\,\mathit{dx}, }$$
(8.110)

where N λ is the number of solutions of the equation P(m) = λ.

Proof.

For simplicity, let us introduce the notation \(\,m\circ \alpha = (m_{1}\alpha _{1},\ldots,m_{n}\alpha _{n})\). By using the inverse Fourier transform \(\,\phi (\beta ) =\sum _{l\in \mathbf{Z}^{n}}\hat{\phi }(l)e^{2\pi i\,\beta \cdot \,l}\,\), we have

$$\displaystyle\begin{array}{rcl} \sum _{P(m)=\lambda }\phi (m\circ \alpha )& =& \sum _{l\in \mathbf{Z}^{n}}\hat{\phi }(l)\sum _{P(m)=\lambda }e^{2\pi i(m_{1}l_{1}\alpha _{1}+\ldots m_{n}l_{n}\alpha _{n})} \\ & =& \sum _{l\in \mathbf{Z}^{n}}\hat{\phi }(l)\,\hat{\omega }_{\lambda }(l\circ \alpha ) = N_{\lambda }\hat{\phi }(0) + T_{\lambda }(\alpha ),{}\end{array}$$
(8.111)

where

$$\displaystyle{ T_{\lambda }(\alpha ) =\sum _{l\in \mathbf{Z}^{n},\,l\neq 0}\hat{\phi }(l)\,\hat{\omega }_{\lambda }(l\circ \alpha ). }$$
(8.112)

Substituting the asymptotic expansion (8.95) into the above expression we have

$$\displaystyle{T_{\lambda }(\alpha ) =\sum _{q\leq N^{(d-1)\theta }}\sum _{l\neq 0}m_{q,\lambda }(l\circ \alpha )\hat{\phi }(l)\, +\,\sum _{l\neq 0}E_{\lambda }(l\circ \alpha )\hat{\phi }(l).}$$

Using the fact that \(\,\hat{\phi }(l) \leq C_{M}(1 + \vert l\vert )^{-M}\,\) for all M ∈ N, estimate (8.97) implies

$$\displaystyle{ \sum _{l\neq 0}\vert E_{\lambda }(l\circ \alpha )\hat{\phi }(l)\vert \,\ll \, N^{n-d-n\gamma _{d} }\,\|\hat{\phi }\|_{l^{1}} \ll N^{n-d-n\gamma _{d} }, }$$
(8.113)

where N = λ 1∕d and γ d  > 0 is a constant depending on d. Also, by (8.100) one has

$$\displaystyle{ \vert m_{q,\lambda }(l\circ \alpha )\vert \,\ll _{\varepsilon }\,N^{n-d}q^{-\frac{n} {D}+1+\varepsilon }\hat{\sigma }\left (\frac{N} {q} \,\|ql\circ \alpha \|\right ). }$$
(8.114)

Since \(\alpha \in (\mathbf{R}/\mathbf{Q})^{n}\) by our assumption and l ≠ 0 we have that \(\,\|ql\circ \alpha \| > 0\,\), thus

$$\displaystyle{m_{q,\lambda }(l\circ \alpha ) \rightarrow 0\ \ \ \ \mbox{ as}\ \ \ \ \lambda \rightarrow \infty.}$$

Let \(\varepsilon > 0\) be fixed, then by (8.114) one estimates crudely

$$\displaystyle{ \sum _{q\geq N^{\varepsilon }}\sum _{l\neq 0}\vert m_{q,\lambda }(l\circ \alpha )\,\hat{\phi }(l)\vert \,\ll \, N^{n-d}\sum _{ q\geq N^{\varepsilon }}q^{-\frac{n} {D}+1}\, \ll \, N^{n-d-n\varepsilon '}. }$$
(8.115)

Also, for a fixed \(q \leq N^{\varepsilon }\)

$$\displaystyle{ \sum _{\vert l\vert \geq N^{\varepsilon }}\vert m_{q,\lambda }(l\circ \alpha )\,\hat{\phi }(l)\vert \,\ll \, N^{n-d-\varepsilon }, }$$
(8.116)

by using the decay estimate \(\,\vert \hat{\phi }(l)\vert \ll (1 + \vert l\vert )^{-2n}\,\).

Since for regular values λ ∈ Λ the number of solutions is \(\,N_{\lambda } \approx \lambda ^{\frac{n} {d} -1} = N^{n-d}\,\), (8.110) follows from (8.114)–(8.116). □ 

Let \(\alpha = (\alpha _{1},\ldots,\alpha _{n})\) be a point such that each of its coordinates α i is diophantine in the sense that \(\,\|l\alpha _{i}\| \geq C_{\varepsilon }\vert l\vert ^{-1-\varepsilon }\,\) for l ∈ Z∕{0}, for every \(\varepsilon > 0\). We will call such points α diophantine , and we can extend this definition to points α ∈ T n as α diophantine if and only if α + m is such for any m ∈ Z n. Note that this condition on α is different from the notion used in Sects. 8.28.3, nevertheless (8.3) implies that the set of diophantine points of the torus has measure 1. Also, it is immediate from the definition that for any \(\,l = (l_{1},\ldots,l_{n}) \in \mathbf{Z}^{n}\), l ≠ 0 we have that

$$\displaystyle{ \|l\circ \alpha \|\geq C_{\varepsilon }\,\vert l\vert ^{-1-\varepsilon }. }$$
(8.117)

For diophantine points α we will derive quantitative estimates on the discrepancy of the sets \(\varOmega _{\lambda,\alpha }\) with respect to both smooth functions and compact, convex bodies. To be more precise, for a smooth function \(\,\phi \in C^{\infty }(\mathbf{T}^{n})\,\) define the associated discrepancy as

$$\displaystyle{ D(\phi,\alpha,\lambda ):=\sum _{P(m)=\lambda }\phi (m\circ \alpha ) - N_{\lambda }\int _{T^{n}}\phi (x)\,\mathit{dx}. }$$
(8.118)

Theorem 34.

Let \(\alpha \in \mathbf{T}^{n}\) be a diophantine point, and let \(\phi \in C^{\infty }(\mathbf{T}^{n})\) . Then for n > n d = d(d − 1)2 d+1 , one has

$$\displaystyle{ \vert D(\phi,\alpha,\lambda )\vert \,\ll \,\lambda ^{\frac{n} {d} -1-n\eta _{d}}, }$$
(8.119)

with a constant η d > 0 depending only on the degree d.

Proof.

We will argue as in the proof of Theorem 33, using condition (8.117) and the decay estimates (8.99) and (8.100). To start, observe that by (8.111)–(8.112)

$$\displaystyle{D(\phi,\alpha,\lambda ) = T_{\lambda }(\alpha ) \leq \sum _{q\leq N^{(d-1)\theta }}\sum _{l\neq 0}\vert m_{q,\lambda }\vert \,\vert \hat{\phi }(l)\vert \, +\, O(N^{n-d-n\gamma _{d} }).}$$

Since α is assumed to be diophantine we have for all \(\varepsilon > 0\)

$$\displaystyle\begin{array}{rcl} \vert D(\phi,\alpha,\lambda )\!\,& \ll & \,\!N^{n-d}\!\!\sum _{ q\leq N^{(d-1)\theta }}\sum _{l\neq 0}q^{-\frac{n} {D}+1}\left (1+\frac{N} {q} \,\|ql\circ \alpha \|\right )^{-\frac{n} {D}+1}\vert \hat{\phi }(l)\vert \\ &\ll _{\varepsilon }& \,N^{n-d}\sum _{ q\leq N^{(d-1)\theta }}\sum _{l\neq 0}q^{-\frac{n} {D}+1}\left (1 + \frac{N} {q^{2+\varepsilon }\vert l\vert ^{1+\varepsilon }}\right )^{-\frac{n} {D}+1}(1 + \vert l\vert )^{-2n}.{}\end{array}$$
(8.120)

Now the parameter θ in the asymptotic formula (8.95) was chosen such that (d − 1)θ < 1∕2 , accordingly we will set \(\,\varepsilon = (1 - 2(d - 1)\theta )/4\). This will ensure that

$$\displaystyle{ \frac{N} {q^{2+\varepsilon }\vert l\vert ^{1+\varepsilon }}\, \geq \, N^{\varepsilon },}$$

for 1 ≤ q ≤ N (d−1)θ and \(\,0 < \vert l\vert < N^{\varepsilon }\,\), thus by (8.120)

$$\displaystyle{\sum _{q\leq N^{(d-1)\theta }}\sum _{0<\vert l\vert <N^{\varepsilon }}\vert m_{q,\lambda }(l\circ \alpha )\vert \,\vert \hat{\phi }(l)\vert \,\ll \, N^{n-d-\varepsilon (n/D-1)}\, \ll \, N^{n-d-n\eta _{d} },}$$

with, say η d  = (1 − 2(d − 1)θ)∕8D. The rest of the sum is estimated crudely by

$$\displaystyle{N^{n-d}\sum _{ q\leq N^{(d-1)\theta }}\sum _{\vert l\vert \geq N^{\varepsilon }}q^{-\frac{n} {D}+1}(1 + \vert l\vert )^{-2n}\, \ll \, N^{n-d-\varepsilon n}.}$$

This finishes the proof of Theorem 34. □ 

Finally, we will study the discrepancy of the image sets \(\varOmega _{\alpha,\lambda }\) with respect to compact, convex bodies \(K \subseteq (-\frac{1} {2}, \frac{1} {2})^{n}\), when the flat torus T n is identified as a set with \([-\frac{1} {2}, \frac{1} {2})^{n}\). Let us remark that in this case one cannot hope for better upper bounds than \(O(\lambda ^{\frac{n} {d} -1-\frac{1} {d} })\). Indeed, consider the discrepancy with respect to the family of cubes K c  = [−c, c]n. The number of solutions of the equation P(m) = λ is ≈ λ nd−1 but (as P(m) ≈ | m | d) each coordinate can take ≪ λ 1∕d values, thus the number of solutions m = (m 1, , m n ) with m 1 being fixed is at least \(\lambda ^{\frac{n} {d} -1-\frac{1} {d} }\), for some value of m 1. Fix such an m 1 and let \(c_{1} = m_{1}\alpha _{1}\pmod 1\). This means that the boundary of the cube \(K_{c_{1}}\) contains at least \(\lambda ^{\frac{n} {d} -1-\frac{1} {d} }\) points of the set Ω α, λ so the discrepancy changes by at least this much as c passes through c 1 and thus one cannot have a better uniform upper bound on it. We will prove a similar upper bound, of the form \(O(\lambda ^{\frac{n} {d} -1-\eta _{d}})\) with a constant η d  > 0 depending only on the degree d which as uniform over a large family of convex bodies.

We will use the fact that if \(K \subseteq (-\frac{1} {2}, \frac{1} {2})^{n}\) is a closed convex set with non-empty interior then there exist convex sets K 1 and K 2 such that for sufficiently small δ > 0

$$\displaystyle{B(K_{1},\delta ) \subseteq K \subseteq B(K_{2},\delta ) \subseteq (-1,1)^{n},}$$

where B(K, δ) is the set of points whose distance to the set K is at most δ. To make our estimates uniform for a large family of convex bodies, define the quantity δ K as the largest δ > 0 for which there exists a point x such that x + B δ  ⊆ K and also \(K + B_{\delta } \subseteq [-\frac{1} {2}, \frac{1} {2}]^{n}\), where B δ is the closed ball of radius δ centered at the origin.

Lemma 35.

Let \(K \subseteq (-\frac{1} {2}, \frac{1} {2})^{n}\,\) be a closed convex body, and let x be a point in the interior of K. For given 0 < δ < δ K ∕10, C 0 = 2∕δ K , and \(\lambda _{1} =\lambda _{ 2}^{-1} = 1 - C_{0}\delta\) ; define the convex bodies K 1 = x + λ 1 K, K 2 = x + λ 2 K.

If ϕ ≥ 0 is a smooth cut-off function supported in (−1,1) n such that ∫ϕ = 1, then we have

$$\displaystyle{ \chi _{K_{1}} {\ast}\phi _{\delta }\,\leq \,\chi _{K}\, \leq \,\chi _{K_{2}} {\ast}\phi _{\delta }, }$$
(8.121)

where χ K stands for the indicator function of a set K, and \(\phi _{\delta }(x) =\delta ^{-n}\phi (x/\delta )\) .

Proof.

From the definition it is immediate that \(\,K_{1} \subseteq K \subseteq K_{2} \subseteq (-\frac{1} {2}, \frac{1} {2})^{n}\). By translation invariance we may assume that x 0 = 0 and then it is enough to show that B(K 1, δ) ⊆ K and B(K, δ) ⊆ K 2. Since \(K = x_{0} +\lambda _{1}K_{2} =\lambda _{1}K_{2}\,\) both claim can be shown the same way. Indeed, assume indirect that there is y ∈ K 1 and \(z\notin K\) such that | yz | ≤ δ. Then by the Hahn-Banach Theorem there is a unit vector v for which

$$\displaystyle{v \cdot y+\delta \geq v \cdot z >\max _{x\in K}v \cdot x \geq \lambda _{1}^{-1}y \cdot z,}$$

since λ 1 −1 y ∈ K. Also, by our assumption \(B_{\delta _{K}} \subseteq K\), hence

$$\displaystyle{y \cdot z \geq v \cdot z-\delta >\delta _{K}-\delta \geq \delta _{K}/2.}$$

This implies

$$\displaystyle{\lambda _{1}\delta \geq (1 -\lambda _{1})y \cdot z \geq C_{0}\delta \,\delta _{K}/2,}$$

which is a contradiction since λ 1 < 1 and C 0 δ K  ≥ 2. The same argument shows that \(B(K,\delta ) \subseteq K_{2}\) and (8.121) follows. □ 

For a closed, convex body \(K \subseteq (-\frac{1} {2}, \frac{1} {2})^{n}\) and a diophantine point α, define the discrepancy

$$\displaystyle{D(K,\alpha,\lambda ) =\sum _{P(m)=\lambda }\chi _{K}(m_{1}\alpha _{1},\ldots,m_{n}\alpha _{n}) - N_{\lambda }vol_{n}(K),}$$

where χ K is the indicator function of K considered as a function on T n, and vol n (K) denotes the volume of the body. We have the following uniform estimate on the discrepancy.

Theorem 36.

Let n > d(d − 1)2 d+1 and let P be a non-singular integral polynomial in n variables, and let \(\alpha \in \mathbf{R}^{n}\) be diophantine and let δ 0 > 0. Then for a closed, convex body \(K \subseteq (-\frac{1} {2}, \frac{1} {2})^{n}\) such that \(\delta _{K} \geq \delta _{0}\) we have

$$\displaystyle{ \vert D(K,\alpha,\lambda )\vert \,\ll \, N^{n-d-\eta _{d} }, }$$
(8.122)

where η d > 0 is a constant depending only on d, and the implicit constant in  (8.122) depends only on the polynomial P, the point α and on δ 0 and is independent of K.

Proof.

Let us use the notation \(\phi _{K,\delta } =\chi _{K} {\ast}\phi _{\delta }\). By (8.121) we have for δ < c δ 0 (c > 0 being sufficiently small)

$$\displaystyle{\sum _{P(m)=\lambda }\phi _{K_{1},\delta }(m\circ \alpha ) - N_{\lambda }\int _{\mathbf{T}^{n}}\phi _{K_{2},\delta } \leq D(K,\alpha,\lambda ) \leq \sum _{P(m)=\lambda }\phi _{K_{2},\delta }(m\circ \alpha ) - N_{\lambda }\int _{\mathbf{T}^{n}}\phi _{K_{1},\delta }.}$$

and also

$$\displaystyle{\int _{\mathbf{T}^{n}}(\phi _{K_{2},\delta } -\phi _{K_{1},\delta })\, \leq \, C\delta \ vol_{n}(K),}$$

with a constant C ≪ δ 0 −1. Thus

$$\displaystyle{ \vert D(K,\alpha,\lambda )\vert \leq \max _{i=1,2}\vert D(\phi _{K_{i},\delta },\alpha,\lambda )\vert + O(N^{n-d}\delta ). }$$
(8.123)

To estimate the discrepancy with respect to the smooth functions \(\phi _{K_{i},\delta }\) we proceed as before, with exception that now we have the estimates on their Fourier transform

$$\displaystyle{\vert \hat{\phi }_{K_{i},\delta }(l)\vert = \vert \hat{\phi }_{K_{i}}(l)\hat{\phi }(\delta l)\vert \,\ll \, (1 +\delta \vert l\vert )^{-2n},}$$

in particular \(\|\hat{\phi }_{K_{i},\delta }\|_{l^{1}} \ll \delta ^{-n}\). Thus

$$\displaystyle{ \vert \sum _{l\neq 0}E_{\lambda }(l\circ \alpha )\hat{\phi }_{K_{i},\delta }\vert \,\ll \, N^{n-d-n\gamma _{d} }\delta ^{-n}. }$$
(8.124)

For the main terms, we have

$$\displaystyle\begin{array}{rcl} \sum _{0<\vert l\vert <N^{\varepsilon }}\vert m_{q,\lambda }(l\circ \alpha )\,\hat{\phi }_{K_{i},\delta }(l)\vert & \ll & N^{n-d}q^{-\frac{n} {D}+1}\left (1 + \frac{N} {q^{2+\varepsilon }\vert l\vert ^{1+\varepsilon }}\right )^{-\frac{n} {D}+1}(1 +\delta \vert l\vert )^{-2n} \\ & \ll & \,N^{n-d-\varepsilon ( \frac{n} {D}-1)}\,q^{-\frac{n} {D}+1}\delta ^{-n}, {}\end{array}$$
(8.125)

for q ≤ N (d−1)θ, choosing \(\varepsilon = (1 - 2(d - 1)\theta )/4\) as before. Also

$$\displaystyle\begin{array}{rcl} \sum _{\vert l\vert \geq N^{\varepsilon }}\vert m_{q,\lambda }(l\circ \alpha )\,\hat{\phi }_{K_{i},\delta }(l)\vert \,& \ll & \,N^{n-d}q^{-\frac{n} {D}+1}\sum _{\vert l\vert \geq N^{\varepsilon }}(1 +\delta \vert l\vert )^{-2n} \\ & \ll & \,N^{n-d}q^{-\frac{n} {D}+1}\,(1 +\delta N^{\varepsilon })^{-2n}\,N^{\varepsilon n}.{}\end{array}$$
(8.126)

Let \(\delta = N^{- \frac{\varepsilon }{ 4D} }\) then the right side of both (8.125) and (8.126) is \(O(N^{n-d- \frac{\varepsilon }{ 4D} }q^{-\frac{n} {D}+1})\). Summing for \(1 \leq q \leq N^{(d-1)\theta }\) and using (8.123) we obtain the estimate

$$\displaystyle{\vert D(K,\alpha,\lambda )\vert \,\ll \, N^{n-d- \frac{\varepsilon }{4D} }.}$$

Finally note that the exponent \(\eta _{d}:= \frac{\varepsilon } {4D}\,\) depend only on the parameter θ and D, hence ultimately only on dimension d, while the implicit constants in our estimates depend on the parameter δ 0 and not on the body K. This finishes the proof of Theorem 36. □ 

3.4 Some Possible Further Directions

Our estimates on the uniformity of the distribution of solutions to diophantine equations in many variables are by no means exhaustive. In fact even in the case of the sphere, it is not clear if our upper bounds are sharp or even what should be the sharp bounds. A closely related problem is to find lower bounds for the mean square average of the discrepancy of the lattice points on spheres over the family of all spherical caps. It is expected that the lattice points are far from optimally distributed and essentially higher lower bounds can be obtained then the uniform lower bounds given in [3, 15] and [12]. To obtain nontrivial lower bounds one may exploit the fact that lattice points in small caps are concentrated on lower dimensional spheres.

For higher degree polynomials it is unrealistic to expect sharp bounds in the generality we have discussed. The special case of the polynomial P(m) = m 1 d + + m n d (d even) deserves special attention as the number of the solutions of the equation P(m) = λ, the so-called Waring problem , has been studied extensively. In fact much sharper asymptotic formulas have been obtained than the ones which can derived from the Birch-Davenport method [18]. In general we have only considered positive polynomials of even degree, however there are natural analogues for indefinite forms. Indeed one may identify the solution set of the diophantine equation P(m 1, , m n ) = 0 within the box | m i  | ≤ N as the set of lattice points \(\mathbf{Z}_{P}(N) = S \cap \mathbf{Z}^{n} \cap [-N,N]^{n}\) where S = { P = 0} is the zero surface of P. One can shrink this set by a factor of N and study the discrepancy with respect to caps as N → .

Let us also remark that weaker bounds we have obtained for the distribution of the solutions modulo 1 seemed partly because of we have allowed very rough convex sets K. It might be true that better upper bounds can be given by assuming some smoothness of the boundary of the convex body, however it is not even immediately clear how to improve the bounds on the discrepancy with respect to balls.

Finally, as we mentioned in the introduction, the uniformity of distribution of the solutions modulo 1, is a special case of a more general phenomenon. It can be shown [11] that the images of the solution sets {P(m) = λ} become equi-distributed when mapped to a probability measure space X via a fully ergodic commuting family of measure preserving transformations. It would be interesting to see if one can get estimates for the rate of equi-distribution for other measure preserving systems than the flat torus with the coordinate shifts.