1 Introduction

The search for \(L^p(\mathbb {R}^n)\)-improving capabilities of operators in harmonic analysis is a classic line of investigation. It is interesting in many cases to see how much the operator “improves” the the \(L^p\) norm (by achieving a higher value of p). In particular, determining these bounds for operators involving integration over a curved submanifold is an active research area (see the early work of Littman [9] and Strichartz [12]). With discrete operators, that is operators defined over the integer lattice, \(l^p(\mathbb {Z}^n)\) spaces behave differently; due to the nesting properties, “improving” seems to become a trivial consequence of \(l^p\)-inequalities. However this is not completely the case, as nontrivial quantitative improving estimates can shed light on the behavior of these operators.

In this paper, we consider the discrete spherical averaging operator along the primes, first developed (to the best of our knowledge) in [1]. This is a discrete, prime variant of Stein’s spherical averaging operator, and number theoretic properties therefore come into play in its analysis. For more information and history, see [1]. Inspired by recent interest in \(l^p\)-improving for the integer version of this operator in [4, 5, 7], we prove quantitative \(l^p(\mathbb {Z}^n)\) improving estimates for the discrete spherical averages along the primes in terms of the radius \(\lambda \). These take the form of \(l^p(\mathbb {Z}^n) \rightarrow l^{p'}(\mathbb {Z}^n)\) bounds for any \(1\le p \le 2\), and the decay rate improves as p approaches 1.

To state our main theorems, we require a few definitions. Discrete spherical averages were introduced by Magyar in [10]:

$$\begin{aligned} S_\lambda f({\mathbf {x}}) := \frac{1}{\#\{ {\mathbf {y}} \in {\mathbb {Z}}^n : |{\mathbf {y}}|^2 = \lambda \} } \sum _{|{\mathbf {y}}|^2 = \lambda } f({\mathbf {x}} - {\mathbf {y}}) \end{aligned}$$

where \(|y|^2 = y_1^2 + \dots y_n^2\). These played a role in Magyar–Stein–Wainger’s proof of sharp \(l^p(\mathbb {Z}^n)\) bounds for the corresponding maximal function [11]

$$\begin{aligned} S_*f ({\mathbf {x}}) := \sup _{\lambda \in {\mathbb {N}}} |S_\lambda f({\mathbf {x}}) |. \end{aligned}$$

Note that we have chosen to define these averages along spheres of radius \(\lambda ^{1/2}\).

Recently both Hughes [4] and Kesler–Lacey [7] used Magyar and Magyar–Stein–Wainger’s techniques to find \(l^p\)-improving estimates for these spherical averages, with decay in terms of \(\lambda \), showing that for dimensions 5 and greater, and \(\frac{n}{n-2} \le p \le 2\), there exists constants independent of \(\lambda \), such that for all \(\lambda \in \mathbb {N}\),

$$\begin{aligned} \Vert S_\lambda f\Vert _{l^{p'}(\mathbb {Z}^n)} \lesssim \lambda ^{-\eta _p}\Vert f\Vert _{l^p\mathbb {Z}^n)} \end{aligned}$$

where \(\eta _p = \frac{n}{2}(\frac{2}{p}-1)\). (At first glance Hughes’s and Kesler–Lacey’s decay rate might look different, but we remind the reader that Kesler–Lacey define their averages along spheres of radius \(\lambda \), not \(\lambda ^{1/2}\), so these decay rates are the same.) This decay rate is also optimal in this range of p. At publication time, these ranges obtained have been updated to the wider \(\frac{n+1}{n-1} \le p \le 2\) in [7] (no \(\varepsilon \) loss) and [4] (with an \(\varepsilon \) loss). Please see both of these papers, as well as the related works [5, 6, 8], for more discussion of these bounds, as well as other related results.

The discrete averaging operators we consider are averages over the prime vectors (or ’prime points’) on the algebraic surface

$$\begin{aligned} \mathfrak {f}({\mathbf {x}}) := |{\mathbf {x}} |^k := x_1^k+ \dots + x_n^k = \lambda , \end{aligned}$$
(1.1)

The Waring–Goldbach problem in analytic number theory involves the study of these points. Classic work of Hua [3] established an asymptotic for the number of these points; as long as we restrict to a specific arithmetic progression \(\Gamma _{n,k}\),

$$\begin{aligned} P(\lambda ) \sim {\mathfrak {S}}_{n,k}(\lambda ) \lambda ^{{n/k - 1}}, \end{aligned}$$
(1.2)

where \({\mathfrak {S}}_{n,k}\) is a singular series that for our purposes can be regarded as a constant, and \(P(\lambda )\) denote the number of prime solutions of (1.1),

$$\begin{aligned} P(\lambda ) = \sum _{\mathfrak {f}({\mathbf {p}}) = \lambda } \log {\mathbf {p}}, \end{aligned}$$

counted with logarithmic weights where \(\log {\mathbf {x}} = (\log x_1) \cdots (\log x_n)\) and \(\mathbf{p}\) is a vector with all coordinates prime. It is natural to count these prime lattice points with density due to the prime number theorem.

The authors in [1] study an ergodic version of this problem by asking quantitative distributional questions about these points, and answer these by proving \(l^p(\mathbb {Z}^n)\) bounds for the discrete spherical maximal function along the primes. We consider these discrete spherical averages defined for \(\mathbf{l}\in \mathbb {Z}^n\)

$$\begin{aligned} A_\lambda f(\mathbf{l}) := \sigma _\lambda \star f(\mathbf{l}) = \frac{1}{P(\lambda )} \sum _{|\mathbf{p}|^k=\lambda } (\log \mathbf{p}) f(\mathbf{p-l}). \end{aligned}$$
(1.3)

where \(\sigma _\lambda \) is a probability measure on \(\mathbb {Z}^n\) defined by

$$\begin{aligned} \sigma _\lambda ({{\mathbf {x}}}) := \frac{1}{P(\lambda )}\mathbf{1}_{\{{{\mathbf {p}}} \in \mathbb {P}^n: |{\mathbf {p}}|^k=\lambda \}}({{\mathbf {x}}}) \log {{\mathbf {x}}}, \end{aligned}$$

and \(\lambda \in \Gamma _{n,k}\), an infinite arithmetic progression of radii where the Hua asymptotic 1.2 holds. Let \(1 \le p \le 2\) and define \(n_0(k) = 2^k+1\) when \(k = 2, 3\) or 4, and

$$\begin{aligned} n_0(k) = k^2 + 3 - \max _{1 \le j \le k-1} \left\lceil \frac{kj - \min (2^j,j^2+j)}{k-j+1} \right\rceil \end{aligned}$$

when \(k \ge 5\). Let \(\lambda \in \Gamma _{n,k}\). We now recall one of the main results in [1] as we will use part of this decomposition. For more details, background, and other subtleties of the ergodic Waring–Goldbach problem, see [1].

For an integer \(q \ge 1\), we write \({\mathbb {Z}}_q = {\mathbb {Z}}/q{\mathbb {Z}}\) and \({\mathbb {U}}_q = {\mathbb {Z}}_q^*\), the group of units. If \({\mathbf {q}} = (q_1, \dots , q_n) \in {\mathbb {Z}}^n\), with \({\mathbf {q}} \ge 1\) (where we mean that \(q_i \ge 1\) for all i), write \({\mathbb {U}}_{{\mathbf {q}}} = {\mathbb {U}}_{q_1} \times \dots \times {\mathbb {U}}_{q_n}\); also we set \(\mathbf {a/q} = (a_1/q_1, \dots , a_n/q_n)\) and \(\mathbf {a}\mathbf {q}= (a_1q_1, \dots , a_nq_n)\) if \({\mathbf {a}} = (a_1, \dots , a_n)\) is another vector. For \(\lambda \in {\mathbb {Z}}\) and \({\mathbf {a}}, {\mathbf {q}} \in {\mathbb {Z}}^n\), with \({\mathbf {q}} \ge 1\), define

$$\begin{aligned} g(a,q; b,r)= & {} \frac{1}{\varphi ([q,r])} \sum _{x \in {\mathbb {U}}_{[q,r]}} e\bigg ( \frac{ax^k}{q} + \frac{bx}{r} \bigg ), \\ {\mathfrak {S}}(\lambda ; {\mathbf {a}}, {\mathbf {q}})= & {} \sum _{q=1}^\infty \sum _{a \in {\mathbb {U}}_q} e( -\lambda a/q ) \prod _{i=1}^ng(a, q; a_i, q_i), \end{aligned}$$

where \(\varphi \) is Euler’s totient function and \([q,r] = {{\,\mathrm{lcm}\,}}[q,r]\). Choose a smooth bump function \(\psi \) such that

$$\begin{aligned} {\mathbf {1}}_{{\mathcal {Q}}}({\mathbf {x}}) \le \psi ({\mathbf {x}}) \le {\mathbf {1}}_{{\mathcal {Q}}}({\mathbf {x}}/2), \end{aligned}$$

where \({\mathbf {1}}_{{\mathcal {Q}}}\) is the indicator function of the cube \({\mathcal {Q}} = [-1,1]^n\). Finally, write \(\psi _t({\mathbf {x}}) = \psi (t{\mathbf {x}})\).

Theorem 1

(From [1]) Let \(k \ge 2\) and n be large enough. Also, let \(\lambda \in \Gamma _{n,k}\) be large, and suppose that \(\lambda ^{1/k} \le N \lesssim \lambda ^{1/k}\). For any fixed \(B>0\), there exists a \(C = C(B) > 0\) such that one has the decomposition

$$\begin{aligned} \widehat{\sigma _\lambda } ({\varvec{\xi }})= & {} \frac{\lambda ^{n/k-1}}{P(\lambda )} \sum _{ 1 \le \mathbf {q}\le Q} \sum _{\mathbf {a}\in {\mathbb {U}}_{\mathbf {q}}} {\mathfrak {S}}(\lambda ; {\mathbf {a}}, {\mathbf {q}})\psi _{N/Q}(\mathbf {q}{\varvec{\xi }}-\mathbf {a}) \widetilde{d\sigma _{\lambda }}({\varvec{\xi }}-\mathbf {a}/\mathbf {q})\\&+ \widehat{E_\lambda }({\varvec{\xi }}) := \widehat{M_\lambda }({\varvec{\xi }})+ \widehat{E_\lambda }({\varvec{\xi }}), \end{aligned}$$

where \(Q = (\log N)^C\) and \(\widetilde{d\sigma _{\lambda }}\) is the Fourier transform of the continuous k-spherical surface measure.

We recall here that this Theorem was stated for slightly different values of n. However, that statement included an \(l^2(\mathbb {Z}^n)\) bound for a certain dyadic operator that we do not need here. Therefore, Theorem 1 as stated here, and as used later on, actually holds for the larger range \(n \ge n_0(k)\).

Specifically, we prove the following quantitative \(l^p(\mathbb {Z}^n)\)-improving decay rate:

Theorem 2

Let \(n \ge n_0(k)\) and the spherical averages \(A_\lambda \) be defined as above. Then we have the following quantitative \(l^p(\mathbb {Z}^n)\) improving inequality, with decay in \(\lambda \):

For \(k \ge 3\) and \(1<p<2\)

$$\begin{aligned} \Vert A_\lambda f\Vert _{l^{p'}(\mathbb {Z}^n)} \le C_{B,p,n,k}\lambda ^{(1-\frac{n}{k})(\frac{2}{p}-1)}(\log \lambda )^{-B}\Vert f\Vert _{l^p(\mathbb {Z}^n)} \end{aligned}$$

and for \(k=2\) and \(1<p<2\)

$$\begin{aligned} \Vert A_\lambda f\Vert _{l^{p'}(\mathbb {Z}^n)} \le C_{p,n}\lambda ^{(1-\frac{n}{2})(\frac{2}{p}-1)}(\log \lambda )^{n}\Vert f\Vert _{l^p(\mathbb {Z}^n)} \end{aligned}$$

The power decay comes from the trivial bound; for \(k\ge 3\) additionally we gain a logarithmic decay. No such gain is present for \(k=2\) due to the fact that the main term is larger than the error term, see the remarks after the proof of Theorem 2. This logarithmic decay is likely the best possible using these methods; additional decay might be possible due to increased knowledge about the distribution of primes and would likely require the resolution of deep problems in analytic number theory.

We can also say something similar for the discrete dyadic prime spherical maximal function: see Theorem 4. Both of these theorems follow by a straightforward interpolation argument; however, the averages for the main term of the decomposition, \(M_\lambda \), satisfy a better quantitative improving bound for \(k \ge 3\), whose proof requires a careful decomposition of the corresponding multiplier, which is where the main work of this paper lies.

Theorem 3

Let n be as in Theorem 2. Then the averaging operator \(M_\lambda \) (see Theorem 1) satisfies the following improving estimate for \(k \ge 3\) and \(1<p<2\):

$$\begin{aligned} \Vert M_\lambda f\Vert _{l^{p'}(\mathbb {Z}^n)} \le C_{p,n,k}\lambda ^{(\frac{2-n}{k})(\frac{2}{p}-1)}(\log N)^{C(\frac{2}{p}-1)}\}\Vert f\Vert _{l^p(\mathbb {Z}^n)}. \end{aligned}$$

The main term operator contains both the arithmetic and analytic content of the spherical averages, see e.g. [1, 11] for a discussion of this. Also, the maximal function of the main term satisfy \(l^p\) bounds for all \(p> \frac{n}{n-2}\), independent of the degree k (as long as the degree is large enough). Proving this non-trivial \(l^p\)-improving of these main term operators may lead to insight on how to improve the error term in the decomposition, thus leading to improved \(l^p\) bounds for the spherical maximal operator. Additionally, knowledge of \(l^p\)-improving for the main term will help to compare on a structural level the similarities and differences between the prime and integer spherical maximal functions; for example, might the optimal dependence on \(\lambda \) be independent of the degree k, as in the \(l^p \rightarrow l^p\) bound? Finally, better knowledge of quantitative \(l^p\) improving estimates (for the main term or the entire operator) would likely lead to better quantitative sparse bounds, recently pursued in the integer setting [6], but not yet for this prime variant.

Most of the next and final section of this paper is devoted to proving Theorem 3. We show a certain decay rate for the main term \(M_\lambda \) of the operator, which requires a careful analysis of its corresponding multiplier. This immediately leads to Theorem 3.

We use a boldface script to denote multidimensional vectors in dimension n, where the underlying space they belong to (\(\mathbb {R}^n\), \(\mathbb {Z}^n\), or \(\mathbb {T}^n\)) will be clear from the context. Moreover the notation \(A\lesssim B\) will be used to mean \( A \le CB\) where C is a constant that may depend on p, n, or k, but never on \(\lambda \). The notation used in this paper will be introduced as needed. The next section contains all the proofs and a brief discussion.

2 Proofs and Discussion

Theorem 2 is proved by interpolating the following estimates:

$$\begin{aligned} \Vert A_\lambda f\Vert _{l^\infty (\mathbb {Z}^n)}\lesssim & {} \lambda ^{1-\frac{n}{k}}(\log {\lambda }^{1/k})^n\Vert f\Vert _{l^1(\mathbb {Z}^n)}\end{aligned}$$
(2.1)
$$\begin{aligned} \Vert M_\lambda f\Vert _{l^\infty (\mathbb {Z}^n)}\lesssim & {} \lambda ^{\frac{2-n}{k}}(\log {N})^C\Vert f\Vert _{l^1(\mathbb {Z}^n)} \end{aligned}$$
(2.2)
$$\begin{aligned} \Vert A_\lambda f\Vert _{l^2(\mathbb {Z}^n)}\lesssim & {} \Vert f\Vert _{l^2(\mathbb {Z}^n)}\end{aligned}$$
(2.3)
$$\begin{aligned} \Vert E_\lambda f\Vert _{l^2(\mathbb {Z}^n)}\lesssim & {} (\log {N})^{-B}\Vert f\Vert _{l^2(\mathbb {Z}^n)} \end{aligned}$$
(2.4)

Given these estimates, we prove Theorem 2:

Proof of Theorem 2

Let \(k \ge 3\) and \(1 \le p \le 2\). First, (2.1) gives

$$\begin{aligned} \Vert E_\lambda f\Vert _{l^\infty (\mathbb {Z}^n)} \lesssim \lambda ^{1-\frac{n}{k}}(\log {\lambda }^{1/k})^n\Vert f\Vert _{l^1(\mathbb {Z}^n)} \end{aligned}$$

since \(\frac{2-n}{k}< 1-\frac{n}{k}<0\). Interpolating this with (2.4) we get

$$\begin{aligned} \Vert E_\lambda f\Vert _{l^{p'}(\mathbb {Z}^n)} \lesssim \lambda ^{(1-\frac{n}{k})(\frac{2}{p}-1)}(\log {\lambda }^{1/k})^{n(\frac{2}{p}-1)}(\log N)^{-B(2-\frac{2}{p})}\Vert f\Vert _{l^p(\mathbb {Z}^n)}. \end{aligned}$$

Note that we can chose \(B \ge n\) which simplifies the expression. For the main term, interpolating (2.2) and \( \Vert M_\lambda f\Vert _{l^2(\mathbb {Z}^n)} \lesssim \Vert f\Vert _{l^2(\mathbb {Z}^n)}\) derived from (2.3) gives

$$\begin{aligned} \Vert M_\lambda f\Vert _{l^{p'}(\mathbb {Z}^n)} \lesssim \lambda ^{(\frac{2-n}{k})(\frac{2}{p}-1)}(\log N)^{C(\frac{2}{p}-1)}\Vert f\Vert _{l^p(\mathbb {Z}^n)}. \end{aligned}$$

Putting these together, we get

$$\begin{aligned} \Vert A_\lambda f\Vert _{l^{p'}(\mathbb {Z}^n)} \lesssim max\{ \lambda ^{(1-\frac{n}{k})(\frac{2}{p}-1)}(\log N)^{-B}, \lambda ^{(\frac{2-n}{k})(\frac{2}{p}-1)}(\log N)^{C(\frac{2}{p}-1)}\}\Vert f\Vert _{l^p(\mathbb {Z}^n)} \end{aligned}$$

which gives Theorem 2.

For \(k=2\), we can do better by simply interpolating (2.1) and (2.3), which gives the trivial interpolation bound of

$$\begin{aligned} \Vert A_\lambda f\Vert _{l^{p'}(\mathbb {Z}^n)} \lesssim \lambda ^{(1-\frac{n}{k})(\frac{2}{p}-1)}(\log \lambda )^{n}\Vert f\Vert _{l^p(\mathbb {Z}^n)} \end{aligned}$$

\(\square \)

Proof of Theorem 3

Note that the second term in the maximum that appears in the last inequality of the previous proof comes from the main term operator \(M_{\lambda }\). Therefore, once estimates (2.2) is proved, we get Theorem 3. \(\square \)

Remark 1

The decay rate of \(\lambda \) improves as p approaches 1. At \(p=2\), we get no improvement. This makes sense since we expect better decay at lower values of p since improving is “trivial” in this discrete setting.

Remark 2

Note that these estimates hold for \(n\ge 5\) if \(k=2\). This is in contrast to some of the results in [1] which only hold for \(n \ge 7\). This might provide further evidence that the spherical maximal function along the primes might be bounded for all \(n\ge 5\).

Remark 3

For \(k \ge 3\), we have that \(\frac{2-n}{k} < 1-\frac{n}{k}\). This is due to the fact that the decay from the main term is greater for \(p<2\), as is expected. Any improvements to the estimate (2.4) would automatically improve Theorem 2. For the case \(k=2\), this decomposition from Theorem 1 gives no improvement to the interpolation between the easy estimates (2.1) and (2.3), because in this case the error term actually has better decay.

Remark 4

The decay in Theorem 3 is likely not optimal. To discuss optimality, we will likely have to restrict to a smaller range of p, such as in [7]. Any improvements to estimate (2.2) would directly improve this decay rate. It seems a plausible conjecture that once we restrict to a certain range of p, the optimal decay rate is \( \Vert M_\lambda f\Vert _{l^{p'}(\mathbb {Z}^n)} \lesssim \lambda ^{-\frac{n}{k}(\frac{2}{p}-1)}(\log \lambda )^{C(\frac{2}{p}-1)}\Vert f\Vert _{l^p(\mathbb {Z}^n)}\). However, the corresponding conjecture in the integer case is not fully known in terms of the range of p. Any improvements to estimate (2.2) would directly improve this decay rate in our case, but it is likely that improvements to Theorem 3 will come from other techniques.

We now prove (2.1) through (2.4). We use the trivial estimate to get (2.1): due to the Hua asymptotic for \(P(\lambda )\), we have

$$\begin{aligned} |A_\lambda f({\mathbf {l}})| \le \frac{1}{\lambda ^{\frac{n}{k}-1}} \sum _{|\mathbf{p}|^k=\lambda } (\log \mathbf{p}) f(\mathbf{p-l}), \end{aligned}$$

which can be bounded trivially in \(l^\infty (\mathbb {Z}^n)\) norm by \(\lambda ^{1-\frac{n}{k}}(\log {\lambda }^{1/k})^n\Vert f\Vert _{l^1(\mathbb {Z}^n)}\). Also, (2.3) is also easily seen to be true since \(\sigma _\lambda \) is defined to be a probability measure.

We now describe how to get (2.4). Note that from [1] we have that

$$\begin{aligned} \left\| \widehat{E_\lambda } \right\| _{L^\infty ({\mathbb {T}}^n)} \lesssim (\log \lambda )^{-B}. \end{aligned}$$
(2.5)

and moreover, that this actually holds for a larger range, including all \(n \ge 5\) when \(k=2\). With this in mind, using Plancherel’s theorem and properties of the Fourier transform, we have

$$\begin{aligned} \Vert E_\lambda f\Vert _{l^2(\mathbb {Z}^n)}= & {} \Vert \widehat{E_\lambda f}\Vert _{l^2(\mathbb {Z}^n)} = \Vert \widehat{E_\lambda }\widehat{f}\Vert _{l^2(\mathbb {Z}^n)} \le \Vert \widehat{E_\lambda }\Vert _{L^\infty (\mathbb {T}^n)}\Vert f\Vert _{l^2(\mathbb {Z}^n)} \\\lesssim & {} (\log \lambda )^{-B}\Vert f\Vert _{l^2(\mathbb {Z}^n)} \end{aligned}$$

which gives (2.4) as claimed.

The remainder of the paper is devoted to proving (2.2). We follow the method outlined in [4] with the technology and notation from [1]; we prove a favorable \(L^\infty \) estimate for the kernel of the main term operator, which is the inverse Fourier transform of \(\widehat{M_\lambda }\). This is because

$$\begin{aligned} \Vert M_\lambda f\Vert _{l^\infty (\mathbb {Z}^n)} = \Vert K_\lambda \star f\Vert _{l^\infty (\mathbb {Z}^n)} \lesssim \Vert K_\lambda \Vert _{l^\infty (\mathbb {Z}^n)}\Vert f\Vert _{l^1(\mathbb {Z}^n)} \end{aligned}$$

by Young’s inequality (where \(K_\lambda \) is the kernel of the convolution operator), so if we can show

$$\begin{aligned} \Vert K_\lambda \Vert _{l^\infty (\mathbb {Z}^n)} \lesssim \lambda ^{\frac{2-n}{k}}(\log N)^C \end{aligned}$$
(2.6)

then we will get (2.2).

Recall that we can maneuver the sums in the variable a and q as well as the multidimensional sums in \(\mathbf {a}\) and \(\mathbf {q}\), and note that even though the sum in q in [1] extends to \(\infty \), this was chosen purely for convenience, and that this sum needs only to be taken to N. With this in mind, denote

$$\begin{aligned} G_\lambda (\mathbf {a}, \mathbf {q}) = \prod _{i=1}^ng(a, q; a_i, q_i) \end{aligned}$$

and

$$\begin{aligned} \psi _{N/Q, \mathbf {q}}({\varvec{\xi }}-\mathbf {a}/\mathbf {q}) = \psi _{N/Q}(\mathbf {q}{\varvec{\xi }}-\mathbf {a}) \end{aligned}$$

so that

$$\begin{aligned} \widehat{K_\lambda } := \widehat{M_\lambda }({\varvec{\xi }}) = \sum _{q=1}^N \sum _{a\in {\mathbb {U}}_q} \widehat{K_\lambda ^{a,q}}({\varvec{\xi }}), \end{aligned}$$

where

$$\begin{aligned} \widehat{K_\lambda ^{a,q}}({\varvec{\xi }}) := e\left( -\lambda a/q \right) \sum _{\mathbf {q}\le Q}\sum _{{\mathbf {a}} \in {\mathbb {U}}_{\mathbf {q}}} G_\lambda (\mathbf {a}, \mathbf {q}) \psi _{N/Q, \mathbf {q}}({\varvec{\xi }}-\mathbf {a}/\mathbf {q}) \widetilde{d \sigma _{\lambda }}({\varvec{\xi }}-{\mathbf {a}}/{\mathbf {q}}). \end{aligned}$$

To show (2.6), we start by applying Fourier inversion

$$\begin{aligned} K_\lambda ^{a,q}(\mathbf{x}) = e({-}\lambda \cdot a/q)\int _{\mathbb {T}^n}e({-} \mathbf{x}\cdot {\varvec{\xi }})\sum _{ 1 \le \mathbf {q}\le Q}\sum _{\mathbf {a}\in {\mathbb {U}}_{\mathbf {q}}} G_\lambda (\mathbf {a},\mathbf {q})\psi _{N/Q, \mathbf {q}}({\varvec{\xi }}{-}\mathbf {a}/\mathbf {q}) \widetilde{d\sigma _{\lambda }}({\varvec{\xi }}{-}\mathbf {a}/\mathbf {q}) \end{aligned}$$

and noting that for each fixed \(\xi \), the supports of the \(\psi _{N/Q}(\mathbf {q}{\varvec{\xi }}-\mathbf {a})\) are disjoint in \(\mathbf {q}\) (for \(\mathbf {q}\le Q\)), we get

$$\begin{aligned} K_\lambda ^{a,q}(\mathbf{x}) = e(-\lambda \cdot a/q)\int _{\mathbb {T}^n}e(- \mathbf{x}\cdot {\varvec{\xi }})\sum _{\mathbf {a}\in {\mathbb {U}}_{\mathbf {q}}} G_\lambda (\mathbf {a}, \mathbf {q})\psi _{N/Q, \mathbf {q}}({\varvec{\xi }}-\mathbf {a}/\mathbf {q}) \widetilde{d\sigma _{\lambda }}({\varvec{\xi }}-\mathbf {a}/\mathbf {q}). \end{aligned}$$

Writing out the Gauss sum along with multiplying and dividing by \(e(\frac{a_1x_i}{q_i})\), the integral becomes

$$\begin{aligned}&\int _{\mathbb {T}^n}e(- \mathbf{x}\cdot {\varvec{\xi }})\sum _{\mathbf {a}\in {\mathbb {U}}_{\mathbf {q}}} \prod _{i=1}^n\frac{1}{\varphi ([q,q_i])} \sum _{b \in {\mathbb {U}}_{[q,q_i]}} e\bigg ( \frac{ab^k}{q} + \frac{a_ib}{q_i} \bigg )\\&\quad e\left( \frac{a_1x_i}{q_i}\right) e\left( \frac{-a_1x_i}{q_i}\right) \psi _{NQ, \mathbf {q}}({\varvec{\xi }}-\mathbf {a}/\mathbf {q}) \widetilde{d\sigma _{\lambda }}({\varvec{\xi }}-\mathbf {a}/\mathbf {q}). \end{aligned}$$

Since we are integrating over \({\varvec{\xi }}\), due to the cutoff function \(\psi \) we can extend the integration to \(\mathbb {R}^n\). We therefore get

$$\begin{aligned} K_\lambda ^{a,q}(\mathbf{x}) = e(-\lambda \cdot a/q) G_x(a,q,\mathbf {q})\int _{\mathbb {R}^n}e(-\mathbf{x}({\varvec{\xi }} - \mathbf {a}/\mathbf {q}))\psi _{N/Q, \mathbf {q}}({\varvec{\xi }}-\mathbf {a}/\mathbf {q}) \widetilde{d\sigma _{\lambda }}({\varvec{\xi }}-\mathbf {a}/\mathbf {q}) \end{aligned}$$

where

$$\begin{aligned} G_x(a,q,\mathbf {q}) = \sum _{\mathbf {a}\in {\mathbb {U}}_{\mathbf {q}}} \prod _{i=1}^n\frac{1}{\varphi ([q,q_i])} \sum _{b \in {\mathbb {U}}_{[q,q_i]}} e\bigg ( \frac{ab^k}{q} + \frac{a_ib}{q_i} \bigg )e\bigg (-\frac{a_1x_i}{q_i}\bigg ) \end{aligned}$$
(2.7)

since we have

$$\begin{aligned} e(-{\varvec{x}}\cdot {\varvec{\xi }})\prod _{i=1}^n e(a_ix_i/q_i) = e(-\sum _{i=1}^n x_i(\xi _i-\frac{a_i}{q_i})) = e(-{\varvec{x}}({\varvec{\xi }}-\mathbf {a}/\mathbf {q})). \end{aligned}$$

Putting this all together, we get

$$\begin{aligned} K_\lambda ^{a,q}(\mathbf{x}) = e(-\lambda \cdot a/q) G_x(a,q,\mathbf {q}) \mathcal {F}(\psi _{N/Q, \mathbf {q}}({\varvec{\xi }}-\mathbf {a}/\mathbf {q}) \widetilde{d\sigma _{\lambda }}({\varvec{\xi }}-\mathbf {a}/\mathbf {q})) \end{aligned}$$

where \(\mathcal {F}\) indicates the Fourier transform.

Using properties of the Fourier transform, we can rewrite

$$\begin{aligned} \mathcal {F}(\psi _{N/Q, \mathbf {q}}({\varvec{\xi }}-\mathbf {a}/\mathbf {q}) \widetilde{d\sigma _{\lambda }}({\varvec{\xi }}-\mathbf {a}/\mathbf {q})) = \widetilde{\psi _{N/Q, \mathbf {q}}}\star d\sigma _{\lambda }({\varvec{x}}). \end{aligned}$$

to get

$$\begin{aligned} K_\lambda ^{a,q}(\mathbf{x}) = e(-\lambda \cdot a/q)G_x(a,q,\mathbf {q})\widetilde{\psi _{N/Q, \mathbf {q}}}\star d\sigma _{\lambda }({\varvec{x}}) \end{aligned}$$
(2.8)

so that

$$\begin{aligned} |K_\lambda ^{a,q}(\mathbf{x})| = |G_x(a,q,\mathbf {q})| |\widetilde{\psi _{N/Q, \mathbf {q}}}\star d\sigma _{\lambda }({\varvec{x}})|. \end{aligned}$$

Next we will bound both the Gauss component (2.7) and the convolution in absolute values. We can rewrite (2.7) as

$$\begin{aligned} G_x(a,q,\mathbf {q})= & {} \prod _{i=1}^n\sum _{a_i\in U_{q_i}}\frac{1}{\varphi ([q,q_i])} \sum _{b \in {\mathbb {U}}_{[q,q_i]}} e\bigg ( \frac{ab^k}{q} + \frac{a_ib}{q_i} \bigg )e\bigg (\frac{-a_1x_i}{q_i}\bigg ). \\= & {} \prod _{i=1}^n\sum _{a_i\in U_{q_i}}g(a, q; a_i, q_i)e\bigg (\frac{-a_1x_i}{q_i}\bigg ) \end{aligned}$$

(To see this note that if \(a_i\in U_{q_i}\) then there exists a \(n_i\) such that \(n_ia_i = 1\), so \({{\,\mathrm{lcm}\,}}_i(n_i)a_i = 1\), so \(\mathbf {a}\in U_\mathbf {q}\). Conversely, \(\mathbf {a}\in U_\mathbf {q}\) implies \(a_i\in U_{q_i}\) for all i.)

We have the following bound:

Lemma 1

For all \(\varepsilon >0\),

$$\begin{aligned} |\sum _{a_i\in U_{q_i}}g(a, q; a_i, q_i)e\bigg (\frac{-a_1x_i}{q_i}\bigg )| \lesssim q_i^{\varepsilon } \end{aligned}$$

Proof

From the proof of Lemma 6, part (iii) in [1], we have that

$$\begin{aligned} |\sum _{a_i\in U_{q_i}}g(a, q; a_i, q_i)e\bigg (\frac{-a_1x_i}{q_i}\bigg )| \le \frac{\tau ((q,q_i))}{\varphi (q_i/(q,q_i))}\sum _{d|\frac{q_i}{(q,q_i)}}d = \frac{\tau ((q,q_i))}{\varphi (q_i/(q,q_i))}\sigma (q_i/(q,q_i)) \end{aligned}$$

Now recall the following facts:

  • \(\tau (n) \lesssim n^{\varepsilon '}\)

  • \(n^{1-\varepsilon ''} \le \varphi (n) \le n\)

  • \(\sigma (n) \le e^\gamma n\log \log n\)

for all \(\varepsilon ', \varepsilon '' >0\). Here \(\tau (n) = \sum _{d|n}1\) and \(\sigma (n) = \sum _{d|n}d\), while \(\gamma \) is the Euler–Mascheroni constant.

Using these facts, we get the bound

$$\begin{aligned} e^\gamma (q,q_i)^{\varepsilon '} \bigg (\frac{q_i}{(q,q_i)}\bigg )^{\varepsilon ''}\log \log \bigg (\frac{q_i}{(q,q_i)}\bigg ) \lesssim q_i^{\varepsilon } \end{aligned}$$

\(\square \)

Therefore

$$\begin{aligned} |G_x(a,q,\mathbf {q})| \le \prod _{i=1}^nq_i^\varepsilon \end{aligned}$$

To bound the convolution in (2.8), we use a variant of the well-known decay of the spherical measure (where we have the diagonal transformation \(\mathbf {q}\)), we have that

$$\begin{aligned} \widetilde{\psi _{N/Q, \mathbf {q}}}\star d\sigma _{\lambda }({\varvec{x}}) \lesssim \frac{QN^{-n}}{\mathbf {q}(1+\frac{|{\varvec{x}}|}{N\mathbf {q}})^M} \end{aligned}$$

where \(M>0\) is any natural number and the implicit constant depends on M (see, for example, equation (5.5.12) in [2]—this also holds for degree k spheres).

Now we show (2.6). Trivially summing over \(a\in U_q\), we have

$$\begin{aligned} \Bigg |\sum _{a\in U_q}K_\lambda ^{a,q}({\varvec{x}})\Bigg | \le q|K_\lambda ^{a,q}({\varvec{x}})| \le q \prod _{i=1}^nq_i^\varepsilon |\widetilde{\psi _{NQ, \mathbf {q}}}\star d\sigma _{\lambda }({\varvec{x}})| \lesssim qQ^{1+\varepsilon }N^{-n} \end{aligned}$$
(2.9)

and we have (by technically abusing notation in that the constant C below should be \(C' = C(1+\varepsilon )\)),

$$\begin{aligned} \Bigg |\sum _{q=1}^N\sum _{a\in U_q}K_\lambda ^{a,q}({\varvec{x}})\Bigg | \lesssim N^2Q^{1+\varepsilon }N^{-n} = \lambda ^{\frac{2-n}{k}}Q^{1+\varepsilon } = \lambda ^{\frac{2-n}{k}}(\log N)^C \end{aligned}$$

which is (2.6).

Finally, we indicate how to prove the following maximal dyadic version of Theorem 2:

Theorem 4

Let \(n \ge n_1(k)\) (see [1] for a precise definition). Then we have for \(k \ge 3\),, \(1<p<2\)

$$\begin{aligned} \left\| \sup _{\Lambda \le \lambda < 2\Lambda }|A_\lambda f|\right\| _{l^{p'}(\mathbb {Z}^n)} \lesssim C_{B,p,n,k}\Lambda ^{(1-\frac{n}{k})(\frac{2}{p}-1)}(\log \Lambda )^{-B}\Vert f\Vert _{l^p(\mathbb {Z}^n)} \end{aligned}$$

and for \(k=2\) and \(1<p<2\)

$$\begin{aligned} \Vert A_\lambda f\Vert _{l^{p'}(\mathbb {Z}^n)} \le C_{p,n}\Lambda ^{(1-\frac{n}{2})(\frac{2}{p}-1)}(\log \Lambda )^{n}\Vert f\Vert _{l^p(\mathbb {Z}^n)} \end{aligned}$$

Proof

We can show the analogues of (2.1) through (2.4) and interpolate as in the proof of Theorem 2. The only estimates that require commentary are proving the analogues of (2.2) and (2.4). Instead of (2.4), we need to show

$$\begin{aligned} \left\| \sup _{\Lambda \le \lambda < 2\Lambda }|E_\lambda f|\right\| _{l^2(\mathbb {Z}^n)} \lesssim (\log {\Lambda })^{-B}\Vert f\Vert _{l^2(\mathbb {Z}^n)}, \end{aligned}$$
(2.10)

but this is the content of Theorem 1 of [1], where we need the stronger assumption that \(n\ge n_1(k)\) (namely \(n \ge 7\) for \(k=2\)). Finally we show

$$\begin{aligned} \left\| \sup _{\Lambda \le \lambda < 2\Lambda }|M_\lambda f|\right\| _{l^\infty (\mathbb {Z}^n)} \lesssim \Lambda ^{\frac{2-n}{k}}(\log {N})^C\Vert f\Vert _{l^1(\mathbb {Z}^n)} \end{aligned}$$
(2.11)

which proceeds by a very similar argument as (2.2). One simply uses the fact that

$$\begin{aligned} \left\| \sup _{\Lambda \le \lambda< 2\Lambda }|K_\lambda \star f|\right\| _{l^\infty (\mathbb {Z}^n)}= & {} \textit{esssup}_{{\varvec{x}}}|\sup _{\Lambda \le \lambda< 2\Lambda }|K_\lambda \star f|| = \sup _{\Lambda \le \lambda< 2\Lambda }{} \textit{esssup}_{{\varvec{x}}}|K_\lambda \star f| \\\le & {} \sup _{\Lambda \le \lambda < 2\Lambda }\Vert K_\lambda \Vert _{l^\infty (\mathbb {Z}^n)}\Vert f\Vert _{l^1(\mathbb {Z}^n)}. \end{aligned}$$

\(\square \)