1 Introduction

Herein we are concerned with the rapid approximation of the discrete Fourier transform \({\hat{\mathbf {f}}}\in {\mathbb {C}}^N\) of a given vector \(\mathbf {f}\in {\mathbb {C}}^N\) for large values of N. Though standard Fast Fourier Transform (FFT) algorithms [5, 8, 28] can accomplish this task in \({\mathcal {O}} \left( N \log N \right) \)-time for arbitrary \(N \in {\mathbb {N}}\), this runtime complexity may still be unnecessarily computationally taxing when N is extremely large. This is particularly true when the vector \({\hat{\mathbf {f}}}\) is approximately s-sparse (i.e., contains only \(s \ll N\) nonzero entries) as in compressive sensing [11] and certain wideband signal processing applications (see, e.g. [24]). Such applications have therefore motivated the development of discrete sparse Fourier transform (DSFT) techniques [13, 14] which are capable of accurately approximating s-sparse DFT vectors \({\hat{\mathbf {f}}}\in {\mathbb {C}}^N\) in just \(s \cdot \log ^{{\mathcal {O}}(1)} N\)-time. When \(s \ll N\) these methods are significantly faster than standard \({\mathcal {O}} \left( N \log N \right) \)-time FFT methods, effectively achieving sublinear o(N) runtime complexities in such cases.

Currently, the most widely used \(s \cdot \log ^{{\mathcal {O}}(1)} N\)-time DSFT methods [12, 15, 22] are randomized algorithms which accurately compute \({\hat{\mathbf {f}}}\) with high probability when given sampling access to \(\mathbf {f}\). Many existing sparse Fourier transforms which are entirely deterministic [6, 19, 21, 25, 29], on the other hand, are perhaps best described as unequally spaced sparse Fourier transform (USSFT) methods in that they approximately compute \({\hat{\mathbf {f}}}\), with its entries \(\hat{f}_\omega \) indexed by the set \(B := \left( -\left\lceil \frac{N}{2}\right\rceil ,\left\lfloor \frac{N}{2}\right\rfloor \right] \cap {\mathbb {Z}}\), by sampling its associated trigonometric polynomial

$$\begin{aligned} f\left( x\right) =\sum _{\omega \in B}\hat{f}_{\omega } \mathbb {e}^{\mathbb {i}\omega x} \end{aligned}$$

at a collection of \(m \ll N\) specially constructed unequally spaced points \(x_1, \dots , x_m \in [-\pi , \pi ]\). These methods have no probability of failing to recover s-sparse \({\hat{\mathbf {f}}}\), but can not accurately compute the DFT \({\hat{\mathbf {f}}}\) of an arbitrary given vector \(\mathbf {f}\in {\mathbb {C}}^N\) due to their need for unequally spaced function evaluations from f of the form \(\left\{ f(x_k) \right\} ^m_{k=1}\).Footnote 1

This state of affairs has left a gap in the theory of DSFT methods. Existing deterministic sparse Fourier transform algorithms currently can efficiently compute the s-sparse DFT \({\hat{\mathbf {f}}}\) of a given vector \(\mathbf {f}\in {\mathbb {C}}^N\) only if either (i) N is a power of a small prime [26], or else (ii) \(\hat{f}_\omega = 0\) for all \(\omega \in B\) with \(|\omega | > N/4\) [19, 20]. In this paper this gap is filled by the development of a new entirely deterministic DSFT algorithm which is always guaranteed to accurately approximate any (nearly) s-sparse \({\hat{\mathbf {f}}}\in {\mathbb {C}}^N\) of any size N when given access only to \(\mathbf {f}\in {\mathbb {C}}^N\). In addition, the method used to develop this new deterministic DSFT algorithm is general enough that it can be applied to any fast and noise robust USSFT method of the type mentioned above (be it deterministic, or randomized) in order to yield a new fast and robust DSFT algorithm. As a result, we are also able to use the fastest of the currently existing USSFT methods [4, 6, 17, 19, 21, 25, 29] in order to create new publicly available DSFT implementations herein which are both faster and more robust to noise than currently existing noise robust DSFT methods for large N.

More generally, we emphasize that the techniques utilized below free developers of SFT methods to develop more general USSFT methods which utilize samples from the trigonometric polynomial f above at any points \(\left\{ x_k \right\} ^m_{k=1} \subset [-\pi ,\pi ]\) they like when attempting to create better DSFT algorithms in the future. Indeed, the techniques herein provide a relatively simple means of translating any future fast and robust USSFT algorithms into (still fast) DSFT algorithms.

1.1 Theoretical Results

Herein we focus on rapidly producing near best s-term approximations of \({\hat{\mathbf {f}}}\) of the type usually considered in compressive sensing [7]. Let \({\hat{\mathbf {f}}}_{s}^{\mathrm{opt}} \in {\mathbb {C}}^N\) denote an optimal s-term approximation to \({\hat{\mathbf {f}}}\in {\mathbb {C}}^N\). That is, let \({\hat{\mathbf {f}}}_{s}^{\mathrm{opt}}\) preserve s of the largest magnitude entries of \({\hat{\mathbf {f}}}\) while setting the rest of its \(N-s\) smallest magnitude entires to 0.Footnote 2 The following DSFT theorem is proven below.Footnote 3

Theorem 1

Let \(N\in {\mathbb {N}}\), \(s\in [2,N]\cap {\mathbb {N}}\), \(1\le r \le \frac{N}{36}\), and \(\mathbf {f}\in {\mathbb {C}}^N\). There exists an algorithm that will always deterministically return an s-sparse vector \(\mathbf {v} \in {\mathbb {C}}^{N}\) satisfying

$$\begin{aligned} \left\| {\hat{\mathbf {f}}}-\mathbf {v} \right\| _{2}\le \left\| {\hat{\mathbf {f}}}-{\hat{\mathbf {f}}}_{s}^{\mathrm{opt}}\right\| _{2}+\frac{33}{\sqrt{s}}\cdot \left\| {\hat{\mathbf {f}}}-{\hat{\mathbf {f}}}_{s}^{\mathrm{opt}}\right\| _{1}+198\sqrt{s}\left\| \mathbf {f}\right\| _{\infty }N^{-r} \end{aligned}$$
(1)

in just \({\mathcal {O}} \left( \frac{ s^2\cdot r^{\frac{3}{2}} \cdot \log ^{\frac{11}{2}} (N)}{\log (s)} \right) \)-time when given access to \(\mathbf {f}\). If returning an s-sparse vector \(\mathbf {v}\in {\mathbb {C}}^{N}\) that satisfies (1) for each \(\mathbf {f}\) with probability at least \((1-p) \in [2/3,1)\) is sufficient, a Monte Carlo algorithm also exists which will do so in just \( {\mathcal {O}} \left( s\cdot r^{\frac{3}{2}} \cdot \log ^\frac{9}{2}(N)\cdot \log \left( \frac{N}{p}\right) \right) \)-time.

Note the quadratic-in-s runtime dependence of the deterministic algorithm mentioned by Theorem 1. It turns out that there is a close relationship between the sampling points \(\left\{ x_k \right\} ^m_{k=1}\) used by the deterministic USSFT methods [21] employed as part of the proof of Theorem 1 and the construction of explicit (deterministic) RIP matrices (see [1, 18] for details). As a result, reducing the quadratic dependence on s of the \(s^2 \log ^{{\mathcal {O}}(1)} N\)-runtime complexity of the deterministic DSFT algorithms referred to by Theorem 1 while still satisfying the error guarantee (1) is likely at least as difficult as constructing explicit deterministic RIP matrices with fewer than \(s^2 \log ^{{\mathcal {O}}(1)} N\) rows by subsampling rows from an \(N \times N\) DFT matrix. Unfortunately, explicitly constructing RIP matrices of this type is known to be a very difficult problem [11]. This means that constructing an entirely deterministic DSFT algorithm which is both guaranteed to always satisfy (1), and which also always runs in \(s \log ^{{\mathcal {O}}(1)} N\)-time, is also likely to be extremely difficult to achieve at present.Footnote 4

The remainder of this paper is organized as follows: In Sect. 2 we set up notation and establish necessary background results. Then, in Sect. 3, we describe our method for converting noise robust USSFT methods into DSFT methods. The resulting approach is summarized in Algorithm 1 therein. Next, Theorem 1 is proven in Sect. 4 using the intermediary results of Sects. 2 and 3. An empirical evaluation of several new DSFT algorithms resulting from our proposed approach is then performed in Sect. 5. The paper is finally concluded with a few additional comments in Sect. 6.

2 Notation and Setup

The Fourier series representation of a \(2\pi \hbox {-periodic}\) function \(f:\left[ -\pi ,\pi \right] \rightarrow {\mathbb {C}}\) will be denoted by

$$\begin{aligned} f\left( x\right) =\sum _{\omega \in {\mathbb {Z}}}\widehat{f}_{\omega }\mathbb {e}^{\mathbb {i}\omega x} \end{aligned}$$

with its Fourier coefficients, \(\widehat{f}_{\omega }\), given by

$$\begin{aligned} \widehat{f}_{\omega }=\frac{1}{2\pi }\int _{-\pi }^{\pi }f\left( x\right) \mathbb {e}^{-\mathbb {i}\omega x}~dx. \end{aligned}$$

We let \(\widehat{f}:=\left\{ \widehat{f}_{\omega }\right\} _{\omega \in {\mathbb {Z}}}\) represent the infinite sequence of all Fourier coefficients of f below. Given two \(2\pi \)-periodic functions f and g we define the convolution of f and g at \(x \in {\mathbb {R}}\) to be

$$\begin{aligned} \left( f*g\right) \left( x\right) ~=~\left( g*f\right) \left( x\right) ~:=~ \frac{1}{2\pi }\int _{-\pi }^{\pi }g\left( x-y\right) f\left( y\right) dy. \end{aligned}$$

This definition, coupled with the definition of the Fourier transform, yields the well-known equality

$$\begin{aligned} \widehat{f*g}_{\omega }=\widehat{f}_{\omega }\widehat{g}_{\omega }\ \forall \omega \in {\mathbb {Z}}. \end{aligned}$$

We may also write \(\widehat{f*g}=\widehat{f}\circ \widehat{g}\) where \(\circ \) denotes the Hadamard product.

For any \(N\in {\mathbb {N}}\), define the discrete Fourier transform (DFT) matrix \(F\in {\mathbb {C}}^{N\times N}\) by

$$\begin{aligned} F_{\omega ,j} :=\frac{(-1)^\omega }{N} \mathbb {e}^{-\frac{2\pi \mathbb {i}\cdot \omega \cdot j}{N}}, \end{aligned}$$

and let \(B:=\left( -\left\lceil \frac{N}{2}\right\rceil ,\left\lfloor \frac{N}{2}\right\rfloor \right] \cap {\mathbb {Z}}\) be a set of N integer frequencies centered at 0. Furthermore, let \(\mathbf {f}\in {\mathbb {C}}^N\) denote the vector of equally spaced samples from f whose entries are given by

$$\begin{aligned} f_j := f\left( -\pi +\frac{2\pi j}{N}\right) \end{aligned}$$

for \(j = 0, \dots , N-1\). One can now see that if

$$\begin{aligned} f\left( x\right) =\sum _{\omega \in B}\widehat{f}_{\omega } \mathbb {e}^{\mathbb {i}\omega x}, \end{aligned}$$

then

$$\begin{aligned} F \mathbf {f}=: {\hat{\mathbf {f}}} \end{aligned}$$
(2)

where \({\hat{\mathbf {f}}}\in {\mathbb {C}}^{N}\) denotes the subset of \(\widehat{f}\) with indices in B, and in vector form.Footnote 5 More generally, bolded lower case letters will always represent vectors in \({\mathbb {C}}^{N}\) below.

As mentioned above, \(\widehat{f}:=\left\{ \widehat{f}_{\omega }\right\} _{\omega \in {\mathbb {Z}}}\) is the infinite sequence of all Fourier coefficients of f. For any subset \(S \subseteq {\mathbb {Z}}\) we let \(\widehat{f}\vert _{S}\in {\mathbb {C}}^{{\mathbb {Z}}}\) be the sequence \(\widehat{f}\) restricted to the subset S, so that \(\widehat{f}\vert _{S}\) has terms \(\left( \widehat{f}\vert _{S} \right) _\omega = \widehat{f}_{\omega }\) for all \(\omega \in S\), and \(\left( \widehat{f}\vert _{S} \right) _\omega = 0\) for all \(\omega \in S^{c}:={\mathbb {Z}}\setminus S\). Note that \({\hat{\mathbf {f}}}\) above is exactly \(\widehat{f}\vert _{B}\) excluding its zero terms for all \(\omega \notin B\). Thus, given any subset \(S\subseteq B\), we let \({\hat{\mathbf {f}}}\vert _{S}\in {\mathbb {C}}^{N}\) be the vector \({\hat{\mathbf {f}}}\) restricted to the set S in an analogous fashion. That is, for \(S \subseteq B\) we will have \(\left( {\hat{\mathbf {f}}}\vert _{S} \right) _\omega = {\hat{\mathbf {f}}}_\omega \) for all \(\omega \in S\), and \(\left( {\hat{\mathbf {f}}}\vert _{S} \right) _\omega = 0\) for all \(\omega \in B\setminus S\).

Given the sequence \(\widehat{f}\in {\mathbb {C}}^{{\mathbb {Z}}}\) and \(s\le N\), we denote by \(R_{s}^{\mathrm{opt}}\left( \widehat{f}\right) \) a subset of B containing s of the most energetic frequencies of f; that is

$$\begin{aligned} R_{s}^{\mathrm{opt}}\left( \widehat{f}\right) :=\left\{ \omega _{1},\dots ,\omega _{s}\right\} \subseteq B \subset {\mathbb {Z}} \end{aligned}$$

where the frequencies \(\omega _j \in B\) are ordered such that

$$\begin{aligned} \left| \widehat{f}_{\omega _{1}}\right| \ge \left| \widehat{f}_{\omega _{2}}\right| \ge \cdots \ge \left| \widehat{f}_{\omega _{s}}\right| \ge \cdots \ge \left| \widehat{f}_{\omega _{N}}\right| . \end{aligned}$$

Here, if desired, one may break ties by also requiring, e.g., that \(\omega _j < \omega _k\) for all \(j < k\) with \(\left| \widehat{f}_{\omega _{j}}\right| =\left| \widehat{f}_{\omega _{k}}\right| \). We will then define \(f_{s}^{\mathrm{opt}}:\left[ -\pi ,\pi \right] \rightarrow {\mathbb {C}}\) based on \(R_{s}^{\mathrm{opt}}\left( \widehat{f}\right) \) by

$$\begin{aligned} f_{s}^{\mathrm{opt}}\left( x\right) :=\sum _{\omega \in R_{s}^{\mathrm{opt}}\left( \widehat{f}\right) }\widehat{f}_{\omega }\mathbb {e}^{\mathbb {i}\omega x}. \end{aligned}$$

Any such \(2 \pi \)-periodic function \(f_{s}^{\mathrm{opt}}\) will be referred to as an optimal s-term approximation to f. Similarly, we also define both \(\widehat{f}_{s}^{\mathrm{opt}} \in {\mathbb {C}}^{{\mathbb {Z}}}\) and \({\hat{\mathbf {f}}}_{s}^{\mathrm{opt}} \in {\mathbb {C}}^{N}\) to be \(\widehat{f}\vert _{R_{s}^{\mathrm{opt}}\left( \widehat{f}\right) }\) and \({\hat{\mathbf {f}}}\vert _{R_{s}^{\mathrm{opt}}\left( \widehat{f}\right) }\), respectively.

2.1 Periodized Gaussians

In the sections that follow the \(2\pi \hbox {-periodic}\) Gaussian \(g:\left[ -\pi ,\pi \right] \rightarrow {\mathbb {R}}^{+}\) defined by

$$\begin{aligned} g\left( x\right) =\frac{1}{c_1}\sum _{n=-\infty }^{\infty }\mathbb {e}^{-\frac{\left( x-2n\pi \right) ^{2}}{2c_1^{2}}} \end{aligned}$$
(3)

with \(c_1 \in {\mathbb {R}}^+\) will play a special role. The following lemmas recall several useful facts concerning both its decay, and its Fourier series coefficients.

Lemma 1

The \(2\pi \hbox {-periodic}\) Gaussian \(g:\left[ -\pi ,\pi \right] \rightarrow {\mathbb {R}}^{+}\) has

$$\begin{aligned} g\left( x\right) \le \left( \frac{3}{c_1}+\frac{1}{\sqrt{2\pi }} \right) \mathbb {e}^{-\frac{x^{2}}{2c_{1}^{2}}} \end{aligned}$$

for all \(x \in \left[ -\pi ,\pi \right] \).

Lemma 2

The \(2\pi \hbox {-periodic}\) Gaussian \(g:\left[ -\pi ,\pi \right] \rightarrow {\mathbb {R}}^{+}\) has

$$\begin{aligned} \widehat{g}_{\omega } = \frac{1}{\sqrt{2\pi }}\mathbb {e}^{-\frac{c_1^2 \omega ^2}{2}} \end{aligned}$$

for all \(\omega \in {\mathbb {Z}}\). Thus, \(\widehat{g}=\left\{ \widehat{g}_{\omega }\right\} _{\omega \in {\mathbb {Z}}}\in \ell ^{2}\) decreases monotonically as \(|\omega |\) increases, and also has \(\Vert \widehat{g} \Vert _{\infty } = \frac{1}{\sqrt{2 \pi }}\).

Lemma 3

Choose any \(\tau \in \left( 0, \frac{1}{\sqrt{2\pi }} \right) \), \(\alpha \in \left[ 1, \frac{N}{\sqrt{\ln N}} \right] \), and \(\beta \in \left( 0 , \alpha \sqrt{\frac{\ln \left( 1/\tau \sqrt{2\pi } \right) }{2}} ~\right] \). Let \(c_1 = \frac{\beta \sqrt{\ln N}}{N}\) in the definition of the periodic Gaussian g from (3). Then \(\widehat{g}_{\omega } \in \left[ \tau , \frac{1}{\sqrt{2\pi }} \right] \) for all \(\omega \in {\mathbb {Z}}\) with \(|\omega | \le \Bigl \lceil \frac{N}{\alpha \sqrt{\ln N}}\Bigr \rceil \).

The proofs of Lemmas 12, and 3 are included in Appendix B for the sake of completeness. Intuitively, we will utilize the periodic function g from (3) as a bandpass filter below. Looking at Lemma 3 in this context we can see that its parameter \(\tau \) will control the effect of \(\widehat{g}\) on the frequency passband defined by its parameter \(\alpha \). Deciding on the two parameters \(\tau , \alpha \) then constrains \(\beta \) which, in turn, fixes the periodic Gaussian g by determining its constant coefficient \(c_1\). As we shall see, the parameter \(\beta \) will also determine the speed and accuracy with which we can approximately sample (i.e., evaluate) the function \(f *g\). For this reason it will become important to properly balance these parameters against one another in subsequent sections.

2.2 On the Robustness of the SFTs Proposed in [21]

The sparse Fourier transforms presented in [21] include both deterministic and randomized methods for approximately computing the Fourier series coefficients of a given \(2 \pi \)-periodic function f from its evaluations at m-points \(\left\{ x_k \right\} ^m_{k=1} \subset [-\pi , \pi ]\). The following results describe how accurate these algorithms will be when they are only given approximate evaluations of f at these points instead. These results are necessary because we will want to execute the SFTs developed in [21] on convolutions of the form \(f *g\) below, but will only be able to approximately compute their values at each of the required points \(x_1, \dots , x_m \in [-\pi ,\pi ]\).

Lemma 4

Let \(s, \epsilon ^{-1} \in {\mathbb {N}} \setminus \{ 1 \}\) with \((s/\epsilon ) \ge 2\), and \(\mathbf {n}\in {\mathbb {C}}^m\) be an arbitrary noise vector. There exists a set of m points \(\left\{ x_k \right\} ^m_{k=1} \subset [-\pi , \pi ]\) such that Algorithm 3 on page 72 of [21], when given access to the corrupted samples \(\left\{ f(x_k) + n_k \right\} ^m_{k=1}\), will identify a subset \(S \subseteq B\) which is guaranteed to contain all \(\omega \in B\) with

$$\begin{aligned} \left| \widehat{f}_{\omega } \right| > 4 \left( \frac{\epsilon \cdot \left\| {\hat{\mathbf {f}}}- {\hat{\mathbf {f}}}^{\mathrm{opt}}_{(s/\epsilon )} \right\| _1}{s} + \left\| \widehat{f} - \widehat{f}\vert _{B} \right\| _1 + \Vert \mathbf {n}\Vert _\infty \right) . \end{aligned}$$
(4)

Furthermore, every \(\omega \in S\) returned by Algorithm 3 will also have an associated Fourier series coefficient estimate \(z_{\omega } \in {\mathbb {C}}\) which is guaranteed to have

$$\begin{aligned} \left| \widehat{f}_{\omega } - z_{\omega } \right| \le \sqrt{2} \left( \frac{\epsilon \cdot \left\| {\hat{\mathbf {f}}}- {\hat{\mathbf {f}}}^{\mathrm{opt}}_{(s/\epsilon )} \right\| _1}{s} + \left\| \widehat{f} - \widehat{f}\vert _{B} \right\| _1 + \Vert \mathbf {n}\Vert _\infty \right) . \end{aligned}$$
(5)

Both the number of required samples, m, and Algorithm 3’s operation count are

$$\begin{aligned} {\mathcal {O}} \left( \frac{s^2 \cdot \log ^4 (N)}{\log \left( \frac{s}{\epsilon } \right) \cdot \epsilon ^2} \right) . \end{aligned}$$
(6)

If succeeding with probability \((1-\delta ) \in [2/3,1)\) is sufficient, and \((s/\epsilon ) \ge 2\), the Monte Carlo variant of Algorithm 3 referred to by Corollary 4 on page 74 of [21] may be used. This Monte Carlo variant reads only a randomly chosen subset of the noisy samples utilized by the deterministic algorithm,

$$\begin{aligned} \left\{ f(\tilde{x}_k) + \tilde{n}_k \right\} ^{\tilde{m}}_{k=1} \subseteq \left\{ f(x_k) + n_k \right\} ^m_{k=1}, \end{aligned}$$

yet it still outputs a subset \(S \subseteq B\) which is guaranteed to simultaneously satisfy both of the following properties with probability at least \(1-\delta \):

  1. (i)

    S will contain all \(\omega \in B\) satisfying (4), and

  2. (ii)

    all \(\omega \in S\) will have an associated coefficient estimate \(z_{\omega } \in {\mathbb {C}}\) satisfying (5).

Finally, both this Monte Carlo variant’s number of required samples, \(\tilde{m}\), as well as its operation count will also always be

$$\begin{aligned} {\mathcal {O}} \left( \frac{s}{\epsilon } \cdot \log ^3 (N) \cdot \log \left( \frac{N}{\delta } \right) \right) . \end{aligned}$$
(7)

Using the preceding lemma one can easily prove the following noise robust variant of Theorem 7 (and Corollary 4) from §5 of [21]. The proofs of both results are outlined in Appendix C for the sake of completeness.

Theorem 2

Suppose \(f: [-\pi ,\pi ] \rightarrow {\mathbb {C}}\) has \(\widehat{f} \in \ell ^1 \cap \ell ^2\). Let \(s, \epsilon ^{-1} \in {\mathbb {N}} \setminus \{ 1 \}\) with \((s/\epsilon ) \ge 2\), and \(\mathbf {n}\in {\mathbb {C}}^m\) be an arbitrary noise vector. Then, there exists a set of m points \(\left\{ x_k \right\} ^m_{k=1} \subset [-\pi , \pi ]\) together with a simple deterministic algorithm \({\mathcal {A}}: {\mathbb {C}}^m \rightarrow {\mathbb {C}}^{4s}\) such that \({\mathcal {A}} \left( \left\{ f(x_k) + n_k \right\} ^m_{k=1} \right) \) is always guaranteed to output (the nonzero coefficients of) a degree \(\le N/2\) trigonometric polynomial \(y_s: [-\pi , \pi ] \rightarrow {\mathbb {C}}\) satisfying

$$\begin{aligned} \left\| f - y_s \right\| _2 \le \left\| {\hat{\mathbf {f}}}- {\hat{\mathbf {f}}}_{s}^{\mathrm{opt}} \right\| _2 + \frac{22\epsilon \cdot \left\| {\hat{\mathbf {f}}}- {\hat{\mathbf {f}}}^{\mathrm{opt}}_{(s/\epsilon )} \right\| _1}{\sqrt{s}} + 22 \sqrt{s} \left( \left\| \widehat{f} - \widehat{f}\vert _{B} \right\| _1 + \Vert \mathbf {n}\Vert _\infty \right) . \end{aligned}$$
(8)

Both the number of required samples, m, and the algorithm’s operation count are always

$$\begin{aligned} {\mathcal {O}} \left( \frac{s^2 \cdot \log ^4 (N)}{\log \left( \frac{s}{\epsilon } \right) \cdot \epsilon ^2} \right) . \end{aligned}$$
(9)

If succeeding with probability \((1-\delta ) \in [2/3,1)\) is sufficient, and \((s/\epsilon ) \ge 2\), a Monte Carlo variant of the deterministic algorithm may be used. This Monte Carlo variant reads only a randomly chosen subset of the noisy samples utilized by the deterministic algorithm,

$$\begin{aligned} \left\{ f(\tilde{x}_k) + \tilde{n}_k \right\} ^{\tilde{m}}_{k=1} \subseteq \left\{ f(x_k) + n_k \right\} ^m_{k=1}, \end{aligned}$$

yet it still outputs (the nonzero coefficients of) a degree \(\le N/2\) trigonometric polynomial, \(y_s: [-\pi , \pi ] \rightarrow {\mathbb {C}}\), that satisfies (8) with probability at least \(1-\delta \). Both its number of required samples, \(\tilde{m}\), as well as its operation count will always be

$$\begin{aligned} {\mathcal {O}} \left( \frac{s}{\epsilon } \cdot \log ^3 (N) \cdot \log \left( \frac{N}{\delta } \right) \right) . \end{aligned}$$
(10)

We now have the necessary prerequisites in order to discuss our general strategy for constructing several new fully discrete SFTs.

3 Description of the Proposed Approach

In this section we assume that we have access to an SFT algorithm \({\mathcal {A}}\) which requires m function evaluations of a \(2 \pi \)-periodic function \(f: [-\pi , \pi ] \rightarrow {\mathbb {C}}\) in order to produce an s-sparse approximation to \(\widehat{f}\). For any non-adaptive SFT algorithm \({\mathcal {A}}\) the m points \(\left\{ x_k \right\} ^m_{k=1} \subset [-\pi , \pi ]\) at which \({\mathcal {A}}\) needs to evaluate f can be determined before \({\mathcal {A}}\) is actually executed. As a result, the function evaluations \(\left\{ f(x_k) \right\} ^m_{k=1}\) required by \({\mathcal {A}}\) can also be evaluated before \({\mathcal {A}}\) is ever run. Indeed, if the SFT algorithm \({\mathcal {A}}\) is nonadaptive, stable, and robust to noise it suffices to approximate the function evaluations \(\left\{ f(x_k) \right\} ^m_{k=1}\) required by \({\mathcal {A}}\) before it is executed.Footnote 6 These simple ideas form the basis for the proposed computational approach outlined in Algorithm 1.

figure a

The objective of Algorithm 1 is to use a nonadaptive and noise robust SFT algorithm \({\mathcal {A}}\) which requires off-grid function evaluations in order to approximately compute the DFT of a given vector \(\mathbf {f}\in {\mathbb {C}}^N\), \({\hat{\mathbf {f}}}= F \mathbf {f}\) . Note that computing \({\hat{\mathbf {f}}}\) is equivalent to computing the Fourier series coefficients of the degree N trigonometric interpolant of \(\mathbf {f}\). Hereafter the \(2 \pi \)-periodic function \(f: [-\pi , \pi ] \rightarrow {\mathbb {C}}\) under consideration will always be this degree N trigonometric interpolant of \(\mathbf {f}\). Our objective then becomes to approximately compute \(\widehat{f}\) using \({\mathcal {A}}\). Unfortunately, our given input vector \(\mathbf {f}\) only contains equally spaced function evaluations of f, and so does not actually contain the function evaluations \(\left\{ f(x_k) \right\} ^m_{k=1}\) required by \({\mathcal {A}}\). As a consequence, we are forced to try to interpolate these required function evaluations \(\left\{ f(x_k) \right\} ^m_{k=1}\) from the available equally spaced function evaluations \(\mathbf {f}\).

Directly interpolating the required function evaluations \(\left\{ f(x_k) \right\} ^m_{k=1}\) from \(\mathbf {f}\) for an arbitrary degree N trigonometric polynomial f using classical techniques appears to be either too inaccurate, or else too slow to work well in our setting.Footnote 7 As a result, Algorithm 1 follows the example of successful nonequispaced fast Fourier transform (NFFT) methods (see, e.g. [2, 9, 10, 23, 30]) and instead uses \(\mathbf {f}\) to rapidly approximate samples from the convolution of the unknown trigonometric polynomial f with (several modulations of) a known filter function g. Thankfully, all of the evaluations \(\left\{ (g*f)(x_k) \right\} ^m_{k=1}\) can be approximated very accurately using only the data in \(\mathbf {f}\) in just \({\mathcal {O}}(m \log N)\)-time when g is chosen carefully enough (see Sect. 3.1 below). The given SFT algorithm \({\mathcal {A}}\) is then used to approximate the Fourier coefficients of \(g*f\) for each modulation of g using these approximate evaluations. Finally, \({\hat{\mathbf {f}}}\) is then approximated using the recovered sparse approximation for each \(\widehat{g*f}\) combined with our a priori knowledge of \(\widehat{g}\).

Next, in Sect. 3.1, explicit bounds will be developed which characterize the runtime required in order to accurately approximate arbitrary samples from \(f*g\) using only a few entries from \(\mathbf {f}\). The attentive reader may notice there that the main theorem in that section (Theorem 4) bares some resemblance to state of the art NFFT error bounds (see, e.g., Steidl’s Theorem 3.1 in [30]) in that it utilizes the properties of truncated convolutions with periodized Gaussians in order to obtain error bounds which decay exponentially with the number of truncated convolution terms utilized per function evaluation. It is important to note, however, that the SFT methods considered herein have several crucial complicating constraints which require such NFFT techniques to be substantially overhauled before they may be fruitfully employed in our setting. Chief among these complications are that \(\Omega (N)\)-time NFFT methods for the evaluation of trigonometric polynomials at nonequispaced points along with their attending error analysis effectively assume that \({\hat{\mathbf {f}}}\) is already known (or, at least, that computing it in FFT-time is an acceptable computational cost). In the case of SFTs this is not true since our main objective is exactly to approximate \({\hat{\mathbf {f}}}\) much more quickly than an FFT can by only reading a tiny sublinear-in-N fraction of the entries in \(\mathbf {f}\). As a result, unlike NFFT methods our analysis needs to focus on rapidly approximating values from \(f *g\) instead of from f itself, and with a Gaussian g whose Fourier transform \(\widehat{g}\) still allows the rapid and accurate application of SFT techniques in Sect. 4 as we continue therein to prove the main result of the paper (Theorem 1).

3.1 Rapidly and Accurately Evaluating \(f*g\)

In this section we will carefully consider the approximation of \(\left( f*g\right) \left( x\right) \) by a severely truncated version of the semi-discrete convolution sum

$$\begin{aligned} \frac{1}{N}\sum ^{N-1}_{j=0}f\left( -\pi +\frac{2\pi j}{N}\right) g\left( x + \pi - \frac{2\pi j}{N} \right) \end{aligned}$$
(11)

for any given value of \(x \in [-\pi , \pi ]\). Our goal is to determine exactly how many terms of this finite sum we actually need in order to obtain an accurate approximation of \(f*g\) at an arbitrary x-value. More specifically, we aim to use as few terms from this sum as absolutely possible in order to ensure, e.g., an approximation error of size \({\mathcal {O}}(N^{-2})\).

Without loss of generality, let us assume that \(N=2M+1\) is odd – this allows us to express B, the set of N Fourier modes about zero, as

$$\begin{aligned} B:=\left( -\left\lceil \frac{N}{2}\right\rceil ,\left\lfloor \frac{N}{2}\right\rfloor \right] \cap {\mathbb {Z}}=\left[ -M,M\right] \cap {\mathbb {Z}}. \end{aligned}$$

In the lemmas and theorems below the function \(f:\left[ -\pi ,\pi \right] \rightarrow {\mathbb {C}}\) will always denote a degree-N trigonometric polynomial of the form

$$\begin{aligned} f\left( x\right) =\sum _{\omega \in B}\widehat{f}_{\omega }\mathbb {e}^{\mathbb {i}\omega x}. \end{aligned}$$

Furthermore, g will always denote the periodic Gaussian as defined above in (3). Finally, we will also make use of the Dirichlet kernel \(D_{M}:{\mathbb {R}}\rightarrow {\mathbb {C}}\), defined by

$$\begin{aligned} D_{M}\left( y\right) =\frac{1}{2\pi }\sum _{n=-M}^{M}\mathbb {e}^{\mathbb {i}ny}=\frac{1}{2\pi }\sum _{n\in B}\mathbb {e}^{\mathbb {i}ny}. \end{aligned}$$

The relationship between trigonometric polynomials such as f and the Dirichlet kernel \(D_{M}\) is the subject of the following lemma.

Lemma 5

Let \(h: [-\pi , \pi ] \rightarrow {\mathbb {C}}\) have \(\widehat{h}_{\omega } = 0\) for all \(\omega \notin B\), and define the set of points \(\left\{ y_{j}\right\} _{j=0}^{2M}=\left\{ -\pi +\frac{2\pi j}{N} \right\} _{j=0}^{2M}\). Then,

$$\begin{aligned} 2\pi \left( h*D_{M}\right) \left( x\right) ~=~h\left( x\right) ~=~\frac{2\pi }{N}\sum _{j=0}^{2M}h\left( y_{j}\right) D_{M}\left( x - y_{j} \right) \end{aligned}$$

holds for all \(x\in \left[ -\pi ,\pi \right] \).

Proof

By the definition of \(D_{M}\), we trivially have \(2\pi \left( \widehat{D_{M}} \right) _{\omega }=\chi _{B}\left( \omega \right) \)\(\forall \omega \in {\mathbb {Z}}\). Thus,

$$\begin{aligned} \widehat{h}=2\pi \cdot \widehat{h}\circ \widehat{D_{M}}=2\pi \cdot \widehat{h*D_{M}} \end{aligned}$$

where, as before, \(\circ \) denotes the Hadamard product, and \(*\) denotes convolution. This yields \(h\left( x\right) =2\pi \left( h*D_{M}\right) \left( x\right) \) and so establishes the first equality above. To establish the second equality above, recall from (2) that for any \(\omega \in B\) we will have

$$\begin{aligned} \widehat{h}_{\omega }=\frac{(-1)^\omega }{N}\sum _{j=0}^{2M}h\left( -\pi +\frac{2\pi j}{N}\right) \mathbb {e}^{\frac{-2\pi \mathbb {i}j\omega }{N}}=\frac{1}{N}\sum _{j=0}^{2M}h\left( y_{j}\right) \mathbb {e}^{-\mathbb {i}\omega y_{j}}, \end{aligned}$$

since h is a trigonometric polynomial. Thus, given \(x\in \left[ -\pi ,\pi \right] \) one has

$$\begin{aligned} h\left( x\right)&=\sum _{\omega \in B}\widehat{h}_{\omega }\mathbb {e}^{\mathbb {i}\omega x} =\frac{1}{N}\sum _{j=0}^{2M}\left( h\left( y_{j}\right) \sum _{\omega \in B}\mathbb {e}^{\mathbb {i}\omega \left( x-y_{j} \right) }\right) =\frac{2\pi }{N}\sum _{j=0}^{2M}h\left( y_{j}\right) D_{M}\left( x-y_{j} \right) . \end{aligned}$$

We now have the desired result. \(\square \)

We can now write a formula for \(g*f\) which only depends on N evaluations of f in \([-\pi , \pi ]\).

Lemma 6

Given the set of equally spaced points \(\left\{ y_{j}\right\} ^{2M}_{j = 0}=\left\{ -\pi +\frac{2\pi j}{N} \right\} _{j=0}^{2M}\) one has that

$$\begin{aligned} \left( g*f\right) \left( x\right) =\frac{1}{N}\sum _{j=0}^{2M}f\left( y_{j}\right) \int _{-\pi }^{\pi }g\left( x-u-y_{j}\right) D_{M}\left( u\right) du \end{aligned}$$

for all \(x\in \left[ -\pi ,\pi \right] \).

Proof

By Lemma 5, we have

$$\begin{aligned} \left( g*f\right) \left( x\right)&=\frac{1}{2\pi }\int _{-\pi }^{\pi }g\left( x-y\right) f\left( y\right) dy =\frac{1}{N}\int _{-\pi }^{\pi }g\left( x-y\right) \sum ^{2M}_{j = 0}f\left( y_{j}\right) D_{M}\left( y-y_{j}\right) dy\\&=\frac{1}{N}\sum ^{2M}_{j = 0}f\left( y_{j}\right) \int _{-\pi }^{\pi }g\left( x-u-y_{j}\right) D_{M}\left( u\right) du. \end{aligned}$$

The last equality holds after a change of variables since g and \(D_{M}\) are both \(2\pi \hbox {-periodic}\). \(\square \)

The next two lemmas will help us bound the error produced by discretizing the integral weights present in the finite sum provided by Lemma 6 above. More specifically, they will ultimately allow us to approximate the sum in Lemma 6 by the sum in (11).

Lemma 7

Let \(x\in \left[ -\pi ,\pi \right] \) and \(y_{j} = -\pi +\frac{2\pi j}{N}\) for some \(j = 0, \dots , 2M\). Then,

$$\begin{aligned} \int _{-\pi }^{\pi }g\left( x-u-y_{j}\right) D_{M}\left( u\right) du=\sum _{n\in B}\widehat{g}_{n}\mathbb {e}^{\mathbb {i}n\left( x - y_{j}\right) }. \end{aligned}$$

Proof

Recalling that \(2\pi \left( \widehat{D_{M}} \right) _{\omega }=\chi _{B}\left( \omega \right) \) for all \(\omega \in {\mathbb {Z}}\) we have that

$$\begin{aligned} \int _{-\pi }^{\pi }g\left( x-u-y_{j}\right) D_{M}\left( u\right) du&= 2 \pi \left( D_{M} *g \right) \left( x - y_{j} \right) \\&= \sum _{n\in {\mathbb {Z}}} \widehat{g}_{n} \chi _{B}\left( n\right) \mathbb {e}^{\mathbb {i}n\left( x - y_{j}\right) } = \sum _{n\in B}\widehat{g}_{n}\mathbb {e}^{\mathbb {i}n\left( x - y_{j}\right) }. \end{aligned}$$

\(\square \)

Lemma 8

Denote \(I\left( a\right) :=\int _{-a}^{a}\mathbb {e}^{-x^{2}}dx\) for \(a>0\); then

$$\begin{aligned} \pi \left( 1-\mathbb {e}^{-a^{2}}\right)<I^{2}\left( a\right) <\pi \left( 1-\mathbb {e}^{-2a^{2}}\right) . \end{aligned}$$

Proof

Let \(a>0\) and observe that

$$\begin{aligned} I^{2}\left( a\right) =\int _{-a}^{a}\int _{-a}^{a}\mathbb {e}^{-x^{2}-y^{2}}dx dy >\iint _{\left\{ x^{2}+y^{2}\le a^{2}\right\} }\mathbb {e}^{-\left( x^{2}+y^{2}\right) }dxdy =\pi \left( 1-\mathbb {e}^{-a^{2}}\right) . \end{aligned}$$

The first equality holds by Fubini’s theorem, and the inequality follows simply by integrating a positive function over a disk of radius a as opposed to a square of sidelength 2a. A similar argument yields the upper bound. \(\square \)

We are now ready to bound the difference between the integral weights present in the finite sum provided by Lemma 6, and the \(g\left( x - y_j \right) \)-weights present in the sum (11).

Lemma 9

Choose any \(\tau \in \left( 0,\frac{1}{\sqrt{2\pi }}\right) \), \(\alpha \in \left[ 1,\frac{N}{\sqrt{\ln N}}\right] \), and \(\beta \in \left( 0,\alpha \sqrt{\frac{\ln \left( 1/\tau \sqrt{2\pi }\right) }{2}}~\right] \). Let \(c_{1}=\frac{\beta \sqrt{\ln N}}{N}\) in the definition of the periodic Gaussian g so that

$$\begin{aligned} g\left( x\right) =\frac{N}{\beta \sqrt{\ln N}}\sum _{n=-\infty }^{\infty }\mathbb {e}^{-\frac{\left( x-2n\pi \right) ^{2}N^{2}}{2\beta ^{2}\ln N}}. \end{aligned}$$

Then for all \(x\in \left[ -\pi ,\pi \right] \) and \(y_{j} = -\pi +\frac{2\pi j}{N}\),

$$\begin{aligned} \left| g\left( x - y_{j}\right) -\int _{-\pi }^{\pi }g\left( x- u - y_{j}\right) D_{M}\left( u\right) du\right| < \frac{N^{1-\frac{\beta ^{2}}{18}}}{\beta \sqrt{\ln N}}. \end{aligned}$$

Proof

Using Lemma 7 we calculate

$$\begin{aligned} \left| g\left( x \!-\! y_{j}\right) \!-\!\int _{-\pi }^{\pi }g\left( x\!-\!u\!-\!y_{j}\right) D_{M}\left( u\right) du\right|&=\left| g\left( x-y_{j}\right) -\sum _{n\in B}\widehat{g}_{n}\mathbb {e}^{\mathbb {i}n\left( x - y_{j}\right) }\right| \\&=\left| \sum _{n\in B^{c}}\widehat{g}_{n}\mathbb {e}^{\mathbb {i}n\left( x-y_{j}\right) }\right| \\&\le \frac{1}{\sqrt{2\pi }}\sum _{\left| n\right| >M}\mathbb {e}^{-\frac{c_{1}^{2}n^{2}}{2}}\quad (\hbox {Using Lemma}\,2)\\&\le \frac{2}{\sqrt{2\pi }}\int _{M}^{\infty }\mathbb {e}^{-\frac{c_{1}^{2}n^{2}}{2}}dn\\&=\sqrt{\frac{2}{\pi }}\int _{M}^{\infty }\mathbb {e}^{-\frac{\beta ^{2}n^{2}\ln N}{2N^{2}}}dn. \end{aligned}$$

Upon the change of variable \(v=\frac{\beta n\sqrt{\ln N}}{\sqrt{2}N}\), we get that

$$\begin{aligned}&\left| g\left( x-y_{j}\right) -\int _{-\pi }^{\pi }g\left( x-u-y_{j}\right) D_{M}\left( u\right) du\right| \\&\qquad \qquad \le \sqrt{\frac{2}{\pi }}\frac{\sqrt{2}N}{\beta \sqrt{\ln N}}\int _{\frac{\beta M\sqrt{\ln N}}{\sqrt{2}N}}^{\infty }\mathbb {e}^{-v^{2}}dv\\&\qquad \qquad =\frac{2N}{\beta \sqrt{\pi \ln N}}\frac{1}{2}\left( \int _{-\infty }^{\infty }\mathbb {e}^{-v^{2}}dv-\int _{-\frac{\beta M\sqrt{\ln N}}{\sqrt{2}N}}^{\frac{\beta M\sqrt{\ln N}}{\sqrt{2}N}}\mathbb {e}^{-v^{2}}dv\right) \\&\qquad \qquad <\frac{N}{\beta \sqrt{\pi \ln N}}\left( \sqrt{\pi }-\sqrt{\pi \left( 1-\mathbb {e}^{-\frac{\beta ^{2}M^{2}\ln N}{2N^{2}}}\right) }\right) \\&\qquad \qquad =\frac{N}{\beta \sqrt{\ln N}}\left( 1-\sqrt{1-N^{-\frac{\beta ^{2}M^{2}}{2N^{2}}}}\right) \end{aligned}$$

where the last inequality follows from Lemma 8. Noting now that

$$\begin{aligned} y\in \left[ 0,1\right] \implies 1-\sqrt{1-y}\le y, \end{aligned}$$

and that \(\frac{N}{M}=2+\frac{1}{M}\in \left( 2,3\right] \) for all \(M \in {\mathbb {Z}}^+\), we can further see that

$$\begin{aligned} \frac{N}{\beta \sqrt{\ln N}}\left( 1-\sqrt{1-N^{-\frac{\beta ^{2}M^{2}}{2N^{2}}}}\right) \le \frac{N}{\beta \sqrt{\ln N}}N^{-\frac{\beta ^{2}M^{2}}{2N^{2}}} \le \frac{N^{1-\frac{\beta ^{2}}{18}}}{\beta \sqrt{\ln N}} \end{aligned}$$

also always holds. \(\square \)

With the lemmas above we can now prove that (11) can be used to approximate \(\left( g*f\right) \left( x\right) \) for all \(x \in [-\pi , \pi ]\) with controllable error.

Theorem 3

Let \(p \ge 1\). Using the same values of the parameters from Lemma 9 above, one has

$$\begin{aligned} \left| \left( g*f\right) \left( x\right) -\frac{1}{N}\sum ^{2M}_{j = 0}f\left( y_{j}\right) g\left( x - y_{j}\right) \right| \le \frac{\left\| \mathbf {f}\right\| _{p}}{\beta \sqrt{\ln N}}N^{1-\frac{\beta ^{2}}{18} - \frac{1}{p}} \end{aligned}$$

for all \(x\in \left[ -\pi ,\pi \right] \).

Proof

Using Lemmas 6 and 9 followed by Holder’s inequality, we have

$$\begin{aligned}&\left| \left( g*f\right) \left( x\right) -\frac{1}{N}\sum ^{2M}_{j = 0}f\left( y_{j}\right) g\left( x-y_{j}\right) \right| \\&\qquad \qquad =\left| \frac{1}{N}\sum ^{2M}_{j = 0}f\left( y_{j}\right) \left( g\left( x-y_{j}\right) -\int _{-\pi }^{\pi }g\left( x-u-y_{j}\right) D_{M}\left( u\right) du\right) \right| \\&\qquad \qquad \le \frac{1}{N}\sum ^{2M}_{j = 0}\left| f\left( y_{j}\right) \right| \frac{N^{1-\frac{\beta ^{2}}{18}}}{\beta \sqrt{\ln N}} \le \frac{N^{\frac{-\beta ^{2}}{18}} }{\beta \sqrt{\ln N}}\left\| \mathbf {f}\right\| _{p} N^{1-\frac{1}{p}}. \end{aligned}$$

\(\square \)

To summarize, Theorem 3 tells us that \(\left( g*f\right) \left( x\right) \) can be approximately computed in \({\mathcal {O}}\left( N\right) \)-time for any \(x\in \left[ -\pi ,\pi \right] \) using (11). This linear runtime cost may be reduced significantly, however, if one is willing to accept an additional trade-off between accuracy and the number of terms needed in the sum (11). This trade-off is characterized in the next lemma.

Lemma 10

Let \(x\in \left[ -\pi ,\pi \right] \), \(p \ge 1\), \(\gamma \in {\mathbb {R}}^+\), and \(\kappa := \lceil \gamma \ln N \rceil + 1\). Set \(j' := \arg \min _j \left| x - y_j \right| \). Using the same values of the other parameters from Lemma 9 above, one has

$$\begin{aligned} \left| \frac{1}{N}\sum ^{2M}_{j = 0}f\left( y_{j}\right) g\left( x - y_{j}\right) - \frac{1}{N}\sum ^{j' + \kappa }_{j = j' - \kappa }f\left( y_{j}\right) g\left( x - y_{j}\right) \right| \le 2 \Vert \mathbf {f}\Vert _p ~N^{-\frac{2 \pi ^2 \gamma ^2}{\beta ^2}} \end{aligned}$$

for all \(\beta \ge 4\) and \(N \ge \beta ^2\).

Proof

Appealing to Lemma 1 and recalling that \(c_{1}=\frac{\beta \sqrt{\ln N}}{N}\) we can see that

$$\begin{aligned} g\left( x\right) \le \left( \frac{3N}{\beta \sqrt{\ln N}}+\frac{1}{\sqrt{2\pi }}\right) \mathbb {e}^{-\frac{x^{2}N^{2}}{2\beta ^{2}\ln N}}. \end{aligned}$$

Using this fact we have that

$$\begin{aligned} g\left( x - y_{j' \pm k}\right)&\le \left( \frac{3N}{\beta \sqrt{\ln N}}+\frac{1}{\sqrt{2\pi }}\right) \mathbb {e}^{-\frac{\left( x - y_{j' \pm k} \right) ^{2}N^{2}}{2\beta ^{2}\ln N}}\\&\le \left( \frac{3N}{\beta \sqrt{\ln N}}+\frac{1}{\sqrt{2\pi }}\right) \mathbb {e}^{-\frac{\left( 2k-1\right) ^{2} \pi ^2}{2\beta ^{2}\ln N}} \end{aligned}$$

for all \(k \in {\mathbb {Z}}_N\). As a result, one can now bound

$$\begin{aligned} \left| \frac{1}{N}\sum ^{2M}_{j = 0}f\left( y_{j}\right) g\left( x - y_{j}\right) - \frac{1}{N}\sum ^{j' + \kappa }_{j = j' - \kappa }f\left( y_{j}\right) g\left( x - y_{j}\right) \right| \end{aligned}$$

above by

$$\begin{aligned} \left( \frac{3}{\beta \sqrt{\ln N}}+\frac{1}{N \sqrt{2\pi }}\right) \sum ^{N - 2 \kappa - 1}_{k = \kappa +1} \left( \left| f\left( y_{j'-k}\right) \right| + \left| f\left( y_{j'+k}\right) \right| \right) \mathbb {e}^{-\frac{\left( 2k-1\right) ^{2} \pi ^2}{2\beta ^{2}\ln N}}, \end{aligned}$$
(12)

where the \(y_j\)-indexes are considered modulo N as appropriate.

Our goal is now to employ Holder’s inequality on (12). Toward that end, we will now bound the q-norm of the vector \(\mathbf{h} := \left\{ \mathbb {e}^{-\frac{\left( \kappa + \ell - \frac{1}{2}\right) ^{2} 2 \pi ^2}{\beta ^{2}\ln N}} \right\} ^{N- 2 \kappa - 1}_{\ell = 1}\). Letting \(a := q \left( \frac{4}{\beta ^{2}\ln N} \right) \) we have that

$$\begin{aligned} \Vert \mathbf{h} \Vert _q^{q}&= \sum ^{N- 2 \kappa - 1}_{\ell = 1} \mathbb {e}^{-\frac{\pi ^2}{2} \left( \kappa + \ell - \frac{1}{2}\right) ^{2} a} < \sum ^{\infty }_{\ell = \kappa } \mathbb {e}^{-\frac{\pi ^2}{2} \ell ^{2} a} \le \int ^\infty _{\kappa -1} \mathbb {e}^{-\frac{\pi ^2 x^{2}}{2} a}~dx\\&\le \sqrt{\frac{1}{2 \pi a}} - \frac{1}{ \pi \sqrt{2 a}} \int ^{\pi (\kappa - 1) \sqrt{\frac{a}{2}}}_{-\pi (\kappa - 1) \sqrt{\frac{a}{2}}} \mathbb {e}^{-u^2}~du \le \sqrt{\frac{1}{2 \pi a}} \mathbb {e}^{-\frac{a \pi ^2}{2}(\kappa - 1)^2} \\&\le \frac{\beta }{2}\sqrt{\frac{\ln N}{2 \pi q}} N^{-\frac{2 q \pi ^2 \gamma ^2}{\beta ^2}}, \end{aligned}$$

where we have used Lemma 8 once again. As a result we have that

$$\begin{aligned} \Vert \mathbf{h} \Vert _q \le \left( \frac{\beta ^2\ln N}{8 \pi } \right) ^{\frac{1}{2q}} q^{- \frac{1}{2q}} N^{-\frac{2 \pi ^2 \gamma ^2}{\beta ^2}} \le \left( \frac{\beta ^2\ln N}{8 \pi } \right) ^{\frac{1}{2q}} N^{-\frac{2 \pi ^2 \gamma ^2}{\beta ^2}} \end{aligned}$$

for all \(q \ge 1\). Applying Holder’s inequality on (12) we can now see that (12) is bounded above by

$$\begin{aligned} 2 \left( \frac{3}{\beta \sqrt{\ln N}}+\frac{1}{N \sqrt{2\pi }}\right) \Vert \mathbf {f}\Vert _p \left( \frac{\beta ^2\ln N}{8 \pi } \right) ^{\frac{1}{2} - \frac{1}{2p}} N^{-\frac{2 \pi ^2 \gamma ^2}{\beta ^2}}. \end{aligned}$$

The result now follows. \(\square \)

We may now finally combine the truncation and estimation errors in Theorem 3 and Lemma 10 above in order to bound the total error one incurs by approximating \(\left( g*f\right) (x)\) via a truncated portion of (11) for any given \(x \in [-\pi , \pi ]\).

Theorem 4

Fix \(x\in \left[ -\pi ,\pi \right] \), \(p\ge 1\) (or \(p = \infty \)), \(\frac{N}{36}\ge r \ge 1\), and \(g: [-\pi , \pi ] \rightarrow {\mathbb {R}}^{+}\) to be the \(2\pi \)-periodic Gaussian (3) with \(c_1 := \frac{6 \sqrt{\ln (N^r)}}{N}\). Set \(j' := \arg \min _j \left| x - y_j \right| \) where \(y_{j} = -\pi +\frac{2\pi j}{N}\) for all \(j = 0, \dots , 2M\). Then,

$$\begin{aligned} \left| \left( g*f\right) (x) - \frac{1}{N}\sum ^{j' + \left\lceil \frac{6r}{\sqrt{2} \pi } \ln N \right\rceil + 1}_{j = j' - \left\lceil \frac{6r}{\sqrt{2} \pi } \ln N \right\rceil - 1}f\left( y_{j}\right) g\left( x - y_{j}\right) \right| \le 3 \frac{\Vert \mathbf {f}\Vert _p}{N^r}. \end{aligned}$$

As a consequence, we can see that \(\left( g*f\right) (x)\) can always be computed to within \({\mathcal {O}} \left( \Vert \mathbf {f}\Vert _{\infty } N^{-r} \right) \)-error in just \({\mathcal {O}}\left( r \log N \right) \)-time for any given \(\mathbf {f}\in {\mathbb {C}}^N\) once the \(\big \{ g\left( x - y_{j}\right) \big \}^{j' + \left\lceil \frac{6r}{\sqrt{2} \pi } \ln N \right\rceil + 1}_{j = j' - \left\lceil \frac{6r}{\sqrt{2} \pi } \ln N \right\rceil - 1}\) have been precomputed.

Proof

Combining Theorem 3 and Lemma 10 we can see that

$$\begin{aligned}&\left| \left( g*f\right) (x) - \frac{1}{N}\sum ^{j' + \left\lceil \frac{6r}{\sqrt{2} \pi } \ln N \right\rceil + 1}_{j = j' - \left\lceil \frac{6r}{\sqrt{2} \pi } \ln N \right\rceil - 1}f\left( y_{j}\right) g\left( x - y_{j}\right) \right| \\&\qquad \qquad \le \Vert \mathbf {f}\Vert _p \left( \frac{1}{\beta \sqrt{\ln N}}N^{1-\frac{\beta ^{2}}{18} - \frac{1}{p}} + 2 ~N^{-\frac{2 \pi ^2 \gamma ^2}{\beta ^2}} \right) \end{aligned}$$

where \(\beta = 6 \sqrt{r} \ge 6\), \(N \ge 36 r = \beta ^2\), and \(\gamma = \frac{6r}{\sqrt{2} \pi } = \frac{\beta \sqrt{r}}{\sqrt{2} \pi }\). \(\square \)

We are now prepared to bound the error of the proposed approach when utilizing the SFTs developed in [21].

4 An Error Guarantee for Algorithm 1 When Using the SFTs Proposed in [21]

Given the \(2\pi \hbox {-periodic}\) Gaussian \(g: [-\pi , \pi ] \rightarrow {\mathbb {R}}^{+}\) (3), consider the periodic modulation of g, \(\tilde{g}_{q}:\left[ -\pi ,\pi \right] \rightarrow {\mathbb {C}}\), for any \(q\in {\mathbb {Z}}\) defined by

$$\begin{aligned} \tilde{g}_{q}\left( x\right) = \mathbb {e}^{-\mathbb {i}qx}g\left( x\right) . \end{aligned}$$

One can see that

$$\begin{aligned} \tilde{g}_{q}\left( x\right)&=\mathbb {e}^{-\mathbb {i}qx}\sum _{\omega =-\infty }^{\infty }\widehat{g}_{\omega } \mathbb {e}^{\mathbb {i}\omega x}=\sum _{\omega =-\infty }^{\infty }\widehat{g}_{\omega } \mathbb {e}^{ \mathbb {i}\left( \omega -q\right) x}=\sum _{\tilde{\omega }=-\infty }^{\infty }\widehat{g}_{\tilde{\omega }+q} \mathbb {e}^{ \mathbb {i}\tilde{\omega }x}, \end{aligned}$$

so that the Fourier series coefficients of \(\tilde{g}_{q}\) are those of g, shifted by q; that is,

$$\begin{aligned} \left( \widehat{\tilde{g}_{q}}\right) _{\omega }=\widehat{g}_{\omega +q}. \end{aligned}$$

In line 9 of Algorithm 1, we provide the SFT Algorithm in [21] with the approximate evaluations of \(\left\{ \left( \tilde{g}_{q}*f\right) \left( x_{k}\right) \right\} _{k=1}^{m},\) namely, \(\left\{ \left( \tilde{g}_{q}*f\right) \left( x_{k}\right) +n_{k}\right\} _{k=1}^{m}\), where, by Theorem 4, the perturbations \(n_{k}\) are bounded, for instance, by

$$\begin{aligned} \left| n_{k}\right| \le 3\frac{\left\| f\right\| _{\infty }}{N^{r}}\ \forall \ k=1,\dots ,m. \end{aligned}$$

With this in mind, let us apply Lemma 4 to the function \(\tilde{g}_{q}*f\). We have the following lemma.

Lemma 11

Let \(s\in [2,N]\cap {\mathbb {N}}\), and \(\mathbf {n}\in {\mathbb {C}}^{m}\) be the vector containing the total errors incurred by approximating \(\tilde{g}_{q}*f\) via a truncated version of (11), as per Theorem 4. There exists a set of m points \(\left\{ x_{k}\right\} _{k=1}^{m}\subset \left[ -\pi ,\pi \right] \) such that Algorithm 3 on page 72 of [21], when given access to the corrupted samples \(\left\{ \left( \tilde{g}_{q}*f\right) \left( x_{k}\right) +n_{k}\right\} _{k=1}^{m},\) will identify a subset \(S\subseteq B\) which is guaranteed to contain all \(\omega \in B\) with

$$\begin{aligned} \left| \left( \widehat{\tilde{g}_{q}*f}\right) _{\omega }\right| >4\left( \frac{1}{s}\cdot \left\| \widehat{\tilde{g}_{q}*f}-\left( \widehat{\tilde{g}_{q}*f}\right) _{s}^{\mathrm{opt}}\right\| _{1}+3\left\| \mathbf {f}\right\| _{\infty }N^{-r}\right) =:4\tilde{\delta }. \end{aligned}$$

Furthermore, every \(\omega \in S\) returned by Algorithm 3 will also have an associated Fourier series coefficient estimate \(z_{\omega }\in {\mathbb {C}}\) which is guaranteed to have

$$\begin{aligned} \left| \left( \widehat{\tilde{g}_{q}*f}\right) _{\omega }-z_{\omega }\right| \le \sqrt{2}\tilde{\delta }. \end{aligned}$$

Next, we need to guarantee that the estimates of \(\widehat{\tilde{g}_{q}*f}\) returned by Algorithm 3 of [21] will yield good estimates of \(\widehat{f}\) itself. We have the following.

Lemma 12

Let \(s\in [2,N]\cap {\mathbb {N}}\). Given a \(2\pi \)-periodic function \(f:\left[ -\pi ,\pi \right] \rightarrow {\mathbb {C}}\), the periodic Gaussian g, and any of its modulations \(\tilde{g}_{q}\left( x\right) =\mathbb {e}^{-\mathbb {i}qx}g\left( x\right) \), one has

$$\begin{aligned} \left\| \widehat{\tilde{g}_{q}*f}-\left( \widehat{\tilde{g}_{q}*f}\right) _{s}^{\mathrm{opt}}\right\| _{1}\le \frac{1}{2} \left\| \widehat{f}-\widehat{f}_{s}^{\mathrm{opt}}\right\| _{1}. \end{aligned}$$

Proof

Recall the definition of \(R_{s}^{\mathrm{opt}}\left( \widehat{f}\right) \) as the subset of B containing the s most energetic frequencies of \(\widehat{f}\), and observe that

$$\begin{aligned} \frac{1}{2} \left\| \widehat{f}-\widehat{f}_{s}^{\mathrm{opt}}\right\| _{1}= \frac{1}{2} \sum _{\omega \in B\backslash R_{s}^{\mathrm{opt}}\left( \widehat{f}\right) }\left| \widehat{f}_{\omega }\right| \ge \sum _{\omega \in B\backslash R_{s}^{\mathrm{opt}}\left( \widehat{f}\right) }\left| \left( \widehat{\tilde{g}_{q}}\right) _{\omega }\cdot \widehat{f}_{\omega }\right| \end{aligned}$$

since, by Lemma 2, \(\widehat{g}_{\omega }<\frac{1}{2}\) for all \(\omega \), and consequently, \(\left( \widehat{\tilde{g}_{q}}\right) _{\omega }=\widehat{g}_{\omega +q}<\frac{1}{2}\) for all \(\omega \). Moreover,

$$\begin{aligned} \sum _{\omega \in B\backslash R_{s}^{\mathrm{opt}}\left( \widehat{f}\right) }\left| \left( \widehat{\tilde{g}_{q}}\right) _{\omega }\cdot \widehat{f}_{\omega }\right|&\ge \sum _{\omega \in B\backslash R_{s}^{\mathrm{opt}}\left( \widehat{\tilde{g}_{q}*f}\right) }\left| \left( \widehat{\tilde{g}_{q}}\right) _{\omega }\cdot \widehat{f}_{\omega }\right| ~=~\left\| \widehat{\tilde{g}_{q}*f}-\left( \widehat{\tilde{g}_{q}*f}\right) _{s}^{\mathrm{opt}}\right\| _{1}. \end{aligned}$$

Let us combine the guarantees above into the following lemma.

Lemma 13

Let \(s\in [2,N]\cap {\mathbb {N}}\), and \(\mathbf {n}\in {\mathbb {C}}^{m}\) be the vector containing the total errors incurred by approximating \(\tilde{g}_{q}*f\) via a truncated version of (11), as per Theorem 4. There exists a set of m points \(\left\{ x_{k}\right\} _{k=1}^{m}\subset \left[ -\pi ,\pi \right] \) such that Algorithm 3 on page 72 of [21], when given access to the corrupted samples \(\left\{ \left( \tilde{g}_{q}*f\right) \left( x_{k}\right) +n_{k}\right\} _{k=1}^{m},\) will identify a subset \(S\subseteq B\) which is guaranteed to contain all \(\omega \in B\) with

$$\begin{aligned} \left| \left( \widehat{\tilde{g}_{q}*f}\right) _{\omega }\right| >4\left( \frac{1}{2s}\cdot \left\| \widehat{f}-\widehat{f}_{s}^{\mathrm{opt}}\right\| _{1}+3\left\| \mathbf {f}\right\| _{\infty }N^{-r}\right) =:4\delta . \end{aligned}$$

Furthermore, every \(\omega \in S\) returned by Algorithm 3 will also have an associated Fourier series coefficient estimate \(z_{\omega }\in {\mathbb {C}}\) which is guaranteed to have

$$\begin{aligned} \left| \left( \widehat{\tilde{g}_{q}}\right) _{\omega }\cdot \widehat{f}_{\omega }-z_{\omega }\right| \le \sqrt{2}\delta . \end{aligned}$$

The lemma above implies that for any choice of q in line 4 of Algorithm 1, we are guaranteed to find all \(\omega \in \left[ q-\left\lceil \frac{N}{\alpha \sqrt{\ln N}}\right\rceil ,q+\left\lceil \frac{N}{\alpha \sqrt{\ln N}}\right\rceil \right) \cap B\) with

$$\begin{aligned} \left| \widehat{f}_{\omega }\right| >\max _{\tilde{\omega }}\frac{4\delta }{\left( \widehat{\tilde{g}_{q}}\right) _{\tilde{\omega }}}\ge \frac{4\delta }{\tau } \end{aligned}$$

where \(\alpha \) and \(\tau \) are as defined in Lemma 3. Moreover, the Fourier series coefficient estimates \(z_{\omega }\) returned by Algorithm 3 will satisfy

$$\begin{aligned} \left| \widehat{f}_{\omega }-\frac{z_{\omega }}{\left( \widehat{\tilde{g}_{q}}\right) _{\omega }}\right| \le \max _{\tilde{\omega }}\frac{\sqrt{2}\delta }{\left( \widehat{\tilde{g}_{q}}\right) _{\tilde{\omega }}}\le \frac{\sqrt{2}\delta }{\tau }. \end{aligned}$$

Following Theorem 3, which guarantees a decay of \(N^{-r}\) in the total approximation error, let us set \(\beta =6\sqrt{r}\) for \(1\le r\le \frac{N}{36}\). Recall from Lemma 3 the choice of \(\beta \)\(\in \left( 0,\alpha \sqrt{\frac{\ln \left( 1/\tau \sqrt{2\pi }\right) }{2}}\right] \) where \(\tau \) is to be chosen from \(\left( 0,\frac{1}{\sqrt{2\pi }}\right) \). Thus, we must choose \(\alpha \in \left[ 1,\frac{N}{\sqrt{\ln N}}\right] \) so that

$$\begin{aligned} 6\sqrt{r}\le \alpha \sqrt{\frac{\ln \left( 1/\tau \sqrt{2\pi }\right) }{2}}\iff \alpha \ge \frac{6\sqrt{2r}}{\ln \left( 1/\tau \sqrt{2\pi }\right) }. \end{aligned}$$

We may remove the dependence on \(\tau \) simply by setting, e.g., \(\tau =\frac{1}{3}\). Then \(\alpha ={\mathcal {O}}\left( \sqrt{r}\right) \).

We are now ready to state the recovery guarantee of Algorithm  1 and its operation count.

Theorem 5

Let \(N\in {\mathbb {N}}\), \(s\in [2,N]\cap {\mathbb {N}}\), and \(1\le r \le \frac{N}{36}\) as in Theorem 4. If Algorithm 3 of [21] is used in Algorithm 1 then Algorithm 1 will always deterministically identify a subset \(S\subseteq B\) and a sparse vector \(\mathbf {v}\vert _{S}\in {\mathbb {C}}^{N}\) satisfying

$$\begin{aligned} \left\| {\hat{\mathbf {f}}}-\mathbf {v}\vert _{S}\right\| _{2}\le \left\| {\hat{\mathbf {f}}}-{\hat{\mathbf {f}}}_{s}^{\mathrm{opt}}\right\| _{2}+\frac{33}{\sqrt{s}}\cdot \left\| {\hat{\mathbf {f}}}-{\hat{\mathbf {f}}}_{s}^{\mathrm{opt}}\right\| _{1}+198\sqrt{s}\left\| \mathbf {f}\right\| _{\infty }N^{-r}. \end{aligned}$$
(13)

Algorithm  1’s operation count is then

$$\begin{aligned} {\mathcal {O}} \left( \frac{ s^2\cdot r^{\frac{3}{2}} \cdot \log ^{\frac{11}{2}} (N)}{\log (s)} \right) . \end{aligned}$$

If returning a sparse vector \(\mathbf {v}\vert _{S}\in {\mathbb {C}}^{N}\) that satisfies (13) with probability at least \((1-p) \in [2/3,1)\) is sufficient, a Monte Carlo variant of the deterministic Algorithm 3 in [21] may be used in line 9 of Algorithm  1. In this case Algorithm  1’s operation count is

$$\begin{aligned} {\mathcal {O}} \left( s\cdot r^{\frac{3}{2}} \cdot \log ^\frac{9}{2}(N)\cdot \log \left( \frac{N}{p}\right) \right) . \end{aligned}$$

Proof

Redefine \(\delta \) in the proof of Theorem 7 in [21] as

$$\begin{aligned} \delta =\frac{1}{\tau }\left( \frac{1}{2s}\cdot \left\| \widehat{f}-\widehat{f}_{s}^{\mathrm{opt}}\right\| _{1}+3\left\| \mathbf {f}\right\| _{\infty }N^{-r}\right) = 3\left( \frac{1}{2s}\cdot \left\| {\hat{\mathbf {f}}}- {\hat{\mathbf {f}}}_{s}^{\mathrm{opt}}\right\| _{1}+3\left\| \mathbf {f}\right\| _{\infty }N^{-r}\right) , \end{aligned}$$

and observe that any \(\omega \in B=\left( -\left\lceil \frac{N}{2}\right\rceil ,\left\lfloor \frac{N}{2}\right\rfloor \right] \cap {\mathbb {Z}}\) that is reconstructed by Algorithm 1 will have a Fourier series coefficient estimate \(v_{\omega }\) that satisfies

$$\begin{aligned} \left| v_{\omega }-{\hat{\mathbf {f}}}_{\omega }\right| = \left| v_{\omega }-\widehat{f}_{\omega }\right| \le \sqrt{2}\cdot \delta . \end{aligned}$$

We can thus bound the approximation error by

$$\begin{aligned} \begin{aligned} \left\| {\hat{\mathbf {f}}}-\mathbf {v}\vert _{S}\right\| _{2}&\le \left\| {\hat{\mathbf {f}}}-{\hat{\mathbf {f}}}\vert _{S}\right\| _{2}+\left\| {\hat{\mathbf {f}}}\vert _{S}-\mathbf {v}\vert _{S}\right\| _{2}\le \left\| {\hat{\mathbf {f}}}-{\hat{\mathbf {f}}}\vert _{S}\right\| _{2}+2\sqrt{s}\cdot \delta \\&=\sqrt{\left\| {\hat{\mathbf {f}}}-{\hat{\mathbf {f}}}_{s}^{\mathrm{opt}}\right\| _{2}^{2}+\sum _{\omega \in R_{s}^{\mathrm{opt}}\left( \widehat{f}\right) \backslash S}\left| \widehat{f}_{\omega }\right| ^{2}-\sum _{\tilde{\omega }\in S\backslash R_{s}^{\mathrm{opt}}\left( \widehat{f}\right) }\left| \widehat{f}_{\tilde{\omega }}\right| ^{2}}+2\sqrt{s}\cdot \delta . \end{aligned} \end{aligned}$$
(14)

In order to make additional progress on (14) we must consider the possible magnitudes of \(\mathbf {\widehat{f}}\) entries at indices in \(S\backslash R_{s}^{\mathrm{opt}}\left( \widehat{f}\right) \) and \(R_{s}^{\mathrm{opt}}\left( \widehat{f}\right) \backslash S\). Careful analysis (in line with the techniques employed in the proof of Theorem 7 of [21]) indicates that

$$\begin{aligned} \sum _{\omega \in R_{s}^{\mathrm{opt}}\left( \widehat{f}\right) \backslash S}\left| \widehat{f}_{\omega }\right| ^{2}-\sum _{\tilde{\omega }\in S\backslash R_{s}^{\mathrm{opt}}\left( \widehat{f}\right) }\left| \widehat{f}_{\tilde{\omega }}\right| ^{2}\le s\cdot \left( 8\sqrt{2}+8\right) ^{2}\cdot \delta ^{2}. \end{aligned}$$

Therefore, in the worst possible case equation (14) will remain bounded by

$$\begin{aligned} \left\| {\hat{\mathbf {f}}}-\mathbf {v}\vert _{S}\right\| _{2}\le \sqrt{\left\| {\hat{\mathbf {f}}}-{\hat{\mathbf {f}}}_{s}^{\mathrm{opt}}\right\| _{2}^{2}+s\cdot \left( 8\sqrt{2}+8\right) ^{2}\cdot \delta ^{2}}+2\sqrt{s}\cdot \delta \le \left\| {\hat{\mathbf {f}}}- {\hat{\mathbf {f}}}_{s}^{\mathrm{opt}}\right\| _{2}+22\sqrt{s}\cdot \delta . \end{aligned}$$

The error bound stated in (13) follows.

The runtimes follow by observing that \(c_2 = {\mathcal {O}} \left( \alpha \cdot \log ^{\frac{1}{2}} (N)\right) = {\mathcal {O}}\left( r^{\frac{1}{2}}\cdot \log ^{\frac{1}{2}} (N) \right) \) as chosen in line 2 of Algorithm  1, and for every choise of q in line 4 of Algorithm  1, all of the evaluations \(\left\{ (\tilde{g}_q*f)(x_k) \right\} ^m_{k=1}\) can be approximated very accurately in just \({\mathcal {O}}(m r \log N)\)-time, where the number of samples m is on the orders described in Theorem 2. \(\square \)

We are now ready to empirically evaluate Algorithm  1 with several different SFT algorithms \({\mathcal {A}}\) used in its line 9.

5 Numerical Evaluation

In this section we evaluate the performance of three new discrete SFT Algorithms resulting from Algorithm 1: DMSFT-4, DMSFT-6,Footnote 8 and CLW-DSFT.Footnote 9 All of them were developed by utilizing different SFT algorithms in line 9 of Algorithm 1. Here DMSFT stands for the Discrete Michigan State Fourier Transform algorithm. Both DMSFT-4 and DMSFT-6 are implementations of Algorithm 1 that use a randomized version of the SFT algorithm GFFT [29] in their line 9.Footnote 10 The only difference between DMSFT-4 and DMSFT-6 is how accurately each one estimates the convolution in line 7 of Algorithm 1: for DMSFT-4 we use \(\kappa = 4\) in the partial discrete convolution in Lemma 10 when approximating \(\tilde{g}_q*f\) at each \(x_k\), while for DMSFT-6 we always use \(\kappa = 6\). The CLW-DSFT stands for the Christlieb Lawlor Wang Discrete Sparse Fourier Transform algorithm. It is an implementation of Algorithm 1 that uses the SFT developed in [6] in its line 9, and \(\kappa \) varying between 12 and 20 for its line 7 convolution estimates (depending on each input vector’s Fourier sparsity, etc.). All of DMSFT-4, DMSFT-6 and CLW-DSFT were implemented in C++ in order to empirically evaluate their runtime and noise robustness characteristics.

We also compare these new implementations’ runtime and robustness characteristics with FFTW 3.3.4Footnote 11 and sFFT 2.0.Footnote 12 FFTW is a highly optimized FFT implementation which runs in \({\mathcal {O}}(N\log N)\)-time for input vectors of length N. All the standard discrete Fourier transforms in the numerical experiments are performed using FFTW 3.3.4 with FFTW_MEASURE plan. The sFFT 2.0 is a randomized discrete sparse Fourier Transform algorithm written in C++ which is both stable and robust to noise. It was developed by Hassanieh et al. in [15]. Note that DMSFT-4, DMSFT-6, CLW-DSFT, and sFFT 2.0 are all randomized algorithms designed to approximate discrete DFTs that are approximately s-sparse. This means that all of them take both sparsity s and size N of the DFT’s \({\hat{\mathbf {f}}}\in {\mathbb {C}}^N\) they aim to recover as parameters. In contrast, FFTW cannot utilize existing sparsity to its advantage. Finally, all experiments are run on a Linux CentOS machine with 2.50 GHz CPU and 16 GB of RAM.

5.1 Experiment Setup

For the execution time experiments each trial input vector \(\mathbf {f}\in {\mathbb {C}}^N\) was generated as follows: First s frequencies were independently selected uniformly at random from \([0, N)\cap {\mathbb {Z}}\), and then each of these frequencies was assigned a uniform random phase with magnitude 1 as its Fourier coefficient. The remaining frequencies’ Fourier coefficients were then set to zero to form \({\hat{\mathbf {f}}}\in {\mathbb {C}}^N\). Finally, the trial input vector \(\mathbf {f}\) was then formed via an inverse DFT.

For each pair of s and N the parameters in each randomized algorithm were chosen so that the probability of correctly recovering all s energetic frequencies was at least 0.9 per trial input. Every data point in a figure below corresponds to an average over 100 runs on 100 different trial input vectors of this kind. It is worth mentioning that the parameter tuning process for DMSFT-4 and DMSFT-6 requires significantly less effort than for both CLW-DSFT and sFFT 2.0 since the DMSFT variants only have two parameters (whose default values are generally near-optimal).

5.2 Runtime as Input Vector Size Varies

In Fig. 1 we fixed the sparsity to \(s=50\) and ran numerical experiments on 8 different input vector lengths N: \(2^{16}\), \(2^{18}\), \(\ldots \), \(2^{30}\). We then plotted the running time (averaged over 100 runs) for DMSFT-4, DMSFT-6, CLW-DSFT, sFFT 2.0, and FFTW.

Fig. 1
figure 1

Runtime comparison at sparsity (s) fixed at 50

As expected, the runtime slope of all the SFT algorithms (i.e. DMSFT-4, DMSFT-6, CLW-DSFT, and sFFT 2.0) is less than the slope of FFTW as N increases. Although FFTW is fastest for vectors of small size, it becomes the slowest algorithm when the vector size N is greater than \(2^{20}\). Among the randomized algorithms, sFFT 2.0 is the fastest one when N is less than \(2^{22}\), but DMSFT-4, DMSFT-6, and CLW-DSFT all outperform sFFT 2.0 with respect to runtime when the input vector’s sizes are large enough. The CLW-DSFT implementation becomes faster than sFFT 2.0 when N is approximately \(2^{21}\) while DMSFT-4 and DMSFT-6 have better runtime performance than sFFT 2.0 when N is greater than \(2^{23}\).

Fig. 2
figure 2

Runtime comparison at bandwidth (N) fixed at \(2^{26}\)

5.3 Runtime as Sparsity Varies

In Fig. 2 we fix the input vector lengths to \(N = 2^{26}\) and run the numerical experiments on 7 different values of sparsity s: 50, 100, 200, 400, 1000, 2000, and 4000. As expected, the FFTW’s runtime is constant as we increase the sparsity. The runtimes of DMSFT-4, CLW-DSFT, and sFFT 2.0 are all essentially linear in s. Here DMSFT-6 has been excluded for ease of viewing/reference – its runtimes lie directly above those of DMSFT-4 when included in the plot. Looking at Fig. 2 we can see the CLW-DSFT’s runtime increases more rapidly with s than that of DMSFT-4 and sFFT 2.0. The runtime of CLW-DSFT becomes the slowest one when sparsity is around 1000. DMSFT-4 and sFFT 2.0 have approximately the same runtime slope as s increases, and they both have good performance when the sparsity is large. However, DMSFT-4 maintains consistently better runtime performance than sFFT 2.0 for all sparsity values, and is the only algorithm in the plot that still faster than FFTW when the sparsity is 4000. Indeed, when the sparsity is 4000 the average runtime of DMSFT-4 is 2.68 s and the average runtime of DMSFT-6 is 2.9 s. Both of them remain faster than FFTW (3.47 s) and sFFT 2.0 (3.96 s) for this large sparsity (though only DMSFT-4 has been included in the plot above).

5.4 Robustness to Noise

In our final set of experiments we test the noise robustness of DMSFT-4, DMSFT-6, CLW-DSFT, sFFT 2.0, and FFTW for different levels of Gaussian noise. Here the size of each input vector is \(N=2^{22}\) and sparsity is fixed at \(s = 50\). The test signals are then generated as before, except that Gaussian noise is added to \(\mathbf {f}\) after it is constructed. More specifically, we first generate \(\mathbf {f}\) and then set \(\mathbf {f}= \mathbf {f}+ \mathbf {n}\) where each entry of \(\mathbf {n}\), \(n_j\), is an i.i.d. mean 0 random complex Gaussian value. The noise vector \(\mathbf {n}\) is then rescaled to achieve each desired signal-to-noise ratio (SNR) considered in the experiments.Footnote 13

Fig. 3
figure 3

Robustness to noise (bandwidth (N) = \(2^{22}\), sparsity (s) = 50)

Recall that the the randomized algorithms compared herein (DMSFT-4, DMSFT-6, CLW-DSFT, and sFFT 2.0) are all tuned to guarantee exact recovery of s-sparse functions with probability at least 0.9 in all experiments. For our noise robustness experiments this ensures that the correct frequency support, S, is found for at least 90 of the 100 trial signals used to generate each point plotted in Fig. 3. We use average \(L_1\) error to measure the noise robustness of each algorithm for each of these at least 90 trial runs. The average \(L_1\) error is defined as

$$\begin{aligned} \textit{Average }L_1\textit{ Error} = \frac{1}{s}\sum _{\omega \in S} \big |\hat{f}_{\omega } - z_{\omega } \big | \end{aligned}$$

where S is the true frequency support of the input vector \(\mathbf {f}\), \(\hat{f}_{\omega }\) are the true input Fourier coefficients for all frequencies \(\omega \in S\), and \(z_{\omega }\) are their recovered approximations from each algorithm. Figure 3 graphs the averaged average \(L_1\) error over the at least 90 trial signals where each method correctly identified S.

It can be seen in Fig. 3 that DMSFT-4, DMSFT-6, sFFT 2.0, and FFTW are all robust to noise. As expected, FFTW has the best performance in this test. DMSFT-4 and DMSFT-6 are both more robust to noise when compared to sFFT 2.0. As for CLW-DSFT, it cannot guarantee a 0.9 probability of correctly recovering S when the SNR is less than 40 and so is not plotted for those SNR values. This is due to the base energetic frequency identification methods of [6, 25] being inherently ill conditioned, though the CLW-DSFT results look better when compared to the true \({\hat{\mathbf {f}}}\) with respect to, e.g., earth mover’s distance. Frequencies are often estimated incorrectly by CLW-DSFT at higher noise levels, but when they are they are usually at least close enough to the true frequencies to be informative.

6 Conclusion

Let \({\mathcal {A}}\) be a sublinear-time sparse FFT algorithm which utilizes unequally spaced samples from a given periodic function \(f: [-\pi , \pi ] \rightarrow {{\mathbb {C}}}\) in order to rapidly approximate its sequence of Fourier series coefficients \(\hat{f} \in \ell ^2\). In this paper we propose a generic method of transforming any such algorithm \({\mathcal {A}}\) into a sublinear-time sparse DFT algorithm which rapidly approximates \({\hat{\mathbf {f}}}\) from a given input vector \(\mathbf {f}\in {{\mathbb {C}}}^N\). As a result we are able to construct several new sublinear-time sparse DFT algorithms from existing sparse Fourier algorithms which utilize unequally spaced function samples [6, 21, 25, 29]. The best of these new algorithms is shown to outperform existing discrete sparse Fourier transform methods with respect to both runtime and noise robustness for large vector lengths N. In addition, we also present several new theoretical discrete sparse FFT robust recovery guarantees. These include the first known theoretical guarantees for entirely deterministic and discrete sparse DFT algorithms which hold for arbitrary input vectors \(\mathbf {f}\in {{\mathbb {C}}}^N\).