A New Class of Fully Discrete Sparse Fourier Transforms: Faster Stable Implementations with Guarantees

Merhi, Sami; Zhang, Ruochuan; Iwen, Mark A.; Christlieb, Andrew

doi:10.1007/s00041-018-9616-4

A New Class of Fully Discrete Sparse Fourier Transforms: Faster Stable Implementations with Guarantees

Published: 03 May 2018

Volume 25, pages 751–784, (2019)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Fourier Analysis and Applications Aims and scope Submit manuscript

A New Class of Fully Discrete Sparse Fourier Transforms: Faster Stable Implementations with Guarantees

Download PDF

Sami Merhi¹,
Ruochuan Zhang¹,
Mark A. Iwen^1,2 &
…
Andrew Christlieb²

391 Accesses
18 Citations
Explore all metrics

Abstract

In this paper we consider sparse Fourier transform (SFT) algorithms for approximately computing the best s-term approximation of the discrete Fourier transform (DFT) ${\hat{\mathbf {f}}}\in {{\mathbb {C}}}^N$ of any given input vector $\mathbf {f}\in {{\mathbb {C}}}^N$ in just $\left( s \log N\right) ^{{{\mathcal {O}}}(1)}$-time using only a similarly small number of entries of $\mathbf {f}$. In particular, we present a deterministic SFT algorithm which is guaranteed to always recover a near best s-term approximation of the DFT of any given input vector $\mathbf {f}\in {{\mathbb {C}}}^N$ in ${{\mathcal {O}}} \left( s^2 \log ^{\frac{11}{2}} (N) \right) $-time. Unlike previous deterministic results of this kind, our deterministic result holds for both arbitrary vectors $\mathbf {f}\in {{\mathbb {C}}}^N$ and vector lengths N. In addition to these deterministic SFT results, we also develop several new publicly available randomized SFT implementations for approximately computing ${\hat{\mathbf {f}}}$ from $\mathbf {f}$ using the same general techniques. The best of these new implementations is shown to outperform existing discrete sparse Fourier transform methods with respect to both runtime and noise robustness for large vector lengths N.

Deterministic sparse FFT for M-sparse vectors

Article 05 July 2017

A deterministic sparse FFT for functions with structured Fourier sparsity

Article 09 August 2018

High-dimensional sparse Fourier algorithms

Article 15 July 2020

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Herein we are concerned with the rapid approximation of the discrete Fourier transform ${\hat{\mathbf {f}}}\in {\mathbb {C}}^N$ of a given vector $\mathbf {f}\in {\mathbb {C}}^N$ for large values of N. Though standard Fast Fourier Transform (FFT) algorithms [5, 8, 28] can accomplish this task in ${\mathcal {O}} \left( N \log N \right) $-time for arbitrary $N \in {\mathbb {N}}$, this runtime complexity may still be unnecessarily computationally taxing when N is extremely large. This is particularly true when the vector ${\hat{\mathbf {f}}}$ is approximately s-sparse (i.e., contains only $s \ll N$ nonzero entries) as in compressive sensing [11] and certain wideband signal processing applications (see, e.g. [24]). Such applications have therefore motivated the development of discrete sparse Fourier transform (DSFT) techniques [13, 14] which are capable of accurately approximating s-sparse DFT vectors ${\hat{\mathbf {f}}}\in {\mathbb {C}}^N$ in just $s \cdot \log ^{{\mathcal {O}}(1)} N$-time. When $s \ll N$ these methods are significantly faster than standard ${\mathcal {O}} \left( N \log N \right) $-time FFT methods, effectively achieving sublinear o(N) runtime complexities in such cases.

Currently, the most widely used $s \cdot \log ^{{\mathcal {O}}(1)} N$-time DSFT methods [12, 15, 22] are randomized algorithms which accurately compute ${\hat{\mathbf {f}}}$ with high probability when given sampling access to $\mathbf {f}$. Many existing sparse Fourier transforms which are entirely deterministic [6, 19, 21, 25, 29], on the other hand, are perhaps best described as unequally spaced sparse Fourier transform (USSFT) methods in that they approximately compute ${\hat{\mathbf {f}}}$, with its entries $\hat{f}_\omega $ indexed by the set $B := \left( -\left\lceil \frac{N}{2}\right\rceil ,\left\lfloor \frac{N}{2}\right\rfloor \right] \cap {\mathbb {Z}}$, by sampling its associated trigonometric polynomial

$$\begin{aligned} f\left( x\right) =\sum _{\omega \in B}\hat{f}_{\omega } \mathbb {e}^{\mathbb {i}\omega x} \end{aligned}$$

at a collection of $m \ll N$ specially constructed unequally spaced points $x_1, \dots , x_m \in [-\pi , \pi ]$. These methods have no probability of failing to recover s-sparse ${\hat{\mathbf {f}}}$, but can not accurately compute the DFT ${\hat{\mathbf {f}}}$ of an arbitrary given vector $\mathbf {f}\in {\mathbb {C}}^N$ due to their need for unequally spaced function evaluations from f of the form $\left\{ f(x_k) \right\} ^m_{k=1}$.^{Footnote 1}

This state of affairs has left a gap in the theory of DSFT methods. Existing deterministic sparse Fourier transform algorithms currently can efficiently compute the s-sparse DFT ${\hat{\mathbf {f}}}$ of a given vector $\mathbf {f}\in {\mathbb {C}}^N$ only if either (i) N is a power of a small prime [26], or else (ii) $\hat{f}_\omega = 0$ for all $\omega \in B$ with $|\omega | > N/4$ [19, 20]. In this paper this gap is filled by the development of a new entirely deterministic DSFT algorithm which is always guaranteed to accurately approximate any (nearly) s-sparse ${\hat{\mathbf {f}}}\in {\mathbb {C}}^N$ of any size N when given access only to $\mathbf {f}\in {\mathbb {C}}^N$. In addition, the method used to develop this new deterministic DSFT algorithm is general enough that it can be applied to any fast and noise robust USSFT method of the type mentioned above (be it deterministic, or randomized) in order to yield a new fast and robust DSFT algorithm. As a result, we are also able to use the fastest of the currently existing USSFT methods [4, 6, 17, 19, 21, 25, 29] in order to create new publicly available DSFT implementations herein which are both faster and more robust to noise than currently existing noise robust DSFT methods for large N.

More generally, we emphasize that the techniques utilized below free developers of SFT methods to develop more general USSFT methods which utilize samples from the trigonometric polynomial f above at any points $\left\{ x_k \right\} ^m_{k=1} \subset [-\pi ,\pi ]$ they like when attempting to create better DSFT algorithms in the future. Indeed, the techniques herein provide a relatively simple means of translating any future fast and robust USSFT algorithms into (still fast) DSFT algorithms.

1.1 Theoretical Results

Herein we focus on rapidly producing near best s-term approximations of ${\hat{\mathbf {f}}}$ of the type usually considered in compressive sensing [7]. Let ${\hat{\mathbf {f}}}_{s}^{\mathrm{opt}} \in {\mathbb {C}}^N$ denote an optimal s-term approximation to ${\hat{\mathbf {f}}}\in {\mathbb {C}}^N$. That is, let ${\hat{\mathbf {f}}}_{s}^{\mathrm{opt}}$ preserve s of the largest magnitude entries of ${\hat{\mathbf {f}}}$ while setting the rest of its $N-s$ smallest magnitude entires to 0.^{Footnote 2} The following DSFT theorem is proven below.^{Footnote 3}

Theorem 1

Let $N\in {\mathbb {N}}$, $s\in [2,N]\cap {\mathbb {N}}$, $1\le r \le \frac{N}{36}$, and $\mathbf {f}\in {\mathbb {C}}^N$. There exists an algorithm that will always deterministically return an s-sparse vector $\mathbf {v} \in {\mathbb {C}}^{N}$ satisfying

$$\begin{aligned} \left\| {\hat{\mathbf {f}}}-\mathbf {v} \right\| _{2}\le \left\| {\hat{\mathbf {f}}}-{\hat{\mathbf {f}}}_{s}^{\mathrm{opt}}\right\| _{2}+\frac{33}{\sqrt{s}}\cdot \left\| {\hat{\mathbf {f}}}-{\hat{\mathbf {f}}}_{s}^{\mathrm{opt}}\right\| _{1}+198\sqrt{s}\left\| \mathbf {f}\right\| _{\infty }N^{-r} \end{aligned}$$

(1)

in just ${\mathcal {O}} \left( \frac{ s^2\cdot r^{\frac{3}{2}} \cdot \log ^{\frac{11}{2}} (N)}{\log (s)} \right) $-time when given access to $\mathbf {f}$. If returning an s-sparse vector $\mathbf {v}\in {\mathbb {C}}^{N}$ that satisfies (1) for each $\mathbf {f}$ with probability at least $(1-p) \in [2/3,1)$ is sufficient, a Monte Carlo algorithm also exists which will do so in just $ {\mathcal {O}} \left( s\cdot r^{\frac{3}{2}} \cdot \log ^\frac{9}{2}(N)\cdot \log \left( \frac{N}{p}\right) \right) $-time.

Note the quadratic-in-s runtime dependence of the deterministic algorithm mentioned by Theorem 1. It turns out that there is a close relationship between the sampling points $\left\{ x_k \right\} ^m_{k=1}$ used by the deterministic USSFT methods [21] employed as part of the proof of Theorem 1 and the construction of explicit (deterministic) RIP matrices (see [1, 18] for details). As a result, reducing the quadratic dependence on s of the $s^2 \log ^{{\mathcal {O}}(1)} N$-runtime complexity of the deterministic DSFT algorithms referred to by Theorem 1 while still satisfying the error guarantee (1) is likely at least as difficult as constructing explicit deterministic RIP matrices with fewer than $s^2 \log ^{{\mathcal {O}}(1)} N$ rows by subsampling rows from an $N \times N$ DFT matrix. Unfortunately, explicitly constructing RIP matrices of this type is known to be a very difficult problem [11]. This means that constructing an entirely deterministic DSFT algorithm which is both guaranteed to always satisfy (1), and which also always runs in $s \log ^{{\mathcal {O}}(1)} N$-time, is also likely to be extremely difficult to achieve at present.^{Footnote 4}

The remainder of this paper is organized as follows: In Sect. 2 we set up notation and establish necessary background results. Then, in Sect. 3, we describe our method for converting noise robust USSFT methods into DSFT methods. The resulting approach is summarized in Algorithm 1 therein. Next, Theorem 1 is proven in Sect. 4 using the intermediary results of Sects. 2 and 3. An empirical evaluation of several new DSFT algorithms resulting from our proposed approach is then performed in Sect. 5. The paper is finally concluded with a few additional comments in Sect. 6.

2 Notation and Setup

The Fourier series representation of a $2\pi \hbox {-periodic}$ function $f:\left[ -\pi ,\pi \right] \rightarrow {\mathbb {C}}$ will be denoted by

$$\begin{aligned} f\left( x\right) =\sum _{\omega \in {\mathbb {Z}}}\widehat{f}_{\omega }\mathbb {e}^{\mathbb {i}\omega x} \end{aligned}$$

with its Fourier coefficients, $\widehat{f}_{\omega }$, given by

$$\begin{aligned} \widehat{f}_{\omega }=\frac{1}{2\pi }\int _{-\pi }^{\pi }f\left( x\right) \mathbb {e}^{-\mathbb {i}\omega x}~dx. \end{aligned}$$

We let $\widehat{f}:=\left\{ \widehat{f}_{\omega }\right\} _{\omega \in {\mathbb {Z}}}$ represent the infinite sequence of all Fourier coefficients of f below. Given two $2\pi $-periodic functions f and g we define the convolution of f and g at $x \in {\mathbb {R}}$ to be

$$\begin{aligned} \left( f*g\right) \left( x\right) ~=~\left( g*f\right) \left( x\right) ~:=~ \frac{1}{2\pi }\int _{-\pi }^{\pi }g\left( x-y\right) f\left( y\right) dy. \end{aligned}$$

This definition, coupled with the definition of the Fourier transform, yields the well-known equality

$$\begin{aligned} \widehat{f*g}_{\omega }=\widehat{f}_{\omega }\widehat{g}_{\omega }\ \forall \omega \in {\mathbb {Z}}. \end{aligned}$$

We may also write $\widehat{f*g}=\widehat{f}\circ \widehat{g}$ where $\circ $ denotes the Hadamard product.

For any $N\in {\mathbb {N}}$, define the discrete Fourier transform (DFT) matrix $F\in {\mathbb {C}}^{N\times N}$ by

$$\begin{aligned} F_{\omega ,j} :=\frac{(-1)^\omega }{N} \mathbb {e}^{-\frac{2\pi \mathbb {i}\cdot \omega \cdot j}{N}}, \end{aligned}$$

and let $B:=\left( -\left\lceil \frac{N}{2}\right\rceil ,\left\lfloor \frac{N}{2}\right\rfloor \right] \cap {\mathbb {Z}}$ be a set of N integer frequencies centered at 0. Furthermore, let $\mathbf {f}\in {\mathbb {C}}^N$ denote the vector of equally spaced samples from f whose entries are given by

$$\begin{aligned} f_j := f\left( -\pi +\frac{2\pi j}{N}\right) \end{aligned}$$

for $j = 0, \dots , N-1$. One can now see that if

$$\begin{aligned} f\left( x\right) =\sum _{\omega \in B}\widehat{f}_{\omega } \mathbb {e}^{\mathbb {i}\omega x}, \end{aligned}$$

then

$$\begin{aligned} F \mathbf {f}=: {\hat{\mathbf {f}}} \end{aligned}$$

(2)

where ${\hat{\mathbf {f}}}\in {\mathbb {C}}^{N}$ denotes the subset of $\widehat{f}$ with indices in B, and in vector form.^{Footnote 5} More generally, bolded lower case letters will always represent vectors in ${\mathbb {C}}^{N}$ below.

As mentioned above, $\widehat{f}:=\left\{ \widehat{f}_{\omega }\right\} _{\omega \in {\mathbb {Z}}}$ is the infinite sequence of all Fourier coefficients of f. For any subset $S \subseteq {\mathbb {Z}}$ we let $\widehat{f}\vert _{S}\in {\mathbb {C}}^{{\mathbb {Z}}}$ be the sequence $\widehat{f}$ restricted to the subset S, so that $\widehat{f}\vert _{S}$ has terms $\left( \widehat{f}\vert _{S} \right) _\omega = \widehat{f}_{\omega }$ for all $\omega \in S$, and $\left( \widehat{f}\vert _{S} \right) _\omega = 0$ for all $\omega \in S^{c}:={\mathbb {Z}}\setminus S$. Note that ${\hat{\mathbf {f}}}$ above is exactly $\widehat{f}\vert _{B}$ excluding its zero terms for all $\omega \notin B$. Thus, given any subset $S\subseteq B$, we let ${\hat{\mathbf {f}}}\vert _{S}\in {\mathbb {C}}^{N}$ be the vector ${\hat{\mathbf {f}}}$ restricted to the set S in an analogous fashion. That is, for $S \subseteq B$ we will have $\left( {\hat{\mathbf {f}}}\vert _{S} \right) _\omega = {\hat{\mathbf {f}}}_\omega $ for all $\omega \in S$, and $\left( {\hat{\mathbf {f}}}\vert _{S} \right) _\omega = 0$ for all $\omega \in B\setminus S$.

Given the sequence $\widehat{f}\in {\mathbb {C}}^{{\mathbb {Z}}}$ and $s\le N$, we denote by $R_{s}^{\mathrm{opt}}\left( \widehat{f}\right) $ a subset of B containing s of the most energetic frequencies of f; that is

$$\begin{aligned} R_{s}^{\mathrm{opt}}\left( \widehat{f}\right) :=\left\{ \omega _{1},\dots ,\omega _{s}\right\} \subseteq B \subset {\mathbb {Z}} \end{aligned}$$

where the frequencies $\omega _j \in B$ are ordered such that

$$\begin{aligned} \left| \widehat{f}_{\omega _{1}}\right| \ge \left| \widehat{f}_{\omega _{2}}\right| \ge \cdots \ge \left| \widehat{f}_{\omega _{s}}\right| \ge \cdots \ge \left| \widehat{f}_{\omega _{N}}\right| . \end{aligned}$$

Here, if desired, one may break ties by also requiring, e.g., that $\omega _j < \omega _k$ for all $j < k$ with $\left| \widehat{f}_{\omega _{j}}\right| =\left| \widehat{f}_{\omega _{k}}\right| $. We will then define $f_{s}^{\mathrm{opt}}:\left[ -\pi ,\pi \right] \rightarrow {\mathbb {C}}$ based on $R_{s}^{\mathrm{opt}}\left( \widehat{f}\right) $ by

$$\begin{aligned} f_{s}^{\mathrm{opt}}\left( x\right) :=\sum _{\omega \in R_{s}^{\mathrm{opt}}\left( \widehat{f}\right) }\widehat{f}_{\omega }\mathbb {e}^{\mathbb {i}\omega x}. \end{aligned}$$

Any such $2 \pi $-periodic function $f_{s}^{\mathrm{opt}}$ will be referred to as an optimal s-term approximation to f. Similarly, we also define both $\widehat{f}_{s}^{\mathrm{opt}} \in {\mathbb {C}}^{{\mathbb {Z}}}$ and ${\hat{\mathbf {f}}}_{s}^{\mathrm{opt}} \in {\mathbb {C}}^{N}$ to be $\widehat{f}\vert _{R_{s}^{\mathrm{opt}}\left( \widehat{f}\right) }$ and ${\hat{\mathbf {f}}}\vert _{R_{s}^{\mathrm{opt}}\left( \widehat{f}\right) }$, respectively.

2.1 Periodized Gaussians

In the sections that follow the $2\pi \hbox {-periodic}$ Gaussian $g:\left[ -\pi ,\pi \right] \rightarrow {\mathbb {R}}^{+}$ defined by

$$\begin{aligned} g\left( x\right) =\frac{1}{c_1}\sum _{n=-\infty }^{\infty }\mathbb {e}^{-\frac{\left( x-2n\pi \right) ^{2}}{2c_1^{2}}} \end{aligned}$$

(3)

with $c_1 \in {\mathbb {R}}^+$ will play a special role. The following lemmas recall several useful facts concerning both its decay, and its Fourier series coefficients.

Lemma 1

The $2\pi \hbox {-periodic}$ Gaussian $g:\left[ -\pi ,\pi \right] \rightarrow {\mathbb {R}}^{+}$ has

$$\begin{aligned} g\left( x\right) \le \left( \frac{3}{c_1}+\frac{1}{\sqrt{2\pi }} \right) \mathbb {e}^{-\frac{x^{2}}{2c_{1}^{2}}} \end{aligned}$$

for all $x \in \left[ -\pi ,\pi \right] $.

Lemma 2

The $2\pi \hbox {-periodic}$ Gaussian $g:\left[ -\pi ,\pi \right] \rightarrow {\mathbb {R}}^{+}$ has

$$\begin{aligned} \widehat{g}_{\omega } = \frac{1}{\sqrt{2\pi }}\mathbb {e}^{-\frac{c_1^2 \omega ^2}{2}} \end{aligned}$$

for all $\omega \in {\mathbb {Z}}$. Thus, $\widehat{g}=\left\{ \widehat{g}_{\omega }\right\} _{\omega \in {\mathbb {Z}}}\in \ell ^{2}$ decreases monotonically as $|\omega |$ increases, and also has $\Vert \widehat{g} \Vert _{\infty } = \frac{1}{\sqrt{2 \pi }}$.

Lemma 3

Choose any $\tau \in \left( 0, \frac{1}{\sqrt{2\pi }} \right) $, $\alpha \in \left[ 1, \frac{N}{\sqrt{\ln N}} \right] $, and $\beta \in \left( 0 , \alpha \sqrt{\frac{\ln \left( 1/\tau \sqrt{2\pi } \right) }{2}} ~\right] $. Let $c_1 = \frac{\beta \sqrt{\ln N}}{N}$ in the definition of the periodic Gaussian g from (3). Then $\widehat{g}_{\omega } \in \left[ \tau , \frac{1}{\sqrt{2\pi }} \right] $ for all $\omega \in {\mathbb {Z}}$ with $|\omega | \le \Bigl \lceil \frac{N}{\alpha \sqrt{\ln N}}\Bigr \rceil $.

The proofs of Lemmas 1, 2, and 3 are included in Appendix B for the sake of completeness. Intuitively, we will utilize the periodic function g from (3) as a bandpass filter below. Looking at Lemma 3 in this context we can see that its parameter $\tau $ will control the effect of $\widehat{g}$ on the frequency passband defined by its parameter $\alpha $. Deciding on the two parameters $\tau , \alpha $ then constrains $\beta $ which, in turn, fixes the periodic Gaussian g by determining its constant coefficient $c_1$. As we shall see, the parameter $\beta $ will also determine the speed and accuracy with which we can approximately sample (i.e., evaluate) the function $f *g$. For this reason it will become important to properly balance these parameters against one another in subsequent sections.

2.2 On the Robustness of the SFTs Proposed in [21]

The sparse Fourier transforms presented in [21] include both deterministic and randomized methods for approximately computing the Fourier series coefficients of a given $2 \pi $-periodic function f from its evaluations at m-points $\left\{ x_k \right\} ^m_{k=1} \subset [-\pi , \pi ]$. The following results describe how accurate these algorithms will be when they are only given approximate evaluations of f at these points instead. These results are necessary because we will want to execute the SFTs developed in [21] on convolutions of the form $f *g$ below, but will only be able to approximately compute their values at each of the required points $x_1, \dots , x_m \in [-\pi ,\pi ]$.

Lemma 4

Let $s, \epsilon ^{-1} \in {\mathbb {N}} \setminus \{ 1 \}$ with $(s/\epsilon ) \ge 2$, and $\mathbf {n}\in {\mathbb {C}}^m$ be an arbitrary noise vector. There exists a set of m points $\left\{ x_k \right\} ^m_{k=1} \subset [-\pi , \pi ]$ such that Algorithm 3 on page 72 of [21], when given access to the corrupted samples $\left\{ f(x_k) + n_k \right\} ^m_{k=1}$, will identify a subset $S \subseteq B$ which is guaranteed to contain all $\omega \in B$ with

$$\begin{aligned} \left| \widehat{f}_{\omega } \right| > 4 \left( \frac{\epsilon \cdot \left\| {\hat{\mathbf {f}}}- {\hat{\mathbf {f}}}^{\mathrm{opt}}_{(s/\epsilon )} \right\| _1}{s} + \left\| \widehat{f} - \widehat{f}\vert _{B} \right\| _1 + \Vert \mathbf {n}\Vert _\infty \right) . \end{aligned}$$

(4)

Furthermore, every $\omega \in S$ returned by Algorithm 3 will also have an associated Fourier series coefficient estimate $z_{\omega } \in {\mathbb {C}}$ which is guaranteed to have

$$\begin{aligned} \left| \widehat{f}_{\omega } - z_{\omega } \right| \le \sqrt{2} \left( \frac{\epsilon \cdot \left\| {\hat{\mathbf {f}}}- {\hat{\mathbf {f}}}^{\mathrm{opt}}_{(s/\epsilon )} \right\| _1}{s} + \left\| \widehat{f} - \widehat{f}\vert _{B} \right\| _1 + \Vert \mathbf {n}\Vert _\infty \right) . \end{aligned}$$

(5)

Both the number of required samples, m, and Algorithm 3’s operation count are

$$\begin{aligned} {\mathcal {O}} \left( \frac{s^2 \cdot \log ^4 (N)}{\log \left( \frac{s}{\epsilon } \right) \cdot \epsilon ^2} \right) . \end{aligned}$$

(6)

If succeeding with probability $(1-\delta ) \in [2/3,1)$ is sufficient, and $(s/\epsilon ) \ge 2$, the Monte Carlo variant of Algorithm 3 referred to by Corollary 4 on page 74 of [21] may be used. This Monte Carlo variant reads only a randomly chosen subset of the noisy samples utilized by the deterministic algorithm,

$$\begin{aligned} \left\{ f(\tilde{x}_k) + \tilde{n}_k \right\} ^{\tilde{m}}_{k=1} \subseteq \left\{ f(x_k) + n_k \right\} ^m_{k=1}, \end{aligned}$$

yet it still outputs a subset $S \subseteq B$ which is guaranteed to simultaneously satisfy both of the following properties with probability at least $1-\delta $:

(i)
S will contain all $\omega \in B$ satisfying (4), and
(ii)
all $\omega \in S$ will have an associated coefficient estimate $z_{\omega } \in {\mathbb {C}}$ satisfying (5).

Finally, both this Monte Carlo variant’s number of required samples, $\tilde{m}$, as well as its operation count will also always be

$$\begin{aligned} {\mathcal {O}} \left( \frac{s}{\epsilon } \cdot \log ^3 (N) \cdot \log \left( \frac{N}{\delta } \right) \right) . \end{aligned}$$

(7)

Using the preceding lemma one can easily prove the following noise robust variant of Theorem 7 (and Corollary 4) from §5 of [21]. The proofs of both results are outlined in Appendix C for the sake of completeness.

Theorem 2

Suppose $f: [-\pi ,\pi ] \rightarrow {\mathbb {C}}$ has $\widehat{f} \in \ell ^1 \cap \ell ^2$. Let $s, \epsilon ^{-1} \in {\mathbb {N}} \setminus \{ 1 \}$ with $(s/\epsilon ) \ge 2$, and $\mathbf {n}\in {\mathbb {C}}^m$ be an arbitrary noise vector. Then, there exists a set of m points $\left\{ x_k \right\} ^m_{k=1} \subset [-\pi , \pi ]$ together with a simple deterministic algorithm ${\mathcal {A}}: {\mathbb {C}}^m \rightarrow {\mathbb {C}}^{4s}$ such that ${\mathcal {A}} \left( \left\{ f(x_k) + n_k \right\} ^m_{k=1} \right) $ is always guaranteed to output (the nonzero coefficients of) a degree $\le N/2$ trigonometric polynomial $y_s: [-\pi , \pi ] \rightarrow {\mathbb {C}}$ satisfying

$$\begin{aligned} \left\| f - y_s \right\| _2 \le \left\| {\hat{\mathbf {f}}}- {\hat{\mathbf {f}}}_{s}^{\mathrm{opt}} \right\| _2 + \frac{22\epsilon \cdot \left\| {\hat{\mathbf {f}}}- {\hat{\mathbf {f}}}^{\mathrm{opt}}_{(s/\epsilon )} \right\| _1}{\sqrt{s}} + 22 \sqrt{s} \left( \left\| \widehat{f} - \widehat{f}\vert _{B} \right\| _1 + \Vert \mathbf {n}\Vert _\infty \right) . \end{aligned}$$

(8)

Both the number of required samples, m, and the algorithm’s operation count are always

$$\begin{aligned} {\mathcal {O}} \left( \frac{s^2 \cdot \log ^4 (N)}{\log \left( \frac{s}{\epsilon } \right) \cdot \epsilon ^2} \right) . \end{aligned}$$

(9)

If succeeding with probability $(1-\delta ) \in [2/3,1)$ is sufficient, and $(s/\epsilon ) \ge 2$, a Monte Carlo variant of the deterministic algorithm may be used. This Monte Carlo variant reads only a randomly chosen subset of the noisy samples utilized by the deterministic algorithm,

$$\begin{aligned} \left\{ f(\tilde{x}_k) + \tilde{n}_k \right\} ^{\tilde{m}}_{k=1} \subseteq \left\{ f(x_k) + n_k \right\} ^m_{k=1}, \end{aligned}$$

yet it still outputs (the nonzero coefficients of) a degree $\le N/2$ trigonometric polynomial, $y_s: [-\pi , \pi ] \rightarrow {\mathbb {C}}$, that satisfies (8) with probability at least $1-\delta $. Both its number of required samples, $\tilde{m}$, as well as its operation count will always be

$$\begin{aligned} {\mathcal {O}} \left( \frac{s}{\epsilon } \cdot \log ^3 (N) \cdot \log \left( \frac{N}{\delta } \right) \right) . \end{aligned}$$

(10)

We now have the necessary prerequisites in order to discuss our general strategy for constructing several new fully discrete SFTs.

3 Description of the Proposed Approach

In this section we assume that we have access to an SFT algorithm ${\mathcal {A}}$ which requires m function evaluations of a $2 \pi $-periodic function $f: [-\pi , \pi ] \rightarrow {\mathbb {C}}$ in order to produce an s-sparse approximation to $\widehat{f}$. For any non-adaptive SFT algorithm ${\mathcal {A}}$ the m points $\left\{ x_k \right\} ^m_{k=1} \subset [-\pi , \pi ]$ at which ${\mathcal {A}}$ needs to evaluate f can be determined before ${\mathcal {A}}$ is actually executed. As a result, the function evaluations $\left\{ f(x_k) \right\} ^m_{k=1}$ required by ${\mathcal {A}}$ can also be evaluated before ${\mathcal {A}}$ is ever run. Indeed, if the SFT algorithm ${\mathcal {A}}$ is nonadaptive, stable, and robust to noise it suffices to approximate the function evaluations $\left\{ f(x_k) \right\} ^m_{k=1}$ required by ${\mathcal {A}}$ before it is executed.^{Footnote 6} These simple ideas form the basis for the proposed computational approach outlined in Algorithm 1.

The objective of Algorithm 1 is to use a nonadaptive and noise robust SFT algorithm ${\mathcal {A}}$ which requires off-grid function evaluations in order to approximately compute the DFT of a given vector $\mathbf {f}\in {\mathbb {C}}^N$, ${\hat{\mathbf {f}}}= F \mathbf {f}$ . Note that computing ${\hat{\mathbf {f}}}$ is equivalent to computing the Fourier series coefficients of the degree N trigonometric interpolant of $\mathbf {f}$. Hereafter the $2 \pi $-periodic function $f: [-\pi , \pi ] \rightarrow {\mathbb {C}}$ under consideration will always be this degree N trigonometric interpolant of $\mathbf {f}$. Our objective then becomes to approximately compute $\widehat{f}$ using ${\mathcal {A}}$. Unfortunately, our given input vector $\mathbf {f}$ only contains equally spaced function evaluations of f, and so does not actually contain the function evaluations $\left\{ f(x_k) \right\} ^m_{k=1}$ required by ${\mathcal {A}}$. As a consequence, we are forced to try to interpolate these required function evaluations $\left\{ f(x_k) \right\} ^m_{k=1}$ from the available equally spaced function evaluations $\mathbf {f}$.

Directly interpolating the required function evaluations $\left\{ f(x_k) \right\} ^m_{k=1}$ from $\mathbf {f}$ for an arbitrary degree N trigonometric polynomial f using classical techniques appears to be either too inaccurate, or else too slow to work well in our setting.^{Footnote 7} As a result, Algorithm 1 follows the example of successful nonequispaced fast Fourier transform (NFFT) methods (see, e.g. [2, 9, 10, 23, 30]) and instead uses $\mathbf {f}$ to rapidly approximate samples from the convolution of the unknown trigonometric polynomial f with (several modulations of) a known filter function g. Thankfully, all of the evaluations $\left\{ (g*f)(x_k) \right\} ^m_{k=1}$ can be approximated very accurately using only the data in $\mathbf {f}$ in just ${\mathcal {O}}(m \log N)$-time when g is chosen carefully enough (see Sect. 3.1 below). The given SFT algorithm ${\mathcal {A}}$ is then used to approximate the Fourier coefficients of $g*f$ for each modulation of g using these approximate evaluations. Finally, ${\hat{\mathbf {f}}}$ is then approximated using the recovered sparse approximation for each $\widehat{g*f}$ combined with our a priori knowledge of $\widehat{g}$.

Next, in Sect. 3.1, explicit bounds will be developed which characterize the runtime required in order to accurately approximate arbitrary samples from $f*g$ using only a few entries from $\mathbf {f}$. The attentive reader may notice there that the main theorem in that section (Theorem 4) bares some resemblance to state of the art NFFT error bounds (see, e.g., Steidl’s Theorem 3.1 in [30]) in that it utilizes the properties of truncated convolutions with periodized Gaussians in order to obtain error bounds which decay exponentially with the number of truncated convolution terms utilized per function evaluation. It is important to note, however, that the SFT methods considered herein have several crucial complicating constraints which require such NFFT techniques to be substantially overhauled before they may be fruitfully employed in our setting. Chief among these complications are that $\Omega (N)$-time NFFT methods for the evaluation of trigonometric polynomials at nonequispaced points along with their attending error analysis effectively assume that ${\hat{\mathbf {f}}}$ is already known (or, at least, that computing it in FFT-time is an acceptable computational cost). In the case of SFTs this is not true since our main objective is exactly to approximate ${\hat{\mathbf {f}}}$ much more quickly than an FFT can by only reading a tiny sublinear-in-N fraction of the entries in $\mathbf {f}$. As a result, unlike NFFT methods our analysis needs to focus on rapidly approximating values from $f *g$ instead of from f itself, and with a Gaussian g whose Fourier transform $\widehat{g}$ still allows the rapid and accurate application of SFT techniques in Sect. 4 as we continue therein to prove the main result of the paper (Theorem 1).

3.1 Rapidly and Accurately Evaluating $f*g$

In this section we will carefully consider the approximation of $\left( f*g\right) \left( x\right) $ by a severely truncated version of the semi-discrete convolution sum

$$\begin{aligned} \frac{1}{N}\sum ^{N-1}_{j=0}f\left( -\pi +\frac{2\pi j}{N}\right) g\left( x + \pi - \frac{2\pi j}{N} \right) \end{aligned}$$

(11)

for any given value of $x \in [-\pi , \pi ]$. Our goal is to determine exactly how many terms of this finite sum we actually need in order to obtain an accurate approximation of $f*g$ at an arbitrary x-value. More specifically, we aim to use as few terms from this sum as absolutely possible in order to ensure, e.g., an approximation error of size ${\mathcal {O}}(N^{-2})$.

Without loss of generality, let us assume that $N=2M+1$ is odd – this allows us to express B, the set of N Fourier modes about zero, as

$$\begin{aligned} B:=\left( -\left\lceil \frac{N}{2}\right\rceil ,\left\lfloor \frac{N}{2}\right\rfloor \right] \cap {\mathbb {Z}}=\left[ -M,M\right] \cap {\mathbb {Z}}. \end{aligned}$$

In the lemmas and theorems below the function $f:\left[ -\pi ,\pi \right] \rightarrow {\mathbb {C}}$ will always denote a degree-N trigonometric polynomial of the form

$$\begin{aligned} f\left( x\right) =\sum _{\omega \in B}\widehat{f}_{\omega }\mathbb {e}^{\mathbb {i}\omega x}. \end{aligned}$$

Furthermore, g will always denote the periodic Gaussian as defined above in (3). Finally, we will also make use of the Dirichlet kernel $D_{M}:{\mathbb {R}}\rightarrow {\mathbb {C}}$, defined by

$$\begin{aligned} D_{M}\left( y\right) =\frac{1}{2\pi }\sum _{n=-M}^{M}\mathbb {e}^{\mathbb {i}ny}=\frac{1}{2\pi }\sum _{n\in B}\mathbb {e}^{\mathbb {i}ny}. \end{aligned}$$

The relationship between trigonometric polynomials such as f and the Dirichlet kernel $D_{M}$ is the subject of the following lemma.

Lemma 5

Let $h: [-\pi , \pi ] \rightarrow {\mathbb {C}}$ have $\widehat{h}_{\omega } = 0$ for all $\omega \notin B$, and define the set of points $\left\{ y_{j}\right\} _{j=0}^{2M}=\left\{ -\pi +\frac{2\pi j}{N} \right\} _{j=0}^{2M}$. Then,

$$\begin{aligned} 2\pi \left( h*D_{M}\right) \left( x\right) ~=~h\left( x\right) ~=~\frac{2\pi }{N}\sum _{j=0}^{2M}h\left( y_{j}\right) D_{M}\left( x - y_{j} \right) \end{aligned}$$

holds for all $x\in \left[ -\pi ,\pi \right] $.

Proof

By the definition of $D_{M}$, we trivially have $2\pi \left( \widehat{D_{M}} \right) _{\omega }=\chi _{B}\left( \omega \right) $$\forall \omega \in {\mathbb {Z}}$. Thus,

$$\begin{aligned} \widehat{h}=2\pi \cdot \widehat{h}\circ \widehat{D_{M}}=2\pi \cdot \widehat{h*D_{M}} \end{aligned}$$

where, as before, $\circ $ denotes the Hadamard product, and $*$ denotes convolution. This yields $h\left( x\right) =2\pi \left( h*D_{M}\right) \left( x\right) $ and so establishes the first equality above. To establish the second equality above, recall from (2) that for any $\omega \in B$ we will have

$$\begin{aligned} \widehat{h}_{\omega }=\frac{(-1)^\omega }{N}\sum _{j=0}^{2M}h\left( -\pi +\frac{2\pi j}{N}\right) \mathbb {e}^{\frac{-2\pi \mathbb {i}j\omega }{N}}=\frac{1}{N}\sum _{j=0}^{2M}h\left( y_{j}\right) \mathbb {e}^{-\mathbb {i}\omega y_{j}}, \end{aligned}$$

since h is a trigonometric polynomial. Thus, given $x\in \left[ -\pi ,\pi \right] $ one has

$$\begin{aligned} h\left( x\right)&=\sum _{\omega \in B}\widehat{h}_{\omega }\mathbb {e}^{\mathbb {i}\omega x} =\frac{1}{N}\sum _{j=0}^{2M}\left( h\left( y_{j}\right) \sum _{\omega \in B}\mathbb {e}^{\mathbb {i}\omega \left( x-y_{j} \right) }\right) =\frac{2\pi }{N}\sum _{j=0}^{2M}h\left( y_{j}\right) D_{M}\left( x-y_{j} \right) . \end{aligned}$$

We now have the desired result. $\square $

We can now write a formula for $g*f$ which only depends on N evaluations of f in $[-\pi , \pi ]$.

Lemma 6

Given the set of equally spaced points $\left\{ y_{j}\right\} ^{2M}_{j = 0}=\left\{ -\pi +\frac{2\pi j}{N} \right\} _{j=0}^{2M}$ one has that

$$\begin{aligned} \left( g*f\right) \left( x\right) =\frac{1}{N}\sum _{j=0}^{2M}f\left( y_{j}\right) \int _{-\pi }^{\pi }g\left( x-u-y_{j}\right) D_{M}\left( u\right) du \end{aligned}$$

for all $x\in \left[ -\pi ,\pi \right] $.

Proof

By Lemma 5, we have

$$\begin{aligned} \left( g*f\right) \left( x\right)&=\frac{1}{2\pi }\int _{-\pi }^{\pi }g\left( x-y\right) f\left( y\right) dy =\frac{1}{N}\int _{-\pi }^{\pi }g\left( x-y\right) \sum ^{2M}_{j = 0}f\left( y_{j}\right) D_{M}\left( y-y_{j}\right) dy\\&=\frac{1}{N}\sum ^{2M}_{j = 0}f\left( y_{j}\right) \int _{-\pi }^{\pi }g\left( x-u-y_{j}\right) D_{M}\left( u\right) du. \end{aligned}$$

The last equality holds after a change of variables since g and $D_{M}$ are both $2\pi \hbox {-periodic}$. $\square $

The next two lemmas will help us bound the error produced by discretizing the integral weights present in the finite sum provided by Lemma 6 above. More specifically, they will ultimately allow us to approximate the sum in Lemma 6 by the sum in (11).

Lemma 7

Let $x\in \left[ -\pi ,\pi \right] $ and $y_{j} = -\pi +\frac{2\pi j}{N}$ for some $j = 0, \dots , 2M$. Then,

$$\begin{aligned} \int _{-\pi }^{\pi }g\left( x-u-y_{j}\right) D_{M}\left( u\right) du=\sum _{n\in B}\widehat{g}_{n}\mathbb {e}^{\mathbb {i}n\left( x - y_{j}\right) }. \end{aligned}$$

Proof

Recalling that $2\pi \left( \widehat{D_{M}} \right) _{\omega }=\chi _{B}\left( \omega \right) $ for all $\omega \in {\mathbb {Z}}$ we have that

$$\begin{aligned} \int _{-\pi }^{\pi }g\left( x-u-y_{j}\right) D_{M}\left( u\right) du&= 2 \pi \left( D_{M} *g \right) \left( x - y_{j} \right) \\&= \sum _{n\in {\mathbb {Z}}} \widehat{g}_{n} \chi _{B}\left( n\right) \mathbb {e}^{\mathbb {i}n\left( x - y_{j}\right) } = \sum _{n\in B}\widehat{g}_{n}\mathbb {e}^{\mathbb {i}n\left( x - y_{j}\right) }. \end{aligned}$$

$\square $

Lemma 8

Denote $I\left( a\right) :=\int _{-a}^{a}\mathbb {e}^{-x^{2}}dx$ for $a>0$; then

$$\begin{aligned} \pi \left( 1-\mathbb {e}^{-a^{2}}\right)<I^{2}\left( a\right) <\pi \left( 1-\mathbb {e}^{-2a^{2}}\right) . \end{aligned}$$

Proof

Let $a>0$ and observe that

$$\begin{aligned} I^{2}\left( a\right) =\int _{-a}^{a}\int _{-a}^{a}\mathbb {e}^{-x^{2}-y^{2}}dx dy >\iint _{\left\{ x^{2}+y^{2}\le a^{2}\right\} }\mathbb {e}^{-\left( x^{2}+y^{2}\right) }dxdy =\pi \left( 1-\mathbb {e}^{-a^{2}}\right) . \end{aligned}$$

The first equality holds by Fubini’s theorem, and the inequality follows simply by integrating a positive function over a disk of radius a as opposed to a square of sidelength 2a. A similar argument yields the upper bound. $\square $

We are now ready to bound the difference between the integral weights present in the finite sum provided by Lemma 6, and the $g\left( x - y_j \right) $-weights present in the sum (11).

Lemma 9

Choose any $\tau \in \left( 0,\frac{1}{\sqrt{2\pi }}\right) $, $\alpha \in \left[ 1,\frac{N}{\sqrt{\ln N}}\right] $, and $\beta \in \left( 0,\alpha \sqrt{\frac{\ln \left( 1/\tau \sqrt{2\pi }\right) }{2}}~\right] $. Let $c_{1}=\frac{\beta \sqrt{\ln N}}{N}$ in the definition of the periodic Gaussian g so that

$$\begin{aligned} g\left( x\right) =\frac{N}{\beta \sqrt{\ln N}}\sum _{n=-\infty }^{\infty }\mathbb {e}^{-\frac{\left( x-2n\pi \right) ^{2}N^{2}}{2\beta ^{2}\ln N}}. \end{aligned}$$

Then for all $x\in \left[ -\pi ,\pi \right] $ and $y_{j} = -\pi +\frac{2\pi j}{N}$,

$$\begin{aligned} \left| g\left( x - y_{j}\right) -\int _{-\pi }^{\pi }g\left( x- u - y_{j}\right) D_{M}\left( u\right) du\right| < \frac{N^{1-\frac{\beta ^{2}}{18}}}{\beta \sqrt{\ln N}}. \end{aligned}$$

Proof

Using Lemma 7 we calculate

$$\begin{aligned} \left| g\left( x \!-\! y_{j}\right) \!-\!\int _{-\pi }^{\pi }g\left( x\!-\!u\!-\!y_{j}\right) D_{M}\left( u\right) du\right|&=\left| g\left( x-y_{j}\right) -\sum _{n\in B}\widehat{g}_{n}\mathbb {e}^{\mathbb {i}n\left( x - y_{j}\right) }\right| \\&=\left| \sum _{n\in B^{c}}\widehat{g}_{n}\mathbb {e}^{\mathbb {i}n\left( x-y_{j}\right) }\right| \\&\le \frac{1}{\sqrt{2\pi }}\sum _{\left| n\right| >M}\mathbb {e}^{-\frac{c_{1}^{2}n^{2}}{2}}\quad (\hbox {Using Lemma}\,2)\\&\le \frac{2}{\sqrt{2\pi }}\int _{M}^{\infty }\mathbb {e}^{-\frac{c_{1}^{2}n^{2}}{2}}dn\\&=\sqrt{\frac{2}{\pi }}\int _{M}^{\infty }\mathbb {e}^{-\frac{\beta ^{2}n^{2}\ln N}{2N^{2}}}dn. \end{aligned}$$

Upon the change of variable $v=\frac{\beta n\sqrt{\ln N}}{\sqrt{2}N}$, we get that

$$\begin{aligned}&\left| g\left( x-y_{j}\right) -\int _{-\pi }^{\pi }g\left( x-u-y_{j}\right) D_{M}\left( u\right) du\right| \\&\qquad \qquad \le \sqrt{\frac{2}{\pi }}\frac{\sqrt{2}N}{\beta \sqrt{\ln N}}\int _{\frac{\beta M\sqrt{\ln N}}{\sqrt{2}N}}^{\infty }\mathbb {e}^{-v^{2}}dv\\&\qquad \qquad =\frac{2N}{\beta \sqrt{\pi \ln N}}\frac{1}{2}\left( \int _{-\infty }^{\infty }\mathbb {e}^{-v^{2}}dv-\int _{-\frac{\beta M\sqrt{\ln N}}{\sqrt{2}N}}^{\frac{\beta M\sqrt{\ln N}}{\sqrt{2}N}}\mathbb {e}^{-v^{2}}dv\right) \\&\qquad \qquad <\frac{N}{\beta \sqrt{\pi \ln N}}\left( \sqrt{\pi }-\sqrt{\pi \left( 1-\mathbb {e}^{-\frac{\beta ^{2}M^{2}\ln N}{2N^{2}}}\right) }\right) \\&\qquad \qquad =\frac{N}{\beta \sqrt{\ln N}}\left( 1-\sqrt{1-N^{-\frac{\beta ^{2}M^{2}}{2N^{2}}}}\right) \end{aligned}$$

where the last inequality follows from Lemma 8. Noting now that

$$\begin{aligned} y\in \left[ 0,1\right] \implies 1-\sqrt{1-y}\le y, \end{aligned}$$

and that $\frac{N}{M}=2+\frac{1}{M}\in \left( 2,3\right] $ for all $M \in {\mathbb {Z}}^+$, we can further see that

$$\begin{aligned} \frac{N}{\beta \sqrt{\ln N}}\left( 1-\sqrt{1-N^{-\frac{\beta ^{2}M^{2}}{2N^{2}}}}\right) \le \frac{N}{\beta \sqrt{\ln N}}N^{-\frac{\beta ^{2}M^{2}}{2N^{2}}} \le \frac{N^{1-\frac{\beta ^{2}}{18}}}{\beta \sqrt{\ln N}} \end{aligned}$$

also always holds. $\square $

With the lemmas above we can now prove that (11) can be used to approximate $\left( g*f\right) \left( x\right) $ for all $x \in [-\pi , \pi ]$ with controllable error.

Theorem 3

Let $p \ge 1$. Using the same values of the parameters from Lemma 9 above, one has

$$\begin{aligned} \left| \left( g*f\right) \left( x\right) -\frac{1}{N}\sum ^{2M}_{j = 0}f\left( y_{j}\right) g\left( x - y_{j}\right) \right| \le \frac{\left\| \mathbf {f}\right\| _{p}}{\beta \sqrt{\ln N}}N^{1-\frac{\beta ^{2}}{18} - \frac{1}{p}} \end{aligned}$$

for all $x\in \left[ -\pi ,\pi \right] $.

Proof

Using Lemmas 6 and 9 followed by Holder’s inequality, we have

$$\begin{aligned}&\left| \left( g*f\right) \left( x\right) -\frac{1}{N}\sum ^{2M}_{j = 0}f\left( y_{j}\right) g\left( x-y_{j}\right) \right| \\&\qquad \qquad =\left| \frac{1}{N}\sum ^{2M}_{j = 0}f\left( y_{j}\right) \left( g\left( x-y_{j}\right) -\int _{-\pi }^{\pi }g\left( x-u-y_{j}\right) D_{M}\left( u\right) du\right) \right| \\&\qquad \qquad \le \frac{1}{N}\sum ^{2M}_{j = 0}\left| f\left( y_{j}\right) \right| \frac{N^{1-\frac{\beta ^{2}}{18}}}{\beta \sqrt{\ln N}} \le \frac{N^{\frac{-\beta ^{2}}{18}} }{\beta \sqrt{\ln N}}\left\| \mathbf {f}\right\| _{p} N^{1-\frac{1}{p}}. \end{aligned}$$

$\square $

To summarize, Theorem 3 tells us that $\left( g*f\right) \left( x\right) $ can be approximately computed in ${\mathcal {O}}\left( N\right) $-time for any $x\in \left[ -\pi ,\pi \right] $ using (11). This linear runtime cost may be reduced significantly, however, if one is willing to accept an additional trade-off between accuracy and the number of terms needed in the sum (11). This trade-off is characterized in the next lemma.

Lemma 10

Let $x\in \left[ -\pi ,\pi \right] $, $p \ge 1$, $\gamma \in {\mathbb {R}}^+$, and $\kappa := \lceil \gamma \ln N \rceil + 1$. Set $j' := \arg \min _j \left| x - y_j \right| $. Using the same values of the other parameters from Lemma 9 above, one has

$$\begin{aligned} \left| \frac{1}{N}\sum ^{2M}_{j = 0}f\left( y_{j}\right) g\left( x - y_{j}\right) - \frac{1}{N}\sum ^{j' + \kappa }_{j = j' - \kappa }f\left( y_{j}\right) g\left( x - y_{j}\right) \right| \le 2 \Vert \mathbf {f}\Vert _p ~N^{-\frac{2 \pi ^2 \gamma ^2}{\beta ^2}} \end{aligned}$$

for all $\beta \ge 4$ and $N \ge \beta ^2$.

Proof

Appealing to Lemma 1 and recalling that $c_{1}=\frac{\beta \sqrt{\ln N}}{N}$ we can see that

$$\begin{aligned} g\left( x\right) \le \left( \frac{3N}{\beta \sqrt{\ln N}}+\frac{1}{\sqrt{2\pi }}\right) \mathbb {e}^{-\frac{x^{2}N^{2}}{2\beta ^{2}\ln N}}. \end{aligned}$$

Using this fact we have that

$$\begin{aligned} g\left( x - y_{j' \pm k}\right)&\le \left( \frac{3N}{\beta \sqrt{\ln N}}+\frac{1}{\sqrt{2\pi }}\right) \mathbb {e}^{-\frac{\left( x - y_{j' \pm k} \right) ^{2}N^{2}}{2\beta ^{2}\ln N}}\\&\le \left( \frac{3N}{\beta \sqrt{\ln N}}+\frac{1}{\sqrt{2\pi }}\right) \mathbb {e}^{-\frac{\left( 2k-1\right) ^{2} \pi ^2}{2\beta ^{2}\ln N}} \end{aligned}$$

for all $k \in {\mathbb {Z}}_N$. As a result, one can now bound

$$\begin{aligned} \left| \frac{1}{N}\sum ^{2M}_{j = 0}f\left( y_{j}\right) g\left( x - y_{j}\right) - \frac{1}{N}\sum ^{j' + \kappa }_{j = j' - \kappa }f\left( y_{j}\right) g\left( x - y_{j}\right) \right| \end{aligned}$$

above by

$$\begin{aligned} \left( \frac{3}{\beta \sqrt{\ln N}}+\frac{1}{N \sqrt{2\pi }}\right) \sum ^{N - 2 \kappa - 1}_{k = \kappa +1} \left( \left| f\left( y_{j'-k}\right) \right| + \left| f\left( y_{j'+k}\right) \right| \right) \mathbb {e}^{-\frac{\left( 2k-1\right) ^{2} \pi ^2}{2\beta ^{2}\ln N}}, \end{aligned}$$

(12)

where the $y_j$-indexes are considered modulo N as appropriate.

Our goal is now to employ Holder’s inequality on (12). Toward that end, we will now bound the q-norm of the vector $\mathbf{h} := \left\{ \mathbb {e}^{-\frac{\left( \kappa + \ell - \frac{1}{2}\right) ^{2} 2 \pi ^2}{\beta ^{2}\ln N}} \right\} ^{N- 2 \kappa - 1}_{\ell = 1}$. Letting $a := q \left( \frac{4}{\beta ^{2}\ln N} \right) $ we have that

$$\begin{aligned} \Vert \mathbf{h} \Vert _q^{q}&= \sum ^{N- 2 \kappa - 1}_{\ell = 1} \mathbb {e}^{-\frac{\pi ^2}{2} \left( \kappa + \ell - \frac{1}{2}\right) ^{2} a} < \sum ^{\infty }_{\ell = \kappa } \mathbb {e}^{-\frac{\pi ^2}{2} \ell ^{2} a} \le \int ^\infty _{\kappa -1} \mathbb {e}^{-\frac{\pi ^2 x^{2}}{2} a}~dx\\&\le \sqrt{\frac{1}{2 \pi a}} - \frac{1}{ \pi \sqrt{2 a}} \int ^{\pi (\kappa - 1) \sqrt{\frac{a}{2}}}_{-\pi (\kappa - 1) \sqrt{\frac{a}{2}}} \mathbb {e}^{-u^2}~du \le \sqrt{\frac{1}{2 \pi a}} \mathbb {e}^{-\frac{a \pi ^2}{2}(\kappa - 1)^2} \\&\le \frac{\beta }{2}\sqrt{\frac{\ln N}{2 \pi q}} N^{-\frac{2 q \pi ^2 \gamma ^2}{\beta ^2}}, \end{aligned}$$

where we have used Lemma 8 once again. As a result we have that

$$\begin{aligned} \Vert \mathbf{h} \Vert _q \le \left( \frac{\beta ^2\ln N}{8 \pi } \right) ^{\frac{1}{2q}} q^{- \frac{1}{2q}} N^{-\frac{2 \pi ^2 \gamma ^2}{\beta ^2}} \le \left( \frac{\beta ^2\ln N}{8 \pi } \right) ^{\frac{1}{2q}} N^{-\frac{2 \pi ^2 \gamma ^2}{\beta ^2}} \end{aligned}$$

for all $q \ge 1$. Applying Holder’s inequality on (12) we can now see that (12) is bounded above by

$$\begin{aligned} 2 \left( \frac{3}{\beta \sqrt{\ln N}}+\frac{1}{N \sqrt{2\pi }}\right) \Vert \mathbf {f}\Vert _p \left( \frac{\beta ^2\ln N}{8 \pi } \right) ^{\frac{1}{2} - \frac{1}{2p}} N^{-\frac{2 \pi ^2 \gamma ^2}{\beta ^2}}. \end{aligned}$$

The result now follows. $\square $

We may now finally combine the truncation and estimation errors in Theorem 3 and Lemma 10 above in order to bound the total error one incurs by approximating $\left( g*f\right) (x)$ via a truncated portion of (11) for any given $x \in [-\pi , \pi ]$.

Theorem 4

Fix $x\in \left[ -\pi ,\pi \right] $, $p\ge 1$ (or $p = \infty $), $\frac{N}{36}\ge r \ge 1$, and $g: [-\pi , \pi ] \rightarrow {\mathbb {R}}^{+}$ to be the $2\pi $-periodic Gaussian (3) with $c_1 := \frac{6 \sqrt{\ln (N^r)}}{N}$. Set $j' := \arg \min _j \left| x - y_j \right| $ where $y_{j} = -\pi +\frac{2\pi j}{N}$ for all $j = 0, \dots , 2M$. Then,

$$\begin{aligned} \left| \left( g*f\right) (x) - \frac{1}{N}\sum ^{j' + \left\lceil \frac{6r}{\sqrt{2} \pi } \ln N \right\rceil + 1}_{j = j' - \left\lceil \frac{6r}{\sqrt{2} \pi } \ln N \right\rceil - 1}f\left( y_{j}\right) g\left( x - y_{j}\right) \right| \le 3 \frac{\Vert \mathbf {f}\Vert _p}{N^r}. \end{aligned}$$

As a consequence, we can see that $\left( g*f\right) (x)$ can always be computed to within ${\mathcal {O}} \left( \Vert \mathbf {f}\Vert _{\infty } N^{-r} \right) $-error in just ${\mathcal {O}}\left( r \log N \right) $-time for any given $\mathbf {f}\in {\mathbb {C}}^N$ once the $\big \{ g\left( x - y_{j}\right) \big \}^{j' + \left\lceil \frac{6r}{\sqrt{2} \pi } \ln N \right\rceil + 1}_{j = j' - \left\lceil \frac{6r}{\sqrt{2} \pi } \ln N \right\rceil - 1}$ have been precomputed.

Proof

Combining Theorem 3 and Lemma 10 we can see that

$$\begin{aligned}&\left| \left( g*f\right) (x) - \frac{1}{N}\sum ^{j' + \left\lceil \frac{6r}{\sqrt{2} \pi } \ln N \right\rceil + 1}_{j = j' - \left\lceil \frac{6r}{\sqrt{2} \pi } \ln N \right\rceil - 1}f\left( y_{j}\right) g\left( x - y_{j}\right) \right| \\&\qquad \qquad \le \Vert \mathbf {f}\Vert _p \left( \frac{1}{\beta \sqrt{\ln N}}N^{1-\frac{\beta ^{2}}{18} - \frac{1}{p}} + 2 ~N^{-\frac{2 \pi ^2 \gamma ^2}{\beta ^2}} \right) \end{aligned}$$

where $\beta = 6 \sqrt{r} \ge 6$, $N \ge 36 r = \beta ^2$, and $\gamma = \frac{6r}{\sqrt{2} \pi } = \frac{\beta \sqrt{r}}{\sqrt{2} \pi }$. $\square $

We are now prepared to bound the error of the proposed approach when utilizing the SFTs developed in [21].

4 An Error Guarantee for Algorithm 1 When Using the SFTs Proposed in [21]

Given the $2\pi \hbox {-periodic}$ Gaussian $g: [-\pi , \pi ] \rightarrow {\mathbb {R}}^{+}$ (3), consider the periodic modulation of g, $\tilde{g}_{q}:\left[ -\pi ,\pi \right] \rightarrow {\mathbb {C}}$, for any $q\in {\mathbb {Z}}$ defined by

$$\begin{aligned} \tilde{g}_{q}\left( x\right) = \mathbb {e}^{-\mathbb {i}qx}g\left( x\right) . \end{aligned}$$

One can see that

$$\begin{aligned} \tilde{g}_{q}\left( x\right)&=\mathbb {e}^{-\mathbb {i}qx}\sum _{\omega =-\infty }^{\infty }\widehat{g}_{\omega } \mathbb {e}^{\mathbb {i}\omega x}=\sum _{\omega =-\infty }^{\infty }\widehat{g}_{\omega } \mathbb {e}^{ \mathbb {i}\left( \omega -q\right) x}=\sum _{\tilde{\omega }=-\infty }^{\infty }\widehat{g}_{\tilde{\omega }+q} \mathbb {e}^{ \mathbb {i}\tilde{\omega }x}, \end{aligned}$$

so that the Fourier series coefficients of $\tilde{g}_{q}$ are those of g, shifted by q; that is,

$$\begin{aligned} \left( \widehat{\tilde{g}_{q}}\right) _{\omega }=\widehat{g}_{\omega +q}. \end{aligned}$$

In line 9 of Algorithm 1, we provide the SFT Algorithm in [21] with the approximate evaluations of $\left\{ \left( \tilde{g}_{q}*f\right) \left( x_{k}\right) \right\} _{k=1}^{m},$ namely, $\left\{ \left( \tilde{g}_{q}*f\right) \left( x_{k}\right) +n_{k}\right\} _{k=1}^{m}$, where, by Theorem 4, the perturbations $n_{k}$ are bounded, for instance, by

$$\begin{aligned} \left| n_{k}\right| \le 3\frac{\left\| f\right\| _{\infty }}{N^{r}}\ \forall \ k=1,\dots ,m. \end{aligned}$$

With this in mind, let us apply Lemma 4 to the function $\tilde{g}_{q}*f$. We have the following lemma.

Lemma 11

Let $s\in [2,N]\cap {\mathbb {N}}$, and $\mathbf {n}\in {\mathbb {C}}^{m}$ be the vector containing the total errors incurred by approximating $\tilde{g}_{q}*f$ via a truncated version of (11), as per Theorem 4. There exists a set of m points $\left\{ x_{k}\right\} _{k=1}^{m}\subset \left[ -\pi ,\pi \right] $ such that Algorithm 3 on page 72 of [21], when given access to the corrupted samples $\left\{ \left( \tilde{g}_{q}*f\right) \left( x_{k}\right) +n_{k}\right\} _{k=1}^{m},$ will identify a subset $S\subseteq B$ which is guaranteed to contain all $\omega \in B$ with

$$\begin{aligned} \left| \left( \widehat{\tilde{g}_{q}*f}\right) _{\omega }\right| >4\left( \frac{1}{s}\cdot \left\| \widehat{\tilde{g}_{q}*f}-\left( \widehat{\tilde{g}_{q}*f}\right) _{s}^{\mathrm{opt}}\right\| _{1}+3\left\| \mathbf {f}\right\| _{\infty }N^{-r}\right) =:4\tilde{\delta }. \end{aligned}$$

Furthermore, every $\omega \in S$ returned by Algorithm 3 will also have an associated Fourier series coefficient estimate $z_{\omega }\in {\mathbb {C}}$ which is guaranteed to have

$$\begin{aligned} \left| \left( \widehat{\tilde{g}_{q}*f}\right) _{\omega }-z_{\omega }\right| \le \sqrt{2}\tilde{\delta }. \end{aligned}$$

Next, we need to guarantee that the estimates of $\widehat{\tilde{g}_{q}*f}$ returned by Algorithm 3 of [21] will yield good estimates of $\widehat{f}$ itself. We have the following.

Lemma 12

Let $s\in [2,N]\cap {\mathbb {N}}$. Given a $2\pi $-periodic function $f:\left[ -\pi ,\pi \right] \rightarrow {\mathbb {C}}$, the periodic Gaussian g, and any of its modulations $\tilde{g}_{q}\left( x\right) =\mathbb {e}^{-\mathbb {i}qx}g\left( x\right) $, one has

$$\begin{aligned} \left\| \widehat{\tilde{g}_{q}*f}-\left( \widehat{\tilde{g}_{q}*f}\right) _{s}^{\mathrm{opt}}\right\| _{1}\le \frac{1}{2} \left\| \widehat{f}-\widehat{f}_{s}^{\mathrm{opt}}\right\| _{1}. \end{aligned}$$

Proof

Recall the definition of $R_{s}^{\mathrm{opt}}\left( \widehat{f}\right) $ as the subset of B containing the s most energetic frequencies of $\widehat{f}$, and observe that

$$\begin{aligned} \frac{1}{2} \left\| \widehat{f}-\widehat{f}_{s}^{\mathrm{opt}}\right\| _{1}= \frac{1}{2} \sum _{\omega \in B\backslash R_{s}^{\mathrm{opt}}\left( \widehat{f}\right) }\left| \widehat{f}_{\omega }\right| \ge \sum _{\omega \in B\backslash R_{s}^{\mathrm{opt}}\left( \widehat{f}\right) }\left| \left( \widehat{\tilde{g}_{q}}\right) _{\omega }\cdot \widehat{f}_{\omega }\right| \end{aligned}$$

since, by Lemma 2, $\widehat{g}_{\omega }<\frac{1}{2}$ for all $\omega $, and consequently, $\left( \widehat{\tilde{g}_{q}}\right) _{\omega }=\widehat{g}_{\omega +q}<\frac{1}{2}$ for all $\omega $. Moreover,

$$\begin{aligned} \sum _{\omega \in B\backslash R_{s}^{\mathrm{opt}}\left( \widehat{f}\right) }\left| \left( \widehat{\tilde{g}_{q}}\right) _{\omega }\cdot \widehat{f}_{\omega }\right|&\ge \sum _{\omega \in B\backslash R_{s}^{\mathrm{opt}}\left( \widehat{\tilde{g}_{q}*f}\right) }\left| \left( \widehat{\tilde{g}_{q}}\right) _{\omega }\cdot \widehat{f}_{\omega }\right| ~=~\left\| \widehat{\tilde{g}_{q}*f}-\left( \widehat{\tilde{g}_{q}*f}\right) _{s}^{\mathrm{opt}}\right\| _{1}. \end{aligned}$$

Let us combine the guarantees above into the following lemma.

Lemma 13

Let $s\in [2,N]\cap {\mathbb {N}}$, and $\mathbf {n}\in {\mathbb {C}}^{m}$ be the vector containing the total errors incurred by approximating $\tilde{g}_{q}*f$ via a truncated version of (11), as per Theorem 4. There exists a set of m points $\left\{ x_{k}\right\} _{k=1}^{m}\subset \left[ -\pi ,\pi \right] $ such that Algorithm 3 on page 72 of [21], when given access to the corrupted samples $\left\{ \left( \tilde{g}_{q}*f\right) \left( x_{k}\right) +n_{k}\right\} _{k=1}^{m},$ will identify a subset $S\subseteq B$ which is guaranteed to contain all $\omega \in B$ with

$$\begin{aligned} \left| \left( \widehat{\tilde{g}_{q}*f}\right) _{\omega }\right| >4\left( \frac{1}{2s}\cdot \left\| \widehat{f}-\widehat{f}_{s}^{\mathrm{opt}}\right\| _{1}+3\left\| \mathbf {f}\right\| _{\infty }N^{-r}\right) =:4\delta . \end{aligned}$$

Furthermore, every $\omega \in S$ returned by Algorithm 3 will also have an associated Fourier series coefficient estimate $z_{\omega }\in {\mathbb {C}}$ which is guaranteed to have

$$\begin{aligned} \left| \left( \widehat{\tilde{g}_{q}}\right) _{\omega }\cdot \widehat{f}_{\omega }-z_{\omega }\right| \le \sqrt{2}\delta . \end{aligned}$$

The lemma above implies that for any choice of q in line 4 of Algorithm 1, we are guaranteed to find all $\omega \in \left[ q-\left\lceil \frac{N}{\alpha \sqrt{\ln N}}\right\rceil ,q+\left\lceil \frac{N}{\alpha \sqrt{\ln N}}\right\rceil \right) \cap B$ with

$$\begin{aligned} \left| \widehat{f}_{\omega }\right| >\max _{\tilde{\omega }}\frac{4\delta }{\left( \widehat{\tilde{g}_{q}}\right) _{\tilde{\omega }}}\ge \frac{4\delta }{\tau } \end{aligned}$$

where $\alpha $ and $\tau $ are as defined in Lemma 3. Moreover, the Fourier series coefficient estimates $z_{\omega }$ returned by Algorithm 3 will satisfy

$$\begin{aligned} \left| \widehat{f}_{\omega }-\frac{z_{\omega }}{\left( \widehat{\tilde{g}_{q}}\right) _{\omega }}\right| \le \max _{\tilde{\omega }}\frac{\sqrt{2}\delta }{\left( \widehat{\tilde{g}_{q}}\right) _{\tilde{\omega }}}\le \frac{\sqrt{2}\delta }{\tau }. \end{aligned}$$

Following Theorem 3, which guarantees a decay of $N^{-r}$ in the total approximation error, let us set $\beta =6\sqrt{r}$ for $1\le r\le \frac{N}{36}$. Recall from Lemma 3 the choice of $\beta $$\in \left( 0,\alpha \sqrt{\frac{\ln \left( 1/\tau \sqrt{2\pi }\right) }{2}}\right] $ where $\tau $ is to be chosen from $\left( 0,\frac{1}{\sqrt{2\pi }}\right) $. Thus, we must choose $\alpha \in \left[ 1,\frac{N}{\sqrt{\ln N}}\right] $ so that

$$\begin{aligned} 6\sqrt{r}\le \alpha \sqrt{\frac{\ln \left( 1/\tau \sqrt{2\pi }\right) }{2}}\iff \alpha \ge \frac{6\sqrt{2r}}{\ln \left( 1/\tau \sqrt{2\pi }\right) }. \end{aligned}$$

We may remove the dependence on $\tau $ simply by setting, e.g., $\tau =\frac{1}{3}$. Then $\alpha ={\mathcal {O}}\left( \sqrt{r}\right) $.

We are now ready to state the recovery guarantee of Algorithm 1 and its operation count.

Theorem 5

Let $N\in {\mathbb {N}}$, $s\in [2,N]\cap {\mathbb {N}}$, and $1\le r \le \frac{N}{36}$ as in Theorem 4. If Algorithm 3 of [21] is used in Algorithm 1 then Algorithm 1 will always deterministically identify a subset $S\subseteq B$ and a sparse vector $\mathbf {v}\vert _{S}\in {\mathbb {C}}^{N}$ satisfying

$$\begin{aligned} \left\| {\hat{\mathbf {f}}}-\mathbf {v}\vert _{S}\right\| _{2}\le \left\| {\hat{\mathbf {f}}}-{\hat{\mathbf {f}}}_{s}^{\mathrm{opt}}\right\| _{2}+\frac{33}{\sqrt{s}}\cdot \left\| {\hat{\mathbf {f}}}-{\hat{\mathbf {f}}}_{s}^{\mathrm{opt}}\right\| _{1}+198\sqrt{s}\left\| \mathbf {f}\right\| _{\infty }N^{-r}. \end{aligned}$$

(13)

Algorithm 1’s operation count is then

$$\begin{aligned} {\mathcal {O}} \left( \frac{ s^2\cdot r^{\frac{3}{2}} \cdot \log ^{\frac{11}{2}} (N)}{\log (s)} \right) . \end{aligned}$$

If returning a sparse vector $\mathbf {v}\vert _{S}\in {\mathbb {C}}^{N}$ that satisfies (13) with probability at least $(1-p) \in [2/3,1)$ is sufficient, a Monte Carlo variant of the deterministic Algorithm 3 in [21] may be used in line 9 of Algorithm 1. In this case Algorithm 1’s operation count is

$$\begin{aligned} {\mathcal {O}} \left( s\cdot r^{\frac{3}{2}} \cdot \log ^\frac{9}{2}(N)\cdot \log \left( \frac{N}{p}\right) \right) . \end{aligned}$$

Proof

Redefine $\delta $ in the proof of Theorem 7 in [21] as

$$\begin{aligned} \delta =\frac{1}{\tau }\left( \frac{1}{2s}\cdot \left\| \widehat{f}-\widehat{f}_{s}^{\mathrm{opt}}\right\| _{1}+3\left\| \mathbf {f}\right\| _{\infty }N^{-r}\right) = 3\left( \frac{1}{2s}\cdot \left\| {\hat{\mathbf {f}}}- {\hat{\mathbf {f}}}_{s}^{\mathrm{opt}}\right\| _{1}+3\left\| \mathbf {f}\right\| _{\infty }N^{-r}\right) , \end{aligned}$$

and observe that any $\omega \in B=\left( -\left\lceil \frac{N}{2}\right\rceil ,\left\lfloor \frac{N}{2}\right\rfloor \right] \cap {\mathbb {Z}}$ that is reconstructed by Algorithm 1 will have a Fourier series coefficient estimate $v_{\omega }$ that satisfies

$$\begin{aligned} \left| v_{\omega }-{\hat{\mathbf {f}}}_{\omega }\right| = \left| v_{\omega }-\widehat{f}_{\omega }\right| \le \sqrt{2}\cdot \delta . \end{aligned}$$

We can thus bound the approximation error by

$$\begin{aligned} \begin{aligned} \left\| {\hat{\mathbf {f}}}-\mathbf {v}\vert _{S}\right\| _{2}&\le \left\| {\hat{\mathbf {f}}}-{\hat{\mathbf {f}}}\vert _{S}\right\| _{2}+\left\| {\hat{\mathbf {f}}}\vert _{S}-\mathbf {v}\vert _{S}\right\| _{2}\le \left\| {\hat{\mathbf {f}}}-{\hat{\mathbf {f}}}\vert _{S}\right\| _{2}+2\sqrt{s}\cdot \delta \\&=\sqrt{\left\| {\hat{\mathbf {f}}}-{\hat{\mathbf {f}}}_{s}^{\mathrm{opt}}\right\| _{2}^{2}+\sum _{\omega \in R_{s}^{\mathrm{opt}}\left( \widehat{f}\right) \backslash S}\left| \widehat{f}_{\omega }\right| ^{2}-\sum _{\tilde{\omega }\in S\backslash R_{s}^{\mathrm{opt}}\left( \widehat{f}\right) }\left| \widehat{f}_{\tilde{\omega }}\right| ^{2}}+2\sqrt{s}\cdot \delta . \end{aligned} \end{aligned}$$

(14)

In order to make additional progress on (14) we must consider the possible magnitudes of $\mathbf {\widehat{f}}$ entries at indices in $S\backslash R_{s}^{\mathrm{opt}}\left( \widehat{f}\right) $ and $R_{s}^{\mathrm{opt}}\left( \widehat{f}\right) \backslash S$. Careful analysis (in line with the techniques employed in the proof of Theorem 7 of [21]) indicates that

$$\begin{aligned} \sum _{\omega \in R_{s}^{\mathrm{opt}}\left( \widehat{f}\right) \backslash S}\left| \widehat{f}_{\omega }\right| ^{2}-\sum _{\tilde{\omega }\in S\backslash R_{s}^{\mathrm{opt}}\left( \widehat{f}\right) }\left| \widehat{f}_{\tilde{\omega }}\right| ^{2}\le s\cdot \left( 8\sqrt{2}+8\right) ^{2}\cdot \delta ^{2}. \end{aligned}$$

Therefore, in the worst possible case equation (14) will remain bounded by

$$\begin{aligned} \left\| {\hat{\mathbf {f}}}-\mathbf {v}\vert _{S}\right\| _{2}\le \sqrt{\left\| {\hat{\mathbf {f}}}-{\hat{\mathbf {f}}}_{s}^{\mathrm{opt}}\right\| _{2}^{2}+s\cdot \left( 8\sqrt{2}+8\right) ^{2}\cdot \delta ^{2}}+2\sqrt{s}\cdot \delta \le \left\| {\hat{\mathbf {f}}}- {\hat{\mathbf {f}}}_{s}^{\mathrm{opt}}\right\| _{2}+22\sqrt{s}\cdot \delta . \end{aligned}$$

The error bound stated in (13) follows.

The runtimes follow by observing that $c_2 = {\mathcal {O}} \left( \alpha \cdot \log ^{\frac{1}{2}} (N)\right) = {\mathcal {O}}\left( r^{\frac{1}{2}}\cdot \log ^{\frac{1}{2}} (N) \right) $ as chosen in line 2 of Algorithm 1, and for every choise of q in line 4 of Algorithm 1, all of the evaluations $\left\{ (\tilde{g}_q*f)(x_k) \right\} ^m_{k=1}$ can be approximated very accurately in just ${\mathcal {O}}(m r \log N)$-time, where the number of samples m is on the orders described in Theorem 2. $\square $

We are now ready to empirically evaluate Algorithm 1 with several different SFT algorithms ${\mathcal {A}}$ used in its line 9.

5 Numerical Evaluation

In this section we evaluate the performance of three new discrete SFT Algorithms resulting from Algorithm 1: DMSFT-4, DMSFT-6,^{Footnote 8} and CLW-DSFT.^{Footnote 9} All of them were developed by utilizing different SFT algorithms in line 9 of Algorithm 1. Here DMSFT stands for the Discrete Michigan State Fourier Transform algorithm. Both DMSFT-4 and DMSFT-6 are implementations of Algorithm 1 that use a randomized version of the SFT algorithm GFFT [29] in their line 9.^{Footnote 10} The only difference between DMSFT-4 and DMSFT-6 is how accurately each one estimates the convolution in line 7 of Algorithm 1: for DMSFT-4 we use $\kappa = 4$ in the partial discrete convolution in Lemma 10 when approximating $\tilde{g}_q*f$ at each $x_k$, while for DMSFT-6 we always use $\kappa = 6$. The CLW-DSFT stands for the Christlieb Lawlor Wang Discrete Sparse Fourier Transform algorithm. It is an implementation of Algorithm 1 that uses the SFT developed in [6] in its line 9, and $\kappa $ varying between 12 and 20 for its line 7 convolution estimates (depending on each input vector’s Fourier sparsity, etc.). All of DMSFT-4, DMSFT-6 and CLW-DSFT were implemented in C++ in order to empirically evaluate their runtime and noise robustness characteristics.

We also compare these new implementations’ runtime and robustness characteristics with FFTW 3.3.4^{Footnote 11} and sFFT 2.0.^{Footnote 12} FFTW is a highly optimized FFT implementation which runs in ${\mathcal {O}}(N\log N)$-time for input vectors of length N. All the standard discrete Fourier transforms in the numerical experiments are performed using FFTW 3.3.4 with FFTW_MEASURE plan. The sFFT 2.0 is a randomized discrete sparse Fourier Transform algorithm written in C++ which is both stable and robust to noise. It was developed by Hassanieh et al. in [15]. Note that DMSFT-4, DMSFT-6, CLW-DSFT, and sFFT 2.0 are all randomized algorithms designed to approximate discrete DFTs that are approximately s-sparse. This means that all of them take both sparsity s and size N of the DFT’s ${\hat{\mathbf {f}}}\in {\mathbb {C}}^N$ they aim to recover as parameters. In contrast, FFTW cannot utilize existing sparsity to its advantage. Finally, all experiments are run on a Linux CentOS machine with 2.50 GHz CPU and 16 GB of RAM.

5.1 Experiment Setup

For the execution time experiments each trial input vector $\mathbf {f}\in {\mathbb {C}}^N$ was generated as follows: First s frequencies were independently selected uniformly at random from $[0, N)\cap {\mathbb {Z}}$, and then each of these frequencies was assigned a uniform random phase with magnitude 1 as its Fourier coefficient. The remaining frequencies’ Fourier coefficients were then set to zero to form ${\hat{\mathbf {f}}}\in {\mathbb {C}}^N$. Finally, the trial input vector $\mathbf {f}$ was then formed via an inverse DFT.

For each pair of s and N the parameters in each randomized algorithm were chosen so that the probability of correctly recovering all s energetic frequencies was at least 0.9 per trial input. Every data point in a figure below corresponds to an average over 100 runs on 100 different trial input vectors of this kind. It is worth mentioning that the parameter tuning process for DMSFT-4 and DMSFT-6 requires significantly less effort than for both CLW-DSFT and sFFT 2.0 since the DMSFT variants only have two parameters (whose default values are generally near-optimal).

5.2 Runtime as Input Vector Size Varies

In Fig. 1 we fixed the sparsity to $s=50$ and ran numerical experiments on 8 different input vector lengths N: $2^{16}$, $2^{18}$, $\ldots $, $2^{30}$. We then plotted the running time (averaged over 100 runs) for DMSFT-4, DMSFT-6, CLW-DSFT, sFFT 2.0, and FFTW.

As expected, the runtime slope of all the SFT algorithms (i.e. DMSFT-4, DMSFT-6, CLW-DSFT, and sFFT 2.0) is less than the slope of FFTW as N increases. Although FFTW is fastest for vectors of small size, it becomes the slowest algorithm when the vector size N is greater than $2^{20}$. Among the randomized algorithms, sFFT 2.0 is the fastest one when N is less than $2^{22}$, but DMSFT-4, DMSFT-6, and CLW-DSFT all outperform sFFT 2.0 with respect to runtime when the input vector’s sizes are large enough. The CLW-DSFT implementation becomes faster than sFFT 2.0 when N is approximately $2^{21}$ while DMSFT-4 and DMSFT-6 have better runtime performance than sFFT 2.0 when N is greater than $2^{23}$.

5.3 Runtime as Sparsity Varies

In Fig. 2 we fix the input vector lengths to $N = 2^{26}$ and run the numerical experiments on 7 different values of sparsity s: 50, 100, 200, 400, 1000, 2000, and 4000. As expected, the FFTW’s runtime is constant as we increase the sparsity. The runtimes of DMSFT-4, CLW-DSFT, and sFFT 2.0 are all essentially linear in s. Here DMSFT-6 has been excluded for ease of viewing/reference – its runtimes lie directly above those of DMSFT-4 when included in the plot. Looking at Fig. 2 we can see the CLW-DSFT’s runtime increases more rapidly with s than that of DMSFT-4 and sFFT 2.0. The runtime of CLW-DSFT becomes the slowest one when sparsity is around 1000. DMSFT-4 and sFFT 2.0 have approximately the same runtime slope as s increases, and they both have good performance when the sparsity is large. However, DMSFT-4 maintains consistently better runtime performance than sFFT 2.0 for all sparsity values, and is the only algorithm in the plot that still faster than FFTW when the sparsity is 4000. Indeed, when the sparsity is 4000 the average runtime of DMSFT-4 is 2.68 s and the average runtime of DMSFT-6 is 2.9 s. Both of them remain faster than FFTW (3.47 s) and sFFT 2.0 (3.96 s) for this large sparsity (though only DMSFT-4 has been included in the plot above).

5.4 Robustness to Noise

In our final set of experiments we test the noise robustness of DMSFT-4, DMSFT-6, CLW-DSFT, sFFT 2.0, and FFTW for different levels of Gaussian noise. Here the size of each input vector is $N=2^{22}$ and sparsity is fixed at $s = 50$. The test signals are then generated as before, except that Gaussian noise is added to $\mathbf {f}$ after it is constructed. More specifically, we first generate $\mathbf {f}$ and then set $\mathbf {f}= \mathbf {f}+ \mathbf {n}$ where each entry of $\mathbf {n}$, $n_j$, is an i.i.d. mean 0 random complex Gaussian value. The noise vector $\mathbf {n}$ is then rescaled to achieve each desired signal-to-noise ratio (SNR) considered in the experiments.^{Footnote 13}

Recall that the the randomized algorithms compared herein (DMSFT-4, DMSFT-6, CLW-DSFT, and sFFT 2.0) are all tuned to guarantee exact recovery of s-sparse functions with probability at least 0.9 in all experiments. For our noise robustness experiments this ensures that the correct frequency support, S, is found for at least 90 of the 100 trial signals used to generate each point plotted in Fig. 3. We use average $L_1$ error to measure the noise robustness of each algorithm for each of these at least 90 trial runs. The average $L_1$ error is defined as

$$\begin{aligned} \textit{Average }L_1\textit{ Error} = \frac{1}{s}\sum _{\omega \in S} \big |\hat{f}_{\omega } - z_{\omega } \big | \end{aligned}$$

where S is the true frequency support of the input vector $\mathbf {f}$, $\hat{f}_{\omega }$ are the true input Fourier coefficients for all frequencies $\omega \in S$, and $z_{\omega }$ are their recovered approximations from each algorithm. Figure 3 graphs the averaged average $L_1$ error over the at least 90 trial signals where each method correctly identified S.

It can be seen in Fig. 3 that DMSFT-4, DMSFT-6, sFFT 2.0, and FFTW are all robust to noise. As expected, FFTW has the best performance in this test. DMSFT-4 and DMSFT-6 are both more robust to noise when compared to sFFT 2.0. As for CLW-DSFT, it cannot guarantee a 0.9 probability of correctly recovering S when the SNR is less than 40 and so is not plotted for those SNR values. This is due to the base energetic frequency identification methods of [6, 25] being inherently ill conditioned, though the CLW-DSFT results look better when compared to the true ${\hat{\mathbf {f}}}$ with respect to, e.g., earth mover’s distance. Frequencies are often estimated incorrectly by CLW-DSFT at higher noise levels, but when they are they are usually at least close enough to the true frequencies to be informative.

6 Conclusion

Let ${\mathcal {A}}$ be a sublinear-time sparse FFT algorithm which utilizes unequally spaced samples from a given periodic function $f: [-\pi , \pi ] \rightarrow {{\mathbb {C}}}$ in order to rapidly approximate its sequence of Fourier series coefficients $\hat{f} \in \ell ^2$. In this paper we propose a generic method of transforming any such algorithm ${\mathcal {A}}$ into a sublinear-time sparse DFT algorithm which rapidly approximates ${\hat{\mathbf {f}}}$ from a given input vector $\mathbf {f}\in {{\mathbb {C}}}^N$. As a result we are able to construct several new sublinear-time sparse DFT algorithms from existing sparse Fourier algorithms which utilize unequally spaced function samples [6, 21, 25, 29]. The best of these new algorithms is shown to outperform existing discrete sparse Fourier transform methods with respect to both runtime and noise robustness for large vector lengths N. In addition, we also present several new theoretical discrete sparse FFT robust recovery guarantees. These include the first known theoretical guarantees for entirely deterministic and discrete sparse DFT algorithms which hold for arbitrary input vectors $\mathbf {f}\in {{\mathbb {C}}}^N$.

Notes

Note that methods which compute the DFT ${\hat{\mathbf {f}}}$ of a given vector $\mathbf {f}$ implicitly assume that $\mathbf {f}$ contains equally spaced samples from the trigonometric polynomial f above.
Note that ${\hat{\mathbf {f}}}_{s}^{\mathrm{opt}}$ may not be unique as there can be ties for the sth largest entry in magnitude of $\mathbf {f}$. This trivial ambiguity turns out not to matter.
Theorem 1 is a slightly simplified version of Theorem 5 proven in Sect. 4.
Of course deterministic algorithms with error guarantees of the type of (1) do exist for more restricted classes of periodic functions f. See, e.g. [3, 4, 27] for some examples. These include USSFT methods developed for periodic functions with structured Fourier support [3] which are of use for, among other things, the fast approximation of functions which exhibit sparsity with respect to other bounded orthonormal basis functions [16].
The interested reader may refer to Appendix A for the proof of (2).
We hasten to point out, moreover, that similar ideas can also be employed for adaptive and noise robust SFT algorithms in order to approximately evaluate f in an “on demand” fashion as well. We leave the details to the interested reader.
Each function evaluation $f(x_k)$ needs to be accurately computed in just ${\mathcal {O}}(\log ^c N)$-time in order to allow us to achieve our overall desired runtime for Algorithm 1.
The code for both DMSFT variants is available at https://sourceforge.net/projects/aafftannarborfa/.
The CLW-DSFT code is available at www.math.msu.edu/~markiwen/Code.html.
Code for GFFT is also available at www.math.msu.edu/~markiwen/Code.html.
This code is available at http://www.fftw.org/.
This code is available at https://groups.csail.mit.edu/netmit/sFFT/.
The SNR is defined as $SNR = 20\log \frac{\parallel \mathbf {f}\parallel _2}{\parallel \mathbf {n}\parallel _2}$, where $\mathbf {f}$ is the length N input vector and $\mathbf {n}$ is the length N noise vector.

References

Bailey, J., Iwen, M.A., Spencer, C.V.: On the design of deterministic matrices for fast recovery of Fourier compressible functions. SIAM J. Matrix Anal. Appl. 33(1), 263–289 (2012)
Article MathSciNet MATH Google Scholar
Beylkin, G.: On the fast Fourier transform of functions with singularities. Appl. Comput. Harmon. Anal. 2(4), 363–381 (1995)
Article MathSciNet MATH Google Scholar
Bittens, S.: Sparse FFT for Functions with Short Frequency Support. University of Göttingen, Göttingen (2016)
MATH Google Scholar
Bittens, S., Zhang, R., Iwen, M.A.: A deterministic sparse FFT for functions with structured Fourier sparsity. arXiv:1705.05256 (2017)
Bluestein, L.: A linear filtering approach to the computation of discrete Fourier transform. IEEE Trans. Audio Electroacoust. 18(4), 451–455 (1970)
Article Google Scholar
Christlieb, A., Lawlor, D., Wang, Y.: A multiscale sub-linear time Fourier algorithm for noisy data. Appl. Comput. Harmon. Anal. 40, 553–574 (2016)
Article MathSciNet MATH Google Scholar
Cohen, A., Dahmen, W., DeVore, R.: Compressed sensing and best k-term approximation. J. Am. Math. Soc. 22(1), 211–231 (2009)
Article MathSciNet MATH Google Scholar
Cooley, J.W., Tukey, J.W.: An algorithm for the machine calculation of complex Fourier series. Math. Comput. 19(90), 297–301 (1965)
Article MathSciNet MATH Google Scholar
Dutt, A., Rokhlin, V.: Fast Fourier transforms for nonequispaced data. SIAM J. Sci. Comput. 14(6), 1368–1393 (1993)
Article MathSciNet MATH Google Scholar
Dutt, A., Rokhlin, V.: Fast Fourier transforms for nonequispaced data. ii. Appl. Comput. Harmon. Anal. 2(1), 85–100 (1995)
Article MathSciNet MATH Google Scholar
Foucart, S., Rauhut, H.: A Mathematical Introduction to Compressive Sensing. Birkhäuser, Basel (2013)
Book MATH Google Scholar
Gilbert, A.C., Muthukrishnan, S., Strauss, M.: Improved time bounds for near-optimal sparse Fourier representations. In: Proceedings of the Optics & Photonics 2005, pp. 59141A–59141A. International Society for Optics and Photonics (2005)
Gilbert, A.C., Strauss, M.J., Tropp, J.A.: A tutorial on fast Fourier sampling. IEEE Signal Process. Mag. 25(2), 57–66 (2008)
Article Google Scholar
Gilbert, A.C., Indyk, P., Iwen, M., Schmidt, L.: Recent developments in the sparse Fourier transform: a compressed Fourier transform for big data. IEEE Signal Process. Mag. 31(5), 91–100 (2014)
Article Google Scholar
Hassanieh, H., Indyk, P., Katabi, D., Price, E.: Simple and practical algorithm for sparse Fourier transform. In: Proceedings of the SODA (2012)
Hu, X., Iwen, M., Kim, H.: Rapidly computing sparse Legendre expansions via sparse Fourier transforms. Numer. Algorithms 74(4), 1029–1059 (2017)
Article MathSciNet MATH Google Scholar
Iwen, M.A.: A deterministic sub-linear time sparse Fourier algorithm via non-adaptive compressed sensing methods. In: Proceedings of the Nineteenth Annual ACM-SIAM Symposium on Discrete Algorithms, pp. 20–29. Society for Industrial and Applied Mathematics (2008)
Iwen, M.A.: Simple deterministically constructible rip matrices with sublinear Fourier sampling requirements. In: Proceedings of the CISS, pp. 870–875 (2009)
Iwen, M.A.: Combinatorial sublinear-time Fourier algorithms. Found. Comput. Math. 10, 303–338 (2010)
Article MathSciNet MATH Google Scholar
Iwen, M.A.: Notes on lemma 6. Preprint at www.math.msu.edu/~markiwen/Papers/Lemma6_FOCM_10.pdf (2012)
Iwen, M.A.: Improved approximation guarantees for sublinear-time Fourier algorithms. Appl. Comput. Harmon. Anal. 34, 57–82 (2013)
Article MathSciNet MATH Google Scholar
Iwen, M., Gilbert, A., Strauss, M., et al.: Empirical evaluation of a sub-linear time sparse DFT algorithm. Commun. Math. Sci. 5(4), 981–998 (2007)
Article MathSciNet MATH Google Scholar
Keiner, J., Kunis, S., Potts, D.: Using NFFT 3–a software library for various nonequispaced fast Fourier transforms. ACM Trans. Math. Softw. 36(4), 19:1–19:30 (2009)
Article MathSciNet MATH Google Scholar
Laska, J., Kirolos, S., Massoud, Y., Baraniuk, R., Gilbert, A., Iwen, M., Strauss, M.: Random sampling for analog-to-information conversion of wideband signals. In: Proceedings of the 2006 IEEE Dallas/CAS Workshop on Design, Applications, Integration and Software, pp. 119–122. IEEE (2006)
Lawlor, D., Wang, Y., Christlieb, A.: Adaptive sub-linear time Fourier algorithms. Adv. Adapt. Data Anal. 5(01), 1350003 (2013)
Article MathSciNet Google Scholar
Morotti, L.: Explicit universal sampling sets in finite vector spaces. Appl. Comput. Harmonic Anal. (2016) https://doi.org/10.1016/j.acha.2016.06.001
Plonka, G., Wannenwetsch, K.: A deterministic sparse FFT algorithm for vectors with small support. Numer. Algorithms 71(4), 889–905 (2016)
Article MathSciNet MATH Google Scholar
Rabiner, L., Schafer, R., Rader, C.: The chirp z-transform algorithm. IEEE Trans. Audio Electroacoust. 17(2), 86–92 (1969)
Article Google Scholar
Segal, I., Iwen, M.: Improved sparse Fourier approximation results: Faster implementations and stronger guarantees. Numer. Algorithms 63, 239–263 (2013)
Article MathSciNet MATH Google Scholar
Steidl, G.: A note on fast Fourier transforms for nonequispaced grids. Adv. Comput. Math. 9, 337–353 (1998)
Article MathSciNet MATH Google Scholar

Download references

Acknowledgements

M.A. Iwen, R. Zhang, and S. Merhi were all supported in part by NSF DMS-1416752. The authors would like to thank Aditya Viswanathan for helpful comments and feedback on the first draft of the paper.

Author information

Authors and Affiliations

Department of Mathematics, Michigan State University, East Lansing, MI, 48824, USA
Sami Merhi, Ruochuan Zhang & Mark A. Iwen
Department of Computational Mathematics, Science, and Engineering (CMSE), Michigan State University, East Lansing, MI, 48824, USA
Mark A. Iwen & Andrew Christlieb

Authors

Sami Merhi
View author publications
You can also search for this author in PubMed Google Scholar
Ruochuan Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Mark A. Iwen
View author publications
You can also search for this author in PubMed Google Scholar
Andrew Christlieb
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sami Merhi.

Additional information

Communicated by Hans G. Feichtinger.

Appendices

Appendix A: Fourier Basics: Continuous versus Discrete Fourier Transforms for Trigonometric Polynomials

Our objective in this appendix is to provide additional details regarding (2) and its relationship to the continuous sparse Fourier transform methods for periodic functions that we employ herein. Our starting point will be to assume only that we have been provided with a vector of data $\mathbf {f}\in {{\mathbb {C}}}^{N}$. Our goal is to rapidly approximate the matrix vector product ${\hat{\mathbf {f}}}= F \mathbf {f}$ where $F\in {{\mathbb {C}}}^{N\times N}$ is the DFT matrix whose entries are given by

$$\begin{aligned} F_{\omega ,j} :=\frac{(-1)^\omega }{N} \mathbb {e}^{-\frac{2\pi \mathbb {i}\cdot \omega \cdot j}{N}} \end{aligned}$$

for $\omega \in \left( -\left\lceil \frac{N}{2}\right\rceil ,\left\lfloor \frac{N}{2}\right\rfloor \right] \cap {\mathbb {Z}}$ and $j = 0, \dots , N-1$.

Beginning from this starting point, one may choose to regard the given vector of data $\mathbf {f}$ as having been generated by sampling a $2 \pi $-periodic trigonometric polynomial $f: [-\pi , \pi ] \rightarrow {{\mathbb {C}}}$ of the form

$$\begin{aligned} f\left( x\right) =\sum _{\omega \in B}\widehat{f}_{\omega } \mathbb {e}^{\mathbb {i}\omega x} \end{aligned}$$

where $B:=\left( -\left\lceil \frac{N}{2}\right\rceil ,\left\lfloor \frac{N}{2}\right\rfloor \right] \cap {\mathbb {Z}}$. In particular, herein we will assume that $\mathbf {f}$ has its jth entry generated by $f_j := f\left( -\pi +\frac{2\pi j}{N}\right) $ for $j = 0, \dots , N-1$. Note that there is exactly one such f for the given data $\mathbf {f}$ since $|B| = N$ (i.e., f is the unique interpolating polynomial for $\mathbf {f}$ with $\omega \in B$).

Considering f just above, we can now see that for any $k \in \mathbb {Z}$ the associated Fourier series coefficient of f is

$$\begin{aligned} \frac{1}{2\pi }\int _{-\pi }^{\pi }f\left( x\right) \mathbb {e}^{-\mathbb {i}k x}~dx&~=~ \frac{1}{2\pi }\int _{-\pi }^{\pi } \left( \sum _{\omega \in B} \widehat{f}_{\omega } \mathbb {e}^{\mathbb {i}\omega x} \right) \mathbb {e}^{-\mathbb {i}k x}~dx \\&~=~ \sum _{\omega \in B} \widehat{f}_{\omega } \left( \frac{1}{2\pi }\int _{-\pi }^{\pi } \mathbb {e}^{\mathbb {i}(\omega - k) x}~dx \right) \\&~=~ {\left\{ \begin{array}{ll} \widehat{f}_{k}, &{}\quad k \in B \\ 0, &{}\quad \text {otherwise} \end{array}\right. }. \end{aligned}$$

Changing our focus now to the discrete Fourier transform of $\mathbf {f}$ considered as being samples from f we can see that

Note that the line above establishes (2) where the vector ${\hat{\mathbf {f}}}\in {{\mathbb {C}}}^N$ exactly contains the nonzero Fourier series coefficients of f as its entries. As a result we can see that computing the Fourier series coefficients of f is equivalent to computing the matrix vector product ${\hat{\mathbf {f}}}= F \mathbf {f}$ for our given data $\mathbf {f}$.

Appendix B: Proof of Lemmas 1, 2 and 3

We will restate each lemma before its proof for ease of reference.

Lemma 14

(Restatement of Lemma 1) The $2\pi \hbox {-periodic}$ Gaussian $g:\left[ -\pi ,\pi \right] \rightarrow {\mathbb {R}}^{+}$ has

$$\begin{aligned} g\left( x\right) \le \left( \frac{3}{c_1}+\frac{1}{\sqrt{2\pi }} \right) \mathbb {e}^{-\frac{x^{2}}{2c_{1}^{2}}} \end{aligned}$$

for all $x \in \left[ -\pi ,\pi \right] $.

Proof

Observe that

$$\begin{aligned} c_{1}g\left( x\right) =\sum _{n=-\infty }^{\infty }\mathbb {e}^{-\frac{\left( x-2n\pi \right) ^{2}}{2c_{1}^{2}}}&=\mathbb {e}^{-\frac{x^{2}}{2c_{1}^{2}}}+\mathbb {e}^{-\frac{\left( x-2\pi \right) ^{2}}{2c_{1}^{2}}}+\mathbb {e}^{-\frac{\left( x+2\pi \right) ^{2}}{2c_{1}^{2}}}+\sum _{\left| n\right| \ge 2}\mathbb {e}^{-\frac{\left( x-2n\pi \right) ^{2}}{2c_{1}^{2}}}\\&\le 3\mathbb {e}^{-\frac{x^{2}}{2c_{1}^{2}}}+\int _{1}^{\infty }\mathbb {e}^{-\frac{\left( x-2n\pi \right) ^{2}}{2c_{1}^{2}}}dn+\int _{1}^{\infty }\mathbb {e}^{-\frac{\left( x+2n\pi \right) ^{2}}{2c_{1}^{2}}}dn \end{aligned}$$

holds since the series above have monotonically decreasing positive terms, and $x\in \left[ -\pi ,\pi \right] $.

Now, if $x\in \left[ 0,\pi \right] $ and $n\ge 1$, one has

$$\begin{aligned} \mathbb {e}^{-\frac{\left( 2n+1\right) ^{2}\pi ^{2}}{2c_{1}^{2}}}\le \mathbb {e}^{-\frac{\left( x+2n\pi \right) ^{2}}{2c_{1}^{2}}}\le \mathbb {e}^{-\frac{4n^{2}\pi ^{2}}{2c_{1}^{2}}}\le \mathbb {e}^{-\frac{\left( x-2n\pi \right) ^{2}}{2c_{1}^{2}}}\le \mathbb {e}^{-\frac{\left( 2n-1\right) ^{2}\pi ^{2}}{2c_{1}^{2}}}, \end{aligned}$$

which yields

$$\begin{aligned} c_{1}g\left( x\right)&~\le ~3\mathbb {e}^{-\frac{x^{2}}{2c_{1}^{2}}}+2\int _{1}^{\infty }\mathbb {e}^{-\frac{\pi ^{2}\left( 2n-1\right) ^{2}}{2c_{1}^{2}}}dn \\&~=~ 3\mathbb {e}^{-\frac{x^{2}}{2c_{1}^{2}}}+\frac{1}{2}\left( \int _{-\infty }^{\infty }\mathbb {e}^{-\frac{\pi ^{2}m^{2}}{2c_{1}^{2}}}dm-\int _{-1}^{1}\mathbb {e}^{-\frac{\pi ^{2}m^{2}}{2c_{1}^{2}}}dm\right) \\&=3\mathbb {e}^{-\frac{x^{2}}{2c_{1}^{2}}}+\frac{c_{1}}{\sqrt{2\pi }}-\frac{1}{2}\int _{-1}^{1}\mathbb {e}^{-\frac{\pi ^{2}m^{2}}{2c_{1}^{2}}}dm. \end{aligned}$$

Using Lemma 8 to bound the last integral we can now get that

$$\begin{aligned} c_{1}g\left( x\right)&\le 3\mathbb {e}^{-\frac{x^{2}}{2c_{1}^{2}}}+\frac{c_{1}}{\sqrt{2\pi }}-\frac{1}{2}\frac{\sqrt{2}c_{1}}{\pi }\sqrt{\pi \left( 1-\mathbb {e}^{-\frac{\pi ^{2}}{2c_{1}^{2}}}\right) }\\&= 3\mathbb {e}^{-\frac{x^{2}}{2c_{1}^{2}}}+\frac{c_{1}}{\sqrt{2\pi }}\left( 1-\sqrt{\left( 1-\mathbb {e}^{-\frac{\pi ^{2}}{2c_{1}^{2}}}\right) }\right) \\&\le 3\mathbb {e}^{-\frac{x^{2}}{2c_{1}^{2}}}+\frac{c_{1}}{\sqrt{2\pi }}\mathbb {e}^{-\frac{\pi ^{2}}{2c_{1}^{2}}} \le 3\mathbb {e}^{-\frac{x^{2}}{2c_{1}^{2}}}+\frac{c_{1}}{\sqrt{2\pi }}\mathbb {e}^{-\frac{x^{2}}{2c_{1}^{2}}}. \end{aligned}$$

Recalling now that g is even we can see that this inequality will also hold for all $x \in [-\pi ,0]$ as well. $\square $

Lemma 15

(Restatement of Lemma 2) The $2\pi \hbox {-periodic}$ Gaussian $g:\left[ -\pi ,\pi \right] \rightarrow {\mathbb {R}}^{+}$ has

$$\begin{aligned} \widehat{g}_{\omega } = \frac{1}{\sqrt{2\pi }}\mathbb {e}^{-\frac{c_1^2 \omega ^2}{2}} \end{aligned}$$

for all $\omega \in {\mathbb {Z}}$. Thus, $\widehat{g}=\left\{ \widehat{g}_{\omega }\right\} _{\omega \in {\mathbb {Z}}}\in \ell ^{2}$ decreases monotonically as $|\omega |$ increases, and also has $\Vert \widehat{g} \Vert _{\infty } = \frac{1}{\sqrt{2 \pi }}$.

Proof

Starting with the definition of the Fourier transform, we calculate

$$\begin{aligned} \widehat{g}_{\omega }= & {} \frac{1}{c_1}\sum _{n=-\infty }^{\infty }\frac{1}{2\pi }\int _{-\pi }^{\pi }\mathbb {e}^{-\frac{(x-2n\pi )^{2}}{2c_1^{2}}}\mathbb {e}^{-\mathbb {i}\omega x}~dx\\= & {} \frac{1}{c_1}\sum _{n=-\infty }^{\infty }\frac{1}{2\pi }\int _{-\pi }^{\pi }\mathbb {e}^{-\frac{(x-2n\pi )^{2}}{2c_1^{2}}}\mathbb {e}^{-\mathbb {i}\omega (x-2n\pi )}~dx\\= & {} \frac{1}{c_1}\sum _{n=-\infty }^{\infty }\frac{1}{2\pi }\int _{-\pi -2n\pi }^{\pi -2n\pi }\mathbb {e}^{-\frac{u{}^{2}}{2c_1^{2}}}\mathbb {e}^{-\mathbb {i}\omega u}~du\\= & {} \frac{1}{2\pi c_1}\int _{-\infty }^{\infty }\mathbb {e}^{-\frac{u{}^{2}}{2c_1^{2}}} \mathbb {e}^{-\mathbb {i}\omega u}~du\\= & {} \frac{c_1\sqrt{2\pi }}{2\pi c_1} \mathbb {e}^{-\frac{c_1^{2}\omega ^{2}}{2}}\\= & {} \frac{\mathbb {e}^{-\frac{c_1^{2}\omega ^{2}}{2}}}{\sqrt{2\pi }}. \end{aligned}$$

The last two assertions now follow easily. $\square $

Lemma 16

(Restatement of Lemma 3) Choose any $\tau \in \left( 0, \frac{1}{\sqrt{2\pi }} \right) $, $\alpha \in \left[ 1, \frac{N}{\sqrt{\ln N}} \right] $, and $\beta \in \left( 0 , \alpha \sqrt{\frac{\ln \left( 1/\tau \sqrt{2\pi } \right) }{2}} ~\right] $. Let $c_1 = \frac{\beta \sqrt{\ln N}}{N}$ in the definition of the periodic Gaussian g from (3). Then $\widehat{g}_{\omega } \in \left[ \tau , \frac{1}{\sqrt{2\pi }} \right] $ for all $\omega \in {\mathbb {Z}}$ with $|\omega | \le \Bigl \lceil \frac{N}{\alpha \sqrt{\ln N}}\Bigr \rceil $.

Proof

By Lemma 2 above it suffices to show that

$$\begin{aligned} \frac{1}{\sqrt{2\pi }}\mathbb {e}^{-\frac{c_1^2 \left( \bigl \lceil \frac{N}{\alpha \sqrt{\ln N}}\bigr \rceil \right) ^2}{2}} \ge \tau , \end{aligned}$$

which holds if and only if

$$\begin{aligned} c_1^2 \left( \left\lceil \frac{N}{\alpha \sqrt{\ln N}}\right\rceil \right) ^2\le & {} 2 \ln \left( \frac{1}{\tau \sqrt{2\pi }}\right) \\ c_1\le & {} \frac{\sqrt{2 \ln \left( \frac{1}{\tau \sqrt{2\pi }}\right) }}{\left\lceil \frac{N}{\alpha \sqrt{\ln N}}\right\rceil }. \end{aligned}$$

Thus, it is enough to have

$$\begin{aligned} c_1 \le \frac{\sqrt{2 \ln \left( \frac{1}{\tau \sqrt{2\pi }}\right) }}{ \frac{N}{\alpha \sqrt{\ln N}} + 1} ~=~\frac{\alpha \sqrt{2 \ln \left( \frac{1}{\tau \sqrt{2\pi }}\right) \ln N}}{ N + \alpha \sqrt{\ln N}}, \end{aligned}$$

or,

$$\begin{aligned} c_1 = \frac{\beta \sqrt{\ln N}}{N} \le \frac{\alpha \sqrt{2 \ln \left( \frac{1}{\tau \sqrt{2\pi }}\right) \ln N}}{2N} \le \frac{\alpha \sqrt{2 \ln \left( \frac{1}{\tau \sqrt{2\pi }}\right) \ln N}}{ N + \alpha \sqrt{\ln N}}. \end{aligned}$$

This, in turn, is guaranteed by our choice of $\beta $. $\square $

Appendix C: Proof of Lemma 4 and Theorem 2

We will restate Lemma 4 before its proof for ease of reference.

Lemma 17

(Restatement of Lemma 4) Let $s, \epsilon ^{-1} \in {\mathbb {N}} \setminus \{ 1 \}$ with $(s/\epsilon ) \ge 2$, and $\mathbf {n}\in {\mathbb {C}}^m$ be an arbitrary noise vector. There exists a set of m points $\left\{ x_k \right\} ^m_{k=1} \subset [-\pi , \pi ]$ such that Algorithm 3 on page 72 of [21], when given access to the corrupted samples $\left\{ f(x_k) + n_k \right\} ^m_{k=1}$, will identify a subset $S \subseteq B$ which is guaranteed to contain all $\omega \in B$ with

$$\begin{aligned} \left| \widehat{f}_{\omega } \right| > 4 \left( \frac{\epsilon \cdot \left\| {\hat{\mathbf {f}}}- {\hat{\mathbf {f}}}^{\mathrm{opt}}_{(s/\epsilon )} \right\| _1}{s} + \left\| \widehat{f} - \widehat{f}\vert _{B} \right\| _1 + \Vert \mathbf {n}\Vert _\infty \right) . \end{aligned}$$

(15)

Furthermore, every $\omega \in S$ returned by Algorithm 3 will also have an associate Fourier series coefficient estimate $z_{\omega } \in {\mathbb {C}}$ which is guaranteed to have

$$\begin{aligned} \left| \widehat{f}_{\omega } - z_{\omega } \right| \le \sqrt{2} \left( \frac{\epsilon \cdot \left\| {\hat{\mathbf {f}}}- {\hat{\mathbf {f}}}^{\mathrm{opt}}_{(s/\epsilon )} \right\| _1}{s} + \left\| \widehat{f} - \widehat{f}\vert _{B} \right\| _1 + \Vert \mathbf {n}\Vert _\infty \right) . \end{aligned}$$

(16)

Both the number of required samples, m, and Algorithm 3’s operation count are

$$\begin{aligned} {\mathcal {O}} \left( \frac{s^2 \cdot \log ^4 (N)}{\log \left( \frac{s}{\epsilon } \right) \cdot \epsilon ^2} \right) . \end{aligned}$$

(17)

If succeeding with probability $(1-\delta ) \in [2/3,1)$ is sufficient, and $(s/\epsilon ) \ge 2$, the Monte Carlo variant of Algorithm 3 referred to by Corollary 4 on page 74 of [21] may be used. This Monte Carlo variant reads only a randomly chosen subset of the noisy samples utilized by the deterministic algorithm,

$$\begin{aligned} \left\{ f(\tilde{x}_k) + \tilde{n}_k \right\} ^{\tilde{m}}_{k=1} \subseteq \left\{ f(x_k) + n_k \right\} ^m_{k=1}, \end{aligned}$$

yet it still outputs a subset $S \subseteq B$ which is guaranteed to simultaneously satisfy both of the following properties with probability at least $1-\delta $:

(i)
S will contain all $\omega \in B$ satisfying (15), and
(ii)
all $\omega \in S$ will have an associated coefficient estimate $z_{\omega } \in {\mathbb {C}}$ satisfying (16).

Finally, both this Monte Carlo variant’s number of required samples, $\tilde{m}$, as well as its operation count will also always be

$$\begin{aligned} {\mathcal {O}} \left( \frac{s}{\epsilon } \cdot \log ^3 (N) \cdot \log \left( \frac{N}{\delta } \right) \right) . \end{aligned}$$

(18)

Proof

The proof of this lemma involves a somewhat tedious and uninspired series of minor modifications to various results from [21]. In what follows we will outline the portions of that paper which need to be changed in order to obtain the stated lemma. Algorithm 3 on page 72 of [21] will provide the basis of our discussion.

In the first paragraph of our lemma we are provided with m-contaminated evaluations of f, $\left\{ f(x_k) + n_k \right\} ^m_{k=1}$, at the set of m points $\left\{ x_k \right\} ^m_{k=1} \subset [-\pi , \pi ]$ required by line 4 of Algorithm 1 on page 67 of [21]. These contaminated evaluations of f will then be used to approximate the vector $\mathcal {G}_{\lambda ,K} \tilde{\psi } \mathbf{A} \in {\mathbb {C}}^m$ in line 4 of Algorithm 3. More specifically, using (18) on page 67 of [21] one can see that each $\left( {\mathcal {G}_{\lambda ,K} \tilde{\psi } \mathbf{A}} \right) _j \in {\mathbb {C}}$ is effectively computed via a DFT

$$\begin{aligned} \left( {\mathcal {G}_{\lambda ,K} \tilde{\psi } \mathbf{A}} \right) _j = \frac{1}{s_j} \sum ^{s_j - 1}_{k=0} f \left( -\pi + \frac{2 \pi k}{s_j} \right) \mathbb {e}^{\frac{-2 \pi \mathbb {i}k h_j}{s_j}} \end{aligned}$$

(19)

for some integers $0 \le h_j < s_j$. Note that we are guaranteed to have noisy evaluations of f at each of these points by assumption. That is, we have $f \left( x_{j,k} \right) + n_{j,k}$ for all $x_{j,k} := -\pi + \frac{2 \pi k}{s_j}$, $k = 0, \dots , s_j - 1$.

We therefore approximate each $\left( {\mathcal {G}_{\lambda ,K} \tilde{\psi } \mathbf{A}} \right) _j$ via an approximate DFT as per (19) by

$$\begin{aligned} E_j := \frac{1}{s_j} \sum ^{s_j - 1}_{k=0} \left( f \left( x_{j,k} \right) + n_{j,k} \right) \mathbb {e}^{\frac{-2 \pi \mathbb {i}k h_j}{s_j}}. \end{aligned}$$

One can now see that

$$\begin{aligned} \left| E_j - \left( {\mathcal {G}_{\lambda ,K} \tilde{\psi } \mathbf{A}} \right) _j \right| = \left| \frac{1}{s_j} \sum ^{s_j - 1}_{k=0} n_{j,k} \mathbb {e}^{\frac{-2 \pi \mathbb {i}k h_j}{s_j}} \right| \le \frac{1}{s_j} \sum ^{s_j - 1}_{k=0} \left| n_{j,k} \right| \le \Vert \mathbf {n}\Vert _{\infty } \end{aligned}$$

(20)

holds for all j. Every entry of both ${\mathcal {E}_{s_1,K} \tilde{\psi } \mathbf{A}}$ and ${\mathcal {G}_{\lambda ,K} \tilde{\psi } \mathbf{A}}$ referred to in Algorithm 3 will therefore be effectively replaced by its corresponding $E_j$ estimate. Thus, the lemma we seek to prove is essentially obtained by simply incorporating the additional error estimate (20) into the analysis of Algorithm 3 in [21] wherever an ${\mathcal {E}_{s_1,K} \tilde{\psi } \mathbf{A}}$ or ${\mathcal {G}_{\lambda ,K} \tilde{\psi } \mathbf{A}}$ currently appears.

To show that lines 6 – 14 of Algorithm 3 will identify all $\omega \in B$ satisfying (15) we can adapt the proof of Lemma 6 on page 72 of [21]. Choose any $\omega \in B$ you like. Lemmas 3 and 5 from [21] together with (20) above ensure that both

$$\begin{aligned} \left| E_j - \widehat{f}_{\omega } \right|&\le \left| E_j - \left( {\mathcal {G}_{\lambda ,K} \tilde{\psi } \mathbf{A}} \right) _j \right| + \left| \left( {\mathcal {G}_{\lambda ,K} \tilde{\psi } \mathbf{A}} \right) _j - \widehat{f}_{\omega } \right| \nonumber \\&\le \Vert \mathbf {n}\Vert _{\infty } + \frac{\epsilon \cdot \left\| {\hat{\mathbf {f}}}- {\hat{\mathbf {f}}}^{\mathrm{opt}}_{(s/\epsilon )} \right\| _1}{s} + \left\| \widehat{f} - \widehat{f}\vert _{B} \right\| _1 \end{aligned}$$

(21)

and

$$\begin{aligned} \left| E_{j'} - \widehat{f}_{\omega } \right|&\le \left| E_{j'} - \left( {\mathcal {E}_{s_1,K} \tilde{\psi } \mathbf{A}} \right) _{j'} \right| + \left| \left( {\mathcal {E}_{s_1,K} \tilde{\psi } \mathbf{A}} \right) _{j'} - \widehat{f}_{\omega } \right| \nonumber \\&\le \Vert \mathbf {n}\Vert _{\infty } + \frac{\epsilon \cdot \left\| {\hat{\mathbf {f}}}- {\hat{\mathbf {f}}}^{\mathrm{opt}}_{(s/\epsilon )} \right\| _1}{s} + \left\| \widehat{f} - \widehat{f}\vert _{B} \right\| _1 \end{aligned}$$

(22)

hold for more than half of the j and $j'$-indexes that Algorithm 3 uses to approximate $\widehat{f}_{\omega }$. The rest of the proof of Lemma 6 now follows exactly as in [21] after the $\delta $ at the top of page 73 is redefined to be $\delta := \frac{\epsilon \cdot \left\| {\hat{\mathbf {f}}}- {\hat{\mathbf {f}}}^{\mathrm{opt}}_{(s/\epsilon )} \right\| _1}{s} + \left\| \widehat{f} - \widehat{f}\vert _{B} \right\| _1 + \Vert \mathbf {n}\Vert _\infty $, each $\left( { \mathcal {G}_{\lambda ,K} \tilde{\psi } \mathbf{A}} \right) _j$ entry is replaced by $E_j$, and each $\left( {\mathcal {E}_{s_1,K} \tilde{\psi } \mathbf{A}} \right) _{j'}$ entry is replaced by $E_{j'}$.

Similarly, to show that lines 15 – 18 of Algorithm 3 will produce an estimate $z_{\omega } \in {\mathbb {C}}$ satisfying (16) for every $\omega \in S$ one can simply modify the first few lines of the proof of Theorem 7 in Appendix F of [21]. In particular, one can redefine $\delta $ as above, replace the appearance of each $\left( { \mathcal {G}_{\lambda ,K} \tilde{\psi } \mathbf{A}} \right) _j$ entry by $E_j$, and then use (21). The bounds on the runtime follow from the last paragraph of the proof of Theorem 7 in Appendix F of [21] with no required changes. To finish, we note that the second paragraph of the lemma above follows from a completely analogous modification of the proof of Corollary 4 in Appendix G of [21]. $\square $

1.1 Appendix C.1: Proof of Theorem 2

To get the first paragraph of Theorem 2 one can simply utilize the proof of Theorem 7 exactly as it is written in Appendix F of [21] after redefining $\delta $ as above, and then replacing the appearance of each $\left( \mathcal {G}_{\lambda ,K} \tilde{\psi } \mathbf{A} \right) _j$ entry with its approximation $E_j$. Once this has been done, equation (42) in the proof of Theorem 7 can then be taken as a consequence of Lemma 4 above. In addition, all references to Lemma 6 of [21] in the proof can then also be replaced with appeals to Lemma 4 above. To finish, the proof of Corollary 4 in Appendix G of [21] can now be modified in a completely analogous fashion in order to prove the second paragraph of Theorem 2.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Merhi, S., Zhang, R., Iwen, M.A. et al. A New Class of Fully Discrete Sparse Fourier Transforms: Faster Stable Implementations with Guarantees. J Fourier Anal Appl 25, 751–784 (2019). https://doi.org/10.1007/s00041-018-9616-4

Download citation

Received: 08 June 2017
Revised: 01 March 2018
Published: 03 May 2018
Issue Date: 15 June 2019
DOI: https://doi.org/10.1007/s00041-018-9616-4

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A New Class of Fully Discrete Sparse Fourier Transforms: Faster Stable Implementations with Guarantees

Abstract

Similar content being viewed by others

Deterministic sparse FFT for M-sparse vectors

A deterministic sparse FFT for functions with structured Fourier sparsity

High-dimensional sparse Fourier algorithms

1 Introduction

1.1 Theoretical Results

Theorem 1

2 Notation and Setup

2.1 Periodized Gaussians

Lemma 1

Lemma 2

Lemma 3

2.2 On the Robustness of the SFTs Proposed in [21]

Lemma 4

Theorem 2

3 Description of the Proposed Approach

3.1 Rapidly and Accurately Evaluating \(f*g\)

Lemma 5

Proof

Lemma 6

Proof

Lemma 7

Proof

Lemma 8

Proof

Lemma 9

Proof

Theorem 3

Proof

Lemma 10

Proof

Theorem 4

Proof

4 An Error Guarantee for Algorithm 1 When Using the SFTs Proposed in [21]

Lemma 11

Lemma 12

Proof

Lemma 13

Theorem 5

Proof

5 Numerical Evaluation

5.1 Experiment Setup

5.2 Runtime as Input Vector Size Varies

5.3 Runtime as Sparsity Varies

5.4 Robustness to Noise

6 Conclusion

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Appendices

Appendix A: Fourier Basics: Continuous versus Discrete Fourier Transforms for Trigonometric Polynomials

Appendix B: Proof of Lemmas 1, 2 and 3

Lemma 14

Proof

Lemma 15

Proof

Lemma 16

Proof

Appendix C: Proof of Lemma 4 and Theorem 2

Lemma 17

Proof

1.1 Appendix C.1: Proof of Theorem 2

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation