Multi-kernel Unmixing and Super-Resolution Using the Modified Matrix Pencil Method

Chrétien, Stéphane; Tyagi, Hemant

doi:10.1007/s00041-020-09725-x

Multi-kernel Unmixing and Super-Resolution Using the Modified Matrix Pencil Method

Published: 22 January 2020

Volume 26, article number 18, (2020)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Fourier Analysis and Applications Aims and scope Submit manuscript

Multi-kernel Unmixing and Super-Resolution Using the Modified Matrix Pencil Method

Download PDF

Stéphane Chrétien^1,2 &
Hemant Tyagi³

273 Accesses
2 Citations
Explore all metrics

Abstract

Consider L groups of point sources or spike trains, with the lth group represented by $x_l(t)$. For a function $g:\mathbb {R}\rightarrow \mathbb {R}$, let $g_l(t) = g(t/\mu _l)$ denote a point spread function with scale $\mu _l > 0$, and with $\mu _1< \cdots < \mu _L$. With $y(t) = \sum _{l=1}^{L} (g_l \star x_l)(t)$, our goal is to recover the source parameters given samples of y, or given the Fourier samples of y. This problem is a generalization of the usual super-resolution setup wherein $L = 1$; we call this the multi-kernel unmixing super-resolution problem. Assuming access to Fourier samples of y, we derive an algorithm for this problem for estimating the source parameters of each group, along with precise non-asymptotic guarantees. Our approach involves estimating the group parameters sequentially in the order of increasing scale parameters, i.e., from group 1 to L. In particular, the estimation process at stage $1 \le l \le L$ involves (i) carefully sampling the tail of the Fourier transform of y, (ii) a deflation step wherein we subtract the contribution of the groups processed thus far from the obtained Fourier samples, and (iii) applying Moitra’s modified Matrix Pencil method on a deconvolved version of the samples in (ii).

Global-Local Mixtures: A Unifying Framework

Article 17 February 2020

Free Component Analysis: Theory, Algorithms and Applications

Article 11 April 2022

Two pairwise iterative schemes for high dimensional blind source separation

Article 29 June 2020

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

1.1 Background on Super-Resolution

Super-resolution consists of estimating a signal x, given blurred observations obtained after convolution with a point spread function g which is assumed to represent the impulse response of the measurement system, such as for e.g., a microscope in high density single molecule imaging. Mathematically, x is typically modeled as a superposition of K Dirac’s, i.e., a sparse atomic measure of the form

$$\begin{aligned} x(t) = \sum _{i=1}^K u_i \delta (t-t_i); \quad u_i \in \mathbb {C}, t_i \in [0,1), \end{aligned}$$

while g is a low pass filter. Denoting

$$\begin{aligned} y(t) = \sum _{i=1}^K u_i g(t-t_i) \end{aligned}$$

(1.1)

to be the convolution of x and g, one is usually given information about x either as samples of y, or the Fourier samples of y. This problem has a number of important applications arising for instance in geophysics [31], astronomy [49], medical imaging [26] etc. The reader is referred to [13] for a more comprehensive list of applications. Super-resolution can be seen as a highly non-trivial “off the grid” extension of the finite dimensional sparse estimation problem in compressed sensing [22, 24] and statistics [11]. In the new setting, instead of estimating a sparse vector in a finite dimensional space, the goal is to estimate a sparse measure over the real line $\mathbb R$ endowed with its Borel $\sigma $-algebra.

Recently, the problem of super-resolution has received a great deal of interest in the signal processing community, triggered by the seminal work of Candès and Fernandez-Granda [12, 13]. They considered the setup where one is given the first few Fourier coefficients of x, i.e., for a cut-off frequency $m \in \mathbb {Z}^{+}$, we observe $f(s) \in \mathbb {C}$ where

$$\begin{aligned} f(s) = \int _{0}^{1} e^{\iota 2\pi st} x(dt) = \sum _{i=1}^K u_i e^{\iota 2\pi s t_i}; \quad s \in \left\{ {-m,-m+1,\dots ,m}\right\} . \end{aligned}$$

(1.2)

Note that this corresponds to taking $g(t) = 2m\ \text {sinc}(2mt)$ in (1.1), and sampling the Fourier transform of y on the regular grid $\left\{ {-m,-m+1,\dots ,m}\right\} $.

1.1.1 Total Variation and Atomic Norm-Based Approaches

In [13], the authors consider the noiseless setting and propose solving an optimization problem over the space of measures which involves minimizing the total variation (TV) norm amongst all measures which are consistent with the observations. The resulting minimization problem is an infinite dimensional convex program over Borel measures on $\mathbb {R}$. It was shown that the dual of this problem can be formulated as a semi-definite program (SDP) in finitely many variables, and thus can be solved in polynomial computational time. Since then, there have been numerous follow-up works such as by Schiebinger et al. [50], Duval and Peyre [18], Denoyelle et al. [17], Bendory et al. [6], Azaïs et al. [3] and many others. For instance, [50] considers the noiseless setting by taking real-valued samples of y with a more general choice of g (such as a Gaussian) and also assumes x to be non-negative. Their proposed approach again involves TV norm minimization with linear constraints. Bendory et al. [6] consider g to be Gaussian or Cauchy, do not place sign assumptions on x, and also analyze TV norm minimization with linear fidelity constraints for estimating x from noiseless samples of y. The approach adopted in [17, 18] is to solve a least-squares-type minimization procedure with a TV norm based penalty term (also referred to as the Beurling LASSO (see for e.g., [3])) for recovering x from samples of y. The approach in [19] considers a natural finite approximation on the grid to the continuous problem, and studies the limiting behaviour as the grid becomes finer; see also [20].

From a statistical view point, Candès and Fernandèz-Granda [13] showed that their approach exactly recovers x in the noiseless case provided $m \ge 2/\triangle $, where $\triangle $ denotes the minimum separation between the spike locations. Similar results for other choices of g were shown by Schiebinger et al. [50] (for positive measures and without any minimum separation condition), and by Bendory et al. [6] (for signed measures and with a separation condition). In the noisy setting, the state of affairs is radically different since it is known (see for e.g., [13, Sect. 3.2], [41, Corollary 1.4]) that some separation between the spike locations is indeed necessary for stable recovery. When sufficient separation is assumed and provided the noise level is small enough, then stable recovery of x is possible (see for e.g., [3, 17, 18, 23]).

Recently also, Tang et al. [57, 58] studied approaches based on atomic norm minimization, which can be formulated as a SDP. In [57, Theorem 1.1], the authors considered the signs of the amplitudes of $u_j$ to be generated randomly, with noiseless samples. It was shown that if $m \ge 2/\triangle $, and if the Fourier samples are obtained at $n = \Omega (K \log K \log m)$ indices selected uniformly at random from $\left\{ {-m,\dots ,m}\right\} $, then x is recovered exactly with high probability. In [58, Thorem 1], the authors considered Gaussian noise in the samples and showed that if $m \ge 2/\triangle $, then the solution returned by the SDP estimates the vector of original Fourier samples (i.e., $(f(-m),\dots ,f(m))^T$) at a mean square rate $O(\sigma ^2 K \frac{\log m}{m})$, with $\sigma ^2$ denoting variance of noise. Moreover, they also show [58, Theorem 2] that the spike locations and amplitude terms corresponding to the SDP solution are localized around the true values.

From a computational perspective, the aforementioned approaches all admit a finite dimensional dual problem with an infinite number of linear constraints; this is a semi infinite program (SIP) for which there is an extensive literature [27]. For the particular case of non-negative x, Boyd et al. [8] proposed an improved Frank–Wolfe algorithm in the primal. In certain instances, for e.g., with Fourier samples (such as in [12, 13]), this SIP can also be reformulated as a SDP. From a practical point of view, SDP is notoriously slow for moderately large number of variables. The algorithm of [8] is a first order scheme with potential local correction steps, and is practically more viable.

1.1.2 Prony, ESPRIT, MUSIC and Extensions

When one is given the first few Fourier samples of the spike train x (i.e., (1.2)), then there exist other approaches that can be employed. Prony’s method [16] is a classical method that involves finding the roots of a polynomial, whose coefficients form a null vector of a certain Hankel matrix. Prony’s method and its variants have also been recently studied by Potts and Tasche (for e.g., [46, 48]), Plonka and Tasche [44], and others. The matrix pencil (MP) method [29] is another classical approach that involves computing the generalized eigenvalues of a suitable matrix pencil. Both these methods recover x exactly in the absence of noise provided $m \ge K$ (so $2K+1$ samples), and are also computationally feasible. Recently, Moitra [41, Theorem 2.8] showed that the MP method is stable in the presence of noise provided the noise level is not too large, and $m > \frac{1}{\triangle } + 1$. Moitra also showed [41, Corollary 1.4] that such a dependence is necessary, in the sense that if $m < (1-\epsilon )/\triangle $, then the noise level would have to be $O(2^{-\epsilon K})$ to be able to stably recover x. Very similar in spirit to the MP method for sum of exponential estimation are the Prony-like methods ESPRIT and MUSIC which can also be used for spike train estimation using the same Fourier domain measurement trick as in [41]. These methods can be interpreted as model reduction methods based on low-rank approximation of Hankel matrices [40]. The ESPRIT method was studied in [45] based on previous results^{Footnote 1} from [5]. Another line of research is about relationships with AAK theory of Adamjan, Arov, and Krein as developed in [44] (see also [43]). The MUSIC method was also studied in great detail in [37, Theorem 4]. The modified Matrix Pencil method, on the one hand, is often considered as less computationally expensive than MUSIC and, on the other hand, is amenable to an error analysis quite similar to the one in [45].

1.2 Super-Resolution with Multiple Kernels

In this paper, we consider a generalization of the standard super-resolution problem by assuming that the measurement process now involves a superposition of convolutions of several spike trains with different point spread functions. This problem, which we call the “multi-kernel unmixing super-resolution problem” appears naturally in many practical applications such as single molecule spectroscopy [30], spike sorting in neuron activity analysis [9, 33], DNA sequencing [34, 36], spike hunting in galaxy spectra [10, 38] etc.

A problem of particular interest at the National Physical Laboratory, the UK’s national metrology institute, is isotope identification [39, 54] which is of paramount importance in nuclear security. Hand-held radio-isotope identifiers are known to suffer from poor performance [54] and new and precise algorithms have to be devised for this problem. While it is legitimately expected for Radio Isotope Identifier Devices to be accurate across all radioactive isotopes, the US Department of Homeland Security requires all future identification systems to be able to meet a minimum identification standard set forth in ANSI N42.34. Effective identification from low resolution information is known to be reliably achievable by trained spectroscopists whereas automated identification using standard algorithms sometimes fails up to three fourth of the time [55]. For instance, the spectrum of $^{232}Th$ is plotted in Fig. 1, which is extracted from [55, p.9]. Isotope identification involves the estimation of a set of peak locations in the gamma spectrum where the signal is blurred by convolution with kernels of different window sizes. Mixtures of different spectra can be particularly difficult to analyse using traditional methods and a need for precise unmixing algorithms in such generalised super-resolution problems may play an important role in future applications such as reliable isotope identification.

Another application of interest is DNA sequencing in the vein of [34]. Sequencing is usually performed using some of the variants of the enzymatic method invented by Frederick Sanger [1]. Sequencing is based on enzymatic reactions, electrophoresis, and some detection technique. Electrophoresis is a technique used to separate the DNA sub-fragments produced as the output of four specific chemical reactions, as described in more detail in [34]. DNA fragments are negatively charged in solution. An efficient color-coding strategy has been developed to permit sizing of all four kinds of DNA sub-fragments by electrophoresis in a single lane of a slab gel or in a capillary. In each of the four chemical reactions, the primers are labeled by one of four different fluorescent dyes. The dyes are then excited by a laser based con-focal fluorescent detection system, in a region within the slab gel or capillary. In that process, fluorescence intensities are emitted in four wavelength bands as shown with different color codes in Fig. 2 below. However, quoting [7], “because the electrophoretic process often fails to separate peaks adequately, some form of deconvolution filter must be applied to the data to resolve overlapping events. This process is complicated by the variability of peak shapes, meaning that conventional deconvolution often fails.” The methods developed in the present paper aim at achieving accurate deconvolution with different peak shapes and might therefore be useful for practical deconvolution problems in DNA sequencing.

We will now provide the mathematical formulation of the problem and describe the main idea of our approach along with our contributions, and discuss related work for this problem.

1.3 Problem Formulation

Say we have L groups of point sources where $\left\{ {t_{l,i}}\right\} _{i=1}^{K} \subset [0,1)$ and $(u_{l,i})_{i=1}^{K}$ (with $u_{l,i} \in \mathbb {C}$) denote the locations and (complex-valued) amplitudes respectively of the sources in the lth group. Our signal of interest is defined as

$$\begin{aligned} x(t) = \sum _{l=1}^L x_l(t) = \sum _{l=1}^L \left( \sum _{j=1}^K \ u_{l,j} \delta (t-t_{l,j})\right) . \end{aligned}$$

Let $g\in L_1(\mathbb {R})$ be a positive definite function^{Footnote 2} with its Fourier transform $\bar{g}(s) = \int _{\mathbb R} g(t) \exp \left( \iota 2\pi st\right) dt$ for $s \in \mathbb {R}$. Consider L distinct kernels $g_l(t) = g(t/\mu _l)$, $l=1,\ldots ,L$ where $0< \mu _1< \cdots < \mu _L$. Let $y(t) = \sum _{l=1}^L (g_l \star x_l)(t) = \sum _{l=1}^L \sum _{j=1}^K u_{l,j} \ g_l(t-t_{l,j})$ where $\star $ denotes the convolution operator. Let f denote the Fourier transform of y, i.e., $f(s) = \int _{\mathbb R} y(t) \exp \left( \iota 2\pi st\right) dt$. Denoting the Fourier transform of $g_l$ by $\bar{g}_l$, we get

$$\begin{aligned} f(s)&= \sum _{l=1}^L \underbrace{\bar{g}_l(s) \left( \sum _{j=1}^K u_{l,j} \exp \left( \iota 2\pi st_{l,j}\right) \right) }_{f_{l}(s)}. \end{aligned}$$

(1.3)

Assuming black box access to the complex valued function f, our aim is to recover estimates of $\left\{ {t_{l,i}}\right\} _{i=1}^{K}$ and $(u_{l,i})_{i=1}^{K}$ for each $l = 1,\dots ,L$ from few possibly noisy samples ${\tilde{f}}(s) = f(s) + w(s)$ of f. Here, $w(s) \in \mathbb {C}$ denotes measurement noise at location s. We remark that the choice of having K summands for each group is only for ease of exposition, one can more generally consider $K_l \le K$ summands for each $l = 1,\dots ,L$.

Gaussian Kernels For ease of exposition, we will from now on consider g to be a Gaussian, i.e., $g(t) = \exp (-t^2/2)$ so that $g_l(t) = \exp (-t^2/(2 \mu _l^2)$, $l=1,\ldots ,L$. It is well known that g is positive definite [60, Theorem 6.10], moreover, $\bar{g}_l(s) = \sqrt{2\pi } \mu _l \exp (-2\pi ^2 s^2 \mu _l^2)$. We emphasize that our restriction to Gaussian kernels is only to minimize the amount of tedious computations in the proof. However, our proof technique can likely be extended to handle more general positive definite g possessing a strictly positive Fourier transform. Examples of such functions are: (a) (Laplace kernel) $g(t) = \exp (-|{t}|)$, and (b) (Cauchy kernel) $g(t) = \frac{1}{1+t^2}$.

1.4 Main Idea of Our Work: Fourier Tail Sampling

To explain the main idea, let us consider the noiseless setting $w(s) = 0$. Our main algorithmic idea stems from observing (1.3), wherein we notice that for s sufficiently large, f(s) is equal to $f_1(s)$ plus a perturbation term arising from the tails of $f_2(s),\dots ,f_{L}(s)$. Thus, $f(s)/\bar{g}_1(s)$ is equal to $\sum _{j=1}^K u_{1,j} \exp \left( \iota 2\pi st_{1,j}\right) $ (which is a weighted sum of complex exponentials) plus a perturbation term. We control this perturbation by choosing s to be sufficiently large, and recover estimates ${\widehat{t}}_{1,j}, {\widehat{u}}_{1,j}$ (up to a permutation $\phi _1$) via the Modified Matrix Pencil (MMP) method of Moitra [41] (outlined as Algorithm 1). Given these, we form the estimate ${\widehat{f}}_1(s)$ to $f_1(s)$ where

$$\begin{aligned} {\widehat{f}}_1(s) =\bar{g}_1(s) \sum _{j=1}^K {\widehat{u}}_{1,\phi _1(j)} \exp (\iota 2 \pi s {\widehat{t}}_{1,\phi _1(j)}). \end{aligned}$$

Provided the estimates are good enough, we will have $f(s) - {\widehat{f}}_1(s) \approx \sum _{l=2}^{L} f_l(s)$. Therefore, by applying the above procedure to $f(s) - {\widehat{f}}_1(s)$, we can hope to recover estimates of $t_{2,j}$’s and $u_{2,j}$’s as well. By proceeding recursively, it is clear that we can perform the above procedure to recover estimates to each $\left\{ {t_{l,i}}\right\} _{i=1}^{K} \subset [0,1)$ and $(u_{l,i})_{i=1}^{K}$ for all $l=1,\dots ,L$. A delicate issue that needs to be addressed for each intermediate group $1< l < L$ is the following. While estimating the parameters for group $1< l < L$, the samples $\frac{f(s) - \sum _{i=1}^{l-1} {\widehat{f}}_i(s)}{\bar{g}_l(s)}$ that we obtain will have perturbation arising due to (a) the tails of $f_{l+1}(s), \dots , f_{L}(s)$ and, (b) the estimates ${\widehat{f}}_{1}(s),\dots ,{\widehat{f}}_{l-1}(s)$ computed thus far. In particular, going “too deep” in to the tail of $f_l(s)$ would blow up the perturbation term in (b), while not going sufficiently deep would blow up the perturbation term in (a). Therefore, in order to obtain stable estimates to the parameters $t_{l,j}, u_{l,j}$, we will need to control these perturbation terms by carefully choosing the locations, as well as the number of sampling points in the tail.

Further Remarks Our choice of using the MMP method for estimating the spike locations and amplitudes at each stage is primarily dictated by two reasons. Firstly, it is extremely simple to implement in practice. Moreover, it comes with precise quantitative error bounds (see Theorem 3) for estimating the source parameters—especially for the setting of adversarial bounded noise—which fit seamlessly in the theoretical analysis of our method. Of course, in practice, other methods such as ESPRIT, MUSIC could also be used, and as discussed in Sect. 1.1.2, there also exist some error bounds in the literature for these methods. To our knowledge, error bounds of the form in Theorem 3 do not currently exist for these other methods. In any case, a full discussion in this regard is outside the scope of the paper, and analyzing the performance of our algorithm with methods other than the MMP method is a direction for future work.

1.5 Main Results

Our algorithm based on the aforementioned idea is outlined as Algorithm 2 which we name KrUMMP (Kernel Unmixing via Modified Matrix Pencil). At each stage $l = 1,\dots ,L$, we choose a “sampling offset” $s_l$ and obtain the (potentially noisy) Fourier samples at $2 m_l$ locations $s_l +i$ for $i=-m_l\dots ,m_l - 1$. Our main result for the noiseless setting ($w \equiv 0$) is stated as Theorem 5 in its full generality. We state its following informal version assuming the spike amplitudes in each group to be^{Footnote 3}$\asymp 1$. Note that $d_w:[0,1]^2 \rightarrow [0,1/2]$ is the usual wrap around distance on [0, 1] (see (2.1)).

Theorem 1

(Noiseless case) Denote $\triangle _l := \min _{i,j} d_w(t_{l,i},t_{l,j}) > 0$ for each $1 \le l \le L$. Let $0 < \varepsilon _1 \le \varepsilon _2 \le \cdots \le \varepsilon _L$ satisfy $\varepsilon _l \lesssim \triangle _l/2$ for each l. Moreover, let $\varepsilon _{L-1} \lesssim \alpha \varepsilon _L^2$ and $\varepsilon _l \lesssim \beta _l(\varepsilon _{l+1})^{2(1+\gamma _l)}$ hold for $1 \le l \le L-1$ with $\alpha ,\beta _l,\gamma _l > 0$ depending on the problem parameters (see (3.25), (3.27)). Finally, in Algorithm 2, let $m_L \asymp 1/\triangle _L$, $s_L = 0$, and

$$\begin{aligned}&m_l \asymp 1/\triangle _l, \quad s_l \asymp m_l \\&\quad + \frac{1}{(\mu _{l+1}^2 - \mu _l^2)^{1/2}} \log ^{1/2}\left( \frac{K^{3/2}(L-l)\mu _L}{\varepsilon _l\mu _l}\right) ; \quad 1 \le l \le L-1. \end{aligned}$$

Then, for each $l=1,\dots ,L$, there exists a permutation $\phi _l: [K] \rightarrow [K]$ such that

$$\begin{aligned} d_w({\widehat{t}}_{l,\phi _l(j)}, t_{l,j})&\le \varepsilon _l, \quad |{{\widehat{u}}_{l,\phi _l(j)} - u_{l,j}}| \lesssim E_l(\varepsilon _l) \varepsilon _l; \quad j=1,\dots ,K. \end{aligned}$$

Here, $E_L(\varepsilon _L) \lesssim \frac{K}{\triangle _L}$, $E_l(\varepsilon _l) \lesssim \left( \frac{K}{\triangle _l} + \frac{1}{(\mu _{l+1}^2 - \mu _l^2)^{1/2}} \log ^{1/2}\left( \frac{K^{3/2}(L-l)\mu _L}{\varepsilon _l\mu _l}\right) \right) $ for $1 \le l \le L-1$.

The conditions on $\varepsilon _l \in (0,1)$ imply that the estimation errors corresponding to group l should be sufficiently smaller than that of group $l+1$. This is because the estimation errors arising in stage l percolate to stage $l+1$, and hence need to be controlled for stable recovery of source parameters for the $(l+1)$th group. Note that the conditions on $s_l$ (sampling offset at stage l) involve upper and lower bounds for reasons stated in the previous section. If $\mu _l$ is close to $\mu _{l+1}$ then $s_l$ will have to be suitably large in order to distinguish between $\bar{g}_l$ and $\bar{g}_{l+1}$, as one would expect intuitively. An interesting feature of the result is that it only depends on separation within a group (specified by $\triangle _l$), and so spikes belonging to different groups are allowed to overlap.

Our second result is for the noisy setting. Say at stage $1 \le p \le L$ of Algorithm 2, we observe $\widetilde{f}(s) = f(s) + w_p(s)$ where $w_p(s) \in \mathbb {C}$ denotes noise at location s. Our main result for this setting is Theorem 6. Denoting $w_p = (w_p(s_p-m_p),w_p(s_p-m_p+1),\dots ,w_p(s_p+m_p-1))^{T} \in \mathbb {C}^{2m_p}$ to be the noise vector at stage p, we state its informal version below assuming the spike amplitudes in each group to be $\asymp 1$.

Theorem 2

(Noisy case) Say at stage $1 \le p \le L$ of Algorithm 2, we observe $\widetilde{f}(s) = f(s) + w_p(s)$ where $w_p(s) \in \mathbb {C}$ denotes noise at location s. For each $1 \le l \le L$, let $\varepsilon _l,m_l,s_l$ be chosen as specified in Theorem 1. Say $\Vert {w_L}\Vert _{\infty } \lesssim \mu _L e^{-\mu _L^2/\triangle _L^2} \frac{\varepsilon _L}{\sqrt{K}}$, and also

$$\begin{aligned} \Vert {w_l}\Vert _{\infty } \lesssim \frac{(\varepsilon _l\mu _l)^{1+C(\mu _l,\mu _{l+1},\triangle _l)}}{\sqrt{K} (K^{3/2} L \mu _L)^{C(\mu _l,\mu _{l+1},\triangle _l)}}; \quad 2 \le l \le L-1 \end{aligned}$$

where $C(\mu _l,\mu _{l+1},\triangle _l) > 0$ depends only on $\mu _l,\mu _{l+1},\triangle _l$. Then, for each $l=1,\dots ,L$, there exists a permutation $\phi _l: [K] \rightarrow [K]$ such that

$$\begin{aligned} d_w({\widehat{t}}_{l,\phi _l(j)}, t_{l,j}) \le \varepsilon _l, \quad |{{\widehat{u}}_{l,\phi _l(j)} - u_{l,j}}| \lesssim E_l(\varepsilon _l) \varepsilon _l; \quad j=1,\dots ,K, \end{aligned}$$

where $E_l(\cdot )$ is as in Theorem 1.

As remarked earlier in Sect. 1.3, one can more generally consider $K_l \le K$ summands for the lth group—our algorithm and results remain unchanged.

2 Notation and Preliminaries

Notation Vectors and matrices are denoted by lower and upper case letters respectively. For $n \in \mathbb {N}$, we denote $[n] = \left\{ {1,\dots ,n}\right\} $. The imaginary unit is denoted by $\iota = \sqrt{-1}$. The notation $\log ^{1/2}(\cdot )$ is used to denote $|{\log (|{\cdot }|)}|^{1/2}$. The $\ell _p$ ($1 \le p \le \infty $) norm of a vector $x \in \mathbb {R}^n$ is denoted by $\Vert {x}\Vert _p$ (defined as $(\sum _i |{x_i}|^p)^{1/p}$). In particular, $\Vert {x}\Vert _{\infty } := \max _i |{x_i}|$. For a matrix $A \in \mathbb {R}^{m \times n}$, we will denote its spectral norm (i.e., largest singular value) by $\Vert {A}\Vert $ and its Frobenius norm by $\Vert {A}\Vert _F$ (defined as $(\sum _{i,j} A_{i,j}^2)^{1/2}$). For positive numbers a, b, we denote $a \lesssim b$ to mean that there exists an absolute constant $C > 0$ such that $a \le C b$. If $a \lesssim b$ and $b \lesssim a$ then we denote $a \asymp b$. The wrap around distance on [0, 1] is denoted by $d_w: [0,1]^2 \rightarrow [0,1/2]$ where we recall that

$$\begin{aligned} d_w(t_1,t_2) = \min \left\{ {|{t_1 - t_2}|, 1-|{t_1 - t_2}|}\right\} . \end{aligned}$$

(2.1)

2.1 Matrix Pencil (MP) Method

We now review the classical Matrix Pencil (MP) method for estimating positions of point sources from Fourier samples. Consider the signal $x(t) := \sum _{j=1}^{K} u_{j} \delta (t - t_{j})$ where $u_{j} \in \mathbb {C}$, $t_j \in [0,1)$ are unknown. Let $f: \mathbb {R}\rightarrow \mathbb {C}$ be the Fourier transform of x so that $f(s) = \sum _{j=1}^K u_j \exp \left( \iota 2\pi s t_j \right) $. For any given offset $s_0 \in \mathbb {Z}^{+}$, let $s = s_0 + i$ where $i \in \mathbb {Z}$; clearly

$$\begin{aligned} f(s_0 + i) = \sum _{j=1}^K u_j \exp \left( \iota 2\pi (s_0+i)t_j\right) = \sum _{j=1}^K u'_j \exp \left( \iota 2\pi i t_j \right) \end{aligned}$$

where $u'_j = u_j \exp \left( \iota 2\pi s_0 t_j \right) , \quad j=1,\ldots ,K$. Choose $i \in \left\{ {-m,-m+1,\ldots ,m-1}\right\} $ to form the $m \times m$ matrices

$$\begin{aligned} H_0 = \begin{bmatrix} f(s_0) &{}\quad f(s_0+1) &{}\quad \cdots &{}\quad f(s_0+m-1) \\ f(s_0-1) &{}\quad f(s_0) &{}\quad \cdots &{}\quad f(s_0+m-2) \\ \vdots &{}\quad \vdots &{}\quad &{}\quad \vdots \\ f(s_0-m+1) &{}\quad f(s_0-m+2) &{}\quad \cdots &{}\quad f(s_0) \end{bmatrix} \end{aligned}$$

(2.2)

and

$$\begin{aligned} H_1 = \begin{bmatrix} f(s_0-1) &{}\quad f(s_0) &{}\quad \cdots &{}\quad f(s_0+m-2) \\ f(s_0-2) &{}\quad f(s_0-1) &{}\quad \cdots &{}\quad f(s_0+m-3) \\ \vdots &{}\quad \vdots &{}\quad &{}\quad \vdots \\ f(s_0-m) &{}\quad f(s_0-m+1) &{}\quad \cdots &{}\quad f(s_0-1) \end{bmatrix}. \end{aligned}$$

(2.3)

Denoting $\alpha _j = \exp \left( -\iota 2\pi t_j \right) $ for $j= 1, \ldots , K$, and the Vandermonde matrix

$$\begin{aligned} V = \begin{bmatrix} 1 &{}\quad 1 &{}\quad \cdots &{}\quad 1 \\ \alpha _1 &{}\quad \alpha _2 &{}\quad \cdots &{}\quad \alpha _K \\ \vdots &{}\quad \vdots &{}\quad &{}\quad \vdots \\ \alpha _1^{m-1} &{}\quad \alpha _2^{m-1} &{}\quad \cdots &{}\quad \alpha _K^{m-1} \end{bmatrix}, \end{aligned}$$

(2.4)

clearly $H_0=V D_{u'} V^H$ and $H_1=VD_{u'} D_\alpha V^H$. Here, $D_{u'} = \text {diag}(u'_1,\dots ,u'_K)$ and $D_\alpha = \text {diag}(\alpha _1,\dots ,\alpha _K)$ are diagonal matrices. One can readily verify^{Footnote 4} that the K non zero generalized eigenvalues of ($H_1,H_0$) are equal to the $\alpha _j$’s. Hence by forming the matrices $H_0, H_1$, we can recover the unknown $t_j$’s exactly from 2K samples of f. Once the $t_j$’s are recovered, we can recover the $u'_j$’s exactly, as the solution of the linear system

$$\begin{aligned} \begin{bmatrix} f(0) \\ f(1) \\ \vdots \\ f(m-1) \end{bmatrix}&= \begin{bmatrix} 1 &{}\quad 1 &{}\quad \cdots &{}\quad 1 \\ \alpha _1 &{}\quad \alpha _2 &{}\quad \cdots &{}\quad \alpha _K \\ \vdots &{}\quad \vdots &{}\quad &{}\quad \vdots \\ \alpha _1^{m-1} &{}\quad \alpha _2^{m-1} &{}\quad \cdots &{}\quad \alpha _K^{m-1} \end{bmatrix} \begin{bmatrix} u'_1 \\ \vdots \\ u'_K \end{bmatrix}. \end{aligned}$$

Thereafter, the $u_j$’s are found as $u_j = u'_j/\exp (\iota 2\pi s_0 t_j)$.

In [29], the authors elaborate further on this approach and in the noiseless case, propose an equivalent formulation of the Matrix Pencil approach as a standard eigenvalue problem. Using continuity of the eigenvalues with respect to perturbation, they also suggest that this eigenvalue problem provides a good estimator in the noisy case. In the next section, we discuss an alternative approach proposed by Moitra, which involves solving a generalised eigenproblem, and for which Moitra provided a precise quantitative perturbation result.

2.2 The Modified Matrix Pencil (MMP) Method

We now consider the noisy version of the setup defined in the previous section. For a collection of K point sources with parameters $u_j \in \mathbb {C}$, $t_j \in [0,1]$, we are given noisy samples

$$\begin{aligned} \widetilde{f}(s) = f(s) + \eta _s, \end{aligned}$$

(2.5)

for $s \in \mathbb {Z}$ and where $\eta _s \in \mathbb {C}$ denotes noise.

Let us choose $s = s_0 + i$ for a given offset $s_0 \in \mathbb {Z}^+$, and $i \in \left\{ {-m,-m+1,\ldots ,m-1}\right\} $ for a positive integer m. Using $(\widetilde{f}(s_0 + i))_{i=-m}^{m-1}$, let us form the matrices ${\widetilde{H}}_0, {\widetilde{H}}_1 \in \mathbb {C}^{m \times m}$ as in (2.2), (2.3). We now have ${\widetilde{H}}_0 = H_0 + E$, ${\widetilde{H}}_1 = H_1 + F$ where $H_0, H_1$ are as defined in (2.2), (2.3), and

$$\begin{aligned} E = \begin{bmatrix} \eta _0 &{}\quad \eta _1 &{}\quad \ldots &{}\quad \eta _{m-1} \\ \eta _{-1} &{}\quad \eta _0 &{}\quad \ldots &{}\quad \eta _{m-2} \\ \vdots &{}\quad \vdots &{}\quad &{}\quad \vdots \\ \eta _{-(m-1)} &{}\quad \eta _{-(m-2)} &{}\quad \ldots &{}\quad \eta _0 \end{bmatrix}, \quad F = \begin{bmatrix} \eta _{-1} &{}\quad \eta _0 &{}\quad \ldots &{}\quad \eta _{m-2} \\ \eta _{-2} &{}\quad \eta _{-1} &{}\quad \ldots &{}\quad \eta _{m-3} \\ \vdots &{}\quad \vdots &{}\quad &{}\quad \vdots \\ \eta _{-m} &{}\quad \eta _{-(m-1)} &{}\quad \ldots &{}\quad \eta _{-1} \end{bmatrix}. \end{aligned}$$

represent the perturbation matrices. Algorithm 1 namely the Modifed Matrix Pencil (MMP) method [41] outlines how we can recover ${\widehat{t}}_j$, ${\widehat{u}}_j$ for $j=1,\dots ,K$.

Before proceeding we need to make some definitions. Let $u_{\max }= \max _j |{u_j}|$ and $u_{\min }= \min _j |{u_j}|$. We denote the largest and smallest non zero singular values of V by $\sigma _{\text {max}},\sigma _{\text {min}}$ respectively, and the condition number of V by $\kappa $ where $\kappa = \sigma _{\text {max}}/\sigma _{\text {min}}$. Let $\eta _{\mathrm {max}}:= \max _{i} |{\eta _i}|$. We will define $\triangle $ as the minimum separation between the locations of the point sources where $\triangle := \min _{j\ne j'} \ d_w(t_j,t_{j'})$.

The following theorem is a more precise version of [41, Theorem 2.8], with the constants computed explicitly. Moreover, the result in [41, Theorem 2.8] was specifically for the case $s_0 = 0$, and also has inaccuracies in the proof. We outline the corrected version of Moitra’s proof in Appendix A.1 and fill in additional details (using auxiliary results from Appendix B) to arrive at the statement of Theorem 3.

Theorem 3

[41] For $0 \le \varepsilon < \triangle /2$, say $ m > \frac{1}{\triangle - 2\varepsilon } + 1$. Moreover, for $C = 10 + \frac{1}{2\sqrt{2}}$, say

$$\begin{aligned} \eta _{\mathrm {max}}\le \varepsilon \frac{u_{\min }\sigma _{\min }^2}{2mC\sqrt{K}} \left( 1+16 \ \kappa ^2 \frac{u_{\max }}{u_{\min }}\right) ^{-1}. \end{aligned}$$

(2.6)

Then, there exists a permutation $\phi : [K] \mapsto [K]$ such that the output of the MMP method satisfies for each $i=1,\dots ,K$

$$\begin{aligned} d_w({\widehat{t}}_{\phi (i)},t_i)&\le \varepsilon , \quad \Vert {{\widehat{u}}_{\phi } - u}\Vert _{\infty } \nonumber \\&\le \left( \frac{2\pi m^{3/2} K u_{\max }+ \frac{u_{\min }\sigma ^2_{\min }}{2 C \sqrt{mK}}\left( 1+16 \ \kappa ^2 \frac{u_{\max }}{u_{\min }}\right) ^{-1}}{(m - \frac{1}{\triangle -2\varepsilon } -1)^{1/2}} + 2\pi u_{\max }s_0 \right) \varepsilon \end{aligned}$$

(2.7)

where ${\widehat{u}}_{\phi }$ is formed by permuting the indices of ${\widehat{u}}$ w.r.t $\phi $.

The following Corollary of Theorem 3 simplifies the expression for the bound on $\Vert {{\widehat{u}}_{\phi } - u}\Vert _{\infty }$ in (2.7), and will be useful for our main results later on. The proof is deferred to Appendix A.2.

Corollary 1

For $0 \le \varepsilon < c \triangle /2$ where $c \in [0,1)$ is a constant, say $\frac{2}{\triangle - 2\varepsilon } + 1 < m \le \frac{2}{(1-c)\triangle } + 1$. Denoting $u_{\mathrm {rel}}= \frac{u_{\max }}{u_{\min }}$, $C = 10 + \frac{1}{2\sqrt{2}}$, and $B(u_{\mathrm {rel}},K) = \frac{1}{5C\sqrt{K}} (1 + 48u_{\mathrm {rel}})^{-1}$, say

$$\begin{aligned} {\eta _{\mathrm {max}}\le \varepsilon u_{\min }B(u_{\mathrm {rel}},K).} \end{aligned}$$

(2.8)

Then, there exists a permutation $\phi : [K] \mapsto [K]$ such that the output of the MMP method satisfies for each $i=1,\dots ,K$

$$\begin{aligned} d_w({\widehat{t}}_{\phi (i)},t_i) \le \varepsilon , \quad {\Vert {{\widehat{u}}_{\phi } - u}\Vert _{\infty } < (\tilde{C}(\triangle ,c,K,u_{\mathrm {rel}}) + 2\pi s_0) u_{\max }\varepsilon } \end{aligned}$$

(2.9)

where $\tilde{C}(\triangle ,c,K,u_{\mathrm {rel}}) = 4\pi K \left( \frac{2}{\triangle (1-c)} + 1\right) + \frac{2}{C\sqrt{K}}\left( u_{\mathrm {rel}}+ 16u_{\mathrm {rel}}^2\right) ^{-1}$, and ${\widehat{u}}_{\phi }$ is formed by permuting the indices of ${\widehat{u}}$ w.r.t $\phi $.

3 Unmixing Gaussians in Fourier Domain: Noiseless Case

We now turn our attention to the main focus of this paper, namely that of unmixing Gaussians in the Fourier domain. We will in general assume the Fourier samples to be noisy as in (2.5), with f defined in (1.3). In this section, we will focus on the noiseless setting wherein $\widetilde{f}(s) = f(s)$ for each s. The noisy setting is analyzed in the next section.

Let us begin by noting that when $L = 1$, i.e., in the case of a single kernel, the problem is solved easily. Indeed, we have from (1.3) that $f(s) = \bar{g}_1(s) \sum _{j=1}^K u_{1,j} \exp \left( \iota 2\pi st_{1,j} \right) $. Clearly, one can exactly recover $(t_{1,j})_{j=1}^K \in [0,1)$ and $(u_{1,j})_{j=1}^K \in \mathbb {C}$ via the MP method by first obtaining the samples $f(-m)$, $f(-m+1)$, ..., $f(m-1)$, and then working with $f(s)/\bar{g}_1(s)$.

The situation for the case $L \ge 2$ is however more delicate. Before proceeding, we need to make some definitions and assumptions.

We will denote $u_{\max }= \max _{l,j} |{u_{l,j}}|$, $u_{\min }= \min _{l,j} |{u_{l,j}}|$, and $u_{\mathrm {rel}}= \frac{u_{\max }}{u_{\min }}$.
The sources in the lth group are assumed to have a minimum separation of
$$\begin{aligned} \triangle _l:= \min _{i \ne j} d_w(t_{l,i},t_{l,j}) > 0. \end{aligned}$$
Denoting $\alpha _{l,j}= \exp \left( -i2\pi t_{l,j} \right) $, $V_l \in \mathbb {C}^{m_l \times K}$ will denote the Vandermonde matrix
$$\begin{aligned} \begin{bmatrix} 1 &{}\quad 1 &{}\quad \cdots &{}\quad 1 \\ \alpha _{l,1} &{}\quad \alpha _{l,2} &{}\quad \cdots &{}\quad \alpha _{l,K} \\ \vdots &{}\quad \vdots &{}\quad &{}\quad \vdots \\ \alpha _{l,1}^{m_l-1} &{}\quad \alpha _{l,2}^{m_l-1} &{}\quad \cdots &{}\quad \alpha _{l,K}^{m_l-1} \end{bmatrix} \end{aligned}$$
for each $l=1,\dots , L$ analogous to (2.4). $\sigma _{\max ,l}, \sigma _{\min ,l}$ will denote its largest and smallest non-zero singular values, and $\kappa _l = \sigma _{\max ,l}/\sigma _{\min ,l}$ its condition number. Recall from Theorem 8 that if $m_l > \frac{1}{\triangle _l} + 1$, then $\sigma _{\max ,l}^2 \le m_l+\frac{1}{\triangle _l}-1$ and $\sigma _{\min ,l}^2 \ge m_l - \frac{1}{\triangle _l}-1$ and thus $\kappa _l^2 \le \frac{m_l+\frac{1}{\triangle _l}-1}{m_l-\frac{1}{\triangle _l}-1}$.

3.1 The Case of Two Kernels

We first consider the case of two Gaussian kernels as the analysis here is relatively easier to digest compared to the general case. Note that f is now of the form

$$\begin{aligned} f(s)&= \bar{g}_1(s) \left( \sum _{j=1}^K u_{1,j} \exp (\iota 2\pi s t_{1,j})\right) + \bar{g}_2(s) \left( \sum _{j=1}^K u_{2,j} \exp (\iota 2\pi s t_{2,j})\right) \end{aligned}$$

where we recall $\bar{g}_l(s) = \sqrt{2\pi } \ \mu _l \exp \left( -2\pi ^2 s^2 \mu _l^2\right) $; $l=1,2$. The following theorem provides sufficient conditions on the choice of the sampling parameters for approximate recovery of $t_{1,j}, t_{2,j}, u_{1,j}, u_{2,j}$ for each $j=1,\dots ,K$.

Theorem 4

Let ${0< \varepsilon _2 < c\triangle _2/2}$ for a constant $c \in [0,1)$, $s_2 = 0$, and $m_2 \in \mathbb {Z}^{+}$ satisfy $\frac{2}{\triangle _2 - 2\varepsilon _2} + 1 \le m_2 < \frac{2}{\triangle _2 (1-c)} + 1 \ (\ = M_{{2},\mathrm {up}})$. For $B(u_{\mathrm {rel}},K)$ as in Corollary 1, let ${0< \varepsilon _1 < c\triangle _1/2}$ also satisfy

$$\begin{aligned} e^{(2\pi ^2 M_{{2},\mathrm {up}}^2 (\mu _2^2 - \mu _1^2))} \left( {2\pi M_{{2},\mathrm {up}}} + \bar{C}_1 + \bar{C}_2 \log ^{1/2}\left( \frac{\bar{C}_3}{\varepsilon _1}\right) \right) {u_{\mathrm {rel}}} \varepsilon _1 \le {\varepsilon _2 \frac{\mu _2 B(u_{\mathrm {rel}},K)}{K \mu _1}}, \end{aligned}$$

(3.1)

where $\bar{C}_1, \bar{C}_2, \bar{C}_3 > 0$ are constants depending (see (3.11), (3.12)) on $c,u_{\mathrm {rel}},\mu _1,\mu _2,\triangle _1,K$ and a constant $\widetilde{c} > 1$. Say $m_1,s_1 \in \mathbb {Z}^{+}$ are chosen to satisfy

$$\begin{aligned} \frac{2}{\triangle _1 - 2\varepsilon _1} + 1 \le m_1 < \frac{2}{\triangle _1 (1-c)} + 1 \ (\ = M_{{1},\mathrm {up}}), \quad S_1 \le s_1 \le \widetilde{c} S_1, \end{aligned}$$

where $S_1 = m_1 + \frac{1}{(2\pi ^2(\mu _2^2 - \mu _1^2))^{1/2}} \log ^{1/2}\left( \frac{Ku_{\mathrm {rel}}\mu _2}{\mu _1 B(u_{\mathrm {rel}},K) \varepsilon _1} \right) $. Then, there exist permutations $\phi _1,\phi _2: [K] \rightarrow [K]$ such that for $j=1,\dots ,K$,

$$\begin{aligned} d_w({\widehat{t}}_{1,\phi _1(j)}, t_{1,j})&\le \varepsilon _1, \quad |{{\widehat{u}}_{1,\phi _1(j)} - u_{1,j}}|< \left( \bar{C}_1 + \bar{C}_2 \log ^{1/2}\left( \frac{\bar{C}_3}{\varepsilon _1}\right) \right) {u_{\max }} \varepsilon _1, \\ d_w({\widehat{t}}_{2,\phi _2(j)}, t_{2,j})&\le \varepsilon _2, \quad |{{\widehat{u}}_{2,\phi _2(j)} - u_{2,j}}| < \widetilde{C}_2 {u_{\max }} \varepsilon _2, \end{aligned}$$

where

$$\begin{aligned} \widetilde{C}_l&= \widetilde{C}(\triangle _l,c,K,u_{\mathrm {rel}}) = 4\pi K M_{{l},\mathrm {up}}\nonumber \\&\quad + \frac{2}{C\sqrt{K}}\left( u_{\mathrm {rel}}+ 16u_{\mathrm {rel}}^2\right) ^{-1}; \quad l=1,2, \end{aligned}$$

(3.2)

and $\widetilde{C}(\cdot ), C$ are as defined in Corollary 1.

Interpreting Theorem 4 Before proceeding to the proof, we make some useful observations.

(a)
We first choose the sampling parameters $\varepsilon $ (accuracy), m (number of samples), s (offset) for the inner kernel $\bar{g}_2$ and then the outer kernel, i.e., $\bar{g}_1$. For group i ($= 1,2$), we first choose $\varepsilon _i$, then $m_i$ (depending on $\varepsilon _i$), and finally the offset $s_i$ (depending on $m_i,\varepsilon _i$).
(b)
The choice of $\varepsilon _2$ is free, but the choice of $\varepsilon _1$ is constrained by $\varepsilon _2$ as seen from (3.1). In particular, $\varepsilon _1$ needs to be sufficiently small with respect to $\varepsilon _2$ so that the perturbation arising due to the estimation errors for group 1 are controlled when we estimate the parameters for group 2.
(c)
The lower bound on $s_1$ ensures that we are sufficiently deep in the tail of $\bar{g}_2$, so that its effect is negligible. The upper bound on $s_1$ is to control the estimation errors of the source amplitudes for group 1 (see (2.9)). Observe that $s_2 = 0$ since $\bar{g}_2$ is the innermost kernel, and so there is no perturbation arising due to the tail of any other inner kernel.
(d)
In theory, $\varepsilon _1$ can be chosen to be arbitrarily close to zero; however, this would result in the offset $s_1$ becoming large. Consequently, this will lead to numerical errors when we divide by $\bar{g}_1(s_1+i)$; $i=-m_1,\dots ,m_1-1$ while estimating the source parameters for group 1.

Order Wise Dependencies The Theorem is heavy in notation, so it would help to understand the order wise dependencies of the terms involved. Assume $u_{\max },u_{\min }\asymp 1$. We have $\widetilde{C}_1 \asymp K/\triangle _1$ and $\widetilde{C}_2 \asymp K/\triangle _2$ which leads to

$$\begin{aligned} \bar{C}_1 \asymp \frac{K}{\triangle _1}, \quad \bar{C}_2 \asymp \frac{1}{(\mu _2^2 - \mu _1^2)^{1/2}}, \quad \bar{C}_3 \asymp \frac{K^{3/2}\mu _2}{\mu _1}. \end{aligned}$$

(a)
For group 2, we have $\varepsilon _2 \lesssim \triangle _2$, $m_2 \asymp 1/\triangle _2$, $s_2 = 0$, and
$$\begin{aligned} d_w({\widehat{t}}_{2,\phi _2(j)}, t_{2,j}) \le \varepsilon _2, \quad |{{\widehat{u}}_{2,\phi _2(j)} - u_{2,j}}| \lesssim \frac{K}{\triangle _2} \varepsilon _2. \end{aligned}$$
(b)
For group 1, $\varepsilon _1 \lesssim \triangle _1$ and (3.1) translates to
$$\begin{aligned}&\left( \frac{1}{\triangle _2} + \frac{K}{\triangle _1} + \frac{1}{(\mu _2^2 - \mu _1^2)^{1/2}} \log ^{1/2}\left( \frac{K^{3/2}\mu _2}{\mu _1\varepsilon _1} \right) \right) \varepsilon _1 \nonumber \\&\quad \lesssim \varepsilon _2\left( \frac{\mu _2}{\mu _1 K^{3/2}}\right) \exp \left( -\frac{2\pi ^2(\mu _2^2 - \mu _1^2)}{\triangle _2^2}\right) . \end{aligned}$$
(3.3)
Moreover, $m_1 \asymp 1/\triangle _1$ and $s_1 \asymp \frac{1}{\triangle _1} + (\mu _2^2 - \mu _1^2)^{-1/2}\log ^{1/2}(\frac{K^{3/2} \mu _2}{\mu _1 \varepsilon _1})$. Finally,
$$\begin{aligned} d_w({\widehat{t}}_{1,\phi _1(j)}, t_{1,j})&\le \varepsilon _1, \quad |{{\widehat{u}}_{1,\phi _1(j)} - u_{1,j}}|\nonumber \\&\lesssim \left( \frac{K}{\triangle _1} + \frac{1}{(\mu _2^2 - \mu _1^2)^{1/2}} \log ^{1/2}\left( \frac{K^{3/2} \mu _2}{\mu _1\varepsilon _1}\right) \right) \varepsilon _1. \end{aligned}$$
(3.4)

Condition on $\varepsilon _1,\varepsilon _2$ It is not difficult to verify that a sufficient condition for (3.3) to hold is that for any given $\theta \in (0,1/2)$, it holds that

$$\begin{aligned} \varepsilon _1 \lesssim \varepsilon _2^{\frac{1}{1-\theta }} C(\mu _1,\mu _2,\triangle _1,\triangle _2,K,\theta ), \end{aligned}$$

(3.5)

where $C(\mu _1,\mu _2,\triangle _1,\triangle _2,K,\theta ) > 0$ depends only on the indicated parameters. This is outlined in Appendix C for Theorem 5 for the case of L kernels. In other words, $\varepsilon _1$ would have to be sufficiently small with respect to $\varepsilon _2$, so that the estimation errors carrying forward from the first group to the estimation of the parameters for the second group, are controlled.

Effect of Separation Between$\mu _1,\mu _2$. Note that as $\mu _1 \rightarrow \mu _2$, then (3.3) becomes more and more difficult to satisfy; in particular, $C(\mu _1,\mu _2,\triangle _1,\triangle _2,K,\theta ) \rightarrow 0$ in (3.5). Hence, we would have to sample sufficiently deep in the tail of $\bar{g}_1$ in order to distinguish $\bar{g}_1,\bar{g}_2$ as one would intuitively expect. Next, for fixed $\mu _2$ as $\mu _1 \rightarrow 0$, we see that (3.3) becomes easier to satisfy. This is because $\bar{g}_1(s)$ is now small for all s, and hence the perturbation error arising from stage 1 reduces accordingly. However, notice that $s_1$ now has to increase correspondingly in order to distinguish between $\bar{g}_1,\bar{g}_2$ (since $\bar{g}_1(s) \approx 0$ for all s). Therefore, in order to control the estimation error of the amplitudes (see (3.4)), $\varepsilon _1$ now has to reduce accordingly. For instance, $\varepsilon _1 = o(\mu _1^{1/3})$ suffices. On the other hand, for fixed $\mu _1$, as $\mu _2 \rightarrow \infty $, satisfying (3.3) becomes more and more difficult. This is because the tail of $\bar{g}_2$ becomes thinner, and so, the deconvolution step at stage 2 blows up the error arising from stage 1.

Proof of Theorem 4

The proof is divided into two steps below.

Recovering Source Parameters for First Group For offset parameter $s_1\in \mathbb Z^{+}$ (the choice of which will be made clear later), we obtain the samples $(f(s_1+i))_{i=-m_1}^{m_1-1}$. Now, for any $i=-m_1,\ldots ,m_1-1$, we have that
$$\begin{aligned} \frac{f(s_1+i)}{\bar{g}_1(s_1+i)}&= \sum _{j=1}^K u_{1,j}\ \exp \left( \iota 2\pi (s_1+i)t_{1,j}\right) \nonumber \\&\quad + \frac{\bar{g}_2(s_1+i)}{\bar{g}_1(s_1+i)} \sum _{j=1}^K \ u_{2,j} \exp \left( \iota 2\pi (s_1+i)t_{2,j}\right) \nonumber \\&= \sum _{j=1}^K \underbrace{u_{1,j} \exp \left( \iota 2\pi s_1 t_{1,j}\right) }_{u'_{1,j}} \exp \left( \iota 2\pi i t_{1,j}\right) \nonumber \\&\quad + \underbrace{\frac{\mu _2}{\mu _1}\exp \left( -2\pi ^2(s_1+i)^2(\mu _2^2-\mu _1^2) \right) \sum _{j=1}^K u_{2,j} \exp (\iota 2\pi (s_1+i)t_{2,j})}_{\eta '_{1,i}} \nonumber \\&= \sum _{j=1}^K u'_{1,j} \exp \left( \iota 2\pi i t_{1,j}\right) +\eta '_{1,i}. \end{aligned}$$
(3.6)
Here, $\eta '_{1,i} \in \mathbb {C}$ corresponds to “perturbation” arising from the tail of $f_2(s)$. Since the stated choice of $s_1$ implies $s_1 > m_1$, this means $\min _{i} (s_1 + i)^2 = (s_1-m_1)^2$, and hence clearly
$$\begin{aligned} |{\eta '_{1,i}}| \le \frac{\mu _2}{\mu _1}\exp \left( -2\pi ^2(s_1-m_1)^2(\mu _2^2-\mu _1^2)\right) Ku_{\max }, \quad i=-m_1,\ldots ,m_1-1. \end{aligned}$$
From (3.6), we can see that $\widetilde{H}_0^{(1)} = V_1 D_{u'_1} V_1^H + E^{(1)}$ and $\widetilde{H}_1^{(1)} = V_1 D_{u'_1} D_{\alpha _1} V_1^H+ F^{(1)}$. Here, $D_{u'_1} = \text {diag}(u'_{1,1},\dots ,u'_{1,K})$ and $D_{\alpha _1} = \text {diag}(\alpha _{1,1},\dots ,\alpha _{1,K})$, while $E^{(1)}, F^{(1)}$ denote the perturbation matrices consisting of $(\eta '_{1,i})_{i=-m_1}^{m_1-1}$ terms, as in (2.2), (2.3).
We obtain estimates ${\widehat{t}}_{1,j}$, $\widehat{u}_{1,j}$, $j=1,\ldots ,K$ via the MMP method. Invoking Corollary 1, we have for $\varepsilon _1 < c \triangle _1/2$, and $\frac{2}{\triangle _1 - 2\varepsilon _1} + 1 \le m_1 < \frac{2}{\triangle _1(1-c)} + 1 (= M_{{1},\mathrm {up}})$ that if $s_1$ satisfies
$$\begin{aligned} \frac{\mu _2}{\mu _1}\exp \left( -2\pi ^2(s_1-m_1)^2(\mu _2^2-\mu _1^2)\right) Ku_{\max }\le {\varepsilon _1 u_{\min }B(u_{\mathrm {rel}},K),} \end{aligned}$$
(3.7)
then there exists a permutation $\phi _1: [K] \rightarrow [K]$ such that for each $j=1,\dots ,K$,
$$\begin{aligned} d_w({\widehat{t}}_{1,\phi _1(j)}, t_{1,j}) \le \varepsilon _1, \quad |{{\widehat{u}}_{1,\phi _1(j)} - u_{1,j}}| < {\left( \widetilde{C}_1 + 2\pi s_1\right) u_{\max }\varepsilon _1}, \end{aligned}$$
(3.8)
where $\widetilde{C}_1 = \widetilde{C}(\triangle _1,c,K,u_{\mathrm {rel}}) = 4\pi K M_{{1},\mathrm {up}} + \frac{2}{C\sqrt{K}}\left( u_{\mathrm {rel}}+ 16u_{\mathrm {rel}}^2\right) ^{-1}$. Clearly, the condition
$$\begin{aligned} s_1 \ge m_1 + \frac{1}{(2\pi ^2(\mu _2^2 - \mu _1^2))^{1/2}} {\log ^{1/2}\left( \frac{Ku_{\mathrm {rel}}\mu _2}{\mu _1 \varepsilon _1 B(u_{\mathrm {rel}},K)} \right) } \end{aligned}$$
implies (3.7). Moreover, since $s_1 \le \widetilde{c} S_1$ and $m_1 < M_{{1},\mathrm {up}}$, we obtain
$$\begin{aligned} s_1 < \widetilde{c} \left( M_{{1},\mathrm {up}} + \frac{1}{(2\pi ^2(\mu _2^2 - \mu _1^2))^{1/2}} {\log ^{1/2}\left( \frac{K u_{\mathrm {rel}}\mu _2}{\mu _1 \varepsilon _1 B(u_{\mathrm {rel}},K)} \right) }\right) . \end{aligned}$$
(3.9)
Plugging (3.9) into (3.8) leads to the bound
$$\begin{aligned} |{{\widehat{u}}_{1,\phi _1(j)} - u_{1,j}}| < \left( \bar{C}_1 + \bar{C}_2 \log ^{1/2}\left( \frac{\bar{C}_3}{\varepsilon _1}\right) \right) {u_{\max }} \varepsilon _1; \quad j=1,\dots ,K, \end{aligned}$$
(3.10)
where $\bar{C}_1, \bar{C}_2, \bar{C}_3 > 0$ are constants defined as follows.
$$\begin{aligned} \bar{C}_1&= {\widetilde{C}_1 + 2\pi \widetilde{c}M_{{1},\mathrm {up}}}, \end{aligned}$$
(3.11)
$$\begin{aligned} \bar{C}_2&= {\frac{2\pi \widetilde{c}}{(2\pi ^2(\mu _2^2 - \mu _1^2))^{1/2}}}, \quad \bar{C}_3 = {\frac{K u_{\mathrm {rel}}\mu _2}{\mu _1 B(u_{\mathrm {rel}},K)}}. \end{aligned}$$
(3.12)
Recovering Source Parameters for Second Group Let $\widehat{f}_1$ denote the estimate of $f_1$ obtained using the estimates $(\widehat{u}_{1,j})_{j=1}^K, (\widehat{t}_{1,j})_{j=1}^K$, defined as
$$\begin{aligned} \widehat{f}_1(s) = \bar{g}_1(s) \sum _{j=1}^K \widehat{u}_{1,j} \exp \left( \iota 2\pi \widehat{t}_{1,j} s\right) . \end{aligned}$$
For suitable $s_2, m_2 \in \mathbb Z_+$ (choice to be made clear later), we now obtain samples
$$\begin{aligned} \frac{f(s_2+i) - \widehat{f}_1(s_2+i)}{\bar{g}_2(s_2+i)}; \quad i=-m_2,\ldots ,m_2-1. \end{aligned}$$
Let us note that
$$\begin{aligned}&\frac{f(s_2+i)-\widehat{f}_1(s_2+i)}{\bar{g}_2(s_2+i)} = \frac{\bar{g}_1(s_2+i)}{\bar{g}_2(s_2+i)} \sum _{j=1}^K\left( u_{1,j} \exp \left( \iota 2\pi (s_2+i) t_{1,j} \right) \right. \nonumber \\&\qquad \left. - {\widehat{u}}_{1,\phi _1(j)} \exp \left( \iota 2\pi (s_2+i){\widehat{t}}_{1,\phi _1(j)}\right) \right) \nonumber \\&\qquad +\sum _{j=1}^K u_{2,j} \exp \left( \iota 2\pi (s_2 + i) t_{2,j} \right) \nonumber \\&\quad = \underbrace{\frac{\mu _1}{\mu _2} \exp (2\pi ^2 (s_2+i)^2 (\mu _2^2 - \mu _1^2)) \sum _{j=1}^K\left( u_{1,j} \exp (\iota 2\pi (s_2+i)t_{1,j}) - {\widehat{u}}_{1,\phi _1(j)} \exp \left( \iota 2\pi (s_2+i){\widehat{t}}_{1,\phi _1(j)}\right) \right) }_{\eta '_{2,i}} \nonumber \\&\qquad + \sum _{j=1}^K \underbrace{u_{2,j} \exp (\iota 2\pi s_2 t_{2,j})}_{u'_{2,j}} \exp (\iota 2\pi i t_{2,j}) \nonumber \\&\quad = \sum _{j=1}^K u'_{2,j} \exp (\iota 2\pi i t_{2,j}) + \eta '_{2,i}. \end{aligned}$$
(3.13)
Here, $\eta '_{2,i} \in \mathbb {C}$ corresponds to noise arising from the estimation errors for the parameters in the first group of sources. As a direct consequence of Proposition 3, we have for each $j=1,\dots ,K$ that
$$\begin{aligned}&|{u_{1,j} \exp (\iota 2\pi (s_2+i)t_{1,j}) - \widehat{u}_{1,\phi _1(j)} \exp (\iota 2\pi (s_2+i){\widehat{t}}_{1,\phi _1(j))}}| \nonumber \\&\quad \le 2\pi u_{\max }|{s_2 + i}| d_w(t_{1,j}, \widehat{t}_{1,\phi _1(j)}) + |{u_{1,j} - \widehat{u}_{1,\phi _1(j)}}| \nonumber \\&\quad < 2\pi u_{\max }|{s_2+i}| \varepsilon _1 + \left( \bar{C}_1 + \bar{C}_2 \log ^{1/2}\left( \frac{\bar{C}_3}{\varepsilon _1}\right) \right) {u_{\max }}\varepsilon _1, \end{aligned}$$
(3.14)
where the last inequality above follows from the bounds on $|{u_{1,j} - \widehat{u}_{1,\phi _1(j)}}|$, $d_w(t_{1,j}, t_{1,\phi _1(j)})$, derived earlier. Now for $s_2 = 0$, and using the fact $|{i}| \le m_2 < \frac{2}{\triangle _2(1-c)} + 1 \ (\ = M_{{2},\mathrm {up}})$, we obtain from (3.14) the following uniform bound on $|{\eta '_{2,i}}|$.
$$\begin{aligned} |{\eta '_{2,i}}|&< \frac{\mu _1}{\mu _2} K e^{(2\pi ^2 m_2^2 (\mu _2^2 - \mu _1^2))} \left( {2\pi m_2} + \bar{C}_1 + \bar{C}_2 \log ^{1/2}\left( \frac{\bar{C}_3}{\varepsilon _1}\right) \right) {u_{\max }} \varepsilon _1 \nonumber \\&< \frac{\mu _1}{\mu _2} K e^{(2\pi ^2 M_{{2},\mathrm {up}}^2 (\mu _2^2 - \mu _1^2))} \left( {2\pi M_{{2},\mathrm {up}}} + \bar{C}_1 + \bar{C}_2 \log ^{1/2}\left( \frac{\bar{C}_3}{\varepsilon _1}\right) \right) {u_{\max }} \varepsilon _1. \end{aligned}$$
(3.15)
From (3.13), we see that $\widetilde{H}_0^{(2)} = V_2 D_{u'_2} V_2^H + E^{(2)}$, $\widetilde{H}_1^{(2)} = V_2 D_{u'_2} D_{\alpha _2}V_2^H +F^{(2)}$. Here, $D_{u'_2} = \text {diag}(u'_{2,1},\dots ,u'_{2,K})$ and $D_{\alpha _2} = \text {diag}(\alpha _{2,1},\dots ,\alpha _{2,K})$, while $E^{(2)}, F^{(2)}$ denote the perturbation matrices consisting of $(\eta '_{2,i})_{i=-m_2}^{m_2-1}$ terms, as in (2.2), (2.3).
We obtain the estimates $({\widehat{t}}_{2,j})_{j=1}^K$ and $(\widehat{u}_{2,j})_{j=1}^K$ using the MMP method. Invoking Corollary 1 and assuming $\varepsilon _2 < c \triangle _2/2$, $\frac{2}{\triangle _2 - 2\varepsilon _2} + 1 \le m_2 < M_{{2},\mathrm {up}}$ hold, it is sufficient that $\varepsilon _1$ satisfies the condition
$$\begin{aligned}&\frac{\mu _1}{\mu _2} K e^{(2\pi ^2 M_{{2},\mathrm {up}}^2 (\mu _2^2 - \mu _1^2))} \left( {2\pi M_{{2},\mathrm {up}}} + \bar{C}_1 + \bar{C}_2 \log ^{1/2}\left( \frac{\bar{C}_3}{\varepsilon _1}\right) \right) {u_{\max }} \varepsilon _1 \\&\quad \le \varepsilon _2 {u_{\min }B(u_{\mathrm {rel}},K)}. \end{aligned}$$
Indeed, there then exists a permutation $\phi _2: [K] \rightarrow [K]$ such that for each $j=1,\dots ,K$,
$$\begin{aligned} d_w({\widehat{t}}_{2,\phi _2(j)}, t_{2,j}) \le \varepsilon _2, \quad |{{\widehat{u}}_{2,\phi _2(j)} - u_{2,j}}| < \widetilde{C}_2 \varepsilon _2, \end{aligned}$$
where ${\widetilde{C}_2 = \widetilde{C}(\triangle _2,c,K,u_{\mathrm {rel}}) = 4\pi K M_{{2},\mathrm {up}} + \frac{2}{C\sqrt{K}}\left( u_{\mathrm {rel}}+ 16u_{\mathrm {rel}}^2 \right) ^{-1}}$. This completes the proof.

$\square $

3.2 The General Case

We now move to the general case where $L \ge 1$. The function f is now of the form

$$\begin{aligned} f(s)&= \sum _{l=1}^{L} \bar{g}_l(s) \left( \sum _{j=1}^K u_{l,j} \exp (\iota 2\pi s t_{l,j})\right) \end{aligned}$$

where we recall that $\bar{g}_l(s) = \sqrt{2\pi } \ \mu _l \exp (-2\pi ^2 s^2 \mu _l^2)$. Before stating our result, it will be helpful to define certain terms for ease of notation, later on.

(1)
For $l=1,\dots ,L$,
$$\begin{aligned} M_{{l},\mathrm {up}} := \frac{2}{\triangle _l(1-c)} + 1, \quad {\widetilde{C}_l := \left[ 4\pi K M_{{l},\mathrm {up}} + \frac{2}{C\sqrt{K}}\left( u_{\mathrm {rel}}+ 16u_{\mathrm {rel}}^2 \right) ^{-1} \right] }. \end{aligned}$$
(3.16)
with constants $c \in (0,1)$ and $C = 10 + \frac{1}{2\sqrt{2}}$ (from Corollary 1).
(2)
For $l=1,\dots ,L-1$, and a constant $\widetilde{c} > 1$,
$$\begin{aligned} \bar{C}_{l,1}&:= \widetilde{C}_l + {2\pi \widetilde{c}M_{{l},\mathrm {up}}}, \ \bar{C}_{l,2} := \frac{{2\pi \widetilde{c}}}{(2\pi ^2(\mu _{l+1}^2 - \mu _l^2))^{1/2}}, \end{aligned}$$
(3.17)
$$\begin{aligned} D_l&:= {\frac{K u_{\mathrm {rel}}(L-l)\mu _L}{\mu _l B(u_{\mathrm {rel}},K)}}, \quad \bar{C}_{l,3} := \left\{ \begin{array}{rl} D_l \quad ; &{} l = 1 \\ 2 D_l \quad ; &{} l > 1. \end{array} \right. \end{aligned}$$
(3.18)
where $B(u_{\mathrm {rel}},K)$ is as defined in Corollary 1.
(3)
For $l = 1,\dots ,L$, define
$$\begin{aligned} E_l(\varepsilon ) := \left\{ \begin{array}{rl} {\left( \bar{C}_{l,1} + \bar{C}_{l,2} \log ^{1/2}\left( \frac{\bar{C}_{l,3}}{\varepsilon }\right) \right) } \quad ; &{} l < L \\ {\widetilde{C}_L} \quad ; &{} l = L. \end{array} \right. \end{aligned}$$
(3.19)
where $\varepsilon \in (0,1)$.
(4)
For $l = 2,\dots ,L-1$, define
$$\begin{aligned} F_l(\varepsilon )&:= C_{l,1}' + C_{l,2}' \log ^{1/2}\left( \frac{2 D_l}{\varepsilon }\right) ; \end{aligned}$$
(3.20)
$$\begin{aligned} \text {where} \quad C_{l,1}'&:= {2\pi }(\widetilde{c} + 1)M_{{l},\mathrm {up}}, \ C_{l,2}' = {\bar{C}_{l,2}}. \end{aligned}$$
(3.21)
Here, $\widetilde{c} > 1$ is the same constant as in (2).

We are now ready to state our main theorem for approximate recovery of the source parameters for each group.

Theorem 5

For a constant $c \in (0,1)$, let $0< \varepsilon _L < c\triangle _L/2$, $\frac{2}{\triangle _L - 2\varepsilon _L} \le m_L < M_{{L},\mathrm {up}}$ and $s_L = 0$. Moreover, for $l = L-1, \dots , 1$, say we choose $\varepsilon _l,m_l,s_l$ as follows.

1.
${0< \varepsilon _l < c\triangle _l/2}$ additionally satisfies the following conditions.
1. (a)
  $(2\pi M_{{L},\mathrm {up}} + E_{L-1}(\varepsilon _{L-1})) \varepsilon _{L-1} u_{\mathrm {rel}}\le \varepsilon _{L} e^{-2\pi ^2(\mu _{L}^2-\mu _1^2)M_{{L},\mathrm {up}}^2} \frac{\mu _{L} B(u_{\mathrm {rel}},K) }{K (L-1) \mu _{L-1}}$.
2. (b)
  If $l < L-1$, then $\varepsilon _{l} \le \varepsilon _{l+1}$, $E_l(\varepsilon _{l}) {\varepsilon _l} \le E_{l+1}(\varepsilon _{l+1}) {\varepsilon _{l+1}}$ and
  $$\begin{aligned} {(F_{l+1}(\varepsilon _{l}) + E_{l}(\varepsilon _{l})) \varepsilon _l u_{\mathrm {rel}}\le \varepsilon _{l+1} e^{-(\mu _{l+1}^2-\mu _1^2)\frac{F_{l+1}^2(\varepsilon _{l+1})}{2}} \frac{\mu _{l+1} B(u_{\mathrm {rel}},K)}{2K l \mu _{l}}.}\nonumber \\ \end{aligned}$$
  (3.22)
2.
$\frac{2}{\triangle _l - 2\varepsilon _l} \le m_l < M_{{l},\mathrm {up}}$, and $S_l \le s_l \le \widetilde{c} S_l$ (for constant $\widetilde{c} > 1$) where
$$\begin{aligned} S_l = m_l + \frac{1}{(2\pi ^2(\mu _{l+1}^2 - \mu _l^2))^{1/2}} \log ^{1/2}\left( {\frac{b_l K u_{\mathrm {rel}}(L-l) \mu _L}{\varepsilon _l B(u_{\mathrm {rel}},K) \mu _l}} \right) , \end{aligned}$$
with $b_l = {1}$ if $l = 1$, and $b_l = {2}$ otherwise.

Then, for each $l=1,\dots ,L$, there exists a permutation $\phi _l: [K] \rightarrow [K]$ such that

$$\begin{aligned} d_w({\widehat{t}}_{l,\phi _l(j)}, t_{l,j})&\le \varepsilon _l, \quad |{{\widehat{u}}_{l,\phi _l(j)} - u_{l,j}}| < E_l(\varepsilon _l) {\varepsilon _l u_{\max }}; \quad j=1,\dots ,K. \end{aligned}$$

Interpreting Theorem 5 Before proceeding to the proof, we make some useful observations.

(a)
We first choose the sampling parameters ($\varepsilon ,m,s$) for the outermost kernel $\bar{g}_L$, then for $\bar{g}_{L-1}$, and so on. For the lth group ($1 \le l \le L$), we first choose $\varepsilon _L$ (accuracy), then $m_l$ (number of samples), and finally $s_l$ (sampling offset).
(b)
The choice of $\varepsilon _L \in (0,c\triangle _L/2)$, while free, dictates the choice of $\varepsilon _1,\dots ,\varepsilon _{L-1}$. To begin with, condition 1a essentially requires $\varepsilon _{L-1}$ to be sufficiently small with respect to $\varepsilon _L$. Similarly, for $l = 1,\dots ,L-2$, the conditions in 1b require $\varepsilon _l$ to be sufficiently small with respect to $\varepsilon _{l+1}$. It ensures that during the estimation of the parameters for the $(l+1)$th group, the estimation errors carrying forward from the previous groups (1 to l) are sufficiently small.
(c)
For each l ($< L$), the lower bound on $s_l$ is to ensure that we are sufficiently deep in the tails of $\bar{g}_{l+1}, \bar{g}_{l+2}, \dots , \bar{g}_L$. The upper bound on $s_l$ is to control the estimation errors of the source amplitudes for group l (see (2.9)).

Order Wise Dependencies We now discuss the scaling of the terms involved, assuming $u_{\max },u_{\min }\asymp 1$.

(i)
For $l=1,\dots ,L$, we have $M_{{l},\mathrm {up}} \asymp \frac{1}{\triangle _l}$, $\widetilde{C}_l \asymp \frac{K}{\triangle _l}$.
(ii)
For $p=1,\dots ,L-1$ we have
$$\begin{aligned} \bar{C}_{p,1} \asymp \frac{K}{\triangle _p}, \ \bar{C}_{p,2} \asymp \frac{1}{(\mu _{p+1}^2 - \mu _p^2)^{1/2}}, \ D_p \asymp \frac{K^{3/2}(L-p)\mu _L}{\mu _p}, \ \bar{C}_{p,3} \asymp D_p. \end{aligned}$$
(iii)
For $l = 1,\dots ,L$, we have
$$\begin{aligned} E_l(\varepsilon _l) \asymp \left\{ \begin{array}{rl} {\frac{K}{\triangle _l} + \frac{1}{(\mu _{l+1}^2 - \mu _l^2)^{1/2}} \log ^{1/2}\left( \frac{K^{3/2}(L-l)\mu _L}{\varepsilon _l\mu _l}\right) } \quad ; &{} l < L \\ \\ \frac{K}{\triangle _L} \quad ; &{} l = L. \end{array} \right. \end{aligned}$$
(3.23)
(iv)
For $q = 2,\dots ,L-1$, we have
$$\begin{aligned} F_q(\varepsilon _q) \asymp \frac{1}{\triangle _q} + \frac{1}{(\mu _{q+1}^2 - \mu _q^2)^{1/2}} \log ^{1/2}\left( \frac{K^{3/2} (L-q) \mu _L}{\varepsilon _q\mu _q}\right) . \end{aligned}$$
(3.24)

Conditions on $\varepsilon _l$ Theorem 5 has several conditions on $\varepsilon _l$, which might be difficult to digest at first glance. On a top level, the conditions dictate that the accuracies satisfy $\varepsilon _1 \le \varepsilon _2 \le \cdots \le \varepsilon _L$. In fact, they require a stronger condition in the sense that for each $1 \le l \le L-1$, $\varepsilon _l$ is required to be sufficiently smaller than $\varepsilon _{l+1}$ (the choice of $\varepsilon _L \lesssim \triangle _L/2$ is free). This places the strongest assumption on $\varepsilon _1$ meaning that the source parameters corresponding to the “outermost kernel” in the Fourier domain should be estimated with the highest accuracy. Below, we state the conditions appearing on $\varepsilon _l$ in the Theorem up to positive constants; the details are deferred to Appendix C.

1.
Condition 1a in Theorem 5 holds if $\varepsilon _{L-1}, \varepsilon _L$ satisfy
$$\begin{aligned} \varepsilon _{L-1} \lesssim \alpha (\varepsilon _L)^{\frac{1}{1-\theta }} \end{aligned}$$
(3.25)
for any given $\theta \in (0,1/2)$. Here, $\alpha > 0$ depends on $\theta ,K,L,\triangle _{L-1},\triangle _{L},\mu _{L},\mu _{L-1},\mu _{1}$.
2.
For $l < L - 1$, let us look at condition 1b in Theorem 5. The requirement $E_l(\varepsilon _l) {\varepsilon _l} \le E_{l+1}(\varepsilon _{l+1}) {\varepsilon _{l+1}}$ holds if $\varepsilon _l,\varepsilon _{l+1}$ satisfy
$$\begin{aligned} \varepsilon _l \lesssim \lambda _l \log ^{\frac{1}{2(1-\theta )}} \left( \frac{K^{3/2}\mu _{L} {(L-l)}}{\varepsilon _{l+1}\mu _{l+1}}\right) \varepsilon _{l+1}^{\frac{1}{1-\theta }} \end{aligned}$$
(3.26)
for any given $\theta \in (0,1/2)$. Here, $\lambda _l > 0$ depends on $\triangle _l,\triangle _{l+1},\mu _{l},\mu _{l+1},\mu _{l+2},L,K,\theta $.
Furthermore, the condition in (3.22) is satisfied if $\varepsilon _l, \varepsilon _{l+1}$ satisfy
$$\begin{aligned} \varepsilon _l \lesssim \beta _l (\varepsilon _{l+1})^{\frac{1+\gamma _l}{1-\theta }} \end{aligned}$$
(3.27)
for any given $\theta \in (0,1/2)$. Here, $\beta _l > 0$ depends on $L,K,\triangle _l$, $\triangle _{l+1},\triangle _{L-1},\triangle _{L}$, $\mu _{l},\mu _{l+1}$, $\mu _{l+2}, \mu _{L}$, $\mu _{L-1},\mu _{1},\theta $, while $\gamma _l > 0$ depends on $\mu _1,\mu _{l+1},\mu _{l+2},\triangle _{l+1}$. Note that the dependence on $\varepsilon _{l+1}$ is stricter in (3.27) as compared to (3.26).

Effect of Separation Between$\mu _l, \mu _{l+1}$for$l = 1,\dots ,L-1$ The interaction between $\mu _{L-1},\mu _L$ occurs in the same manner as explained for the case of two kernels, the reader is invited to verify this. We analyze the interaction between $\mu _{l},\mu _{l+1}$ below for $l < L-1$.

1.
Consider the scenario where $\mu _l \rightarrow \mu _{l+1}$ (with other terms fixed). We see that $s_l$ has to be suitably large now in order to be able to distinguish between $\bar{g}_l$ and $\bar{g}_{l+1}$. Moreover, conditions (3.26), (3.27) become stricter in the sense that $\lambda _l,\beta _l \rightarrow 0$.
2.
Now say $\mu _{l+1}$ is fixed, and $\mu _{l} \rightarrow 0$ (and hence $\mu _1,\dots ,\mu _{l-1} \rightarrow 0$). In this case, the conditions $E_1(\varepsilon _1) {\varepsilon _l} \le \cdots \le E_{l+1}(\varepsilon _{l+1}) {\varepsilon _{l+1}}$ become vacuous as the estimation error arising from stages $1,\dots ,l-1$ themselves approach 0. However, $s_1,\dots ,s_l$ now increase accordingly in order to distinguish within $\bar{g}_1,\dots ,\bar{g}_l$. Hence, to control the estimation error of the amplitudes, i.e, $E_i(\varepsilon _i)$; $1 \le i \le l$, $\varepsilon _1,\dots ,\varepsilon _l$ have to be suitably small.

Proof of Theorem 5

The proof is divided in to three main steps.

Recovering Source Parameters for First Group For $i=-m_1,\ldots ,m_1-1$, we have
$$\begin{aligned} \frac{f(s_1+i)}{\bar{g}_1(s_1+i)}&= \sum _{j=1}^K u_{1,j} \exp \left( \iota 2\pi (s_1+i)t_{1,j}\right) \nonumber \\&\quad + \sum _{l=2}^{L}\frac{\bar{g}_l(s_1+i)}{\bar{g}_1(s_1+i)} \sum _{j=1}^K \ u_{l,j} \exp \left( \iota 2\pi (s_1+i)t_{l,j}\right) \nonumber \\&= \sum _{j=1}^K \underbrace{u_{1,j} \exp \left( \iota 2\pi s_1 t_{1,j}\right) }_{u'_{1,j}} \exp \left( \iota 2\pi i t_{1,j}\right) \nonumber \\&\quad + \underbrace{\sum _{l=2}^{L} \frac{\mu _l}{\mu _1}\exp \left( -2\pi ^2(s_1+i)^2(\mu _l^2-\mu _1^2) \right) \sum _{j=1}^K u_{l,j} \exp (\iota 2\pi (s_1+i)t_{l,j})}_{\eta '_{1,i}} \nonumber \\&= \sum _{j=1}^K u'_{1,j} \exp \left( \iota 2\pi i t_{1,j}\right) + \eta '_{1,i}. \end{aligned}$$
(3.28)
Here, $\eta '_{1,i}$ is the perturbation due to the tail of $\bar{g}_2,\bar{g}_3,\dots ,\bar{g}_L$. Since the stated choice of $s_1$ implies $s_1 > m_1$, this means $\min _{i} (s_1 + i)^2 = (s_1-m_1)^2$, and hence clearly
$$\begin{aligned} |{\eta '_{1,i}}|&\le Ku_{\max }\sum _{l=2}^{L} \frac{\mu _l}{\mu _1}\exp \left( -2\pi ^2(s_1-m_1)^2(\mu _l^2-\mu _1^2)\right) \\&\le Ku_{\max }(L-1) \frac{\mu _{L}}{\mu _1}\exp \left( -2\pi ^2(s_1-m_1)^2(\mu _2^2-\mu _1^2)\right) . \end{aligned}$$
We obtain estimates ${\widehat{t}}_{1,j}$, $\widehat{u}_{1,j}$, $j=1,\ldots ,K$ via the MMP method. Invoking Corollary 1, we have for $\varepsilon _1 < c \triangle _1/2$, and $\frac{2}{\triangle _1 - 2\varepsilon _1} + 1 \le m_1 < \frac{2}{\triangle _1(1-c)} + 1 (= M_{{1},\mathrm {up}})$ that if $s_1$ satisfies
$$\begin{aligned} Ku_{\max }(L-1) \frac{\mu _{L}}{\mu _1}\exp \left( -2\pi ^2(s_1-m_1)^2(\mu _2^2-\mu _1^2)\right) \le \varepsilon _1 {u_{\min }B(u_{\mathrm {rel}},K)} \end{aligned}$$
(3.29)
then there exists a permutation $\phi _1: [K] \rightarrow [K]$ such that for each $j=1,\dots ,K$,
$$\begin{aligned} d_w({\widehat{t}}_{1,\phi _1(j)}, t_{1,j}) \le \varepsilon _1, \quad |{{\widehat{u}}_{1,\phi _1(j)} - u_{1,j}}| < \left( \widetilde{C}_1 + 2 \pi s_1\right) {u_{\max }} \varepsilon _1. \end{aligned}$$
(3.30)
Clearly, the condition
$$\begin{aligned} s_1 \ge m_1 + \frac{1}{(2\pi ^2(\mu _2^2 - \mu _1^2))^{1/2}} \log ^{1/2}\left( {\frac{K u_{\mathrm {rel}}(L-1) \mu _L}{\mu _1 \varepsilon _1 B(u_{\mathrm {rel}},K)} } \right) \end{aligned}$$
(3.31)
implies (3.29). Moreover, since $s_1 \le \widetilde{c} S_1$ and $m_1 < M_{{1},\mathrm {up}}$, we obtain
$$\begin{aligned} s_1&< \widetilde{c} \left( M_{{1},\mathrm {up}} + \frac{1}{(2\pi ^2(\mu _2^2 - \mu _1^2))^{1/2}} \log ^{1/2}\left( {\frac{K (L-1) u_{\mathrm {rel}}\mu _L}{\mu _1 \varepsilon _1 B(u_{\mathrm {rel}},K)}} \right) \right) . \end{aligned}$$
(3.32)
Plugging (3.32) in (3.30), we obtain
$$\begin{aligned} |{{\widehat{u}}_{1,\phi _1(j)} - u_{1,j}}|&< \left( \bar{C}_{1,1} + \bar{C}_{1,2} \log ^{1/2}\left( \frac{\bar{C}_{1,3}}{\varepsilon _1}\right) \right) \varepsilon _1 {u_{\max }} \nonumber \\&= E_1(\varepsilon _1) {\varepsilon _1 u_{\max }}; \quad j=1,\dots ,K, \end{aligned}$$
(3.33)
where $\bar{C}_{1,1}, \bar{C}_{1,2}, \bar{C}_{1,3} > 0$ are constants defined in (3.17), (3.18), and $E_p(\cdot )$ is defined in (3.19).
Recovering Source Parameters for$l{\text {th}} (1< l < L)$Group Say we are at the lth iteration for $1< l < L$, having estimated the source parameters up to the $(l-1)$th group. Say that for each $p = 1,\dots ,l-1$ and $j=1,\dots ,K$ the following holds.
$$\begin{aligned} d_w(\widehat{t}_{p,\phi _p(j)}, t_{p,j}) \le \varepsilon _p, \quad |{\widehat{u}_{p,\phi _p(j)} - u_{p,j}}| < E_p(\varepsilon _p) {u_{\max }\varepsilon _p}, \end{aligned}$$
(3.34)
for some permutations $\phi _p: [K] \rightarrow [K]$, with
1. 1.
  $\varepsilon _1 \le \cdots \le \varepsilon _{l-1}; \quad \quad E_1(\varepsilon _1) {\varepsilon _1} \le \cdots \le E_{l-1}(\varepsilon _{l-1}) {\varepsilon _{l-1}}$;
2. 2.
  $\varepsilon _{p} < c \triangle _p/2$ ;
3. 3.
  ${(F_{q+1}(\varepsilon _{q}) + E_{q}(\varepsilon _{q})) u_{\mathrm {rel}}\varepsilon _q \le \varepsilon _{q+1} e^{-(\mu _{q+1}^2-\mu _1^2)\frac{F_{q+1}^2(\varepsilon _{q+1})}{2}} \frac{\mu _{q+1} B(u_{\mathrm {rel}},K)}{2K q \mu _{q}}}$, $1 \le q \le l-2$.
For $i=-m_l,\ldots ,m_l-1$, we have
$$\begin{aligned}&\frac{f(s_l + i) - \sum _{p=1}^{l-1} \widehat{f}_p(s_l + i)}{\bar{g}_l(s_l+i)} \\&\quad = \underbrace{\sum _{p=1}^{l-1}\frac{\bar{g}_p(s_l+i)}{\bar{g}_l(s_l + i)} \sum _{j=1}^K[u_{p,j}\exp (\iota 2\pi (s_l+i)t_{p,j}) - \widehat{u}_{p,\phi _p(j)} \exp (\iota 2\pi (s_l+i)\widehat{t}_{p,\phi _p(j)})]}_{\eta '_{l,i,past}} \\&\qquad + \underbrace{\sum _{q=l+1}^{L} \frac{\bar{g}_q(s_l+i)}{\bar{g}_l(s_l + i)} \sum _{j=1}^{K}u_{q,j}\exp (\iota 2\pi (s_l+i)t_{q,j})}_{\eta '_{l,i,fut}} \\&\qquad + \sum _{j=1}^K \underbrace{u_{l,j} \exp (\iota 2\pi s_l t_{l,j})}_{u'_{l,j}} \exp (\iota 2\pi i t_{l,j}) \\&\quad = \sum _{j=1}^K u'_{l,j} \exp (\iota 2\pi i t_{l,j}) + \eta '_{l,i,past} + \eta '_{l,i,fut}. \end{aligned}$$
Here, $\eta '_{l,i,past}$ denotes perturbation due to the estimation errors of the source parameters in the past. Moreover, $\eta '_{l,i,fut}$ denotes perturbation due to the tails of the kernels that are yet to be processed.
1. (i)
  $\underline{\textit{Bounding }\eta '_{l,i,past}}$ To begin with, note
  $$\begin{aligned} \eta '_{l,i,past}&= \sum _{p=1}^{l-1}\frac{\mu _p}{\mu _l} \exp (2\pi ^2(\mu _l^2 - \mu _p^2) (s_l + i)^2)\\&\qquad \sum _{j=1}^K[u_{p,j}\exp (\iota 2\pi (s_l+i)t_{p,j}) \\&\quad - \widehat{u}_{p,\phi _p(j)} \exp (\iota 2\pi (s_l+i)\widehat{t}_{p,\phi _p(j)}]. \end{aligned}$$
  Using Proposition 3, we have for each $p=1,\dots ,l-1$ and $j=1,\dots ,K$ that
  $$\begin{aligned}&|{u_{p,j} \exp (\iota 2\pi (s_l+i)t_{p,j}) - \widehat{u}_{p,\phi _p(j)} \exp (\iota 2\pi (s_l+i){\widehat{t}}_{p,\phi _p(j)})}| \nonumber \\&\quad \le 2\pi u_{\max }|{s_l + i}| d_w(t_{p,j}, \widehat{t}_{p,\phi _p(j)}) + |{u_{p,j} - \widehat{u}_{p,\phi _p(j)}}| \nonumber \\&\quad < 2\pi u_{\max }|{s_l+i}| \varepsilon _p + E_p(\varepsilon _p) {\varepsilon _p u_{\max }}, \end{aligned}$$
  (3.35)
  where the last inequality is due to (3.34). Since $s_l > m_l$, hence $(s_l + i)^2 < (s_l + m_l)^2$ for all $i=-m_l,\dots ,m_l-1$. With the help of (3.35), we then readily obtain
  $$\begin{aligned} |{\eta '_{l,i,past}}|&< \sum _{p=1}^{l-1} \left( \left( \frac{\mu _p}{\mu _l} \exp (2\pi ^2(\mu _l^2 - \mu _p^2) (s_l + m_l)^2)\right) \nonumber \right. \nonumber \\&\quad \left. (2\pi u_{\max }(s_l+m_l) \varepsilon _p + E_p(\varepsilon _p) {\varepsilon _p u_{\max }}) K\right) \nonumber \\&\le \left( K \frac{\mu _{l-1}}{\mu _l} e^{2\pi ^2(\mu _l^2 - \mu _1^2)(s_l + m_l)^2} \right) \nonumber \\&\quad \left( \sum _{p=1}^{l-1} 2\pi u_{\max }(s_l+m_l)\varepsilon _p + E_p(\varepsilon _p) {\varepsilon _p u_{\max }} \right) \nonumber \\&\le \left( K(l-1) \frac{\mu _{l-1}}{\mu _l} e^{2\pi ^2(\mu _l^2 - \mu _1^2)(s_l + m_l)^2} \right) \nonumber \\&\quad \left( 2\pi u_{\max }(s_l+m_l)\varepsilon _{l-1} + E_{l-1}(\varepsilon _{l-1}) {\varepsilon _{l-1} u_{\max }} \right) , \end{aligned}$$
  (3.36)
  where in the last inequality, we used 1.
2. (ii)
  $\underline{\textit{Bounding }\eta '_{l,i,fut}}$ We have
  $$\begin{aligned} \eta '_{l,i,fut}&= \sum _{q=l+1}^{L} \frac{\mu _q}{\mu _l} \exp (-2\pi ^2(s_l+i)^2(\mu _q^2 - \mu _l^2)) \nonumber \\&\quad \left( \sum _{j=1}^K u_{q,j} \exp (\iota 2\pi (s_l+i)t_{q,j})\right) . \end{aligned}$$
  Since $s_l > m_l$, we have $(s_l+i)^2 \ge (s_l-m_l)^2$ for all $i=-m_l,\dots ,m_l-1$. This, along with the fact $\frac{\mu _q}{\mu _l} \le \frac{\mu _L}{\mu _l}$ gives us
  $$\begin{aligned} |{\eta '_{l,i,fut}}| \le \frac{\mu _L}{\mu _l} Ku_{\max }(L-l) \exp (-2\pi ^2(s_l-m_l)^2(\mu _{l+1}^2 - \mu _l^2)). \end{aligned}$$
  It follows that if $s_l \ge S_l$ where
  $$\begin{aligned} S_l = m_l + \frac{1}{(2\pi ^2(\mu _{l+1}^2 - \mu _l^2))^{1/2}} \log ^{1/2}\left( {\frac{2 K u_{\mathrm {rel}}(L-l) \mu _L}{\varepsilon _l \mu _l B(u_{\mathrm {rel}},K)}} \right) \end{aligned}$$
  then for $i=-m_l,\dots ,m_l-1$, we have
  $$\begin{aligned} |{\eta '_{l,i,fut}}| < \varepsilon _l {\frac{u_{\min }B(u_{\mathrm {rel}},K)}{2}}. \end{aligned}$$
  (3.37)
3. (iii)
  $\underline{\textit{Back to }\eta '_{l,i,past}}$ We will now find conditions which ensure that the same bound as (3.37) holds on $|{\eta '_{l,i,past}}|$, uniformly for all i. To this end, since $s_l \le \widetilde{c} S_l$ and $m_l < M_{{l},\mathrm {up}}$, we obtain
  $$\begin{aligned} s_l < \widetilde{c} \left( M_{{l},\mathrm {up}} + \frac{1}{(2\pi ^2(\mu _{l+1}^2 - \mu _l^2))^{1/2}} \log ^{1/2}\left( {\frac{2 K u_{\mathrm {rel}}(L-l) \mu _L}{\varepsilon _l \mu _l B(u_{\mathrm {rel}},K)}}\right) \right) . \end{aligned}$$
  (3.38)
  This then gives us the bound
  $$\begin{aligned} {2\pi }(s_l + m_l)&< {2\pi } (\widetilde{c} + 1) M_{{l},\mathrm {up}} \\&\quad + \frac{{2\pi } \widetilde{c}}{(2\pi ^2(\mu _{l+1}^2 - \mu _l^2))^{1/2}} \log ^{1/2}\left( {\frac{2 K u_{\mathrm {rel}}(L-l) \mu _L}{\varepsilon _l \mu _l B(u_{\mathrm {rel}},K)}}\right) \\&= C'_{l,1} + C'_{l,2} \log ^{1/2}\left( \frac{2D_{l}}{\varepsilon _l}\right) = F_{l}(\varepsilon _l), \end{aligned}$$
  where we recall the definition of $F_l$, and constants $C_{l,1}', C_{l,2}', D_l > 0$ from (3.20), (3.21). Since $\varepsilon _l< 1 < D_l$ and $\varepsilon _l \ge \varepsilon _{l-1}$, hence ${2\pi } (s_l + m_l) < F_{l}(\varepsilon _l) \le F_{l}(\varepsilon _{l-1})$. Using this in (3.36), we obtain
  $$\begin{aligned}&|{\eta '_{l,i,past}}| < \left( K(l-1) \frac{\mu _{l-1}}{\mu _l} {e^{(\mu _l^2 - \mu _1^2) \frac{F^2_l(\varepsilon _l)}{2}}} \right) \nonumber \\&\quad {\left( F_l(\varepsilon _{l-1}) + E_{l-1}(\varepsilon _{l-1}) \right) {u_{\max }\varepsilon _{l-1}}} \end{aligned}$$
  (3.39)
  Therefore if $\varepsilon _{l-1}$ satisfies the condition
  $$\begin{aligned} {(F_l(\varepsilon _{l-1}) + E_{l-1}(\varepsilon _{l-1}) ) u_{\max }\varepsilon _{l-1} \le \varepsilon _l e^{-(\mu _l^2-\mu _1^2) \frac{F^2_l(\varepsilon _l)}{2}} \frac{u_{\min }\mu _l B(u_{\mathrm {rel}},K)}{2 K (l-1) \mu _{l-1}}} \end{aligned}$$
  then it implies $|{\eta '_{l,i,past}}| < \varepsilon _l {\frac{u_{\min }B(u_{\mathrm {rel}},K)}{2}}$. Together with (3.37), this gives
  $$\begin{aligned} |{\eta '_{l,i}}| \le |{\eta '_{l,i,fut}}| + |{\eta '_{l,i,past}}| < \varepsilon _l {u_{\min }B(u_{\mathrm {rel}},K)}. \end{aligned}$$
  We obtain estimates ${\widehat{t}}_{l,j}$, $\widehat{u}_{l,j}$, $j=1,\ldots ,K$ via the MMP method. Invoking Corollary 1, if $\varepsilon _l < c \triangle _l/2$, then for the stated choice of $m_l$, there exists a permutation $\phi _l: [K] \rightarrow [K]$ such that for each $j=1,\dots ,K$,
  $$\begin{aligned} d_w({\widehat{t}}_{l,\phi _l(j)}, t_{l,j})&\le \varepsilon _l; \quad |{{\widehat{u}}_{l,\phi _l(j)} - u_{l,j}}| \nonumber \\&< \left( \widetilde{C}_l + 2\pi s_l \right) \varepsilon _l {u_{\max }} < E_l(\varepsilon _l) {\varepsilon _{l} u_{\max }}. \end{aligned}$$
  (3.40)
  The last inequality follows readily using (3.38).
Recovering Source Parameters for Last Group Say that for each $p = 1,\dots ,L-1$ and $j=1,\dots ,K$ the following holds.
$$\begin{aligned} d_w(\widehat{t}_{p,\phi _p(j)}, t_{p,j}) \le \varepsilon _p, \quad |{{\widehat{u}}_{p,\phi _p(j)} - u_{p,j}}| < E_p(\varepsilon _p) {\varepsilon _p u_{\max }}, \end{aligned}$$
(3.41)
for some permutations $\phi _p: [K] \rightarrow [K]$, with
1. 1.
  $\varepsilon _1 \le \cdots \le \varepsilon _{L-1}$; $E_1(\varepsilon _1) {\varepsilon _1} \le \cdots \le E_{L-1}(\varepsilon _{L-1}) {\varepsilon _{L-1}}$;
2. 2.
  $\varepsilon _{p} < c \triangle _p/2$;
3. 3.
  $(F_{q+1}(\varepsilon _{q}) + E_{q}(\varepsilon _{q})) \varepsilon _{q} u_{\mathrm {rel}}\le \varepsilon _{q+1} e^{-(\mu _{q+1}^2-\mu _1^2)\frac{F_{q+1}^2(\varepsilon _{q+1})}{2}} \frac{\mu _{q+1} B(u_{\mathrm {rel}},K)}{2 K q \mu _{q}}$, $1 \le q \le L-2$.
We proceed by noting that for each $i = -m_L,\dots ,m_L-1$
$$\begin{aligned}&\frac{f(s_L + i) - \sum _{p=1}^{L-1} \widehat{f}_p(s_L + i)}{\bar{g}_L(s_L + i)} \\&\quad = \underbrace{\sum _{p=1}^{L-1}\frac{\bar{g}_p(s_L+i)}{\bar{g}_L(s_L + i)} \sum _{j=1}^K[u_{p,j}\exp (\iota 2\pi (s_L+i)t_{p,j}) - \widehat{u}_{p,\phi _p(j)} \exp (\iota 2\pi (s_L+i)\widehat{t}_{p,\phi _p(j)}]}_{\eta '_{L,i}} \\&\qquad + \sum _{j=1}^K \underbrace{u_{L,j} \exp (\iota 2\pi s_L t_{L,j})}_{u'_{L,j}} \exp (\iota 2\pi i t_{L,j}) = \sum _{j=1}^K u'_{L,j} \exp (\iota 2\pi i t_{L,j}) + \eta '_{L,i}. \end{aligned}$$
Using Proposition 3, we have for each $p=1,\dots ,L-1$ and $j=1,\dots ,K$ that
$$\begin{aligned}&|{u_{p,j} \exp (\iota 2\pi (s_L+i)t_{p,j}) - \widehat{u}_{p,\phi _p(j)} \exp (\iota 2\pi (s_L+i){\widehat{t}}_{p,\phi _p(j)}}| \nonumber \\&\quad \le 2\pi u_{\max }|{s_L + i}| d_w(t_{p,j}, \widehat{t}_{p,\phi _p(j)}) + |{u_{p,j} - \widehat{u}_{p,\phi _p(j)}}| \nonumber \\&\quad < 2\pi u_{\max }|{s_L+i}| \varepsilon _p + E_p(\varepsilon _p) {\varepsilon _p u_{\max }}, \end{aligned}$$
(3.42)
where the last inequality follows from (3.41). Since $s_L = 0$, hence $(s_L + i)^2< m_L^2 < M_{{L},\mathrm {up}}^2$ for all $i=-m_L,\dots ,m_L-1$. Using (3.42), we then readily obtain
$$\begin{aligned} |{\eta '_{L,i}}|&< \sum _{p=1}^{L-1} \left( \left( \frac{\mu _p}{\mu _L} e^{2\pi ^2(\mu _L^2 - \mu _p^2) M_{{L},\mathrm {up}}^2}\right) (2\pi u_{\max }M_{{L},\mathrm {up}}\varepsilon _p + E_p(\varepsilon _p) {\varepsilon _p u_{\max }})K\right) \nonumber \\&\le \left( K \frac{\mu _{L-1}}{\mu _L} e^{2\pi ^2(\mu _L^2 - \mu _1^2)M_{{L},\mathrm {up}}^2} \right) {\sum _{p=1}^{L-1}\left( 2\pi u_{\max }M_{{L},\mathrm {up}}\varepsilon _p + E_p(\varepsilon _p) \varepsilon _p u_{\max }\right) } \nonumber \\&\le \left( K(L-1) \frac{\mu _{L-1}}{\mu _L} e^{2\pi ^2(\mu _L^2 - \mu _1^2)M_{{L},\mathrm {up}}^2} \right) \left( 2\pi u_{\max }M_{{L},\mathrm {up}}\varepsilon _{L-1} \right. \nonumber \\&\quad \left. + E_{L-1}(\varepsilon _{L-1}) {\varepsilon _{L-1} u_{\max }} \right) \end{aligned}$$
(3.43)
where in the last inequality, we used (1).
We obtain estimates ${\widehat{t}}_{L,j}$, $\widehat{u}_{L,j}$, $j=1,\ldots ,K$ via the MMP method. Invoking Corollary 1 and assuming $\varepsilon _L < c \triangle _L/2$, it follows for the stated conditions on $m_L$, that it suffices if $\varepsilon _{L-1}$ satisfies
$$\begin{aligned} {(2\pi M_{{L},\mathrm {up}} + E_{L-1}(\varepsilon _{L-1})) \varepsilon _{L-1} u_{\max }\le \varepsilon _L e^{-2\pi ^2(\mu _L^2 - \mu _1^2)M_{{L},\mathrm {up}}^2} \frac{u_{\min }\mu _L B(u_{\mathrm {rel}},K)}{K (L-1) \mu _{L-1}}.} \end{aligned}$$
Indeed, there then exists a permutation $\phi _L: [K] \rightarrow [K]$ such that for each $j=1,\dots ,K$,
$$\begin{aligned} d_w({\widehat{t}}_{L,\phi _L(j)}, t_{L,j})&\le \varepsilon _L; \quad |{{\widehat{u}}_{L,\phi _L(j)} - u_{L,j}}| < \widetilde{C}_L \varepsilon _L {u_{\max }} = E_L(\varepsilon _L) {\varepsilon _{L} u_{\max }}. \end{aligned}$$
(3.44)
This completes the proof.

$\square $

4 Unmixing Gaussians in Fourier Domain: Noisy Case

We now analyze the noisy setting where we acquire noisy values of the of the Fourier transform of f at the sampling location (frequency) s. In particular, at stage p ($1 \le p \le L$) in Algorithm 2, and frequency s, let $w_p(s)$ denote the additive observation noise on the clean Fourier sample f(s). Denoting the noisy measurement by $\widetilde{f}(s)$, this means that at stage p,

$$\begin{aligned} \widetilde{f}(s)&= f(s) + w_p(s) \\&= \sum _{l=1}^{L} \bar{g}_l(s) \left( \sum _{j=1}^K u_{l,j} \exp (\iota 2\pi s t_{l,j})\right) + w_p(s). \end{aligned}$$

In addition to the terms defined at the beginning of Sect. 3.2, we will need an additional term (defined below) which will be used in the statement of our theorem.

$$\begin{aligned} F'_l(\varepsilon )&:= C_{l,1}' + C_{l,2}' \log ^{1/2}\left( \frac{3 D_l}{\varepsilon }\right) ; \quad l = 2,\dots ,L-1, \end{aligned}$$

(4.1)

where $C_{l,1}'$, $C_{l,2}'$ are as defined in (3.21).

Theorem 6

For a constant $c \in (0,1)$, let $0< \varepsilon _L < c\triangle _L/2$, $\frac{2}{\triangle _L - 2\varepsilon _L} \le m_L < M_{{L},\mathrm {up}}$ and $s_L = 0$. Moreover, for $l = L-1, \dots , 1$, say we choose $\varepsilon _l,m_l,s_l$ as follows.

1.
$0< \varepsilon _l < c\triangle _l/2$ additionally satisfies the following conditions.
1. (a)
  $(2\pi M_{{L},\mathrm {up}} + E_{L-1}(\varepsilon _{L-1})) \varepsilon _{L-1} u_{\mathrm {rel}}\le \varepsilon _{L} e^{-2\pi ^2(\mu _{L}^2-\mu _1^2)M_{{L},\mathrm {up}}^2} \frac{\mu _{L} B(u_{\mathrm {rel}},K)}{2 K (L-1) \mu _{L-1}}$.
2. (b)
  If $l < L-1$, then $\varepsilon _{l} \le \varepsilon _{l+1}$, $E_l(\varepsilon _{l}) \varepsilon _{l} \le E_{l+1}(\varepsilon _{l+1}) \varepsilon _{l+1}$ and
  $$\begin{aligned} (F'_{l+1}(\varepsilon _{l}) + E_{l}(\varepsilon _{l})) \varepsilon _{l} u_{\mathrm {rel}}\le \varepsilon _{l+1} e^{-(\mu _{l+1}^2-\mu _1^2)\frac{F_{l+1}^{'^2}(\varepsilon _{l+1})}{2}} \frac{\mu _{l+1} B(u_{\mathrm {rel}},K)}{3 K l \mu _{l}}. \end{aligned}$$
2.
$\frac{2}{\triangle _l - 2\varepsilon _l} \le m_l < M_{{l},\mathrm {up}}$, and $S_l \le s_l \le \widetilde{c} S_l$ (for constant $\widetilde{c} > 1$) where
$$\begin{aligned} S_l = m_l + \frac{1}{(2\pi ^2(\mu _{l+1}^2 - \mu _l^2))^{1/2}} \log ^{1/2}\left( {\frac{b_l K (L-l) \mu _L u_{\mathrm {rel}}}{\varepsilon _l \mu _l B(u_{\mathrm {rel}},K)}} \right) , \end{aligned}$$
with $b_l = 2$ if $l = 1$, and $b_l = 3$ otherwise.

Assume that the noise satisfies the conditions

$$\begin{aligned} \left\| \frac{w_1(\cdot )}{\bar{g}_1(s_1+\cdot )}\right\| _{\infty }&\le \varepsilon _1 {\frac{u_{\min }B(u_{\mathrm {rel}},K)}{2}}, \end{aligned}$$

(4.2)

$$\begin{aligned} \left\| \frac{w_l(\cdot )}{\bar{g}_l(s_l+\cdot )}\right\| _{\infty }&\le \varepsilon _l {\frac{u_{\min }B(u_{\mathrm {rel}},K)}{3}}, \quad l=2,\ldots ,L-1, \quad \text {and} \end{aligned}$$

(4.3)

$$\begin{aligned} \left\| \frac{w_L(\cdot )}{\bar{g}_L(s_L+\cdot )}\right\| _{\infty }&\le \varepsilon _L {\frac{u_{\min }B(u_{\mathrm {rel}},K)}{2}}. \end{aligned}$$

(4.4)

Then, for each $l=1,\dots ,L$, there exists a permutation $\phi _l: [K] \rightarrow [K]$ such that

$$\begin{aligned} d_w({\widehat{t}}_{l,\phi _l(j)}, t_{l,j})&\le \varepsilon _l, \quad |{{\widehat{u}}_{l,\phi _l(j)} - u_{l,j}}| < E_l(\varepsilon _l) {\varepsilon _l u_{\max }}; \quad j=1,\dots ,K. \end{aligned}$$

Since the organisation and the arguments of the proof of Theorem 6 are almost identical to that of Theorem 5, we defer this proof to Appendix D where we pinpoint the main differences between the noiseless and the noisy settings.

Interpreting Theorem 6 Theorem 6 is almost the same as Theorem 5 barring the conditions on the magnitude of external noise (and minor differences in some constants). Specifically, conditions (4.2)–(4.4) state that at stage l, the magnitude of the noise should be small relative to the desired accuracy parameter $\varepsilon _l$. This is examined in more detail below. For convenience, we will now assume $u_{\min },u_{\max }\asymp 1$.

1.
(Condition on $\Vert {w_l(\cdot )}\Vert _{\infty }$ for $1 \le l \le L-1$) Let us start with the case $2 \le l \le L-1$. Condition (4.3) states that
$$\begin{aligned} \left| \frac{w_l(i)}{\bar{g}_l(s_l+i)} \right| \lesssim \frac{\varepsilon _l}{\sqrt{K}}; \quad i=-m_l,\dots ,m_l-1. \end{aligned}$$
(4.5)
Now, $|{\frac{w_l(i)}{\bar{g}_l(s_l+i)}}| = \frac{|{w_l(i)}|}{\sqrt{2\pi }\mu _l} e^{2\pi ^2(s_l + i)^2 \mu _l^2}$. Since $s_l > m_l$, therefore $s_l + i > 0$ for the given range of i. Hence, $(s_l+i)^2< (s_l+m_l)^2 < F_l^{'^2}(\varepsilon _l)$ for each i. Since $F_l^{'}(\varepsilon ) \asymp F_l(\varepsilon )$, therefore using the order wise dependency from (3.24), we obtain for each i that
$$\begin{aligned} \frac{|{w_l(i)}|}{\sqrt{2\pi }\mu _l} e^{2\pi ^2(s_l + i)^2 \mu _l^2} \lesssim \frac{|{w_l(i)}|}{\mu _l}\left( \frac{K^{3/2}L \mu _L}{\varepsilon _l\mu _l}\right) ^{C(\mu _l,\mu _{l+1},\triangle _l)}, \end{aligned}$$
(4.6)
where $C(\mu _l,\mu _{l+1},\triangle _l) > 0$ depends only on $\mu _l,\mu _{l+1},\triangle _l$. Hence from (4.5), (4.6), we see that (4.3) is satisfied if
$$\begin{aligned} \Vert {w_l}\Vert _{\infty } \lesssim \frac{(\varepsilon _l\mu _l)^{1+C(\mu _l,\mu _{l+1},\triangle _l)}}{\sqrt{K} (K^{3/2} L \mu _L)^{C(\mu _l,\mu _{l+1},\triangle _l)}}; \quad 2 \le l \le L-1. \end{aligned}$$
(4.7)
In a similar manner, one can easily show that (4.2) is satisfied if
$$\begin{aligned} \Vert {w_1}\Vert _{\infty } \lesssim \frac{(\varepsilon _1\mu _1)^{1+C(\mu _1,\mu _{2},\triangle _1)}}{\sqrt{K} (K^{3/2} L \mu _L)^{C(\mu _1,\mu _{2},\triangle _1)}}. \end{aligned}$$
(4.8)
2.
(Condition on$\Vert {w_L}\Vert _{\infty }$) In this case, $s_L = 0$ and so $(s_L + i)^2 \le m_L^2$. Therefore for each $i=-m_L,\dots ,m_L-1$, we obtain
$$\begin{aligned} \left| \frac{w_L(i)}{\bar{g}_L(s_L+i)} \right| = \frac{|{w_L(i)}|}{\sqrt{2\pi }\mu _L} e^{2\pi ^2(i)^2 \mu _L^2} \lesssim \frac{|{w_L(i)}|}{\mu _L} e^{2\pi ^2 \mu _L^2/\triangle _L^2}. \end{aligned}$$
(4.9)
Hence from (4.9), we see that (4.4) is satisfied if
$$\begin{aligned} \Vert {w_L}\Vert _{\infty } \lesssim \mu _L e^{-\mu _L^2/\triangle _L^2} \frac{\varepsilon _L}{\sqrt{K}}. \end{aligned}$$
(4.10)

(4.8), (4.7), (4.10) show the conditions that the noise level is required to satisfy at the different levels. From the discussion following Theorem 5, we know that the $\varepsilon _i$’s gradually become smaller and smaller as we move from $i=L$ to $i=1$ (with $\varepsilon _1$ being the smallest). Therefore the condition on $\Vert {w_1}\Vert _{\infty }$ is the strictest, while the condition on $\Vert {w_L}\Vert _{\infty }$ is the mildest.

Corollary for the Case $L = 1$. As noted earlier, the case $L = 1$ is not interesting in the absence of external noise as we can exactly recover the source parameters. The situation is more interesting in the presence of noise as shown in the following Corollary of Theorem 6 for the case $L=1$.

Corollary 2

For a constant $c \in (0,1)$, let $0< \varepsilon _1 < c\triangle _1/2$, $\frac{2}{\triangle _1 - 2\varepsilon _1} \le m_1 < \frac{2}{\triangle _1(1-c)} + 1$ and $s_1 = 0$. Moreover, denoting $C = 10 + \frac{1}{\sqrt{2}}$, assume that the noise satisfies

$$\begin{aligned} \left\| \frac{w_1(\cdot )}{\bar{g}_1(\cdot )}\right\| _{\infty }&\le \varepsilon _1 \frac{u_{\min }}{10 C\sqrt{K}} \left( 1 + 48 {u_{\mathrm {rel}}} \right) ^{-1}. \end{aligned}$$

(4.11)

Then, there exists a permutation $\phi _1: [K] \rightarrow [K]$ such that for $j=1,\dots ,K$, we have

$$\begin{aligned} d_w({\widehat{t}}_{1,\phi _1(j)}, t_{1,j})&\le \varepsilon _1, \\ |{{\widehat{u}}_{1,\phi _1(j)} - u_{1,j}}|&< {\left( 4\pi K \left( \frac{2}{\triangle _1(1-c)}+1 \right) + \frac{2}{C\sqrt{K}}\left( u_{\mathrm {rel}}+ 16u_{\mathrm {rel}}^2\right) ^{-1} \right) \varepsilon _1 u_{\max }.} \end{aligned}$$

Assuming $u_{\min },u_{\max }\asymp 1$, we see from (4.10) that (4.11) is satisfied if $\Vert {w_1}\Vert _{\infty } \lesssim \mu _1 e^{-\mu _1^2/\triangle _1^2} \frac{\varepsilon _1}{\sqrt{K}}$.

5 Experiments

In this section, we present some numerical experiments for our method^{Footnote 5} pertaining to the error in the recovery of the locations of the spikes. Our setup is as follows. We fix $L = 4$ groups, and consider $K \in \left\{ {2,3,4,5}\right\} $. We fix the variance parameters of the Gaussian kernel to be $\mu _L = 0.01$ and $\mu _l = \mu _{l+1}/2$ for $l=1,\dots ,L-1$. The minimum separation parameter $\triangle $ is set to 0.05 and $m_l$ is fixed to $1/\triangle + 5$ for each $l = 1,\dots ,L$. Furthermore, we fix the sampling parameters as $s_L = 0$, $\varepsilon _L = 0.01$, and for each $l=1,\dots ,L-1$ choose

$$\begin{aligned} \varepsilon _l = \varepsilon _{l+1}^2, \quad s_l = m_l + \frac{C}{\sqrt{2\pi ^2 (\mu _{l+1}^2 - \mu _{l}^2)}}\log ^{1/2}\left( \frac{\mu _L}{\mu _l \varepsilon _l}\right) \end{aligned}$$

(5.1)

for a suitably chosen constant $C > 0$. This is in line with our theory since $\varepsilon _l$ is smaller than $\varepsilon _{l+1}$ and $s_l$ too is of the form specified in Theorem 5.

In each Monte Carlo run, we choose $K \in \left\{ {2,3,4,5}\right\} $, and then randomly generate K spikes for each $l = 1,\dots ,L$. In particular, the amplitudes of the spikes are generated by first uniformly sampling values in $[u_{\min },u_{\max }]$ with $u_{\min }= 3$ and $u_{\max }= 10$, and then randomly assigning each one of them a negative sign with probability 1/2. Moreover, the spike locations are sampled uniformly at random in (0, 1) with the minimum separation $\triangle _l$ for each group ensured to be greater than or equal to $\triangle $. Thereafter, each Fourier sample (for every group) is corrupted with zero-mean i.i.d Gaussian noise. For each group l, upon obtaining the estimated spike locations $\widehat{t}_{l,j}$ and amplitudes $\widehat{u}_{l,j}$ for $j=1,\dots ,K$ (via Algorithm 2), we “match” (for each group) the estimated set of spikes with the input spikes based on a simple heuristic. We first find the estimated spike location that has the smallest wrap around distance from an input spike location—this gives us a match. This pair is then removed, and we repeat the process on the remaining sets of spikes. This finally gives us a permutation $\phi :[K] \rightarrow [K]$ where $t_{l,j}$ would ideally be close to $\widehat{t}_{l,\phi (j)}$. Finally, we evaluate the performance of our algorithm by examining (a) the maximum wrap around distance $d_{w,l,\max } := \max _{j} d_w(t_{l,j}, \widehat{t}_{l,\phi (j)})$, and (b) the average wrap around distance $d_{w,l,\text {avg}}:= (1/K) \sum _{j=1}^K d_w(t_{l,j}, \widehat{t}_{l,\phi (j)})$. This is repeated over 400 Monte Carlo trials.

Figures 3 and 4 show scatter plots for the above notions of error and the minimum separation $\triangle _l$ for each group, in the absence of external noise. Figures 5 and 6 show the same, but with external Gaussian noise (standard deviation $5 \times 10^{-5}$). Both these plots are for $C = 0.6$ in (5.1), we found this choice to give the best result. We observe that for $l=1,2$, the spike locations are recovered near exactly in all the trials and for all choices of K—both in the noiseless and noisy settings. For $l=3,4$, the performance is reasonably good for $K=2,3$. In particular, for both the noiseless and noisy settings, $d_{w,l,\max }$ is less than 0.05 in at least $82\%$ of the trials while $d_{w,l,\text {avg}}$ is less than 0.05 in at least $93\%$ of trials. While the performance is seen to drop as K increases—especially in terms of $d_{w,l,\max }$—the performance in terms of $d_{w,l,\text {avg}}$ is still significantly better than $d_{w,l,\max }$. In particular, for $K = 4$, $d_{w,l,\text {avg}}$ is less than 0.05 in at least $86\%$ of the trials (in both noiseless and noisy settings), while for $K = 5$, the same is true in at least $73\%$ of the trials.

As mentioned earlier, we found the choice $C = 0.6$ to give the best performance, in general. In the noiseless scenario however, the choice $C = 1$ results in near exact recovery for all groups as shown in Appendix E (see Figs. 7, 8). However in the presence of external Gaussian noise of standard deviation $5 \times 10^{-5}$, the recovery performance breaks down for groups $l=2,3,4$ (see Figs. 9, 10). Since we are sampling relatively deeper in the Fourier tail (compared to that when $C = 0.6$), the deconvolution step blows up the noise significantly, leading to the worse recovery performance. Finally, in the noiseless case, we observed that the recovery performance breaks down for larger values of C (i.e., for $C > 4$) due to numerical errors creeping in the deconvolution step.

6 Discussion and Concluding Remarks

We now compare with closely related work and conclude with directions for future research.

6.1 Related Work on the Multi-kernel Unmixing Super-Resolution Problem

Despite is natural role in many practical problems, the study of super-resolution under the presence of multiple kernels has not attracted much theoretical activity until recently. In [35], the authors introduce an interesting variant of the atomic norm approach to sparse measure reconstruction. They prove that the resulting convex optimization problem recovers the original measure in the noiseless case and they provide an interesting error bound in the noisy setting. The assumptions underlying this latter result is a standard separation assumption on the spike localization and a uniform random prior on the Fourier coefficients (when considered to lie in $\mathbb R/\mathbb Z$) of the point spread functions. In comparison, our assumptions are quite different. In particular, we do not make any assumption about the randomness of the Fourier coefficients of the point spread function. Moreover, we use Moitra’s Modified MP method instead of the atomic norm penalization considered in [35]. As a main benefit of our approach, we do not need any hyper-parameter tuning when the signal is sufficiently larger than the noise level.^{Footnote 6}

Another interesting work on multi-kernel super-resolution is the technique developed in [51], where the setting is very close to the one of the present paper. A set of relevant modifications of the LASSO estimator and Matching Pursuit method, combined with post-processing techniques, are proposed in [51] and shown to perform well on real datasets. However, to the best of our knowledge, the practical value of these methods is not rigorously supported by theoretical results.

6.2 Future Directions

In this paper, we provide a simple and intuitive algorithm for multi-kernel super-resolution, and also provide strong theoretical results for our approach. There are several directions for extending the results in this paper, we list two of them below.

Firstly, our analysis assumes that the kernel variance parameters (i.e., $\mu _l$) are known exactly. In general, the analysis can be extended to the case where upper and lower estimates are available for each $\mu _l$. Hence one could consider estimating the variance terms, and using these estimates with our algorithm. The choice of the method for estimating $(\mu _l)_l$ should be investigated with great care. One possible avenue is to use Lepski’s method [25] and its many recent variants and improvements (see for e.g., [14]).

Next, it is natural to extend the techniques developed here to the multi-dimensional setting as multivariate signals are of high importance in many applications such as DNA sequencing and Mass Spectrometry. This could be investigated using for instance multivariate Prony-like methods such as in [2, 15, 21, 28, 32, 42, 47]. Another interesting avenue is the one discovered in [4] where refined bounds on the condition number of the Vandermonde matrices are devised. It would be of particular interest to understand how such bounds could be employed in our framework in order to accomodate separation conditions below the threshold discovered by Moitra [41].

Notes

The estimation error on the $t_i$’s, $i=1,\ldots ,K$, can be deduced from [45, Theorem 5.1], while the error in the coefficients $u_i$, $i=1,\ldots ,K$ comes as a result of a perturbation analysis based on the condition number of the Vandermonde matrix associated with the frequencies.
Recall from [60, Theorem 6.11] that a continuous function $g \in L_1(\mathbb {R})$ is positive definite if and only if g is bounded and its Fourier transform is nonnegative and non vanishing.
The symbols $\lesssim $ and $\asymp $ are used to hide positive constants, see Sect. 2.
Recall (see [56, Definition 2.1]) that $\lambda = \beta /\gamma $ (where $(\beta ,\gamma ) \ne (0,0)$) is a generalized eigenvalue of $(H_1,H_0)$ if it satisfies $\text {rank } (\gamma H_1 - \beta H_0) < \max _{(\zeta _1,\zeta _0) \in \mathbb {C}^2 \setminus \left\{ {0,0}\right\} }\text { rank}(\zeta _1 H_1 - \zeta _0 H_0).$ Clearly, this is only satisfied if $\lambda = \alpha _j$ (see also [29, Theorem 2.1]).
Code available here: https://hemant-tyagi.github.io.
what sufficiently larger means is elaborated on in Theorem 6
A pair (A, B) is regular if there exist scalars $\alpha ,\beta $ with $(\alpha ,\beta ) \ne (0,0)$ such that $\text {det}(\beta A - \alpha B) \ne 0$ [53, Definition VI.1.2]. For (A, B) to be regular, it suffices that A and/or B is full rank.
Note that $D_{u'}$ and $D_u$ have the same singular values as $|{u'_i}| = |{u_i}|$ for each i.
The Proposition requires $0 \le \varepsilon /C \le 1/4$, which is the case here.
Here, we use the fact $\log x \le n x^{1/n}$ for $n,x > 0$.

References

Adams, M.D., Fields, C., Venter, J.C.: Automated DNA Sequencing and Analysis. Elsevier, New York (2012)
Google Scholar
Andersson, F., Carlsson, M.: Esprit for multidimensional general grids. SIAM J. Matrix Anal. Appl. 39(3), 1470–1488 (2018)
MathSciNet MATH Google Scholar
Azaïs, J.M., de Castro, Y., Gamboa, F.: Spike detection from inaccurate samplings. Appl. Comput. Harmon. Anal. 38(2), 177–195 (2015)
MathSciNet MATH Google Scholar
Batenkov, D., Demanet, L., Goldman, G., Yomdin, Y.: Conditioning of partial nonuniform Fourier matrices with clustered nodes. To appear in SIAM J. Matrix Anal. Appl. arXiv:1809.00658 (2019)
Bazán, F.S.V.: Error analysis of signal zeros: a projected companion matrix approach. Linear Algebra Appl. 369, 153–167 (2003)
MathSciNet MATH Google Scholar
Bendory, T., Dekel, S., Feuer, A.: Robust recovery of stream of pulses using convex optimization. J. Math. Anal. Appl. 442(2), 511–536 (2016)
MathSciNet MATH Google Scholar
Berno, A.J.: A graph theoretic approach to the analysis of DNA sequencing data. Genome Res. 6(2), 80–91 (1996)
Google Scholar
Boyd, N., Schiebinger, G., Recht, B.: The alternating descent conditional gradient method for sparse inverse problems. SIAM J. Optim. 27(2), 616–639 (2017)
MathSciNet MATH Google Scholar
Brown, E.N., Kass, R.E., Mitra, P.P.: Multiple neural spike train data analysis: state-of-the-art and future challenges. Nat. Neurosci. 7(5), 456 (2004)
Google Scholar
Brutti, P., Genovese, C., Miller, C.J., Nichol, R.C., Wasserman, L.: Spike hunting in galaxy spectra. Technical report, Libera Universit‘a Internazionale degli Studi Sociali Guido Carli di Roma. http://www.stat.cmu.edu/tr/tr828/tr828.html (2005)
Bühlmann, P., Van De Geer, S.: Statistics for High-Dimensional Data: Methods, Theory and Applications. Springer, New York (2011)
MATH Google Scholar
Candès, E.J., Fernandez-Granda, C.: Super-resolution from noisy data. J. Fourier Anal. Appl. 19(6), 1229–1254 (2013)
MathSciNet MATH Google Scholar
Candès, E.J., Fernandez-Granda, C.: Towards a mathematical theory of super-resolution. Commun. Pure Appl. Math. 67(6), 906–956 (2014)
MathSciNet MATH Google Scholar
Comte, F., Lacour, C.: Anisotropic adaptive kernel deconvolution. Ann. l’Institut Henri Poincaré, Probab. Stat. 49(2), 569–609 (2013)
MathSciNet MATH Google Scholar
Cuyt, A., Lee, W.-S.: Multivariate exponential analysis from the minimal number of samples. Adv. Comput. Math. 44(4), 987–1002 (2018)
MathSciNet MATH Google Scholar
de Prony, B.G.R.: Essay experimental et analytique: sur les lois de la dilatabilite de fluides elastique et sur celles de la force expansive de la vapeur de l’alcool, a differentes temperatures. Journal de l’ecole Polytechnique 24–76 (1795)
Denoyelle, Q., Duval, V., Peyré, G.: Asymptotic of sparse support recovery for positive measures. J. Phys. 657(1), 012013 (2015)
MATH Google Scholar
Duval, V., Peyré, G.: Exact support recovery for sparse spikes deconvolution. Found. Comput. Math. 15(5), 1315–1355 (2015)
MathSciNet MATH Google Scholar
Duval, V., Peyré, G.: Sparse regularization on thin grids I: the LASSO. Inverse Probl. 33(5), 055008 (2017)
MathSciNet MATH Google Scholar
Duval, V., Peyré, G.: Sparse spikes super-resolution on thin grids II: the continuous basis pursuit. Inverse Probl. 33(9), 095008 (2017)
MathSciNet MATH Google Scholar
Ehler, M., Kunis, S., Peter, T., Richter, C.: A randomized multivariate matrix pencil method for superresolution microscopy. Electron. Trans. Numer. Anal. 51, 63–74 (2019)
MathSciNet MATH Google Scholar
Eldar, Y.C., Kutyniok, G.: Compressed Sensing: Theory and Applications. Cambridge University Press, Cambridge (2012)
Google Scholar
Fernandez-Granda, C.: Support detection in super-resolution. In: International Conference on Sampling Theory and Appl., pp. 145–148 (2013)
Foucart, S., Rauhut, H.: A Mathematical Introduction to Compressive Sensing. Birkhäuser, Basel (2013)
MATH Google Scholar
Goldenshluger, A., Lepski, O.: Bandwidth selection in kernel density estimation: oracle inequalities and adaptive minimax optimality. Ann. Stat. 39(3), 1608–1632 (2011)
MathSciNet MATH Google Scholar
Greenspan, H.: Super-resolution in medical imaging. Comput. J. 52(1), 43–63 (2009)
Google Scholar
Hettich, R., Kortanek, K.O.: Semi-infinite programming: theory, methods, and applications. SIAM Rev. 35(3), 380–429 (1993)
MathSciNet MATH Google Scholar
Hua, Y.: Estimating two-dimensional frequencies by matrix enhancement and matrix pencil. IEEE Trans. Signal Process. 40(9), 2267–2280 (1992)
MATH Google Scholar
Hua, Y., Sarkar, T.K.: Matrix pencil method for estimating parameters of exponentially damped/undamped sinusoids in noise. IEEE Trans. Acoust. Speech Signal Process. 38(5), 814–824 (1990)
MathSciNet MATH Google Scholar
Huang, J., Sun, M., Gumpper, K., Chi, Y., Ma, J.: 3D multifocus astigmatism and compressed sensing (3D MACS) based superresolution reconstruction. Biomed. Opt. Express 6(3), 902–917 (2015)
Google Scholar
Khaidukov, V., Landa, E., Moser, T.J.: Diffraction imaging by focusing-defocusing: an outlook on seismic superresolution. Geophysics 69(6), 1478–1490 (2004)
Google Scholar
Kunis, S., Peter, T., Römer, T., von der Ohe, U.: A multivariate generalization of Prony’s method. Linear Algebra Appl. 490, 31–47 (2016)
MathSciNet MATH Google Scholar
Lewicki, M.S.: A review of methods for spike sorting: the detection and classification of neural action potentials. Network 9(4), R53–R78 (1998)
MathSciNet MATH Google Scholar
Li, L.: DNA sequencing and parametric deconvolution. Stat. Sin. 12, 179–202 (2002)
MathSciNet MATH Google Scholar
Li, Y., Chi, Y.: Stable separation and super-resolution of mixture models. Appl. Comput. Harmon. Anal. 46(1), 1–39 (2019)
MathSciNet MATH Google Scholar
Li, L.M., Speed, T.P.: Deconvolution of sparse positive spikes. J. Comput. Graph. Stat. 13(4), 853–870 (2004)
MathSciNet Google Scholar
Liao, W., Fannjiang, A.: MUSIC for single-snapshot spectral estimation: Stability and super-resolution. Appl. Comput. Harmon. Anal. 40(1), 33–67 (2016)
MathSciNet MATH Google Scholar
Liu, H., Wasserman, L., Genovese, C., Lafferty, J.: Estimating galaxy spectral functions using sparse composite models (2007)
Lu, J.: Photopeak detection and quantification using wavelet analysis. Master’s thesis, University of Illinois at Urbana-Champaign (2013)
Markovsky, I.: Structured low-rank approximation and its applications. Automatica 44(4), 891–909 (2008)
MathSciNet MATH Google Scholar
Moitra, A.: Super-resolution, extremal functions and the condition number of vandermonde matrices. In: Proceedings of the Forty-seventh Annual ACM Symposium on Theory of Computing, STOC ’15, pp. 821–830 (2015)
Peter, T., Plonka, G., Schaback, R.: Prony’s method for multivariate signals. PAMM 15(1), 665–666 (2015)
Google Scholar
Plonka, G., Pototskaia, V.: Application of the AAK theory for sparse approximation of exponential sums. arXiv preprint arXiv:1609.09603 (2016)
Plonka, G., Tasche, M.: Prony methods for recovery of structured functions. GAMM-Mitteilungen 37(2), 239–258 (2014)
MathSciNet MATH Google Scholar
Potts, D., Tasche, M.: Error estimates for the ESPRIT algorithm. In: Large Truncated Toeplitz Matrices, Toeplitz Operators, and Related Topics, pp. 621–648. Springer (2017)
Potts, D., Tasche, M.: Parameter estimation for exponential sums by approximate prony method. Signal Process. 90(5), 1631–1642 (2010). (Special Section on Statistical Signal & Array Processing)
MATH Google Scholar
Potts, D., Tasche, M.: Parameter estimation for multivariate exponential sums. ETNA 40, 204–224 (2013)
MathSciNet MATH Google Scholar
Potts, D., Tasche, M.: Parameter estimation for nonincreasing exponential sums by prony-like methods. Linear Algebra Appl. 439(4), 1024–1039 (2013)
MathSciNet MATH Google Scholar
Puschmann, K.G., Kneer, F.: On super-resolution in astronomical imaging. A&A 436(1), 373–378 (2005)
Google Scholar
Schiebinger, G., Robeva, E., Recht, B.: Superresolution without separation. Inf. Inference 7(1), 1–30 (2018)
MathSciNet MATH Google Scholar
Slawski, M., Hein, M.: Sparse recovery for protein mass spectrometry data. In: Practical Applications of Sparse Modeling, pp. 79–98. MIT Press (2014)
SnapGene: View and edit DNA sequence traces. http://www.snapgene.com/products/snapgene_viewer/ (2018)
Stewart, G.W., Sun, J.G.: Matrix Perturbation Theory. Academic Press Inc., New York (1990)
MATH Google Scholar
Stinnett, J.B.: Automated isotope identification algorithms for low-resolution gamma spectrometers. PhD thesis, University of Illinois at Urbana-Champaign (2016)
Stinnett, J.: Bayesian algorithms for automated isotope identification. Master’s thesis, University of Illinois at Urbana-Champaign (2014)
Sun, J.G.: Orthogonal projections and the perturbation of the eigenvalues of singular pencils. J. Comput. Math. 1(1), 63–74 (1983)
MATH Google Scholar
Tang, G., Bhaskar, B.N., Shah, P., Recht, B.: Compressed sensing off the grid. IEEE Trans. Inf theory 59(11), 7465–7490 (2013)
MathSciNet MATH Google Scholar
Tang, G., Bhaskar, B.N., Recht, B.: Near minimax line spectral estimation. IEEE Trans. Inf. Theory 61(1), 499–512 (2015)
MathSciNet MATH Google Scholar
Wedin, P.: Perturbation bounds in connection with singular value decomposition. BIT Numer. Math. 12(1), 99–111 (1972)
MathSciNet MATH Google Scholar
Wendland, H.: Scattered Data Approximation. Cambridge Monographs on Applied and Computational Mathematics. Cambridge University Press, Cambridge (2004)
Google Scholar
Weyl, H.: Das asymptotische verteilungsgesetz der eigenwerte linearer partieller differentialgleichungen (mit einer anwendung auf die theorie der hohlraumstrahlung). Math. Ann. 71(4), 441–479 (1912)
MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

National Physical Laboratory, Teddington, UK
Stéphane Chrétien
The Alan Turing Institure, London, UK
Stéphane Chrétien
INRIA Lille-Nord Europe, Lille, France
Hemant Tyagi

Authors

Stéphane Chrétien
View author publications
You can also search for this author in PubMed Google Scholar
Hemant Tyagi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hemant Tyagi.

Additional information

Communicated by Hans G. Feichtinger.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

H. Tyagi: This work was done by the author while affiliated to the Alan Turing Institute, London, UK, and School of Mathematics, University of Edinburgh, Edinburgh, UK. This author’s work was supported by EPSRC Grant EP/N510129/1.

Appendices

Modified Matrix Pencil Method

1.1 Proof of Moitra’s Theorem

We now outline the steps of Moitra’s proof [41, Theorem 2.8] for completeness. In particular, we note that the result in [41, Theorem 2.8] does not detail the constants appearing in the bounds, while we will do so here.

To begin with, recall the notion of chordal metric for measuring distance between complex numbers.

Definition 1

The chordal metric for $u,v \in \mathbb {C}$ is defined as

$$\begin{aligned} \chi (u,v) := \frac{|{u-v}|}{\sqrt{1+|{u}|^2} \sqrt{1+|{v}|^2}}. \end{aligned}$$

Denoting $s(u),s(v) \in \mathbb {S}^2$ to be points with u, v as their respective stereographic projections on the plane, we also have $\chi (u,v) = \frac{1}{2}\Vert {s(u) - s(v)}\Vert $. It is useful to note that if $(u,v) \in \mathbb {R}^2$ is the stereographic projection of a point $(a,b,c) \in \mathbb {S}^2$, then

$$\begin{aligned} (a,b,c) = \left( \frac{2u}{1+u^2+v^2}, \frac{2v}{1+u^2+v^2}, \frac{-1+u^2+v^2}{1+u^2+v^2} \right) . \end{aligned}$$

Definition 2

Let $(\lambda _i)_{i=1}^n$ denote generalized eigenvalues of (A, B), and also let $({{\widehat{\lambda }}}_i)_{i=1}^n$ denote generalized eigenvalues of $({\widehat{A}}, {\widehat{B}})$. Then the matching distance with respect to $\chi $ is defined as

$$\begin{aligned} \text {md}_{\chi }[(A,B), ({\widehat{A}}, {\widehat{B}})] := \min _{\phi } \max _i \chi (\lambda _i, {{\widehat{\lambda }}}_{\phi (i)}), \end{aligned}$$

where $\phi :[n] \rightarrow [n]$ denotes a permutation.

The key to the analysis in [41, Theorem 2.8] are the following two technical results.

Theorem 7

([53, Corollary VI.2.5]) Let (A, B) and $({{\widehat{A}}, {\widehat{B}}})$ be regular^{Footnote 7} pairs and further suppose that for some nonsingular X, Y we have $(X A {Y^H}, X B {Y^H}) = (I, D)$ where D is diagonal. Also let $e_i$ be the ith row of $X(A -{\widehat{A}}){Y^H}$, $f_i$ be the ith row of $X(B - {\widehat{B}}){Y^H}$ and set $\rho = \max _i\{\Vert e_i\Vert _1 + \Vert f_i\Vert _1\}$. If the following regions

$$\begin{aligned} {\mathcal {G}}_i&= \left\{ \mu \mid \chi ({D_{ii}},\mu ) \le \rho \right\} \end{aligned}$$

are disjoint then the matching distance of the generalized eigenvalues of $({{\widehat{B}}, {\widehat{A}}})$ to $\{D_{ii}\}_i$ is at most $\rho $ with respect to the chordal metric.

Recall that for any pair (A, B), and non-singular X, Y, the pair $(X A {Y^H}, X B {Y^H})$ is equivalent to (A, B). In particular, they both have the same generalized eigenvalues [53, Theorem VI.1.8].

Theorem 8

([41, Theorem 1.1]) Provided that $m > \frac{1}{\triangle } + 1$, we have $\sigma _{\text {max}}^2 \le m+\frac{1}{\triangle }-1$ and $\sigma _{\text {min}}^2 \ge m-\frac{1}{\triangle }-1$. Consequently, the condition number of V satisfies

$$\begin{aligned} \kappa ^2&\le \frac{m+\frac{1}{\triangle }-1}{m-\frac{1}{\triangle }-1}. \end{aligned}$$

Step 1: Recovering Each $t_j$. Let $\widehat{U}$ denote the top K left singular vectors of ${\widetilde{H}}_0$. The following lemma shows that there exists an orthonormal basis for the column span of V that is well aligned to $\widehat{U}$.

Lemma 1

([41, Lemma 2.7]) If $\Vert {E}\Vert _2 < \sigma _K(VD_{u'}V^H)$, then there exists a matrix U such that

$$\begin{aligned} \Vert {U-{\widehat{U}}}\Vert _2&\le \frac{2 \Vert {E}\Vert _2 }{\sigma _K(V{D_{u'}}V^H)- \Vert {E}\Vert _2} \end{aligned}$$

and the columns of U form an orthonormal basis for V and those of ${\widehat{U}}$ form an orthogonal basis of the K “largest” singular vectors of ${\widetilde{H}}_0$.

Remark 1

In the original statement in [41, Lemma 2.7] the denominator term is $\sigma _K(V{D_{u'}}V^H)$. But from Wedin’s bound [59], we observe that the denominator should be $\sigma _K(V{D_{u'}}V^H + E)$ instead. Since $\sigma _K(V{D_{u'}}V^H + E) \ge \sigma _K(V{D_{u'}}V^H) - \Vert {E}\Vert _2$ (Weyl’s inequality [61]), the statement in Lemma 1 follows.

By projecting $H_0, H_1$ to U we obtain

$$\begin{aligned} A = U^H H_0 U = U^HV{D_{u'}}V^H U \quad \text {and} \quad B = U^H H_1 U = U^HV{D_{u'}}D_\alpha V^H U. \end{aligned}$$

As easily checked, the generalized eigenvalues of (B, A) are $\alpha _1,\dots ,\alpha _K$. Similarly,

$$\begin{aligned} {\widehat{A}}= & {} {\widehat{U}}^H{\widetilde{H}}_0{\widehat{U}} = {\widehat{U}}^H(V{D_{u'}}V^H+E){\widehat{U}} \quad \text {and} \quad {\widehat{B}} = {\widehat{U}}^H{\widetilde{H}}_1{\widehat{U}} \\= & {} {\widehat{U}}^H(V{D_{u'}}D_\alpha V^H+F){\widehat{U}}. \end{aligned}$$

The generalized eigenvalues of (${\widehat{B}}, {\widehat{A}}$) are perturbed versions of that of (B, A). Since^{Footnote 8}$\sigma _K(V{D_{u'}}V^H) \ge \sigma _{\min }^2 u_{\min }$, therefore if $\Vert {E}\Vert _2 < \sigma ^2_{\min } u_{\min }$ holds, then Lemma 1 gives us the bound

$$\begin{aligned} \Vert {U-{\widehat{U}}}\Vert _2&\le \frac{2 \Vert {E}\Vert _2}{\sigma ^2_{\min } u_{\min }- \Vert {E}\Vert _2} = {\tau }. \end{aligned}$$

(A.1)

Let us define the matrices $X := {D^{-1/2}_{u'}}V^\dagger U$, $Y:= {(D^{-1/2}_{u'})^H} V^\dagger U$. Clearly, both X and Y are non-singular. Moreover, one can easily verify that $X A Y^H = I$ and $X B Y^H = D_{\alpha }.$ We now bound the ${\ell _2}$-norm of each row of $X(A-{\widehat{A}}){Y^H}$ and $X(B-{\widehat{B}}){Y^H}$ using what we have seen so far. To begin with, following the steps in [41], we obtain

$$\begin{aligned} {\Vert {A - {\widehat{A}}}\Vert _2}&\le {\Vert E \Vert _2 + \sigma ^2_{\max } u_{\max }{\tau } (2+{\tau })} \\ \Rightarrow \Vert X(A-{\widehat{A}}) {Y^H} \Vert _2&\le \frac{\Vert E \Vert _2 + \sigma ^2_{\max } {u_{\max }\tau } (2+{\tau })}{u_{\min }\ \sigma ^2_{\min }} \\&= \frac{\Vert E\Vert _2}{u_{\min }\sigma _{\text {min}}^2} + \kappa ^2 \frac{u_{\max }}{u_{\min }} \ {\tau }(2+{\tau }). \end{aligned}$$

The same bound holds on ${\Vert {X(B-{\widehat{B}}){Y^H}}\Vert _2}$ with E replaced by F. Let $e_i,f_i$ denote the ith row of $X(A-{\widehat{A}}){Y^H}$, $X(B-{\widehat{B}}){Y^H}$ respectively and let $\rho = \max _{i} \left\{ \Vert e_i\Vert _1+ \Vert f_i\Vert _1\right\} $. Since for each i, $\Vert {e_i}\Vert _2 \le \Vert {X(A-{\widehat{A}}){Y^H}}\Vert _2$ and $\Vert {f_i}\Vert _2 \le \Vert {X(B-{\widehat{B}}){Y^H}}\Vert _2$, therefore by using the fact $\rho \le \sqrt{K} \ \max _{i} \left\{ \Vert e_i\Vert _2 + \Vert f_i\Vert _2\right\} $, we get

$$\begin{aligned} \rho&\le \sqrt{K} \left( \frac{\Vert E\Vert _2+\Vert F\Vert _2}{u_{\min } \ \sigma ^2_{\min }} + {\frac{2\kappa ^2 u_{\max }}{u_{\min }}\left[ \tau (2+\tau )\right] }\right) . \end{aligned}$$

(A.2)

Assume that $\Vert E\Vert _2, \ \Vert F\Vert _2 \le \delta \ \frac{u_{\min }\sigma ^2_{\min }}{2}$ for some $\delta \in (0,1)$. Then,

$$\begin{aligned} {\tau (2+\tau )} \le \frac{\delta }{1-\delta /2}\left( 2+\frac{\delta }{1-\delta /2}\right) = \frac{2\delta }{(1-\delta /2)^2} \le 8\ \delta . \end{aligned}$$

Applying these bounds to (A.2) leads to the bound

$$\begin{aligned} \rho&\le \sqrt{K} \ \delta \left( 1+16 \ \kappa ^2 \frac{u_{\max }}{u_{\min }}\right) . \end{aligned}$$

On the other hand, one can easily verify that $\Vert E\Vert _2, \Vert F\Vert _2 \le m \eta _{\mathrm {max}}$. Therefore, if $\eta _{\mathrm {max}}\le \delta \ \frac{u_{\min }\ \sigma ^2_{\min }}{2 m}$, then we get $\rho \le \sqrt{K} \ \delta \left( 1+16 \ \kappa ^2 \frac{u_{\max }}{u_{\min }}\right) .$

Our immediate goal now is to use Theorem 7. (A, B) are clearly regular since A, B are full rank matrices. We will show that if $\delta $ is small enough, then ${\widehat{A}}$ is full rank, which in turn implies that $({\widehat{A}}, {\widehat{B}})$ is regular. To this end, recall that if $\Vert {E}\Vert _2 < \sigma ^2_{\min } u_{\min }$ then $\Vert {A - {\widehat{A}}}\Vert _2 \le \Vert E \Vert _2 + \sigma ^2_{\max } u_{\max }\tau (2+\tau )$ holds with $\tau $ as in (A.1). If $\Vert {E}\Vert _2 \le \frac{\delta }{2} \sigma ^2_{\min } u_{\min }$ for $\delta \in (0,1)$ then we have seen that this implies $\tau (2+\tau ) \le 8\delta $; this in turn implies that $\Vert {A - {\widehat{A}}}\Vert _2 \le 8\delta \sigma _{\max }^2u_{\max }+ \frac{\delta }{2}u_{\min }\sigma _{\min }^2$. Finally, using Weyl’s inequality, we have that

$$\begin{aligned} \sigma _{K}({\widehat{A}}) \ge \sigma _K(A) - \Vert {A - {\widehat{A}}}\Vert _2 \ge \sigma ^2_{\min } u_{\min } - 8\delta \sigma _{\max }^2u_{\max }- \frac{\delta }{2}u_{\min }\sigma _{\min }^2. \end{aligned}$$

Hence $\sigma _{K}({\widehat{A}}) > 0$ if

$$\begin{aligned} \delta < \frac{\sigma ^2_{\min } u_{\min }}{8\sigma _{\max }^2u_{\max }+ \frac{1}{2}u_{\min }\sigma _{\min }^2} = \frac{2}{16\kappa ^2\frac{u_{\max }}{u_{\min }} + 1}. \end{aligned}$$

(A.3)

Now let us note that $\chi (\alpha _i,\alpha _j) \ge 2 d_w(t_i,t_j) \ge 2\triangle $ (see Appendix B.1) for $i \ne j$. So if u is such that $\chi (\alpha _i,u) \le \epsilon $, then this would imply $\chi (\alpha _j,u) \ge 2\triangle - \epsilon $, and hence if additionally $\epsilon < \triangle $ holds, then the regions $\left\{ {u \ | \ \chi (\alpha _i,u) \le \epsilon }\right\} $ will be disjoint. Denote $({\widehat{\lambda }}_j)_{j=1}^{n}$ to be the generalized eigenvalues of (${{\widehat{B}}, {\widehat{A}}}$). Therefore, if $\delta \in (0,1)$ satisfies $\delta < \frac{\triangle }{\sqrt{K}} \left( 1+16 \ \kappa ^2 \frac{u_{\max }}{u_{\min }}\right) ^{-1}$ (note that the bound is already less than 1 and also subsumes (A.3)), then from Theorem 7 we know that there exists a permutation $\phi : [k] \rightarrow [k]$ such that $\chi ({\widehat{\lambda }}_{\phi (i)}, \alpha _i) \le \sqrt{K} \ \delta \left( 1+16 \ \kappa ^2 \frac{u_{\max }}{u_{\min }}\right) $ for $i = 1,\dots ,n$. Let ${\widehat{\alpha }}_i = {{\widehat{\lambda }}_i}/|{{\widehat{\lambda }}_i}|$, and set $\sqrt{K} \ \delta \left( 1+16 \ \kappa ^2 \frac{u_{\max }}{u_{\min }}\right) = \varepsilon /C$ where $C = 10 + \frac{1}{2\sqrt{2}}$ and $0 \le \varepsilon < \min \left\{ {1,C\triangle }\right\} $. This means $\chi ({\widehat{\lambda }}_{\phi (i)}, \alpha _i) \le \varepsilon /C$ holds, and so, from^{Footnote 9} Proposition 1 in Appendix B.2, this implies $d_w(\widehat{t}_{\phi (i)},t_i) \le \varepsilon $. Putting it together with the earlier condition on $\delta $, we get that if

$$\begin{aligned} \eta _{\mathrm {max}}\le \varepsilon \frac{u_{\min }\sigma _{\min }^2}{2mC\sqrt{K}} \left( 1+16 \ \kappa ^2 \frac{u_{\max }}{u_{\min }}\right) ^{-1} \end{aligned}$$

(A.4)

is satisfied, for $0 \le \varepsilon < \min \left\{ {1,C\triangle }\right\} $, then $d_w(\widehat{t}_{\phi (i)},t_i) \le \varepsilon $.

Step 2: Recovering Each $u'_j$. Note that $d_w({\widehat{t}}_i, {\widehat{t}}_j) \ge \triangle - 2\varepsilon $ for all $i\ne j$, so we assume $\varepsilon < \triangle /2$ from now. Also recall that we form the Vandermonde matrix ${\widehat{V}} \in \mathbb {C}^{m \times K}$ using ${\widehat{\alpha }}_j = \exp (-\iota 2\pi {\widehat{t}}_j)$. Then, the estimate ${\widehat{u}} = {\widehat{V}}^{\dagger } v \in \mathbb {C}^K$ satisfies

$$\begin{aligned} {\widehat{u}}' = {\widehat{V}}^{\dagger } v = {\widehat{V}}^{\dagger } Vu' + {\widehat{V}}^{\dagger } \eta . \end{aligned}$$

Denote ${\tilde{\phi }} = \phi ^{-1}$ and let $u'_{{\tilde{\phi }}} \in \mathbb {C}^K$ denote the permuted version of $u'$ w.r.t ${\tilde{\phi }}$. Also, let $V_{{\tilde{\phi }}}$ be formed by permuting the columns of V w.r.t ${\tilde{\phi }}$. Then,

$$\begin{aligned} {\widehat{u}}' - u'_{{\tilde{\phi }}}&= ( {\widehat{V}}^{\dagger } V_{{\tilde{\phi }}} u'_{{{\tilde{\phi }}}} - u'_{{\tilde{\phi }}}) + {\widehat{V}}^{\dagger } \eta \nonumber \\&= ({\widehat{V}}^{\dagger } V_{{\tilde{\phi }}} - {\widehat{V}}^{\dagger } {\widehat{V}})u'_{{\tilde{\phi }}} + {\widehat{V}}^{\dagger } \eta \nonumber \\ \Rightarrow \Vert {{\widehat{u}}' - u'_{{\tilde{\phi }}}}\Vert _2&\le \Vert {{\widehat{V}}^{\dagger }}\Vert _2 (\Vert {V_{{\tilde{\phi }}} - {\widehat{V}}}\Vert _2 \Vert {u'_{{\tilde{\phi }}}}\Vert _2 + \Vert {\eta }\Vert _2). \end{aligned}$$

(A.5)

Now, ${\widehat{V}}^{\dagger } = ({\widehat{V}}^H {\widehat{V}})^{-1} {\widehat{V}}^H$ so $\Vert {{\widehat{V}}^{\dagger }}\Vert _2 = (\sigma _{\text {min}}({\widehat{V}}))^{-1} \le (m - \frac{1}{\triangle -2\varepsilon } -1)^{-1/2}$ (cf., Theorem 8). Next, the magnitude of each entry in $V_{{\tilde{\phi }}} - {\widehat{V}}$ can be verified to be upper bounded by $2\pi m \max _{i} d_w(t_i,{\widehat{t}}_{{\tilde{\phi }}(i)}) \le 2\pi m \varepsilon $ (see Proposition 2 in Appendix B.3). Thus, $\Vert {V_{{\tilde{\phi }}} - {\widehat{V}}}\Vert _2 \le \Vert {V_{{\tilde{\phi }}} - {\widehat{V}}}\Vert _F \le 2\pi m^{3/2} K^{1/2} \varepsilon $. Also, $\Vert {u'_{{\tilde{\phi }}}}\Vert _2 \le \sqrt{K}u_{\max }$ and using (A.4), we obtain

$$\begin{aligned} \Vert {\eta }\Vert _2 \le \sqrt{m} \eta _{\mathrm {max}}\le \varepsilon \frac{u_{\min } \sigma ^2_{\min }}{2 C \sqrt{m K}}\left( 1+16 \ \kappa ^2 \frac{u_{\max }}{u_{\min }}\right) ^{-1}. \end{aligned}$$

Plugging these bounds in (A.5), we obtain

$$\begin{aligned} \Vert {{\widehat{u}}' - u'_{{\tilde{\phi }}}}\Vert _{\infty } \le \Vert {{\widehat{u}}' - u'_{{\tilde{\phi }}}}\Vert _2 \le \frac{2\pi m^{3/2} K u_{\max }\varepsilon + \varepsilon \frac{u_{\min } \sigma ^2_{\min }}{2 C \sqrt{m K}}\left( 1+16 \ \kappa ^2 \frac{u_{\max }}{u_{\min }}\right) ^{-1}}{(m - \frac{1}{\triangle -2\varepsilon } -1)^{1/2}}. \end{aligned}$$

(A.6)

Step 3: Recovering Each $u_j$. Given ${\widehat{u}}' \in \mathbb {C}^K$, we obtain our final estimate ${\widehat{u}}_j = \exp (-\iota 2\pi s_0 {\widehat{t}}_j) {\widehat{u}}'_j$ for $j=1,\dots ,K$. Denoting

$$\begin{aligned} {\widehat{D}}&= \text {diag}(\exp (-\iota 2\pi s_0 {\widehat{t}}_1),\ldots ,\exp (-\iota 2\pi s_0 {\widehat{t}}_k)), \\ D_{{\tilde{\phi }}}&= \text {diag}(\exp (-\iota 2\pi s_0 t_{{\tilde{\phi }}(1)}),\ldots ,\exp (-\iota 2\pi s_0 t_{{\tilde{\phi }}(k)})), \end{aligned}$$

note that ${\widehat{u}} = \widehat{D} {\widehat{u}}'$ and $u_{{\tilde{\phi }}} = D_{{\tilde{\phi }}} u'_{{\tilde{\phi }}}$. From this, we obtain

$$\begin{aligned} {\widehat{u}} - u_{{\tilde{\phi }}}&= \widehat{D} {\widehat{u}}' - D_{{\tilde{\phi }}} u'_{{\tilde{\phi }}} \nonumber \\&= \widehat{D} {\widehat{u}}' - \widehat{D}u'_{{\tilde{\phi }}} + \widehat{D}u'_{{\tilde{\phi }}} - D_{{\tilde{\phi }}} u'_{{\tilde{\phi }}} \nonumber \\&= {\widehat{D}} ({\widehat{u}}' - u'_{{\tilde{\phi }}}) + ({\widehat{D}} - D_{{\tilde{\phi }}}) u'_{{\tilde{\phi }}} \nonumber \\ \Rightarrow \Vert {{\widehat{u}} - u_{{\tilde{\phi }}}}\Vert _{\infty }&\le \Vert {{\widehat{u}}' - u'_{{\tilde{\phi }}}}\Vert _{\infty } + \Vert {{\widehat{D}} - D_{{\tilde{\phi }}}}\Vert _{\infty } \Vert {u'_{{\tilde{\phi }}}}\Vert _{\infty } \end{aligned}$$

(A.7)

where in the last line, we used $\Vert {{\widehat{D}}}\Vert _{\infty } = 1$. Using Proposition 2 in Appendix B.3 , we readily obtain the bound $\Vert {{\widehat{D}} - D_{{\tilde{\phi }}}}\Vert _{\infty } \le 2\pi s_0 \max _i d_w({\widehat{t}}_i, t_{{\tilde{\phi }}(i)}) \le 2\pi s_0 \varepsilon $. Using $\Vert {u'_{{\tilde{\phi }}}}\Vert _{\infty } = u_{\max }$ and (A.6) in (A.7), we finally obtain

$$\begin{aligned} \Vert {{\widehat{u}} - u_{{\tilde{\phi }}}}\Vert _{\infty }&\le \frac{2\pi m^{3/2} K u_{\max }\varepsilon + \varepsilon \frac{u_{\min } \sigma ^2_{\min }}{2 C \sqrt{m K}}\left( 1+16 \ \kappa ^2 \frac{u_{\max }}{u_{\min }}\right) ^{-1}}{(m - \frac{1}{\triangle -2\varepsilon } -1)^{1/2}} + 2\pi u_{\max }s_0 \varepsilon . \end{aligned}$$

(A.8)

1.2 Proof of Corollary 1

We need only make the following simple observations. Firstly, $\sigma _{\text {min}}^2 \le \sigma _{\text {max}}^2 \le m + \frac{1}{\triangle } + 1 < 2m$, since $m > 1/\triangle + 1$. Also, since $\kappa ^2 \ge 1$, we get

$$\begin{aligned} \frac{u_{\min } \sigma ^2_{\min } \varepsilon }{2 C\sqrt{m K}} \left( 1+16 \ \kappa ^2 \frac{u_{\max }}{u_{\min }}\right) ^{-1}&< \frac{u_{\min }}{C\sqrt{K}} \left( 1+16 \frac{u_{\max }}{u_{\min }}\right) ^{-1} \sqrt{m} \varepsilon . \end{aligned}$$

Next, one can easily verify that for $m \ge \frac{2}{\triangle -2\varepsilon } + 1$, we have $\frac{\sqrt{m}}{\sqrt{m - \frac{1}{\triangle -2\varepsilon } - 1}} \le 2$. Using these observations in (2.7), along with $m < \frac{2}{\triangle (1-c)} + 1$, we obtain the stated bound on $\Vert {{\hat{u}}_{\phi } - u}\Vert _{\infty }$.

Next, let us note that the condition $m \ge \frac{2}{\triangle - 2\varepsilon } + 1 \ge \frac{2}{\triangle } + 1$ leads to the following simple observations.

(i)
$$\begin{aligned} \frac{m}{\sigma _{\min }^2} \le \frac{m}{m-\frac{1}{\triangle } - 1} \le \frac{1}{1-\frac{1/\triangle + 1}{2/\triangle + 1}} \le 5/2. \end{aligned}$$
(ii)
$$\begin{aligned} \kappa ^2 = \frac{m + \frac{1}{\triangle } - 1}{m - \frac{1}{\triangle } - 1} \le 3. \end{aligned}$$

Plugging these in (2.6), we have that it suffices for $\eta _{\mathrm {max}}$ to satisfy

$$\begin{aligned} \eta _{\mathrm {max}}\le \varepsilon \frac{u_{\min }}{5C\sqrt{K}} \left( 1+48 \frac{u_{\max }}{u_{\min }}\right) ^{-1} \le \varepsilon \frac{u_{\min }\sigma _{\min }^2}{2mC\sqrt{K}} \left( 1+16 \ \kappa ^2 \frac{u_{\max }}{u_{\min }}\right) ^{-1}. \end{aligned}$$

Auxiliary Results

1.1 Useful Relation Involving Wrap Around Metric and Chordal Metric for Points on the Unit Disk

For $t_1,t_2 \in [0,1)$ let $\alpha _1 = \exp (\iota 2\pi t_1)$, $\alpha _2 = \exp (\iota 2\pi t_2)$ denote their representations on the unit disk. We have

$$\begin{aligned} |{\alpha _1 - \alpha _2}|&= |{1 - \exp (\iota 2\pi (t_2-t_1))}| \\&= \sqrt{(1 - \cos (2\pi (t_2-t_1)))^2 + \sin ^2(2\pi (t_2-t_1))} \\&= \sqrt{2 - 2\cos (2\pi (t_2-t_1))} \\&= 2|{\sin (\pi (t_2-t_1))}| \\&= 2\sin (\pi |{t_2-t_1}|) \qquad (\text {Since } t_2-t_1 \in (-1,1)) \\&= 2\sin (\pi - \pi |{t_2-t_1}|) = 2\sin (\pi d_w(t_1,t_2)). \end{aligned}$$

Since $\sin x \le x$ for $x \ge 0$, therefore we obtain $|{\alpha _1 - \alpha _2}| \le 2 \pi d_w(t_1,t_2)$. Also, since $d_w(t_1,t_2) \in [0,1/2]$ and $\sin x \ge \frac{2x}{\pi }$ for $x \in [0,\pi /2]$, we obtain $d_w(t_1,t_2) \le |{\alpha _1 - \alpha _2}|/4$. To summarize,

$$\begin{aligned} \frac{|{\alpha _1 - \alpha _2}|}{2\pi } \le d_w(t_1,t_2) \le \frac{|{\alpha _1 - \alpha _2}|}{4}. \end{aligned}$$

(B.1)

Finally, from the definition of chordal metric (see Definition 1) we know that $\chi (\alpha _1,\alpha _2) = |{\alpha _1 - \alpha _2}|/2$. Thus from (B.1), we obtain

$$\begin{aligned} \frac{\chi (\alpha _1,\alpha _2)}{\pi } \le d_w(t_1,t_2) \le \frac{\chi (\alpha _1,\alpha _2)}{2}. \end{aligned}$$

(B.2)

1.2 Useful Relation Involving Wrap Around Metric and Chordal Metric When One Point Lies on the Unit Disk, and the Other Does Not

We will prove the following useful Proposition.

Proposition 1

For $\alpha _1,\alpha _2 \in \mathbb {C}$ where $|{\alpha _1}| = 1$, let $t_1,t_2 \in [0,1)$ be such that $\alpha _1 = \exp (\iota 2\pi t_1)$, $\alpha _2/|{\alpha _2}| = \exp (\iota 2\pi t_2)$. If $\chi (\alpha _1,\alpha _2) \le \epsilon $ for some $0 \le \epsilon \le 1/4$, then $d_w(t_1,t_2) \le (20 + \frac{1}{\sqrt{2}})\epsilon $.

Proof

We begin by representing $\alpha _1,\alpha _2$ in Cartesian coordinates in $\mathbb {R}^2$ where

$$\begin{aligned} \alpha _1 = (a,b), \quad \alpha _2 = (x,y). \end{aligned}$$

Let $s(\alpha _1), s(\alpha _2) \in \mathbb {S}^2$ be such that $\alpha _1, \alpha _2$ are their respective stereographic projections. Then,

$$\begin{aligned} s(\alpha _2)&= \left( \frac{2x}{1+x^2+y^2},\frac{2y}{1+x^2+y^2}, \frac{-1+x^2+y^2}{1+x^2+y^2} \right) \\ s(\alpha _1)&= (a,b,0). \end{aligned}$$

Since $\chi (\alpha _1,\alpha _2) = \frac{\Vert {s(\alpha _1) - s(\alpha _2)}\Vert _2}{2} \le \epsilon $, this implies

$$\begin{aligned} \frac{\vert -1+x^2+y^2\vert }{1+x^2+y^2} \le 2\epsilon , \quad \left| \frac{2x}{1+x^2+y^2} -a\right| \le 2\epsilon , \quad \left| \frac{2y}{1+x^2+y^2} -b\right| \le 2\epsilon . \end{aligned}$$

(B.3)

Let us assume $\epsilon < 1/2$ from now.

The first inequality in (B.3) implies
$$\begin{aligned} \frac{1-2\epsilon }{1+2\epsilon }&\le x^2+y^2 \le \frac{1+2\epsilon }{1-2\epsilon }. \end{aligned}$$
(B.4)
The second inequality in (B.3) implies
$$\begin{aligned} (a-2\epsilon ) \left( \frac{1+x^2+y^2}{2} \right)&\le x\le (a+2\epsilon ) \left( \frac{1+x^2+y^2}{2}\right) . \end{aligned}$$
(B.5)
The third inequality in (B.3) implies
$$\begin{aligned} (b-2\epsilon ) \left( \frac{1+x^2+y^2}{2} \right)&\le y\le (b+2\epsilon ) \left( \frac{1+x^2+y^2}{2}\right) . \end{aligned}$$
(B.6)

Now from (B.4), (B.5) we get

$$\begin{aligned} \frac{x}{\sqrt{x^2+y^2}} -a&\le \frac{(a+2\epsilon )\sqrt{1+2\epsilon }}{(1-2\epsilon )^{3/2}} -a \nonumber \\&= \frac{a\sqrt{1+2\epsilon }}{(1-2\epsilon )^{3/2}} - a + \frac{2\epsilon \sqrt{1+2\epsilon }}{(1-2\epsilon )^{3/2}} \nonumber \\&\le \left( \frac{\sqrt{1+2\epsilon }}{(1-2\epsilon )^{3/2}} - 1\right) + \frac{2\epsilon \sqrt{1+2\epsilon }}{(1-2\epsilon )^{3/2}} \end{aligned}$$

(B.7)

where we used $|{a}| \le 1$. Note that $\sqrt{1+2\epsilon } \le 1 + \epsilon $. Moreover, we have the following

Claim 1

For $0 \le \epsilon \le 1/4$, we have $(1-2\epsilon )^{-3/2} \le (1+12\sqrt{2}\epsilon )$.

Proof

Consider the function $g(\epsilon ) = (1+12\sqrt{2}\epsilon ) - (1-2\epsilon )^{-3/2}$. We have $g'(\epsilon ) = 12\sqrt{2} - 3(1-2\epsilon )^{-5/2}$ and so $g'(\epsilon ) \ge 0$ when $0 \le \epsilon \le 1/4$. Thus g is increasing for this range of $\epsilon $, and so $g(\epsilon ) \ge g(0) = 0$, which completes the proof. $\square $

Applying these observations to (B.7), we get

$$\begin{aligned} \frac{x}{\sqrt{x^2+y^2}} -a&\le [(1+\epsilon )(1+12\sqrt{2}\epsilon ) - 1] + 2\epsilon (1+\epsilon )2^{3/2} \nonumber \\&= [(12\sqrt{2} + 1)\epsilon + 12\sqrt{2}\epsilon ^2] + 4\sqrt{2}(\epsilon + \epsilon ^2) \nonumber \\&\le [(12\sqrt{2} + 1)\epsilon + 3\sqrt{2}\epsilon ] + 5\sqrt{2}\epsilon \qquad (\text {Using } \epsilon \le 1/4) \nonumber \\&\le (20\sqrt{2} + 1)\epsilon . \end{aligned}$$

(B.8)

The reader is invited to verify that $a - \frac{x}{\sqrt{x^2+y^2}} \le (20\sqrt{2} + 1)\epsilon $, which together with (B.8) implies that $|{\frac{x}{\sqrt{x^2+y^2}} -a}| \le (20\sqrt{2} + 1)\epsilon $. In an identical fashion, one obtains the same bound on $|{\frac{y}{\sqrt{x^2+y^2}} - b}|$. From these observations, we then obtain

$$\begin{aligned} \chi (\alpha _1,\alpha _2/|{\alpha _2}|) = \frac{1}{2}\sqrt{\left( \frac{x}{\sqrt{x^2+y^2}} -a \right) ^2 + \left( \frac{y}{\sqrt{x^2+y^2}} - b\right) ^2} \le \left( 20 + \frac{1}{\sqrt{2}}\right) \epsilon . \end{aligned}$$

Using (B.2), this in turn implies that $d_w({t_1,t_2}) \le (10 + \frac{1}{2\sqrt{2}}) \epsilon $. $\square $

1.3 More Useful Results

Proposition 2

For any $t_1,t_2 \in [0,1)$, and integer n, we have

$$\begin{aligned} |{\exp (\iota 2\pi n t_1) - \exp (\iota 2\pi n t_2)}| \le 2|{n}|\pi d_w(t_1,t_2). \end{aligned}$$

Proof

On one hand,

$$\begin{aligned} |{\exp (\iota 2\pi n t_1) - \exp (\iota 2\pi n t_2)}|&= |{1 - \exp (\iota 2\pi n(t_2-t_1))}| \nonumber \\&= 2|{\sin (\pi n (t_2-t_1))}| \nonumber \\&\le 2\pi |{n}| |{t_2 - t_1}| \quad (\text {Since } |{\sin x}|\nonumber \\&\le |{x}|, \ \forall x\in \mathbb {R}). \end{aligned}$$

(B.9)

On the other hand, note that

$$\begin{aligned} 2|{\sin (\pi n (t_2-t_1))}|&= 2|{\sin (\pi |{n}| |{t_2-t_1}|)}| = 2|{\sin (\pi |{n}| (1 - |{t_2-t_1}|))}| \nonumber \\&\le 2 \pi |{n}| (1 - |{t_2 - t_1}|). \end{aligned}$$

(B.10)

The bound follows from (B.9), (B.10) and by noting the definition of $d_w$. $\square $

Proposition 3

For any $t_1,t_2 \in [0,1)$, integer n, and $u_1,u_2 \in \mathbb {C}$, we have

$$\begin{aligned} |{u_1\exp (\iota 2\pi n t_1) - u_2 \exp (\iota 2\pi n t_2)}| \le 2\pi |{u_1}| |{n}| d_w(t_1,t_2) + |{u_1-u_2}|. \end{aligned}$$

Proof

Indeed,

$$\begin{aligned}&|{u_1\exp (\iota 2\pi n t_1) - u_2 \exp (\iota 2\pi n t_2)}| \\&\quad = |{u_1\exp (\iota 2\pi n t_1) - u_1\exp (\iota 2\pi n t_2) + u_1\exp (\iota 2\pi n t_2) - u_2 \exp (\iota 2\pi n t_2)}| \\&\quad \le |{u_1}| |{\exp (\iota 2\pi n t_1) - \exp (\iota 2\pi n t_2)}| + |{u_1-u_2}| \\&\quad \le 2\pi |{u_1}| |{n}| d_w(t_1,t_2) + |{u_1-u_2}| \end{aligned}$$

where in the last inequality, we used Proposition 2. $\square $

Conditions on $\varepsilon _l$ in Theorem 5

We treat conditions 1a, 1b separately below. In this section, for a given $i \in \mathbb {N}$, we will denote $C_i(u_i,v_i,w_i,\dots )$ to be a positive term depending only on the parameters $u_i,v_i,w_i,\dots $.

1.
Condition 1a is equivalent to
$$\begin{aligned}&\frac{K}{\triangle _{L-1}}\varepsilon _{L-1} + \frac{1}{(\mu _L^{2} - \mu _{L-1}^2)^{1/2}} \log ^{1/2}\left( \frac{K^{3/2}\mu _{L}}{\varepsilon _{L-1}\mu _{L-1}} \right) \varepsilon _{L-1}\nonumber \\&\quad \lesssim \frac{\varepsilon _{L} \mu _L e^{-2\pi ^2(\mu _L^2 - \mu _1^2)/\triangle _L^2}}{K^{3/2}L\mu _{L-1}} \end{aligned}$$
(C.1)
$$\begin{aligned}&\Leftrightarrow C_1(K,\triangle _{L-1},\mu _L,\mu _{L-1}) \log ^{1/2}\left( \frac{K^{3/2}\mu _{L}}{\varepsilon _{L-1}\mu _{L-1}} \right) \varepsilon _{L-1} \nonumber \\&\quad \lesssim C_2(\mu _{L},\mu _{L-1},\mu _{1},K,L,\triangle _L) \varepsilon _L. \end{aligned}$$
(C.2)
Since for any $0< {\widetilde{\theta }} < 1$, we have^{Footnote 10}
$$\begin{aligned} \log \left( \frac{K^{3/2}\mu _{L}}{\varepsilon _{L-1}\mu _{L-1}}\right) \le \frac{1}{{\widetilde{\theta }}}\left( \frac{K^{3/2}\mu _{L}}{\varepsilon _{L-1}\mu _{L-1}}\right) ^{{\widetilde{\theta }}}, \end{aligned}$$
(C.3)
therefore (C.2) is satisfied if for any given $\theta \in (0,1/2)$, it holds that
$$\begin{aligned} \varepsilon _{L-1}^{1-\theta } \lesssim \sqrt{\theta } \frac{C_2(\mu _{L},\mu _{L-1},\mu _{1},K,L,\triangle _L)}{C_1(K,\triangle _{L-1},\mu _L,\mu _{L-1})} \left( \frac{\mu _{L-1}}{\mu _L K^{3/2}}\right) ^{\theta } \varepsilon _{L}. \end{aligned}$$
(C.4)
From this, (3.25) follows easily.
2.
We now look at condition 1b starting with the condition $E_l(\varepsilon _l) \varepsilon _l \le E_{l+1}(\varepsilon _{l+1}) \varepsilon _{l+1}$. Using the order dependency of $E_{l}(\cdot )$ from (3.23), this is the same as
$$\begin{aligned}&C_3(K,\triangle _l,\mu _l,\mu _{l+1}) \log ^{1/2}\left( \frac{K^{3/2}\mu _{L} (L-l)}{\varepsilon _{l}\mu _{l}}\right) \varepsilon _l \end{aligned}$$
(C.5)
$$\begin{aligned}&\quad \lesssim C_4(K,\triangle _{l+1},\mu _{l+1},\mu _{l+2}) \log ^{1/2}\left( \frac{K^{3/2}\mu _{L} (L-l)}{\varepsilon _{l+1}\mu _{l+1}}\right) \varepsilon _{l+1}. \end{aligned}$$
(C.6)
Using (C.3), it follows that $E_l(\varepsilon _l) \varepsilon _l \le E_{l+1}(\varepsilon _{l+1}) \varepsilon _{l+1}$ is ensured if
$$\begin{aligned} \varepsilon _{l}^{1-\theta }\lesssim & {} \sqrt{\theta }\frac{C_4(K,\triangle _{l+1},\mu _{l+1},\mu _{l+2})}{C_3(K,\triangle _l,\mu _l,\mu _{l+1})} \left( \frac{\mu _{l}}{\mu _L (L-l) K^{3/2}}\right) ^{\theta } \nonumber \\&\quad \log ^{1/2}\left( \frac{K^{3/2}\mu _{L} (L-l)}{\varepsilon _{l+1}\mu _{l+1}}\right) \varepsilon _{l+1} \end{aligned}$$
(C.7)
holds for any given $\theta \in (0,1/2)$.
Now consider the other condition, namely (3.22). Using the order dependency of $F_{l+1}(\varepsilon _l)$ and $E_l(\varepsilon _l)$ from (3.24), (3.23) respectively, this can be equivalently written as
$$\begin{aligned}&\left( \frac{1}{\triangle _{l+1}} + \frac{1}{(\mu _{l+2}^2 -\mu _{l+1}^2)^{1/2}} \log ^{1/2}\left( \frac{K^{3/2} (L-l) \mu _L}{\varepsilon _l \mu _{l+1}}\right) \right) \varepsilon _l \nonumber \\&\qquad +\left( \frac{K}{\triangle _l} + \frac{1}{(\mu _{l+1}^2 -\mu _{l}^2)^{1/2}} \log ^{1/2}\left( \frac{K^{3/2} (L-l) \mu _L}{\varepsilon _l \mu _{l}}\right) \right) \varepsilon _l \nonumber \\&\quad \lesssim \varepsilon _{l+1} e^{-(\mu _{l+1}^2 -\mu _{1}^2)\frac{F^2_{l+1}(\varepsilon _{l+1})}{2}} \frac{\mu _{l+1}}{\mu _l l K^{3/2}}. \end{aligned}$$
(C.8)
Since $\mu _{l+1} > \mu _l$, therefore the L.H.S of (C.8) is
$$\begin{aligned} \lesssim C_5(\triangle _l,\triangle _{l+1},\mu _l,\mu _{l+1},\mu _{l+2},K) \log ^{1/2}\left( \frac{K^{3/2} (L-l) \mu _L}{\varepsilon _l \mu _{l}}\right) \varepsilon _l. \end{aligned}$$
(C.9)
Moreover, since
$$\begin{aligned} (\mu _{l+1}^2 -\mu _{1}^2) \frac{F^2_{l+1}(\varepsilon _{l+1})}{2} \lesssim C_6(\mu _1,\mu _{l+1}, \mu _{l+2}, \triangle _{l+1}) \log \left( \frac{K^{3/2} (L-l) \mu _L}{\varepsilon _{l+1} \mu _{l+1}}\right) , \end{aligned}$$
therefore the R.H.S of (C.8) is
$$\begin{aligned}& > rsim \varepsilon _{l+1} \left( \frac{\mu _{l+1}}{\mu _{l} l K^{3/2}}\right) \left( \frac{\varepsilon _{l+1}\mu _{l+1}}{K^{3/2} (L-l) \mu _L}\right) ^{C_6(\mu _1,\mu _{l+1}, \mu _{l+2}, \triangle _{l+1})} \nonumber \\& > rsim (\varepsilon _{l+1})^{1+C_6(\mu _1,\mu _{l+1}, \mu _{l+2}, \triangle _{l+1})} C_7(\mu _l,\mu _{l+1},\mu _L,K,L). \end{aligned}$$
(C.10)
Therefore from (C.9), (C.10), it follows that a sufficient condition for (C.8) to hold is
$$\begin{aligned}&\log ^{1/2}\left( \frac{K^{3/2} (L-l) \mu _L}{\varepsilon _l \mu _{l}}\right) \varepsilon _l \lesssim (\varepsilon _{l+1})^{1+C_6(\mu _1,\mu _{l+1}, \mu _{l+2}, \triangle _{l+1})}\nonumber \\&\quad \frac{C_7(\mu _l,\mu _{l+1},\mu _L,K,L)}{C_5(\triangle _l,\triangle _{l+1},\mu _l,\mu _{l+1},\mu _{l+2},K)}. \end{aligned}$$
(C.11)
Using (C.3), we have that (C.11) holds if for any given $\theta \in (0,1/2)$,
$$\begin{aligned}&\varepsilon _l^{1-\theta } \lesssim \sqrt{\theta } \frac{C_7(\mu _l,\mu _{l+1},\mu _L,K,L)}{C_5(\triangle _l,\triangle _{l+1},\mu _l,\mu _{l+1},\mu _{l+2},K)} (\varepsilon _{l+1})^{1+C_6(\mu _1,\mu _{l+1}, \mu _{l+2}, \triangle _{l+1})} \nonumber \\&\quad \left( \frac{\mu _l}{\mu _L (L-l) K^{3/2}}\right) ^{\theta }. \end{aligned}$$
(C.12)
Comparing (C.12) and (C.7), we see that the dependence on $\varepsilon _{l+1}$ in (C.12) is in general stricter than that in (C.7).

Proof of Theorem 6

As for the proof of Theorem 5, the proof of Theorem 6 is divided into three main steps. We will only point out the relevant modifications in the proof of Theorem 5 that we need to make in order to account for the noise term.

Recovering Source Parameters for First Group For $i=-m_1,\ldots ,m_1-1$, equation (3.28) becomes
$$\begin{aligned} \frac{{\widetilde{f}}(s_1+i)}{\bar{g}_1(s_1+i)}&= \sum _{j=1}^K u'_{1,j} \exp \left( \iota 2\pi i t_{1,j}\right) + \eta '_{1,i}+\frac{w_1(i)}{\bar{g}_1(s_1+i)}. \end{aligned}$$
Since the noise term satisfies (4.2), invoking Corollary 1 brings that for $\varepsilon _1 < c \triangle _1/2$, and $\frac{2}{\triangle _1 - 2\varepsilon _1} + 1 \le m_1 < \frac{2}{\triangle _1(1-c)} + 1 (= M_{{1},\mathrm {up}})$, we only need to impose that $s_1$ satisfies
$$\begin{aligned} Ku_{\max }(L-1) \frac{\mu _{L}}{\mu _1}\exp \left( -2\pi ^2(s_1-m_1)^2(\mu _2^2-\mu _1^2)\right) \le {\varepsilon _1 \frac{u_{\min }B(u_{\mathrm {rel}},K)}{2}} \end{aligned}$$
to recover the same error bound (3.30) as in the noiseless case. The lower and upper bounds on $s_1$ are then obtained in a similar way as (3.31) and (3.32) in the proof of Theorem 5, and one finally gets the same error bound (3.33) as in the noiseless case.
Recovering Source Parameters for$l{\text {th}} (1< l < L)$Group Say we are at the lth iteration for $1< l < L$, having estimated the source parameters up to the $(l-1)$th group. Say that for each $p = 1,\dots ,l-1$ and $j=1,\dots ,K$ the following holds.
$$\begin{aligned} d_w(\widehat{t}_{p,\phi _p(j)}, t_{p,j}) \le \varepsilon _p, \quad |{\widehat{u}_{p,\phi _p(j)} - u_{p,j}}| < E_p(\varepsilon _p) {\varepsilon _p u_{\max }}, \end{aligned}$$
(D.1)
for some permutations $\phi _p: [K] \rightarrow [K]$, with
1. 1.
  $\varepsilon _1 \le \cdots \le \varepsilon _{l-1}; \quad \quad E_1(\varepsilon _1) {\varepsilon _1} \le \cdots \le E_{l-1}(\varepsilon _{l-1}) {\varepsilon _{l-1}}$;
2. 2.
  $\varepsilon _{p} < c \triangle _p/2$ ;
3. 3.
  $(F_{q+1}'(\varepsilon _{q}) + E_{q}(\varepsilon _{q})) {\varepsilon _{q} u_{\max }} \le \varepsilon _{q+1} e^{-(\mu _{q+1}^2-\mu _1^2)\frac{{F_{q+1}'^2}(\varepsilon _{q+1})}{2}} \frac{\mu _{q+1} B(u_{\mathrm {rel}},K)}{3 K q \mu _{q}}$, $1 \le q \le l-2$.
For $i=-m_l,\ldots ,m_l-1$, we have
$$\begin{aligned} \frac{{\widetilde{f}}(s_l + i) - \sum _{p=1}^{l-1} \widehat{f}_p(s_l + i)}{\bar{g}_l(s_l+i)}&= \sum _{j=1}^K u'_{l,j} \exp (\iota 2\pi i t_{l,j}) \\&\quad + \eta '_{l,i,past} + \eta '_{l,i,fut}+ \frac{w_l(i)}{\bar{g}_l(s_l+i)}. \end{aligned}$$
As in the proof of Theorem 5, $\eta '_{l,i,past}$ denotes perturbation due to the estimation errors of the source parameters in the past, and $\eta '_{l,i,fut}$ denotes perturbation due to the tails of the kernels that are yet to be processed. Imposing that $s_l \ge S_l$ where
$$\begin{aligned} S_l = m_l + \frac{1}{(2\pi ^2(\mu _{l+1}^2 - \mu _l^2))^{1/2}} \log ^{1/2}\left( {\frac{3 K u_{\mathrm {rel}}(L-l) \mu _L}{\varepsilon _l \mu _l B(u_{\mathrm {rel}},K)}} \right) \end{aligned}$$
we get that, for $i=-m_l,\dots ,m_l-1$,
$$\begin{aligned} |{\eta '_{l,i,fut}}| < \varepsilon _l {\frac{u_{\min }B(u_{\mathrm {rel}},K)}{3}} \end{aligned}$$
(D.2)
where we used the same steps as in the proof of Theorem 5. Now, since $s_l \le \widetilde{c} S_l$ and $m_l < M_{{l},\mathrm {up}}$, we obtain the bound
$$\begin{aligned} 2\pi (s_l + m_l)&< 2\pi (\widetilde{c} + 1) M_{{l},\mathrm {up}} \\&\quad + \frac{2\pi \widetilde{c}}{(2\pi ^2(\mu _{l+1}^2 - \mu _l^2))^{1/2}} \log ^{1/2}\left( {\frac{3 K u_{\mathrm {rel}}(L-l) \mu _L}{\varepsilon _l \mu _l B(u_{\mathrm {rel}},K)}}\right) \\&= C'_{l,1} + C'_{l,2} \log ^{1/2}\left( \frac{3D_{l}}{\varepsilon _l}\right) = F'_{l}(\varepsilon _l), \end{aligned}$$
where we recall the definition of $F'_l$, and constants $C_{l,1}', C_{l,2}', D_l > 0$ from (4.1), (3.21). Since $\varepsilon _l < D_l$, hence if $\varepsilon _l \ge \varepsilon _{l-1}$ holds, then $2\pi (s_l + m_l) < F'_{l}(\varepsilon _l) \le F'_{l}(\varepsilon _{l-1})$. Using this in (3.36), we obtain
$$\begin{aligned} |{\eta '_{l,i,past}}| < {\left( K(l-1) \frac{\mu _{l-1}}{\mu _l} e^{(\mu _l^2 - \mu _1^2)\frac{F^{'^2}_l(\varepsilon _l)}{2}} \right) \left( F'_l(\varepsilon _{l-1}) + E_{l-1}(\varepsilon _{l-1})\right) \varepsilon _{l-1} u_{\max }.} \end{aligned}$$
Therefore if $\varepsilon _{l-1}$ satisfies the condition
$$\begin{aligned} {(F'_l(\varepsilon _{l-1}) + E_{l-1}(\varepsilon _{l-1}))\varepsilon _{l-1} u_{\max }\le \varepsilon _l e^{-(\mu _l^2-\mu _1^2) \frac{F^{'^2}_l(\varepsilon _l)}{2}} \frac{u_{\min }\mu _l B(u_{\mathrm {rel}},K)}{3 K (l-1) \mu _{l-1}}} \end{aligned}$$
then it implies $|{\eta '_{l,i,past}}| < {\varepsilon _l \frac{u_{\min }B(u_{\mathrm {rel}},K)}{3}}$. Together with (D.2), (4.3) this gives for each i that
$$\begin{aligned} \left| \eta '_{l,i}+ \frac{w_l(i)}{{\bar{g}}_l(s_l+i)}\right| \le |{\eta '_{l,i,fut}}| + |{\eta '_{l,i,past}}| + \left| \frac{w_l(i)}{{\bar{g}}_l(s_l+i)}\right| < \varepsilon _l {u_{\min }B(u_{\mathrm {rel}},K)}. \end{aligned}$$
Thereafter, we obtain the same bound (3.40) as in the noiseless case.
Recovering Source Parameters for Last Group Say that for each $p = 1,\dots ,L-1$ and $j=1,\dots ,K$ the following holds.
$$\begin{aligned} d_w(\widehat{t}_{p,\phi _p(j)}, t_{p,j}) \le \varepsilon _p, \quad |{{\widehat{u}}_{p,\phi _p(j)} - u_{p,j}}| < E_p(\varepsilon _p) {\varepsilon _p u_{\max }}, \end{aligned}$$
(D.3)
for some permutations $\phi _p: [K] \rightarrow [K]$, with
1. 1.
  $\varepsilon _1 \le \cdots \le \varepsilon _{L-1}$; $E_1(\varepsilon _1) {\varepsilon _1} \le \cdots \le E_{L-1}(\varepsilon _{L-1}) {\varepsilon _{L-1}}$;
2. 2.
  $\varepsilon _{p} < c \triangle _p/2$;
3. 3.
  $(F'_{q+1}(\varepsilon _{q}) + E_{q}(\varepsilon _{q})) {\varepsilon _{q} u_{\mathrm {rel}}} \le \varepsilon _{q+1} e^{-(\mu _{q+1}^2-\mu _1^2)\frac{F_{q+1}^{'^2}(\varepsilon _{q+1})}{2}} \frac{\mu _{q+1} B(u_{\mathrm {rel}},K)}{3 K q \mu _{q}}$, $1 \le q \le L-2$.
We proceed by noting that for each $i = -m_L,\dots ,m_L-1$
$$\begin{aligned} \frac{{\widetilde{f}}(s_L + i) - \sum _{p=1}^{L-1} \widehat{f}_p(s_L + i)}{\bar{g}_L(s_L + i)}&= \sum _{j=1}^K u'_{L,j} \exp (\iota 2\pi i t_{L,j}) + \eta '_{L,i}+\frac{w_L(i)}{{\bar{g}}_L(s_L+i)}. \end{aligned}$$
Using Proposition 3, we have for each $p=1,\dots ,L-1$ and $j=1,\dots ,K$ that
$$\begin{aligned}&|{u_{p,j} \exp (\iota 2\pi (s_L+i)t_{p,j}) - \widehat{u}_{p,\phi _p(j)} \exp (\iota 2\pi (s_L+i){\widehat{t}}_{p,\phi _p(j)})}| \nonumber \\&\quad < 2\pi u_{\max }|{s_L+i}| \varepsilon _p + E_p(\varepsilon _p) {\varepsilon _p u_{\max }}, \end{aligned}$$
(D.4)
where we used (D.3). Since $s_L = 0$, hence $(s_L + i)^2< m_L^2 < M_{{L},\mathrm {up}}^2$ for all $i=-m_L,\dots ,m_L-1$. Using (D.4), we then easily obtain as in the proof of Theorem 5
$$\begin{aligned} |{\eta '_{L,i}}|&\le \left( K(L-1) \frac{\mu _{L-1}}{\mu _L} e^{2\pi ^2(\mu _L^2 - \mu _1^2)M_{{L},\mathrm {up}}^2} \right) \nonumber \\&\quad \left( 2\pi M_{{L},\mathrm {up}} + E_{L-1}(\varepsilon _{L-1})\right) {u_{\max }\varepsilon _{L-1}} \end{aligned}$$
(D.5)
where in the last inequality, we used (1).
Invoking Corollary 1 and assuming $\varepsilon _L < c \triangle _L/2$, it follows for the stated conditions on $m_L$, that it suffices if $\varepsilon _{L-1}$ satisfies
$$\begin{aligned} {(2\pi M_{{L},\mathrm {up}} + E_{L-1}(\varepsilon _{L-1})) u_{\mathrm {rel}}\varepsilon _{L-1} \le \varepsilon _L e^{-2\pi ^2(\mu _L^2 - \mu _1^2)M_{{L},\mathrm {up}}^2} \frac{\mu _L B(u_{\mathrm {rel}},K)}{2 K (L-1) \mu _{L-1}}.} \end{aligned}$$
Indeed, combining this last bound with (4.4), we obtain for each i that $|{\eta '_{L,i} + \frac{w_L(i)}{{\bar{g}}_l(s_L+i)}}| < \varepsilon _L u_{\min }B(u_{\mathrm {rel}},K)$. Thereafter, we obtain the same bound (3.44) as in the noiseless case.

More Experiments

See Figs. 7, 8, 9, and 10

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chrétien, S., Tyagi, H. Multi-kernel Unmixing and Super-Resolution Using the Modified Matrix Pencil Method. J Fourier Anal Appl 26, 18 (2020). https://doi.org/10.1007/s00041-020-09725-x

Download citation

Received: 09 July 2018
Revised: 23 November 2019
Published: 22 January 2020
DOI: https://doi.org/10.1007/s00041-020-09725-x

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Multi-kernel Unmixing and Super-Resolution Using the Modified Matrix Pencil Method

Abstract

Similar content being viewed by others

Global-Local Mixtures: A Unifying Framework

Free Component Analysis: Theory, Algorithms and Applications

Two pairwise iterative schemes for high dimensional blind source separation

1 Introduction

1.1 Background on Super-Resolution

1.1.1 Total Variation and Atomic Norm-Based Approaches

1.1.2 Prony, ESPRIT, MUSIC and Extensions

1.2 Super-Resolution with Multiple Kernels

1.3 Problem Formulation

1.4 Main Idea of Our Work: Fourier Tail Sampling

1.5 Main Results

Theorem 1

Theorem 2

2 Notation and Preliminaries

2.1 Matrix Pencil (MP) Method

2.2 The Modified Matrix Pencil (MMP) Method

Theorem 3

Corollary 1

3 Unmixing Gaussians in Fourier Domain: Noiseless Case

3.1 The Case of Two Kernels

Theorem 4

Proof of Theorem 4

3.2 The General Case

Theorem 5

Proof of Theorem 5

4 Unmixing Gaussians in Fourier Domain: Noisy Case

Theorem 6

Corollary 2

5 Experiments

6 Discussion and Concluding Remarks

6.1 Related Work on the Multi-kernel Unmixing Super-Resolution Problem

6.2 Future Directions

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Modified Matrix Pencil Method

1.1 Proof of Moitra’s Theorem

Definition 1

Definition 2

Theorem 7

Theorem 8

Lemma 1

Remark 1

1.2 Proof of Corollary 1

Auxiliary Results

1.1 Useful Relation Involving Wrap Around Metric and Chordal Metric for Points on the Unit Disk

1.2 Useful Relation Involving Wrap Around Metric and Chordal Metric When One Point Lies on the Unit Disk, and the Other Does Not

Proposition 1

Proof

Claim 1

Proof

1.3 More Useful Results

Proposition 2

Proof

Proposition 3

Proof

Conditions on \(\varepsilon _l\) in Theorem 5

Proof of Theorem 6

More Experiments

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation