Support Recovery for Sparse Super-Resolution of Positive Measures

Denoyelle, Quentin; Duval, Vincent; Peyré, Gabriel

doi:10.1007/s00041-016-9502-x

Support Recovery for Sparse Super-Resolution of Positive Measures

Published: 26 September 2016

Volume 23, pages 1153–1194, (2017)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Fourier Analysis and Applications Aims and scope Submit manuscript

Support Recovery for Sparse Super-Resolution of Positive Measures

Download PDF

Quentin Denoyelle^1,2,
Vincent Duval² &
Gabriel Peyré^2,3

825 Accesses
60 Citations
3 Altmetric
Explore all metrics

Abstract

We study sparse spikes super-resolution over the space of Radon measures on $\mathbb {R}$ or $\mathbb {T}$ when the input measure is a finite sum of positive Dirac masses using the BLASSO convex program. We focus on the recovery properties of the support and the amplitudes of the initial measure in the presence of noise as a function of the minimum separation t of the input measure (the minimum distance between two spikes). We show that when ${w}/\lambda $, ${w}/t^{2N-1}$ and $\lambda /t^{2N-1}$ are small enough (where $\lambda $ is the regularization parameter, w the noise and N the number of spikes), which corresponds roughly to a sufficient signal-to-noise ratio and a noise level small enough with respect to the minimum separation, there exists a unique solution to the BLASSO program with exactly the same number of spikes as the original measure. We show that the amplitudes and positions of the spikes of the solution both converge toward those of the input measure when the noise and the regularization parameter drops to zero faster than $t^{2N-1}$.

Exact Support Recovery for Sparse Spikes Deconvolution

Article 09 October 2014

Point Source Super-resolution Via Non-convex $L_1$ Based Methods

Article 28 January 2016

A Theory of Super-Resolution from Short-Time Fourier Transform Measurements

Article 05 April 2017

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

1.1 Super-Resolution and Sparse Spikes Deconvolution

Super-resolution consists in retrieving fine scale details of a possibly noisy signal from coarse scale information. The importance of recovering the high frequencies of a signal comes from the fact that there is often a physical blur in the acquisition process, such as diffraction in optical systems, wave reflection in seismic imaging or spikes recording from neuronal activity.

In resolution theory, the two-point resolution criterion defines the ability of a system to resolve two points of equal intensities. As a point source produces a diffraction pattern which is centered about the geometrical image point of the point source, it is often admitted that two points are resolved by a system if the central maximum of the intensity diffraction of one point source coincides with the first zero of the intensity diffraction pattern of the other point. This defines a distance that only depends on the system and which is called the Rayleigh Length. In the case of the ideal low-pass filter (meaning that the input signal is convolved with the Dirichlet kernel, see (9) for the exact definition) with cutoff frequency $f_c$, the Rayleigh Length is $1/f_c$. We refer to [11] for more details about resolution theory. Super-resolution in signal processing thus consists in developing techniques that enable to retrieve information below the Rayleigh Length.

Let us introduce in a more formal way the problem which will be the core of this article. Let $X$ be the real line $\mathbb {R}$ or the 1-D torus $\mathbb {T}=\mathbb {R}/\mathbb {Z}$ and $\mathcal {M}(X)$ the Banach space of bounded Radon measures on $X$, which can be seen as the topological dual of the space $\mathscr {C}_{X}$ where $\mathscr {C}_{X}$ is either the space of continuous functions on $\mathbb {R}$ that vanish at infinity when $X=\mathbb {R}$ or the space of continuous functions on $\mathbb {T}$ when $X=\mathbb {T}$. We consider a given integral operator $\Phi :\mathcal {M}(X)\rightarrow \mathcal {H}$, where $\mathcal {H}$ is a separable Hilbert space, whose kernel $\varphi $ is supposed to be a smooth function (see Definition 1 for the technical assumptions made on $\varphi $), i.e.

$$\begin{aligned} \forall m\in \mathcal {M}(X),\quad \Phi m=\int _X\varphi (x)\mathrm {d}m(x). \end{aligned}$$

(1)

$\Phi $ represents the acquisition operator and can for instance account for a blur in the measurements. In the special case of $\varphi (x)=\tilde{\varphi }(\cdot -x)$, $\Phi $ is a convolution operator. We denote by $m_{a_{0},tz_0}=\sum _{i=1}^{N} a_{0,i}\delta _{tz_{0,i}}$ our input sparse spikes train where the $a_{0,i}\in \mathbb {R}_+^*$ are the amplitudes of the Dirac masses at positions $tz_{0,i}\in X$. Let $y_t=\Phi m_{a_{0},tz_0}$ be the noiseless observation. The parameter $t>0$ controls the minimum separation distance between the spikes, and this paper aims at studying the recovery of $m_{a_{0},tz_0}$ from $y_t+w$ (where $w\in \mathcal {H}$ is some noise) when t is small.

1.2 From the LASSO to the BLASSO

1.2.1 LASSO

$\ell ^1$ regularization techniques were first introduced in geophysics (see [6, 18, 21]) for seismic prospecting. Indeed, the density changes in the underground can be modeled as a sparse spikes train. $\ell ^1$ reconstruction property provides solutions with few nonzero coefficients and can be solved efficiently with convex optimization methods. Donoho theoretically studied and justified these techniques in [12]. In statistics, the $\ell ^1$ norm is used in the Lasso method [23] which consists in minimizing a quadratic error subject to a $\ell ^1$ penalization. As the authors remarked it, it retains both the features of subset selection (by setting to zero some coefficients, thanks to the property of the $\ell ^1$ norm to favor sparse solutions) and ridge regression (by shrinking the other coefficients). In signal processing, the basis pursuit method [13] uses the $\ell ^1$ norm to decompose signals into overcomplete dictionaries.

1.2.2 BLASSO

Following recent works (see for instance [1, 3, 5, 8, 14]), the sparse deconvolution method that we consider in this article operates over a continuous domain, i.e. without resorting to some sort of discretization on a grid. The inverse problem is solved over the space of Radon measures which is a non-reflexive Banach space. This continuous “grid free” setting makes the mathematical analysis easier and allows us to make precise statement about the location of the recovered spikes locations.

The technique that we study in this paper consists in solving a convex optimization problem that uses the total variation norm which is the equivalent of the $\ell ^1$ norm for measures. The $\ell ^1$ norm is known to be particularly well fitted for the recovery of sparse signals. Thus the use of the total variation norm favors the emergence of spikes in the solution.

The total variation norm is defined by

$$\begin{aligned} \forall m \in \mathcal {M}(X), \quad |m|(X) \mathop = \limits ^{\mathrm{def}.}\underset{\psi \in \mathscr {C}_{X}}{\sup }\; \left\{ \int _X\psi \mathrm {d}m \;;\; \left\| \psi \right\| _{\text {L}^\infty (X)} \leqslant 1 \right\} . \end{aligned}$$

In particular,

$$\begin{aligned} |m_{a_{0},tz_0}|(X)=\left\| a_{0} \right\| _{1}, \end{aligned}$$

which shows in a way that the total variation norm generalizes the $\ell ^1$ norm to the continuous setting of measures (i.e. no discretization grid is required).

The first method that we are interested in is the classical basis pursuit, defined originally in [13] in a finite dimensional setting, and written here over the space of Radon measures

$$\begin{aligned} \underset{m \in \mathcal {M}(X)}{\min }\; \left\{ |m|(X) \;;\; \Phi m=y_t \right\} . \qquad ({\mathcal {P}_0(y_t)}) \end{aligned}$$

This is the problem studied in [5], in the case where $\Phi $ is an ideal low-pass filter on the torus (i.e. $X=\mathbb {T}$).

When the signal is noisy, i.e. when we observe $y_t+w$ instead, with $w\in \mathcal {H}$, we may rather consider the problem

$$\begin{aligned} \underset{m \in \mathcal {M}(X)}{\min }\; \frac{1}{2} \left\| \Phi m - (y_t+w) \right\| _{\mathcal {H}}^2 + \lambda |m|(X). \qquad ({\mathcal {P}_{\lambda }(y_t+w)}) \end{aligned}$$

Here $\lambda >0$ is a parameter that should adapted to the noise level $\left\| w \right\| _{\mathcal {H}}$. This problem is coined “BLASSO” in [8].

While this is not the focus of this article, we note that there exist algorithms to solve the infinite dimensional convex problems $({\mathcal {P}_0(y_t)})$ and $({\mathcal {P}_{\lambda }(y_t+w)})$, see for instance [3, 5].

1.3 Previous Works

1.3.1 LASSO/BLASSO Performance Analysis

In order to quantify the recovery performance of the methods $\mathcal {P}_0(y_t)$ and $\mathcal {P}_{\lambda }(y_t+w)$, the following two questions arise:

1.
Does the solutions of $\mathcal {P}_0(y_t)$ recover the input measure $m_{a_{0},tz_0}$?
2.
How close is the solution of $\mathcal {P}_{\lambda }(y_t+w)$ to the solution of $\mathcal {P}_0(y_t)$?

When the amplitudes of the spikes are arbitrary complex numbers, the answers to the above questions require a large enough minimum separation distance $\Delta (tz_0)$ between the spikes where

$$\begin{aligned} \Delta (tz_0) \mathop = \limits ^{\mathrm{def}.}\underset{i \ne j}{\min }\; d_X(tz_{0,i},t z_{0,j}). \end{aligned}$$

(2)

where $d_X$ is either the canonical distance on $\mathbb {R}$ i.e.

$$\begin{aligned} \forall x,y\in \mathbb {R}, \quad d_X(x,y)=|x-y|, \end{aligned}$$

(3)

when $X=\mathbb {R}$, or the canonical induced distance on $\mathbb {T}$ i.e.

$$\begin{aligned} \forall x,y\in \mathbb {R}, \quad d_X(x+\mathbb {Z},y+\mathbb {Z})=\min _{k\in \mathbb {Z}} |x-y+k|, \end{aligned}$$

(4)

when $X=\mathbb {T}$. The first question is addressed in [5] where the authors showed, in the case of $\Phi $ being the ideal low-pass filter on the torus [see (9)], i.e. when $\mathcal {H}=\mathbb {T}$, that $m_{a_{0},tz_0}$ is the unique solution of $\mathcal {P}_0(y_t)$ provided that $\Delta (tz_0)\geqslant \frac{C}{f_c}$ where $C>0$ is a universal constant and $f_c$ the cutoff frequency of the ideal low-pass filter. In the same paper, it is shown that $C\leqslant 2$ when $a_{0}\in \mathbb {C}^N$ and $C\leqslant 1.87$ when $a_{0}\in \mathbb {R}^N$. In [14], the authors show that necessarily $C\geqslant \frac{1}{2}$.

The second question receives partial answers in [3, 4, 9, 15]. In [3], it is shown that if the solution of $\mathcal {P}_0(y_t)$ is unique then the measures recovered by $\mathcal {P}_{\lambda }(y_t+w)$ converge in the weak-* sense to the solution of $\mathcal {P}_0(y_t)$ when $\lambda \rightarrow 0$ and $\left\| w \right\| _{\mathcal {H}}/\lambda \rightarrow 0$. In [4], the authors measure the reconstruction error using the $\text {L}^2$ norm of an ideal low-pass filtered version of the recovered measures. In [9, 15], error bounds are given on the locations of the recovered spikes with respect to those of the input measure $m_{a_{0},tz_0}$. However, those works provide little information about the structure of the measures recovered by $\mathcal {P}_{\lambda }(y_t+w)$. That point is addressed in [14] where the authors show that under the Non Degenerate Source Condition (see Sect. 2 for more details), there exists a unique solution to $\mathcal {P}_{\lambda }(y_t+w)$ with the exact same number of spikes as the original measure provided that $\lambda $ and $\left\| w \right\| _{\mathcal {H}}/\lambda $ are small enough. Moreover in that regime, this solution converges to the original measure when the noise drops to zero.

1.3.2 LASSO/BLASSO for Positive Spikes

For positive spikes (i.e. $a_{0,i}>0$), the picture is radically different, since exact recovery of $m_{a_{0},tz_0}$ without noise (i.e. $(w,\lambda )=(0,0)$) holds for all $t>0$, see for instance [8]. Stability constants however explode as $t \rightarrow 0$. A recent work [20] shows however that stable recovery is obtained if the signal-to-noise ratio grows faster than $O(1/t^{2N})$, closely matching optimal lower bounds of $O(1/t^{2N-1})$ obtained by combinatorial methods, as also proved recently [10]. Our main contribution is to show that the same $O(1/t^{2N-1})$ signal-to-noise scaling in fact guarantees a perfect support recovery of the spikes under a certain non-degeneracy condition on the filter. This extends, for positive measures, the initial results of [14] by providing an asymptotic analysis when $t \rightarrow 0$.

1.3.3 MUSIC and Related Methods

There is a large body of literature in signal processing on spectral methods to perform spikes location from low frequency measurements. One of the most popular methods is MUSIC (for multiple signal classification) [22] and we refer to [16] for an overview of its use in signal processing for line spectral estimation. In the noiseless case, exact reconstruction of the initial signal is guaranteed as long as there are enough observations compared to the number of distinct frequencies [19]. Stability to noise is known to hold under a minimum separation distance similar to the one of the BLASSO [19]. However, on sharp contrast with the behavior of the BLASSO, numerical observations (see for instance [7]), as well as a recent work of Demanet and Nguyen, show that this stability continues to hold regardless of the sign of the amplitudes $a_{0,i}$, as soon as the signal-to-noise ratio scales like $O(1/t^{2N-1})$. Note that this matches (when w is a Gaussian white noise) the Cramer-Rao lower bound achievable by any unbiased estimator [2]. This class of methods are thus more efficient than BLASSO for arbitrary measures, but they are restricted to operators $\Phi $ that are convolution with a low-pass filter, which is not the case of our analysis for the BLASSO.

1.4 Contributions

1.4.1 Main Results

From these previous works, one can ask whether exact support estimation by BLASSO for positive spikes is achievable when t tends to 0. Our main result, Theorem 2, shows that this is indeed the case. It states, under some non-degeneracy condition on $\Phi $, that there exists a unique solution to $\mathcal {P}_{\lambda }(y_t+w)$ with the exact same number of spikes as the original measure provided that $\left\| w \right\| _{\mathcal {H}}/\lambda $, $\left\| w \right\| _{\mathcal {H}}/t^{2N-1}$ and $\lambda /t^{2N-1}$ are small enough. Moreover we give an upper bound, in that regime, on the error of the recovered measure with respect to the initial measure. As a by-product, we show that the amplitudes and positions of the spikes of the solution both converge towards those of the input measure when the noise and the regularization parameter tend to zero faster than $t^{2N-1}$.

1.4.2 Extensions

We consider in this article the case where all the spikes locations $t z_0$ cluster near zero. Following for instance [20], it is possible to consider a more general model with several cluster points, where the sign of the Diracs is the same around each of these points. Our analysis, although more difficult to perform, could be extended to this setting, at the price of modifying the definition of $\eta _W$ (see Definition 3) to account for several cluster points.

Lastly, if the kernel $\Phi $ under consideration has some specific scale $\sigma $ (such as the standard deviation of a Gaussian kernel, or $\sigma =1/f_c$ for the ideal low-pass filter in the case of the deconvolution on the torus), then it is possible to state our contribution by replacing t by the dimensionless quantity $\text {SRF} \mathop = \limits ^{\mathrm{def}.}t/\sigma $ (called “super-resolution factor” in [20]). It is then possible to extend our proof so show that the signal-to-noise ratio should obey the scaling $1/\text {SRF}^{2N-1}$.

1.4.3 Roadmap

The exact statement of Theorem 2 (our main contribution), and in particular the definition of the non-degeneracy condition, requires some more background, which is the subject of Sect. 2. The proof of this result can be found in Sects. 3, 4 and 5. It relies on an independent study of the asymptotic behavior of quantities depending on the operator $\Phi _{tz_0}$ (such as its pseudo-inverse) when t tends to zero. This takes place in Sect. 3. Note that a sketch of proof of the main result can be found in Sect. 2.3.

1.5 Notations

1.5.1 Measures

We consider $X=\mathbb {R}$ or $X=\mathbb {T}$ as the space of Dirac masses positions. $X$ equipped with the distance $d_X$ [see (3) and (4)] is a locally compact metric space (compact in the case $X=\mathbb {T}$). We denote by $\mathcal {M}(X)$ the space of bounded Radon measures on $X$. It is the topological dual of the Banach space $\mathscr {C}_{X}$ (endowed with $\left\| \cdot \right\| _{\text {L}^\infty (X)}$) of continuous functions defined on $X$, that furthermore are imposed to vanish at infinity in the case $X=\mathbb {R}$. The two problems that we study, i.e. the Basis Pursuit for measures $\mathcal {P}_0(y_t)$ and the BLASSO $\mathcal {P}_{\lambda }(y_t+w)$, are two convex optimization problems on the space $\mathcal {M}(X)$.

Let us consider $a_{0}\in (\mathbb {R}_+^*)^N$ and $z_0\in \mathbb {R}^N$. When $X=\mathbb {T}$, we make the assumption that $z_0\in (-\frac{1}{4},\frac{1}{4})^N$. We define

$$\begin{aligned} \Delta _0 \mathop = \limits ^{\mathrm{def}.}\Delta (z_0), \end{aligned}$$

(5)

where $\Delta (z_0)$ is introduced in (2). We denote by $\overline{\mathrm {B}}\mathopen {}\left( \bar{x},r\right) $ (resp. $\mathrm {B}\mathopen {}\left( \bar{x},r\right) $) the closed (resp. open) $\ell ^\infty $ ball in $\mathbb {R}^N$ with center $\bar{x}$ and radius r. We define neighborhoods of respectively $z_0$ and $a_{0}$ as

$$\begin{aligned} \overline{\mathcal {B}}_{z_0}\mathop = \limits ^{\mathrm{def}.}\overline{\mathrm {B}}\mathopen {}\left( z_0,\frac{\Delta _0}{4}\right) \quad \text {and} \quad \overline{\mathcal {B}}_{a_{0}}\mathop = \limits ^{\mathrm{def}.}\overline{\mathrm {B}}\mathopen {}\left( a_{0},\frac{\min _i (a_{0,i})}{4}\right) . \end{aligned}$$

(6)

Note that when $X=\mathbb {T}$, if $z\in \overline{\mathcal {B}}_{z_0}$ then $z\in (-\frac{1}{2},\frac{1}{2})^N$ and for all $t\in (0,1]$, $tz\in (-\frac{1}{2},\frac{1}{2})^N$. As a result, depending on the context, we consider $tz_0$ and $tz$ as elements of $\mathbb {T}^N$ by identifying them with their equivalent classes.

Then, the initial measure we want to recover is of the form

$$\begin{aligned} m_{a_{0},tz_0}\mathop = \limits ^{\mathrm{def}.}\sum _{i=1}^N a_{0,i}\delta _{tz_{0,i}}. \end{aligned}$$

Note that for example, in the case $X=\mathbb {T}$, when we write $\delta _{tz_{0,i}}$, $tz_{0,i}$ is considered as an element of $\mathbb {T}$.

We use the parameter $t\in (0,1]$ to make the spikes locations tend to 0 in $X$, so that the minimum separation distance of $m_{a_{0},tz_0}$ : $\Delta (tz_0)\rightarrow 0$ when $t\rightarrow 0$.

1.5.2 Kernels

The admissible kernels $\varphi :X\rightarrow \mathcal {H}$ defining the integral operators $\Phi $ (modeling the blur of acquisition) are functions, defined on $X$ and taking values in a separable Hilbert space $\mathcal {H}$, satisfying some regularity properties listed in the following Definition.

Definition 1

(Admissible kernels) We denote by $\text{ KER }^{k}$, the set of admissible kernels of order k. A function $\varphi :X\rightarrow \mathcal {H}$ belongs to $\text{ KER }^{k}$ if $\varphi \in \mathscr {C}^{k}(X,\mathcal {H})$. When $X=\mathbb {R}$, $\varphi $ must also satisfies the following requirements:

For all $p\in \mathcal {H}$, $\left\langle \varphi (x), p\right\rangle _{\mathcal {H}}\rightarrow 0$ when $|x|\rightarrow +\infty $.
For all $0\leqslant i\leqslant k$, $\underset{x\in \mathcal {H}}{\sup } \left\| \varphi ^{(i)}(x) \right\| _{\mathcal {H}} <+\infty $.

1.5.3 Linear Operators

We consider a linear operator $\Phi : \mathcal {M}(X)\rightarrow \mathcal {H}$ of the form

$$\begin{aligned} \forall m\in \mathcal {M}(X), \quad \Phi m&\mathop = \limits ^{\mathrm{def}.}\int _X\varphi (x)\mathrm {d}m(x), \end{aligned}$$

(7)

where $\varphi : X\rightarrow \mathcal {H}$ belongs to $\text{ KER }^{0}$. $\Phi $ is weak-* to weak continuous. Its adjoint $\Phi ^*: \mathcal {H}\rightarrow \mathscr {C}_{X}$ is given by

$$\begin{aligned} \forall p\in \mathcal {H}, \quad \forall x\in X, \quad (\Phi ^*p)(x) = \left\langle \varphi (x), p\right\rangle _{\mathcal {H}}. \end{aligned}$$

A typical example is a convolution operator, where $\varphi (x)= \tilde{\varphi }(\cdot -x)$, for some continuous function $\tilde{\varphi }:X\rightarrow \mathbb {R}$, $\tilde{\varphi }\in \text {L}^2(X)\cap \mathscr {C}_{X}$, so that $(\Phi m)(v) = \int _X\tilde{\varphi }(v-x)\mathrm {d}m(x)$. In particular, in the case $X=\mathbb {R}$, the Gaussian filter is defined as

$$\begin{aligned} \forall x\in \mathbb {R},\quad \tilde{\varphi }(x)=\tilde{\varphi }_G (x)&\mathop = \limits ^{\mathrm{def}.}e^{-x^2/2}, \end{aligned}$$

(8)

where $\tilde{\varphi }_G$ is the Gaussian kernel. In the case $X=\mathbb {T}$, a typical example of convolution operator is the ideal low pass filter which is defined as

$$\begin{aligned} \forall x\in \mathbb {T},\quad \tilde{\varphi }(x)=\tilde{\varphi }_D (x)&\mathop = \limits ^{\mathrm{def}.}\sum _{k=-f_c}^{f_c} e^{2\mathrm {i}\pi kx}, \end{aligned}$$

(9)

for some $f_c\in \mathbb {N}^*$ called the cutoff frequency. $\tilde{\varphi }_D$ is the Dirichlet kernel with cutoff frequency $f_c$.

Remark 1

This last example is equivalently obtained (as considered for instance in [5]) by using $\mathcal {H}=\mathbb {C}^{2f_c+1}$ (endowed with its canonical inner product) in place of $\mathcal {H}=L^2(\mathbb {T})$, and defining $\varphi $ as $\varphi (x)=(e^{2i\pi k x})_{-f_c\leqslant k \leqslant f_c}$. Note that, to simplify the notation, we consider in this paper real Hilbert spaces $\mathcal {H}$, but our analysis readily extends to complex Hilbert spaces.

Given general $\varphi $ and $\Phi $ as in (7), and given $\bar{x}=(x_1,\ldots ,x_N)\in X^N$, we denote by $\Phi _{\bar{x}}:\mathbb {R}^N\rightarrow \mathcal {H}$ the linear operator such that

$$\begin{aligned} \forall a\in \mathbb {R}^N, \quad \Phi _{\bar{x}}(a) \mathop = \limits ^{\mathrm{def}.}\sum _{i=1}^{N} a_i \varphi (x_i), \end{aligned}$$

and by $\Gamma _{\bar{x}}:(\mathbb {R}^{N}\times \mathbb {R}^N)\rightarrow \mathcal {H}$ the linear operator defined by

$$\begin{aligned} \Gamma _{\bar{x}}\begin{pmatrix}a\\ b\end{pmatrix} \mathop = \limits ^{\mathrm{def}.}\sum _{i=1}^{N}\left( a_i \varphi (x_i) + b_i \varphi '(x_i)\right) . \end{aligned}$$

For $\varphi \in \text{ KER }^{k}$, operators involving the derivatives are defined similarly,

$$\begin{aligned} \forall 0\leqslant i\leqslant k, \quad (\Phi _{\bar{x}})^{(k)} : a\in \mathbb {R}^N \longmapsto (\Phi _{\bar{x}})^{(k)}(a) = \sum _{i=1}^{N} a_i \varphi ^{(i)}(x_i). \end{aligned}$$

We occasionally write $\Phi _{\bar{x}}'$ (resp. $\Phi _{\bar{x}}''$) for $(\Phi _{\bar{x}})^{(1)}$ (resp. for $(\Phi _{\bar{x}})^{(2)}$), and we adopt the following matricial notation

$$\begin{aligned} \Phi _{\bar{x}}=(\varphi (x_1) \ \ldots \ \varphi (x_{N})) \quad \text {and} \quad \Gamma _{\bar{x}}=(\Phi _{\bar{x}} \ \Phi _{\bar{x}}'), \end{aligned}$$

where the “columns” of those “matrices” are elements of $\mathcal {H}$. In particular, the adjoint operator $\Phi _{\bar{x}}^*:\mathcal {H}\rightarrow \mathbb {R}^N$ is given by

$$\begin{aligned} \forall p\in \mathcal {H},\quad \Phi _{\bar{x}}^*p=\left( (\Phi ^*p)(x_i)\right) _{1\leqslant i\leqslant N}. \end{aligned}$$

We denote by $\varphi _{k} \in \mathcal {H}$ the $k^{th}$ derivative of $\varphi $ at 0, i.e.

$$\begin{aligned} \varphi _{k} \mathop = \limits ^{\mathrm{def}.}\varphi ^{(k)}(0). \end{aligned}$$

(10)

In particular, $\varphi _{0} = \varphi (0)$.

Given $k \in \mathbb {N}$, we define

$$\begin{aligned} \Psi _{k}\mathop = \limits ^{\mathrm{def}.}\begin{pmatrix} \varphi _{0}&\quad \! \varphi _{1}&\quad \! \ldots&\quad \! \varphi _{k} \end{pmatrix}. \end{aligned}$$

(11)

If $\Psi _{k}: \mathbb {R}^{k+1}\rightarrow \mathcal {H}$ has full column rank, we define its pseudo-inverse as $\Psi _{k}^+\mathop = \limits ^{\mathrm{def}.}(\Psi _{k}^*\Psi _{k})^{-1}\Psi _{k}^*$. Similarly, we denote $\Gamma _{\bar{x}}^+ \mathop = \limits ^{\mathrm{def}.}(\Gamma _{\bar{x}}^*\Gamma _{\bar{x}})^{-1}\Gamma _{\bar{x}}^*$ provided $\Gamma _{\bar{x}}$ has full column rank.

1.5.4 Linear Algebra

For $z\in \mathbb {R}^N$, we let

$$\begin{aligned} H_{z}\mathop = \limits ^{\mathrm{def}.}\begin{pmatrix} 1 &{}\quad \! \ldots &{}\quad \! 1 &{}\quad \! 0 &{}\quad \!\ldots &{}\quad \! 0\\ z_1 &{}\quad \! \ldots &{}\quad \! z_N &{}\quad \! 1 &{}\quad \! \ldots &{}\quad \! 1\\ \vdots &{}\quad \! &{}\quad \! \vdots &{}\quad \! \vdots &{}\quad \! &{}\quad \! \vdots \\ \frac{(z_1)^{2N-1}}{(2N-1)!} &{}\quad \! \ldots &{}\quad \! \frac{(z_N)^{2N-1}}{(2N-1)!} &{}\quad \! \frac{(z_1)^{2N-2}}{(2N-2)!} &{}\quad \! \ldots &{}\quad \! \frac{(z_N)^{2N-2}}{(2N-2)!} \end{pmatrix} \in \mathbb {R}^{2N \times 2N}, \end{aligned}$$

(12)

so that $H_{z}^{*,-1}$ is the matrix of the Hermite interpolation at points $z_1,\ldots z_N$ when $\mathbb {R}_{2N-1}[X]$ is equiped with the basis $\left( 1,X,\ldots , \frac{X^{2N-1}}{(2N-1)!}\right) $.

For each $N \in \mathbb {N}$, we define

$$\begin{aligned} \delta _{N}&\mathop = \limits ^{\mathrm{def}.}(1,0,\ldots ,0)^T \in \mathbb {R}^N,\end{aligned}$$

(13)

$$\begin{aligned} \mathbbm {1}_N&\mathop = \limits ^{\mathrm{def}.}(1,1,\ldots , 1)^T \in \mathbb {R}^N. \end{aligned}$$

(14)

We use the $\ell ^{\infty }$ norm, $\left|\cdot \right|_\infty $, for vectors of $\mathbb {R}^N$ or $\mathbb {R}^{2N}$, whereas the notation $\left\| \cdot \right\| $ refers to an operator norm (on matrices, or bounded linear operators). $\left\| \cdot \right\| _{\mathcal {H}}$ is the norm on $\mathcal {H}$ associated to the inner product $\left\langle \cdot , \cdot \right\rangle _{\mathcal {H}}$. $\left\| \cdot \right\| _{\text {L}^\infty (X)}$ denotes the $\text {L}^\infty $ norm for functions defined on $X$.

2 Asymptotic Analysis of Support Recovery

This section exposes our main contribution (Theorem 2). The hypotheses of this result require an injectivity condition and a non-degeneracy condition, that are explained in Sects. 2.1 and 2.2.

2.1 Injectivity Hypotheses

We introduce here an injectivity hypothesis which ensures the invertibility of $\Phi _{tz}^* \Phi _{tz}$ and $\Gamma _{tz}^* \Gamma _{tz}$ for $t>0$ small enough.

In the case of the ideal low-pass filter [defined in (9)], $\Gamma _{\bar{x}}$ has full column rank provided that $\bar{x}=(x_1,\ldots , x_N) \in X^N$ has pairwise distinct coordinates (see [14, Sect. 3.6, Proposition 6]). That property is not true for a general operator $\Phi $. However, in this paper we focus on sums of Dirac masses that are clustered around the point $0\in X$, i.e. $\bar{x}=tz$ for $t>0$ and $z\in \mathbb {R}^N$ with pairwise distinct components. The following assumption, which is crucial to our analysis, shall ensure that $\Gamma _{tz}$ has full rank at least for small t.

Definition 2

Let $\varphi : X\rightarrow \mathcal {H}$. For all $k\in \mathbb {N}$, we say that the hypothesis $\mathcal {I}_{k}$ holds if and only if

$$\begin{aligned} \varphi \in \text{ KER }^{k} \text { and } (\varphi _{0}, \ldots ,\varphi _{k}) \text { are linearly independent in } \mathcal {H}. \qquad \qquad \mathcal {I}_{k}\end{aligned}$$

See Definition 1 for the definition of the space $\text{ KER }^{k}$ and Eq. (10) for the definition of $\varphi _{k}$.

If $\mathcal {I}_{k}$ holds, then $\Psi _{k}^* \Psi _{k}$ is a symmetric positive definite matrix, where $\Psi _{k}$ is defined in (11).

To exemplify the meaning of this injectivity hypothesis, Proposition 1 below considers the case $X=\mathbb {T}$ with $\Phi $ a convolution operator.

Proposition 1

Let $\tilde{\varphi } \in \mathscr {C}^{k}(\mathbb {T},\mathbb {R})$ (where $\varphi (x)=\tilde{\varphi }(\cdot -x)$), then $\mathcal {I}_{k}$ holds if and only if $\varphi _{0}$ has at least $k+1$ non-zeros Fourier coefficients. In particular if $\Phi $ is the ideal low-pass filter with cutoff frequency $f_c\in \mathbb {N}^*$, $\mathcal {I}_{k}$ holds if and only if $k\leqslant 2 f_c$.

The proof of this proposition is given in Sect. 1.

As we shall see in Sect. 3, the conditions $\mathcal {I}_{N-1}$ and $\mathcal {I}_{2N-1}$ imply respectively the invertibility of $\Phi _{tz}^* \Phi _{tz}$ and $\Gamma _{tz}^* \Gamma _{tz}$, provided that t is small enough. According to Proposition 1, in the special case of an ideal low-pass filter, these conditions holds if and only if $f_c$ is large enough with respect to the number N of spikes.

2.2 Vanishing Derivatives Precertificate

Following [14, Sect. 4.1, Definition 6], we introduce below the so called “vanishing derivatives pre-certificate” $\eta _{V,t}$, which is a function defined on $X$ that interpolates the spikes positions and signs (here $+1$). Note that $\eta _{V,t}$ can be computed in closed form by solving the linear system (15).

Proposition 2

(Vanishing derivatives precertificate, [14]) If $\Gamma _{tz_0}$ has full column rank, there is a unique solution to the problem

$$\begin{aligned} \inf \left\{ \left\| p \right\| _{\mathcal {H}} \;;\; \forall i=1,\ldots ,N, \; (\Phi ^*p)(t z_{0,i})=1, (\Phi ^*p)'(t z_{0,i})=0 \right\} . \end{aligned}$$

Its solution $p_{V,t}$ is given by

$$\begin{aligned} p_{V,t}=\Gamma _{tz_0}^{+,*} \begin{pmatrix} \mathbbm {1}_N \\ 0\end{pmatrix}, \end{aligned}$$

(15)

and we define the vanishing derivatives precertificate as $\eta _{V,t}\mathop = \limits ^{\mathrm{def}.}\Phi ^*p_{V,t}$.

As shown in [14] (see Sect. 4.2), $\eta _{V,t}$ governs the support recovery properties of the BLASSO $\mathcal {P}_{\lambda }(y_t+w)$. More precisely, if $\eta _{V,t}$ is non-degenerate, i.e.

$$\begin{aligned} \left\{ {\begin{matrix} &{}\forall \,x\in X\setminus \{t z_{0,1},\ldots , t z_{0,N}\}, &{}\quad \left|\eta _{V,t}(x)\right|<1,\\ &{}\forall \,i\in \{1,\ldots ,N\}, &{}\quad \eta _{V,t}''( t z_{0,i} )\ne 0, \end{matrix}}\right. \end{aligned}$$

(16)

then there exists a low noise regime in which the BLASSO recovers exactly the correct number N of spikes, and the error on the locations and amplitudes is proportional to the noise level.

The constants involved in the main result of [14] (Theorem 2) depend on the value of $t>0$. The goal of this paper is to precisely determine this dependency, and to show that this support recovery result extends to the setting where $t \rightarrow 0$, provided that $(\lambda ,w)$ obey some precise scaling with respect to t.

Since our focus is actually on the support recovery properties of the BLASSO when $t\rightarrow 0$, it is natural to look at the limit of $p_{V,t}$ as $t\rightarrow 0$. This leads us to the $(2N-1)$-vanishing derivatives precertificate defined below.

Proposition 3

($(2N-1)$-vanishing derivatives precertificate) If $\mathcal {I}_{2N-1}$ holds, there is a unique solution to the problem

$$\begin{aligned} \inf \left\{ \left\| p \right\| _{\mathcal {H}} \;;\; (\Phi ^*p)(0)=1, (\Phi ^*p)'(0)=0, \ldots , (\Phi ^*p)^{(2N-1)}(0)=0 \right\} . \end{aligned}$$

We denote by $p_{W}$ its solution, given by

$$\begin{aligned} p_{W}=\Psi _{2N-1}^{+,*}\delta _{2N} \end{aligned}$$

(17)

and we define the $(2N-1)$-vanishing derivatives precertificate as $\eta _W\mathop = \limits ^{\mathrm{def}.}\Phi ^*p_{W}$.

The following Proposition, which is a direct consequence of Lemma 1 in the next section, shows that indeed $\eta _{V,t}$ converges toward $\eta _W$.

Proposition 4

If $\mathcal {I}_{2N-1}$ holds and $\varphi \in \text{ KER }^{K}$ (for $K\geqslant 2N-1$), then for $t>0$ small enough $\Gamma _{tz_0}$ has full column rank. Moreover

$$\begin{aligned} \lim _{t\rightarrow 0^+} p_{V,t}&= p_{W} \text{ strongly } \text{ in } \mathcal {H},\\ \lim _{t\rightarrow 0^+} \eta _{V,t}^{(k)}&= \eta _W^{(k)} \text{ in } \text{ the } \text{ sense } \text{ of } \text{ the } \text{ uniform } \text{ convergence } \text{ on } X, \end{aligned}$$

for all $0\leqslant k\leqslant K$.

Figure 1 shows graphically this convergence of $\eta _{V,t}$ toward $\eta _W$ in the case of the deconvolution problem over the 1-D torus with the Dirichlet kernel. Figures 2 and 3 show $\eta _W$ for several values of N. Notice how it becomes flatter at 0 as N increases. This implies that $\eta _{V,t}$ for small t gets closer to degeneracy as N increases. This is reflected in our main contribution (Theorem 2) where the signal-to-noise ratio is required to scale with $t^{2N-1}$.

The behavior of $\eta _{V,t}$ is therefore governed by specific properties of $\eta _W$ for small values of $t>0$. In particular, as stated by Theorem 1 below, the non-degeneracy of $\eta _W$ (as defined in Definition 3 below) implies the non-degeneracy of $\eta _{V,t}$ (as defined in (16)).

Definition 3

Assume that $\mathcal {I}_{2N-1}$ holds and $\varphi \in \text{ KER }^{2N}$. We say that $\eta _W$ is $(2N-1)$-non-degenerate if $\eta _W^{(2N)}(0)\ne 0$ and for all $x\in X\setminus \{0\}$, $|\eta _W(x)| <1$.

Theorem 1

Suppose that $\eta _W$ is $(2N-1)$-non-degenerate (Definition 3). Let $R_W>0$. Then, there exist $C_W>0$, $t_W>0$ such that for all $t\in (0,t_W)$, all $z\in \mathbb {R}^N$ with pairwise distinct coordinates and $\left|z\right|_\infty \leqslant R_W$, and all $\eta \in \mathscr {C}^{2N}(X)\cap \text{ W }^{2N,\infty }(X)$ satisfying for $1\leqslant i \leqslant N$, $\eta (tz_i)=1$ and $\eta '(tz_i)=0$,

$$\begin{aligned}&\left( \forall \ell \in \{0,\ldots 2N\},\quad \left\| \eta ^{(\ell )}-\eta _W^{(\ell )} \right\| _{\text {L}^\infty (X)}\leqslant C_W\right) \\ \Longrightarrow&\left( \forall x\in X\setminus \bigcup _i \{tz_i\}, \quad |\eta (x)|<1 \quad \text {and} \quad \forall 1\leqslant i \leqslant N, \ \eta ''(tz_i)<0\right) . \end{aligned}$$

The proof of this theorem can be found in Sect. 1.

Remark 2

An important consequence of Theorem 1 is that if $\eta _W$ is $(2N-1)$-non-degenerate, then, by Proposition 4, $\eta _{V,t}$ is also non-degenerate for $t>0$ small enough. The function $\eta _{V,t}$ is then the minimal norm certificate for the measure $m_{a_{0},tz_0}$, and the Non-Degenerate Source Condition (see [14]) holds. As a result, for fixed small $t>0$, the BLASSO admits a unique solution in a certain low noise regime corresponding to a large enough signal to noise ratio, with exactly the same number of spikes as the original measure $m_{a_{0},tz_0}$. For more details on that matter, see [14].

A natural question is whether $\eta _W$ is indeed $(2N-1)$-non-degenerate. Proposition 5 below (proved in Sect. 1) gives a partial answer in the case of the deconvolution over $X$.

Proposition 5

Assume that $\Phi $ is a convolution operator (i.e. for all $x\in X$, $\varphi (x)=\tilde{\varphi }(\cdot -x)$ and $\mathcal {H}=L^2(X)$) and $\mathcal {I}_{2N}$ holds. Suppose also, only in the case $X=\mathbb {R}$, that for all $0\leqslant i\leqslant 2N-1$, $\tilde{\varphi }^{(i)}(x)\rightarrow 0$ when $|x|\rightarrow +\infty $. Then $\eta _W^{(2N)}(0)<0$.

Remark 3

Thanks to Proposition 5 and the first part of the proof of Theorem 1 (which is given in Sect. 1), note that the following is true: there exist $C_W>0$, $t_W>0$ such that for all $t\in (0,t_W)$, $z\in \mathbb {R}^N$ with pairwise distinct coordinates and $\left|z\right|_\infty \leqslant R_W$, there exists $r^+>0$ with $r^+>\underset{1\leqslant i\leqslant N}{\max } t_Wz_i$ and $r^-<0$ with $r^-<\underset{1\leqslant i\leqslant N}{\min } t_Wz_i$ such that for all $\eta \in \mathscr {C}^{2N}(X)\cap \text{ W }^{2N,\infty }(X)$ satisfying for all $1\leqslant i\leqslant N$, $\eta (tz_i)=1$ and $\eta '(tz_i)=0$:

$$\begin{aligned}&\left( \forall \ell \in \{0,\ldots , 2N\}, \ \left\| \eta ^{(\ell )}-\eta _W^{(\ell )} \right\| _{\text {L}^\infty (X)}\leqslant C_W\right) \\&\Longrightarrow \left( \forall x\in (r^-,r^+)\setminus \bigcup _i \{tz_i\}, \ |\eta (x)|<1 \quad \text {and} \quad \forall i\in \{1, \ldots , N\}, \ \eta ''(tz_i)<0 \right) . \end{aligned}$$

Whether the other condition in the definition of the $(2N-1)$-non-degeneracy of $\eta _W$ holds (i.e. whether $|\eta _W|<1$ on $X\setminus \{0\}$) should be checked on a case-by-case basis. Since $\eta _W$ depends only on the kernel $\varphi $ and can be computed by simply inverting a linear system, it is easy to check numerically if $\eta _W$ is $(2N-1)$-non-degenerate. Proposition 6 shows that in the special case of $X=\mathbb {R}$ and the Gaussian kernel, $\eta _W$ can be computed in closed form and is indeed $(2N-1)$-non-degenerate.

Proposition 6

Assume that $X=\mathbb {R}$, $\mathcal {H}=L^2(\mathbb {R})$ and $\Phi $ is a convolution operator, i.e. for all $x\in \mathbb {R}$, $\varphi (x)=\tilde{\varphi }(\cdot -x)$, where $\tilde{\varphi }:x\in \mathbb {R}\mapsto e^{-x^2/2}$ is a Gaussian. Then the associated $(2N-1)$-vanishing derivatives precertificate is

$$\begin{aligned} \forall \,x\in \mathbb {R}, \quad \eta _W(x)=e^{-\frac{x^2}{4}}\sum _{k=1}^N \frac{x^{2k}}{2^{2k}k!}. \end{aligned}$$

(18)

In particular, $\eta _W$ is $(2N-1)$-non-degenerate.

The proof of this result can be found in the Appendix 1.4. If we denote by $\eta _{W,\sigma }$, the $(2N-1)$ vanishing derivatives precertificate associated to the filter $\varphi _\sigma :x\in \mathbb {R}\mapsto e^{-\frac{x^2}{2\sigma ^2}}$, then $\eta _{W,\sigma }=\eta _{W,1}(\frac{\cdot }{\sigma })$. That is why we only consider the case of $\sigma =1$ in Proposition 6. Figure 3 shows $\eta _W$ for the Gaussian filter with an increasing number N of spikes.

For the deconvolution over the 1-D torus, we observed numerically (as illustrated in Fig. 2) that $\eta _W$ is $(2N-1)$-non degenerate for the ideal low pass filter for any value N such that $N\leqslant f_c$ (Fig. 4 represents the complementary case where N is fixed but $f_c$ increases). However, for some filters, the associated $\eta _W$ might be degenerate. This is illustrated in Fig. 5 where $\eta _W$ is illustrated for several filters with increasing complexity i.e. we consider low pass filters with a fixed cutoff frequency, with increasing extreme Fourier coefficients (starting with a slowly varying filter). Remark that the last two $\eta _W$ (in red) are degenerate, as they correspond to the filters with the higher complexity (the Fourier coefficients increase the most with the frequency).

2.3 Main Contribution

We now state our main contribution.

Theorem 2

Suppose that $\varphi \in \text{ KER }^{2N+1}$ and that $\eta _W$ is $(2N-1)$-non-degenerate. Then there exist constants $(t_1,t_W,C,C_R,M)$ (depending only on $\varphi $, $a_{0}$ and $z_0$) such that for all $0<t<\min \left( t_1,t_W \right) $, for all $(\lambda ,w)\in \mathrm {B}\mathopen {}\left( 0,C_Rt^{2N-1}\right) $ with $\left\| \frac{w}{\lambda } \right\| _{\mathcal {H}}\leqslant C$,

the problem $\mathcal {P}_{\lambda }(y_t+w)$ admits a unique solution,
that solution has exactly N spikes, and it is of the form $m_{a,tz}$, with $(a,z)=g_t^*(\lambda ,w)$ (where $g_t^*$ is a $\mathscr {C}^{2N}$ function defined on $\mathrm {B}\mathopen {}\left( 0,C_Rt^{2N-1}\right) \subset \mathbb {R}\times \mathcal {H}$),
the following inequality holds
$$\begin{aligned} \left|(a,z)-(a_{0},z_0)\right|_\infty \leqslant M\left( \frac{|\lambda |}{t^{2N-1}}+\frac{\left\| w \right\| _{\mathcal {H}}}{t^{2N-1}} \right) . \end{aligned}$$

Note that the value of constants involved can be found in the proof of the theorem, more precisely, $(t_1,C)$ are defined in (46), $C_R$ is defined in Proposition 9, $t_W$ is defined in Theorem 1 and M is given in Corollary 1.

The proof of Theorem 2 uses results spanning Sects. 3, 4 and 5. Below, we give a sketch of proof to guide the reader through the remaining of the paper. The elements of the proof are divided in three main steps.

Step 1 (Sections 4.1 and 4.2) We start with the first order optimality equation that any solution $m_{a,tz}$, for fixed $t>0$, of $\mathcal {P}_{\lambda }(y_t+w)$ must satisfy i.e.

$$\begin{aligned} \Gamma _{tz}^*\left( \Phi _{tz}a-\Phi _{tz_0}a_{0}-w \right) +\lambda \begin{pmatrix}\mathbbm {1}_N\\ 0 \end{pmatrix}=0. \end{aligned}$$

It is obtained by applying Fermat’s rule to the problem $\mathcal {P}_{\lambda }(y_t+w)$. Since the parameters $(a,z,\lambda ,w)=(a_{0},z_0,0_\mathbb {R},0_{\mathcal {H}})$ are a solution of the equation, the idea is to parametrize, in a neighborhood of $(\lambda ,w)=(0_\mathbb {R},0_{\mathcal {H}})$, the amplitudes and positions $(a,z)$ in terms of $(\lambda ,w)$ by applying the implicit function theorem so that $(a,z,\lambda ,w)$ is a solution of the first order equation. This process is detailed in Sects. 4.1 and 4.2. The rest of the proof consists in proving that the measure $m_{a,tz}$ is the unique solution of the problem $\mathcal {P}_{\lambda }(y_t+w)$. But before we have to deal with the domain of existence of the above parametrization.

Step 2 (Section 4.3) The implicit function theorem only provides the existence of a neighborhood in $(\lambda ,w)$ of $(0_\mathbb {R},0_{\mathcal {H}})$ where the parametrization holds, but we do not know how it size depends on the parameter t. This issue is important because one of our aims is to determine the constraints on t (corresponding roughly to the minimum distance between the spikes of the original measure), on the noise level and on the regularization parameter $\lambda $ so that the recovery of the support is possible. Section 4.3 is devoted to show that the parametrization, which writes $(a,z)=g_t^*(\lambda ,w)$ (see Eq. (38) for the definition of the implicit function $g_t^*$), of the solutions of the first order optimality equation holds in a neighborhood of $(0_{\mathbb {R}},0_{\mathcal {H}})$ and of size proportional to $t^{2N-1}$. This result corresponds to Proposition 9. The proof uses an upper bound of $\mathrm {d}g_t^*$ which is stated in Corollary 1. Proposition 9 relies on asymptotic expansions (of $\Gamma _{tz}$ for example), when $t\rightarrow 0$, gathered and used in the proof of Lemma 5. Section 3 is devoted to these asymptotic expansions and it may be skipped at first reading.

Step 3 (Section 5) Up to now, we have constructed a candidate solution $m_{a,tz}$ (composed of N spikes) where $(a,z)=g_t^*(\lambda ,w)$ is built by parametrizing the solutions of the first order optimality equation. Moreover this parametrization holds for all $(\lambda ,w)$ in a ball of radius proportional to $t^{2N-1}$. It remains to prove that $m_{a,tz}$ is indeed the unique solution of $\mathcal {P}_{\lambda }(y_t+w)$. To prove that $m_{a,tz}$ is a solution, it is equivalent to check that

$$\begin{aligned} 0\in \partial \left( m\mapsto \frac{1}{2}\left\| \Phi m -y_t-w \right\| _{\mathcal {H}}^2+\lambda |m|(X) \right) (m_{a,tz}), \end{aligned}$$

which reformulates into $\eta _{\lambda ,t}\mathop = \limits ^{\mathrm{def}.}\frac{1}{\lambda }\Phi ^*(y_t+w-\Phi m_{a,tz})\in \partial |m_{a,tz}|(X)$. This is done by first showing the convergence of $\eta _{\lambda ,t}$ towards $\eta _W$ when $(t,\lambda ,w)\rightarrow 0$ in a well chosen domain, see Proposition 11, and then using Theorem 1 and the fact that $\eta _W$ is ensured to be $(2N-1)$-non-degenerate (which is one of the hypotheses of Theorem 2) to get the non-degeneracy of $\eta _{\lambda ,t}$ and the conclusion.

2.3.1 Putting All Together

After this sketch, we now give the detailed proof. It uses Proposition 9 (parametrization of the solution of the first order optimality equation on a ball, for the parameter $(\lambda ,w)$, of radius proportional to $t^{2N-1}$), Proposition 11 (convergence of $\eta _{\lambda ,t}$ towards $\eta _W$), Theorem 1 (use of the $(2N-1)$-non-degeneracy of $\eta _W$ to transfer it to $\eta _{\lambda ,t}$), and Proposition 10 (upper bound on the error of $m_{a,tz}$ with respect to $m_{a_{0},tz_0}$).

Proof of Theorem 2

Let us take $t,\lambda ,w$ as in the hypotheses of the Theorem 2. Let $(a,z)=g_t^*(\lambda ,w)$, where $g_t^*$ is the function constructed in Sect. 4. Let us define

$$\begin{aligned} p_{\lambda ,t}\mathop = \limits ^{\mathrm{def}.}\frac{1}{\lambda } \left( \Phi _{tz_0}a_{0}+w-\Phi _{tz}a \right) \quad \text {and} \quad \eta _{\lambda ,t}\mathop = \limits ^{\mathrm{def}.}\Phi ^* p_{\lambda ,t}. \end{aligned}$$

By Proposition 11 combined with Theorem 1 where we take

$$\begin{aligned} R_W\mathop = \limits ^{\mathrm{def}.}\sup \{\left|z\right|_\infty ;z\in \overline{\mathcal {B}}_{z_0}\}, \end{aligned}$$

(19)

we have for $0<t< \min (t_W, t_1)$,

$$\begin{aligned} \forall x\in X\setminus \bigcup _i \{tz_i\}, |\eta _{\lambda ,t}(x)|<1 \quad \text {and} \quad \forall 1\leqslant i \leqslant N, \ \eta _{\lambda ,t}''(tz_i)<0, \end{aligned}$$

(20)

while $\eta _{\lambda ,t}(tz_i)=1={{\mathrm{sign}}}(a_i)$ by definition.

We deduce that $\eta _{\lambda ,t}$ is in the subdifferential of the total variation at $m_{a,tz}$ because

$\eta _{\lambda ,t}\in \mathscr {C}_{X}$,
$\left\| \eta _{\lambda ,t} \right\| _{\text {L}^\infty (X)}\leqslant 1$ thanks to Eq. (20),
$\forall 1\leqslant i\leqslant N$, $\eta _{\lambda ,t}(tz_i)=1={{\mathrm{sign}}}(a_i)$ by definition of $\eta _{\lambda ,t}$ (recall that $(a,z)=g_t^*(\lambda ,w)$).

As a result $m_{a,tz}$ is a solution to $\mathcal {P}_{\lambda }(y_t+w)$ and $p_{\lambda ,t}$ is the unique solution to the dual problem associated to $\mathcal {P}_{\lambda }(y_t+w)$ (see [14, Sect. 2.4] for details on dual certificates and optimality conditions for $\mathcal {P}_{\lambda }(y_t+w)$).

Let m be an other solution of $\mathcal {P}_{\lambda }(y_t+w)$. Then the support of m is included in the saturation points of $\eta _{\lambda ,t}=\Phi ^*p_{\lambda ,t}$ i.e. in $\{tz_1,\ldots ,tz_N\}$. As a result $m=m_{a',tz}$ for some $a'\in \mathbb {R}^N$ and m satisfies the first order optimality equation $\Gamma _{tz}^*\left( \Phi _{tz}a'-\Phi _{tz_0}a_{0}-w \right) +\lambda \begin{pmatrix}\mathbbm {1}_N\\ 0 \end{pmatrix}=0$. Hence $\Phi _{tz}^*\Phi _{tz}a'=\Phi _{tz}^*\Phi _{tz}a$ and since $\Phi _{tz}$ has full rank (by assumption t is chosen sufficiently small, see Lemma 1 for the proof), $\Phi _{tz}^*\Phi _{tz}$ is invertible and $a'=a$. So $m=m_{a,tz}$ and $\mathcal {P}_{\lambda }(y_t+w)$ admits a unique solution: $m_{a,tz}$.

The bound on the error between $(a,z)$ and the amplitudes and positions of the initial measure $(a_{0},z_0)$ is a direct consequence of Proposition 10. $\square $

2.4 Necessary Condition for the Recovery in the Limit $t\rightarrow 0$

Our main contribution, Theorem 2, states that under a non-degeneracy property which involves $\eta _W$, it is possible to perform the recovery of the support of a measure $m_{a_{0},tz_0}$ in the limit $t\rightarrow 0$ when the data are contaminated by some noise, provided that $\max (|\lambda |/t^{2N-1},\left\| w \right\| _{\mathcal {H}}/t^{2N-1},\left\| w \right\| _{\mathcal {H}}/\lambda )\leqslant C$ for some constant $C>0$ depending only on the filter $\varphi $ and $(a_{0},z_0)$. It is natural to ask whether the non-degeneracy condition on $\eta _W$, in order to get the recovery of the support in some low noise regime, is sharp.

The following Theorem shows that the $(2N-1)$-non-degeneracy assumption on $\eta _W$ is almost sharp in the sense that the recovery of the support in a low noise regime leads to $\left\| \eta _W \right\| _{\text {L}^\infty (X)}\leqslant 1$.

Theorem 3

Suppose that $\mathcal {I}_{2N-1}$ holds and $\varphi \in \text{ KER }^{2N+1}$. Suppose also that there exists a sequence $(t_n)_{n\in \mathbb {N}}$ such that $t_n \rightarrow 0$ and satisfying

$$\begin{aligned} \forall n\in \mathbb {N}, \exists (\lambda _n,w_n), \exists (a_n,z_n)\in \mathbb {R}^N\times \mathbb {R}^N, m_{a_n,t_nz_n} \text{ is } \text{ solution } \text{ of } \mathcal {P}_{\lambda _n}(y_{t_n}+w_n), \end{aligned}$$

where $(\lambda _n,w_n)\rightarrow 0$ with $\frac{\left\| w_n \right\| _{\mathcal {H}}}{\lambda _n}\rightarrow 0$. Then

$$\begin{aligned} \left\| \eta _W \right\| _{\text {L}^\infty (X)}\leqslant 1. \end{aligned}$$

(21)

The proof of this result can be found in Appendix 1.5.

The remaining sections of the paper, namely Sects. 3, 4 and 5 are devoted to the proof of Theorem 2.

3 Preliminaries

Our study relies to a large extent on the asymptotic behavior of quantities built upon $\Phi _{tz}$ and $\Gamma _{tz}$ for $t>0$ small, such as $(\Phi _{tz}^* \Phi _{tz})^{-1}$ or $(\Gamma _{tz}^* \Gamma _{tz})^{-1}$. In this section, we gather several preliminary results that enable us to control that behavior.

3.1 Approximate Factorizations

Our asymptotic estimates are based on an approximate factorization of $\Phi _{\bar{x}}$ and $\Gamma _{\bar{x}}$ using Vandermonde and Hermite matrices. It enables us to study the asymptotic behavior of the optimality conditions of $\mathcal {P}_{\lambda }(y_t+w)$ when $t\rightarrow 0^+$. In the following, we consider $z\in \overline{\mathcal {B}}_{z_0}$ [see (6)] and $t\in (0,1]$, so that $H_{tz}$ is always invertible. Moreover, we shall always assume that $\varphi \in \text{ KER }^{2N}$.

Proposition 7

The following expansion holds

$$\begin{aligned} \Gamma _{tz}= \Psi _{2N-1}H_{tz}+ \Lambda _{t,z} D_t, \end{aligned}$$

(22)

where $\Psi _{2N-1}$ is defined in (11), $H_{tz}$ is defined in (12), and where

$$\begin{aligned} \Lambda _{t,z}&\mathop = \limits ^{\mathrm{def}.}\bigg ( \Big ( \int _0^1 (z_i)^{2N}\varphi ^{(2N)}(stz_i)\frac{(1-s)^{2N-1}}{(2N-1)!} \mathrm {d}s \Big )_{1\leqslant i\leqslant N}, \\&\qquad \Big ( \int _0^1 (z_i)^{2N-1}\varphi ^{(2N)}(stz_i)\frac{(1-s)^{2N-2}}{(2N-2)!} \mathrm {d}s \Big )_{1\leqslant i\leqslant N} \bigg ) \\ D_t&\mathop = \limits ^{\mathrm{def}.}{{\mathrm{diag}}}(t^{2N},\ldots ,t^{2N},t^{2N-1}, \ldots , t^{2N-1}). \end{aligned}$$

Proof

This expansion is nothing but the Taylor expansions for $\varphi $ and $\varphi '$:

$$\begin{aligned} \varphi ( tz_i)&= \varphi _{0} +(tz_i)\varphi _{1}+\ldots +\frac{(tz_i)^{2N-1}}{(2N-1)!}\varphi _{2N-1}\nonumber \\&\quad +(tz_i)^{2N}\int _0^1\varphi ^{(2N)}(stz_i)\frac{(1-s)^{2N-1}}{(2N-1)!} \mathrm {d}s ,\end{aligned}$$

(23)

$$\begin{aligned} \varphi '(tz_i)&= \varphi _{1} + (tz_i)\varphi _{2}+\ldots +\frac{(tz_i)^{2N-2}}{(2N-2)!}\varphi _{2N-1}\nonumber \\&\quad +(tz_i)^{2N-1}\int _0^1\varphi ^{(2N)}(stz_i)\frac{(1-s)^{2N-2}}{(2N-2)!} \mathrm {d}s. \end{aligned}$$

(24)

$\square $

The above expansion yields a useful factorization for $\Gamma _{tz}$,

$$\begin{aligned} \Gamma _{tz}= \Psi _{tz}H_{tz} \quad \text {where} \quad \Psi _{tz}&\mathop = \limits ^{\mathrm{def}.}\Psi _{2N-1}+ \Lambda _{t,z} D_t H_{tz}^{-1}. \end{aligned}$$

The rest of this section is devoted to the consequence of that factorization for the asymptotic behavior of $\Gamma _{tz}$ and its related quantities. The main ingredient of this analysis is the factorization of $H_{tz}$ as

$$\begin{aligned} H_{tz}={{\mathrm{diag}}}(1,t,\ldots , t^{2N-1})H_{z}{{\mathrm{diag}}}\left( 1,\ldots , 1,\frac{1}{t},\ldots , \frac{1}{t}\right) . \end{aligned}$$

(25)

Let us emphasize that our Taylor expansions are uniform in $z\in \overline{\mathcal {B}}_{z_0}$. More precisely, given two quantities $f(z,t)$, $g(z,t)$, we say that $f(z,t)=g(z,t) + O\mathopen {}\left( t^k\right) $ if

$$\begin{aligned} \limsup _{t\rightarrow 0^+} \sup _{z\in \overline{\mathcal {B}}_{z_0}} \left|\frac{f(z,t) - g(z,t)}{t^k}\right|<+\infty . \end{aligned}$$

Lemma 1

The following expansion holds for $t\rightarrow 0^+$,

$$\begin{aligned} \Psi _{tz}= \Psi _{2N-1}+ O\mathopen {}\left( t\right) . \end{aligned}$$

(26)

Moreover, if $\mathcal {I}_{2N-1}$ holds then $\Psi _{tz}$ and $\Gamma _{tz}$ have full column rank for $t>0$ small enough and

$$\begin{aligned} (\Psi _{tz}^*\Psi _{tz})^{-1}&=(\Psi _{2N-1}^*\Psi _{2N-1})^{-1}+O\mathopen {}\left( t\right) \end{aligned}$$

(27)

$$\begin{aligned} \Gamma _{tz}^{+,*}\begin{pmatrix}\mathbbm {1}_N\\ 0\end{pmatrix}&= \Psi _{2N-1}^{+,*}\delta _{2N} + O\mathopen {}\left( t\right) . \end{aligned}$$

(28)

Proof

We begin by noticing that

$$\begin{aligned} \Lambda _{t,z}D_tH_{tz}^{-1}&= t^{2N}\Lambda _{t,z}H_{z}^{-1}{{\mathrm{diag}}}(1,1/t,\ldots ,1/t^{2N-1})\\&=\Lambda _{t,z}H_{z}^{-1}{{\mathrm{diag}}}(t^{2N},t^{2N-1},\ldots ,t). \end{aligned}$$

The function $z\mapsto H_{z}^{-1}$ is $\mathscr {C}^{\infty }$ and uniformly bounded on $\overline{\mathcal {B}}_{z_0}$, and $(z,t)\mapsto \Lambda _{t,z}$ is $\mathscr {C}^{0}$ on the compact set $\overline{\mathcal {B}}_{z_0}\times [0,1]$ hence uniformly bounded too. As a result, we get (26).

Assume now that $\mathcal {I}_{2N-1}$ holds. Since $\Psi _{2N-1}^*\Psi _{2N-1}$ is invertible, there is some $R>0$ such that for every A in the closed ball $\overline{\mathrm {B}}\mathopen {}\left( \Psi _{2N-1}^*\Psi _{2N-1},R\right) \subset \mathbb {R}^{(2N-1)\times (2N-1)}$, A is invertible. By the mean value inequality

$$\begin{aligned} \left\| (\Psi _{2N-1}^*\Psi _{2N-1})^{-1}-A^{-1} \right\|&\leqslant \!\!\! \sup _{B\in \overline{\mathrm {B}}\mathopen {}\left( \Psi _{2N-1}^*\Psi _{2N-1},R\right) } \left\| B^{-1}(A-\Psi _{2N-1}^*\Psi _{2N-1})B^{-1} \right\| \\&\leqslant \left( \sup _{B\in \overline{\mathrm {B}}\mathopen {}\left( \Psi _{2N-1}^*\Psi _{2N-1},R\right) }\left\| B^{-1} \right\| \right) ^2 \left\| A-\Psi _{2N-1}^*\Psi _{2N-1} \right\| . \end{aligned}$$

Applying that to $A=\Psi _{tz}^*\Psi _{tz}= \Psi _{2N-1}^*\Psi _{2N-1}+O\mathopen {}\left( t\right) $ (since each term in the product is uniformly bounded), we get (27).

Now, for the last point, we infer from $\Gamma _{tz}=\Psi _{tz}H_{tz}$ and the fact that $H_{tz}$ is invertible that $\Gamma _{tz}^{+,*}=\Psi _{tz}^{+,*}H_{tz}^{-1,*}$. Hence

$$\begin{aligned} \Gamma _{tz}^{+,*} \begin{pmatrix}\mathbbm {1}_N\\ 0\end{pmatrix} = \ \Psi _{tz}^{+,*} \delta _{2N} = \Psi _{tz}(\Psi _{tz}^*\Psi _{tz})^{-1} \delta _{2N} \end{aligned}$$

where $\delta _{2N}$ is defined in (13).

Each factor below being uniformly bounded in $\overline{\mathcal {B}}_{z_0}\times [0,1]$, we get

$$\begin{aligned} \Gamma _{tz}^{+,*} \begin{pmatrix}\mathbbm {1}_N\\ 0\end{pmatrix}&= \left( \Psi _{2N-1}+O\mathopen {}\left( t\right) \right) \left( (\Psi _{2N-1}^*\Psi _{2N-1})^{-1}+O\mathopen {}\left( t\right) \right) \delta _{2N}\\&= \Psi _{2N-1}(\Psi _{2N-1}^*\Psi _{2N-1})^{-1}\delta _{2N} +O\mathopen {}\left( t\right) . \end{aligned}$$

$\square $

3.2 Projectors

In this paragraph, we shall always suppose that $\mathcal {I}_{2N-1}$ holds. Another important quantity in our study is the orthogonal projector $P_{({{\mathrm{Im}}}\Gamma _{tz})^\perp }$ (resp. $P_{({{\mathrm{Im}}}\Psi _{2N-1})^\perp }$) onto $({{\mathrm{Im}}}\Gamma _{tz})^\perp $ (resp. $({{\mathrm{Im}}}\Psi _{2N-1})^\perp $). We define

$$\begin{aligned} \Pi _{tz}&\mathop = \limits ^{\mathrm{def}.}P_{({{\mathrm{Im}}}\Gamma _{tz})^\perp } = \mathrm {Id}_{\mathcal {H}}-\Gamma _{tz}(\Gamma _{tz}^*\Gamma _{tz})^{-1}\Gamma _{tz}^*,\\ \Pi _{2N-1}&\mathop = \limits ^{\mathrm{def}.}P_{({{\mathrm{Im}}}\Psi _{2N-1})^\perp }= \mathrm {Id}_{\mathcal {H}}-\Psi _{2N-1}(\Psi _{2N-1}^*\Psi _{2N-1})^{-1}\Psi _{2N-1}^*. \end{aligned}$$

Observing that $P_{({{\mathrm{Im}}}\Gamma _{tz})^\perp }=P_{({{\mathrm{Im}}}\Psi _{tz})^\perp }$, we immediately obtain from the previous Lemma that $\Pi _{tz}=\Pi _{2N-1}+O\mathopen {}\left( t\right) $.

By construction, $\Pi _{tz}\Phi _{tz}=\Pi _{tz}\Phi _{tz}'=0$, but the following proposition shows that this quantity is also small if we replace $\Phi _{tz}$ with $\Phi _{tz}''$.

Lemma 2

There exists a constant $L_1>0$ (which only depends on $\varphi $, $a_{0}$ and $z_0$) such that

$$\begin{aligned} \left\| \Pi _{tz}\Phi _{tz}'' \right\| _{\mathcal {H}}\leqslant L_1t^{2N-2} \end{aligned}$$

uniformly in $z\in \overline{\mathcal {B}}_{z_0}$.

Proof

Applying a Taylor expansion to $\varphi ^{(2)}$, we write

$$\begin{aligned} \Phi _{tz}''= \Psi _{2N-1}\tilde{V}_{tz} +t^{2N-2}\tilde{\Lambda }_{t,z} \end{aligned}$$

where

$$\begin{aligned} \tilde{V}_{tz}&= \begin{pmatrix} 0 &{}\quad \ldots &{}\quad 0\\ 0&{}\quad \ldots &{}\quad 0\\ 1&{}\quad \ldots &{}\quad 1\\ \vdots &{}\quad &{}\quad \vdots \\ \frac{(tz_1)^{2N-3}}{(2N-3)!}&{}\quad \ldots &{}\quad \frac{(tz_N)^{2N-3}}{(2N-3)!} \end{pmatrix}, \end{aligned}$$

(29)

$$\begin{aligned} \tilde{\Lambda }_{t,z}&= \begin{pmatrix} (z_i)^{2N-2}\int _{0}^1 \varphi ^{(2N)}(stz_i)\frac{(1-s)^{2N-3}}{(2N-3)!} \mathrm {d}s \end{pmatrix}_{1\leqslant i\leqslant N}. \end{aligned}$$

Hence

$$\begin{aligned} \Pi _{tz}\Phi _{tz}''&= \Pi _{tz}(\Psi _{tz}\tilde{V}_{tz} + (\Psi _{N-1}-\Psi _{tz})\tilde{V}_{tz} +t^{2N-2}\tilde{\Lambda }_{t,z})\\&=\Pi _{tz} (-\Lambda _{t,z}D_tH_{tz}^{-1}\tilde{V}_{tz} +t^{2N-2}\tilde{\Lambda }_{t,z}) \text{ since } \Pi _{tz}\Psi _{tz}=0. \end{aligned}$$

Using (25), we see that

$$\begin{aligned} D_tH_{tz}^{-1}\tilde{V}_{tz}&=H_{z}^{-1}{{\mathrm{diag}}}(t^{2N},t^{2N-1},\ldots ,t)\tilde{V}_{tz}\\&=t^{2N-2}H_{z}^{-1}\tilde{V}_{z}, \quad \text{ hence }\\ \Pi _{tz}\Phi _{tz}''&= t^{2N-2}\Pi _{tz} (-\Lambda _{t,z}H_{z}^{-1}\tilde{V}_{z}+\tilde{\Lambda }_{t,z}). \end{aligned}$$

Since $\left\| \Pi _{tz} \right\| \leqslant 1$ and the continuous function $(z,t)\mapsto -\Lambda _{t,z}H_{z}^{-1}\tilde{V}_{z}+\tilde{\Lambda }_{t,z}$ is uniformly bounded on the compact set $\overline{\mathcal {B}}_{z_0}\times [0,1]$, we obtain

$$\begin{aligned} \left\| \Pi _{tz}\Phi _{tz}'' \right\| _{\mathcal {H}}\leqslant \left( \sup _{(z',t')\in \overline{\mathcal {B}}_{z_0}\times [0,1]} \left\| \Lambda _{t',z'}H_{z'}^{-1}\tilde{V}_{z'}+\tilde{\Lambda }_{t',z'} \right\| _{\mathcal {H}}\right) t^{2N-2} \end{aligned}$$

$\square $

We study further the projector $\Pi _{tz}$ when it is not evaluated at the same $z$ as $\Gamma _{tz}$.

Lemma 3

If $\varphi \in \text{ KER }^{2N+1}$, then there is a constant $L_2>0$ (which only depends on $\varphi $, $a_{0}$ and $z_0$) such that for all $z\in \overline{\mathcal {B}}_{z_0}$, all $t\in (0,1]$,

$$\begin{aligned} \left\| \Pi _{tz}\Gamma _{tz_0}\begin{pmatrix}a_{0}\\ 0\end{pmatrix} \right\| _{\mathcal {H}}\leqslant L_2t^{2N}\left|z-z_0\right|_\infty . \end{aligned}$$

Proof

Let us observe that

$$\begin{aligned} \Pi _{tz}\Gamma _{tz_0}&= \Pi _{tz}(\Psi _{2N-1}H_{tz_0}+\Lambda _{t,z_0}D_t)\\&=\Pi _{tz}(\Psi _{tz} H_{tz_0}+ (\Psi _{2N-1}-\Psi _{tz})H_{tz_0}+\Lambda _{t,z_0}D_t)\\&=\Pi _{tz}(-\Lambda _{t,z}D_tH_{tz}^{-1}H_{tz_0}+\Lambda _{t,z_0}D_t) \text{ since } \Pi _{tz}\Psi _{tz}=0. \end{aligned}$$

Observing that

$$\begin{aligned} D_tH_{tz}^{-1}H_{tz_0}= t^{2N}H_{z}^{-1}H_{z_0}{{\mathrm{diag}}}\left( 1, \ldots , 1, 1/t, \ldots , 1/t \right) =H_{z}^{-1}H_{z_0}D_t, \end{aligned}$$

we get

$$\begin{aligned} \Pi _{tz}\Gamma _{tz_0}= \Pi _{tz}\left( \Lambda _{t,z_0}(\mathrm {Id}_{2N}-H_{z}^{-1}H_{z_0}) + (\Lambda _{t,z_0}-\Lambda _{t,z})H_{z}^{-1}H_{z_0}\right) D_t. \end{aligned}$$

For $k\in \{2N-1,2N\}$, the function $(u,s,t)\mapsto u^k\varphi ^{(2N)}(stu)$ is defined and $\mathscr {C}^{1}$ on the compact sets (since $\varphi \in \text{ KER }^{2N+1}$)

$$\begin{aligned} \forall \,1\leqslant i\leqslant N, \quad \left[ {z_0}_i-\frac{\Delta _0}{4},{z_0}_i+\frac{\Delta _0}{4}\right] \times [0,1]\times [0,1], \end{aligned}$$

where $\Delta _0$ is defined in (5). Thus there is a constant $C>0$ (which does not depend on t nor $z\in \overline{\mathcal {B}}_{z_0}$) such that

$$\begin{aligned} \left| \int _0^1 \left( (z_i)^{k}\varphi ^{(2N)}(stz_i)- ({z_0}_i)^{k}\varphi ^{(2N)}(st{z_0}_i)\right) \frac{(1-s)^{k-1}}{(k-1)!} \mathrm {d}s\right|&\leqslant C\left|z_i-{z_0}_i\right|,\\ \text{ hence } \quad \quad \left\| \Lambda _{t,z_0}-\Lambda _{t,z} \right\|&\leqslant C\left|z-z_0\right|_\infty . \end{aligned}$$

As a result, since $\left\| \Pi _{tz} \right\| \leqslant 1$ and $z\mapsto H_{z}^{-1}H_{z_0}$ is bounded on $\overline{\mathcal {B}}_{z_0}$,

$$\begin{aligned} \left\| \Pi _{tz} (\Lambda _{tz_0}-\Lambda _{t,z})H_{z}^{-1}H_{z_0} \right\| _{\mathcal {H}}\leqslant C\sup _{z'\in \overline{\mathcal {B}}_{z_0}}\left\| H_{z'}^{-1}H_{z_0} \right\| \left|z-z_0\right|_\infty . \end{aligned}$$

As for the left term, $\Lambda _{tz_0}$ is bounded uniformly in $t\in [0,1]$, and the mapping $z\mapsto H_{z}^{-1}H_{z_0}$ is $\mathscr {C}^{1}$ on $\overline{\mathcal {B}}_{z_0}$. As a result, there is a constant $\tilde{C}>0$ such that

$$\begin{aligned} \forall z\in \overline{\mathcal {B}}_{z_0},\quad \left\| \mathrm {Id}_N-H_{z}^{-1}H_{z_0} \right\| \leqslant \tilde{C}\left|z-z_0\right|_\infty . \end{aligned}$$

To conclude, we observe that $D_t\begin{pmatrix}a_{0}\\ 0\end{pmatrix}=t^{2N}\begin{pmatrix}a_{0}\\ 0\end{pmatrix}$, and we combine the above inequalities to obtain

$$\begin{aligned} \left\| \Pi _{tz}\Gamma _{tz_0}\begin{pmatrix}a_{0}\\ 0\end{pmatrix} \right\| _{\mathcal {H}}\leqslant \left( C\sup _{z'\in \overline{\mathcal {B}}_{z_0}}\left\| H_{z'}^{-1}H_{z_0} \right\| +\tilde{C}\sup _{t\in [0,1]}\left\| \Lambda _{tz_0} \right\| \right) t^{2N}\left|z-z_0\right|_\infty . \end{aligned}$$

$\square $

3.3 Asymptotics of the Vanishing Derivatives Precertificate

We end this section devoted to the asymptotic behavior of quantities related to $\Gamma _{tz}$ by studying the second derivative of the vanishing derivatives precertificate $\eta _{V,t}$ (see Definition 2, and [14] for more details). Theorem 1 ensures that the second derivatives of $\eta _{V,t}$ do not vanish at $z_{0,i}$, $1\leqslant i\leqslant N$. However, it does not provide any estimation of those second derivatives. That is the purpose of the next proposition.

In view of Sect. 4, it will be useful to study those second derivatives not only for the precertificates that are defined by interpolating the sign at $tz_0$ but more generally for the precertificates that are defined to interpolate the sign at tz for any $z\in \overline{\mathcal {B}}_{z_0}$.

Proposition 8

Assume that $\varphi \in \text{ KER }^{2N+1}$ and that $\mathcal {I}_{2N-1}$ holds. Then

$$\begin{aligned} {\Phi _{tz}''}^*\Gamma _{tz}^{+,*}\begin{pmatrix}\mathbbm {1}_N \\ 0\end{pmatrix}&= t^{2N-2}\eta _W^{(2N)}(0)d_z+O\mathopen {}\left( t^{2N-1}\right) ,\end{aligned}$$

(30)

$$\begin{aligned} \quad \text {where} \quad d_z\in \mathbb {R}^N, \quad d_{z,i}\mathop = \limits ^{\mathrm{def}.}&\frac{2}{(2N)!} \prod _{j\ne i}(z_i-z_j)^2 \ \text{ for }\ 1\leqslant i\leqslant N. \end{aligned}$$

(31)

Proof

We proceed as in the proof of Lemma 2 by writing $\Phi _{tz}''= \Psi _{2N-1}\tilde{V}_{tz} +t^{2N-2}\tilde{\Lambda }_{t,z}$ [see (29)] and $\Psi _{tz}=\Gamma _{tz}H_{tz}^{-1}$. We obtain

$$\begin{aligned} \Phi _{tz}''&=\Psi _{tz}\tilde{V}_{tz}+t^{2N-2}\tilde{\Lambda }_{t,z}-t^{2N-2}\Lambda _{t,z}H_{z}^{-1}\tilde{V}_{z}. \end{aligned}$$

The first term yields

$$\begin{aligned} \tilde{V}_{tz}{}^* \Psi _{tz}^* \Gamma _{tz}^{+,*}\begin{pmatrix}\mathbbm {1}_N \\ 0\end{pmatrix} \!= \tilde{V}_{tz}{}^*\Psi _{tz}^*\Psi _{tz}(\Psi _{tz}^*\Psi _{tz})^{-1}H_{tz}^{-1,*}\begin{pmatrix}\mathbbm {1}_N \\ 0\end{pmatrix} \!= \tilde{V}_{tz}{}^* \delta _{2N} = 0.\qquad \end{aligned}$$

(32)

As for the second term, we take the Taylor expansion a little further (using integration by parts),

$$\begin{aligned} \int _0^1 \varphi ^{(2N)}(stz_i)\frac{(1-s)^{2N-3}}{(2N-3)!} \mathrm {d}s&= \frac{\varphi _{2N}}{(2N-2)!} \\&\quad + tz_i\int _0^1\varphi ^{(2N+1)}(stz_i)\frac{(1-s)^{2N-2}}{(2N-2)!} \mathrm {d}s, \end{aligned}$$

so as to obtain

$$\begin{aligned} \tilde{\Lambda }_{t,z} = \left( \varphi _{2N}, \ldots , \varphi _{2N} \right) {{\mathrm{diag}}}(e_z) + O\mathopen {}\left( t\right) , \ \text{ where } \ e_z\mathop = \limits ^{\mathrm{def}.}\left( \frac{(z_i)^{2N-2}}{(2N-2)!} \right) _{1\leqslant i\leqslant N} \in \mathbb {R}^N \end{aligned}$$

and, as usual, $O\mathopen {}\left( t\right) $ is uniform in $z\in \overline{\mathcal {B}}_{z_0}$. From Lemma 1, we also know that $\Gamma _{tz}^{+,*}\begin{pmatrix}\mathbbm {1}_N \\ 0\end{pmatrix}=p_{W}+O(t)$, hence

$$\begin{aligned} \tilde{\Lambda }_{t,z}^*\Gamma _{tz}^{+,*}\begin{pmatrix}\mathbbm {1}_N \\ 0\end{pmatrix} = {{\mathrm{diag}}}(e_z) \begin{pmatrix}\left\langle \varphi _{2N}, p_{W}\right\rangle _{\mathcal {H}} \\ \vdots \\ \left\langle \varphi _{2N}, p_{W}\right\rangle _{\mathcal {H}} \end{pmatrix} + O\mathopen {}\left( t\right) = \eta _W^{(2N)}(0) e_z+O\mathopen {}\left( t\right) .\qquad \end{aligned}$$

(33)

Now, we proceed with the last term. Similarly, by integration by parts,

$$\begin{aligned} \Lambda _{t,z}&= \begin{pmatrix}\varphi _{2N}&\quad \!\ldots&\quad \!\varphi _{2N}\end{pmatrix} {{\mathrm{diag}}}(f_z) + O(t). \end{aligned}$$

where $f_z$ is defined by

$$\begin{aligned} f_z\mathop = \limits ^{\mathrm{def}.}\left( \frac{(z_1)^{2N}}{(2N)!},\ldots ,\frac{(z_N)^{2N}}{(2N)!}, \frac{(z_1)^{2N-1}}{(2N-1)!},\ldots , \frac{(z_N)^{2N-1}}{(2N-1)!}\right) \in \mathbb {R}^{2N}. \end{aligned}$$

(34)

Hence,

$$\begin{aligned} \Lambda _{t,z}^*\Gamma _{tz}^{+,*}\begin{pmatrix}\mathbbm {1}_N \\ 0\end{pmatrix}&= {{\mathrm{diag}}}(f_z) \begin{pmatrix} \left\langle \varphi _{2N}, p_{W}\right\rangle _{\mathcal {H}}\\ \vdots \\ \left\langle \varphi _{2N}, p_{W}\right\rangle _{\mathcal {H}} \end{pmatrix}+O(t) = \eta _W^{(2N)}(0) f_z+O(t). \end{aligned}$$

To conclude, we study $\tilde{V}_{z}{}^*H_{z}^{-1,*}$ (which is uniformly bounded on $\overline{\mathcal {B}}_{z_0}$). In $\mathbb {R}_{2N-1}[X]$ endowed with the basis $\left( 1,X,\ldots , \frac{X^{2N-1}}{(2N-1)!}\right) $, $H_{z}^{*}$ is the matrix of the linear map which evaluates a polynomial and its derivatives at $\{z_1,\ldots ,z_N\}$. On the other hand $\tilde{V}_{z}{}^*$ represents the evaluation of the second derivative at $\{z_1,\ldots ,z_N\}$. Thus,

$$\begin{aligned} \tilde{V}_{z}{}^*H_{z}^{-1,*} f_z= (P''(z_i))_{1\leqslant i\leqslant N}, \end{aligned}$$

where P is the unique polynomial in $\mathbb {R}_{2N-1}[X]$ which satisfies

$$\begin{aligned} \forall \,i = 1, \ldots , N, \quad P(z_i)=\frac{(z_i)^{2N}}{(2N)!} \quad \text {and} \quad P'(z_i)=\frac{(z_i)^{2N-1}}{(2N-1)!}. \end{aligned}$$

One may check that

$$\begin{aligned} P(X)&=\frac{X^{2N}}{(2N)!}- \frac{1}{(2N)!}\prod _{i=1}^N (X-z_i)^2\\ \quad \text {and} \quad P''(z_i)&=\frac{z_i^{2N-2}}{(2N-2)!}-\frac{2}{(2N)!}\prod _{j\ne i}(z_i-z_j)^2. \end{aligned}$$

As a result,

$$\begin{aligned} -\tilde{V}_{z}{}^*H_{z}^{-1,*}\Lambda _{t,z}^*\Gamma _{tz}^{+,*}\begin{pmatrix}\mathbbm {1}_N \\ 0\end{pmatrix} = \eta _W^{(2N)}(0)( d_z-e_z) +O\mathopen {}\left( t\right) , \end{aligned}$$

(35)

where $O\mathopen {}\left( t\right) $ is uniform in $z\in \overline{\mathcal {B}}_{z_0}$. We obtain the claimed result by summing (32), (33) and (35). $\square $

4 Building a Candidate Solution

Now that the technical issues regarding the asymptotic behavior of $\Gamma _{tz}$ have been settled, we are ready to tackle the study of the BLASSO. In this section, we build a candidate solution for $\mathcal {P}_{\lambda }(y_t+w)$ by relying on its optimality conditions.

4.1 First Order Optimality Conditions

The optimality conditions for $\mathcal {P}_{\lambda }(y_t+w)$ (see [14]) state that any measure of the form $m_{a,tz}=\sum _{i=1}^N a_i \delta _{tz_i}$ is a solution to $\mathcal {P}_{\lambda }(y_t+w)$ if and only if the function defined by $\eta _{\lambda ,t}\mathop = \limits ^{\mathrm{def}.}\Phi ^* p_{\lambda ,t}$ with $p_{\lambda ,t}\mathop = \limits ^{\mathrm{def}.}\frac{1}{\lambda } \left( \Phi _{tz_0}a_{0}+w-\Phi _{tz}a \right) $ satisfies $\left\| \eta _{\lambda ,t} \right\| _{\text {L}^\infty (X)}\leqslant 1$ and $\eta _{\lambda ,t}(tz_i)={{\mathrm{sign}}}(a_i)$ for all $1\leqslant i\leqslant N$.

Observe that we must have $\eta _{\lambda ,t}'(tz_i)=0$ for all $1\leqslant i\leqslant N$. Moreover, in our case, since we assume that $a_{0,i}>0$ we have in fact $\eta _{\lambda ,t}(tz_i)=1$.

In order to build such a function $\eta _{\lambda ,t}$, let us consider the function $f_t$ defined for some fixed $t>0$ on $\left( \mathbb {R}^N \right) ^2\times \mathbb {R}\times \mathcal {H}$ by

$$\begin{aligned} f_t(u,v) \mathop = \limits ^{\mathrm{def}.}\Gamma _{tz}^*\left( \Phi _{tz}a-\Phi _{tz_0}a_{0}-w \right) +\lambda \begin{pmatrix}\mathbbm {1}_N \\ 0\end{pmatrix} \end{aligned}$$

(36)

$$\begin{aligned} \quad \text {where} \quad u=(a,z) \quad \text {and} \quad v=(\lambda ,w). \end{aligned}$$

(37)

Now, let us write $u_0\mathop = \limits ^{\mathrm{def}.}(a_{0},z_0)$. Notice that $m_{a,tz}$ is a solution to $\mathcal {P}_{\lambda }(y_t+w)$ if and only if $f_t(u,v)=0$ and $\left\| \eta _{\lambda ,t} \right\| _{\text {L}^\infty (X)}\leqslant 1$. Our strategy is therefore to construct solutions of $f_t(u,v)=0$ and to prove that $\left\| \eta _{\lambda ,t} \right\| _{\text {L}^\infty (X)}\leqslant 1$ provided $(\lambda ,w)$ and $\eta _W$ satisfy certain properties. More precisely we start by parametrizing the solutions of $f_t(u,v)=0$, in a neighborhood of $(u_0,0)$, using the Implicit Function Theorem.

The following Lemma 4 (whose proof is omitted and corresponds to simple computations) shows that $f_t$ is smooth and gives its derivatives.

Lemma 4

If $\varphi \in \text{ KER }^{k+1}$ for some $k\in \mathbb {N}^*$ then $f_t$ is of class $\mathscr {C}^{k}$ and for all $(u,v)\in \left( \mathbb {R}^N \right) ^2\times (\mathbb {R}\times \mathcal {H})$

$$\begin{aligned} \partial _{u} f_t(u,v)&= \Gamma _{tz}^*\Gamma _{tz}J_{ta}+t\begin{pmatrix} 0 &{}\quad \! {{\mathrm{diag}}}(\Phi _{tz}'^*(\Phi _{tz}a-\Phi _{tz_0}a_{0}- w)) \\ 0 &{}\quad \! {{\mathrm{diag}}}(\Phi _{tz}''^*(\Phi _{tz}a-\Phi _{tz_0}a_{0}- w)) \end{pmatrix}\\ \partial _{v} f_t(u,v)&= \left( \begin{pmatrix}\mathbbm {1}_N \\ 0\end{pmatrix},-\Gamma _{tz}^*\right) \end{aligned}$$

where $J_{ta}\mathop = \limits ^{\mathrm{def}.}\begin{pmatrix}\mathrm {Id}_N \quad \!&{}\quad \! 0 \\ 0 &{}\quad \! t{{\mathrm{diag}}}(a)\end{pmatrix}$.

4.2 Implicit Function Theorem

Suppose that $\mathcal {I}_{2N-1}$ holds and $\varphi \in \text{ KER }^{2N+1}$. By the results of Sect. 3, there exists $0<t_0<1$ such that for $0<t<t_0$ and all $z\in \overline{\mathcal {B}}_{z_0}$, $\Gamma _{tz}^*\Gamma _{tz}$ is invertible. In the following we shall consider a fixed value of such $t_0$ provided by Lemma 5 below which also ensures additional properties.

Now, let $t\in (0,t_0)$ be fixed. By Lemma 4, $f_t$ is $\mathscr {C}^{2N}$, $\partial _{u} f_t(u_0,0)=\Gamma _{tz_0}^*\Gamma _{tz_0}J_{ta_{0}}$ is invertible and $f_t(u_0,0)=0$. Hence by the Implicit Function Theorem, there exists $V_t$ a neighborhood of 0 in $\mathbb {R}\times \mathcal {H}$, $U_t$ a neighborhood of $u_0$ in $\left( \mathbb {R}^N \right) ^2$ and $g_t:V_t\rightarrow U_t$ a $\mathscr {C}^{2N}$ function such that

$$\begin{aligned} \forall (u,v)\in U_t\times V_t, \quad f_t(u,v)=0 \quad \Longleftrightarrow \quad u=g_t(v). \end{aligned}$$

Moreover, denoting $\mathrm {d}g_t$ the differential of $g_t$, we have

$$\begin{aligned} \forall \,v\in V_t, \quad \mathrm {d}g_t(v) = -\left( \partial _{u} f_t(g_t(v),v) \right) ^{-1} \partial _{v} f_t(g_t(v),v). \end{aligned}$$

4.3 Extension of the Implicit Function $g_t$

Our goal is to prove that $m_{a,tz}$ is the solution of the BLASSO, where $(a,z)=u=g_t(v)$. To this end, we shall exhibit additional constraints on $v\in V_t$, such as the scaling of the noise $\left\| w \right\| _{\mathcal {H}}$ or $\lambda $ with respect to t, in order to ensure that $\left\| \eta _{\lambda ,t} \right\| _{\text {L}^\infty (X)}\leqslant 1$. However, the size of the neighborhood $V_t$ provided by the Implicit Function Theorem is a priori unknown, and it might implicitly impose even stronger conditions on $\lambda $ and w as $t\rightarrow 0^+$.

Hence, before studying whether $\left\| \eta _{\lambda ,t} \right\| _{\text {L}^\infty (X)}\leqslant 1$, we show in this section that we may replace $V_t$ with some ball with radius of order $t^{2N-1}$ and still have a parametrization of the form $u=g_t(v)$ satisfying $f_t(g_t(v),v)=0$ where $f_t$ is defined in (36).

Let $V_t^*=\bigcup _{V\in \mathcal {V}} V$, where $\mathcal {V}$ is the collection of all open sets $V\subset \mathbb {R}\times \mathcal {H}$ such that

$0\in V$,
V is star-shaped with respect to 0,
$V\subset \mathrm {B}\mathopen {}\left( 0,C_Tt^{2N-2}\right) $, where $C_T>0$ is a constant defined by Lemma 5 below,
there exists a $\mathscr {C}^{2N}$ function $g:V\rightarrow (\mathbb {R}^N)^2$ such that $g(0)=u_0$ and $f_t(g(v),v)=0$ for all $v\in V$,
$g(V)\subset \mathcal {B}_{a_{0}}\times \mathcal {B}_{z_0}$.

Observe that $\mathcal {V}$ is nonempty (by the Implicit Function Theorem in Sect. 4.2) and stable by union, so that $V_t^*\in \mathcal {V}$. Indeed, all the properties defining $\mathcal {V}$ are easy to check except possibly the last two. Let $V, \tilde{V}\in \mathcal {V}$ and $g,\tilde{g}$ be corresponding functions. The set $ \left\{ v\in V\cap \tilde{V} \;;\; g(v)=\tilde{g}(v) \right\} $ is nonempty (because $g(0)=u_0=\tilde{g}(0)$) and closed in $V\cap \tilde{V}$. Moreover, it is open since for any $v\in V\cap \tilde{V}$, $v\in \mathrm {B}\mathopen {}\left( 0,C_Tt^{2N-2}\right) $ and, by Lemma 5 below, the Implicit Function Theorem applies at (g(v), v), yielding an open neighborhood in which g and $\tilde{g}$ coincide. By connectedness of $V\cap \tilde{V}$, g and $\tilde{g}$ coincide in the whole set $V\cap \tilde{V}$. As a result, the function $g_t^*: V_t^*\rightarrow (\mathbb {R}^N)^2$, defined by

$$\begin{aligned} g_t^*(v)\mathop = \limits ^{\mathrm{def}.}g(v) \text{ if } v\in V, V\in \mathcal {V}, \text{ and } g \text{ is } \text{ a } \text{ corresponding } \text{ function }, \end{aligned}$$

(38)

is well defined. Moreover, $g_t^*$ is $\mathscr {C}^{2N}$ and $g_t^*(V_t^*)\subset \mathcal {B}_{a_{0}}\times \mathcal {B}_{z_0}$.

Before proving that $V_t^*$ contains a ball of radius of order $t^{2N-1}$ and studying the variations of $g_t^*$, we state Lemma 5 mentioned above.

Lemma 5

If $\mathcal {I}_{2N-1}$ holds and $\varphi \in \text{ KER }^{2N+1}$ then there exists $t_0\in (0,1)$ and $C_T>0$ (which only depend on $\varphi $ and $u_0$) such that for all $t\in (0,t_0)$, if

$$\begin{aligned} u=(a,z)\in \overline{\mathcal {B}}_{a_{0}}\times \overline{\mathcal {B}}_{z_0} \quad \text {and} \quad v=(\lambda ,w)\in \mathrm {B}\mathopen {}\left( 0,C_Tt^{2N-2}\right) , \end{aligned}$$

(39)

then the matrix

$$\begin{aligned}&G_{tz}(\lambda ,w)\mathop = \limits ^{\mathrm{def}.}\Psi _{tz}^*\Psi _{tz}+t H_{tz}^{*,-1} F_{t z}J_{ta}^{-1} H_{tz}^{-1}, \end{aligned}$$

(40)

$$\begin{aligned}&\quad \text {where} \quad F_{t z}\mathop = \limits ^{\mathrm{def}.}\begin{pmatrix} 0 &{}\quad \! 0 \\ 0 &{}\quad \! -{{\mathrm{diag}}}\left( \Phi _{tz}''^* q_{tz} \right) \end{pmatrix},\end{aligned}$$

(41)

$$\begin{aligned}&\quad \text {and} \quad q_{tz} \mathop = \limits ^{\mathrm{def}.}\lambda \Gamma _{tz}^{*,+}\begin{pmatrix}\mathbbm {1}_N\\ 0\end{pmatrix}+\Pi _{tz} w+\Pi _{tz} \Gamma _{tz_0}\begin{pmatrix}a_{0}\\ 0\end{pmatrix}, \end{aligned}$$

(42)

is invertible and the norm of its inverse is less than $3\left\| (\Psi _{2N-1}^*\Psi _{2N-1})^{-1} \right\| $.

If, moreover, $f_t(u,v)=0$, then

$$\begin{aligned} \partial _{u} f_t(u,v)=H_{tz}^* G_{tz}(\lambda ,w)H_{tz}J_{ta}\end{aligned}$$

and this is an invertible matrix.

Let us precise that by $(\lambda ,w)\in \mathrm {B}\mathopen {}\left( 0,C_Tt^{2N-2}\right) $, we mean that $\left|\lambda \right|<C_Tt^{2N-2}$ and $\left\| w \right\| _{\mathcal {H}}<C_Tt^{2N-2}$.

Proof

We consider $t_0\in (0,1)$ small enough so that for $0<t<t_0$ and all $z\in \overline{\mathcal {B}}_{z_0}$, $\Gamma _{tz}^*\Gamma _{tz}$ is invertible and

$$\begin{aligned} \left\| (\Psi _{tz}^*\Psi _{tz})^{-1} \right\|&\leqslant 2 \left\| (\Psi _{2N-1}^*\Psi _{2N-1})^{-1} \right\| \text{ by } \text{ Lemma } \text{1 },\end{aligned}$$

(43)

$$\begin{aligned} \left|\lambda \Phi _{tz}''^*\Gamma _{tz}^{*,+}\begin{pmatrix}\mathbbm {1}_N\\ 0\end{pmatrix}\right|_\infty&\leqslant \frac{4(2R_W)^{2N}}{(2N)!} \left|\lambda \eta _W^{(2N)}(0)\right| t^{2N-2} \text{ by } \text{ Proposition } \text{8 }. \end{aligned}$$

(44)

In the last equation, we have used the fact that $\left|\prod _{j\ne i}(z_i-z_j)^2\right|\leqslant (2R_W)^{2N}$ (where $R_W=\sup \{\left|z\right|_\infty ;z\in \overline{\mathcal {B}}_{z_0}\}$ is defined in Eq. (19)).

We also know that for some constants $L_1,L_2>0$ which only depend on $\varphi $ and $u_0$,

$$\begin{aligned} \left|\Phi _{tz}''^*\Pi _{tz} w\right|_\infty&\leqslant \left\| w \right\| _{\mathcal {H}} L_1t^{2N-2} \text{ by } \text{ Lemma } \text{2 },\\ \text{ and } \left|\Phi _{tz}''^*\Pi _{tz} \Gamma _{tz_0}\begin{pmatrix}a_0\\ 0\end{pmatrix}\right|_\infty&\leqslant L_1L_2\frac{\Delta _{0}}{4} t^{4N-2} \text{ by } \text{ Lemmas } \text{2 } \text{ and } \text{3 }, \end{aligned}$$

for all $z\in \overline{\mathcal {B}}_{z_0}=\overline{\mathrm {B}}\mathopen {}\left( z_0,\frac{\Delta _{0}}{4}\right) $, $t\in (0,t_0)$, where $\Delta _{0}$ is defined in (5).

Combining those inequalities with (44), we see that

$$\begin{aligned} \left|\Phi _{tz}''^*q_{tz}\right|_\infty&= \left|\lambda \Phi _{tz}''^*\Gamma _{tz}^{*,+}\begin{pmatrix}\mathbbm {1}_N\\ 0\end{pmatrix} + \Phi _{tz}''^*\Pi _{tz} w + \Phi _{tz}''^*\Pi _{tz} \Gamma _{tz_0}\begin{pmatrix}a_0\\ 0\end{pmatrix} \right|_\infty \\&\leqslant \left|\lambda \right|\frac{4}{(2N)!} \left|\eta _W^{(2N)}(0)\right| t^{2N-2} + \left\| w \right\| _{\mathcal {H}} L_1t^{2N-2} + L_1L_2\frac{\Delta _{0}}{4} t^{4N-2} \end{aligned}$$

On the other hand, since

$$\begin{aligned} H_{tz}^{*,-1}&= {{\mathrm{diag}}}\left( 1,\ldots , 1/t^{2N-1} \right) H_{z}^{*,-1} {{\mathrm{diag}}}\left( 1,\ldots ,1,t,\ldots ,t \right) ,\\ \quad \text {and} \quad F_{t z}&= \begin{pmatrix} 0 &{}\quad \! 0 \\ 0 &{}\quad \! -{{\mathrm{diag}}}\left( \Phi _{tz}''^* q_{tz} \right) \end{pmatrix}, \end{aligned}$$

we get

$$\begin{aligned} H_{tz}^{*,-1} F_{t z}J_{ta}^{-1} H_{tz}^{-1}&=t{{\mathrm{diag}}}\left( {\scriptstyle 1,\ldots ,\frac{1}{t^{2N-1}}}\right) H_{z}^{*,-1} F_{t z}\\&J_{a}^{-1}H_{z}^{-1}{{\mathrm{diag}}}\left( {\scriptstyle 1,\ldots ,\frac{1}{t^{2N-1}}}\right) \end{aligned}$$

so that

$$\begin{aligned} \left\| tH_{tz}^{*,-1} F_{t z}J_{ta}^{-1} H_{tz}^{-1} \right\|&\leqslant \frac{\left|a^{-1}\right|_\infty }{t^{4N-4}} \left\| H_{z}^{*,-1} \right\| \left\| H_{z}^{-1} \right\| \left|\Phi _{tz}''^*q_{tz}\right|_\infty \\&\leqslant C\left( \frac{\left|\lambda \right|}{t^{2N-2}}\frac{4(2R)^{2N}}{(2N)!} \left|\eta _W^{(2N)}(0)\right|\right. \\&{} \quad + \left. \frac{\left\| w \right\| _{\mathcal {H}}}{t^{2N-2}}L_1+ L_2L_1\frac{\Delta _{0}}{4} t^{2}\right) \end{aligned}$$

with $C=\sup _{(a,z)\in \overline{\mathcal {B}}_{a_{0}}\times \overline{\mathcal {B}}_{z_0}}\left\| H_{z}^{*,-1} \right\| \left\| H_{z}^{-1} \right\| \left|a^{-1}\right|_\infty $.

Possibly choosing $t_0$ a bit smaller, we may assume that

$$\begin{aligned} 0\leqslant CL_2L_1\frac{\Delta _{0}}{4} t_0^{2}<\frac{1}{8\left\| (\Psi _{2N-1}^*\Psi _{2N-1})^{-1} \right\| }. \end{aligned}$$

As a consequence, there exists $C_T>0$ such that for all $t\in (0,t_0)$, and all $(a,z)\in \overline{\mathcal {B}}_{a_{0}}\times \overline{\mathcal {B}}_{z_0}$,

$$\begin{aligned}&\left( \max \left( \frac{\left|\lambda \right|}{t^{2N-2}},\frac{\left\| w \right\| _{\mathcal {H}}}{t^{2N-2}}\right) \leqslant C_T\right) \\&\Longrightarrow \left\| tH_{tz}^{*,-1} F_{t z}J_{ta}^{-1} H_{tz}^{-1} \right\| \leqslant \frac{1}{4\left\| (\Psi _{2N-1}^*\Psi _{2N-1})^{-1} \right\| }. \end{aligned}$$

Then, recalling (43) and setting

$$\begin{aligned} r&\mathop = \limits ^{\mathrm{def}.}\left\| t\left( \Psi _{tz}^*\Psi _{tz} \right) ^{-1}H_{tz}^{*,-1} F_{t z}J_{ta}^{-1} H_{tz}^{-1} \right\| \\&\leqslant 2\left\| (\Psi _{2N-1}^*\Psi _{2N-1})^{-1} \right\| \frac{1}{4\left\| (\Psi _{2N-1}^*\Psi _{2N-1})^{-1} \right\| }=\frac{1}{2}, \end{aligned}$$

we see that the matrix (40) is invertible, and

$$\begin{aligned} \left\| \left( \mathrm {Id}_{2N}+t\left( \Psi _{tz}^*\Psi _{tz} \right) ^{-1}H_{tz}^{*,-1} F_{t z}J_{ta}^{-1} H_{tz}^{-1} \right) ^{-1} \right\| \leqslant \sum _{k=0}^{+\infty }r^k = \frac{1}{1-r} \leqslant \frac{3}{2}. \end{aligned}$$

Eventually, using (43) again, we obtain that the norm of the inverse of (40) is less than $3\left\| (\Psi _{2N-1}^*\Psi _{2N-1})^{-1} \right\| $.

Now, if $f_t(u,v)=0$, then $\Phi _{tz}'^*(\Phi _{tz}a-\Phi _{tz_0}a_{0}- w)=0$, so that thanks to Lemma 4 we obtain

$$\begin{aligned} \partial _{u} f_t(u,v)&= \Gamma _{tz}^*\Gamma _{tz}J_{ta}+t\begin{pmatrix} 0 &{}\quad \! 0 \\ 0 &{}\quad \! {{\mathrm{diag}}}(\Phi _{tz}''^*(\Phi _{tz}a-\Phi _{tz_0}a_{0}- w)) \end{pmatrix}. \end{aligned}$$

Moreover,

$$\begin{aligned} \Phi _{tz}a-\Phi _{tz_0}a_{0}- w&=\Gamma _{tz}\begin{pmatrix}a\\ 0\end{pmatrix}- \Gamma _{tz_0}\begin{pmatrix}a_{0}\\ 0\end{pmatrix}-w \\&= \Gamma _{tz}^{*,+}\Gamma _{tz}^*\Gamma _{tz_0}\begin{pmatrix}a_{0}\\ 0\end{pmatrix}+ \Gamma _{tz}^{*,+}\Gamma _{tz}^*w-\lambda \Gamma _{tz}^{*,+}\begin{pmatrix}\mathbbm {1}_N \\ 0\end{pmatrix} \\&\quad \,- \Gamma _{tz_0}\begin{pmatrix}a_{0}\\ 0\end{pmatrix}-w \\&= - \Pi _{tz} \Gamma _{tz_0}\begin{pmatrix}a_{0}\\ 0\end{pmatrix}- \Pi _{tz} w - \lambda \Gamma _{tz}^{*,+}\begin{pmatrix}\mathbbm {1}_N \\ 0\end{pmatrix} = -q_{tz}. \end{aligned}$$

As a result,

$$\begin{aligned} \partial _{u} f_t(u,v)&= H_{tz}^*\Psi _{tz}^*\Psi _{tz}H_{tz}J_{ta}+ t \begin{pmatrix} 0 &{}\quad \! 0 \\ 0 &{}\quad \! -{{\mathrm{diag}}}\left( \Phi _{tz}''^* q_{tz} \right) \end{pmatrix}\\&= H_{tz}^* G_{tz}(\lambda ,w)H_{tz}J_{ta}, \end{aligned}$$

and $\partial _{u} f_t(u,v)$ is invertible. $\square $

We may now study the variations of $g_t^*$.

Corollary 1

If $\mathcal {I}_{2N-1}$ holds and $\varphi \in \text{ KER }^{2N+1}$ then there exists $M>0$ (which only depends on $\varphi $ and $u_0$), such that for $0<t<t_0$, for all $v\in V_t^*$

$$\begin{aligned} \left\| \mathrm {d}g_t^*(v) \right\| \leqslant \frac{M}{t^{2N-1}}. \end{aligned}$$

Proof

Let us recall that by construction, $V_t^*\subset \mathrm {B}\mathopen {}\left( 0,C_Tt^{2N-2}\right) $. Thus, from Lemma 5, we know that for all $v\in V_t^*$, $\partial _{u} f_t(g_t(v),v) = H_{tz}^* G_{tz}(\lambda ,w)H_{tz}J_{ta}$, where $(a,z)=g_t^*(v)$. Since $\mathrm {d}g_t^*(v)=-\left( \partial _{u} f_t(g_t^*(v),v) \right) ^{-1} \partial _{v} f_t(g_t^*(v),v)$, we get

$$\begin{aligned} \mathrm {d}g_t^*(v)&= -J_{ta}^{-1}H_{tz}^{-1} G_{tz}(\lambda ,w)^{-1} H_{tz}^{*,-1} \left( \begin{pmatrix}\mathbbm {1}_N \\ 0\end{pmatrix}, -H_{tz}^*\Psi _{tz}^* \right) \\&= J_{a}^{-1}H_{z}^{-1} {{\mathrm{diag}}}\left( {\scriptstyle 1,\ldots ,\frac{1}{t^{2N-1}}}\right) G_{tz}(\lambda ,w)^{-1} \left( \delta _{2N}, \Psi _{tz}^* \right) , \end{aligned}$$

Using Lemma 5 and the fact that $J_{a}^{-1},H_{z}^{-1},\Psi _{tz}$ are uniformly bounded on $\overline{\mathcal {B}}_{a_{0}}\times \overline{\mathcal {B}}_{z_0}$, we obtain the claimed upper bound of $\left\| \mathrm {d}g_t^*(v) \right\| $ for all $v\in V_t^*$. $\square $

We are now in position to prove that $V_t^*$ contains a ball of radius of order $t^{2N-1}$.

Proposition 9

If $\mathcal {I}_{2N-1}$ holds and $\varphi \in \text{ KER }^{2N+1}$, there exists $C_R>0$ such that for all $t\in (0,t_0)$,

$$\begin{aligned} \mathrm {B}\mathopen {}\left( 0,C_Rt^{2N-1}\right) \subset V_t^* \quad \text {with} \quad C_R\geqslant \min \left( \frac{\Delta _{0}}{4M},\frac{\min _i(a_{0,i})}{4M},\frac{C_T}{t_0}\right) . \end{aligned}$$

Proof

Let $v\in \mathbb {R}\times \mathcal {H}$ with unit norm (i.e. $\max (\lambda ,\left\| w \right\| _{\mathcal {H}})=1$), and define

$$\begin{aligned} R_v\mathop = \limits ^{\mathrm{def}.}\sup \left\{ r\geqslant 0 \;;\; rv\in V_t^* \right\} . \end{aligned}$$

Clearly $0<R_v\leqslant C_Tt^{2N-2}$. Assume that $R_v< C_Tt^{2N-2}$. Then by Corollary 1, $g_t^*$ is uniformly continuous on $V_t^*$, so that the value of $g_t^*(R_vv)$ can be defined as a limit, and $f_t(g_t^*(R_vv),R_vv)=0$.

By contradiction, if $g_t^*(R_vv)\in \mathcal {B}_{a_{0}}\times \mathcal {B}_{z_0}$, then by Lemma 5, we may apply the Implicit Function Theorem to obtain a neighborhood of $(g_t^*(R_vv),R_v)$ in which $g_t^*$ may be extended. This enables us to construct an open set $V\in \mathcal {V}$ (in particular we may ensure that V is star-shaped with respect to 0) such that $V_t^*\subsetneq V$, which contradicts the maximality of $V_t^*$.

Hence, $g_t^*(R_vv)\in \partial (\mathcal {B}_{a_{0}}\times \mathcal {B}_{z_0})=\left( \partial (\mathcal {B}_{a_{0}})\times \overline{\mathcal {B}}_{z_0}\right) \cup \left( \overline{\mathcal {B}}_{a_{0}}\times \partial (\mathcal {B}_{z_0})\right) $. Assume for instance that $g_t^*(R_vv)\in \overline{\mathcal {B}}_{a_{0}}\times \partial (\mathcal {B}_{z_0})$ (the other case being similar). Then, for $(a,z)=g_t^*(R_vv)$,

$$\begin{aligned} \frac{\Delta _{0}}{4}=\left|z-z_0\right|_\infty \leqslant \int _0^1 \left|\mathrm {d}g_t^*(sR_vv)\cdot R_vv\right|_\infty \mathrm {d}s\leqslant \frac{M}{t^{2N-1}}R_v, \end{aligned}$$

which yields $R_v\geqslant \frac{\Delta _{0}}{4M}t^{2N-1}$. Similarly, if $g_t^*(R_vv)\in \partial (\mathcal {B}_{a_{0}})\times \overline{\mathcal {B}}_{z_0}$, we may prove that $R_v\geqslant \frac{\min _i(a_{0,i})}{4M}t^{2N-1}$.

Eventually, we have proved that for all $v\in \mathbb {R}\times \mathcal {H}$ with unit norm,

$$\begin{aligned} R_v\geqslant \min \left( \frac{\Delta _{0}}{4M}t^{2N-1},\frac{\min _i(a_{0,i})}{4M}t^{2N-1},C_Tt^{2N-2}\right) , \end{aligned}$$

and the claimed result follows. $\square $

4.4 Continuity of $g_t^*$ at 0

Before moving to Sect. 5 and to the proof of $\left\| \eta _{\lambda ,t} \right\| _{\text {L}^\infty (X)}\leqslant 1$ (which ensures that $m_{a,tz}$ is a solution to the BLASSO), we give a first order expansion of our candidate solution $u=g_t^*(v)$ for all $v\in B(0,C_Rt^{2N-1})$.

Proposition 10

If $\mathcal {I}_{2N-1}$ holds and $\varphi \in \text{ KER }^{2N+1}$ then for all $t\in (0,t_0)$, $v\in B(0,C_Rt^{2N-1})$,

$$\begin{aligned} \left\| g_t^*(v)-g_t^*(0) \right\| \leqslant M\left( \frac{|\lambda |}{t^{2N-1}}+\frac{\left\| w \right\| _{\mathcal {H}}}{t^{2N-1}} \right) . \end{aligned}$$

(45)

Proof

To show Eq. (45), it suffices to write

$$\begin{aligned} g_t^*(v)=g_t^*(0)+\int _0^1 \mathrm {d}g_t^*(sv)\cdot v\mathrm {d}s, \end{aligned}$$

and use Corollary 1 to conclude. $\square $

5 Convergence of $\eta _{\lambda ,t}$ to $\eta _W$

It remains to prove that $m_{a,tz}$ where $(a,z)=g_t^*(v)$ is indeed a solution to $\mathcal {P}_{\lambda }(y_t+w)$. To show this statement, we prove that $\eta _{\lambda ,t}$ converges towards $\eta _W$ when $(t,\lambda ,w)\rightarrow 0$ in a well chosen domain. This section is devoted to this result. The proof of our main contribution, Theorem 2, which can be found in Sect. 2.3, uses this convergence result and the assumption that $\eta _W$ is $(2N-1)$-non-degenerate to conclude.

Proposition 11

Assume that $\varphi \in \text{ KER }^{2N+1}$ and that $\mathcal {I}_{2N-1}$ holds, and let $C_W>0$ be the constant defined in Theorem 1, $g_t^*$, $t_0>0$ and $C_R>0$ be the function and constants defined in Sect. 4.

Then there exist constants $t_1\in (0,t_0)$ and $C>0$ (which depend only on $\varphi $ and $u_0$) such that for all $t\in \left( 0,t_1 \right) $ and for all $(\lambda ,w)\in \mathrm {B}\mathopen {}\left( 0,C_Rt^{2N-1}\right) $ with $\left\| \frac{w}{\lambda } \right\| _{\mathcal {H}}\leqslant C$, the following inequalities hold

$$\begin{aligned} \forall \ell \in \{0,\ldots , 2N\},\quad \left\| \eta _{\lambda ,t}^{(\ell )}-\eta _W^{(\ell )} \right\| _{\text {L}^\infty (X)}\leqslant C_W, \end{aligned}$$

with $\eta _{\lambda ,t}=\Phi ^*\left( \frac{1}{\lambda } \left( \Phi _{tz_0}a_{0}+w-\Phi _{tz}a \right) \right) $ and $(a,z)=g_t^*(\lambda ,w)$.

Proof

Let $t\in (0,t_0)$, $v\in \mathrm {B}\mathopen {}\left( 0,C_Rt^{2N-1}\right) $, and $(a,z)=u=g_t^*(v)$. Then, using $f_t(u,v)=0$ [see (36)], we get

$$\begin{aligned} p_{\lambda ,t}&\mathop = \limits ^{\mathrm{def}.}\frac{1}{\lambda } \left( \Phi _{tz_0}a_{0}+w-\Phi _{tz}a \right) \\&=\frac{1}{\lambda }\left( \Gamma _{tz_0}\begin{pmatrix}a_{0}\\ 0\end{pmatrix}+w-\Gamma _{tz}\begin{pmatrix}a\\ 0\end{pmatrix} \right) \\&=\Gamma _{tz}^{*,+}\begin{pmatrix}\mathbbm {1}_N \\ 0\end{pmatrix}+\Pi _{tz} \frac{w}{\lambda } + \frac{1}{\lambda }\Pi _{tz}\Gamma _{tz_0}\begin{pmatrix}a_{0}\\ 0\end{pmatrix}. \end{aligned}$$

Hence,

$$\begin{aligned} \left\| p_{\lambda ,t}-p_{W} \right\| _{\mathcal {H}} \leqslant \left\| \Gamma _{tz}^{*,+}\begin{pmatrix}\mathbbm {1}_N \\ 0\end{pmatrix}-p_{W} \right\| _{\mathcal {H}} + \left\| \Pi _{tz} \frac{w}{\lambda } \right\| _{\mathcal {H}} + \left\| \frac{1}{\lambda }\Pi _{tz}\Gamma _{tz_0}\begin{pmatrix}a_{0}\\ 0\end{pmatrix} \right\| _{\mathcal {H}} \end{aligned}$$

From Lemma 1, there exists $C_V>0$ and $t_V>0$ (which only depends on $\varphi ,u_0$) such that for all $t\in (0,t_V)$ and all $z\in \overline{\mathcal {B}}_{z_0}$,

$$\begin{aligned} \left\| \Gamma _{tz}^{*,+}\begin{pmatrix}\mathbbm {1}_N \\ 0\end{pmatrix}-p_{W} \right\| _{\mathcal {H}}\leqslant C_Vt. \end{aligned}$$

Moreover, since $\Pi _{tz}$ is an orthogonal projector,

$$\begin{aligned} \left\| \Pi _{tz} \frac{w}{\lambda } \right\| _{\mathcal {H}}\leqslant \left\| \frac{w}{\lambda } \right\| _{\mathcal {H}}, \end{aligned}$$

and by Lemma 3, there exists $L_2>0$ (which only depends on $\varphi ,u_0$) such that, for all $t\in (0,t_0)$

$$\begin{aligned} \left\| \frac{1}{\lambda }\Pi _{tz}\Gamma _{tz_0}\begin{pmatrix}a_{0}\\ 0\end{pmatrix} \right\| _{\mathcal {H}}&\leqslant \frac{L_2}{|\lambda |}t^{2N} \left|z-z_0\right|_\infty \\&\leqslant \frac{L_2}{|\lambda |}t^{2N} \frac{M}{t^{2N-1}} (|\lambda |+\left\| w \right\| _{\mathcal {H}}) \text{ by } \text{ Proposition } \text{10 } \\&\leqslant L_2Mt \left( 1+\left\| \frac{w}{\lambda } \right\| _{\mathcal {H}} \right) . \end{aligned}$$

Gathering all these upper-bounds, one obtains

$$\begin{aligned} \left\| p_{\lambda ,t}-p_{W} \right\| _{\mathcal {H}}&\leqslant (C_V+L_2M)t + (1 + L_2Mt) \left\| \frac{w}{\lambda } \right\| _{\mathcal {H}}\\&\leqslant (C_V+L_2M)t + (1+ L_2M) \left\| \frac{w}{\lambda } \right\| _{\mathcal {H}} \end{aligned}$$

Now, denoting by $\Phi ^{(\ell )}: \mathcal {M}(X)\rightarrow \mathcal {H}$ the operator $m\mapsto \int _X\varphi ^{(\ell )}(x)\mathrm {d}m(x)$ and by ${\Phi ^{(\ell )}}^*:\mathcal {H}\rightarrow \mathscr {C}_{X}$ its adjoint (so that $\eta _{\lambda ,t}^{(\ell )}={\Phi ^{(\ell )}}^*p_{\lambda ,t}$ and $\eta _W^{(\ell )}={\Phi ^{(\ell )}}^*p_{W}$), we let

$$\begin{aligned} K \mathop = \limits ^{\mathrm{def}.}\underset{0\leqslant l\leqslant 2N}{\max }\underset{x\in X}{\sup }\left\| \varphi ^{(\ell )}(x) \right\| _{\mathcal {H}}, \end{aligned}$$

which satisfies $K<+\infty $ because $\varphi \in \text{ KER }^{2N+1}$. Then, for all $\ell \in \{0,\ldots , 2N\}$,

$$\begin{aligned} \left\| \eta _{\lambda ,t}^{(\ell )}-\eta _W^{(\ell )} \right\| _{\text {L}^\infty (X)}&\leqslant K \left\| p_{\lambda ,t}-p_{W} \right\| _{\mathcal {H}}\\&\leqslant K\left( (C_V+L_2M)t + (1+ L_2M) \left\| \frac{w}{\lambda } \right\| _{\mathcal {H}}\right) . \end{aligned}$$

As a consequence, by taking t smaller than $\min (t_0,t_V,\frac{C_W}{2K(C_V+L_2M)})$ and for all $(\lambda ,w)\in B(0,C_Rt^{2N-1})$ such that

$$\begin{aligned} (1 + L_2M) \left\| \frac{w}{\lambda } \right\| _{\mathcal {H}}\leqslant \frac{C_W}{2K}, \end{aligned}$$

we get

$$\begin{aligned} \left\| \eta _{\lambda ,t}^{(\ell )}-\eta _W^{(\ell )} \right\| _{\text {L}^\infty (X)} \leqslant C_W. \end{aligned}$$

$\square $

Remark 4

The constants involved in Proposition 11 are

$$\begin{aligned} t_1\mathop = \limits ^{\mathrm{def}.}\min \left( t_0,t_V,\frac{C_W}{2K(C_V+L_2M)}\right) \quad \text {and} \quad C \mathop = \limits ^{\mathrm{def}.}\frac{C_W}{2K(1+L_2M)}. \end{aligned}$$

(46)

They only depend on $\varphi $ and $u_0$.

6 Conclusion

In this paper, we have proposed a detailed analysis of the recovery performance of the BLASSO for positive measures. We have shown that if the signal-to-noise ratio is of the order of $1/t^{2N-1}$, then the BLASSO achieves perfect support estimation. This results nicely matches both Cramer-Rao lower-bounds [2], bounds for combinatorial approaches [10] and practical performances of the MUSIC algorithm [7]. It is also close to the upper-bound obtained by [20] for stable recovery on a grid by the LASSO. We have hence shown that these bounds do not only imply stability, they imply exact support recovery for the BLASSO, if $\eta _W$ is $(2N-1)$-non-degenerate. We have observed numerically that this condition holds for the ideal low-pass filter and that this is the case for a large class of low-pass filters. We have showed that this is true for the Gaussian kernel by providing the expression of $\eta _W$.

References

Bhaskar, B.N., Recht, B.: Atomic norm denoising with applications to line spectral estimation. In: 2011 49th Annual Allerton Conference on Communication, Control, and Computing, pp. 261–268 (2011)
Blu, T., Dragotti, P.L., Vetterli, M., Marziliano, P., Coulot, L.: Sparse sampling of signal innovations: theory, algorithms and performance bounds. IEEE Signal Process. Mag. 25(2), 31–40 (2008)
Article Google Scholar
Bredies, K., Pikkarainen, H.K.: Inverse problems in spaces of measures. ESAIM 19(1), 190–218 (2013)
Candès, E.J., Fernandez-Granda, C.: Super-resolution from noisy data. J. Fourier Anal. Appl. 19(6), 1229–1254 (2013)
Article MathSciNet MATH Google Scholar
Candès, E.J., Fernandez-Granda, C.: Towards a mathematical theory of super-resolution. Commun. Pure Appl. Math. 67(6), 906–956 (2014)
Article MathSciNet MATH Google Scholar
Claerbout, J.F., Muir, F.: Robust modeling with erratic data. Geophysics 38(5), 826–844 (1973)
Article Google Scholar
Condat, L., Hirabayashi, A.: Cadzow denoising upgraded: A new projection method for the recovery of dirac pulses from noisy linear measurements. Preprint hal-00759253 (2013)
de Castro, Y., Gamboa, F.: Exact reconstruction using beurling minimal extrapolation. J. Math. Anal. Appl. 395(1), 336–354 (2012)
Article MathSciNet MATH Google Scholar
de Castro, Y., Azaïs, J.-M., Gamboa, F.: Spike detection from inaccurate samplings. Appl. Comput. Harmon. Anal. 38(2), 177–195 (2015)
Article MathSciNet MATH Google Scholar
Demanet, L., Nguyen, N.: The recoverability limit for superresolution via sparsity (2015). arXiv:1502.01385
Den Dekker, A.J., Van den Bos, A., et al.: Resolution: a survey. JOSA A 14(3), 547–557 (1997)
Article Google Scholar
Donoho, D.L.: Super-resolution via sparsity constraints. SIAM J. Math. Anal. 23(5), 1309–1331 (1992)
Donoho, D.L., Chen, S.S., Saunders, M.A.: Atomic decomposition by basis pursuit. SIAM J. Sci. Comput. 20(1), 33–61 (1998)
Article MathSciNet MATH Google Scholar
Duval, V., Peyré, G.: Exact support recovery for sparse spikes deconvolution. To appear in Foundation and Computational Mathematics (2015)
Fernandez-Granda, C.: Support detection in super-resolution. In: Proceedings of the 10th International Conference on Sampling Theory and Applications, pp. 145–148 (2013)
Krim, H., Viberg, M.: Two decades of array signal processing research: the parametric approach. IEEE Signal Process. Mag. 13(4), 67–94 (1996)
Article Google Scholar
Lang, S.: Algebra. Graduate Texts in Mathematics. Springer, New York (2002)
Google Scholar
Levy, S., Fullagar, P.: Reconstruction of a sparse spike train from a portion of its spectrum and application to high resolution deconvolution. Geophysics 46(9), 1235–1243 (1981)
Article Google Scholar
Liao, W., Fannjiang, A.: MUSIC for single-snapshot spectral estimation: stability and super-resolution (2014). arXiv:1404.1484
Morgenshtern, V.I., Candès, E.J.: Super-resolution of positive sources: the discrete setup (2014). arXiv:1504.00717
Santosa, F., Symes, W.W.: Linear inversion of band-limited reflection seismograms. SIAM J. Sci. Stat. Comput. 7(4), 1307–1330 (1986)
Article MathSciNet MATH Google Scholar
Schmidt, R.O.: Multiple emitter location and signal parameter estimation. IEEE Trans. Antennas Propag. 34(3), 276–280 (1986)
Article Google Scholar
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B 58, 267–288 (1994)
MathSciNet MATH Google Scholar

Download references

Acknowledgments

We would like to thank Laurent Demanet for his initial questions that motivated us to prove the main result of this paper. We would also like to thank Jean-Marie Mirebeau for stimulating discussions about the intriguing structure of the inverse of checkerboard matrices. This work has been partly supported by the European Research Council (ERC Project SIGMA-Vision).

Author information

Authors and Affiliations

CEREMADE, Univ. Paris-Dauphine, Paris, France
Quentin Denoyelle
INRIA - Mokaplan project-team, Paris, France
Quentin Denoyelle, Vincent Duval & Gabriel Peyré
CNRS & CEREMADE, Univ. Paris-Dauphine, Paris, France
Gabriel Peyré

Authors

Quentin Denoyelle
View author publications
You can also search for this author in PubMed Google Scholar
Vincent Duval
View author publications
You can also search for this author in PubMed Google Scholar
Gabriel Peyré
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gabriel Peyré.

Additional information

Communicated by Massimo Fornasier.

Appendix 1: Proof of the Results of Sect. 2

1.1 Appendix 1.1: Proof of Proposition 1

The functions $(\varphi _{0},\ldots ,\varphi _{k})$ are linearly independent in $\text {L}^2(\mathbb {T})$ if and only if their respective Fourier coefficients are linearly independent in $\ell ^2(\mathbb {Z})$. If $(c_n[\varphi _{0}])_{n\in \mathbb {Z}}$ denotes the Fourier coefficients of $\varphi _{0}$, the Fourier coefficients of $\varphi _{j}$ are given by $\left( (2\mathrm {i}\pi n )^j c_n[\varphi _{0}]\right) _{n\in \mathbb {Z}}$ (with the convention that $0^0=1$).

If $\varphi _{0}$ has $k+1$ nonzeros Fourier coefficients corresponding to pairwise distinct frequencies $(n_0,\ldots n_k)$, those Fourier coefficients are given by the matrix product

$$\begin{aligned} \begin{pmatrix} c_{n_0}[\varphi _{0}] &{}\quad \! &{}\quad \!0 \\ &{}\quad \!\ddots &{}\quad \!\\ 0 &{}\quad \! &{}\quad \!c_{n_k}[\varphi _{0}] \end{pmatrix} \begin{pmatrix} 1 &{}\quad \! (2\mathrm {i}\pi n_0) &{}\quad \! (2\mathrm {i}\pi n_0)^2 &{}\quad \! \ldots &{}\quad \! (2\mathrm {i}\pi n_0)^{k} \\ 1 &{}\quad \! (2\mathrm {i}\pi n_1) &{}\quad \! (2\mathrm {i}\pi n_1)^2 &{}\quad \! \ldots &{}\quad \! (2\mathrm {i}\pi n_1))^{k} \\ \vdots &{}\quad \! \vdots &{}\quad \! \vdots &{}\quad \! \ddots &{}\quad \! \vdots \\ 1 &{}\quad \! (2\mathrm {i}\pi n_k) &{}\quad \! (2\mathrm {i}\pi n_k)^2 &{}\quad \! \ldots &{}\quad \! (2\mathrm {i}\pi n_k)^{k} \\ \end{pmatrix} \end{aligned}$$

Both the diagonal and the Vandermonde matrices are invertible, hence the family of Fourier coefficients of $(\varphi _{0}, \ldots , \varphi _{k})$ is linearly independent.

Conversely, if $\mathcal {I}_{k}$ holds, one can find $k+1$ Fourier coefficients, corresponding to some frequencies $n_0,\ldots n_k$, such that the matrix $(c_{n_\ell }[\varphi _{j}])_{0\leqslant \ell ,j\leqslant k}$ is invertible. From the above factorization, we deduce that each $c_{n_\ell }[\varphi _{0}]$ must be nonzero for $0\leqslant \ell \leqslant k$.

1.2 Appendix 1.2: Proof of Theorem 1

The proof proceeds in two steps. First we show the result locally around 0 in $X$ thanks to $\eta _W^{(2N)}(0)<0$ (because $\eta _W^{(2N)}(0)\ne 0$ and $|\eta _W|<1$ on $X\setminus \{0\}$) and then we extend the result to all $X$ thanks to $|\eta _W|<1$ on $X\setminus \{0\}$.

1.2.1 Locally

Let us prove that there exist $C_W>0$, $t_W>0$ such that for all $t\in (0,t_W)$, $z\in \mathbb {R}^N$ with pairwise distinct coordinates and $\left|z\right|_\infty \leqslant R_W$, there exist $r^+>0$ with $r^+>\underset{1\leqslant i\leqslant N}{\max } t_Wz_i$ and $r^-<0$ with $r^-<\underset{1\leqslant i\leqslant N}{\min } t_Wz_i$ such that for all $\eta \in \mathscr {C}^{2N}(X)\cap \text{ W }^{2N,\infty }(X)$ satisfying for all $1\leqslant i\leqslant N$, $\eta (tz_i)=1$ and $\eta '(tz_i)=0$, the following implication holds,

$$\begin{aligned}{\begin{matrix} &{}\left( \forall \ell \in \{0,\ldots , 2N\}, \left\| \eta ^{(\ell )}-\eta _W^{(\ell )} \right\| _{\text {L}^\infty (X)}\leqslant C_W\right) \\ &{}\Longrightarrow \left( \forall x\in (r^-,r^+)\setminus \bigcup _i \{tz_i\}, \quad |\eta (x)|<1 \quad \text {and} \quad \forall 1\leqslant i \leqslant N, \ \eta ''(tz_i)<0\right) . \end{matrix}} \end{aligned}$$

First, we prove that, provided $C_W>0$, $t_W>0$ are small enough, $\eta '$ has exactly $2N-1$ zeros in $(r^-,r^+)$.

Let $t>0$ and $\eta \in \mathscr {C}^{2N}(X)\cap \text{ W }^{2N,\infty }(X)$ and $z\in \mathbb {R}^N$ with pairwise distinct coordinates and $\left|z\right|_\infty \leqslant R_W$ such that for all $1\leqslant i\leqslant N$, $\eta (tz_i)=1$ and $\eta '(tz_i)=0$. We suppose that $z_1<z_2<\ldots <z_N$. By Rolle’s Theorem, for all $1\leqslant i\leqslant N-1$, there exists $c_i(t)\in (tz_i,tz_{i+1})$ such that $\eta '(c_i(t))=0$. As a result $\eta '$ has at least $2N-1$ zeros in all $(r^-,r^+)$ satisfying the requirements.

Now, let us assume by contradiction that $\eta '$ has strictly more than $2N-1$ zeros for arbitrarily small values of $C_W$, $t_W$ and in all $(r^-,r^+)$ satisfying the requirements. As a result, there are sequences $(t_k)_{k\in \mathbb {N}}$ where $t_k\rightarrow 0$, $(z_k)_{k\in \mathbb {N}}$ (where each $z_k\in \mathbb {R}^N$ has pairwise distinct coordinates and $\left|z_k\right|_\infty \leqslant R_W$), $(r^+_k)_{k\in \mathbb {N}}$ and $(r^-_k)_{k\in \mathbb {N}}$ where for all $k\in \mathbb {N}$, $r^+_k>0$, $r^+_k>\underset{1\leqslant i\leqslant N}{\max } t_k (z_k)_i$, $r^+_k\rightarrow 0$ and $r^-_k<0$, $r^-_k<\underset{1\leqslant i\leqslant N}{\min } t_k(z_k)_i$ and $r^-_k\rightarrow 0$. And there exists $(\eta _k)_{k\in \mathbb {N}}\in \left( \mathscr {C}^{2N}(X)\cap \text{ W }^{2N,\infty }(X) \right) ^{\mathbb {N}}$ such that for all $k\in \mathbb {N}$

$$\begin{aligned}&\forall i\in \{1,\ldots , N-1\}, \ \eta _k(t_k(z_k)_i)=1 \quad \text {and} \quad \eta _k'(t_k(z_k)_i)=0, \\&\forall \ell \in \{0,\ldots , 2N\}, \ \left\| \eta _k^{(\ell )}-\eta _W^{(\ell )} \right\| _{\text {L}^\infty (X)}\leqslant \frac{1}{k}, \end{aligned}$$

and $\eta _k'$ has at least 2N zeros in $(r^-_k,r^+_k)$ (we already know that $\eta _k'$ has at least $2N-1$ zeros in $(r^-_k,r^+_k)$). Thus, for all $k\in \mathbb {N}$, by applying successively Rolle’s Theorem, we obtain that there exists $x_k\in (r^-_k,r^+_k)$ such that $\eta _k^{(2N)}(x_k)=0$. Since $x_k\rightarrow 0$ (because $r^-_k,r^+_k\rightarrow 0$) and $\left\| \eta _k^{(2N)}-\eta _W^{(2N)} \right\| _{\text {L}^\infty (X)}\leqslant \frac{1}{k}$, we deduce that $\eta _W^{(2N)}(0)=0$, which is a contradiction. Hence $\eta '$ has exactly $2N-1$ zeros in some $(r^-,r^+)$.

Using the same argument, we may also prove that for all $i\in \{1,\ldots , N\}$, $\eta ''(tz_i)\ne 0$. Let us now observe that, either for all $1\leqslant i \leqslant N$, $\eta ''(tz_i)>0$ or, for all $1\leqslant i \leqslant N$, $\eta ''(tz_{i+1})<0$. Indeed, assume for instance by contradiction that for some $1\leqslant i\leqslant N-1$, $\eta ''(tz_i)>0$ and $\eta ''(tz_{i+1})<0$. Then, there exists $c_i(t)\in (tz_i,tz_{i+1})$ such that $\eta (c_i(t))=1$. Applying Rolle’s Theorem on respectively $(tz_i,c_i(t))$ and $(c_i(t),tz_{i+1})$, we obtain that $\eta '$ vanishes at least twice in $(tz_i,tz_{i+1})$. It is a contradiction with the fact that $\eta '$ has exactly $2N-1$ zeros in $(r^-,r^+)$ for all $0<t<t_W$.

As a result, there are only two possibilities: either $\eta ''(tz_i)<0$ for all i (and then for all $x\in (r^-,r^+)\setminus \bigcup _i \{tz_i\}$ $\eta (x)<1$), or $\eta ''(tz_i)>0$ for all i (and then for all $x\in (r^-,r^+)\setminus \bigcup _i \{tz_i\}$ $\eta (x)>1$). But since $\eta _W^{(2N)}(0)<0$, there is some $\tilde{x}\in (r^-,r^+)$ and some $\varepsilon _0>0$ such that $\eta _W(\tilde{x}) < 1 - \varepsilon _0$. Choosing $C_W$ small enough so that $C_W<\varepsilon _0/2$, we obtain that $\eta (\tilde{x})<1-\varepsilon _0/2$. As a consequence, for all $x\in (r^-,r^+)\setminus \bigcup _i \{tz_i\}$, $\eta (x)<1$. Finally we can suppose that $\eta _W(x)>-1$ on $(r^-,r^+)$ by imposing $0<C_W<\inf _{X} \eta _W+1$.

To sum up, we have proved the following :

$$\begin{aligned} \forall x\in (r^-,r^+)\setminus \bigcup _i \{tz_i\}, |\eta (x)|<1 \quad \text {and} \quad \forall 1\leqslant i \leqslant N, \ \eta ''(tz_i)<0. \end{aligned}$$

(47)

1.2.2 Globally

As $\eta _W$ is non-degenerate, we have $\sup _{X\setminus (r^-,r^+)}\left|\eta _W\right|<1$. We can assume that $0<C_W<(1-\sup _{X\setminus (r^-,r^+)}\left|\eta _W\right|)/2$ and use $\left\| \eta -\eta _W \right\| _{\text {L}^\infty (X)}\leqslant C_W$ so as to obtain

$$\begin{aligned} \forall x\in X\setminus (r^-,r^+), \quad |\eta (x)|<1. \end{aligned}$$

(48)

Gathering Eqs. (47) and (48), we obtain the claimed result.

1.3 Appendix 1.3: Proof of Proposition 5

The proof of Proposition 5 relies on the study of the structure of the matrices $\Psi _{k}^*\Psi _{k}$. We start by introducing the notion of checkerboard matrices.

Definition 4

Let $n\in \mathbb {N}^*$ and $A=(a_{i,j})_{i,j} \in \mathbb {R}^{n\times n}$. We say that A is a checkerboard matrix if for all (i, j) such that $i+j$ is odd, $a_{i,j}=0$.

For an odd n, such a matrix looks like

$$\begin{aligned} \begin{pmatrix} a_{1,1} &{}\quad \! 0 &{}\quad \! a_{1,3} &{}\quad \! 0 &{}\quad \! \ldots &{}\quad \! 0 &{}\quad \! a_{1,n-2} &{}\quad \! 0 &{}\quad \! a_{1,n} \\ 0 &{}\quad \! a_{2,2} &{}\quad \! 0 &{}\quad \! a_{2,4} &{}\quad \! \ldots &{}\quad \! a_{2,n-3} &{}\quad \! 0 &{}\quad \! a_{2,n-1} &{}\quad \! 0 \\ \vdots &{}\quad \! \vdots &{}\quad \! \vdots &{}\quad \! \vdots &{}\quad \! \ddots &{}\quad \! \vdots &{}\quad \! \vdots &{}\quad \! \vdots &{}\quad \! \vdots \\ 0 &{}\quad \! a_{n-1,2} &{}\quad \! 0 &{}\quad \! a_{n-1,4} &{}\quad \! \ldots &{}\quad \! a_{n-1,n-3} &{}\quad \! 0 &{}\quad \! a_{n-1,n-1} &{}\quad \! 0 \\ a_{n,1} &{}\quad \! 0 &{}\quad \! a_{n,3} &{}\quad \! 0 &{}\quad \! \ldots &{}\quad \! 0 &{}\quad \! a_{n,n-2} &{}\quad \! 0 &{}\quad \! a_{n,n} \end{pmatrix}. \end{aligned}$$

Lemma 6

The set of checkerboard matrices of size $n \in \mathbb {N}^*$ is an algebra over $\mathbb {R}$. As a result the inverse of a checkerboard matrix is also a checkerboard matrix.

Proof

The only difficulty is to show that the product of two checkerboard matrices is also a checkerboard matrix. Let $A=(a_{i,j})_{i,j}$ and $B=(b_{i,j})_{i,j}$ be two such matrices. Let (i, j) such that $i+j$ is odd. Then $\sum _{k=1}^n a_{i,k}b_{k,j} = 0$ because if $i+k$ is even then $(i+j)-(i+k)=j-k$ is odd and $j-k+2k=j+k$ is odd, hence $b_{j,k}=0$. On the contrary, if $i+k$ is odd then $a_{i,k}=0$. So AB is a checkerboard matrix. The last statement holds because the inverse of any matrix is a polynomial in that matrix. $\square $

Now we give a more precise result on the structure of some particular checkerboard matrices.

Lemma 7

Let $A=(a_{i,j})_{1\leqslant i,j\leqslant 2n+1}$ be a symmetric positive-definite checkerboard matrix of size $2n+1$ such that

$$\begin{aligned} a_{i,j}=(-1)^{\frac{i-j}{2}} a_{\frac{i+j}{2},\frac{i+j}{2}} \end{aligned}$$

for all $i,j\in \{1,\ldots , 2n+1\}$ such that $i+j$ is even, and let $(b_{i,j})_{1\leqslant i,j\leqslant 2n+1}$ denote the entries of $A^{-1}$. Then $b_{2n+1,2n+1}$ and $b_{1,2n+1}$ are positive.

Proof

The fact that $b_{2n+1,2n+1}$ is positive, is a direct consequence of the fact that $A^{-1}$ is symmetric positive-definite because so is A.

From the expression of the inverse of a matrix using cofactors, we see that $b_{1,2n+1}=\det (\hat{A}_{1,2n+1})/\det (A)$, where $\hat{A}_{1,2n+1}$ is the matrix obtained from A by removing the first row and the last column. Since A is symmetric positive-definite, $\det (A)>0$, and we need only show that $\det (\hat{A}_{1,2n+1})>0$. But one can see that $\hat{A}_{1,2n+1}$ is a skew symmetric matrix of size 2n, hence

$$\begin{aligned} \det (\hat{A}_{1,2n+1})=\text{ pf }(\hat{A}_{1,2n+1})^2, \end{aligned}$$

where $\text{ pf }(\hat{A}_{1,2n+1})$ is the Pfaffian of the matrix $\hat{A}_{1,2n+1}$. For more details on the Pfaffian (definition and proof of the result used here), see [17]. As a result, $\det (\hat{A}_{1,2n+1}) \geqslant 0$.

Moreover, $\hat{A}_{1,2n+1}$ is invertible. Indeed if we denote the colums of A (resp. the columns $\hat{A}_{1,2n+1}$) by $C_1,\ldots ,C_{2n+1}$ (resp. $\hat{C}_1, \ldots ,\hat{C}_{2n}$), we observe that for $0\leqslant i \leqslant n-1$, $\hat{C}_{2i+1}\in E_1$ and $\hat{C}_{2(i+1)}\in E_2$ where

$$\begin{aligned} E_1&= {{\mathrm{Span}}}\{ e_{2j+1} ; \ 0 \leqslant j \leqslant n-1 \}, \\ E_2&= {{\mathrm{Span}}}\{ e_{2(j+1)} ; \ 0 \leqslant j \leqslant n-1 \}, \end{aligned}$$

with $(e_i)_{1\leqslant i\leqslant 2n}$ is the canonical base of $\mathbb {R}^{2n}$. Since $E_1$ and $E_2$ are in direct sum, if $\hat{A}_{1,2n+1}$ is not invertible it means that either $(\hat{C}_{2i+1})_{0\leqslant i \leqslant n-1}$ or $(\hat{C}_{2(i+1)})_{0\leqslant i \leqslant n-1}$ is linearly dependent. But it cannot be $(\hat{C}_{2(i+1)})_{0\leqslant i \leqslant n-1}$ because it would imply that $(C_{2(i+1)})_{0\leqslant i \leqslant n-1}$ is also linearly dependent, because

$$\begin{aligned} C_{2(i+1)}=\begin{pmatrix} 0 \\ \hat{C}_{2(i+1)} \end{pmatrix}, \end{aligned}$$

which would contradict the invertibility of A.

So it means that $(\hat{C}_{2i+1})_{0\leqslant i \leqslant n-1}$ must be linearly dependent. However from the structure of the matrix A, one can see that for all $0\leqslant i \leqslant n-1$

$$\begin{aligned} \hat{C}_{2i+1}= \begin{pmatrix} 0 \\ (-1)^{i+1} a_{2i+1,1} \\ 0 \\ -(-1)^{i+1} a_{2i+3,1} \\ \vdots \\ 0 \\ (-1)^{i+1} a_{2(i+n)-1,1} \\ 0 \\ -(-1)^{i+1} a_{2(i+n)+1,1} \end{pmatrix} \quad \text {and} \quad \hat{C}_{2(i+1)}= \begin{pmatrix} (-1)^{i+1} a_{2i+1,1} \\ 0 \\ -(-1)^{i+1} a_{2i+3,1} \\ 0 \\ \vdots \\ (-1)^{i+1} a_{2(i+n)-1,1} \\ 0 \\ -(-1)^{i+1} a_{2(i+n)+1,1} \\ 0 \end{pmatrix}. \end{aligned}$$

Thus, a linear combination between the elements of $(\hat{C}_{2i+1})_{0\leqslant i \leqslant n-1}$ gives the same linear combination between the elements of $(\hat{C}_{2(i+1)})_{0\leqslant i \leqslant n-1}$, which contradicts for the same reason as before the invertibility of A.

Hence $\hat{A}_{1,2n+1}$ is invertible and $b_{1,2n+1}>0$. $\square $

The next result describes the structure of the matrices $\Psi _{k}^*\Psi _{k}$ when $\Phi $ is a convolution operator (i.e. for all $x\in \mathbb {R}$, $\varphi (x)=\tilde{\varphi }(\cdot -x)$) and $k\in \mathbb {N}^*$ is odd.

Lemma 8

Suppose that $\Phi $ is a convolution operator. Then for all $k\in \mathbb {N}^*$, $\Psi _{k}^* \Psi _{k}$ is a checkerboard matrix. Moreover if $\mathcal {I}_{2N}$ holds, then the entries indexed by $(1,2N+1)$ and $(2N+1,2N+1)$ of $(\Psi _{2N}^*\Psi _{2N})^{-1}$ are positive.

Proof

Let (i, j) such that $i+j$ is odd and for example $i>j$ ($\Psi _{k}^* \Psi _{k}$ is symmetric so it does not matter). The entry (i, j) of $\Psi _{k}^* \Psi _{k}$ is equal to

$$\begin{aligned} \left\langle \varphi _{i-1} , \varphi _{j-1} \right\rangle _{\mathcal {H}}. \end{aligned}$$

Recall that in the setting of Proposition 5, $\mathcal {H}=L^2(X)$ and $\left\langle f, g\right\rangle _{\mathcal {H}}=\int _X fg$. Successive integrations by parts (we integrate the left term and derive the right term) yield

$$\begin{aligned} \left\langle \varphi _{i-1} , \varphi _{j-1} \right\rangle _{\mathcal {H}} = (-1)^{\frac{i-j+1}{2}} \left\langle \varphi _{\frac{i+j-1}{2}-1} , \varphi _{\frac{i+j+1}{2}-1} \right\rangle _{\mathcal {H}}, \\ \left\langle \varphi _{i-1} , \varphi _{j-1} \right\rangle _{\mathcal {H}} = (-1)^{\frac{i-j-1}{2}} \left\langle \varphi _{\frac{i+j+1}{2}-1} , \varphi _{\frac{i+j-1}{2}-1} \right\rangle _{\mathcal {H}}. \end{aligned}$$

Hence, by the symmetry of the scalar product, we obtain that

$$\begin{aligned} \left\langle \varphi _{i-1} , \varphi _{j-1} \right\rangle _{\mathcal {H}} = -\left\langle \varphi _{i-1} , \varphi _{j-1} \right\rangle _{\mathcal {H}}. \end{aligned}$$

This implies that $\left\langle \varphi _{i-1} , \varphi _{j-1} \right\rangle _{\mathcal {H}}=0$ for odd values of $i+j$, so that $\Psi _{k}^*\Psi _{k}$ is a checkerboard matrix (see Definition 4).

Moreover, similar computations show that when $i+j$ is even

$$\begin{aligned} \left\langle \varphi _{i-1} , \varphi _{j-1} \right\rangle _{\mathcal {H}}=(-1)^{\frac{i-j}{2}}\left\langle \varphi _{\frac{i+j}{2}-1} , \varphi _{\frac{i+j}{2}-1} \right\rangle _{\mathcal {H}}. \end{aligned}$$

If $\mathcal {I}_{2N}$ holds, $\Psi _{2N}^*\Psi _{2N}$ is symmetric positive-definite. As a result $\Psi _{2N}^*\Psi _{2N}$ satisfies the assumptions of Lemma 7, so the entries $(1,2N+1)$ and $(2N+1,2N+1)$ of $(\Psi _{2N}^*\Psi _{2N})^{-1}$ are positive. $\square $

We can now prove that $\eta _W^{(2N)}(0)<0$ when $\Phi $ is a convolution operator and $\mathcal {I}_{2N}$ holds.

Proof of Proposition 5

Since $\eta _W=\Phi ^*p_{W}$, we deduce that

$$\begin{aligned} \eta _W^{(2N)}(0)&= \left\langle \varphi _{2N}, p_{W}\right\rangle _{\mathcal {H}} = \left\langle \varphi _{2N}, \Psi _{2N-1}(\Psi _{2N-1}^*\Psi _{2N-1})^{-1}\delta _{2N} \right\rangle _{\mathcal {H}}. \end{aligned}$$

Consider the symmetric positive definite matrix

$$\begin{aligned} \Psi _{2N}^*\Psi _{2N}= \begin{pmatrix} \Psi _{2N-1}^*\Psi _{2N-1}&{}\quad \! \Psi _{2N-1}^* \varphi _{2N}\\ [ \Psi _{2N-1}^* \varphi _{2N} ]^* &{}\quad \! \left\| \varphi _{2N} \right\| _{\mathcal {H}}^2 \end{pmatrix} \in \mathbb {R}^{2N \times 2N}. \end{aligned}$$

Observe that $\Psi _{2N-1}^*\Psi _{2N-1}$ is invertible, and that

$$\begin{aligned} S \mathop = \limits ^{\mathrm{def}.}\left\| \varphi _{2N} \right\| _{\mathcal {H}}^2 - \left\langle \varphi _{2N} , \Psi _{2N-1}(\Psi _{2N-1}^*\Psi _{2N-1})^{-1}\Psi _{2N-1}^*\varphi _{2N} \right\rangle _{\mathcal {H}} \ne 0. \end{aligned}$$

Indeed,

$$\begin{aligned} S = \left\| (\mathrm {Id}_{\mathcal {H}} - \Pi _{2N-1})\varphi _{2N} \right\| _{\mathcal {H}}^2>0 \end{aligned}$$

since $(\varphi _{0}, \ldots \varphi _{2N})$ has full rank.

Thus, we may apply the block inversion formula, and $(\Psi _{2N}^*\Psi _{2N})^{-1}$ is of the form

$$\begin{aligned}&(\Psi _{2N}^*\Psi _{2N})^{-1}= \frac{1}{\left\| (\mathrm {Id}_{\mathcal {H}} - \Pi _{2N-1})\varphi _{2N} \right\| _{\mathcal {H}}^2} \\&\qquad \times \begin{pmatrix} ***&{} -(\Psi _{2N-1}^*\Psi _{2N-1})^{-1} \Psi _{2N-1}^*\varphi _{2N} \\ -[(\Psi _{2N-1}^*\Psi _{2N-1})^{-1} \Psi _{2N-1}^*\varphi _{2N}]^* &{} 1 \end{pmatrix} \end{aligned}$$

Lemma 8 ensures that the entry $(2N+1,1)$ of $(\Psi _{2N}^*\Psi _{2N})^{-1}$ is (strictly) positive. This precisely means that

$$\begin{aligned} -\frac{ \left\langle \varphi _{2N} , \Psi _{2N-1}(\Psi _{2N-1}^*\Psi _{2N-1})^{-1}\delta _{2N} \right\rangle _{\mathcal {H}} }{ \left\| (\mathrm {Id}_{\mathcal {H}}-\Pi _{2N-1})\varphi _{2N} \right\| _{\mathcal {H}}^2 } >0, \end{aligned}$$

and as a result $\eta _W^{(2N)}(0)=\left\langle \varphi _{2N}, p_{W}\right\rangle _{\mathcal {H}} <0$. $\square $

1.4 Appendix 1.4: Proof of Proposition 6

Let denote $(a_k)_{0\leqslant k\leqslant 2N-1}$ the coefficients of the first column of the matrix $(\Psi _{2N-1}^*\Psi _{2N-1})^{-1}$. Then we know that for all $x\in \mathbb {R}$,

$$\begin{aligned} \eta _W(x)=\sum _{k=0}^{N-1}a_{2k}\varphi \star \varphi ^{(2k)}(x). \end{aligned}$$

Since $\varphi \star \varphi =x\rightarrow \sqrt{\pi }e^{-x^2/4}$, we get that for all $x\in \mathbb {R}$,

$$\begin{aligned} \eta _W(x)=\sqrt{\pi }e^{-x^2/4}\sum _{k=0}^{N-1}a_{2k}\hat{H}_{2k}(x), \end{aligned}$$

(49)

where $\hat{H}_{2k}$ is the Hermite polynomial of order 2k associated to $x\rightarrow e^{-x^2/4}$.

Now let us show (18) recursively on $N\in \mathbb {N}^*$. If $N=1$, we have that $\eta _W(x)=e^{-x^2/4}$ because $\Psi _{2N-1}^*\Psi _{2N-1}=(\sqrt{\pi })$. Suppose that the property is true for some $N\in \mathbb {N}^*$, i.e.,

$$\begin{aligned} \eta _{W,N}(x)=e^{-x^2/4}\sum _{k=0}^{N-1}\frac{x^{2k}}{2^{2k}k!}, \end{aligned}$$

(50)

where we use the notation $\eta _{W,N}$ to recall that this is the function $\eta _W$ for N spikes. Then form for all $x\in \mathbb {R}$,

$$\begin{aligned} \kappa (x)\mathop = \limits ^{\mathrm{def}.}e^{x^2/4}\left( \eta _{W,N+1}(x)-\eta _{W,N}(x) \right) . \end{aligned}$$

$\kappa $ is a polynomial of degree 2N [thanks to (49)], it satisfies $\kappa (0)=0$ and for all $1\leqslant i\leqslant 2N-1$, $\kappa ^{(i)}(0)=0$. As a result for all $x\in \mathbb {R}$,

$$\begin{aligned} \kappa (x)=\lambda x^{2N}, \end{aligned}$$

for some $\lambda \in \mathbb {R}$.

It remains to show that $\lambda =\frac{1}{2^{2N}N!}$. Remark that for all $x\in \mathbb {R}$, $\kappa ^{(2N)}(x)=\lambda (2N)!$ and on the other hand $\kappa ^{(2N)}(0)=-\eta _{W,N}^{(2N)}(0)$ because $\eta _{W,N+1}^{(2N)}(0)=0$ by definition of $\eta _{W,N+1}$. So it is enough to show that,

$$\begin{aligned} \eta _{W,N}^{(2N)}(0)=-\frac{(2N)!}{2^{2N}N!}. \end{aligned}$$

(51)

Thanks to the Leibniz formula applied to (50), we have the following formula,

$$\begin{aligned} \eta _{W,N}^{(2N)}(0)=\sum _{k=0}^{N-1}\left( {\begin{array}{c}2N\\ 2k\end{array}}\right) \hat{H}_{2N-2k}(0)\frac{(2k)!}{2^{2k}k!}. \end{aligned}$$

Since $\hat{H}_{2N-2k}(0)=(-1)^{N-k}\frac{(2N-2k-1)!}{2^{2N-2k-1}(N-1-k)!}$, thanks to Lemma 9 below, we obtain,

$$\begin{aligned} \eta _{W,N}^{(2N)}(0)&=\sum _{k=0}^{N-1}\frac{(2N)!}{(2N-2k)!(2k)!}\cdot (-1)^{N-k}\frac{(2N-2k-1)!}{2^{2N-2k-1}(N-1-k)!}\cdot \frac{(2k)!}{2^{2k}k!}\\&= -\frac{(2N)!}{2^{2N}N!}\sum _{k=0}^{N-1}\frac{N(N-1)!}{(N-1-k)!k!}\cdot \frac{2\cdot (-1)^{N-1-k}}{(2N-2k)}\\&=-\frac{(2N)!}{2^{2N}N!} N \sum _{k=0}^{N-1}\left( {\begin{array}{c}N-1\\ k\end{array}}\right) (-1)^{N-1-k} \int _0^1 x^{N-1-k} \mathrm {d}x \\&=-\frac{(2N)!}{2^{2N}N!} N \underbrace{\int _0^1 (1-x)^{N-1} \mathrm {d}x}_{=1/N}.\\ \end{aligned}$$

This ends the recursive proof and thus we have (18).

To conclude that $\eta _W$ is $(2N-1)$-non-degenerate, it remains to show that for all $x\in \mathbb {R}^*$, $|\eta _W(x)|<1$, since we already know that $\eta _W^{(2N)}(0)<0$ thanks to (51). But we have, thanks to (18), that for all $x\in \mathbb {R}^*$, $0<\eta _W(x)<1$ since $\sum _{k=0}^{N-1}\frac{x^{2k}}{2^{2k}k!}$ is the truncated power series of $e^{x^2/4}$.

Lemma 9

One has for all $k\in \mathbb {N}$,

$$\begin{aligned} \hat{H}_{2k+2}(0)=(-1)^{k+1}\frac{(2k+1)!}{2^{2k+1}k!}. \end{aligned}$$

(52)

Proof

We know that $H_{k+2}(x)=xH_{k+1}(x)-(k+1)H_k(x)$ where $H_k$ is the Hermite polynomial of order k associated to $x\mapsto e^{-x^2/2}$. Thus $H_{2k+2}(0)=-(2k+1)H_{2k}(0)$ and then,

$$\begin{aligned} H_{2k+2}(0)=(-1)^{k+1}\frac{(2k+1)!}{2^k k!}. \end{aligned}$$

(53)

Now, remark that $\hat{H}_{k}(x)=2^{-k/2}H_k(x/\sqrt{2})$, so that together with (53) we get the expected result (52). $\square $

1.5 Appendix 1.5: Proof of Theorem 3

For all $n\in \mathbb {N}$, the solution $m_n\mathop = \limits ^{\mathrm{def}.}m_{a_n,t_nz_n}$ of $\mathcal {P}_{\lambda _n}(y_{t_n}+w_n)$ satisfies the following first order optimality conditions:

$$\begin{aligned} \Gamma _{t_nz_n}^*(\Phi _{t_nz_n}a_n -\Phi _{tz_0}a_{0}-w_n)+\lambda _n \begin{pmatrix}\mathbbm {1}_N\\ 0\end{pmatrix}=0. \end{aligned}$$

(54)

Since $\eta _{\lambda _n,t_n}=\Phi ^*(\frac{1}{\lambda _n}(\Phi _{tz_0}a_{0}+w_n-\Phi _{t_nz_n}a_n))$ satisfies $\left\| \eta _{\lambda _n,t_n} \right\| _{\text {L}^\infty (X)}\leqslant 1$, $\eta _{\lambda _n,t_n}\in {{\mathrm{Im}}}\Phi ^*$ and $\eta _{\lambda _n,t_n}(tz_n)={{\mathrm{sign}}}(a_n)=1$.

As a result, by taking n large enough if necessary, we know that $(a_n,z_n)=g_t^*(\lambda _n,w_n)$ with $g_t^*\in \mathscr {C}^{2N}(V_t^*)$ so that $(a_n,z_n)\rightarrow (a_{0},z_0)$. Therefore all the asymptotic results that we established are true when applied for $(a_n,z_n)$ for n large enough. Observe from (54), that we get for all $n\in \mathbb {N}$:

$$\begin{aligned} p_{\lambda _n,t_n}=\Gamma _{t_nz_n}^{*,+}\begin{pmatrix}\mathbbm {1}_N \\ 0\end{pmatrix}+\Pi _{t_nz_n} \frac{w_n}{\lambda _n} + \frac{1}{\lambda _n}\Pi _{t_nz_n}\Gamma _{tz_0}\begin{pmatrix}a_{0}\\ 0\end{pmatrix}. \end{aligned}$$

As a consequence:

$$\begin{aligned} \left\| p_{\lambda _n,t_n}-p_{W} \right\| _{\mathcal {H}}\rightarrow 0 \quad \text{ when } \quad n\rightarrow +\infty , \end{aligned}$$

so that for all $0\leqslant l\leqslant 2N$,

$$\begin{aligned} \left\| \eta _{\lambda _n,t_n}^{(l)}-\eta _W^{(l)} \right\| _{\text {L}^\infty (X)}\rightarrow 0 \quad \text{ when } \quad n\rightarrow +\infty . \end{aligned}$$

(55)

In particular, since $\left\| \eta _{\lambda _n,t_n}\right\| \leqslant 1$, we deduce (21):

$$\begin{aligned} \left\| \eta _W \right\| _{\text {L}^\infty (X)}\leqslant 1. \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Denoyelle, Q., Duval, V. & Peyré, G. Support Recovery for Sparse Super-Resolution of Positive Measures. J Fourier Anal Appl 23, 1153–1194 (2017). https://doi.org/10.1007/s00041-016-9502-x

Download citation

Received: 22 July 2015
Revised: 06 July 2016
Published: 26 September 2016
Issue Date: October 2017
DOI: https://doi.org/10.1007/s00041-016-9502-x

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Support Recovery for Sparse Super-Resolution of Positive Measures

Abstract

Similar content being viewed by others

Exact Support Recovery for Sparse Spikes Deconvolution

Point Source Super-resolution Via Non-convex \(L_1\) Based Methods

A Theory of Super-Resolution from Short-Time Fourier Transform Measurements

1 Introduction

1.1 Super-Resolution and Sparse Spikes Deconvolution

1.2 From the LASSO to the BLASSO

1.2.1 LASSO

1.2.2 BLASSO

1.3 Previous Works

1.3.1 LASSO/BLASSO Performance Analysis

1.3.2 LASSO/BLASSO for Positive Spikes

1.3.3 MUSIC and Related Methods

1.4 Contributions

1.4.1 Main Results

1.4.2 Extensions

1.4.3 Roadmap

1.5 Notations

1.5.1 Measures

1.5.2 Kernels

Definition 1

1.5.3 Linear Operators

Remark 1

1.5.4 Linear Algebra

2 Asymptotic Analysis of Support Recovery

2.1 Injectivity Hypotheses

Definition 2

Proposition 1

2.2 Vanishing Derivatives Precertificate

Proposition 2

Proposition 3

Proposition 4

Definition 3

Theorem 1

Remark 2

Proposition 5

Remark 3

Proposition 6

2.3 Main Contribution

Theorem 2

2.3.1 Putting All Together

Proof of Theorem 2

2.4 Necessary Condition for the Recovery in the Limit \(t\rightarrow 0\)

Theorem 3

3 Preliminaries

3.1 Approximate Factorizations

Proposition 7

Proof

Lemma 1

Proof

3.2 Projectors

Lemma 2

Proof

Lemma 3

Proof

3.3 Asymptotics of the Vanishing Derivatives Precertificate

Proposition 8

Proof

4 Building a Candidate Solution

4.1 First Order Optimality Conditions

Lemma 4

4.2 Implicit Function Theorem

4.3 Extension of the Implicit Function \(g_t\)

Lemma 5

Proof

Corollary 1

Proof

Proposition 9

Proof

4.4 Continuity of \(g_t^*\) at 0

Proposition 10

Proof

5 Convergence of \(\eta _{\lambda ,t}\) to \(\eta _W\)

Proposition 11

Proof

Remark 4

6 Conclusion

References