Affine Phase Retrieval for Sparse Signals via $$\ell _1$$ Minimization

Huang, Meng; Sun, Shixiang; Xu, Zhiqiang

doi:10.1007/s00041-023-10022-6

Affine Phase Retrieval for Sparse Signals via $\ell _1$ Minimization

Published: 12 June 2023

Volume 29, article number 36, (2023)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Fourier Analysis and Applications Aims and scope Submit manuscript

Affine Phase Retrieval for Sparse Signals via $\ell _1$ Minimization

Download PDF

323 Accesses
3 Citations
Explore all metrics

Abstract

Affine phase retrieval is the problem of recovering signals from the magnitude-only measurements with a priori information. In this paper, we use the $\ell _1$ minimization to exploit the sparsity of signals for affine phase retrieval, showing that $O(k\log ( en/k))$ Gaussian random measurements are sufficient to recover all k-sparse signals by solving a natural $\ell _1$ minimization program, where n is the dimension of signals. For the case where measurements are corrupted by noises, the reconstruction error bounds are given for both real-valued and complex-valued signals. Our results demonstrate that the natural $\ell _1$ minimization program for affine phase retrieval is stable.

Phase Retrieval by the Inverse Power Method

Stable Signal Recovery from Phaseless Measurements

Article 20 October 2015

Fast Rank-One Alternating Minimization Algorithm for Phase Retrieval

Article 22 October 2018

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

1.1 Problem Setup

Affine phase retrieval for sparse signals aims to recover a k-sparse signal ${\varvec{x}}_0 \in {\mathbb F}^n$, ${\mathbb F}\in \{{\mathbb R},{\mathbb C}\}$, from the observed data

$$\begin{aligned} y_j = \vert \langle {\varvec{a}}_j, \varvec{x}_0 \rangle + b_j \vert + w_j, \quad j=1, \ldots , m, \end{aligned}$$

where ${\varvec{a}}_j \in {\mathbb F}^n, \, j=1, \ldots , m$ are given measurement vectors, ${\varvec{b}}:= (b_1, \ldots , b_m)^\textrm{T}\in {\mathbb F}^m$ is the given bias vector, and ${\varvec{w}}:= (w_1, \ldots , w_m)^\textrm{T}\in {\mathbb R}^m$ is the noise vector. The affine phase retrieval arises in several practical applications, such as holography [2, 20, 26, 27] and Fourier phase retrieval [3,4,5, 23], where some side information of signals is a priori known before capturing the magnitude-only measurements.

The aim of this paper is to study the following program to recover ${\varvec{x}}_0$ from ${ \varvec{ y}}:= (y_1, \ldots , y_m)^\textrm{T}\in {\mathbb R}^m$:

$$\begin{aligned} \min _{\varvec{x} \in \mathbb C^n} \Vert \varvec{x} \Vert _1 \quad \text {s.t.} \ \Vert \vert \varvec{A} {\varvec{x}}+ \varvec{b} \vert - \varvec{y} \Vert _2 \le \epsilon , \end{aligned}$$

(1)

where $\varvec{A}:= [{\varvec{a}}_1, \ldots , {\varvec{a}}_m] ^* \in {\mathbb F}^{m\times n}$.

Particularly, we focus on the following questions:

Question 1: Assume that ${\varvec{a}}_j, \, j=1, \ldots , m$, are Gaussian random measurements with $m = O(k\log ( en/k))$. In the absence of noise, i.e., ${\varvec{w}}=0, \, \epsilon =0$, is the solution to (1) ${\varvec{x}}_0$?
Question 2: In the noisy scenario, is the program (1) stable under small perturbation?

For the case where ${\varvec{x}}_0\in {\mathbb C}^n$ is non-sparse, it was shown that $m \ge 4n-1$ generic measurements are sufficient to guarantee the uniqueness of solutions in [19], and several efficient algorithms with linear convergence rate was proposed to recover the non-sparse signals ${\varvec{x}}_0$ from ${ \varvec{ y}}$ under $m=O(n\log n)$ Gaussian random measurements in [25]. However, for the case where ${\varvec{x}}_0$ is sparse, to the best of our knowledges, there is no result about it.

1.2 Related Works

1.2.1 Phase Retrieval

The noisy phase retrieval is the problem of recovering a signal ${\varvec{x}}_0 \in {\mathbb F}^n$, ${\mathbb F}\in \{{\mathbb R},{\mathbb C}\}$ from the magnitude-only measurements

$$\begin{aligned} y'_j = \left|\langle {\varvec{a}}_j,{\varvec{x}}_0 \rangle \right| + w_j, \quad j=1, \ldots , m, \end{aligned}$$

where ${\varvec{a}}_j \in {\mathbb F}^n$ are given measurement vectors and $w_j \in {\mathbb R}$ are noises. It arises naturally in many areas such as X-ray crystallography [21, 22, 28], coherent diffractive imaging [30], and optics [14, 15, 32]. In these settings, optical detectors record only the intensity of a light wave while losing the phase information. Note that $\left|\langle {\varvec{a}}_j,{\varvec{x}}_0 \rangle \right|^2=\left|\langle {\varvec{a}}_j, e^{ i\theta }{\varvec{x}}_0 \rangle \right|^2$ for any $\theta \in {\mathbb R}$. Therefore the recovery of ${\varvec{x}}_0$ for the classical phase retrieval is up to a global phase. In the absence of noise, it has been proved that $m\ge 2n-1$ generic measurements suffice to guarantee the uniqueness of solutions for the real case [1], and $m \ge 4n-4$ for the complex case [6, 13, 38], respectively. Moreover, several efficient algorithms have been proposed to reconstruct ${\varvec{x}}_0$ from ${ \varvec{ y}}':=[y'_1,\ldots , y'_m]^\textrm{T}$, such as alternating minimization [29], truncated amplitude flow [37], smoothed amplitude flow [7], trust-region [33], and the Wirtinger flow (WF) variants [9, 10, 41].

1.2.2 Sparse Phase Retrieval

For several applications, the underlying signal is naturally sparse or admits a sparse representation after some linear transformation. This leads to the sparse phase retrieval:

$$\begin{aligned} \min _{\varvec{x} \in {\mathbb F}^n } \quad \Vert \varvec{x} \Vert _0 \quad \text {s.t.} \ \Vert \vert \varvec{A} {\varvec{x}}\vert - \varvec{{ \varvec{ y}}}' \Vert _2 \le \epsilon , \end{aligned}$$

(2)

where $\varvec{A}:=[{\varvec{a}}_1, \ldots , {\varvec{a}}_m]^*$. In the absence of noise, it has been established that $m = 2k$ generic measurements are necessary and sufficient for uniquely recovering of all k-sparse signals in the real case, and $m\ge 4k-2$ are sufficient in the complex case [39]. In the noisy scenario, $O(k \log ( en/k)) $ measurements suffice for stable sparse phase retrieval [12]. Due to the hardness of $\ell _0$-norm in (2), a computationally tractable approach to recover ${\varvec{x}}_0$ is by solving the following $\ell _1$ minimization:

$$\begin{aligned} \min _{\varvec{x} \in {\mathbb F}^n} \quad \Vert \varvec{x} \Vert _1 \quad \text {s.t.} \ \Vert \vert \varvec{A} {\varvec{x}}\vert - { \varvec{ y}}' \Vert _2 \le \epsilon . \end{aligned}$$

(3)

For the real case, based on the strong restricted isometry property (SRIP) established by Voroninski and Xu [34], the authors in [18] proved that, if ${\varvec{a}}_1,\ldots ,{\varvec{a}}_m \sim 1/\sqrt{m} \cdot \mathcal N (0, I_n)$ are i.i.d. Gaussian random vectors with $m\ge O( k\log ( en/k))$, then the solution ${\widehat{\varvec{x}}}\in {\mathbb R}^n$ to (3) satisfies

$$\begin{aligned} \min \left\{ \Vert {\widehat{\varvec{x}}}- {\varvec{x}}_0 \Vert , \Vert {\widehat{\varvec{x}}}+ {\varvec{x}}_0 \Vert \right\} \lesssim \epsilon +\frac{\sigma _k({\varvec{x}}_0)_1}{\sqrt{k}}, \end{aligned}$$

where $\sigma _k(\varvec{x}_0)_1:= \min _{\vert {\text {supp}}(\varvec{x})\vert \le k} \Vert \varvec{x} - \varvec{x}_0\Vert _1$. Lately, this result was extended to the complex case by employing the “phaselift” technique in [36]. Specifically, the authors in [36] showed that, for any k-sparse signal ${\varvec{x}}_0 \in {\mathbb C}^n$, the solution ${\widehat{\varvec{x}}}\in {\mathbb C}^n$ to the program

$$\begin{aligned} \mathop {\textrm{argmin}}\limits _{\varvec{x} \in \mathbb C^n} \quad \Vert \varvec{x} \Vert _1 \quad \text {s.t.} \ \Vert \mathcal {A}({\varvec{x}}) - \mathcal {A}({\varvec{x}}_0) \Vert _2 \le \epsilon \end{aligned}$$

satisfies

$$\begin{aligned} \min _{\theta \in [0, 2\pi )}\Vert {\widehat{\varvec{x}}}- e^{ i\theta }\varvec{x}_0\Vert _2 \lesssim \frac{\epsilon }{\sqrt{m}\Vert \varvec{x}_0\Vert _2}, \end{aligned}$$

provided ${\varvec{a}}_1, \ldots , {\varvec{a}}_m \sim \mathcal N(0,I_n)$ are i.i.d. complex Gaussian random vectors and $m\ge O( k\log ( en/k))$. Here, $\mathcal {A}({\varvec{x}}):= (\left|{\varvec{a}}_1^* {\varvec{x}}\right|^2, \ldots , \left|{\varvec{a}}_m^* {\varvec{x}}\right|^2)$.

1.2.3 Affine Phase Retrieval

The affine phase retrieval aims to recover a signal ${\varvec{x}}_0 \in {\mathbb F}^n$, ${\mathbb F}\in \{{\mathbb R},{\mathbb C}\}$, from the measurements

$$\begin{aligned} y_j =\left| \langle {\varvec{a}}_j, \varvec{x}_0 \rangle + b_j\right| , \quad j=1, \ldots , m, \end{aligned}$$

where ${\varvec{a}}_j \in {\mathbb F}^n, \, j=1, \ldots , m$ are measurement vectors, ${\varvec{b}}:= (b_1, \ldots , b_m)^\textrm{T}\in {\mathbb F}^m$ is the bias vector. The problem can be regarded as the classic phase retrieval with a priori information, and is raised in many areas, such as holographic phase retrieval [16, 17, 27] and Fourier phase retrieval [3,4,5, 23]. In such scenarios, one needs to employ some additional information about the desired signals to ensure the uniqueness of solutions. Specifically, in holographic optics, a reference signal $\varvec{r} \in {\mathbb C}^k$, whose structure is a priori known, is included in the diffraction patterns alongside the signal of interest ${\varvec{x}}_0\in {\mathbb C}^n$ [2, 20, 26]. Set $\varvec{x}_0' = (\varvec{x}_0^\textrm{T}, \varvec{r}^\textrm{T})^\textrm{T}\in \mathbb C^{n+k}$. Then the magnitude-only measurements we obtain that

$$\begin{aligned} y_j = \vert \langle {\varvec{a}}'_j, \varvec{x}_0' \rangle \vert = \vert \langle {\varvec{a}}_j, \varvec{x}_0 \rangle + \langle {\varvec{a}}_j'', \varvec{r} \rangle \vert = \vert \langle {\varvec{a}}_j, \varvec{x}_0 \rangle + b_j\vert , \quad j=1,\ldots ,m, \end{aligned}$$

where ${{\varvec{a}}}'_j = ({\varvec{a}}_j^\textrm{T}, {{\varvec{a}}_j''}^\textrm{T})^\textrm{T}\in {\mathbb C}^{n+k}$ are given measurement vectors and $b_j = \langle {\varvec{a}}_j'', \varvec{r} \rangle \in {\mathbb C}$ are known. Therefore, the holographic phase retrieval can be viewed as the affine phase retrieval.

Another application of affine phase retrieval arises in Fourier phase retrieval problem. For one-dimensional Fourier phase retrieval problem, it usually does not possess the uniqueness of solutions [35]. Actually, for a given signal with dimension n, beside the trivial ambiguities caused by shift, conjugate reflection and rotation, there still could be $2^{n-2}$ nontrivial solutions. To enforce the uniqueness of solutions, one approach is to use additionally known values of some entries [4], which can be recast as affine phase retrieval. More related works on the uniqueness of solutions for Fourier phase retrieval can be seen in [11, 31].

1.3 Our Contributions

In this paper, we focus on the recovery of sparse signals from the magnitude of affine measurements. Specifically, we aim to recover a k-sparse signal ${\varvec{x}}_0\in \mathbb F^n$ ($\mathbb F={\mathbb R}$ or $\mathbb F={\mathbb C}$) from the data

$$\begin{aligned} { \varvec{ y}}=\left|\varvec{Ax}_0 + \varvec{b}\right|+{\varvec{w}}, \end{aligned}$$

where $\varvec{A}:= [{\varvec{a}}_1,\ldots ,{\varvec{a}}_m]^* \in \mathbb F^{m\times n}$ is the measurement matrix, ${\varvec{b}}\in {\mathbb F}^m$ is the bias vector, and ${\varvec{w}}\in {\mathbb R}^m$ is the noise vector. Our aim is to present the performance of the following $\ell _1$ minimization program:

$$\begin{aligned} \mathop {\textrm{argmin}}\limits _{\varvec{x} \in \mathbb F^n} \Vert \varvec{x} \Vert _1 \quad \text {s.t.} \ \Vert \vert \varvec{A} {\varvec{x}}+ \varvec{b} \vert - \varvec{y} \Vert _2 \le \epsilon . \end{aligned}$$

(4)

We say a triple $(\varvec{A}, {\varvec{b}}, \Delta )$ is instance optimal of order $k_0$ if it holds

$$\begin{aligned} \Vert \Delta (\left|\varvec{A}{\varvec{x}}+{\varvec{b}}\right|)-{\varvec{x}}\Vert _p \le C\cdot \sigma _{k_0}{({\varvec{x}})}_q \end{aligned}$$

(5)

for all ${\varvec{x}}\in {\mathbb F}^n$. Here, $\Delta : {\mathbb R}^m\rightarrow {\mathbb F}^n$ is a decoder for reconstructing ${\varvec{x}}$, $\sigma _k(\varvec{x})_q:= \min _{\vert {\text {supp}}(\varvec{z})\vert \le k} \Vert \varvec{z} - \varvec{x}\Vert _q$ and $C:= C_{k_0, p, q}$ is a constant depending on $k_0$, p and q.

Theorem 1

Assume that there exists a matrix $\varvec{A} \in {\mathbb F}^{m\times n}$, a vector ${\varvec{b}}\in {\mathbb F}^m$, a decoder $\Delta : {\mathbb F}^m \rightarrow {\mathbb F}^n$ and positive integers $k_0, \, p, \, q$ such that (5) holds for all ${\varvec{x}}\in {\mathbb F}^n$. Then ${\varvec{b}}\notin \{\varvec{A}{\varvec{z}}: {\varvec{z}}\in {\mathbb F}^n\}$.

Proof

We assume that ${\varvec{b}}= \varvec{A} {\varvec{z}}_0$ where ${\varvec{z}}_0 \in {\mathbb F}^n$. We next show that there exits ${\varvec{x}}\in {\mathbb F}^n$ such that (5) does not hold. For the aim of contradiction, we assume that (5) holds. Since $\sigma _{k_0}{(-{\varvec{x}})}_q=\sigma _{k_0}{({\varvec{x}})}_q$, we have

$$\begin{aligned} \Vert \Delta (\left|\varvec{A}{\varvec{x}}-{\varvec{b}}\right|)+{\varvec{x}}\Vert _p=\Vert \Delta (\left|\varvec{A}(-{\varvec{x}})+{\varvec{b}}\right|)-(-{\varvec{x}})\Vert _p\le C \sigma _{k_0}{({\varvec{x}})}_q. \end{aligned}$$

(6)

Assume that ${\varvec{x}}_0\in {\mathbb F}^n$ is $k_0$-sparse, i.e. $\sigma _{k_0}({\varvec{x}}_0)_q=0$. According to (5) and (6), we obtain that

$$\begin{aligned} \Delta (\left|\varvec{A}{\varvec{x}}_0+{\varvec{b}}\right|)\,\,=\,\, {\varvec{x}}_0,\quad \Delta (\left|\varvec{A}{\varvec{x}}_0-{\varvec{b}}\right|)\,\,=\,\, -{\varvec{x}}_0. \end{aligned}$$

(7)

Taking ${\varvec{x}}=r{\varvec{x}}_0+2{\varvec{z}}_0$ in (6), we have

$$\begin{aligned} \Vert \Delta (\left|\varvec{A}(r{\varvec{x}}_0+2{\varvec{z}}_0)-{\varvec{b}}\right|)+r{\varvec{x}}_0+2{\varvec{z}}_0\Vert _p\le C \sigma _{k_0}{(r{\varvec{x}}_0+2{\varvec{z}}_0)}_q\le C\sigma _{k_0}{(2{\varvec{z}}_0)}_q, \end{aligned}$$

(8)

where $r>0$. Observe that

$$\begin{aligned} \Delta (\left|\varvec{A}(r{\varvec{x}}_0+2{\varvec{z}}_0)-{\varvec{b}}\right|)=\Delta (\left|\varvec{A}(r{\varvec{x}}_0)+{\varvec{b}}\right|)=r{\varvec{x}}_0. \end{aligned}$$

(9)

Here, we use ${\varvec{x}}_0$ is $k_0$-sparse. Substituting (8) into (9), we obtain that

$$\begin{aligned} \Vert 2r{\varvec{x}}_0+2{\varvec{z}}_0\Vert _p\,\,\le \,\, C\sigma _{k_0}(2{\varvec{z}}_0)_q \end{aligned}$$

(10)

holds for any $r>0$. Note $\lim _{r\rightarrow \infty }\Vert 2r{\varvec{x}}_0+2{\varvec{z}}_0\Vert _p=\infty $. Hence, (10) does not hold provided r is large enough. A contradiction! $\square $

For the case where $m\le n$ and $\varvec{A}$ is full rank, we have ${\varvec{b}}\in \{\varvec{A}{\varvec{z}}: {\varvec{z}}\in {\mathbb F}^n\}$. According to Theorem 1, we know that it is impossible to build the instance-optimality result under this setting. This is quite different from the earlier results on standard phase retrieval [18], where the instance-optimality is

$$\begin{aligned} \min _{\left|c\right|=1} \Vert \Delta (\left|\varvec{A}{\varvec{x}}\right|)-c {\varvec{x}}\Vert _p \le C\cdot \sigma _{k_0}{({\varvec{x}})}_q, \text { for all } {\varvec{x}}\in {\mathbb F}^n. \end{aligned}$$

(11)

The instance-optimality result for the standard phase retrieval, as expressed in equation (11), is established in [18].

1.3.1 Real Case

Our first result gives an upper bound for the reconstruct error of (4) in the real case, under the assumption of ${\varvec{a}}_1, \ldots ,{\varvec{a}}_m \in {\mathbb R}^n$ being real Gaussian random vectors and $m\ge O(k\log ( en/k))$. It means the $\ell _1$-minimization program is stable under small perturbation, even for the approximately k-sparse signals. To begin with, we need the following definition of strong RIP condition, which was introduced by Voroninski and Xu [34].

Definition 1

(Strong RIP in [34]) The matrix $\varvec{A}\in {\mathbb R}^{m\times n}$ satisfies the Strong Restricted Isometry Property (SRIP) of order k and constants $\theta _l,\, \theta _u >0 $ if the following inequality

$$\begin{aligned} \theta _{l}\Vert {\varvec{x}} \Vert ^2 \le \min _{I\subset [m],\left|I\right|\ge m/2} \Vert \varvec{A}_I {\varvec{x}} \Vert ^2\le \max _{I\subset [m],\left|I\right|\ge m/2} \Vert \varvec{A}_I {\varvec{x}} \Vert ^2 \le \theta _{u}\Vert {\varvec{x}} \Vert ^2 \end{aligned}$$

holds for all k-sparse signals ${\varvec{x}}\in {\mathbb R}^n$. Here, $\varvec{A}_I$ denotes the sub-matrix of $\varvec{A}$ whose rows with indices in I are kept, $[m]:=\{1,\ldots ,m\}$ and $\left|I\right|$ denotes the cardinality of I.

The following result indicates that the matrix $\begin{bmatrix} \varvec{A}&\varvec{b} \end{bmatrix} \in \mathbb R^{m \times (n+1)}$ satisfies strong RIP condition with high probability under some mild conditions on $\varvec{A}\in {\mathbb R}^{m\times n}$ and ${\varvec{b}}\in {\mathbb R}^m$.

Theorem 2

Let $\varvec{A} \in \mathbb R^{m \times n}$ be a Gaussian random matrix with entries $a_{k, j}\sim \mathcal N(0,1/m)$. Suppose that the vector $\varvec{b} \in \mathbb R^m$ satisfies $\alpha \le \Vert \varvec{b}_I \Vert _2 \le \beta $ for all $I \subseteq [m]$ with $ \left|I\right| \ge m/2$, where $\alpha \le \beta $ are two positive constants. Set $\varvec{A}':= \begin{bmatrix} \varvec{A}&\varvec{b} \end{bmatrix} \in \mathbb R^{m \times (n+1)}$. If $m \ge Ct(k+1)\log ( en/k)$ with $t(k+1) \le n$ and $1 < t \in \mathbb Z$, then there exist constants $\theta _l',\, \theta _u'$, independent with t, such that the matrix $\varvec{A}'$ satisfies the strong RIP of order $tk+1$ and constants $\theta _l',\, \theta _u'$ with probability at least $1 - 4\exp (-c'm)$. Here, $C,\, c'>0$ are constants depending only on $\alpha $ and $\beta $.

The following theorem shows that if we add some restrictions on the signal ${\varvec{x}}$, then the instance-optimality result can be established.

Theorem 3

Assume that $\varvec{A}':= \begin{bmatrix} \varvec{A}&\varvec{b} \end{bmatrix} \in \mathbb R^{m \times (n+1)}$ satisfies the strong RIP of order $(a+1)(k+1)$ with constants $\theta _u \ge \theta _l>0$. If $a > \theta _u/\theta _l$, then the following holds: for any vector ${\varvec{x}}_0 \in {\mathbb R}^n$, the solution ${\widehat{\varvec{x}}}$ to (4) with $\varvec{y} = \vert \varvec{A} \varvec{x}_0 + \varvec{b}\vert + \varvec{w}$ and $\Vert \varvec{w} \Vert _2 \le \epsilon $ obeys

$$\begin{aligned} \Vert {\widehat{\varvec{x}}}- \varvec{x}_0 \Vert _2 \le K_1\epsilon + K_2 \frac{\sigma _k({\varvec{x}}_0)_1}{\sqrt{a(k+1)}}, \end{aligned}$$

provided $K_1\epsilon + K_2 \frac{\sigma _k({\varvec{x}}_0)_1}{\sqrt{a(k+1)}} <2$. Here,

$$\begin{aligned} K_1 := \frac{2\left( 1+1/\sqrt{a}\right) }{\sqrt{\theta _l}-\sqrt{\theta _u}/\sqrt{a}} >0, \quad K_2 := \sqrt{\theta _u}K_1 + 2. \end{aligned}$$

From Theorem 2, we know that if $\varvec{A} \in \mathbb R^{m \times n}$ is a Gaussian random matrix with entries $a_{k, j}\sim \mathcal N(0,1/m)$ and the sampling complexity $m\ge C (a+1)(k+2)\log ( en/k)$, then with high probability the matrix $\varvec{A}':= \begin{bmatrix} \varvec{A}&\varvec{b} \end{bmatrix}$ satisfies strong RIP condition of order $(a+1)(k+1)$ with constants $\theta _l,\, \theta _u>0$ under some mild conditions on ${\varvec{b}}$. Here, the constants $\theta _l,\, \theta _u$ are independent with a. Therefore, taking the constant $a> \theta _u/\theta _l$, the conclusion of Theorem 3 holds with high probability.

In the absence of noise, i.e., ${\varvec{w}}=0, \, \epsilon =0$, Theorem 3 shows that if ${\varvec{a}}_1, \ldots , {\varvec{a}}_m \sim 1/\sqrt{m} \cdot \mathcal N(0, I_n)$ are real Gaussian random vectors and $m\ge O(k\log ( en/k))$, then all the k-sparse signals ${\varvec{x}}_0 \in {\mathbb R}^n$ could be reconstructed exactly by solving the program (4) under some mild conditions on ${\varvec{b}}$. We state it as the following corollary:

Corollary 1

Let $\varvec{A} \in \mathbb {R}^{m \times n}$ be a Gaussian random matrix with entries $a_{jk}\sim \mathcal N(0,1/m)$, and $\varvec{b} \in \mathbb {R}^m$ be a vector satisfying $\alpha \le \Vert \varvec{b}_I \Vert _2 \le \beta $ for all $I \subseteq [m]$ with $ \left|I\right|\ge m/2 $, where $\alpha \le \beta $ are two positive universal constants. If $m \ge C k\log ( en/k)$, then with probability at least $1 - 4\exp (-c m)$ it holds: for any k-sparse signal $\varvec{x}_0 \in \mathbb {R}^n$, the $\ell _1$ minimization

$$\begin{aligned} \mathop {\textrm{argmin}}\limits _{\varvec{x} \in \mathbb {R}^n} \Vert \varvec{x}\Vert _1 \quad \mathrm {s.t.} \quad \left|\varvec{A} {\varvec{x}}+ \varvec{b}\right| = \varvec{y} \end{aligned}$$

with $\varvec{y} = \left|\varvec{A} {\varvec{x}}_0 + \varvec{b}\right|$ has a unique solution $\varvec{x}_0$. Here $C,\, c>0$ are constants depending only on $\alpha $ and $\beta $.

1.3.2 Complex Case

We next turn to consider the estimation performance of (4) for the complex-valued signals. Let $\mathbb {H}^{n \times n}$ be the set of Hermitian matrix in ${\mathbb C}^{n\times n}$ and $\Vert \varvec{H}\Vert _{0, 2}$ denotes the number of non-zero rows in $\varvec{H}$. Given ${\varvec{a}}_1, \ldots , {\varvec{a}}_m \in {\mathbb C}^n$ and $b_1,\ldots ,b_m \in {\mathbb C}$, we define a linear map $\mathcal A': \varvec{H}' \in \mathbb H^{(n+1)\times (n+1)}\rightarrow {\mathbb R}^m$ as follows:

$$\begin{aligned} \mathcal A'(\varvec{H}') = ({\varvec{a}}_1'^* \varvec{H}' {\varvec{a}}_1', \dots , {\varvec{a}}_m'^* \varvec{H}' {\varvec{a}}_m'), \end{aligned}$$

(12)

where ${\varvec{a}}_j':= \Bigl ( \begin{array}{l} {\varvec{a}}_j \\ b_j \end{array} \Bigr ) \in {\mathbb C}^{n+1}$.

Definition 2

We say the linear map $\mathcal A'$ defined in (12) satisfies the restricted isometry property of order (r, k) with constants $c,\, C >0$ if the following holds

$$\begin{aligned} c \Vert \varvec{H}' \Vert _F \le \frac{1}{m}\Vert \mathcal A'(\varvec{H}') \Vert _1 \le C \Vert \varvec{H}' \Vert _F \end{aligned}$$

(13)

for all $\varvec{H}':= \begin{bmatrix} \varvec{H} &{} {\varvec{h}}\\ {\varvec{h}}^* &{} 0 \end{bmatrix} \in \mathbb H^{(n+1)\times (n+1)}$ with ${\text {rank}}(\varvec{H}) \le r$, $\Vert \varvec{H} \Vert _{0, 2} \le k$ and $\Vert {\varvec{h}} \Vert _0 \le k$.

The following theorem shows that the linear map ${\mathcal A}'$ satisfies the restricted isometry property over low-rank and sparse matrices, provided ${\varvec{a}}_1, \ldots , {\varvec{a}}_m \in {\mathbb C}^n$ are i.i.d. complex Gaussian random vectors and ${\varvec{b}}:= (b_1,\ldots ,b_m)^\textrm{T}\in {\mathbb C}^m$ satisfies some mild conditions.

Theorem 4

Suppose ${\varvec{a}}_1, \ldots , {\varvec{a}}_m \sim 1/\sqrt{2}\cdot \mathcal N(0,I_n) + i/\sqrt{2}\cdot \mathcal N(0,I_n)$ are i.i.d. complex Gaussian random vectors and ${\varvec{b}}\in {\mathbb C}^m$ is a independent sub-gaussian random vector (it also may be deterministic) with sub-gaussian norm $\Vert {\varvec{b}} \Vert _{\psi _2} \le C$ and ${\mathbb E}\Vert {\varvec{b}} \Vert _1 \ge c_1 m$, ${\mathbb E}\Vert {\varvec{b}} \Vert _2 \le c_2 \sqrt{m} $, where $C>0,\, c_2 \ge c_1 > 0$ are universal constants. If $m\ge C' k\log ( en/k)$, then with probability at least $1-5\exp (-c' m)$, the linear map ${\mathcal A}'$ defined in (12) obeys

$$\begin{aligned} \frac{\theta ^-}{12} \Vert \varvec{H}' \Vert _F \le \frac{1}{m}\Vert \mathcal A'(\varvec{H}') \Vert _1 \le 3\theta ^+ \Vert \varvec{H}' \Vert _F \end{aligned}$$

for all $\varvec{H}':= \begin{bmatrix} \varvec{H} &{} {\varvec{h}}\\ {\varvec{h}}^* &{} 0 \end{bmatrix} \in \mathbb H^{(n+1)\times (n+1)}$ with ${\text {rank}}(\varvec{H}) \le 2$, $\Vert \varvec{H} \Vert _{0, 2} \le k$ and $\Vert {\varvec{h}} \Vert _0 \le k$. Here, $\theta ^-:=\min (1, c_1/\sqrt{2})$, $\theta ^+:=\max (\sqrt{6}, c_2)$, and $C',\, c'>0$ are constants depending only on $c_1,\, c_2$.

With abuse of notation, we denote $\mathcal A'({\varvec{x}}'):=\mathcal A'({\varvec{x}}'{\varvec{x}}'^*)$ for any vector ${\varvec{x}}'\in {\mathbb C}^{n+1}$. Then we have

Theorem 5

Assume that the linear map $\mathcal A'(\cdot )$ satisfies the RIP condition (13) of order (2, 2ak) with constants $c,\, C>0$. For any k-sparse signal ${\varvec{x}}_0 \in {\mathbb C}^n$, if

$$\begin{aligned} c - C\left( \frac{4}{\sqrt{a}}+\frac{1}{a}\right) >0, \end{aligned}$$

then the solution ${\widehat{\varvec{x}}}\in {\mathbb C}^n$ to

$$\begin{aligned} \mathop {\textrm{argmin}}\limits _{{\varvec{x}}\in {\mathbb C}^n} \quad \Vert {\varvec{x}} \Vert _1 \quad \text{ s.t. } \quad \Vert \mathcal A'({\varvec{x}}')-{\tilde{\varvec{y}}} \Vert \le \epsilon \quad \text{ and } \quad {\varvec{x}}'=({\varvec{x}}^\textrm{T}, 1)^\textrm{T}\end{aligned}$$

with ${\tilde{\varvec{y}}}=\mathcal A'({\varvec{x}}'_0)+{\varvec{w}}$, $\Vert {\varvec{w}} \Vert \le \epsilon $ and ${\varvec{x}}'_0=({\varvec{x}}_0^\textrm{T}, 1)^\textrm{T}$ obeys

$$\begin{aligned} \min _{\theta \in {\mathbb R}} \left( \Vert {{\widehat{\varvec{x}}}-{e}^{ i\theta }{\varvec{x}}_0}\Vert _2 + \left|1-{e}^{ i\theta } \right|\right) \le \frac{C_0 \epsilon }{\left( \Vert {\varvec{x}}_0 \Vert +1\right) \sqrt{m}}, \end{aligned}$$

where

$$\begin{aligned} C_0:=2\sqrt{2} \cdot \frac{\frac{1}{a} +\frac{4}{\sqrt{a}}+1}{c - C\left( \frac{4}{\sqrt{a}}+\frac{1}{a}\right) }. \end{aligned}$$

Based on Theorem 4, if ${\varvec{a}}_1,\ldots ,{\varvec{a}}_m \in {\mathbb C}^n$ are i.i.d. complex Gaussian random vectors and $m\ge C' ak\log ( en/ak)$, then with high probability the linear map $\mathcal A'$ defined in (12) satisfies RIP condition of order (2, 2ak) with constants $c=\theta ^-/12$ and $C=3\theta ^+$ under some mild conditions on ${\varvec{b}}$. For the noiseless case where ${\varvec{w}}=0,\,\epsilon =0$, taking the constant $a> (8C/c)^2$ and combining with Theorem 5, we can obtain the following result.

Corollary 2

Suppose ${\varvec{a}}_1, \ldots , {\varvec{a}}_m \sim 1/\sqrt{2}\cdot \mathcal N(0,I_n)+{i}/\sqrt{2}\cdot \mathcal N(0,I_n)$ are i.i.d. complex Gaussian random vectors and ${\varvec{b}}\in {\mathbb C}^m$ is a independent sub-gaussian random vector (it also may be deterministic) with sub-gaussian norm $\Vert {\varvec{b}} \Vert _{\psi _2} \le C$ and $ {\mathbb E}\Vert {\varvec{b}} \Vert _1 \ge c_1 m$, ${\mathbb E}\Vert {\varvec{b}} \Vert _2 \le c_2 \sqrt{m}$, where $C>0,\, c_2 \ge c_1 > 0$ are universal constants. If $m \ge C'' k\log ( en/k)$, then with probability at least $1-5\exp (-c'' m)$, then the solution to

$$\begin{aligned} \mathop {\textrm{argmin}}\limits _{{\varvec{x}}\in {\mathbb C}^n} \quad \Vert {\varvec{x}} \Vert _1 \quad \text{ s.t. } \quad \left|\varvec{A}{\varvec{x}}+{\varvec{b}}\right|=\left|\varvec{A}{\varvec{x}}_0+{\varvec{b}}\right| \end{aligned}$$

is ${\varvec{x}}_0$ exactly. Here, $C'',\, c'' > 0$ are constants depending only on $c_1,\,c_2$.

Remark 1

We give an upper bound for $\min _{\theta \in {\mathbb R}} \left( \Vert {{\widehat{\varvec{x}}}-e^{{i}\theta }{\varvec{x}}_0}\Vert _2 + \left|1-e^{ i\theta }\right|\right) $ in Theorem 5. However, since the affine phase retrieval can recover a signal exactly (not just up to a global phase), one may wonder: is there a stable recovery bound for $\Vert {{\widehat{\varvec{x}}}-{\varvec{x}}_0}\Vert _2$? We believe that the answer is no, especially for the case where the noise vector $\Vert {{\varvec{w}}}\Vert _2 \gtrsim \sqrt{m}$. We defer the proof of it for the future work.

1.4 Notations

Throughout the paper, we denote ${\varvec{x}}\sim \mathcal N(0,I_n)$ if ${\varvec{x}}\in {\mathbb R}^n$ is a standard Gaussian random vector. A vector $\varvec{x}$ is k-sparse if there are at most k nonzero entries of ${\varvec{x}}$. For simplicity, we denote $[m]:= \{1, \dots , m\}$. For any subset $I \subseteq [m]$, let $\varvec{A}_I = \begin{bmatrix} {\varvec{a}}_j: j \in I \end{bmatrix}^*$ be the submatrix whose rows are generated by $\varvec{A} = \begin{bmatrix} {\varvec{a}}_1,\ldots , {\varvec{a}}_m \end{bmatrix}^*$. Denote $\sigma _k(\varvec{x}_0)_p:= \min _{\left|{\text {supp}}(\varvec{x})\right| \le k} \Vert \varvec{x} - \varvec{x}_0\Vert _p$ as the best k-term approximation error of $\varvec{x}_0$ with respect to $\ell _p$ norm. For a complex number b, we use $b_{\Re }$ and $b_{\Im }$ to denote the real and imaginary part of b, respectively. For any $A,B\in {\mathbb R}$, we use $ A \lesssim B$ to denote $A\le C_0 B$ where $C_0\in {\mathbb R}_+$ is an absolute constant. The notion $\gtrsim $ can be defined similarly. Throughout this paper, c, C and the subscript (superscript) forms of them denote constants whose values vary with the context.

2 Proof of Theorem 2 and Theorem 3

In this section, we consider the estimation performance of the $\ell _1$-minimization program (4) for the real-valued signals. Before proceeding, we need the following lemma which shows that if $\varvec{A} \in {\mathbb R}^{m\times n}$ is a real Gaussian random matrix with entries $a_{k,j}\sim \mathcal N(0,1/m)$, then $\varvec{A}$ satisfies the strong RIP with high probability.

Lemma 1

(Theorem 2.1 in [34]) Suppose that $t>1$ and that $\varvec{A}\in {\mathbb R}^{m\times n}$ is a Gaussian random matrix with entries $a_{k,j}\sim \mathcal N(0,1/m)$. Let $m=O(tk \log (e n/k))$ where $k \in [1,d]\cap {\mathbb Z}$ and $t\ge 1$ is a constant. Then there exist constants $\theta _l, \theta _u$ with $0<\theta _l< \theta _u<2$, independent with t, such that $\varvec{A}$ satisfies SRIP of order $t\cdot k$ and constants $\theta _l,\, \theta _u$ with probability at least $1-\exp (-cm)$, where $c>0$ is a universal constant.

2.1 Proof of Theorem 2

Proof

From the definition, it suffices to show there exist constants $\theta '_l, \theta '_u >0 $ such that the following inequality

$$\begin{aligned} \theta '_{l}\Vert {\varvec{x}}' \Vert ^2 \le \min _{I\subset [m],\left|I\right|\ge m/2} \Vert \varvec{A}'_I {\varvec{x}}' \Vert ^2\le \max _{I\subset [m],\left|I\right|\ge m/2} \Vert \varvec{A}'_I {\varvec{x}}' \Vert ^2 \le \theta '_{u}\Vert {\varvec{x}}' \Vert ^2 \end{aligned}$$

(14)

holds for all $(tk+1)$-sparse signals ${\varvec{x}}' \in {\mathbb R}^{n+1}$. To this end, we denote $\varvec{x}' = (\varvec{x}^\textrm{T}, z)^\textrm{T}$, where $\varvec{x} \in \mathbb R^n$ and $z \in \mathbb R$. We first consider the case where $z=0$. From Lemma 1, we know that if $m \gtrsim t(k+1)\log ( en/(k+1))$ and $t>1$, then there exist two positive constants $\theta _l, \theta _u \in (0,2)$ such that

$$\begin{aligned} \theta _l \Vert \varvec{x}\Vert _2^2 \le \min _{I \subseteq [m], \left|I\right|\ge m/2} \Vert \varvec{A}_I \varvec{x}\Vert _2^2 \le \max _{I \subseteq [m], \left|I\right|\ge m/2}\Vert \varvec{A}_I \varvec{x}\Vert _2^2 \le \theta _u \Vert \varvec{x}\Vert _2^2 \end{aligned}$$

(15)

holds for all $(tk+1)$-sparse vector $\varvec{x} \in \mathbb R^n$ with probability at least $1 - \exp (-cm)$. Here, $c>0$ is a universal constant. Note that $\varvec{A}' \varvec{x}' = \varvec{A} \varvec{x}$. We immediately obtain (14) for the case where $z=0$.

Next, we turn to the case where $z \ne 0$. A simple calculation shows that

$$\begin{aligned} \Vert \varvec{A}'_I \varvec{x}'\Vert _2^2 = \Vert \varvec{A}_I \varvec{x} + z\varvec{b}_I\Vert _2^2 = \Vert \varvec{A}_I \varvec{x}\Vert _2^2 + 2z\langle \varvec{A}_I \varvec{x}, \varvec{b}_I \rangle + z^2\Vert \varvec{b}_I\Vert _2^2 \end{aligned}$$

(16)

for any $I \subseteq [m]$. Denote $\varvec{A} = \begin{bmatrix} {\varvec{a}}_1,\ldots , {\varvec{a}}_m \end{bmatrix}^\textrm{T}$. Note that $\sqrt{m} {\varvec{a}}_j \sim \mathcal N(0, I_n)$. Taking $\zeta = \frac{\min (\theta _l,\alpha ^2)}{200\beta }$ in Lemma 5, we obtain that there exists a constant $C>0$ depending only on $\theta _l,\, \alpha ,\, \beta $ such that when $m\ge Ct(k+1)\log (en/k)$, with probability at least $1-3\exp (-c_1 m)$, it holds

$$\begin{aligned} \left|\langle \varvec{A}_I \varvec{x}, \varvec{b}_I \rangle \right| = \left|\langle \varvec{A} {\varvec{x}}, \varvec{b}_I \rangle \right| \le \frac{\min \{\theta _l,\alpha ^2\}}{200\beta } \Vert \varvec{x}\Vert _2\Vert \varvec{b}\Vert _2 \end{aligned}$$

(17)

for all $(tk+1)$-sparse vectors $\varvec{x}$ and all $I \subseteq [m]$. Here, we view ${\varvec{b}}_I={\varvec{b}}\mathbb {I}_I\in {\mathbb R}^m$ (${\mathbb I}_I(j)=1$ if $j\in { I}$ and 0 if $j\notin { I}$), and $c_1 > 0$ is a constant depending only on $\theta _l,\, \alpha ,\, \beta $. Note that the vector $\varvec{b}$ satisfies

$$\begin{aligned} \alpha \le \Vert \varvec{b}_I \Vert _2 \le \beta \end{aligned}$$

(18)

for all $I \subseteq [m]$ with $\left|I\right|\ge m/2$. Putting (15), (17) and (18) into (16), we obtain that when $m \ge Ct(k+1)\log (en/k)$, with probability at least $1-4\exp (-cm)$, the following two inequalities

$$\begin{aligned} \Vert \varvec{A}'_I \varvec{x}'\Vert _2^2 \ge \theta _l \Vert \varvec{x}\Vert _2^2 - 2\left|z\right|\frac{\min \{\theta _l, \alpha ^2\}}{200\beta }\Vert \varvec{x}\Vert _2 \beta + \alpha ^2 z^2 \ge 0.99\min \{\theta _l, \alpha ^2\} \Vert \varvec{x}'\Vert _2^2, \end{aligned}$$

and

$$\begin{aligned} \Vert \varvec{A}'_I \varvec{x}'\Vert _2^2 \le \theta _u \Vert \varvec{x}\Vert _2^2 + 2\left|z\right|\frac{\min \{\theta _l, \alpha ^2\}}{200\beta }\Vert \varvec{x}\Vert _2 \beta + \beta ^2 z^2 \le 1.01 \max \{\theta _u, \beta ^2\} \Vert \varvec{x}'\Vert _2^2 \end{aligned}$$

hold for all $(tk+1)$-sparse vector $\varvec{x}' \in \mathbb R^{n+1}$ and for all $I \subseteq [m]$ with $\left|I\right|\ge m/2$. Here, $c > 0$ is a constant depending only on $\theta _l,\, \alpha ,\, \beta $. In other words, we have

$$\begin{aligned} \theta '_l \Vert \varvec{x}' \Vert _2^2 \le \min _{I \subseteq [m], \left|I\right|\ge m/2} \Vert \varvec{A}'_I \varvec{x}'\Vert _2^2 \le \max _{I \subseteq [m], \left|I\right|\ge m/2} \Vert \varvec{A}'_I \varvec{x}' \Vert _2^2 \le \theta '_u \Vert \varvec{x}'\Vert _2^2 \end{aligned}$$

for all $(tk+1)$-sparse vector $\varvec{x}'$ with probability at least $1 - 4\exp (-cm)$. Here, $\theta '_l = 0.99\min \{\theta _l, \alpha ^2\}$ and $\theta '_u = 1.01 \max \{\theta _u, \beta ^2\}$. Combining the above two cases and noting that $\theta _l$, $\theta _u >0$ are universal constants, we complete the proof. $\square $

2.2 Proof of Theorem 3

Proof

Denote $\varvec{A}' = \begin{bmatrix} \varvec{A}&\varvec{b} \end{bmatrix}$, ${\widehat{\varvec{x}}}' = ({\widehat{\varvec{x}}}^\textrm{T}, 1)^\textrm{T}$ and ${\varvec{x}}'_0 = (\varvec{x}_0^\textrm{T}, 1)^\textrm{T}$. Set

$$\begin{aligned} I:= \{j: (\langle {\varvec{a}}_j, {\widehat{\varvec{x}}} \rangle + b_j)(\langle {\varvec{a}}_j, \varvec{x}_0 \rangle + b_j) \ge 0\}. \end{aligned}$$

We next divide the proof into the following two cases.

Case 1: $\left|I\right| \ge m/2$. Set ${\varvec{h}}= {\widehat{\varvec{x}}}' - {\varvec{x}}'_0$. For any $a>1$, we decompose ${\varvec{h}}$ into the sum of ${\varvec{h}}_{T_0}, {\varvec{h}}_{T_1}, \dots $, where $T_0$ is an index set which consists the indices of the $k+1$ largest coordinates of ${\varvec{x}}'_0$ in magnitude, $T_1$ is the index set corresponding to the $a(k+1)$ largest coordinates of ${\varvec{h}}_{T_0^c}$ in magnitude, $T_2$ is the index set corresponding to the $a(k+1)$ largest coordinates of ${\varvec{h}}_{(T_0 \cup T_1)^c}$ in magnitude, and so on. For simplicity, we denote $T_{jl}:= T_j \cup T_l$. To prove the theorem, we only need to give an upper bound for $\Vert {{\varvec{h}}}\Vert _2$. Observe that

$$\begin{aligned} \Vert {{\varvec{h}}}\Vert _2 \le \Vert {{\varvec{h}}_{T_{01}}}\Vert _2 + \Vert {{\varvec{h}}-{\varvec{h}}_{T_{01}}}\Vert _2. \end{aligned}$$

(19)

We claim that the following holds:

$$\begin{aligned} \Vert {{\varvec{h}}-{\varvec{h}}_{T_{01}}}\Vert _2 \le \frac{1}{\sqrt{a}}\Vert {\varvec{h}}_{T_{01}} \Vert _2 + \frac{2\sigma _k({\varvec{x}}_0)_1}{\sqrt{a(k+1)}} \end{aligned}$$

(20)

and

$$\begin{aligned} \Vert {{\varvec{h}}_{T_{01}}}\Vert _2 \le \frac{2}{\sqrt{\theta _l}- \sqrt{\theta _u}/\sqrt{a}} \cdot \left( \epsilon + \frac{\sqrt{\theta _u} \sigma _k({\varvec{x}}_0)_1}{\sqrt{a(k+1)}}\right) . \end{aligned}$$

(21)

Here, $C,\, c,\, \theta _l$ and $\theta _u$ are positive constants depending only on $\alpha $ and $\beta $. Putting (20) and (21) into (19), we obtain that

$$\begin{aligned} \Vert {{\varvec{h}}}\Vert _2 \le \frac{2\left( 1+1/\sqrt{a}\right) }{\sqrt{\theta _l}- \sqrt{\theta _u}/\sqrt{a}} \epsilon + \left( \frac{2(1+1/\sqrt{a})\sqrt{\theta _u}}{\sqrt{\theta _l}- \sqrt{\theta _u}/\sqrt{a}} +2\right) \frac{\sigma _k({\varvec{x}}_0)_1}{\sqrt{a(k+1)}}. \end{aligned}$$

It remains to prove the claim (20) and (21). Since ${\widehat{\varvec{x}}}$ is the solution to $\ell _1$ minimization program (4), we have

$$\begin{aligned} \Vert {\varvec{x}}'_0 \Vert _1 \ge \Vert {\widehat{\varvec{x}}}' \Vert _1&= \Vert {\varvec{x}}'_0 + {\varvec{h}} \Vert _1 = \Vert ({\varvec{x}}'_0 + {\varvec{h}})_{T_0} \Vert _1 + \Vert ({\varvec{x}}'_0 + {\varvec{h}})_{T_0^c} \Vert _1 \\ {}&\ge \Vert {\varvec{x}}'_{0,T_0} \Vert _1 - \Vert {\varvec{h}}_{T_0} \Vert _1 + \Vert {\varvec{h}}_{T_0^c} \Vert _1 - \Vert {\varvec{x}}'_{0,T_0^c} \Vert _1. \end{aligned}$$

Therefore,

$$\begin{aligned} \Vert {\varvec{h}}_{T_0^c}\Vert _1 \le \Vert {\varvec{h}}_{T_0} \Vert _1 + 2\Vert {\varvec{x}}'_{0,T_0^c} \Vert _1. \end{aligned}$$

(22)

From the definition of $T_j$, we obtain that, for all $j \ge 2$,

$$\begin{aligned} \Vert {\varvec{h}}_{T_j} \Vert _2 \le \sqrt{a(k+1)}\Vert {\varvec{h}}_{T_j} \Vert _\infty = \frac{a(k+1)}{\sqrt{a(k+1)}}\Vert {\varvec{h}}_{T_j} \Vert _\infty \le \frac{\Vert {\varvec{h}}_{T_{j-1}} \Vert _1}{\sqrt{a(k+1)}}. \end{aligned}$$

It then gives

$$\begin{aligned} \Vert {\varvec{h}}_{T_{01}^c} \Vert _2 \le \sum _{j \ge 2}\Vert {\varvec{h}}_{T_j} \Vert _2 \le \frac{1}{\sqrt{a(k+1)}}\sum _{j \ge 2}\Vert {\varvec{h}}_{T_{j-1}} \Vert _1 = \frac{1}{\sqrt{a(k+1)}}\Vert {\varvec{h}}_{T_0^c} \Vert _1. \end{aligned}$$

(23)

Putting (22) into (23), we obtain the conclusion of claim (20), namely,

$$\begin{aligned} \begin{aligned} \Vert {\varvec{h}}_{T_{01}^c} \Vert _2&\le \frac{1}{\sqrt{a(k+1)}}\Vert {\varvec{h}}_{T_0^c} \Vert _1 \le \frac{\Vert {\varvec{h}}_{T_0} \Vert _1 + 2\Vert {\varvec{x}}'_{0,T_0^c} \Vert _1}{\sqrt{a(k+1)}} \\ {}&\le \frac{1}{\sqrt{a}} \Vert {\varvec{h}}_{T_0} \Vert _2 + \frac{2\sigma _{k+1}({\varvec{x}}'_0)_1}{\sqrt{k}} \le \frac{1}{\sqrt{a}}\Vert {\varvec{h}}_{T_{01}} \Vert _2 + \frac{2\sigma _k({\varvec{x}}_0)_1}{\sqrt{a(k+1)}}, \end{aligned} \end{aligned}$$

(24)

where the third inequality follows the Cauchy-Schwarz inequality and the last inequality comes from the fact $\sigma _{k+1}({\varvec{x}}'_0)_1 \le \sigma _k({\varvec{x}}_0)_1$ by the definitions of ${\widehat{\varvec{x}}}'$ and $\sigma _k(\cdot )_1$.

We next turn to prove the claim (21). Observe that

$$\begin{aligned} \Vert \varvec{A}'_I {\varvec{h}} \Vert _2 \ge \Vert \varvec{A}'_I {\varvec{h}}_{T_{01}} \Vert _2 - \Vert \varvec{A}'_I {\varvec{h}}_{T_{01}^c} \Vert _2. \end{aligned}$$

(25)

For the left hand side of (25), by the definition of I, we have

$$\begin{aligned} \begin{aligned} \Vert \varvec{A}'_I {\varvec{h}} \Vert _2&= \Vert \left|\varvec{A}'_I {\widehat{\varvec{x}}}'| - |\varvec{A}'_I {\varvec{x}}'_0\right| \Vert _2 \\ {}&\le \Vert \left|\varvec{A}' {\widehat{\varvec{x}}}'\right| - \left|\varvec{A}' {\varvec{x}}'_0\right| \Vert _2 \\ {}&\le \Vert {\left|\varvec{A}' {\widehat{\varvec{x}}}'\right|- { \varvec{ y}}}\Vert _2 +\Vert {\left|\varvec{A}' {\varvec{x}}'_0\right|- { \varvec{ y}}}\Vert _2 \\&\le 2\epsilon . \end{aligned} \end{aligned}$$

(26)

For the first term of the right hand side of (25), since the matrix $\varvec{A}'$ satisfies strong RIP of order $(a+1)(k+1)$ with constants $\theta _l,\theta _u >0$, we immediately have

$$\begin{aligned} \Vert \varvec{A}'_I {\varvec{h}}_{T_{01}} \Vert _2 \ge \sqrt{\theta _l} \Vert {{\varvec{h}}_{T_{01}}}\Vert _2. \end{aligned}$$

(27)

To give an upper bound for the term $ \Vert \varvec{A}'_I {\varvec{h}}_{T_{01}^c} \Vert _2$, note that $\Vert {\varvec{h}}_{T_{01}^c} \Vert _\infty \le \Vert {\varvec{h}}_{T_1} \Vert _1/a(k+1)$. Let $\theta :=\max \left( \Vert {\varvec{h}}_{T_1} \Vert _1/a(k+1), \Vert {\varvec{h}}_{T_{01}^c} \Vert _1/a(k+1) \right) $. Then by the Lemma 2, we could decompose the vector ${\varvec{h}}_{T_{01}^c}$ into the following form:

$$\begin{aligned} {\varvec{h}}_{T_{01}^c} =\sum _{j=1}^N \lambda _j {\varvec{u}}_j, \quad \text {with} \quad 0\le \lambda _j\le 1, \quad \sum _{j=1}^N \lambda _j=1, \end{aligned}$$

where ${\varvec{u}}_j$ are $a(k+1)$-sparse vectors satisfying

$$\begin{aligned} \Vert {{\varvec{u}}_j}\Vert _1=\Vert { {\varvec{h}}_{T_{01}^c}}\Vert _1, \quad \Vert {\varvec{u}}_j \Vert _\infty \le \theta . \end{aligned}$$

Therefore, we have

$$\begin{aligned} \Vert {{\varvec{u}}_j}\Vert _2 \le \sqrt{\theta \Vert { {\varvec{h}}_{T_{01}^c}}\Vert _1}. \end{aligned}$$

We notice from (22) that

$$\begin{aligned} \Vert { {\varvec{h}}_{T_{01}^c}}\Vert _1 \le \Vert { {\varvec{h}}_{T_{0}^c}}\Vert _1 \le \Vert {\varvec{h}}_{T_0} \Vert _1 + 2\sigma _k({\varvec{x}}_0)_1. \end{aligned}$$

Thus, if $\theta = \Vert {\varvec{h}}_{T_1} \Vert _1/a(k+1)$, then we have

$$\begin{aligned} \Vert {{\varvec{u}}_j}\Vert _2\le & {} \sqrt{\frac{\Vert {\varvec{h}}_{T_1} \Vert _1 \Vert {{\varvec{h}}_{T_{01}^c}}\Vert _1}{a(k+1)}} \! \le \! \sqrt{\frac{\Vert {\varvec{h}}_{T_0^c} \Vert _1 \Vert { {\varvec{h}}_{T_{01}^c}}\Vert _1}{a(k+1)}}\\\le & {} \frac{\Vert {\varvec{h}}_{T_0} \Vert _1 + 2\sigma _k({\varvec{x}}_0)_1}{\sqrt{a(k+1)}} \! \le \! \frac{\Vert {\varvec{h}}_{T_0} \Vert _2}{\sqrt{a}}+\frac{ 2\sigma _k({\varvec{x}}_0)_1}{\sqrt{a(k+1)}}. \end{aligned}$$

If $\theta =\Vert {\varvec{h}}_{T_{01}^c} \Vert _1/a(k+1) $, then

$$\begin{aligned} \Vert {{\varvec{u}}_j}\Vert _2 \le \frac{\Vert {\varvec{h}}_{T_{01}^c} \Vert _1}{\sqrt{a(k+1)}} \le \frac{\Vert {\varvec{h}}_{T_0} \Vert _2}{\sqrt{a}}+\frac{2\sigma _k({\varvec{x}}_0)_1}{\sqrt{a(k+1)}}. \end{aligned}$$

Therefore, for the second term of the right hand side of (25), it follows from the definition of strong RIP that

$$\begin{aligned} \Vert \varvec{A}'_I {\varvec{h}}_{T_{01}^c} \Vert _2 =\Vert {\sum _{j=1}^N \lambda _j \varvec{A}'_I {\varvec{u}}_j }\Vert _2 \le \sqrt{\theta _u} \sum _{j=1}^N \lambda _j \Vert {{\varvec{u}}_j}\Vert _2 \le \sqrt{\theta _u} \left( \frac{\Vert {\varvec{h}}_{T_0} \Vert _2}{\sqrt{a}}+\frac{2\sigma _k({\varvec{x}}_0)_1}{\sqrt{a(k+1)}}\right) . \end{aligned}$$

(28)

Putting (26), (27) and (28) into (25), we immediately obtain

$$\begin{aligned} 2\epsilon \ge \sqrt{\theta _l} \Vert {{\varvec{h}}_{T_{01}}}\Vert _2 - \sqrt{\theta _u}\left( \frac{\Vert {\varvec{h}}_{T_{01}} \Vert _2}{\sqrt{a}}+\frac{2\sigma _k({\varvec{x}}_0)_1}{\sqrt{a(k+1)}}\right) , \end{aligned}$$

which gives

$$\begin{aligned} \Vert {{\varvec{h}}_{T_{01}}}\Vert _2 \le \frac{2}{\sqrt{\theta _l}-\sqrt{\theta _u}/\sqrt{a}} \cdot \left( \epsilon + \frac{\sqrt{\theta _u} \sigma _k({\varvec{x}}_0)_1}{\sqrt{a(k+1)}}\right) . \end{aligned}$$

Case 2: $\left|I\right| < m/2$. For this case, denote ${\varvec{h}}^+ = {\widehat{\varvec{x}}}' + {\varvec{x}}'_0$. Replacing ${\varvec{h}}$ and the subset I in Case 1 by ${\varvec{h}}^+$ and $I^c$ respectively, and applying the same argument, we could obtain

$$\begin{aligned} \Vert {\varvec{h}}_+ \Vert \le \frac{2\left( 1+1/\sqrt{a}\right) }{\sqrt{\theta _l}-\sqrt{\theta _u}/\sqrt{a}} \epsilon + \left( \frac{2(1+1/\sqrt{a})\sqrt{\theta _u}}{\sqrt{\theta _l}-\sqrt{\theta _u}/\sqrt{a}} +2\right) \frac{\sigma _k({\varvec{x}}_0)_1}{\sqrt{a(k+1)}}. \end{aligned}$$

(29)

However, recall that ${\widehat{\varvec{x}}}' = ({\widehat{\varvec{x}}}^\textrm{T}, 1)^\textrm{T}$ and ${\varvec{x}}'_0 = (\varvec{x}_0^\textrm{T}, 1)^\textrm{T}$. It means $\Vert {\varvec{h}}_+ \Vert _2 \ge 2$, which contradicts to (29) by the assumption of $\epsilon $ and $\sigma _k({\varvec{x}}_0)_1$, i.e., $ K_1\epsilon + K_2 \frac{\sigma _k({\varvec{x}}_0)_1}{\sqrt{a(k+1)}} <2 $. Therefore, Case 2 does not hold.

Combining the above two cases, we complete our proof. $\square $

3 Proof of Theorems 4 and 5

3.1 Proof of Theorem 4

Proof

Without loss of generality, we assume that $\Vert {\varvec{H}'}\Vert _F=1$. Observe that

$$\begin{aligned} \frac{1}{m}\Vert {\mathcal A}'(\varvec{H}') \Vert _1=\frac{1}{m} \sum _{j=1}^m \left|{\varvec{a}}_j^* \varvec{H} {\varvec{a}}_j+ 2(b_j ( {\varvec{a}}_j^*{\varvec{h}}))_{\Re }\right|:= \frac{1}{m} \sum _{j=1}^m \xi _j. \end{aligned}$$

For any fixed $\varvec{H} \in \mathbb H^{n\times n}$ and ${\varvec{h}}\in {\mathbb C}^n$, the terms $\xi _j, j=1,\ldots ,m$ are independent sub-exponential random variables with the maximal sub-exponential norm

$$\begin{aligned} K:=\max _{1\le j\le m}C_1(\Vert {\varvec{H}}\Vert _F+ \Vert b_j \Vert _{\psi _2} \Vert {\varvec{h}} \Vert ) \le C_2 \end{aligned}$$

for some universal constants $C_1,\, C_2>0$. Here, we use the fact $\max \left( \Vert {\varvec{H}}\Vert _F,\Vert {\varvec{h}} \Vert \right) \le \Vert {\varvec{H}'}\Vert _F = 1$. For any $0<\epsilon \le 1$, the Bernstein’s inequality gives

$$\begin{aligned} {\mathbb P}\Biggl (\Bigl \vert \frac{1}{m} \sum _{j=1}^m \left( \xi _j - {\mathbb E}\xi _j\right) \Bigr \vert \ge \epsilon \Biggr ) \le 2\exp \left( -c\epsilon ^2 m\right) , \end{aligned}$$

where $c>0$ is a universal constant. According to Lemma 6, we obtain that

$$\begin{aligned} \frac{1}{3} {\mathbb E}\sqrt{ \Vert {\varvec{H}}\Vert _F^2+ \vert b_j\vert ^2 \Vert {\varvec{h}} \Vert ^2 } \le {\mathbb E}\xi _j \le 2 {\mathbb E}\sqrt{3 \Vert {\varvec{H}}\Vert _F^2+ \vert b_j \vert ^2 \Vert {\varvec{h}} \Vert ^2}. \end{aligned}$$

This gives

$$\begin{aligned} \frac{1}{m} \sum _{j=1}^m {\mathbb E}\xi _j \le \frac{2}{m} \sum _{j=1}^m {\mathbb E}\left( \sqrt{3} \Vert {\varvec{H}}\Vert _F+ \vert b_j\vert \Vert {\varvec{h}} \Vert \right) \le 2\sqrt{3} \Vert {\varvec{H}}\Vert _F+2c_2\Vert {\varvec{h}} \Vert \le 2\theta ^+, \end{aligned}$$

where $\theta ^+:=\max (\sqrt{6}, c_2)$. Here, we use the fact $\Vert {\varvec{H}'}\Vert _F^2=\Vert {\varvec{H}}\Vert _F^2+2\Vert {\varvec{h}} \Vert ^2=1$, ${\mathbb E}\Vert {\varvec{b}} \Vert _1\le \sqrt{m} {\mathbb E}\Vert {\varvec{b}} \Vert \le c_2m$, and $\frac{a+b}{\sqrt{2}} \le \sqrt{a^2+b^2} \le a+b$ for any positive number $a,b \in {\mathbb R}$. Similarly, we could obtain

$$\begin{aligned} \frac{1}{m} \sum _{j=1}^m {\mathbb E}\xi _j \ge \frac{1}{3\sqrt{2}}\cdot \frac{1}{m} \sum _{j=1}^m {\mathbb E}\left( \Vert {\varvec{H}}\Vert _F+\vert b_j \vert \Vert {\varvec{h}} \Vert \right) \ge \frac{1}{3\sqrt{2}} \left( \Vert {\varvec{H}}\Vert _F+c_1 \Vert {\varvec{h}} \Vert \right) \ge \frac{\theta ^-}{6}, \end{aligned}$$

where $\theta ^-:=\min (1, c_1/\sqrt{2})$. Collecting the above estimators, we obtain that, with probability at least $1-2\exp (-c\epsilon ^2 m)$, the following inequality

$$\begin{aligned} \frac{\theta ^-}{6} -\epsilon \le \frac{1}{m}\Vert {\mathcal A}'(\varvec{H}') \Vert _1\le 2\theta ^+ +\epsilon \end{aligned}$$

(30)

holds for a fixed $\varvec{H}' \in \mathbb H^{(n+1)\times (n+1)}$. We next show that (30) holds for all $\varvec{H}' \in \mathcal X$, where

$$\begin{aligned} \mathcal X \!:=\! \left\{ \varvec{H}'\!:=\!\begin{bmatrix} \varvec{H} &{} {\varvec{h}}\\ {\varvec{h}}^* &{} 0 \end{bmatrix} \in \mathbb H^{(n+1)\times (n+1)}: \Vert {\varvec{H}'}\Vert _F=1,\, {\text {rank}}(\varvec{H}) \le 2,\, \Vert \varvec{H} \Vert _{0, 2} \le k,\, \Vert {\varvec{h}} \Vert _0 \le k\right\} . \end{aligned}$$

To this end, we adopt a basic version of a $\delta $-net argument. Assume that $\mathcal N_\delta $ is a $\delta $-net of $\mathcal X$, i.e., for any $\varvec{H}'=\begin{bmatrix} \varvec{H} &{} {\varvec{h}}\\ {\varvec{h}}^* &{} 0 \end{bmatrix} \in \mathcal X$ there exists a $\varvec{H}_0':=\begin{bmatrix} \varvec{H}_0 &{} {\varvec{h}}_0 \\ {\varvec{h}}_0^* &{} 0 \end{bmatrix} \in \mathcal N_\delta $ such that $\Vert {\varvec{H} - \varvec{H}_0}\Vert _F\le \delta $ and $\Vert {\varvec{h}}-{\varvec{h}}_0 \Vert \le \delta $. Using the same idea of Lemma 2.1 in [36], we obtain that the covering number of $\mathcal X$ is

$$\begin{aligned} \left|\mathcal N_\delta \right| \le \left( \frac{9\sqrt{2} en}{\delta k} \right) ^{4k+2}\cdot \left( \begin{array}{l} n \\ k \end{array}\right) \left( 1+\frac{2}{\delta }\right) ^{2k} \le \exp \left( C_3 k\log (en/ \delta k)\right) , \end{aligned}$$

where $C_3>0$ is a universal constant. Note that ${\varvec{h}}-{\varvec{h}}_0$ has at most 2k nonzero entries. We obtain that if $m\gtrsim k\log (en/k)$, then with probability at least $1-3\exp (-c m)$, it holds

$$\begin{aligned} \left|\frac{1}{m}\Vert {\mathcal A}'(\varvec{H}') \Vert _1-\frac{1}{m}\Vert {\mathcal A}'(\varvec{H}_0') \Vert _1\right|&\le \frac{1}{m}\Vert {\mathcal A}'(\varvec{H}'-\varvec{H}'_0) \Vert _1 \\&\le \frac{1}{m}\Vert \mathcal A(\varvec{H}-\varvec{H}_0) \Vert _1 +\frac{2}{m} \sum _{j=1}^m \vert b_j\vert \vert {\varvec{a}}_j^* ({\varvec{h}}-{\varvec{h}}_0)\vert \\ {}&\le \frac{1}{m}\Vert \mathcal A(\varvec{H}-\varvec{H}_0) \Vert _1 \\&\quad + 2\sqrt{\frac{1}{m} \sum _{j=1}^m \vert b_j\vert ^2} \! \sqrt{\frac{1}{m} \sum _{j=1}^m \vert {\varvec{a}}_j^* ({\varvec{h}}-{\varvec{h}}_0)\vert ^2}\\ {}&\le 2.45 \Vert {\varvec{H}-\varvec{H}_0}\Vert _F + 3(c_2+1) \Vert {\varvec{h}}-{\varvec{h}}_0 \Vert \\ {}&\le 3\left( c_2+2\right) \delta , \end{aligned}$$

where the linear map $\mathcal A(\cdot )$ is defined as $\mathcal {A}(\varvec{H}):= ({\varvec{a}}_1^*\varvec{H} {\varvec{a}}_1, \dots , {\varvec{a}}_m^*\varvec{H} {\varvec{a}}_m)$, and the fourth inequality follows from the combination of Lemma 3, the fact $ \frac{1}{\,}m \sum _{j=1}^m {\varvec{a}}_j {\varvec{a}}_j^* \le 3/2$ with probability at least $1-\exp (-c m)$, and

$$\begin{aligned} \frac{1}{m} \sum _{j=1}^m \vert b_j \vert ^2 \le \frac{{\mathbb E}\Vert {\varvec{b}} \Vert ^2}{m}+1\le c_2 +1 \end{aligned}$$

with probability at least $1-2\exp (-c m)$. Choosing $\epsilon :=\frac{1}{48}$, $\delta :=\frac{\theta ^-}{48(c_2+2)}$, and taking the union bound, we obtain that the following inequality

$$\begin{aligned} \frac{\theta ^-}{12} \le \frac{1}{m}\Vert {\mathcal A}'(\varvec{H}') \Vert _1\le 3\theta ^+ \quad \text{ for } \text{ all } \quad \varvec{H}'\in \mathcal X \end{aligned}$$

holds with probability at least

$$\begin{aligned} 1-3\exp (-cm)-2 \exp \left( C_3 k \log (en/ \delta k)\right) \cdot \exp (-c \epsilon ^2 m) \ge 1-5 \exp (-c' m), \end{aligned}$$

provided $m\ge C' k\log (en/k)$, where $C',\, c'>0$ are constants depending only on $c_1$ and $c_2$. $\square $

3.2 Proof of Theorem 5

Proof

The proof of this theorem is adapted from that of Theorem 1.3 in [36]. Note that the $\ell _1$-minimization problem we consider is

$$\begin{aligned} \mathop {\textrm{argmin}}\limits _{{\varvec{x}}\in {\mathbb C}^n} \quad \Vert {\varvec{x}} \Vert _1 \quad \mathrm {s.t.} \quad \Vert \mathcal A'({\varvec{x}}')-{ \varvec{ y}}' \Vert \le \epsilon \quad \text{ with } \quad {\varvec{x}}'= \begin{pmatrix} {\varvec{x}}\\ 1\end{pmatrix}. \end{aligned}$$

(31)

Here, with some abuse of notation, we set

$$\begin{aligned} \mathcal A'({\varvec{x}}'):=\mathcal A'({\varvec{x}}' {\varvec{x}}') = \left( \left|{\varvec{a}}_1'^* {\varvec{x}}'\right|^2, \ldots , \left|{\varvec{a}}_m'^* {\varvec{x}}'\right|^2\right) \quad \text{ with } \quad {\varvec{a}}_j':=\begin{pmatrix} {\varvec{a}}_j\\ b_j\end{pmatrix},\quad j=1,\ldots ,m. \end{aligned}$$

Let ${\widehat{\varvec{x}}}\in {\mathbb C}^n$ be a solution to (31). Without loss of generality, we assume $\langle {\widehat{\varvec{x}}}',{\varvec{x}}_0' \rangle \ge 0$ (Otherwise, we can choose $e^{\textrm{i}\theta } {\varvec{x}}_0'$ for an appropriate $\theta $), where ${\widehat{\varvec{x}}}'= \begin{pmatrix} {\widehat{\varvec{x}}}\\ 1\end{pmatrix}$ and ${\varvec{x}}_0'= \begin{pmatrix} {\varvec{x}}_0 \\ 1\end{pmatrix}$. Set

$$\begin{aligned} \hat{\varvec{X}}':={\widehat{\varvec{x}}}'{\widehat{\varvec{x}}}'^*=\begin{pmatrix} {\widehat{\varvec{x}}}{\widehat{\varvec{x}}}^* &{} {\widehat{\varvec{x}}}\\ {\widehat{\varvec{x}}}^*&{} 1 \end{pmatrix} \end{aligned}$$

and

$$\begin{aligned} \varvec{H}':={\widehat{\varvec{x}}}'{\widehat{\varvec{x}}}'^*-{\varvec{x}}'_0{{\varvec{x}}'_0}^*=\begin{pmatrix} {\widehat{\varvec{x}}}{\widehat{\varvec{x}}}^* -{\varvec{x}}_0{\varvec{x}}_0^* &{} {\widehat{\varvec{x}}}-{\varvec{x}}_0 \\ {\widehat{\varvec{x}}}^* -{\varvec{x}}_0^*&{} 0 \end{pmatrix}:=\begin{pmatrix} \varvec{H} &{} {\varvec{h}}\\ {\varvec{h}}^*&{} 0\end{pmatrix}. \end{aligned}$$

Therefore, it suffices to give an upper bound for $\Vert {\varvec{H}'}\Vert _F$. Denote $T_0:={\text {supp}}({\varvec{x}}_0)$ and $T'_0:=T_0 \cup \left\{ n+1\right\} $. Let $T_1$ be the index set corresponding to the indices of the ak-largest elements of ${\widehat{\varvec{x}}}_{T_0^c}$ in magnitude, and $T_2$ contain the indices of the next ak largest elements, and so on. Set $T_{01}:=T_0 \cup T_1$, $T'_{01}:=T'_0 \cup T_1$, $ \bar{\varvec{h}}:={\varvec{h}}_{T_{01}}$, $ \bar{\varvec{H}}=\varvec{H}_{T_{01}, T_{01}}$, and $ \bar{\varvec{H}}':= \varvec{H}'_{T'_{01}, T'_{01}}$. Noting that

$$\begin{aligned} \Vert {\varvec{H}'}\Vert _F\le \Vert { \bar{\varvec{H}}'}\Vert _F +\Vert {\varvec{H}'- \bar{\varvec{H}}'}\Vert _F, \end{aligned}$$

(32)

we next consider the terms $\Vert { \bar{\varvec{H}}'}\Vert _F$ and $\Vert {\varvec{H}'- \bar{\varvec{H}}'}\Vert _F$. We claim that

$$\begin{aligned} \Vert {\varvec{H}'- \bar{\varvec{H}}'}\Vert _F \le \left( \frac{1}{a} +\frac{4}{\sqrt{a}}\right) \Vert { \bar{\varvec{H}}'}\Vert _F \end{aligned}$$

(33)

and

$$\begin{aligned} \Vert { \bar{\varvec{H}}'}\Vert _F \le \frac{1}{c - C\left( \frac{4}{\sqrt{a}}+\frac{1}{a}\right) } \cdot \frac{2\epsilon }{\sqrt{m}}. \end{aligned}$$

(34)

Combining (32), (33) and (34), we obtain that

$$\begin{aligned} \Vert {\varvec{H}'}\Vert _F\le \frac{\frac{1}{a} +\frac{4}{\sqrt{a}}+1}{c - C\left( \frac{4}{\sqrt{a}}+\frac{1}{a}\right) } \cdot \frac{2\epsilon }{\sqrt{m}}. \end{aligned}$$

According to Lemma 4, we immediately have

$$\begin{aligned} \min _{\theta \in {\mathbb R}} \Vert {{\widehat{\varvec{x}}}'-e^{\textrm{i}\theta } {\varvec{x}}'_0}\Vert _2 \le \frac{\sqrt{2}\Vert \varvec{H}' \Vert }{\Vert {\varvec{x}}_0 \Vert +1} \le \frac{\frac{1}{a} +\frac{4}{\sqrt{a}}+1}{c - C\left( \frac{4}{\sqrt{a}}+\frac{1}{a}\right) } \cdot \frac{2\sqrt{2} \epsilon }{\left( \Vert {\varvec{x}}_0 \Vert +1\right) \sqrt{m}}. \end{aligned}$$

By the definition of ${\widehat{\varvec{x}}}'$ and ${\varvec{x}}'_0$, we arrive at the conclusion.

It remains to prove the claims (33) and (34). Note that

$$\begin{aligned} \Vert {\varvec{H}'- \bar{\varvec{H}}'}\Vert _F \le \sum _{i\ge 2, j\ge 2} \Vert {\varvec{H}_{T_i, T_j}}\Vert _F + 2 \sum _{j\ge 2} \Vert {\varvec{H}'_{T'_0, T_j}}\Vert _F+2 \sum _{j\ge 2} \Vert {\varvec{H}'_{T_1, T_j}}\Vert _F . \end{aligned}$$

(35)

We first give an upper bound for the term $\sum _{i\ge 2, j\ge 2} \Vert {\varvec{H}'_{T_i, T_j}}\Vert _F$. Noting that ${\varvec{x}}_0$ is a k-sparse vector and ${\widehat{\varvec{x}}}\in {\mathbb C}^n$ is the solution to (31), we obtain that

$$\begin{aligned} \Vert {\varvec{x}}_0 \Vert _1 \ge \Vert {{\widehat{\varvec{x}}}}\Vert _1=\Vert {{\widehat{\varvec{x}}}_{T_0}}\Vert _1+\Vert {{\widehat{\varvec{x}}}_{T_0^c}}\Vert _1, \end{aligned}$$

which implies $\Vert {{\widehat{\varvec{x}}}_{T_0^c}}\Vert _1 \le \Vert {{\widehat{\varvec{x}}}_{T_0}-{\varvec{x}}_0}\Vert _1$. Moreover, by the definition of $T_j$, we know that for all $j\ge 2$, it holds $ \Vert {\widehat{\varvec{x}}}_{T_j} \Vert _2 \le \frac{\Vert {\widehat{\varvec{x}}}_{T_{j-1}} \Vert _1}{\sqrt{ak}}$. It then implies

$$\begin{aligned} \sum _{j\ge 2} \Vert {{\widehat{\varvec{x}}}_{T_{j}}}\Vert _2 \le \frac{1}{\sqrt{ak}} \sum _{j\ge 2} \Vert {{\widehat{\varvec{x}}}_{T_{j-1}}}\Vert _1 \le \frac{1}{\sqrt{ak}} \Vert {\widehat{\varvec{x}}}_{T_0^c} \Vert _1 \le \frac{1}{\sqrt{a}} \Vert {{\widehat{\varvec{x}}}_{T_0}-{\varvec{x}}_0}\Vert _2. \end{aligned}$$

(36)

Therefore, the first term of (35) can be estimated as

$$\begin{aligned} \begin{aligned} \sum _{i\ge 2, j\ge 2} \Vert {\varvec{H}_{T_i, T_j}}\Vert _F= \sum _{i\ge 2, j\ge 2} \Vert {{\widehat{\varvec{x}}}_{T_i}}\Vert _2 \Vert {{\widehat{\varvec{x}}}_{T_j}}\Vert _2 = \left( \sum _{j\ge 2} \Vert {{\widehat{\varvec{x}}}_{T_{j}}}\Vert _2\right) ^2 \le \frac{1}{ak} \Vert {\widehat{\varvec{x}}}_{T_{0}^c} \Vert _1^2 \\ = \frac{1}{ak} \Vert \varvec{H}_{T_0^c, T_0^c} \Vert _1 \le \frac{1}{ak} \Vert \varvec{H}_{T_0, T_0} \Vert _1 \le \frac{1}{a} \Vert { \bar{\varvec{H}}'}\Vert _F, \end{aligned} \end{aligned}$$

(37)

where the second inequality follows from

$$\begin{aligned} \Vert {\varvec{H}-\varvec{H}_{T_0,T_0}}\Vert _1 =\Vert {{\widehat{\varvec{x}}}{\widehat{\varvec{x}}}^*-({\widehat{\varvec{x}}}{\widehat{\varvec{x}}}^*)_{T_0,T_0}}\Vert _1 \le \Vert {{\varvec{x}}_0{{\varvec{x}}_0}^*}\Vert _1-\Vert {({\widehat{\varvec{x}}}{\widehat{\varvec{x}}}^*)_{T_0,T_0}}\Vert _1 \le \Vert {\varvec{H}_{T_0, T_0}}\Vert _1. \end{aligned}$$

Here, the first inequality comes from $\Vert {{\widehat{\varvec{x}}}}\Vert _1 \le \Vert {{\varvec{x}}_0}\Vert _1$.

For the second term and the third term of (35), we obtain that

$$\begin{aligned} \begin{aligned} \sum _{j\ge 2} \Vert {\varvec{H}'_{T'_0, T_j}}\Vert _F+\sum _{j\ge 2} \Vert {\varvec{H}'_{T_1, T_j}}\Vert _F&= \Vert {\widehat{\varvec{x}}}'_{T'_0} \Vert \sum _{j\ge 2} \Vert {\widehat{\varvec{x}}}'_{T_j} \Vert + \Vert {\widehat{\varvec{x}}}'_{T_1} \Vert \sum _{j\ge 2} \Vert {\widehat{\varvec{x}}}'_{T_j} \Vert \\ {}&\le \frac{1}{\sqrt{a}} \Vert {{\widehat{\varvec{x}}}'_{T'_0}-{\varvec{x}}'_0}\Vert _2 \left( \Vert {{\widehat{\varvec{x}}}'_{T'_0}}\Vert _2+\Vert {{\widehat{\varvec{x}}}'_{T_1}}\Vert _2\right) \\ {}&\le \frac{\sqrt{2}}{\sqrt{a}} \Vert {{\widehat{\varvec{x}}}'_{T'_{01}}-{\varvec{x}}'_0}\Vert _2 \Vert {{\widehat{\varvec{x}}}'_{T'_{01}}}\Vert _2 \\&\le \frac{2}{\sqrt{a}} \Vert { \bar{\varvec{H}}'}\Vert _F, \end{aligned} \end{aligned}$$

(38)

where the first inequality follows from (36) due to ${\widehat{\varvec{x}}}'_{T_j}={\widehat{\varvec{x}}}_{T_j}$ for all $j\ge 1$, and the last inequality comes from Lemma 4. Putting (37) and (38) into (35), we obtain that

$$\begin{aligned} \Vert {\varvec{H}'- \bar{\varvec{H}}'}\Vert _F \le \left( \frac{1}{a} +\frac{4}{\sqrt{a}}\right) \Vert { \bar{\varvec{H}}'}\Vert _F. \end{aligned}$$

This proves the claim (33).

Finally, we turn to prove the claim (34). Note that $\Vert \mathcal A'({\widehat{\varvec{x}}}')-{\tilde{\varvec{y}}} \Vert \le \epsilon $ and ${\tilde{\varvec{y}}}:=\mathcal A'({\varvec{x}}_0')+\epsilon $, which implies

$$\begin{aligned} \Vert {\mathcal A'(\varvec{H}')}\Vert _2 \le \Vert {\mathcal A'({\widehat{\varvec{x}}}')-{\tilde{\varvec{y}}}}\Vert _2+ \Vert {\mathcal A'({\varvec{x}}_0')-{\tilde{\varvec{y}}}}\Vert _2 \le 2\epsilon . \end{aligned}$$

Thus, we have

$$\begin{aligned} \frac{2\epsilon }{\sqrt{m}} \ge \frac{1}{\sqrt{m}} \Vert {\mathcal A'(\varvec{H}')}\Vert _2 \ge \frac{1}{m} \Vert {\mathcal A'(\varvec{H}')}\Vert _1 \ge \frac{1}{m} \Vert {\mathcal A'( \bar{\varvec{H}}')}\Vert _1 - \frac{1}{m} \Vert {\mathcal A'(\varvec{H}'- \bar{\varvec{H}}')}\Vert _1. \end{aligned}$$

(39)

Recall that $ \bar{\varvec{H}}':=\begin{pmatrix} \bar{\varvec{H}}&{} \bar{\varvec{h}}\\ \bar{\varvec{h}}^*&{} 0\end{pmatrix}$ with ${\text {rank}}( \bar{\varvec{H}})\le 2$, $\Vert \bar{\varvec{H}} \Vert _{0,2}\le (a+1)k$, and $\Vert \bar{\varvec{h}} \Vert _0 \le (a+1)k$. It then follows from the RIP of $\mathcal A'$ that

$$\begin{aligned} \Vert {\mathcal A'( \bar{\varvec{H}}')}\Vert _1 \ge c \Vert { \bar{\varvec{H}}'}\Vert _F. \end{aligned}$$

(40)

To prove (34), it suffices to give an upper bound for the term $\frac{1}{m} \Vert {\mathcal A'(\varvec{H}'- \bar{\varvec{H}}')}\Vert _1$. Observe that

$$\begin{aligned} \varvec{H}'- \bar{\varvec{H}}' = (\varvec{H}'_{T'_0, {T'^c_{01}}}+\varvec{H}'_{{T'^c_{01}},T'_0}) + (\varvec{H}'_{T_1, T'^c_{01}}+\varvec{H}'_{ T'^c_{01},T_1})+\varvec{H}'_{T'^c_{01}, T'^c_{01}}. \end{aligned}$$

(41)

Since

$$\begin{aligned} \varvec{H}'_{T'_0, {T'^c_{01}}}+\varvec{H}'_{ {T'^c_{01}},T'_0} =\sum _{j\ge 2} (\varvec{H}'_{T'_0, T_{j}}+\varvec{H}'_{ T_{j},T'_0}) =\sum _{j\ge 2} \begin{pmatrix} {\widehat{\varvec{x}}}_{T_0}{\widehat{\varvec{x}}}_{T_j}^*+{\widehat{\varvec{x}}}_{T_j}{\widehat{\varvec{x}}}_{T_0}^* &{} {\widehat{\varvec{x}}}_{T_j} \\ {\widehat{\varvec{x}}}_{T_j}^* &{} 0\end{pmatrix}, \end{aligned}$$

then the RIP of $\mathcal A'$ implies

$$\begin{aligned} \begin{aligned} \frac{1}{m} \Vert {\mathcal A'(\varvec{H}'_{T'_0, {T'^c_{01}}}+\varvec{H}'_{ {T'^c_{01}},T'_0} )}\Vert _1&\le C \sum _{j\ge 2} \left( \Vert {{\widehat{\varvec{x}}}_{T_0}{\widehat{\varvec{x}}}_{T_j}^*+{\widehat{\varvec{x}}}_{T_j}{\widehat{\varvec{x}}}_{T_0}^*}\Vert _F +2\Vert { {\widehat{\varvec{x}}}_{T_j}}\Vert _2\right) \\ {}&\le 2\sqrt{2} C \Vert {{\widehat{\varvec{x}}}'_{T'_0}}\Vert _2 \sum _{j\ge 2} \Vert {{\widehat{\varvec{x}}}_{T_{j}}}\Vert _2 \\ {}&\le \frac{2\sqrt{2}}{\sqrt{a}}C \Vert {{\widehat{\varvec{x}}}'_{T'_0}}\Vert _2 \Vert {{\widehat{\varvec{x}}}'_{T'_{01}}-{\varvec{x}}'_0}\Vert _2. \end{aligned} \end{aligned}$$

(42)

Similarly, we could obtain

$$\begin{aligned} \frac{1}{m} \Vert {\mathcal A'(\varvec{H}'_{T_1, T'^c_{01}}+\varvec{H}'_{ T'^c_{01},T_1})}\Vert _1 \le \frac{2\sqrt{2}}{\sqrt{a}}C \Vert {{\widehat{\varvec{x}}}'_{T_1}}\Vert _2 \Vert {{\widehat{\varvec{x}}}'_{T'_{01}}-{\varvec{x}}'_0}\Vert _2. \end{aligned}$$

(43)

Finally, observe that $\frac{1}{m} \Vert {\mathcal A' (\varvec{H}'_{T'^c_{01}, T'^c_{01}})}\Vert _1 = \frac{1}{m} \Vert {\mathcal A (\varvec{H}_{T^c_{01}, T^c_{01}})}\Vert _1$. Using the same technique as [36, Eq. (3.16)], we could obtain

$$\begin{aligned} \frac{1}{m} \Vert {\mathcal A'(\varvec{H}'_{T'^c_{01}, T'^c_{01}})}\Vert _1 \le \frac{C}{a} \Vert { \bar{\varvec{H}}'}\Vert _F. \end{aligned}$$

(44)

Putting (42), (43) and (44) into (41), we have

$$\begin{aligned} \frac{1}{m} \Vert {\mathcal A'(\varvec{H}'- \bar{\varvec{H}}')}\Vert _1 \le \frac{4}{\sqrt{a}}C \Vert {{\widehat{\varvec{x}}}'_{T'_{01}}}\Vert _2 \Vert {{\widehat{\varvec{x}}}'_{T'_{01}}-{\varvec{x}}'_0}\Vert _2+\frac{C}{a} \Vert { \bar{\varvec{H}}'}\Vert _F \le C\left( \frac{4}{\sqrt{a}}+\frac{1}{a}\right) \Vert { \bar{\varvec{H}}'}\Vert _F. \end{aligned}$$

(45)

Combining (39), (40) and (45), we immediately obtain

$$\begin{aligned} \left( c- C\left( \frac{4}{\sqrt{a}}+\frac{1}{a}\right) \right) \Vert { \bar{\varvec{H}}'}\Vert _F \le \frac{2\epsilon }{\sqrt{m}}, \end{aligned}$$

which implies

$$\begin{aligned} \Vert { \bar{\varvec{H}}'}\Vert _F \le \frac{1}{c- C\left( \frac{4}{\sqrt{a}}+\frac{1}{a}\right) } \cdot \frac{2\epsilon }{\sqrt{m}}. \end{aligned}$$

This completes the proof of claim (34). $\square $

References

Balan, R., Casazza, P., Edidin, D.: On signal reconstruction without phase. Appl. Comput. Harmon. Anal. 20(3), 345–356 (2006)
Article MathSciNet MATH Google Scholar
Barmherzig, D.A., Sun, J., Li, P.N., Lane, T.J., Candès, E.J.: Holographic phase retrieval and reference design. Inverse Probl. 35(9), 094001 (2019)
Article MathSciNet MATH Google Scholar
Beinert, R., Plonka, G.: Ambiguities in one-dimensional discrete phase retrieval from Fourier magnitudes. J. Fourier Anal. Appl. 21(6), 1169–1198 (2015)
Article MathSciNet MATH Google Scholar
Beinert, R., Plonka, G.: Enforcing uniqueness in one-dimensional phase retrieval by additional signal information in time domain. Appl. Comput. Harmon. Anal. 45(3), 505–525 (2018)
Article MathSciNet MATH Google Scholar
Bendory, T., Beinert, R.,Eldar, Y. C.: Fourier phase retrieval: Uniqueness and algorithms. Compressed Sensing and its Applications, pp. 55–91 (2017)
Bandeira, A., Cahill, J., Mixon, D., Nelson, A.: Saving phase: injectivity and stability for phase retrieval. Appl. Comput. Harmon. Anal. 37(1), 106–125 (2014)
Article MathSciNet MATH Google Scholar
Cai, J., Huang, M., Li, D., Wang, Y.: Solving phase retrieval with random initial guess is nearly as good as by spectral initialization. Appl. Comput. Harmon. Anal. 58, 60–84 (2022)
Article MathSciNet MATH Google Scholar
Cai, T.T., Zhang, A.: Sparse representation of a polytope and recovery of sparse signals and low-rank matrices. IEEE Trans. Inf. Theory 60(1), 122–132 (2013)
Article MathSciNet MATH Google Scholar
Candès, E.J., Li, X., Soltanolkotabi, M.: Phase retrieval via Wirtinger flow: theory and algorithms. IEEE Trans. Inf. Theory 61(4), 1985–2007 (2015)
Article MathSciNet MATH Google Scholar
Chen, Y., Candès, E.J.: Solving random quadratic systems of equations is nearly as easy as solving linear systems. Commun. Pure Appl. Math. 70(5), 822–883 (2017)
Article MathSciNet MATH Google Scholar
Edidin, D.: The geometry of ambiguity in one-dimensional phase retrieval. SIAM J. Appl. Algebr. Geom. 3(4), 644–660 (2019)
Article MathSciNet MATH Google Scholar
Eldar, Y.C., Mendelson, S.: Phase retrieval: stability and recovery guarantees. Appl. Comput. Harmon. Anal. 36(3), 473–494 (2014)
Article MathSciNet MATH Google Scholar
Conca, A., Edidin, D., Hering, M., Vinzant, C.: An algebraic characterization of injectivity in phase retrieval. Appl. Comput. Harmon. Anal. 38(2), 346–356 (2015)
Article MathSciNet MATH Google Scholar
Fienup, J.R.: Reconstruction of an object from the modulus of its Fourier transform. Opt. Lett. 3(1), 27–29 (1978)
Article Google Scholar
Fienup, J.R.: Phase retrieval algorithms: a comparison. Appl. Opt. 21(15), 2758–2769 (1982)
Article Google Scholar
Gabor, D.: A new microscopic principle. Nature 161(4098), 777–778 (1948)
Article Google Scholar
Gabor, D.: Microscopy by reconstructed wave-fronts. Proc. R. Soc. Lond. Ser. A 197(1051), 454–487 (1949)
Article MATH Google Scholar
Gao, B., Wang, Y., Xu, Z.: Stable signal recovery from phaseless measurements. J. Fourier Anal. Appl. 22(4), 787–808 (2016)
Article MathSciNet MATH Google Scholar
Gao, B., Sun, Q., Wang, Y., Xu, Z.: Phase retrieval from the magnitudes of affine linear measurements. Adv. Appl. Math. 93, 121–141 (2018)
Article MathSciNet MATH Google Scholar
Guizar-Sicairos, M., Fienup, J.R.: Holography with extended reference by autocorrelation linear differential operation. Opt. Express 15(26), 17592–17612 (2007)
Article Google Scholar
Harrison, R.W.: Phase problem in crystallography. J. Opt. Soc. Am. A 10(5) (1993)
Hauptman, H.A.: The phase problem of X-ray crystallography. Rep. Prog. Phys. 54(11), 1427–1454 (1991)
Article Google Scholar
Huang, K., Eldar, Y.C., Sidiropoulos, N.D.: Phase retrieval from 1D Fourier measurements: convexity, uniqueness, and algorithms. IEEE Trans. Signal Process. 64(23), 6105–6117 (2016)
Article MathSciNet MATH Google Scholar
Huang, M., Xu, Z.: Performance bound of the intensity-based model for noisy phase retrieval. arXiv:2004.08764 (2020)
Huang, M., Xu, Z.: Strong convexity of affine phase retrieval. arXiv:2204.09412 (2022)
Latychevskaia, T.: Iterative phase retrieval for digital holography: tutorial. JOSA A 36(12), 31–40 (2019)
Article Google Scholar
Liebling, M., Blu, T., Cuche, E., Marquet, P., Depeursinge, C., Unser, M.: Local amplitude and phase retrieval method for digital holography applied to microscopy. In: European Conference on Biomedical Optics, Vol. 5143, pp. 210–214 (2003)
Millane, R.P.: Phase retrieval in crystallography and optics. J. Opt. Soc Am. A 7(3), 394–411 (1990)
Article Google Scholar
Netrapalli, P., Jain, P., Sanghavi, S.: Phase retrieval using alternating minimization. IEEE Trans. Signal Process. 63(18), 4814–4826 (2015)
Article MathSciNet MATH Google Scholar
Rodriguez, J.A., Xu, R., Chen, C., Zou, Y., Miao, J.: Oversampling smoothness: an effective algorithm for phase retrieval of noisy diffraction intensities. J. Appl. Crystallogr. 46(2), 312–318 (2013)
Article Google Scholar
Sanz, J.L.C.: Mathematical considerations for the problem of Fourier transform phase retrieval from magnitude. SIAM J. Appl. Math. 45(4), 651–664 (1985)
Article MathSciNet MATH Google Scholar
Shechtman, Y., Eldar, Y.C., Cohen, O., Chapman, H.N., Miao, J., Segev, M.: Phase retrieval with application to optical imaging: a contemporary overview. IEEE Signal Process. Mag. 32(3), 87–109 (2015)
Article Google Scholar
Sun, J., Qu, Q., Wright, J.: A geometric analysis of phase retrieval. Found. Comput. Math. 18(5), 1131–1198 (2018)
Article MathSciNet MATH Google Scholar
Voroninski, V., Xu, Z.: A strong restricted isometry property, with an application to phaseless compressed sensing. Appl. Comput. Harmon. Anal. 40(2), 386–395 (2016)
Article MathSciNet MATH Google Scholar
Walther, A.: The question of phase retrieval in optics. J. Mod. Opt. 10(1), 41–49 (1963)
MathSciNet Google Scholar
Xia, Y., Xu, Z.: The recovery of complex sparse signals from few phaseless measurements. Appl. Comput. Harmon. Anal. 50, 1–15 (2021)
Article MathSciNet MATH Google Scholar
Wang, G., Giannakis, G.B., Eldar, Y.C.: Solving systems of random quadratic equations via truncated amplitude flow. IEEE Trans. Inf. Theory 64(2), 773–794 (2018)
Article MathSciNet MATH Google Scholar
Wang, Y., Xu, Z.: Generalized phase retrieval : measurement number, matrix recovery and beyond. Appl. Comput. Harmon. Anal. 47(2), 423–446 (2019)
Article MathSciNet MATH Google Scholar
Wang, Y., Xu, Z.: Phase retrieval for sparse signals. Appl. Comput. Harmon. Anal. 37(3), 531–544 (2014)
Article MathSciNet MATH Google Scholar
Xu, G., Xu, Z.: On the $\ell _1$-norm invariant convex $k$-sparse decomposition of signals. J. Oper. Res. Soc. China 1(4), 537–541 (2013)
Article MathSciNet MATH Google Scholar
Zhang, H., Zhou, Y., Liang, Y., Chi, Y.: A nonconvex approach for phase retrieval: reshaped Wirtinger flow and incremental algorithms. J. Mach. Learn. Res. 18(1), 5164–5198 (2017)
MathSciNet MATH Google Scholar

Download references

Acknowledgements

Meng Huang was supported by NSFC grant (12201022) and the Fundamental Research Funds for the Central Universities (Grant No. YWF-22-T-204). Zhiqiang Xu was supported by the National Science Fund for Distinguished Young Scholars (12025108) and NSFC (12021001,12288201).

Author information

Authors and Affiliations

School of Mathematical Sciences, Beihang University, Beijing, 100191, China
Meng Huang
LSEC, ICMSEC, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, 100190, China
Shixiang Sun & Zhiqiang Xu
School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, 100049, China
Shixiang Sun & Zhiqiang Xu

Authors

Meng Huang
View author publications
You can also search for this author in PubMed Google Scholar
Shixiang Sun
View author publications
You can also search for this author in PubMed Google Scholar
Zhiqiang Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Meng Huang.

Additional information

Communicated by Jaming Philippe.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A Supporting Lemmas

The following lemma gives a way for how to decompose a vector ${\varvec{v}}\in {\mathbb R}^n$ into the convex combination of several k-sparse vectors.

Lemma 2

( [8, 40]) Suppose that ${\varvec{v}}\in {\mathbb R}^n$ satisfying $\Vert {\varvec{v}} \Vert _\infty \le \theta $ and $\Vert {{\varvec{v}}}\Vert _1\le k\theta $, where $\theta >0$ and $k\in \mathbb Z_+$. Then we have

$$\begin{aligned} {\varvec{v}}=\sum _{j=1}^N \lambda _j {\varvec{u}}_j \quad \textrm{with}\quad 0\le \lambda _j \le 1, \quad \sum _{j=1}^N \lambda _j=1, \end{aligned}$$

where ${\varvec{u}}_j \in {\mathbb R}^n$ is k-sparse vectors and $\Vert {{\varvec{u}}_j}\Vert _1 \le \Vert {{\varvec{v}}}\Vert _1, \; \Vert {\varvec{u}}_j \Vert _\infty \le \theta $.

Lemma 3

( [36]) Let the linear map $\mathcal {A}(\cdot )$ be defined as

$$\begin{aligned} \mathcal {A}(\varvec{H}) := ({\varvec{a}}_1^* \varvec{H} {\varvec{a}}_1, \dots , {\varvec{a}}_m^*\varvec{H} {\varvec{a}}_m), \end{aligned}$$

where ${\varvec{a}}_j \sim 1/\sqrt{2}\cdot \mathcal N(0,I_n)+ i/\sqrt{2}\cdot \mathcal N(0,I_n), j=1,\ldots ,m$ are i.i.d. complex Gaussian random vectors. If $m \gtrsim k\log (e n/k)$, then with probability at least $1 - 2\exp (-c_0m)$, $\mathcal {A}$ satisfies

$$\begin{aligned} 0.12\Vert \varvec{H}\Vert _F \le \frac{1}{m}\Vert \mathcal {A}(\varvec{H})\Vert _1 \le 2.45\Vert \varvec{H}\Vert _F \end{aligned}$$

for all $H \in \mathbb {H}^{n \times n}$ with $\textrm{rank}(\varvec{H}) \le 2$ and $\Vert \varvec{H}\Vert _{0, 2} \le k$. Here, $\Vert \varvec{H}\Vert _{0, 2}$ denotes the number of non-zero rows in $\varvec{H}$.

Lemma 4

( [24, 36]) For any vectors ${\varvec{u}},{\varvec{v}}\in {\mathbb C}^n$ obeying $\langle {\varvec{u}},{\varvec{v}} \rangle \ge 0$, we have

$$\begin{aligned} \Vert {{\varvec{u}}{\varvec{u}}^*-{\varvec{v}}{\varvec{v}}^*}\Vert _F \ge \frac{1}{\sqrt{2}} \Vert {{\varvec{u}}}\Vert _2\Vert {{\varvec{u}}-{\varvec{v}}}\Vert _2. \end{aligned}$$

Lemma 5

Suppose that ${\varvec{a}}_j \sim \mathcal N(0,I_n), j=1, \dots , m$ are i.i.d. Gaussian random vectors and $\varvec{b}\in \mathbb R^m$ is a nonzero vector. For any fixed $\zeta \in (0, 1)$, if $m \ge C\zeta ^{-2}k(\log ( en/k)+\log (1/\zeta ))$, then with probability at least $1 - 3\exp (-c_0 \zeta ^2 m)$ it holds that

$$\begin{aligned} \sum _{j=1}^m b_j({\varvec{a}}_j^\textrm{T}\varvec{x}) \le \zeta \sqrt{m}\Vert \varvec{x} \Vert _2\Vert \varvec{b} \Vert _2 \end{aligned}$$

for all k-sparse vectors $\varvec{x} \in \mathbb R^n$. Here, $c_0 >0$ is a universal constant.

Proof

Without loss of generality we assume $\Vert \varvec{x} \Vert _2 = 1$. For any fixed $\varvec{x}_0$, the terms ${\varvec{a}}_j^\textrm{T}\varvec{x}_0$ are independent, mean zero, sub-gaussian random variables with the maximal sub-gaussian norm being a positive universal constant. The Hoeffding’s inequality implies

$$\begin{aligned} {\mathbb P}\left( \left|b_j({\varvec{a}}_j^\textrm{T}\varvec{x}_0)\right| \ge t\right) \le 2\exp \bigl (-\frac{c_1^2t^2}{\Vert \varvec{b} \Vert _2^2}\bigr ). \end{aligned}$$

Here, $c_1 > 0$ is a universal constant. Taking $t = \zeta \sqrt{m}\Vert \varvec{b} \Vert _2/2$, we obtain that

$$\begin{aligned} \Bigl \vert \sum _{j=1}^m({\varvec{a}}_j^\textrm{T}\varvec{x}_0)\Bigr \vert \le \frac{\zeta }{2} \cdot \sqrt{m} \Vert \varvec{b} \Vert _2 \end{aligned}$$

(46)

holds with probability at least $1 - 2\exp (-c_1\zeta ^2m/4)$.

Next, we give a uniform bound to (46) for all k-sparse vectors $\varvec{x}$. Denote

$$\begin{aligned} \mathcal S_{n,k} = \{\varvec{x}\in \mathbb R^n: \Vert \varvec{x} \Vert _2=1, \Vert \varvec{x} \Vert _0 \le k\}. \end{aligned}$$

We assume that $\mathcal N$ is a $\delta $-net of $\mathcal S_{n,k}$ such that for any $\varvec{x} \in \mathcal S_{n,k}$, there exists a vector $\varvec{x}_0 \in \mathcal N$ such that $\Vert \varvec{x} - \varvec{x}_0 \Vert _2 \le \delta $. The covering number $\vert \mathcal N\vert \le \left( \begin{array}{l} n\\ k \end{array} \right) (1+\frac{2}{\delta })^k$. Note that $\Vert {\varvec{x}}-{\varvec{x}}_0 \Vert \le 2k$. Therefore, when $m\gtrsim 2k$, with probability at least $1 - \exp (-c_2 m)$, it holds, Thus we have

$$\begin{aligned} \left| \Bigl \vert \sum _{j=1}^m b_j({\varvec{a}}_j^\textrm{T}\varvec{x})\Bigr \vert - \Bigl \vert \sum _{j=1}^m b_j({\varvec{a}}_j^\textrm{T}\varvec{x}_0)\Bigr \vert \right|&\le \Bigl \vert \sum _{j=1}^m b_j {\varvec{a}}_j^\textrm{T}(\varvec{x}-\varvec{x}_0)\Bigr \vert \\ {}&\le \Vert \varvec{b} \Vert _2 \sqrt{\sum _{j=1}^m \vert {\varvec{a}}_j^\textrm{T}(\varvec{x} - \varvec{x}_0)\vert ^2}\\ {}&\le \Vert \varvec{b} \Vert _2 \sqrt{\Bigl \Vert \sum _{j=1}^m {\varvec{a}}_j{\varvec{a}}_j^\textrm{T}\Bigr \Vert _2} \cdot \Vert \varvec{x}-\varvec{x}_0 \Vert _2 \\ {}&\le 2\Vert \varvec{b} \Vert _2 \sqrt{m} \cdot \delta , \end{aligned}$$

where the second inequality follows from the Cauchy-Schwarz inequality and the last inequality comes from the fact $\Vert \sum _{j=1}^m {\varvec{a}}_j {\varvec{a}}_j^\textrm{T} \Vert _2 \le 4m$ with probability at least $1 - \exp (-c_2 m)$, where $c_2>0$ is a universal constant. Choosing $\delta = \zeta /4$ and taking the union bound over $\mathcal N$, we obtain that

$$\begin{aligned} \Big \vert \sum _{j=1}^m b_j({\varvec{a}}_j^\textrm{T}\varvec{x}_0)\Big \vert \le \zeta \cdot \sqrt{m} \Vert \varvec{b} \Vert _2 \end{aligned}$$

holds with probability at least

$$\begin{aligned} 1 - 2\exp (-c_1\zeta ^2m/4)\cdot \left( \begin{array}{l} n\\ k \end{array} \right) \cdot (1+\frac{2}{\delta })^{k} - \exp (-c_2 m) \ge 1-3\exp (-c \zeta ^2 m) \end{aligned}$$

provided $m \ge C \zeta ^{-2} k (\log ( en/k)+\log (1/\zeta ))$. Here, C and c are positive universal constants. This completes the proof. $\square $

Lemma 6

Suppose that ${\varvec{a}}\in {\mathbb C}^n$ is a complex Gaussian random vector and $b \in {\mathbb C}$ is a complex number. For any Hermitian matrix $\varvec{H} \in {\mathbb C}^{n\times n}$ with ${\text {rank}}(\varvec{H}) \le 2$ and any vector ${\varvec{h}}\in {\mathbb C}^n$, we have

$$\begin{aligned} \frac{1}{3} \sqrt{ \Vert {\varvec{H}}\Vert _F^2+ b^2 \Vert {\varvec{h}} \Vert ^2 } \le {\mathbb E}\left|{\varvec{a}}^* \varvec{H} {\varvec{a}}+ 2( b( {\varvec{a}}^*{\varvec{h}}))_{\Re }\right| \le 2\sqrt{3 \Vert {\varvec{H}}\Vert _F^2+ b^2 \Vert {\varvec{h}} \Vert ^2}. \end{aligned}$$

Proof

Since $\varvec{H} \in {\mathbb C}^{n\times n}$ is a Hermitian matrix with ${\text {rank}}(\varvec{H}) \le 2$, we can decompose $\varvec{H}$ into

$$\begin{aligned} \varvec{H}= \lambda _1 \varvec{u}_1 \varvec{u}_1^* + \lambda _2 \varvec{u}_2 \varvec{u}_2^*, \end{aligned}$$

where $\lambda _1, \lambda _2 \in \mathbb R$ are eigenvalues of $\varvec{H}$ and $\varvec{u}_1, \varvec{u}_2 \in \mathbb C^n$ are the corresponding eigenvectors with $\Vert \varvec{u}_1 \Vert _2 = \Vert \varvec{u}_2 \Vert _2 = 1, \langle {\varvec{u}}_1,{\varvec{u}}_2 \rangle =0$. For the vector ${\varvec{h}}\in {\mathbb C}^n$, we can write it in the form of

$$\begin{aligned} {\varvec{h}}=\sigma _1 {\varvec{u}}_1 +\sigma _2 {\varvec{u}}_2+\sigma _3 {\varvec{u}}_3, \end{aligned}$$

where $\sigma _1, \sigma _2, \sigma _3 \in {\mathbb C}$, and ${\varvec{u}}_3 \in {\mathbb C}^n$ satisfying $\langle {\varvec{u}}_3,{\varvec{u}}_1 \rangle =0, \langle {\varvec{u}}_3,{\varvec{u}}_2 \rangle =0$ and $\Vert {\varvec{u}}_3 \Vert =1$. For simplicity, without loss of generality, we assume that b is a real number. Therefore, we have

$$\begin{aligned} {\varvec{a}}^* H {\varvec{a}}+ 2(b( {\varvec{a}}^*{\varvec{h}}))_{\Re } = \lambda _1 \left|{\varvec{a}}^* {\varvec{u}}_1\right|^2+\lambda _2 \left|{\varvec{a}}^* {\varvec{u}}_2\right|^2 +2b \left( \sigma _1 {\varvec{a}}^*{\varvec{u}}_1+\sigma _2 {\varvec{a}}^*{\varvec{u}}_2+\sigma _3 {\varvec{a}}^*{\varvec{u}}_3\right) _{\Re }. \end{aligned}$$

Note that ${\varvec{a}}\in {\mathbb C}^n$ is a complex Gaussian random vector and ${\varvec{u}}_1,\, {\varvec{u}}_2,\, {\varvec{u}}_3$ are orthogonal vectors. Thus, we have

$$\begin{aligned} {\mathbb E}\left|{\varvec{a}}^* H {\varvec{a}}+ 2(b({\varvec{a}}^*{\varvec{h}}))_{\Re }\right| = {\mathbb E}\vert \xi \vert , \end{aligned}$$

with $\xi $ being a random variable given by

$$\begin{aligned} \xi =\lambda _1 z_1^2+ \lambda _1 z_2^2+ \lambda _2 z_3^2+ \lambda _2 z_4^2+2b \left( \sigma _{1,\Re } z_1 - \sigma _{1,\Im } z_2+ \sigma _{2,\Re } z_3 - \sigma _{2,\Im } z_4+\sigma _{3,\Re } z_5 - \sigma _{3,\Im } z_6\right) . \end{aligned}$$

Here, $z_1, z_2, z_3, z_4, z_5, z_6 \sim \mathcal N(0, 1/2)$ are independent. By Cauchy-Schwarz inequality, we have

$$\begin{aligned} {\mathbb E}\vert \xi \vert \le \sqrt{{\mathbb E}\xi ^2} \quad \text{ and } \quad {\mathbb E}\xi ^2 = {\mathbb E}( \xi ^{\frac{2}{3}}\xi ^{\frac{4}{3}}) \le (\mathbb E\xi )^{\frac{2}{3}}(\mathbb E\xi _j^4)^{\frac{1}{3}}. \end{aligned}$$

It immediately gives

$$\begin{aligned} \sqrt{\frac{(\mathbb E\xi ^2)^3}{\mathbb E\xi ^4}} \le {\mathbb E}\vert \xi \vert \le \sqrt{{\mathbb E}\xi ^2} \end{aligned}$$

(47)

Let $z_1 = \rho _1\cos \theta $, $z_2 = \rho _1\sin \theta $, $z_3 = \rho _2\cos \phi $ and $z_4 = \rho _2\sin \phi $, $z_5 = \rho _3\cos \gamma $ and $z_6 = \rho _3\sin \gamma $. Through some tedious calculations, we have

$$\begin{aligned} {\mathbb E}\xi ^2&= \Bigl (\frac{1}{2\pi }\Bigr )^3 \int _0^{2\pi } \int _0^{2\pi } \int _0^{2\pi }\int _0^\infty \int _0^\infty \int _0^\infty \rho _1\rho _2 \rho _3 \Big ( \lambda _1 \rho _1^2 + \lambda _2 \rho _2^2 + 2b(\sigma _{1,\Re } \rho _1\cos \theta \\&\qquad - \sigma _{1,\Im } \rho _1\sin \theta + \sigma _{2,\Re } \rho _2\cos \phi - \sigma _{2,\Re } \rho _2\sin \phi + \sigma _{3,\Re } \rho _3\cos \gamma - \sigma _{3,\Im } \rho _3 \sin \gamma )\Big )^2\\&\qquad \times e^{-\frac{\rho _1^2+\rho _2^2+\rho _3^2}{2}}\mathrm d\rho _1\mathrm d\rho _2\mathrm d\rho _3\mathrm d\theta \mathrm d\phi \mathrm d\gamma \\&= 8(\lambda _1^2 + \lambda _1\lambda _2 + \lambda _2^2) + 4b^2(\sigma _1^2+\sigma _2^2+\sigma _3^2) \nonumber \\ {}&\le 12 \Vert {\varvec{H}}\Vert _F^2+ 4b^2 \Vert {\varvec{h}} \Vert ^2, \end{aligned}$$

where the last inequality follows from the fact that $\lambda _1^2 + \lambda _2^2 = \Vert \varvec{H} \Vert _F^2$ and $\sigma _1^2+\sigma _2^2+\sigma _3^2=\Vert {\varvec{h}} \Vert ^2$. Similarly, we could obtain

$$\begin{aligned} \mathbb E\xi _j^2 \ge 4\Vert \varvec{H} \Vert _F^2 + 4b^2\Vert {\varvec{h}} \Vert ^2 \end{aligned}$$

(48)

and

$$\begin{aligned} \begin{aligned} {\mathbb E}\xi ^4&= 48(8(\lambda _1^4 + \lambda _1^3\lambda _2 + \lambda _1^2\lambda _2^2 + \lambda _1\lambda _2^3 + \lambda _2^4) + b^4(\sigma _1^2+\sigma _2^2+\sigma _3^2 )^2 \\ {}&\quad + 4b^2(\lambda _1 + \lambda _2)^2 (\sigma _1^2+\sigma _2^2+\sigma _3^2) + 8b^2 (\lambda _1^2\sigma _1^2+ \lambda _2^2\sigma _2^2) ) \\ {}&\le 48(12\Vert \varvec{H} \Vert _F^4 + b^4\Vert {\varvec{h}} \Vert _2^4 + 16 b^2\Vert \varvec{H} \Vert _F^2\Vert {\varvec{h}} \Vert _2^2) \\ {}&\le 576 \left( \Vert {\varvec{H}}\Vert _F^2+b^2\Vert {\varvec{h}} \Vert ^2\right) ^2, \end{aligned} \end{aligned}$$

(49)

where the first inequality follows from the fact that

$$\begin{aligned} \lambda _1^4 + \lambda _1^3\lambda _2 + \lambda _1^2\lambda _2^2 + \lambda _1\lambda _2^3 + \lambda _2^4 \le \lambda _1^4 + \lambda _1^2\lambda _2^2 + \lambda _2^4 + \frac{1}{2} \left( \lambda _1^2+ \lambda _2^2\right) ^2 \le \frac{2}{3} \Vert {\varvec{H}}\Vert _F^4 \end{aligned}$$

and

$$\begin{aligned} \lambda _1^2\sigma _1^2+ \lambda _2^2\sigma _2^2 \le \left( \lambda _1^2+\lambda _2^2\right) \left( \sigma _1^2+\sigma _2^2+\sigma _3^2\right) \le \Vert {\varvec{H}}\Vert _F^2\Vert {\varvec{h}} \Vert ^2. \end{aligned}$$

Putting (48) and (49) into (47), we obtain

$$\begin{aligned} {\mathbb E}\vert \xi \vert \ge \frac{1}{3} \sqrt{ \Vert {\varvec{H}}\Vert _F^2+ b^2 \Vert {\varvec{h}} \Vert ^2 }. \end{aligned}$$

Therefore, we have

$$\begin{aligned} \frac{1}{3} \sqrt{ \Vert {\varvec{H}}\Vert _F^2+ b^2 \Vert {\varvec{h}} \Vert ^2 } \le {\mathbb E}\vert \xi \vert \le 2\sqrt{3 \Vert {\varvec{H}}\Vert _F^2+ b^2 \Vert {\varvec{h}} \Vert ^2}. \end{aligned}$$

This completes the proof. $\square $

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Huang, M., Sun, S. & Xu, Z. Affine Phase Retrieval for Sparse Signals via $\ell _1$ Minimization. J Fourier Anal Appl 29, 36 (2023). https://doi.org/10.1007/s00041-023-10022-6

Download citation

Received: 21 September 2021
Revised: 27 February 2023
Accepted: 30 May 2023
Published: 12 June 2023
DOI: https://doi.org/10.1007/s00041-023-10022-6

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Affine Phase Retrieval for Sparse Signals via \(\ell _1\) Minimization

Abstract

Similar content being viewed by others

Phase Retrieval by the Inverse Power Method

Stable Signal Recovery from Phaseless Measurements

Fast Rank-One Alternating Minimization Algorithm for Phase Retrieval

1 Introduction

1.1 Problem Setup

1.2 Related Works

1.2.1 Phase Retrieval

1.2.2 Sparse Phase Retrieval

1.2.3 Affine Phase Retrieval

1.3 Our Contributions

Theorem 1

Proof

1.3.1 Real Case

Definition 1

Theorem 2

Theorem 3

Corollary 1

1.3.2 Complex Case

Definition 2

Theorem 4

Theorem 5

Corollary 2

Remark 1

1.4 Notations

2 Proof of Theorem 2 and Theorem 3

Lemma 1

2.1 Proof of Theorem 2

Proof

2.2 Proof of Theorem 3

Proof

3 Proof of Theorems 4 and 5

3.1 Proof of Theorem 4

Proof

3.2 Proof of Theorem 5

Proof

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

A Supporting Lemmas

A Supporting Lemmas

Lemma 2

Lemma 3

Lemma 4

Lemma 5

Proof

Lemma 6

Proof

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation