Abstract
Affine phase retrieval is the problem of recovering signals from the magnitude-only measurements with a priori information. In this paper, we use the \(\ell _1\) minimization to exploit the sparsity of signals for affine phase retrieval, showing that \(O(k\log ( en/k))\) Gaussian random measurements are sufficient to recover all k-sparse signals by solving a natural \(\ell _1\) minimization program, where n is the dimension of signals. For the case where measurements are corrupted by noises, the reconstruction error bounds are given for both real-valued and complex-valued signals. Our results demonstrate that the natural \(\ell _1\) minimization program for affine phase retrieval is stable.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
1.1 Problem Setup
Affine phase retrieval for sparse signals aims to recover a k-sparse signal \({\varvec{x}}_0 \in {\mathbb F}^n\), \({\mathbb F}\in \{{\mathbb R},{\mathbb C}\}\), from the observed data
where \({\varvec{a}}_j \in {\mathbb F}^n, \, j=1, \ldots , m\) are given measurement vectors, \({\varvec{b}}:= (b_1, \ldots , b_m)^\textrm{T}\in {\mathbb F}^m\) is the given bias vector, and \({\varvec{w}}:= (w_1, \ldots , w_m)^\textrm{T}\in {\mathbb R}^m\) is the noise vector. The affine phase retrieval arises in several practical applications, such as holography [2, 20, 26, 27] and Fourier phase retrieval [3,4,5, 23], where some side information of signals is a priori known before capturing the magnitude-only measurements.
The aim of this paper is to study the following program to recover \({\varvec{x}}_0\) from \({ \varvec{ y}}:= (y_1, \ldots , y_m)^\textrm{T}\in {\mathbb R}^m\):
where \(\varvec{A}:= [{\varvec{a}}_1, \ldots , {\varvec{a}}_m] ^* \in {\mathbb F}^{m\times n}\).
Particularly, we focus on the following questions:
-
Question 1: Assume that \({\varvec{a}}_j, \, j=1, \ldots , m\), are Gaussian random measurements with \(m = O(k\log ( en/k))\). In the absence of noise, i.e., \({\varvec{w}}=0, \, \epsilon =0\), is the solution to (1) \({\varvec{x}}_0\)?
-
Question 2: In the noisy scenario, is the program (1) stable under small perturbation?
For the case where \({\varvec{x}}_0\in {\mathbb C}^n\) is non-sparse, it was shown that \(m \ge 4n-1\) generic measurements are sufficient to guarantee the uniqueness of solutions in [19], and several efficient algorithms with linear convergence rate was proposed to recover the non-sparse signals \({\varvec{x}}_0\) from \({ \varvec{ y}}\) under \(m=O(n\log n)\) Gaussian random measurements in [25]. However, for the case where \({\varvec{x}}_0\) is sparse, to the best of our knowledges, there is no result about it.
1.2 Related Works
1.2.1 Phase Retrieval
The noisy phase retrieval is the problem of recovering a signal \({\varvec{x}}_0 \in {\mathbb F}^n\), \({\mathbb F}\in \{{\mathbb R},{\mathbb C}\}\) from the magnitude-only measurements
where \({\varvec{a}}_j \in {\mathbb F}^n\) are given measurement vectors and \(w_j \in {\mathbb R}\) are noises. It arises naturally in many areas such as X-ray crystallography [21, 22, 28], coherent diffractive imaging [30], and optics [14, 15, 32]. In these settings, optical detectors record only the intensity of a light wave while losing the phase information. Note that \(\left|\langle {\varvec{a}}_j,{\varvec{x}}_0 \rangle \right|^2=\left|\langle {\varvec{a}}_j, e^{ i\theta }{\varvec{x}}_0 \rangle \right|^2\) for any \(\theta \in {\mathbb R}\). Therefore the recovery of \({\varvec{x}}_0\) for the classical phase retrieval is up to a global phase. In the absence of noise, it has been proved that \(m\ge 2n-1\) generic measurements suffice to guarantee the uniqueness of solutions for the real case [1], and \(m \ge 4n-4\) for the complex case [6, 13, 38], respectively. Moreover, several efficient algorithms have been proposed to reconstruct \({\varvec{x}}_0\) from \({ \varvec{ y}}':=[y'_1,\ldots , y'_m]^\textrm{T}\), such as alternating minimization [29], truncated amplitude flow [37], smoothed amplitude flow [7], trust-region [33], and the Wirtinger flow (WF) variants [9, 10, 41].
1.2.2 Sparse Phase Retrieval
For several applications, the underlying signal is naturally sparse or admits a sparse representation after some linear transformation. This leads to the sparse phase retrieval:
where \(\varvec{A}:=[{\varvec{a}}_1, \ldots , {\varvec{a}}_m]^*\). In the absence of noise, it has been established that \(m = 2k\) generic measurements are necessary and sufficient for uniquely recovering of all k-sparse signals in the real case, and \(m\ge 4k-2\) are sufficient in the complex case [39]. In the noisy scenario, \(O(k \log ( en/k)) \) measurements suffice for stable sparse phase retrieval [12]. Due to the hardness of \(\ell _0\)-norm in (2), a computationally tractable approach to recover \({\varvec{x}}_0\) is by solving the following \(\ell _1\) minimization:
For the real case, based on the strong restricted isometry property (SRIP) established by Voroninski and Xu [34], the authors in [18] proved that, if \({\varvec{a}}_1,\ldots ,{\varvec{a}}_m \sim 1/\sqrt{m} \cdot \mathcal N (0, I_n)\) are i.i.d. Gaussian random vectors with \(m\ge O( k\log ( en/k))\), then the solution \({\widehat{\varvec{x}}}\in {\mathbb R}^n\) to (3) satisfies
where \(\sigma _k(\varvec{x}_0)_1:= \min _{\vert {\text {supp}}(\varvec{x})\vert \le k} \Vert \varvec{x} - \varvec{x}_0\Vert _1\). Lately, this result was extended to the complex case by employing the “phaselift” technique in [36]. Specifically, the authors in [36] showed that, for any k-sparse signal \({\varvec{x}}_0 \in {\mathbb C}^n\), the solution \({\widehat{\varvec{x}}}\in {\mathbb C}^n\) to the program
satisfies
provided \({\varvec{a}}_1, \ldots , {\varvec{a}}_m \sim \mathcal N(0,I_n)\) are i.i.d. complex Gaussian random vectors and \(m\ge O( k\log ( en/k))\). Here, \(\mathcal {A}({\varvec{x}}):= (\left|{\varvec{a}}_1^* {\varvec{x}}\right|^2, \ldots , \left|{\varvec{a}}_m^* {\varvec{x}}\right|^2)\).
1.2.3 Affine Phase Retrieval
The affine phase retrieval aims to recover a signal \({\varvec{x}}_0 \in {\mathbb F}^n\), \({\mathbb F}\in \{{\mathbb R},{\mathbb C}\}\), from the measurements
where \({\varvec{a}}_j \in {\mathbb F}^n, \, j=1, \ldots , m\) are measurement vectors, \({\varvec{b}}:= (b_1, \ldots , b_m)^\textrm{T}\in {\mathbb F}^m\) is the bias vector. The problem can be regarded as the classic phase retrieval with a priori information, and is raised in many areas, such as holographic phase retrieval [16, 17, 27] and Fourier phase retrieval [3,4,5, 23]. In such scenarios, one needs to employ some additional information about the desired signals to ensure the uniqueness of solutions. Specifically, in holographic optics, a reference signal \(\varvec{r} \in {\mathbb C}^k\), whose structure is a priori known, is included in the diffraction patterns alongside the signal of interest \({\varvec{x}}_0\in {\mathbb C}^n\) [2, 20, 26]. Set \(\varvec{x}_0' = (\varvec{x}_0^\textrm{T}, \varvec{r}^\textrm{T})^\textrm{T}\in \mathbb C^{n+k}\). Then the magnitude-only measurements we obtain that
where \({{\varvec{a}}}'_j = ({\varvec{a}}_j^\textrm{T}, {{\varvec{a}}_j''}^\textrm{T})^\textrm{T}\in {\mathbb C}^{n+k}\) are given measurement vectors and \(b_j = \langle {\varvec{a}}_j'', \varvec{r} \rangle \in {\mathbb C}\) are known. Therefore, the holographic phase retrieval can be viewed as the affine phase retrieval.
Another application of affine phase retrieval arises in Fourier phase retrieval problem. For one-dimensional Fourier phase retrieval problem, it usually does not possess the uniqueness of solutions [35]. Actually, for a given signal with dimension n, beside the trivial ambiguities caused by shift, conjugate reflection and rotation, there still could be \(2^{n-2}\) nontrivial solutions. To enforce the uniqueness of solutions, one approach is to use additionally known values of some entries [4], which can be recast as affine phase retrieval. More related works on the uniqueness of solutions for Fourier phase retrieval can be seen in [11, 31].
1.3 Our Contributions
In this paper, we focus on the recovery of sparse signals from the magnitude of affine measurements. Specifically, we aim to recover a k-sparse signal \({\varvec{x}}_0\in \mathbb F^n\) (\(\mathbb F={\mathbb R}\) or \(\mathbb F={\mathbb C}\)) from the data
where \(\varvec{A}:= [{\varvec{a}}_1,\ldots ,{\varvec{a}}_m]^* \in \mathbb F^{m\times n}\) is the measurement matrix, \({\varvec{b}}\in {\mathbb F}^m\) is the bias vector, and \({\varvec{w}}\in {\mathbb R}^m\) is the noise vector. Our aim is to present the performance of the following \(\ell _1\) minimization program:
We say a triple \((\varvec{A}, {\varvec{b}}, \Delta )\) is instance optimal of order \(k_0\) if it holds
for all \({\varvec{x}}\in {\mathbb F}^n\). Here, \(\Delta : {\mathbb R}^m\rightarrow {\mathbb F}^n\) is a decoder for reconstructing \({\varvec{x}}\), \(\sigma _k(\varvec{x})_q:= \min _{\vert {\text {supp}}(\varvec{z})\vert \le k} \Vert \varvec{z} - \varvec{x}\Vert _q\) and \(C:= C_{k_0, p, q}\) is a constant depending on \(k_0\), p and q.
Theorem 1
Assume that there exists a matrix \(\varvec{A} \in {\mathbb F}^{m\times n}\), a vector \({\varvec{b}}\in {\mathbb F}^m\), a decoder \(\Delta : {\mathbb F}^m \rightarrow {\mathbb F}^n\) and positive integers \(k_0, \, p, \, q\) such that (5) holds for all \({\varvec{x}}\in {\mathbb F}^n\). Then \({\varvec{b}}\notin \{\varvec{A}{\varvec{z}}: {\varvec{z}}\in {\mathbb F}^n\}\).
Proof
We assume that \({\varvec{b}}= \varvec{A} {\varvec{z}}_0\) where \({\varvec{z}}_0 \in {\mathbb F}^n\). We next show that there exits \({\varvec{x}}\in {\mathbb F}^n\) such that (5) does not hold. For the aim of contradiction, we assume that (5) holds. Since \(\sigma _{k_0}{(-{\varvec{x}})}_q=\sigma _{k_0}{({\varvec{x}})}_q\), we have
Assume that \({\varvec{x}}_0\in {\mathbb F}^n\) is \(k_0\)-sparse, i.e. \(\sigma _{k_0}({\varvec{x}}_0)_q=0\). According to (5) and (6), we obtain that
Taking \({\varvec{x}}=r{\varvec{x}}_0+2{\varvec{z}}_0\) in (6), we have
where \(r>0\). Observe that
Here, we use \({\varvec{x}}_0\) is \(k_0\)-sparse. Substituting (8) into (9), we obtain that
holds for any \(r>0\). Note \(\lim _{r\rightarrow \infty }\Vert 2r{\varvec{x}}_0+2{\varvec{z}}_0\Vert _p=\infty \). Hence, (10) does not hold provided r is large enough. A contradiction! \(\square \)
For the case where \(m\le n\) and \(\varvec{A}\) is full rank, we have \({\varvec{b}}\in \{\varvec{A}{\varvec{z}}: {\varvec{z}}\in {\mathbb F}^n\}\). According to Theorem 1, we know that it is impossible to build the instance-optimality result under this setting. This is quite different from the earlier results on standard phase retrieval [18], where the instance-optimality is
The instance-optimality result for the standard phase retrieval, as expressed in equation (11), is established in [18].
1.3.1 Real Case
Our first result gives an upper bound for the reconstruct error of (4) in the real case, under the assumption of \({\varvec{a}}_1, \ldots ,{\varvec{a}}_m \in {\mathbb R}^n\) being real Gaussian random vectors and \(m\ge O(k\log ( en/k))\). It means the \(\ell _1\)-minimization program is stable under small perturbation, even for the approximately k-sparse signals. To begin with, we need the following definition of strong RIP condition, which was introduced by Voroninski and Xu [34].
Definition 1
(Strong RIP in [34]) The matrix \(\varvec{A}\in {\mathbb R}^{m\times n}\) satisfies the Strong Restricted Isometry Property (SRIP) of order k and constants \(\theta _l,\, \theta _u >0 \) if the following inequality
holds for all k-sparse signals \({\varvec{x}}\in {\mathbb R}^n\). Here, \(\varvec{A}_I\) denotes the sub-matrix of \(\varvec{A}\) whose rows with indices in I are kept, \([m]:=\{1,\ldots ,m\}\) and \(\left|I\right|\) denotes the cardinality of I.
The following result indicates that the matrix \(\begin{bmatrix} \varvec{A}&\varvec{b} \end{bmatrix} \in \mathbb R^{m \times (n+1)}\) satisfies strong RIP condition with high probability under some mild conditions on \(\varvec{A}\in {\mathbb R}^{m\times n}\) and \({\varvec{b}}\in {\mathbb R}^m\).
Theorem 2
Let \(\varvec{A} \in \mathbb R^{m \times n}\) be a Gaussian random matrix with entries \(a_{k, j}\sim \mathcal N(0,1/m)\). Suppose that the vector \(\varvec{b} \in \mathbb R^m\) satisfies \(\alpha \le \Vert \varvec{b}_I \Vert _2 \le \beta \) for all \(I \subseteq [m]\) with \( \left|I\right| \ge m/2\), where \(\alpha \le \beta \) are two positive constants. Set \(\varvec{A}':= \begin{bmatrix} \varvec{A}&\varvec{b} \end{bmatrix} \in \mathbb R^{m \times (n+1)}\). If \(m \ge Ct(k+1)\log ( en/k)\) with \(t(k+1) \le n\) and \(1 < t \in \mathbb Z\), then there exist constants \(\theta _l',\, \theta _u'\), independent with t, such that the matrix \(\varvec{A}'\) satisfies the strong RIP of order \(tk+1\) and constants \(\theta _l',\, \theta _u'\) with probability at least \(1 - 4\exp (-c'm)\). Here, \(C,\, c'>0\) are constants depending only on \(\alpha \) and \(\beta \).
The following theorem shows that if we add some restrictions on the signal \({\varvec{x}}\), then the instance-optimality result can be established.
Theorem 3
Assume that \(\varvec{A}':= \begin{bmatrix} \varvec{A}&\varvec{b} \end{bmatrix} \in \mathbb R^{m \times (n+1)}\) satisfies the strong RIP of order \((a+1)(k+1)\) with constants \(\theta _u \ge \theta _l>0\). If \(a > \theta _u/\theta _l\), then the following holds: for any vector \({\varvec{x}}_0 \in {\mathbb R}^n\), the solution \({\widehat{\varvec{x}}}\) to (4) with \(\varvec{y} = \vert \varvec{A} \varvec{x}_0 + \varvec{b}\vert + \varvec{w}\) and \(\Vert \varvec{w} \Vert _2 \le \epsilon \) obeys
provided \(K_1\epsilon + K_2 \frac{\sigma _k({\varvec{x}}_0)_1}{\sqrt{a(k+1)}} <2\). Here,
From Theorem 2, we know that if \(\varvec{A} \in \mathbb R^{m \times n}\) is a Gaussian random matrix with entries \(a_{k, j}\sim \mathcal N(0,1/m)\) and the sampling complexity \(m\ge C (a+1)(k+2)\log ( en/k)\), then with high probability the matrix \(\varvec{A}':= \begin{bmatrix} \varvec{A}&\varvec{b} \end{bmatrix}\) satisfies strong RIP condition of order \((a+1)(k+1)\) with constants \(\theta _l,\, \theta _u>0\) under some mild conditions on \({\varvec{b}}\). Here, the constants \(\theta _l,\, \theta _u\) are independent with a. Therefore, taking the constant \(a> \theta _u/\theta _l\), the conclusion of Theorem 3 holds with high probability.
In the absence of noise, i.e., \({\varvec{w}}=0, \, \epsilon =0\), Theorem 3 shows that if \({\varvec{a}}_1, \ldots , {\varvec{a}}_m \sim 1/\sqrt{m} \cdot \mathcal N(0, I_n)\) are real Gaussian random vectors and \(m\ge O(k\log ( en/k))\), then all the k-sparse signals \({\varvec{x}}_0 \in {\mathbb R}^n\) could be reconstructed exactly by solving the program (4) under some mild conditions on \({\varvec{b}}\). We state it as the following corollary:
Corollary 1
Let \(\varvec{A} \in \mathbb {R}^{m \times n}\) be a Gaussian random matrix with entries \(a_{jk}\sim \mathcal N(0,1/m)\), and \(\varvec{b} \in \mathbb {R}^m\) be a vector satisfying \(\alpha \le \Vert \varvec{b}_I \Vert _2 \le \beta \) for all \(I \subseteq [m]\) with \( \left|I\right|\ge m/2 \), where \(\alpha \le \beta \) are two positive universal constants. If \(m \ge C k\log ( en/k)\), then with probability at least \(1 - 4\exp (-c m)\) it holds: for any k-sparse signal \(\varvec{x}_0 \in \mathbb {R}^n\), the \(\ell _1\) minimization
with \(\varvec{y} = \left|\varvec{A} {\varvec{x}}_0 + \varvec{b}\right|\) has a unique solution \(\varvec{x}_0\). Here \(C,\, c>0\) are constants depending only on \(\alpha \) and \(\beta \).
1.3.2 Complex Case
We next turn to consider the estimation performance of (4) for the complex-valued signals. Let \(\mathbb {H}^{n \times n}\) be the set of Hermitian matrix in \({\mathbb C}^{n\times n}\) and \(\Vert \varvec{H}\Vert _{0, 2}\) denotes the number of non-zero rows in \(\varvec{H}\). Given \({\varvec{a}}_1, \ldots , {\varvec{a}}_m \in {\mathbb C}^n\) and \(b_1,\ldots ,b_m \in {\mathbb C}\), we define a linear map \(\mathcal A': \varvec{H}' \in \mathbb H^{(n+1)\times (n+1)}\rightarrow {\mathbb R}^m\) as follows:
where \({\varvec{a}}_j':= \Bigl ( \begin{array}{l} {\varvec{a}}_j \\ b_j \end{array} \Bigr ) \in {\mathbb C}^{n+1}\).
Definition 2
We say the linear map \(\mathcal A'\) defined in (12) satisfies the restricted isometry property of order (r, k) with constants \(c,\, C >0\) if the following holds
for all \(\varvec{H}':= \begin{bmatrix} \varvec{H} &{} {\varvec{h}}\\ {\varvec{h}}^* &{} 0 \end{bmatrix} \in \mathbb H^{(n+1)\times (n+1)}\) with \({\text {rank}}(\varvec{H}) \le r\), \(\Vert \varvec{H} \Vert _{0, 2} \le k\) and \(\Vert {\varvec{h}} \Vert _0 \le k\).
The following theorem shows that the linear map \({\mathcal A}'\) satisfies the restricted isometry property over low-rank and sparse matrices, provided \({\varvec{a}}_1, \ldots , {\varvec{a}}_m \in {\mathbb C}^n\) are i.i.d. complex Gaussian random vectors and \({\varvec{b}}:= (b_1,\ldots ,b_m)^\textrm{T}\in {\mathbb C}^m\) satisfies some mild conditions.
Theorem 4
Suppose \({\varvec{a}}_1, \ldots , {\varvec{a}}_m \sim 1/\sqrt{2}\cdot \mathcal N(0,I_n) + i/\sqrt{2}\cdot \mathcal N(0,I_n)\) are i.i.d. complex Gaussian random vectors and \({\varvec{b}}\in {\mathbb C}^m\) is a independent sub-gaussian random vector (it also may be deterministic) with sub-gaussian norm \(\Vert {\varvec{b}} \Vert _{\psi _2} \le C\) and \({\mathbb E}\Vert {\varvec{b}} \Vert _1 \ge c_1 m\), \({\mathbb E}\Vert {\varvec{b}} \Vert _2 \le c_2 \sqrt{m} \), where \(C>0,\, c_2 \ge c_1 > 0\) are universal constants. If \(m\ge C' k\log ( en/k)\), then with probability at least \(1-5\exp (-c' m)\), the linear map \({\mathcal A}'\) defined in (12) obeys
for all \(\varvec{H}':= \begin{bmatrix} \varvec{H} &{} {\varvec{h}}\\ {\varvec{h}}^* &{} 0 \end{bmatrix} \in \mathbb H^{(n+1)\times (n+1)}\) with \({\text {rank}}(\varvec{H}) \le 2\), \(\Vert \varvec{H} \Vert _{0, 2} \le k\) and \(\Vert {\varvec{h}} \Vert _0 \le k\). Here, \(\theta ^-:=\min (1, c_1/\sqrt{2})\), \(\theta ^+:=\max (\sqrt{6}, c_2)\), and \(C',\, c'>0\) are constants depending only on \(c_1,\, c_2\).
With abuse of notation, we denote \(\mathcal A'({\varvec{x}}'):=\mathcal A'({\varvec{x}}'{\varvec{x}}'^*)\) for any vector \({\varvec{x}}'\in {\mathbb C}^{n+1}\). Then we have
Theorem 5
Assume that the linear map \(\mathcal A'(\cdot )\) satisfies the RIP condition (13) of order (2, 2ak) with constants \(c,\, C>0\). For any k-sparse signal \({\varvec{x}}_0 \in {\mathbb C}^n\), if
then the solution \({\widehat{\varvec{x}}}\in {\mathbb C}^n\) to
with \({\tilde{\varvec{y}}}=\mathcal A'({\varvec{x}}'_0)+{\varvec{w}}\), \(\Vert {\varvec{w}} \Vert \le \epsilon \) and \({\varvec{x}}'_0=({\varvec{x}}_0^\textrm{T}, 1)^\textrm{T}\) obeys
where
Based on Theorem 4, if \({\varvec{a}}_1,\ldots ,{\varvec{a}}_m \in {\mathbb C}^n\) are i.i.d. complex Gaussian random vectors and \(m\ge C' ak\log ( en/ak)\), then with high probability the linear map \(\mathcal A'\) defined in (12) satisfies RIP condition of order (2, 2ak) with constants \(c=\theta ^-/12\) and \(C=3\theta ^+\) under some mild conditions on \({\varvec{b}}\). For the noiseless case where \({\varvec{w}}=0,\,\epsilon =0\), taking the constant \(a> (8C/c)^2\) and combining with Theorem 5, we can obtain the following result.
Corollary 2
Suppose \({\varvec{a}}_1, \ldots , {\varvec{a}}_m \sim 1/\sqrt{2}\cdot \mathcal N(0,I_n)+{i}/\sqrt{2}\cdot \mathcal N(0,I_n)\) are i.i.d. complex Gaussian random vectors and \({\varvec{b}}\in {\mathbb C}^m\) is a independent sub-gaussian random vector (it also may be deterministic) with sub-gaussian norm \(\Vert {\varvec{b}} \Vert _{\psi _2} \le C\) and \( {\mathbb E}\Vert {\varvec{b}} \Vert _1 \ge c_1 m\), \({\mathbb E}\Vert {\varvec{b}} \Vert _2 \le c_2 \sqrt{m}\), where \(C>0,\, c_2 \ge c_1 > 0\) are universal constants. If \(m \ge C'' k\log ( en/k)\), then with probability at least \(1-5\exp (-c'' m)\), then the solution to
is \({\varvec{x}}_0\) exactly. Here, \(C'',\, c'' > 0\) are constants depending only on \(c_1,\,c_2\).
Remark 1
We give an upper bound for \(\min _{\theta \in {\mathbb R}} \left( \Vert {{\widehat{\varvec{x}}}-e^{{i}\theta }{\varvec{x}}_0}\Vert _2 + \left|1-e^{ i\theta }\right|\right) \) in Theorem 5. However, since the affine phase retrieval can recover a signal exactly (not just up to a global phase), one may wonder: is there a stable recovery bound for \(\Vert {{\widehat{\varvec{x}}}-{\varvec{x}}_0}\Vert _2\)? We believe that the answer is no, especially for the case where the noise vector \(\Vert {{\varvec{w}}}\Vert _2 \gtrsim \sqrt{m}\). We defer the proof of it for the future work.
1.4 Notations
Throughout the paper, we denote \({\varvec{x}}\sim \mathcal N(0,I_n)\) if \({\varvec{x}}\in {\mathbb R}^n\) is a standard Gaussian random vector. A vector \(\varvec{x}\) is k-sparse if there are at most k nonzero entries of \({\varvec{x}}\). For simplicity, we denote \([m]:= \{1, \dots , m\}\). For any subset \(I \subseteq [m]\), let \(\varvec{A}_I = \begin{bmatrix} {\varvec{a}}_j: j \in I \end{bmatrix}^*\) be the submatrix whose rows are generated by \(\varvec{A} = \begin{bmatrix} {\varvec{a}}_1,\ldots , {\varvec{a}}_m \end{bmatrix}^*\). Denote \(\sigma _k(\varvec{x}_0)_p:= \min _{\left|{\text {supp}}(\varvec{x})\right| \le k} \Vert \varvec{x} - \varvec{x}_0\Vert _p\) as the best k-term approximation error of \(\varvec{x}_0\) with respect to \(\ell _p\) norm. For a complex number b, we use \(b_{\Re }\) and \(b_{\Im }\) to denote the real and imaginary part of b, respectively. For any \(A,B\in {\mathbb R}\), we use \( A \lesssim B\) to denote \(A\le C_0 B\) where \(C_0\in {\mathbb R}_+\) is an absolute constant. The notion \(\gtrsim \) can be defined similarly. Throughout this paper, c, C and the subscript (superscript) forms of them denote constants whose values vary with the context.
2 Proof of Theorem 2 and Theorem 3
In this section, we consider the estimation performance of the \(\ell _1\)-minimization program (4) for the real-valued signals. Before proceeding, we need the following lemma which shows that if \(\varvec{A} \in {\mathbb R}^{m\times n}\) is a real Gaussian random matrix with entries \(a_{k,j}\sim \mathcal N(0,1/m)\), then \(\varvec{A}\) satisfies the strong RIP with high probability.
Lemma 1
(Theorem 2.1 in [34]) Suppose that \(t>1\) and that \(\varvec{A}\in {\mathbb R}^{m\times n}\) is a Gaussian random matrix with entries \(a_{k,j}\sim \mathcal N(0,1/m)\). Let \(m=O(tk \log (e n/k))\) where \(k \in [1,d]\cap {\mathbb Z}\) and \(t\ge 1\) is a constant. Then there exist constants \(\theta _l, \theta _u\) with \(0<\theta _l< \theta _u<2\), independent with t, such that \(\varvec{A}\) satisfies SRIP of order \(t\cdot k\) and constants \(\theta _l,\, \theta _u\) with probability at least \(1-\exp (-cm)\), where \(c>0\) is a universal constant.
2.1 Proof of Theorem 2
Proof
From the definition, it suffices to show there exist constants \(\theta '_l, \theta '_u >0 \) such that the following inequality
holds for all \((tk+1)\)-sparse signals \({\varvec{x}}' \in {\mathbb R}^{n+1}\). To this end, we denote \(\varvec{x}' = (\varvec{x}^\textrm{T}, z)^\textrm{T}\), where \(\varvec{x} \in \mathbb R^n\) and \(z \in \mathbb R\). We first consider the case where \(z=0\). From Lemma 1, we know that if \(m \gtrsim t(k+1)\log ( en/(k+1))\) and \(t>1\), then there exist two positive constants \(\theta _l, \theta _u \in (0,2)\) such that
holds for all \((tk+1)\)-sparse vector \(\varvec{x} \in \mathbb R^n\) with probability at least \(1 - \exp (-cm)\). Here, \(c>0\) is a universal constant. Note that \(\varvec{A}' \varvec{x}' = \varvec{A} \varvec{x}\). We immediately obtain (14) for the case where \(z=0\).
Next, we turn to the case where \(z \ne 0\). A simple calculation shows that
for any \(I \subseteq [m]\). Denote \(\varvec{A} = \begin{bmatrix} {\varvec{a}}_1,\ldots , {\varvec{a}}_m \end{bmatrix}^\textrm{T}\). Note that \(\sqrt{m} {\varvec{a}}_j \sim \mathcal N(0, I_n)\). Taking \(\zeta = \frac{\min (\theta _l,\alpha ^2)}{200\beta }\) in Lemma 5, we obtain that there exists a constant \(C>0\) depending only on \(\theta _l,\, \alpha ,\, \beta \) such that when \(m\ge Ct(k+1)\log (en/k)\), with probability at least \(1-3\exp (-c_1 m)\), it holds
for all \((tk+1)\)-sparse vectors \(\varvec{x}\) and all \(I \subseteq [m]\). Here, we view \({\varvec{b}}_I={\varvec{b}}\mathbb {I}_I\in {\mathbb R}^m\) (\({\mathbb I}_I(j)=1\) if \(j\in { I}\) and 0 if \(j\notin { I}\)), and \(c_1 > 0\) is a constant depending only on \(\theta _l,\, \alpha ,\, \beta \). Note that the vector \(\varvec{b}\) satisfies
for all \(I \subseteq [m]\) with \(\left|I\right|\ge m/2\). Putting (15), (17) and (18) into (16), we obtain that when \(m \ge Ct(k+1)\log (en/k)\), with probability at least \(1-4\exp (-cm)\), the following two inequalities
and
hold for all \((tk+1)\)-sparse vector \(\varvec{x}' \in \mathbb R^{n+1}\) and for all \(I \subseteq [m]\) with \(\left|I\right|\ge m/2\). Here, \(c > 0\) is a constant depending only on \(\theta _l,\, \alpha ,\, \beta \). In other words, we have
for all \((tk+1)\)-sparse vector \(\varvec{x}'\) with probability at least \(1 - 4\exp (-cm)\). Here, \(\theta '_l = 0.99\min \{\theta _l, \alpha ^2\}\) and \(\theta '_u = 1.01 \max \{\theta _u, \beta ^2\}\). Combining the above two cases and noting that \(\theta _l\), \(\theta _u >0\) are universal constants, we complete the proof. \(\square \)
2.2 Proof of Theorem 3
Proof
Denote \(\varvec{A}' = \begin{bmatrix} \varvec{A}&\varvec{b} \end{bmatrix}\), \({\widehat{\varvec{x}}}' = ({\widehat{\varvec{x}}}^\textrm{T}, 1)^\textrm{T}\) and \({\varvec{x}}'_0 = (\varvec{x}_0^\textrm{T}, 1)^\textrm{T}\). Set
We next divide the proof into the following two cases.
Case 1: \(\left|I\right| \ge m/2\). Set \({\varvec{h}}= {\widehat{\varvec{x}}}' - {\varvec{x}}'_0\). For any \(a>1\), we decompose \({\varvec{h}}\) into the sum of \({\varvec{h}}_{T_0}, {\varvec{h}}_{T_1}, \dots \), where \(T_0\) is an index set which consists the indices of the \(k+1\) largest coordinates of \({\varvec{x}}'_0\) in magnitude, \(T_1\) is the index set corresponding to the \(a(k+1)\) largest coordinates of \({\varvec{h}}_{T_0^c}\) in magnitude, \(T_2\) is the index set corresponding to the \(a(k+1)\) largest coordinates of \({\varvec{h}}_{(T_0 \cup T_1)^c}\) in magnitude, and so on. For simplicity, we denote \(T_{jl}:= T_j \cup T_l\). To prove the theorem, we only need to give an upper bound for \(\Vert {{\varvec{h}}}\Vert _2\). Observe that
We claim that the following holds:
and
Here, \(C,\, c,\, \theta _l\) and \(\theta _u\) are positive constants depending only on \(\alpha \) and \(\beta \). Putting (20) and (21) into (19), we obtain that
It remains to prove the claim (20) and (21). Since \({\widehat{\varvec{x}}}\) is the solution to \(\ell _1\) minimization program (4), we have
Therefore,
From the definition of \(T_j\), we obtain that, for all \(j \ge 2\),
It then gives
Putting (22) into (23), we obtain the conclusion of claim (20), namely,
where the third inequality follows the Cauchy-Schwarz inequality and the last inequality comes from the fact \(\sigma _{k+1}({\varvec{x}}'_0)_1 \le \sigma _k({\varvec{x}}_0)_1\) by the definitions of \({\widehat{\varvec{x}}}'\) and \(\sigma _k(\cdot )_1\).
We next turn to prove the claim (21). Observe that
For the left hand side of (25), by the definition of I, we have
For the first term of the right hand side of (25), since the matrix \(\varvec{A}'\) satisfies strong RIP of order \((a+1)(k+1)\) with constants \(\theta _l,\theta _u >0\), we immediately have
To give an upper bound for the term \( \Vert \varvec{A}'_I {\varvec{h}}_{T_{01}^c} \Vert _2\), note that \(\Vert {\varvec{h}}_{T_{01}^c} \Vert _\infty \le \Vert {\varvec{h}}_{T_1} \Vert _1/a(k+1)\). Let \(\theta :=\max \left( \Vert {\varvec{h}}_{T_1} \Vert _1/a(k+1), \Vert {\varvec{h}}_{T_{01}^c} \Vert _1/a(k+1) \right) \). Then by the Lemma 2, we could decompose the vector \({\varvec{h}}_{T_{01}^c}\) into the following form:
where \({\varvec{u}}_j\) are \(a(k+1)\)-sparse vectors satisfying
Therefore, we have
We notice from (22) that
Thus, if \(\theta = \Vert {\varvec{h}}_{T_1} \Vert _1/a(k+1)\), then we have
If \(\theta =\Vert {\varvec{h}}_{T_{01}^c} \Vert _1/a(k+1) \), then
Therefore, for the second term of the right hand side of (25), it follows from the definition of strong RIP that
Putting (26), (27) and (28) into (25), we immediately obtain
which gives
Case 2: \(\left|I\right| < m/2\). For this case, denote \({\varvec{h}}^+ = {\widehat{\varvec{x}}}' + {\varvec{x}}'_0\). Replacing \({\varvec{h}}\) and the subset I in Case 1 by \({\varvec{h}}^+\) and \(I^c\) respectively, and applying the same argument, we could obtain
However, recall that \({\widehat{\varvec{x}}}' = ({\widehat{\varvec{x}}}^\textrm{T}, 1)^\textrm{T}\) and \({\varvec{x}}'_0 = (\varvec{x}_0^\textrm{T}, 1)^\textrm{T}\). It means \(\Vert {\varvec{h}}_+ \Vert _2 \ge 2\), which contradicts to (29) by the assumption of \(\epsilon \) and \(\sigma _k({\varvec{x}}_0)_1\), i.e., \( K_1\epsilon + K_2 \frac{\sigma _k({\varvec{x}}_0)_1}{\sqrt{a(k+1)}} <2 \). Therefore, Case 2 does not hold.
Combining the above two cases, we complete our proof. \(\square \)
3 Proof of Theorems 4 and 5
3.1 Proof of Theorem 4
Proof
Without loss of generality, we assume that \(\Vert {\varvec{H}'}\Vert _F=1\). Observe that
For any fixed \(\varvec{H} \in \mathbb H^{n\times n}\) and \({\varvec{h}}\in {\mathbb C}^n\), the terms \(\xi _j, j=1,\ldots ,m\) are independent sub-exponential random variables with the maximal sub-exponential norm
for some universal constants \(C_1,\, C_2>0\). Here, we use the fact \(\max \left( \Vert {\varvec{H}}\Vert _F,\Vert {\varvec{h}} \Vert \right) \le \Vert {\varvec{H}'}\Vert _F = 1\). For any \(0<\epsilon \le 1\), the Bernstein’s inequality gives
where \(c>0\) is a universal constant. According to Lemma 6, we obtain that
This gives
where \(\theta ^+:=\max (\sqrt{6}, c_2)\). Here, we use the fact \(\Vert {\varvec{H}'}\Vert _F^2=\Vert {\varvec{H}}\Vert _F^2+2\Vert {\varvec{h}} \Vert ^2=1\), \({\mathbb E}\Vert {\varvec{b}} \Vert _1\le \sqrt{m} {\mathbb E}\Vert {\varvec{b}} \Vert \le c_2m\), and \(\frac{a+b}{\sqrt{2}} \le \sqrt{a^2+b^2} \le a+b\) for any positive number \(a,b \in {\mathbb R}\). Similarly, we could obtain
where \(\theta ^-:=\min (1, c_1/\sqrt{2})\). Collecting the above estimators, we obtain that, with probability at least \(1-2\exp (-c\epsilon ^2 m)\), the following inequality
holds for a fixed \(\varvec{H}' \in \mathbb H^{(n+1)\times (n+1)}\). We next show that (30) holds for all \(\varvec{H}' \in \mathcal X\), where
To this end, we adopt a basic version of a \(\delta \)-net argument. Assume that \(\mathcal N_\delta \) is a \(\delta \)-net of \(\mathcal X\), i.e., for any \(\varvec{H}'=\begin{bmatrix} \varvec{H} &{} {\varvec{h}}\\ {\varvec{h}}^* &{} 0 \end{bmatrix} \in \mathcal X\) there exists a \(\varvec{H}_0':=\begin{bmatrix} \varvec{H}_0 &{} {\varvec{h}}_0 \\ {\varvec{h}}_0^* &{} 0 \end{bmatrix} \in \mathcal N_\delta \) such that \(\Vert {\varvec{H} - \varvec{H}_0}\Vert _F\le \delta \) and \(\Vert {\varvec{h}}-{\varvec{h}}_0 \Vert \le \delta \). Using the same idea of Lemma 2.1 in [36], we obtain that the covering number of \(\mathcal X\) is
where \(C_3>0\) is a universal constant. Note that \({\varvec{h}}-{\varvec{h}}_0\) has at most 2k nonzero entries. We obtain that if \(m\gtrsim k\log (en/k)\), then with probability at least \(1-3\exp (-c m)\), it holds
where the linear map \(\mathcal A(\cdot )\) is defined as \(\mathcal {A}(\varvec{H}):= ({\varvec{a}}_1^*\varvec{H} {\varvec{a}}_1, \dots , {\varvec{a}}_m^*\varvec{H} {\varvec{a}}_m)\), and the fourth inequality follows from the combination of Lemma 3, the fact \( \frac{1}{\,}m \sum _{j=1}^m {\varvec{a}}_j {\varvec{a}}_j^* \le 3/2\) with probability at least \(1-\exp (-c m)\), and
with probability at least \(1-2\exp (-c m)\). Choosing \(\epsilon :=\frac{1}{48}\), \(\delta :=\frac{\theta ^-}{48(c_2+2)}\), and taking the union bound, we obtain that the following inequality
holds with probability at least
provided \(m\ge C' k\log (en/k)\), where \(C',\, c'>0\) are constants depending only on \(c_1\) and \(c_2\). \(\square \)
3.2 Proof of Theorem 5
Proof
The proof of this theorem is adapted from that of Theorem 1.3 in [36]. Note that the \(\ell _1\)-minimization problem we consider is
Here, with some abuse of notation, we set
Let \({\widehat{\varvec{x}}}\in {\mathbb C}^n\) be a solution to (31). Without loss of generality, we assume \(\langle {\widehat{\varvec{x}}}',{\varvec{x}}_0' \rangle \ge 0\) (Otherwise, we can choose \(e^{\textrm{i}\theta } {\varvec{x}}_0'\) for an appropriate \(\theta \)), where \({\widehat{\varvec{x}}}'= \begin{pmatrix} {\widehat{\varvec{x}}}\\ 1\end{pmatrix}\) and \({\varvec{x}}_0'= \begin{pmatrix} {\varvec{x}}_0 \\ 1\end{pmatrix}\). Set
and
Therefore, it suffices to give an upper bound for \(\Vert {\varvec{H}'}\Vert _F\). Denote \(T_0:={\text {supp}}({\varvec{x}}_0)\) and \(T'_0:=T_0 \cup \left\{ n+1\right\} \). Let \(T_1\) be the index set corresponding to the indices of the ak-largest elements of \({\widehat{\varvec{x}}}_{T_0^c}\) in magnitude, and \(T_2\) contain the indices of the next ak largest elements, and so on. Set \(T_{01}:=T_0 \cup T_1\), \(T'_{01}:=T'_0 \cup T_1\), \( \bar{\varvec{h}}:={\varvec{h}}_{T_{01}}\), \( \bar{\varvec{H}}=\varvec{H}_{T_{01}, T_{01}}\), and \( \bar{\varvec{H}}':= \varvec{H}'_{T'_{01}, T'_{01}}\). Noting that
we next consider the terms \(\Vert { \bar{\varvec{H}}'}\Vert _F\) and \(\Vert {\varvec{H}'- \bar{\varvec{H}}'}\Vert _F\). We claim that
and
Combining (32), (33) and (34), we obtain that
According to Lemma 4, we immediately have
By the definition of \({\widehat{\varvec{x}}}'\) and \({\varvec{x}}'_0\), we arrive at the conclusion.
It remains to prove the claims (33) and (34). Note that
We first give an upper bound for the term \(\sum _{i\ge 2, j\ge 2} \Vert {\varvec{H}'_{T_i, T_j}}\Vert _F\). Noting that \({\varvec{x}}_0\) is a k-sparse vector and \({\widehat{\varvec{x}}}\in {\mathbb C}^n\) is the solution to (31), we obtain that
which implies \(\Vert {{\widehat{\varvec{x}}}_{T_0^c}}\Vert _1 \le \Vert {{\widehat{\varvec{x}}}_{T_0}-{\varvec{x}}_0}\Vert _1\). Moreover, by the definition of \(T_j\), we know that for all \(j\ge 2\), it holds \( \Vert {\widehat{\varvec{x}}}_{T_j} \Vert _2 \le \frac{\Vert {\widehat{\varvec{x}}}_{T_{j-1}} \Vert _1}{\sqrt{ak}}\). It then implies
Therefore, the first term of (35) can be estimated as
where the second inequality follows from
Here, the first inequality comes from \(\Vert {{\widehat{\varvec{x}}}}\Vert _1 \le \Vert {{\varvec{x}}_0}\Vert _1\).
For the second term and the third term of (35), we obtain that
where the first inequality follows from (36) due to \({\widehat{\varvec{x}}}'_{T_j}={\widehat{\varvec{x}}}_{T_j}\) for all \(j\ge 1\), and the last inequality comes from Lemma 4. Putting (37) and (38) into (35), we obtain that
This proves the claim (33).
Finally, we turn to prove the claim (34). Note that \(\Vert \mathcal A'({\widehat{\varvec{x}}}')-{\tilde{\varvec{y}}} \Vert \le \epsilon \) and \({\tilde{\varvec{y}}}:=\mathcal A'({\varvec{x}}_0')+\epsilon \), which implies
Thus, we have
Recall that \( \bar{\varvec{H}}':=\begin{pmatrix} \bar{\varvec{H}}&{} \bar{\varvec{h}}\\ \bar{\varvec{h}}^*&{} 0\end{pmatrix}\) with \({\text {rank}}( \bar{\varvec{H}})\le 2\), \(\Vert \bar{\varvec{H}} \Vert _{0,2}\le (a+1)k\), and \(\Vert \bar{\varvec{h}} \Vert _0 \le (a+1)k\). It then follows from the RIP of \(\mathcal A'\) that
To prove (34), it suffices to give an upper bound for the term \(\frac{1}{m} \Vert {\mathcal A'(\varvec{H}'- \bar{\varvec{H}}')}\Vert _1\). Observe that
Since
then the RIP of \(\mathcal A'\) implies
Similarly, we could obtain
Finally, observe that \(\frac{1}{m} \Vert {\mathcal A' (\varvec{H}'_{T'^c_{01}, T'^c_{01}})}\Vert _1 = \frac{1}{m} \Vert {\mathcal A (\varvec{H}_{T^c_{01}, T^c_{01}})}\Vert _1\). Using the same technique as [36, Eq. (3.16)], we could obtain
Putting (42), (43) and (44) into (41), we have
Combining (39), (40) and (45), we immediately obtain
which implies
This completes the proof of claim (34). \(\square \)
References
Balan, R., Casazza, P., Edidin, D.: On signal reconstruction without phase. Appl. Comput. Harmon. Anal. 20(3), 345–356 (2006)
Barmherzig, D.A., Sun, J., Li, P.N., Lane, T.J., Candès, E.J.: Holographic phase retrieval and reference design. Inverse Probl. 35(9), 094001 (2019)
Beinert, R., Plonka, G.: Ambiguities in one-dimensional discrete phase retrieval from Fourier magnitudes. J. Fourier Anal. Appl. 21(6), 1169–1198 (2015)
Beinert, R., Plonka, G.: Enforcing uniqueness in one-dimensional phase retrieval by additional signal information in time domain. Appl. Comput. Harmon. Anal. 45(3), 505–525 (2018)
Bendory, T., Beinert, R.,Eldar, Y. C.: Fourier phase retrieval: Uniqueness and algorithms. Compressed Sensing and its Applications, pp. 55–91 (2017)
Bandeira, A., Cahill, J., Mixon, D., Nelson, A.: Saving phase: injectivity and stability for phase retrieval. Appl. Comput. Harmon. Anal. 37(1), 106–125 (2014)
Cai, J., Huang, M., Li, D., Wang, Y.: Solving phase retrieval with random initial guess is nearly as good as by spectral initialization. Appl. Comput. Harmon. Anal. 58, 60–84 (2022)
Cai, T.T., Zhang, A.: Sparse representation of a polytope and recovery of sparse signals and low-rank matrices. IEEE Trans. Inf. Theory 60(1), 122–132 (2013)
Candès, E.J., Li, X., Soltanolkotabi, M.: Phase retrieval via Wirtinger flow: theory and algorithms. IEEE Trans. Inf. Theory 61(4), 1985–2007 (2015)
Chen, Y., Candès, E.J.: Solving random quadratic systems of equations is nearly as easy as solving linear systems. Commun. Pure Appl. Math. 70(5), 822–883 (2017)
Edidin, D.: The geometry of ambiguity in one-dimensional phase retrieval. SIAM J. Appl. Algebr. Geom. 3(4), 644–660 (2019)
Eldar, Y.C., Mendelson, S.: Phase retrieval: stability and recovery guarantees. Appl. Comput. Harmon. Anal. 36(3), 473–494 (2014)
Conca, A., Edidin, D., Hering, M., Vinzant, C.: An algebraic characterization of injectivity in phase retrieval. Appl. Comput. Harmon. Anal. 38(2), 346–356 (2015)
Fienup, J.R.: Reconstruction of an object from the modulus of its Fourier transform. Opt. Lett. 3(1), 27–29 (1978)
Fienup, J.R.: Phase retrieval algorithms: a comparison. Appl. Opt. 21(15), 2758–2769 (1982)
Gabor, D.: A new microscopic principle. Nature 161(4098), 777–778 (1948)
Gabor, D.: Microscopy by reconstructed wave-fronts. Proc. R. Soc. Lond. Ser. A 197(1051), 454–487 (1949)
Gao, B., Wang, Y., Xu, Z.: Stable signal recovery from phaseless measurements. J. Fourier Anal. Appl. 22(4), 787–808 (2016)
Gao, B., Sun, Q., Wang, Y., Xu, Z.: Phase retrieval from the magnitudes of affine linear measurements. Adv. Appl. Math. 93, 121–141 (2018)
Guizar-Sicairos, M., Fienup, J.R.: Holography with extended reference by autocorrelation linear differential operation. Opt. Express 15(26), 17592–17612 (2007)
Harrison, R.W.: Phase problem in crystallography. J. Opt. Soc. Am. A 10(5) (1993)
Hauptman, H.A.: The phase problem of X-ray crystallography. Rep. Prog. Phys. 54(11), 1427–1454 (1991)
Huang, K., Eldar, Y.C., Sidiropoulos, N.D.: Phase retrieval from 1D Fourier measurements: convexity, uniqueness, and algorithms. IEEE Trans. Signal Process. 64(23), 6105–6117 (2016)
Huang, M., Xu, Z.: Performance bound of the intensity-based model for noisy phase retrieval. arXiv:2004.08764 (2020)
Huang, M., Xu, Z.: Strong convexity of affine phase retrieval. arXiv:2204.09412 (2022)
Latychevskaia, T.: Iterative phase retrieval for digital holography: tutorial. JOSA A 36(12), 31–40 (2019)
Liebling, M., Blu, T., Cuche, E., Marquet, P., Depeursinge, C., Unser, M.: Local amplitude and phase retrieval method for digital holography applied to microscopy. In: European Conference on Biomedical Optics, Vol. 5143, pp. 210–214 (2003)
Millane, R.P.: Phase retrieval in crystallography and optics. J. Opt. Soc Am. A 7(3), 394–411 (1990)
Netrapalli, P., Jain, P., Sanghavi, S.: Phase retrieval using alternating minimization. IEEE Trans. Signal Process. 63(18), 4814–4826 (2015)
Rodriguez, J.A., Xu, R., Chen, C., Zou, Y., Miao, J.: Oversampling smoothness: an effective algorithm for phase retrieval of noisy diffraction intensities. J. Appl. Crystallogr. 46(2), 312–318 (2013)
Sanz, J.L.C.: Mathematical considerations for the problem of Fourier transform phase retrieval from magnitude. SIAM J. Appl. Math. 45(4), 651–664 (1985)
Shechtman, Y., Eldar, Y.C., Cohen, O., Chapman, H.N., Miao, J., Segev, M.: Phase retrieval with application to optical imaging: a contemporary overview. IEEE Signal Process. Mag. 32(3), 87–109 (2015)
Sun, J., Qu, Q., Wright, J.: A geometric analysis of phase retrieval. Found. Comput. Math. 18(5), 1131–1198 (2018)
Voroninski, V., Xu, Z.: A strong restricted isometry property, with an application to phaseless compressed sensing. Appl. Comput. Harmon. Anal. 40(2), 386–395 (2016)
Walther, A.: The question of phase retrieval in optics. J. Mod. Opt. 10(1), 41–49 (1963)
Xia, Y., Xu, Z.: The recovery of complex sparse signals from few phaseless measurements. Appl. Comput. Harmon. Anal. 50, 1–15 (2021)
Wang, G., Giannakis, G.B., Eldar, Y.C.: Solving systems of random quadratic equations via truncated amplitude flow. IEEE Trans. Inf. Theory 64(2), 773–794 (2018)
Wang, Y., Xu, Z.: Generalized phase retrieval : measurement number, matrix recovery and beyond. Appl. Comput. Harmon. Anal. 47(2), 423–446 (2019)
Wang, Y., Xu, Z.: Phase retrieval for sparse signals. Appl. Comput. Harmon. Anal. 37(3), 531–544 (2014)
Xu, G., Xu, Z.: On the \(\ell _1\)-norm invariant convex \(k\)-sparse decomposition of signals. J. Oper. Res. Soc. China 1(4), 537–541 (2013)
Zhang, H., Zhou, Y., Liang, Y., Chi, Y.: A nonconvex approach for phase retrieval: reshaped Wirtinger flow and incremental algorithms. J. Mach. Learn. Res. 18(1), 5164–5198 (2017)
Acknowledgements
Meng Huang was supported by NSFC grant (12201022) and the Fundamental Research Funds for the Central Universities (Grant No. YWF-22-T-204). Zhiqiang Xu was supported by the National Science Fund for Distinguished Young Scholars (12025108) and NSFC (12021001,12288201).
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Jaming Philippe.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
A Supporting Lemmas
A Supporting Lemmas
The following lemma gives a way for how to decompose a vector \({\varvec{v}}\in {\mathbb R}^n\) into the convex combination of several k-sparse vectors.
Lemma 2
( [8, 40]) Suppose that \({\varvec{v}}\in {\mathbb R}^n\) satisfying \(\Vert {\varvec{v}} \Vert _\infty \le \theta \) and \(\Vert {{\varvec{v}}}\Vert _1\le k\theta \), where \(\theta >0\) and \(k\in \mathbb Z_+\). Then we have
where \({\varvec{u}}_j \in {\mathbb R}^n\) is k-sparse vectors and \(\Vert {{\varvec{u}}_j}\Vert _1 \le \Vert {{\varvec{v}}}\Vert _1, \; \Vert {\varvec{u}}_j \Vert _\infty \le \theta \).
Lemma 3
( [36]) Let the linear map \(\mathcal {A}(\cdot )\) be defined as
where \({\varvec{a}}_j \sim 1/\sqrt{2}\cdot \mathcal N(0,I_n)+ i/\sqrt{2}\cdot \mathcal N(0,I_n), j=1,\ldots ,m\) are i.i.d. complex Gaussian random vectors. If \(m \gtrsim k\log (e n/k)\), then with probability at least \(1 - 2\exp (-c_0m)\), \(\mathcal {A}\) satisfies
for all \(H \in \mathbb {H}^{n \times n}\) with \(\textrm{rank}(\varvec{H}) \le 2\) and \(\Vert \varvec{H}\Vert _{0, 2} \le k\). Here, \(\Vert \varvec{H}\Vert _{0, 2}\) denotes the number of non-zero rows in \(\varvec{H}\).
Lemma 4
( [24, 36]) For any vectors \({\varvec{u}},{\varvec{v}}\in {\mathbb C}^n\) obeying \(\langle {\varvec{u}},{\varvec{v}} \rangle \ge 0\), we have
Lemma 5
Suppose that \({\varvec{a}}_j \sim \mathcal N(0,I_n), j=1, \dots , m\) are i.i.d. Gaussian random vectors and \(\varvec{b}\in \mathbb R^m\) is a nonzero vector. For any fixed \(\zeta \in (0, 1)\), if \(m \ge C\zeta ^{-2}k(\log ( en/k)+\log (1/\zeta ))\), then with probability at least \(1 - 3\exp (-c_0 \zeta ^2 m)\) it holds that
for all k-sparse vectors \(\varvec{x} \in \mathbb R^n\). Here, \(c_0 >0\) is a universal constant.
Proof
Without loss of generality we assume \(\Vert \varvec{x} \Vert _2 = 1\). For any fixed \(\varvec{x}_0\), the terms \({\varvec{a}}_j^\textrm{T}\varvec{x}_0\) are independent, mean zero, sub-gaussian random variables with the maximal sub-gaussian norm being a positive universal constant. The Hoeffding’s inequality implies
Here, \(c_1 > 0\) is a universal constant. Taking \(t = \zeta \sqrt{m}\Vert \varvec{b} \Vert _2/2\), we obtain that
holds with probability at least \(1 - 2\exp (-c_1\zeta ^2m/4)\).
Next, we give a uniform bound to (46) for all k-sparse vectors \(\varvec{x}\). Denote
We assume that \(\mathcal N\) is a \(\delta \)-net of \(\mathcal S_{n,k}\) such that for any \(\varvec{x} \in \mathcal S_{n,k}\), there exists a vector \(\varvec{x}_0 \in \mathcal N\) such that \(\Vert \varvec{x} - \varvec{x}_0 \Vert _2 \le \delta \). The covering number \(\vert \mathcal N\vert \le \left( \begin{array}{l} n\\ k \end{array} \right) (1+\frac{2}{\delta })^k\). Note that \(\Vert {\varvec{x}}-{\varvec{x}}_0 \Vert \le 2k\). Therefore, when \(m\gtrsim 2k\), with probability at least \(1 - \exp (-c_2 m)\), it holds, Thus we have
where the second inequality follows from the Cauchy-Schwarz inequality and the last inequality comes from the fact \(\Vert \sum _{j=1}^m {\varvec{a}}_j {\varvec{a}}_j^\textrm{T} \Vert _2 \le 4m\) with probability at least \(1 - \exp (-c_2 m)\), where \(c_2>0\) is a universal constant. Choosing \(\delta = \zeta /4\) and taking the union bound over \(\mathcal N\), we obtain that
holds with probability at least
provided \(m \ge C \zeta ^{-2} k (\log ( en/k)+\log (1/\zeta ))\). Here, C and c are positive universal constants. This completes the proof. \(\square \)
Lemma 6
Suppose that \({\varvec{a}}\in {\mathbb C}^n\) is a complex Gaussian random vector and \(b \in {\mathbb C}\) is a complex number. For any Hermitian matrix \(\varvec{H} \in {\mathbb C}^{n\times n}\) with \({\text {rank}}(\varvec{H}) \le 2\) and any vector \({\varvec{h}}\in {\mathbb C}^n\), we have
Proof
Since \(\varvec{H} \in {\mathbb C}^{n\times n}\) is a Hermitian matrix with \({\text {rank}}(\varvec{H}) \le 2\), we can decompose \(\varvec{H}\) into
where \(\lambda _1, \lambda _2 \in \mathbb R\) are eigenvalues of \(\varvec{H}\) and \(\varvec{u}_1, \varvec{u}_2 \in \mathbb C^n\) are the corresponding eigenvectors with \(\Vert \varvec{u}_1 \Vert _2 = \Vert \varvec{u}_2 \Vert _2 = 1, \langle {\varvec{u}}_1,{\varvec{u}}_2 \rangle =0\). For the vector \({\varvec{h}}\in {\mathbb C}^n\), we can write it in the form of
where \(\sigma _1, \sigma _2, \sigma _3 \in {\mathbb C}\), and \({\varvec{u}}_3 \in {\mathbb C}^n\) satisfying \(\langle {\varvec{u}}_3,{\varvec{u}}_1 \rangle =0, \langle {\varvec{u}}_3,{\varvec{u}}_2 \rangle =0\) and \(\Vert {\varvec{u}}_3 \Vert =1\). For simplicity, without loss of generality, we assume that b is a real number. Therefore, we have
Note that \({\varvec{a}}\in {\mathbb C}^n\) is a complex Gaussian random vector and \({\varvec{u}}_1,\, {\varvec{u}}_2,\, {\varvec{u}}_3\) are orthogonal vectors. Thus, we have
with \(\xi \) being a random variable given by
Here, \(z_1, z_2, z_3, z_4, z_5, z_6 \sim \mathcal N(0, 1/2)\) are independent. By Cauchy-Schwarz inequality, we have
It immediately gives
Let \(z_1 = \rho _1\cos \theta \), \(z_2 = \rho _1\sin \theta \), \(z_3 = \rho _2\cos \phi \) and \(z_4 = \rho _2\sin \phi \), \(z_5 = \rho _3\cos \gamma \) and \(z_6 = \rho _3\sin \gamma \). Through some tedious calculations, we have
where the last inequality follows from the fact that \(\lambda _1^2 + \lambda _2^2 = \Vert \varvec{H} \Vert _F^2\) and \(\sigma _1^2+\sigma _2^2+\sigma _3^2=\Vert {\varvec{h}} \Vert ^2\). Similarly, we could obtain
and
where the first inequality follows from the fact that
and
Putting (48) and (49) into (47), we obtain
Therefore, we have
This completes the proof. \(\square \)
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Huang, M., Sun, S. & Xu, Z. Affine Phase Retrieval for Sparse Signals via \(\ell _1\) Minimization. J Fourier Anal Appl 29, 36 (2023). https://doi.org/10.1007/s00041-023-10022-6
Received:
Revised:
Accepted:
Published:
DOI: https://doi.org/10.1007/s00041-023-10022-6