Keywords

1 Introduction

Phase retrieval is to recover the signal \(\varvec{x}\in \mathbb {C}^n\) from the phaseless measurement:

$$\begin{aligned} b_i = \left| \left\langle \varvec{a}_i, \varvec{x}\right\rangle \right| ,\quad 1\le i \le m, \end{aligned}$$
(1)

where the measuring vectors \(\varvec{a}_i\ \) in \(\mathbb {C}^n\). The measuring matrix \(\varvec{A}= [\varvec{a}_1,\cdots ,\varvec{a}_m]\) is assumed to be full-rank. The applications of phase retrieval exists widely in many fields of sciences and engineering, including X-ray crystallography [1], molecular imaging [2], biological imaging [3] as well as astronomy [4]. To avoid the illness of problem, it is proved that m should be larger than \(2n-1\) in real-valued case or \(4n-4\) in complex-valued case [5, 6], which we call the information limit in phase retrieval.

The nonlinear inverse problem in (1) is generally known to be NP-hard [7]. Therefore, without adopting effective heuristic function, it is almost impossible to find the solution in tolerable time when n is relatively large. The efficient and practical approaches to this problem are generally nonconvex and can be roughly categorized into two types: alternating projection algorithms (also referred as fixed point algorithms) [8, 9] and gradient-descent methods [10, 11].

The crucial step for nonconvex methods is an elaborate initialization. Namely, one usually needs a good estimate of the true solution to ensure the probability of successful reconstruction. In fact, many nonconvex gradient-like method, e.g. the Wirtinger flow method [10] and the Amplitude flow method [12] degenerate to be convex if the initialization lies around the true solution. As more accurate initializer is proposed, the nonconvex method can performs even better when m is below the information limit in phase retrieval [13].

Related work is presented in Sect. 2. In Sect. 3, a more accurate initialization method, called the inverse power method is proposed in this paper. And through numerical experiments, the effectiveness and superior accuracy are illustrated in Sect. 4.

2 Related Work

Current initialization methods include the spectral method [8], the truncated spectral method [14] and the null vector method [15]. For the convenience of the notation, the measuring vectors are normalized to unit vectors before the initialization step, namely \(\left\| \varvec{a}_i\right\| =1.\)

The spectral method is a linear algebraic initializer by the maximization problem:

$$\begin{aligned} \varvec{x}_\text {spec} = \arg \max \left\{ \left\| {\text {diag}}(\varvec{b})\varvec{A}^*\varvec{x}\right\| ^2:\varvec{x}\in \mathbb {C}^n , \left\| \varvec{x}\right\| = \left\| \varvec{x}_0\right\| \right\} , \end{aligned}$$
(2)

where \({\text {diag}}(\varvec{b})\) is a function that returns a square diagonal matrix with the elements of vector \(\varvec{b}\) on the main diagonal.

The truncated spectral method is a modification of the spectral method. This method first select a part of measuring vectors which have the largest inner product with the true solution \(\varvec{x}\). Then an estimate can be obtained by finding the vector that is most coherent with the selected set of vectors. Specifically, the truncated spectral vector method is to solve the maximization problem:

$$\begin{aligned} \varvec{x}_\text {t-spec} = \arg \max \left\{ \left\| {\text {diag}}(\varvec{1}_\tau \odot \varvec{b})\varvec{A}^*\varvec{x}\right\| ^2:\varvec{x}\in \mathbb {C}^n , \left\| \varvec{x}\right\| = \left\| \varvec{x}_0\right\| \right\} , \end{aligned}$$
(3)

where \(\odot \) stands for element-wise multiplication, and \(\varvec{1}_\tau \) is the characteristic function of the set

$$\begin{aligned} \left\{ i:b(i)\le \tau {\left\| \varvec{x}_0\right\| }\right\} \end{aligned}$$
(4)

with some preset threshold value \(\tau \in (0,1)\).

The null vector method is proposed from another viewpoint: the orthogonality between vectors. A set of weak vectors that are most orthogonal to \(\varvec{x}\) are selected from the measuring vectors. And through seeking the vector which is most orthogonal to the weak vectors, one can obtain an estimate of the \(\varvec{x}\). Mathematically speaking, the weak vector set is denoted as \(I\subset \left\{ 1,\cdots ,m\right\} \) and its complement \(I_c\) are defined such that \(b_i\le b_j\) for all \(i\in I\) and \(j\in I_c\). Then the null vector method can be formulated as

$$\begin{aligned} \varvec{x}_\text {null} = \arg \min \left\{ \left\| \varvec{A}_I^*\varvec{x}\right\| ^2:\varvec{x}\in \mathbb {C}^n , \left\| \varvec{x}\right\| = \left\| \varvec{x}_0\right\| \right\} . \end{aligned}$$
(5)

To make use of the available power method, Chen convert (5) into a maximization problem [15]:

$$\begin{aligned} \varvec{x}_\text {null} = \arg \max \left\{ \left\| \varvec{A}_{I_c}^*\varvec{x}\right\| ^2:\varvec{x}\in \mathbb {C}^n , \left\| \varvec{x}\right\| = \left\| \varvec{x}_0\right\| \right\} . \end{aligned}$$
(6)

Current initialization methods, (2), (3) and (6) are all about finding the leading eigenvector, which can be efficiently solved by power method. We illustrate the Algorithm 1 for computing (2) as an example.

figure a

3 Algorithm

3.1 Inverse Power Method

Related work show that the null vector method has the best numerical performance compared with spectral method and the truncated spectral method [15]. However, several drawbacks remains. An important step in the null vector is setting a threshold value for picking out weak vectors, and the problem of how to choose the proper threshold value has not been solved yet. Moreover, the selected weak vectors are treated without indistinguishably, the accuracy could be improved if proper weights are added upon these vectors. Based on these considerations, our method, the inverse power method is proposed as follows:

$$\begin{aligned} \varvec{x}_\text {null} = \arg \max \left\{ \left\| {\text {diag}}(\varvec{b}^{-\gamma /2})\varvec{A}^*\varvec{x}\right\| ^2:\varvec{x}\in \mathbb {C}^n , \left\| \varvec{x}\right\| = \left\| \varvec{x}_0\right\| \right\} . \end{aligned}$$
(7)

where \(\gamma \) is a positive number.

The inverse power method is based on the following facts:

  1. 1.

    By minimizing the function \(f_i(\varvec{x}) = \frac{\varvec{x}^*}{\left\| \varvec{x}\right\| }\varvec{a}_i\varvec{a}_i^*\frac{\varvec{x}}{\left\| \varvec{x}\right\| }\), one can obtain an arbitrary vector in the orthogonal complement space of \(\varvec{a}_i\).

  2. 2.

    Instead of selecting only a proportion of measuring vectors by comparing \(b_i\), we carefully give various weights to measuring vectors. Specifically, the inverse power method is to minimize a combination of weighted \(f_i(\varvec{x})\):

    $$\begin{aligned} f(\varvec{x}) = \sum _i^{m}{w_if_i(\varvec{x}) }. \end{aligned}$$
    (8)

    The measurements \(b_i\) can also be regarded as the measure of orthogonality between \(\varvec{a}_i\) and \(\varvec{x}_0\). For arbitrary two measuring vectors \(\varvec{a}_i\) and \(\varvec{a}_j\), if the \(b_i\) is smaller than \(b_j\), then \(\varvec{x}\) is more orthogonal with \(\varvec{a}_i\) than \(\varvec{a}_j\), hence more weights is added upon the function \(f_i(\varvec{x})\). Therefore, it is reasonable to weight more \(f_i(\varvec{x})\) corresponding to smaller \(b_i\). A natural way to satisfy this goal is weighting \(f_i(\varvec{x})\) with simple functions. Here we take the weights \(w_i = b_i^{-\gamma }\) with \(\gamma \ge 0\), which makes (8) and (7) equivalent. Through numerical implementations, we find that \(\gamma >1\) would ensure relative good performance.

Solving the minimization problem (7) is identical to searching for the eigenvector with the smallest eigenvalue. This can be solved efficiently through the propose inverse power approach presented in Algorithm 2. The core step of this algorithm is solving a linear system of equation, which can be implemented efficiently by the well-known Gauss-Seidel method, Jacobi method or successive over relaxation method.

figure b

3.2 Iterative Inverse Power Method

After we obtain an estimate \(\varvec{x}_\text {IPM}\) by the inverse power algorithm, the similar analysis about orthogonality can be made upon the residual \(\varvec{x}_\text {res}={\varvec{x}_0-\varvec{x}_\text {IPM}}\). This inspires us to propose an iterative inverse power algorithm to improve the estimate.

Denote \(\varvec{b}_\text {IPM} = \varvec{A}^*\varvec{x}_\text {IPM}\), the residual between the true measurements \(\varvec{A}^*\varvec{x}\) and \(\varvec{A}^*\varvec{x}_\text {IPM}\) can be approximated by \(\varvec{b}_\text {res} := \varvec{b}-\left| \varvec{A}^*\varvec{x}_\text {IPM}\right| \). Then the proportion of \(\varvec{x}_\text {res}\) in the orthogonal complement space of \(\varvec{a}_i\) can be characterized by \({(b_\text {IPM})_i}^{-\gamma }\). In other words, we can implement the inverse power algorithm upon the \(\varvec{b}_\text {res}\) to approximate \(\varvec{x}_\text {res}\), which can be use as the search direction for better estimate.

The iterative inverse power method is presented in Algorithm 3. The assignment \(\varvec{x}_\text {IPM} \leftarrow {\text {sign}}\left( \varvec{x}_\text {IPM}^*\varvec{x}_s \right) \varvec{x}_\text {IPM}\) adds a phase factor \(e^{i\theta }\) upon \(\varvec{x}_\text {IPM}\), where \(\theta = \arg \min _\theta \left\| e^{i\theta }\varvec{x}_\text {IPM}-\varvec{x}_s\right\| \). The step size is restricted to be less than l such that the elements of \(\varvec{b}_\text {res}\) are always positive, hence ensuring the weights used in the inverse power method are always positive. Otherwise, the negative weights will invalidate the inverse power method. The relaxation parameter \(\alpha \) controls the amount of each adjustment.

figure c

4 Numerical Experiments

In this section, we implement several numerical experiments to compare the performance of our algorithms with other initialization algorithms. For comparison, we define the relative error between recovered signal \(\hat{\varvec{x}}\) and the true signal \(\varvec{x}\) as:

$$\begin{aligned} \text {RE:} =\frac{{\text {dist}}\left( \hat{\varvec{x}},\varvec{x}_0\right) }{\left\| {\varvec{x}}\right\| ^2}, \end{aligned}$$
(9)

where the function \({\text {dist}}\left( \hat{\varvec{x}}, \varvec{x}_0\right) = \min _\theta \left\| e^{i\theta }\hat{\varvec{x}} - \varvec{x}_0\right\| \).

Figure 1 compares the inverse power method with three initialization schemes mentioned in Sect. 2. The relative errors of the returned initialization estimate under different oversampling rate m / n are presented. The measuring vectors are i.i.d. \(\mathcal {N}(\varvec{0},\varvec{I}_n/2)+i\mathcal {N}(\varvec{0},\varvec{I}_n/2)\). Obviously, all methods perform better as oversampling rate increases. The spectral and truncated spectral method exhibit generally the worst performance compared with other methods. The null vector method enjoys a better performance than the spectral and truncated spectral method, and it achieves similar performance with the inverse method. However, all the inverse power methods performs better than the null vector method as oversampling rate increases. The inverse power methods with \(\gamma =3\) and \(\gamma =3.5\) have the lowest relative error over all oversampling rate. And interestingly, the inverse power method performs almost the same with \(\gamma \ge 2\) when oversampling rate is sufficiently large (\(m/n>25\)).

Fig. 1.
figure 1

Relative error of estimates by different initialization schemes under different oversampling rate m / n. \(n=256\), and m / n varies from 5 to 50. The parameter \(\gamma \) of the inverse power method varies from 1.5 to 3.5.

Figure 2 illustrates the effectiveness of the proposed iterative inverse power method. Overall, the iterative inverse power method perform considerably better than the single inverse power method. And better accuracy is achieved as oversampling rate increases. Let \(\alpha \) correspond to the step size of each update. Different \(\alpha \) can lead to different rates of convergence. Apparently, a larger step size can cause a faster converging rate. It only takes about 10 iterations to converge for \(\alpha =0.9\), while for the \(\alpha =0.1\), dozens of iterations are implemented before converging. However, a larger \(\alpha \) can skip the optimum solution, which has been widely discussed in optimization theory [16]. As shown in Fig. 2b, for \(\alpha =0.9\), the relative error is 0.48, much higher than 0.32 for \(\alpha =0.6\), and 0.31 for \(\alpha =0.3\).

Fig. 2.
figure 2

Relative error for the iterative inverse power method versus iteration count with various step size. \(\alpha \) under different oversampling rates.

5 Conclusion

This paper develops a new initialization method, which can efficiently give a more accurate estimate of the original signal for phase retrieval. And through iterative use of the proposed inverse power method, a better performance is achieved. This work is helpful for solving various imaging problems based on phase retrieval.