1 Introduction

Phase retrieval has significant applications in a variety of imaging systems based on diffraction, including phase microscopy [1], X-ray crystallography [2], astronomical photography [3], radio holography [4], beam correction [5] and adaptive optics [6]. In these physical fields, phase plays a more important role than the intensity. However, current technology cannot allow for direct phase detecting. It is only the intensity data that can we record in actual. Phase retrieval is an effective method to reconstruct the missing phase by a list of intensity measurements.

In this study, we concentrate on the 2-D phase retrieval problem that recovers the wavefront from the pre-known object-plane intensity and measured far-field intensities. For this inverse problem, theoretically, no exact solution method exists since the amplitude and the phase is not related directly [7]. Fortunately, it has been proven that the unique solution to the phase is guaranteed, provided the sampling is enough [8]. Nevertheless, there is still no general methods to approach the unique solution, since the phase retrieval problem is extremely non-linear and non-convex. Resignedly, we resort to the iterative Fourier algorithms such as the famous Gerchberg and Saxton (GS) algorithm to find out an acceptable near-optimal solution. Though the GS iteration has been rigorously proved to be monotonically non-increasing [9], the global minimum is hard to guaranteed, as the algorithm often converges to a local minimum and heavily depends on the initial phase guess. To obtain better convergence solution, researchers impose more powerful constraints on both the object and the Fourier planes. For example, applying masks to adjust the illumination patterns [10,11,12], modulating the object phase by applying an SLM [13], measuring the intensities at different planes [14, 15], and recording more data by conducting multi-wavelength measurement [16]. However, these improved schemes may also suffer from poor initial phases and hence fall into a local minimum too.

In addition to improving the iterative algorithms, other scholars turned to research non-iterative approaches. As a deterministic phase retrieval method, transport intensity equation (TIE) based algorithms attract more attentions over the years. Recently, Zuo et al. [17] developed an algorithm to solve the TIE related second order partial differential equation more precisely. Through establishing algebraic relationships, Gao et al. [18, 19] simplified the complex TIE equation to a first order equation, which greatly improves the adaptability of the method used to solve near-field phase retrieval problems. Based on exact mathematics, Gonsalves [20] found a close-form algorithm available for “small-phase” cases. He also raised the idea of differential phase retrieval for one-dimensional signal [21]. While in 2D cases, his method involves a complex partial differential equation, of which the solution is still under study [22, 23]. Beyond these classical methods, a few new theories are also applied to solve phase retrieval, such as wavelet-transform [24] and fractional differential [25].

In this paper, we combined these two ideas of phase retrieval: iterative solutions and deterministic solutions. It is found that under a suitable ‘phase mask’ modulation, a good approximation of the unknown far-field phase can be calculated out from the two recorded far-field intensities before and after adding the mask. Though the calculated phase is just a rough approximation, it captures many key features of the real far-field phase, especially in the region where the phase changes dramatically. With this phase as the initial input, the GS iteration is greatly improved and has the ability to converge to a very near optimal solution. Through rigorous derivations and digital simulations, the authors confirmed the feasibility of this method.

2 Principle

2.1 The pase retrieval problem

As shown in Fig. 1, we take the lens diffractive imaging system as the example to illustrate the method. A Gaussian beam propagates \(B(x,y){{\text{e}}^{iW(x,y)}}\) with an unknown wavefront aberration W(x, y) to the object. After the lens, the focused image is recorded on the receiving screen by a CCD. Add a phase mask to the incident beam and record the second image.

Fig. 1
figure 1

Schematic diagram of the differential phase retrieval

Our task is to recover the phase W(x, y) of the complex-valued object. The intensity of the incident Gaussian beam is denoted as B(x, y), which is known in advance. Assume the amplitude and phase of the far field are h(u, v) and α(u, v), respectively, then the recorded image can be expressed as h2(u, v), which is the intensity of the Fourier transform to the object \(B(x,y){{\text{e}}^{iW(x,y)}}\), as shown below. For a simpler expression, the coordinates (x, y) and (u, v) are omitted.

$$h{{\text{e}}^{i\alpha }}=\iint {B{{\text{e}}^{iW}}{{\text{e}}^{ - i(ux+vy)}}{\text{d}}x{\text{d}}y}.$$
(1)

where, the amplitudes in object plane and Fourier plane are both known, in which case the general GS algorithm can applied to solve this equation. However, this method will probably return weak results. In this study, we try to improve this situation by finding a feasible initial phase which would ensure there always outputs closely optimal solutions. By adding suitable phase masks, a feasible initial phase can be achieved by differential calculations to the recorded intensities. With this phase as the initial input, the GS algorithm would be able to avoid its own properties: initial sensitivity and local convergence.

2.2 The mathematics

A small phase change in the incident beam induces a differential measurement of the far-field image. The phase change is implemented by adding a phase mask (denoted as M) to the incident beam. After adding the phase mask, the object phase becomes (W + M), and the corresponding far-field becomes l(u, v) in amplitude and β(u, v) in phase.

$$l{{\text{e}}^{i\beta }}=\iint {B{{\text{e}}^{iW}}{{\text{e}}^{iM}}{{\text{e}}^{ - i(ux+vy)}}{\text{d}}x{\text{d}}y}.$$
(2)

Introduce a variable t to perform differential operation, M = t∙S, where S is the normalize shape of M, S ∊ [0, 1]. When t keeps very small, M will be close to a zero matrix, which means the following equation holds.

$${{\text{e}}^{iM}} \approx 1+i \cdot M=1+i\left( {t \cdot S} \right).$$
(3)

Substitute the above equation into Eq. (2), we obtain

$$\begin{aligned} l{{\text{e}}^{i\beta }}= & \iint {B{{\text{e}}^{iW}}\left( {1+i \cdot M} \right){{\text{e}}^{ - i(ux+vy)}}{\text{d}}x{\text{d}}y} \\ = & \iint {B{{\text{e}}^{iW}}{{\text{e}}^{ - i(ux+vy)}}{\text{d}}x{\text{d}}y}+it\iint {\left[ {BS} \right]{{\text{e}}^{iW}}{{\text{e}}^{ - i(ux+vy)}}{\text{d}}x{\text{d}}y}. \\ \end{aligned}$$
(4)

The last integral depends on the shape S. Record it as

$$k{{\text{e}}^{i\gamma }}=\iint {\left[ {BS} \right]{{\text{e}}^{iW}}{{\text{e}}^{ - i(ux+vy)}}{\text{d}}x{\text{d}}y}.$$
(5)

Hence, Eq. (4) is simplified further to

$$l{{\text{e}}^{i\beta }}=h{{\text{e}}^{i\alpha }}+it\left( {k{{\text{e}}^{i\gamma }}} \right).$$
(6)

Calculate the modulus square on both sides:

$${l^2}={h^2}+{t^2}({k^2})+2t\left( {hk} \right)\sin \left( {\alpha - \gamma } \right).$$
(7)

In the actual measurement there are only the intensities of h and l being known. Record their values as p and q.

$$\left\{ \begin{aligned} & p={h^2}. \\ & q(t)={l^2}={h^2}+{t^2}\left( {{k^2}} \right)+2t\left( {hk} \right)\sin \left( {\alpha - \gamma } \right). \\ \end{aligned} \right.$$
(8)

In the expression of q(t), t is the unique variable. All of the matrices h, k, α, γ are independent upon t. Calculate the 1-order and 2-order derivatives of q(t) with respect to t.

$$\left\{ \begin{aligned} & \frac{{{\text{d}}q(t)}}{{{\text{d}}t}}=2t\left( {{k^2}} \right)+2\left( {hk} \right)\sin \left( {\alpha - \gamma } \right). \\ & \frac{{{{\text{d}}^2}q(t)}}{{{\text{d}}{t^2}}}=2\left( {{k^2}} \right). \\ \end{aligned} \right.$$
(9)

From the above equations it can be seen that q(t) is a quadratic function, from which it seems there requires at least three far-field images to determine q(t) so as to solve the term sin(αγ). While in fact two far-field images is enough. Calculate the first derivative of q(t) at t = 0.

$${\left. {\frac{{{\text{d}}q(t)}}{{{\text{d}}t}}} \right|_{t=0}}=2\left( {hk} \right)\sin \left( {\alpha - \gamma } \right).$$
(10)

According to the definition of differential and noting the small value of t, we obtain

$${\left. {\frac{{{\text{d}}q(t)}}{{{\text{d}}t}}} \right|_{t=0}}{\text{= }}{\left. {\frac{{q(t) - q(0)}}{t}} \right|_{t \to 0}} \approx \frac{{q - p}}{t}.$$
(11)

Thus, the relationship between the amplitude and the phase in far field is finally established.

$$\left( {\alpha - \gamma } \right)=\arcsin \left[ {\frac{{q - p}}{{2t\left( {hk} \right)}}} \right].$$
(12)

2.3 To the unknown γ

Though the relationship is established, there still remain two unknowns: γ and k. As shown in Eq. (5), both of them depend on the unknown W, which means there is no way to find out their value precisely. Fortunately, a favorable approximation of α would be available if there is a suitable S. Equation (5) suggests that a sharp S could make this. In that case, [B·S] becomes very sharp, with most of its values being zero, making the periphery part of W invalid in the diffraction. In fact, a sharp S is equivalent to a narrow hole to filter the object and lead to a very smooth diffraction field. Therefore, the far-field phase γ in Eq. (5) is definitely very flat. As shown in Fig. 2, a simulation of this process demonstrates that a sharp phase mask indeed produces a flat γ, with most of its values close to 0, ± π.

Fig. 2
figure 2

A simulation example showing that a sharp S produces a smooth γ. The wavefront aberration W was designed to be with a Gaussian random shape and not shown here. The phase mask M was set to be a cylinder window function given by Eq. (17) with the parameters t = 1 and d = 0.2

As seen from this example, γ is almost a cluster of planes which is filled by 0, π and − π. However, γ is still unknown. Which area values 0, which area values π? Unknown. A precise γ is unreachable while approximation is possible. From the purpose of finding an approximation, γ can be considered to be a constant (such as zero), since it is very flat and carries little information on the phase changes. In the Fourier transform, obviously, the bumpy phase works dominantly and the flat phase make few effects. Therefore, it is reasonable to make the assumption

$$\gamma \approx {\text{constant}} \to 0.$$
(13)

Hence, Eq. (12) is updated as follow.

$$\alpha =\arcsin \left[ {\frac{{q - p}}{{2t\left( {hk} \right)}}} \right].$$
(14)

2.4 To the unknown k

From Eq. (12), the phase α can be solved only when k is known. Here, we discuss how to get a feasible solution of k with the two recorded far-field images. Equation (9) has indicated there needs at least three recorded intensities to solve a precise k so as to solve α. However, in fact, we can find three known implicit conditions for solving Eq. (12):

  1. (1)

    Any far-field phase locate in the range of [− π, π].

  2. (2)

    The domain of definition of ‘arcsin’ function is [− 1, 1] and its domain of range is just [− π, π];

  3. (3)

    The value of Eq. (12) is mainly determined by (q − p) and less affected by k since it serves as denominator.

From the principles (1) and (2), it is seen that a precise solution of α can be obtained as long as the shape of k is known. Then according to (3), a rough k is also workable. Based on the property of k, such a feasible approximation of k can really be calculated out from the recorded far-field intensities. Equation (5) shows that k is determined by the known added phase mask S and the unknown to be solved wavefront W. In this equation, the mask S is meant to filter the object so that generates a redistributed diffractive field. A sharp S would primary reduce the scale of the intensity but slightly changes the distribution. That is, the shape of k still depends on W but its range now depends on S. In this way, the following two approximate equations Eq. (15) are introduced to get a feasible k. Note that the approximation could have other forms, not only this present one.

$$\left\{ \begin{gathered} {k_{{\text{shape}}}} \approx \left| {\iint {\left[ {B \cdot 1} \right]{{\text{e}}^{iW}}{{\text{e}}^{ - i(ux+vy)}}{\text{d}}x{\text{d}}y}} \right|=h. \hfill \\ {k_{{\text{range}}}} \approx \left| {\iint {\left[ {B \cdot S} \right]{{\text{e}}^{i0}}{{\text{e}}^{ - i(ux+vy)}}{\text{d}}x{\text{d}}y}} \right|. \hfill \\ \end{gathered} \right.$$
(15)

Based on Eq. (15), we can calculate out a feasible k from the known B, S and h. Figure 3 provides simulation results that use this method to obtain an approximate k. It seems there are obvious differences between the approximation and the real k, but actually, they are very close in center area where the intensity is very high and plays core role in phase retrieval. Since it is considered that the shape of k is exactly the same with h, here shows the final equation to solve α. This equation involves only the known data t, p, q, and C. Where C is the parameter to ensure the direct result in the brace to be within the range [− 1, 1], and ε is a small constant to make this equation stronger.

Fig. 3
figure 3

A feasible approximate k based on the known B, S and h

$$\alpha \approx \arcsin \left[ {\left\{ {\frac{{q - p}}{{2t \cdot p+\varepsilon }}} \right\} \cdot C} \right].$$
(16)

It should be noted that the above formula causes a lot of wrong values of α inevitably, because the “arcsin” ranges in [− π/2, π/2]. That is, the value of the original α smaller than − π/2 or larger than π/2 will be folded to [− π/2, π/2]. This problem comes from the “arcsin” itself and the wrong elements in α can hardly be corrected under the existing conditions. Fortunately, such a trouble brings just a slight effect on the final phase recovery, because most features especially the veins of α are still retained. With the help of subsequent GS iteration, the current approximate far-field phase carrying many key features will quickly converge to an accurate phase retrieval solution.

3 Method

The above is the theoretical derivation of the developed method. As shown in in Fig. 4, the method consists of two process: differential phase retrieval and GS iteration. The novelty of this paper is just the differential phase retrieval process. By establishing formulations to the two focused far-field images h and l, a favorable phase guess is solved directly, which always ensures the GS iteration an efficient convergence so as to reach an accurate retrieval of W.

Fig. 4
figure 4

The flow chart of the proposed differential phase retrieval method. The differential phase retrieval is used to find out a good phase initial guess, and the GS iteration is then used to improve this guess into an accurate one

In practice the method primarily consists of four steps:

  1. 1.

    Measure the beam B and the infocus far-field image p.

  2. 2.

    Add a phase mask and record the new far-field image q.

  3. 3.

    Solve a close approximation of α from p and q.

  4. 4.

    Take this approximation as the initial phase guess and apply the GS algorithm to recover the wavefront W.

In general phase retrieval cases, the GS algorithm often converges to an unsatisfying local optimal solution, which is the so-called initial-sensitivity. This problem happens in quite a few GS-like phase retrieval algorithms. Our method solves this problem neatly by finding out a favorable initial phase guess, which would make the iteration can always converges to a near-global optimal solution.

4 Simulation

4.1 Effectiveness of the method

In this paper, numerical simulations were adopted to verify the effectiveness of the proposed differential phase retrieval method. Taking the GS algorithm as an example, the advantages of the solved initial phase is demonstrated.

Simulation results show that the obtained phase based on the proposed method indeed improves the GS iteration.

Here, the wavefront W to be retrieved was designed to be randomly smooth without loss of generality. The shape of the phase mask S was defined to be cylindrical with its values being 0 or 1 as shown in Fig. 2. To make it clear, introduce d (0 < d < 1) to describe the width of S. A large d means a wider phase window and vice versa. In the digital simulation to generate the far-field images, FFT calculation was applied with a suitable sampling coefficient called η. A bigger η makes denser far-field images. To prevent aliasing on the intensities p and q, η must be not less than 2.

$$S=\left\{ \begin{aligned} & 1,{\text{ }}\sqrt {{x^2}+{y^2}} <{\text{d}}{\left( {\sqrt {{x^2}+{y^2}} } \right)_{\hbox{max} }}. \\ & 0,{\text{ else}}{\text{.}} \\ \end{aligned} \right.$$
(17)

The approximation of α obtained by Eq. (16) is denoted as αR, and the iterative phase retrieval solution from Fig. 4 is denoted as WR. In comparison of WR with W, apply offset root mean squared errors (ORMS) to assess the difference. Equation for calculating ORMS is shown as below, where N is the data size and the symbol “⎺” means taking average.

$${\text{ORMS}}=\sqrt {\frac{{\sum\nolimits_{{m=1}}^{N} {\sum\nolimits_{{n=1}}^{N} {{{\left[ {\left( {{W_{\text{R}}}(m,n) - \overline {{{W_{\text{R}}}}} } \right) - \left( {W(m,n) - \bar {W}} \right)} \right]}^2}} } }}{{{N^2}}}} .$$
(18)

As shown in Fig. 5, the designed randomly smooth W ranges from − π/2 to π/2. The parameters d, t, ε, η, N were set to 0.2, 0.1, 1, 4, 101, respectively. The far-field images were generated with the size 404 × 404 by applying FFT calculation to the object with four times zeros-padding. In actual, for the far-field images, there only the center 1/16 data (101 × 101) was used, considering there is impossible to record all region of the diffraction intensities.

Fig. 5
figure 5

An example of the method used to recover the wavefront. It can be seen αR does carry key features of α, especially in the central region, and it becomes very close to α after 100 iterations

As it can be seen from the simulation results, the first far-field phase guess αR shown in Fig. 5f reflects many key features of the real far-field phase α shown in Fig. 5e. After 100 times GS iteration, the retrieved far-field phase αR shown in Fig. 5g becomes almost the same to its real value α. Meanwhile, the retrieved object phase WR shown in Fig. 5h also becomes very close to the given wavefront W as shown in Fig. 5b. From these results it can be seen the GS iteration returns a precise phase retrieval solution, which is just because the solved phase approximation αR has been already very close to the real far-field phase α.

4.2 Comparison with arbitrary phase guesses

When takes arbitrary phase guesses as the initial input, the GS iteration will always return weak phase retrieval results. To compare with αR, as shown in Fig. 6, the three arbitrary phase guesses, including all-zeros (α1), random distribution (α2) and the sign (1 or − 1) of the ideal far-field phase (α3), are taken as the initial into the GS iteration.

Fig. 6
figure 6

Three representative kinds (uniform, random, standard) of arbitrary initial phase guess, all with the size 101 × 101

To make effective comparison, the simulation involved 100 different randomly smooth wavefront aberrations (named W i , i = 1, 2, 3…100) with the same range [−π/2, π/2]. Other parameters remain unchanged:

$$~d=0.{\text{2}},t=0.{\text{1}},\varepsilon ={\text{1}},\eta ={\text{4}},N={\text{1}}0{\text{1}},{\text{ 1}}00{\text{ iterations}}.$$

The results of ORMS values in this simulation are shown in Fig. 7, where it can be seen that all the arbitrary phase guesses α1, α2 and α3 tend to make the iterations converge to poor solutions, while αR is basically able to guarantee a very near global optimal phase retrieval solution.

Fig. 7
figure 7

The ORMS of the iterative results by inputting αR, α1, α2 and α3 into the GS algorithm as the initial phase. From the results it can be seen αR always contributes to a very near global optimal solution, whereas arbitrary phase guesses occasionally return a good result but mostly leads to weak phase retrieval solutions

4.3 Performance under noises

There are inevitable errors attached to h and l during the measurements. We studied the accuracy of the method under different strength additive Gaussian noise based on numerical simulation. Here, chose h and l in Fig. 5 as the original image and impose Gaussian white noise to it. The parameters were also set as:

$$d=0.{\text{2}},t=0.{\text{1}},\varepsilon ={\text{1}},\eta ={\text{4}},N={\text{1}}0{\text{1}},{\text{ 1}}00{\text{ iterations}}.$$

Similarly, take 100 different W (named W i , i = 1, 2, 3…100) with the same range [–π/2, π/2] into the simulation to acquire an effective ORMS value (recorded as \(\overline {{{\text{ORMS}}}}\) and shown in Table 1). From these results, it concludes that the proposed method exhibited good anti-noise performance, with the average phase error being less than 0.1 in 20 dB noise.

Table 1 \(\overline {{{\text{ORMS}}}}\) of the recovered phase under Gaussian noise

It can also be seen from Table 1 that the three artificial phase guesses produced much worse phase retrieval in the same noise condition. Moreover, it seems that the final results of the GS iteration has nothing to do with the noise intensity when takes α1, α2 and α3 as the initial phase. This proves that the artificial phase guess is a totally gambly choice. In contrast, when take αR as the initial phase, the wavefront retrieval accuracy \(\overline {{{\text{ORMS}}}}\) varies strictly with the noise intensity. This proves that the solved far-field phase approximation αR does reflect the real far-field phase α.

5 Discussion

In most kinds of iterative Fourier algorithms for solving the phase retrieval problem, the iteration process occurs to be initial-sensitive and local-convergence. A good initial value is essential to reach a good phase retrieval solution. Previous studies tended to imposing more constraints to guarantee a better convergence result. Unlike this idea, we tried to obtain a good solution by finding a perfect initial phase. Based on theoretical analysis, an algebraic relation which connects the far-field phase and the intensity was established. Under a sharp phase mask, the far-field phase can be solved approximately. Although this approximation is not very accurate, it still carries key features of the real phase, and so that ensures the GS iteration converges to an excellent phase retrieval solutions.

We deeply studied what kind of phase mask can return a more accurate phase retrieval solution. After lots of trials, it concludes that the method does not rely on the shape of the phase mask. The shape could be round, square or any other, as long as its width d is small enough to meet the Taylor’s approximation Eq. (3), and its height t is small enough to meet the difference approximation Eq. (11). In general, there should be d < 0.2 and t < 1. Compared with t, a smaller d would be more effective to ensure an accurate phase retrieval solution. If d is too large, it may need more iteration to obtain an acceptable result. However, d cannot be too small as well. In the case of a tiny d in practice, the solved αR would become worse due to the noises, since the core term (q − p) in Eq. (16) gets more inaccurate.

The significance of our work is that we find an effective approach to obtain a good initial value for general iterative Fourier algorithms to solve the phase retrieval problems. In addition, this idea implies the exact far-field phase α can even also be solved if there could be found more equations to connect the amplitude and the phase, just as Eq. (12) does. In that case, the traditional phase retrieval problem would have its general exact solution method.

6 Conclusion

To sum up, our study developed an effective algorithm for wavefront distortion detection with two focused far-field images, one is the direct focused and the other is recorded after adding a phase mask. We demonstrate that an approximate far-field phase can be solved directly with a sharp phase mask. By taking this approximate phase as the initial input, the GS iteration converges much faster and achieves much better phase retrieval solutions.

This work basically solves the initial-sensitivity problem that occurs in iteration-based phase retrieval algorithms. Through contrast simulations, we found that the solved phase approximation does reflect many key features of the actual far-field phase, which can always guarantee the GS iteration nearly converges to the global optimal solution. Moreover, the method does not require an accurate phase mask and exhibits high tolerance to the noises. That means this method has considerable potential applications.

It is admitted that current research is still in the stage of theoretical deduction and numerical simulation. While in fact the experimental verification is underway now. Our expectation is, with the help of the experiments, to testify the feasibility of the proposed method on the one hand, and on the other hand to find out better phase retrieval methods based on differential calculation.