Abstract
This paper presents a new fast iterative shrinkage-thresholding algorithm, termed AFISTA. The essential idea is to improve the convergence rate of FISTA using a new continuation strategy leading to a less number of iterations compared to FISTA. The convergence theorem of the AFISTA is proposed. In order to further accelerate the AFISTA method, it is equipped with the Barzilai-Borwein (BB) method. Also, for applications with orthogonal sensing matrix A, we proposed a specialized version of the AFISTA method. AFISTA is tailored for solving the basis pursuit problem which can be applied successfully on a variety of problems arising in signal and image processing issues such as sparse signal recovery, signal and image denoising, image restoration, and compressive sensing. To show the efficiency of the method, we compare our results with generalizations of linearized Bregman and fixed - point continuation (FPC) methods in sparse signal recovery applications, with split Bregman method in compressive sensing for sparse MRI and with Gradient projection for sparse reconstruction (GPSR) method in image deconvolution. Numerical results demonstrate that AFISTA overcomes all of the compared methods in convergence rate and some of them in both convergence rate and quality of reconstructed results.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Let \(A\in \mathbb {R}^{m\times n}\) with m < n or m ≪ n (in compressed sensing), \(b\in \mathbb {R}^{m}\), and \(u\in \mathbb {R}^{n}\). A basis pursuit (BP) problem is a constrained minimization problem as follows [11]:
which gives the solution of the underdetermined linear system with minimal L1 norm. This formulation has been suggested by Chen and Donoho [11] to find a sparse solution among the many possible solutions to Au = b. Basis pursuit problem emerges in many applications such as statistics [15], image compression [22], spectral decomposition [23] and compressed sensing [9, 14, 33, 38], image reconstruction [39], specially its applications in medical images [36, 37]. Instead of problem (1), one can consider the unconstrained problem:
for λ ≥ 0 named LASSO (Least Absolute Shrinkage And Selection Operator) which is closely related to that. Indeed, in the [32], the authors proved that a solution of (1) is either u = 0, or else is a solution of problem (2), for some λ > 0. In (1), one can use another norm instead of L1 norm. Especially, considering sparsity for the solution, L0 norm can be replaced with L1 norm since it gives the sparsest solution for the problem at hand (note that L0 is not actually a norm) [24, 26]. But, it is an NP-hard problem and so it is difficult to solve [24, 26]. In [29], it is mentioned that under suitable circumstances, the solution to the problem (2) gives the sparsest solution satisfying Au = b. Indeed, the problem (2) is a convex approximation to the L0 regularizing problem [17]:
where ∥u∥0 corresponds to the total number of nonzero elements in u. The problem (2) can be considered as a second order cone programming problem [17] and therefore can be solved by conventional methods such as an interior point method [5]. However, because of its computational complexity, these conventional methods are not suitable for large scale data [29] which arises in the real applications such as CS reconstruction of Magnetic Resonance Imaging (MRI). Many algorithms solve efficiently the problem (2). Some of these methods can be listed as follows: infeasible-point subgradient algorithm (ISA) [25], homotopy based methods [27, 28], fixed - point continuation (FPC) [19, 40], L1-Magic algorithm [6], linearized Bregman and its generations [21, 29, 41, 42] and proximal gradient based algorithms such as alternating direction method (ADM) [30], alternating direction method of multipliers (ADMM) [30], iterative shrinkage thresholding algorithm (ISTA) and its modification, fast iterative shrinkage thresholding algorithm (FISTA) [2, 3, 30, 34, 35]. Compared to the traditional methods, the above methods, had been suggested to be suitable for problems with large - scale data and sparse solutions and had been used successfully for the LASSO problem. Most of these methods are proposed to solving LASSO (2) which has nothing to do with the BP problem. Among these methods, fixed - point continuation and an active set method can also solve the BP problem directly only if λ is set to be a tiny value (e.g., 1e-10).
In this paper, we proposed a method based on FISTA which solves the BP problem directly. Indeed, we modified FISTA using a continuation strategy to lead to a less number of iterations compared to FISTA and to solve the BP problem instead of the LASSO problem. Given that the only difference between the FISTA and AFISTA methods is related to the continuation strategy added to the AFISTA algorithm, which has little computational cost compared to the matrix-vector multiplication, the computational cost of each of these two algorithms at each iteration is almost the same. But the number of iterations for AFISTA is considerably less than FISTA. Also, we presented the convergence analysis of the proposed method. In order to further accelerate the proposed method, we used the Barzilai-Borwein (BB) method [4]. Also, for applications with orthogonal sensing matrix A, we proposed a specialized version of the AFISTA method. The efficiency of the proposed method is illustrated through a variety of numerical experiments such as sparse signal recovery, recovery of a noisy sinusoidal wave, compressive sensing for MRI images, and image deconvolution.
This paper is organized as follows: Section 2 recalls ISTA and FISTA methods which is the core of the proposed method, in Section 3, we proposed the AFISTA method and its generations, also, we presented its convergence analysis, in Section 4, some numerical experiments were presented. Finally, in the conclusion section, we summarize the whole text of this paper.
2 ISTA and FISTA methods
In this section, the ideas of ISTA and FISTA algorithms, the core of our proposed method, are shortly described In [2], the authors considered the following general formulation to solve
where \(g:\mathbb {R}^{n} \rightarrow \mathbb {R} \) is a continuous convex function which can be both smooth or non-smooth. And \(f: \mathbb {R}^{n} \rightarrow \mathbb {R} \) is a convex and continuously differentiable function with Lipschitz continuous gradient, i.e.,
where ∥.∥ denotes the Euclidean norm in the space \(\mathbb {R}^{n}\). The basic iterative scheme of the ISTA for solving the convex optimization problem (4) is as follows
Now by setting \(f(u) = \frac {1}{2}\Vert Au-b \Vert ^{2}\) and g(u) = λ∥u∥1, we have ∇f(u) = AT(Au − b) and L(f) = ∥ATA∥ and also [2]
where \(\mathcal {T}_{\alpha }:\mathbb {R}^{n}\rightarrow \mathbb {R}^{n}\) is the shrinkage (soft thresholding) operator defined by
Finally, algorithm of the ISTA method for solving LASSO problem can be written as follows [2]
Algorithm 1 shows the ISTA algorithm exactly uses two matrix-vector multiplications involving A and AT followed by a shrinkage step. So, it is suitable for large-scale applications. Note that, ISTA is the special case of the proximal forward-backward method proposed in [8] and [31]. A general convergence results for the proximal forward-backward method can be found in [13]. A sufficient but not necessary condition to ensure the convergence of the sequence \(\left \lbrace u^{k} \right \rbrace \) generated by the ISTA for the optimal point of the LASSO problem is that the stepsize L satisfies \(\frac {1}{L}\in \left (0, \frac {1}{\Vert A^{T}A \Vert }\right ) \) [16] or equivalently L ≥∥ATA∥. The authors in [7] proved that the sequence \(\left \lbrace u^{k} \right \rbrace \) can converge to minimizer of the LASSO problem with \(L>\frac {1}{2}\Vert A^{T}A \Vert \). However, it is not a necessary condition too.
The main drawback of the ISTA algorithm is its rate of convergence. Indeed, it converges slowly to the optimal objective function value. To fix this drawback of the ISTA, in [2] a faster algorithm called FISTA is proposed. The main difference between FISTA and ISTA algorithms is that FISTA uses yk composed of uk− 1 and uk− 2, whereas ISTA is only based on uk− 1. Algorithm 2 shows FISTA method for LASSO problem [2]
As appears, the cost of calculation of Algorithms 1 and 2 are almost the same, this is because the execution times of lines 4 and 5 are obviously marginal. Now, we report two theorems of the convergence rate of the ISTA and FISTA methods presented in [2]. Suppose that \(F(u) = \frac {1}{2}\Vert Au-b \Vert ^{2} + \lambda \Vert u \Vert _{1}\) and BP problem is solvable, i.e., U∗ = argminF≠∅ and also F∗ be the optimal function value of F at an optimal point u∗.
Theorem 1
[2] Let \(\left \lbrace u^{k} \right \rbrace \)be the sequence generated by ISTA method. Then for any k ≥ 1
where C is a positive constant.
Theorem 2
[2] Let \(\left \lbrace u^{k} \right \rbrace \) be the sequence generated by FISTA method. Then for any k ≥ 1
where C is a positive constant.
In [35], the authors note that the ISTA and FISTA methods have local linear convergence rate, i.e., the convergence of these methods is linear when they reach close to the optimal point.
3 AFISTA method
Considering the importance of the convergence rate in large-scale problems, we advocate in favor of using a continuation strategy along with the BB stepsize for accelerating the FISTA method.
3.1 Accelerating FISTA with continuation
Here, the AFISTA method is presented. In [19], the authors introduced a fixed point continuation method that solves a sequence of LASSO problems with different values of the regularization parameter λ. Indeed, for a decreasing sequence \(\left \lbrace \lambda _{k} \right \rbrace \), as a warm start for the next LASSO problem, they used the solution of the i-th LASSO problem associated with the regularization parameter λk, which was obtained using fixed - point algorithm. Similar continuation strategies have also been used in [1, 17, 20, 28].
Here, we have used the continuation technique to accelerate the FISTA method in a different way. Compared with previous works, that produce decreasing sequence \(\left \lbrace \lambda _{k} \right \rbrace \) with a fixed descent rate 0 < β < 1, we used the convergence of the sequence {uk} generated by FISTA method to produce a decreasing sequence of the regularization parameters which converges to zero. Indeed, here we produced the decreasing sequence \(\left \lbrace \lambda _{k} \right \rbrace \), first by setting λ1 as \( \lambda _{1}=\tilde {\lambda } \Vert b\Vert _{2}\Vert A^{T}b\Vert _{2} /L\) and then at iteration k, by setting λk+ 1 as follows
where \(\tilde {\lambda }\) and γ are positive constants. It is notable that, the ISTA and FISTA are proposed to solve LASSO problem which has nothing to do with BP problem. Whereas, the proposed method, by converging λk to 0, solves BP problem. Also, to ensure the convergence of the sequence λk to 0, we need to take γ < 1. Numerical results show how much the use of this method to generate a sequence of regularization parameters in the continuation technique can be successful. Now, we are ready to present the algorithm of the AFISTA for the BP problem
Given that the only difference between the FISTA and AFISTA methods is in the fifth and seventh lines of the AFISTA algorithm, which has little computational cost compared to the matrix-vector multiplication, the computational cost of each of these two algorithms at each iteration is almost the same. But the number of iterations for AFISTA is considerably less than FISTA. This fact is clearly shown in Fig. 1, where the total number of iterations of the FISTA and AFISTA methods to reach the same stopping criterion are 7651 and 155, respectively. In this experiment, we take the sensing matrix A to be a 300 × 1000 Gaussian random matrix, the total number of nonzeros of u∗ to be 50 and λ = 0.005 for FISTA method and \(\tilde {\lambda } = 2 \) and γ = 0.9 for AFISTA method and we use the stopping criterion \(\frac {\Vert Au^{k}-b\Vert }{\Vert b\Vert }<10^{-5}\) for these two methods.
Table 1 shows the difference between the convergence rate of the AFISTA and FISTA methods in terms of the number of iterations (NoIter for short) along with stoping criterion \(\frac {\Vert u^{k}-u^{*} \Vert }{\Vert u^{*} \Vert }\leq \varepsilon \) for different values of ε. Here, Func. Err. denotes objective function error F(uk) − F(u∗). Observe the big difference between AFISTA and FISTA: the former needs 715 iterations with γ = 0.5, 733 iterations with γ = 0.7 and 708 iterations with γ = 0.9 for ε = 10−14 whereas the latter requires 3282 iterations for ε = 10−1. In this test, we take the sensing matrix A to be a 300 × 1000 Gaussian random matrix, the total number of nonzeros of u∗ to be 50 and \(\tilde {\lambda } = 2.5 \). It is notable that, numerical results are obtained for 10 random samples.
3.2 Convergence analysis of the AFISTA
In this section, first we present the convergence result of the FISTA. Then, we propose the convergence theorem of the AFISTA. Note that, the Authors in [10], proved the convergence of the objective function. However, the convergence of the sequence \(\left \lbrace u_{k}\right \rbrace \) given by FISTA was not known. In [10], the Authors show that with a small modification, one can ensure the convergence of the objective function as well as the convergence of the sequence \(\left \lbrace u_{k}\right \rbrace \) itself. Here, we present the theorem proved in [10].
Theorem 3
[10] Suppose that a > 2 be a real number, and \(\forall k\in \mathbb {N}\) let \(t_{k}=\frac {k+a-1}{a}\). Then, the sequence \(\left \lbrace u_{k}\right \rbrace \) produced by FISTA converges to a minimizer of objective function F.
Note that, here we set objective function \(F(u) = \frac {1}{2}\Vert Au-b \Vert ^{2} + \lambda \Vert u \Vert _{1}\), which is the objective function of the LASSO problem. Now, we are ready to present the following theorem of the convergence of sequence \(\left \lbrace u_{k}\right \rbrace \) given by AFISTA.
Theorem 4
Suppose that \(\left \lbrace u_{k}\right \rbrace \) is the sequence given by AFISTA. Then, the sequence \(\left \lbrace u_{k}\right \rbrace \) converges to the minimizer of the Basis Pursuit problem.
Proof
Based on updating rule (5), it is obvious that the sequence \(\left \lbrace \lambda _{k}\right \rbrace \) is a non-increasing sequence. Also it is not a fixed sequence. For some k ≥ 1, suppose that at iteration k, we have λk = λk+ 1 = ⋯. It means that, from iteration k, we are solving LASSO problem with regularization parameter λk using FISTA starting from initial point uk. Thus, the convergence of FISTA method yields that for a given 0 < γ < 1
So, the sequence \(\left \lbrace \lambda _{k}\right \rbrace \) is a positive decreasing sequence, and converges to a point. Also, from (6), one can conclude that the sequence \(\left \lbrace \lambda _{k}\right \rbrace \) has a subsequence which satisfies \(\lambda _{k_{i+1}} < \gamma \lambda _{k_{i}} \quad \forall i\in \mathbb {N}\). This means that, the sequence \(\left \lbrace \lambda _{k}\right \rbrace \) is a convergent seguence which has a subsequence that converges to zero. Thus, we have \(\left \lbrace \lambda _{k}\right \rbrace \rightarrow 0\) which will force the equality Au = b constraint. This means that eventually AFISTA solves the Basis Pursuit problem. □
Note that, throughout this paper we set a = 3 in formula \(t_{k}=\frac {k+a-1}{a}\).
3.3 AFISTA with BB stepsize
The Barzilai-Borwein (BB) method was proposed in [4]. The motivation of the BB method is the same idea behind quasi-Newton’s method. In fact, BB method provides an approximation to the inverse of the Hessian matrix of the problem at hand, and this approximation simply is set to be a multiple of the identity matrix, that is, the step size αk is chosen so that αkI approximates the inverse of the Hessian at iteration k.
Let Sk− 1 = uk − uk− 1 and Gk− 1 = gk − gk− 1, then the BB stepsize can be computed as follows
and so by differentiating and setting it to zero, we obtain
Now, in order to further accelerate the AFISTA method, we intend to use the BB method, which has been successfully used in similar works for l1-minimization issues [12, 17, 40, 41]. The AFISTA algorithm contains two basic steps in each iteration. The first step has one iteration of the gradient descent method with fixed stepsize \(\frac {\mathbf {1}}{\mathbf {L}}\), which performs the minimization of the function \(f(u) = \frac {1}{2}\Vert Au-b \Vert ^{2}\). The second step uses a shrinkage operator, which imposes the sparsity assumption on the obtained solution. Therefore, to improve the convergence rate of the first step, we changed the value of stepsize furnished by BB stepsizes. The above discussions lead to an accelerated version of AFISTA, namely AFISTA-BB (see Algorithm 4).
Each iteration of the AFISTA-BB algorithm uses two matrix-vector multiplications as in the previous methods, but in practice, its convergence rate is better than the AFISTA. Left and right - hand side of Fig. 2 shows the residual and relative error as a function of iteration counts, respectively. In this figure, to reach the stopping criterion \(\frac {\Vert Au^{k}-b\Vert }{\Vert b\Vert }<10^{-5}\), the total number of iterations of the AFISTA and AFISTA-BB methods are 122 and 71, respectively. Figure 2 shows also the AFISTA-BB method has significantly better performance than the AFISTA. The sensing matrix A is a 600 × 2000 Gaussian matrix. All other parameters used in this figure are the same as used in Fig. 1.
3.4 Orthogonal case
Let us consider the specific case of AAT = I which appears in the most applications. In this case, all of the stepsizes are equal to one. For steepest descent stepsize, we have
And for BB stepsize, we have
thus
Therefore, for this particular case, the AFISTA algorithm can be rewritten as follows
4 Numerical experiments
In this section, the efficiency of the AFISTA algorithm is shown for solving the BP problem. We applied the AFISTA on several examples. First, it is done on sparse signal recovery, second on noisy sinusoidal wave recovery, third on CS for sparse MRI, and finally on image deconvolution problem. All of the discussed algorithms were implemented using Lenovo B590 laptop with Intel(R) Celeron(R) CPU 1005M @ 1.90GHz Processor in MATLAB. Throughout this section, the accuracy of the methods is measured by relative error \(\frac {\Vert u^{k}-\bar {u} \Vert }{\Vert \bar {u} \Vert }\), where \({\bar {\mathbf {u}}}\) and uk stand for the original signal and the reconstructed one at iteration k, respectively.
4.1 Sparse signal recovery
In this subsection, our goal is to reconstruct the original sparse signal \({\bar {\mathbf {u}}}\) from the sensing matrix A and the observation b. To generate the original signal \({\bar {\mathbf {u}}}\), the position of the non-zero components of signal \({\bar {\mathbf {u}}}\) were randomly selected using the randsample command in MATLAB and then each of these non-zero components were sampled from the standard Gaussian (randn in MATLAB) or from [− 1,1] uniformly at random (2 ∗rand − 1 in MATLAB) or from \(\left \lbrace -1,1\right \rbrace \) standard Gaussian random (sign(randn) in MATLAB). And we use three types of measurement matrices: orthogonalized Gaussian matrices that its entries were generated using Gaussian distribution and its rows were orthogonalized by QR decompositions, standard Gaussian matrices, and partial discrete cosine transform (DCT) matrices that its m rows were randomly selected from the n × n DCT matrix. Note that, when we deal with the partial DCT matrix, we can compute matrix-vector multiplications involving A and AT using the discrete cosine transform and the inverse discrete cosine transform, respectively. Here, we have done two types of tests: Noise - free Test: its results are given in Fig. 3 and Table 2, and Noisy Test: its results are given in Figs. 4, 5, 6 and Table 3. In noisy case, we take the measurement vector b as follows
where η is the white Gaussian noise of variance σ2. To detect the noise level, instead of using σ, we use the signal to noise ratio (SNR), which is defined as follows
Here, we used AFISTA-BB method along with the following parameters, sensing matrix A = 600 × 2000 standard Gaussian matrix, \(\Vert \bar {u}\Vert _{0}=50\) and \(\tilde {\lambda } =2 \). Figure 3 shows the residual and the relative error. AFISTA-BB algorithm converges to a solution with 10− 15 residual and relative error in less than 350 iterations. The stopping criterion is \(\frac {\Vert Au^{k}-b\Vert }{\Vert b\Vert }<10^{-15}\). Figures 4–6 show the results of the noisy tests with three different levels of noise. These figures show, even for huge noise levels with negative SNRs, the AFISTA-BB algorithm works well.
In Table 2, we compare specialized AFISTA for orthogonal case, termed SAFISTA, with two different methods, namely kicking+BB line search (KB) and L-BFGS (LB) of the linearized Bergman [41]. In this Table, following [41], we use orthogonalized Gaussian matrices as sensing matrices and we report average results which are obtained using 20 random instances. Also, we let \(\tilde {\lambda }=0.1\) for Gaussian signals and \(\tilde {\lambda }=0.05\) for uniformly random [− 1,1] signals, α = 0.95 and the stopping criterion equal to \(\frac {\Vert Au^{k}-b\Vert }{\Vert b\Vert }<10^{-5}\). From Table 2, one can observe that the number of matrix-vector multiplications required for the SAFISTA method is less than linearized Bergman’s methods, so SAFISTA is faster than these tow methods. Notably, the quality of reconstructions for all three methods are the same.
Table 3 shows the comparison of AFISTA and the linearized Bregman with kicking (KO) [29]. Let std denotes the standard deviation. For standard Gaussian matrices and partial DCT ones we use the AFISTA-BB with stopping criterion std\(\left (Au^{k}-b\right ) <\sigma \) and SAFISTA with std\(\left (Au^{k}-b\right ) < 0.5\sigma \), respectively. We report mean results obtained for 10 instances. Numerical results show that the AFISTA method overcomes KO method from the perspective of convergence rate and accuracy.
Table 4 shows the comparison of the SAFISTA with FPC-AS (i.e. FPC Active Set [40]). We used latest version of the FPC-AS Matlab code, which is available at http://www.caam.rice.edu/~optimization/L1/FPC_AS/. For a fair comparison, first, we run the FPC-AS code and then, having the FPC-AS’s solution uF, we terminated the SAFISTA method at iteration k when the following solution uk is satisfied
Table 4 shows also the average results which are obtained for 20 random instances using orthogonalized Gaussian matrices as sensing matrices. Also, we take α = 0.95 and \(\tilde {\lambda } = 0.04\). Note that, in this table, we have both noise - free and noisy cases. From this table, one can see that for the noise free cases FPC-AS works better than SAFISTA, but in noisy cases, SAFISTA works better and required fewer matrix-vector multiplications than FPC-AS. In the last column of this table we present the time ratio of FPC-AS and proposed methods, which the average of this column shows that for noise - free cases stopping time of these two methods is approximately the same but for noisy cases, the proposed method is approximately 2 times faster than FPC-AS.
4.2 Recovery of noisy sinusoidal wave
For the first compressive sensing application in the frequency domain, we consider a sinusoidal wave that is sparse in frequency domain:
where ai and βi denote magnitudes and frequencies of the signal. The observed signal \(\tilde {x}\), is a noisy signal with the form \(\tilde {x} = \bar {x} + \eta \), where η is the white Gaussian noise of variance σ2. Note that, in CS application, only a random measurement of observed signal \(\tilde {x}\) is available. So, the problem is to reconstruct the original signal \(\bar {x}\) from a random measurement of the observed signal \(\tilde {x}\). Then, the problem can be written as Au = b, where \(b=\tilde {x}(I)\), I is a random subset of \( \left \lbrace 1,2,{\ldots } ,n \right \rbrace \), n is total signal length and the sensing matrix A is the partial inverse Fourier transform matrix. Note that, signal u is in the Fourier domain, thus we need to take an inverse Fourier transform to get the reconstructed signal in the physical domain. In numerical experiments, to generate original signal \(\bar {x}\), the magnitudes ai are generated uniformly at random from [− 1,1] and βis are taken to be random multiplies of \(\frac {2\pi }{n}\), i.e., \(\beta _{i} = k_{i}\frac {2\pi }{n}\), where ki is randomly selected from \(\left \lbrace 0,1,{\ldots } ,n-1 \right \rbrace \).
In this test, AFISTA-BB algorithm is used with parameters \( \tilde {\lambda }=0.01\), L = 1 and the stopping criterion \(\frac {\Vert Au^{k}-b\Vert }{\Vert b\Vert }<0.01\). Numerical results for this test for three different levels of noise with SNRs = 9.5369,4.0767 and-0.9420 are shown in Figs. 7, 8 and 9. Details of the results obtained in this test are presented in the Table 5. These results show that to have a reliable reconstruction at the higher level of noise, we need to use a greater number of measurements.
4.3 CS for sparse MRI
To illustrate the efficiency of our method in practical applications, we tested it on MRI images. All images used in this section are of the size 720 × 960 which are available at https://usa.healthcare.siemens.com/magnetic-resonance-imaging/. After the vectorization of these images, the original signal \({\bar {\mathbf {u}}}\) will be of length n = 691200. In the CS application for sparse MRI, we need to take the sensing matrix A to be a partial discrete Fourier transform matrix. Note that, we compute matrix-vector multiplications involving A and AT using the fast Fourier transform and the inverse fast Fourier transform, respectively. Throughout this part, we use the AFISTA-BB algorithm with parameters \( \tilde {\lambda }=1\times 10^{-10}\), L = 1 and the stopping criterion \(\frac {\Vert Au^{k}-b\Vert }{\Vert b\Vert }<5\times 10^{-3}\).
Considering the importance of the sparsity assumption, we first start with angiogram images, which is usually sparse in the spatial domain. We used four angiogram images with different sparsity levels. We call these four images angio1, angio2, angio3 and angio4, which are used for producing Fig. 10, from top to bottom, and Fig. 11, respectively. Figure 10 shows the results of our first test on angiography images. In this figure, we compare the AFISTA-BB algorithm with the split Bregman method [18]. The Matlab code for the split Bregman method can be downloaded from http://tag7.web.rice.edu/Split_Bregman.html. The obtained results show that for enough sparse images, the AFISTA-BB works better than split Bregman in convergence rate and accuracy of the solution. In the other words, Whereas AFISTA-BB is faster than split Bregman, the quality of reconstructed images using AFISTA-BB is better than split Bregman. Details of the results obtained in this test are presented in the Table 6. Figure 11 shows the results of our second test on angiography images. In this test, we use image angio4 that is 52% sparse. Compared to the images used in Fig. 10, which are about 70% sparse, this image is not sparse enough to reconstruct using 30% of its frequency domain data. From Fig. 11, we can see that the AFISTA-BB algorithm failed to have a proper reconstruction using 30% of frequency domain data.
This figure also shows that the AFISTA-BB algorithm works well in reconstructing the image angio4 using 50% of its frequency domain data. Details of the results obtained in this test are presented in the Table 7. Figure 11 shows the importance of the sparsity assumption in BP problem. It is obvious that for other MRI images with significantly little sparsity level the AFISTA method will not work.
As the last test, we applied AFISTA-BB on MRI images. Here, the original signal is considered as the wavelet coefficients of the MRI images, in which the Haar wavelet is used to decompose images up to five resolution levels. The results of this test are presented in Fig. 12 and its details are presented in Table 8.
4.4 Image deconvolution
For our final experiment, we test the proposed AFISTA-BB on the Image Deconvolution problem. In other words, we set matrix A to be RW, in which, R is a matrix representation of the blur operation, and W denotes the inverse wavelet transform. Following [17], the blur kernel is setted to be \(h_{ij} = \frac {1}{i^{2} + j^{2}}\). Here, we use two well - known test images, Lena and Cameraman. And we compare the proposed method With GPSR-BB algorithm [17]. Both the paper and the code of the GPSR-BB algorithm are available at http://www.lx.it.pt/~mtf/GPSR/. For the GPSR algorithm, we use the demo ”demo_ image_ deblur.m” with default settings for its parameters except parameter ”tolA”. For implementation of the discrete wavelet transform, both algorithms use the Rice wavelet toolbox that it can be freely downloaded from http://www-dsp.rice.edu/software/rwt.shtml. Throughout this part, we use the AFISTA-BB algorithm with parameters \( \tilde {\lambda }=1\times 10^{-7}\), L = 1 × 10+ 3 and the stopping criterion \(\frac {\Vert Au^{k}-b\Vert }{\Vert b\Vert }<\epsilon \). The results of Image Deconvolution experiments are reported at Table 9 and Figs. 13 and 14, which show the efficiency of the proposed method. Note that, in this test, the quality of reconstructed images using the GPSR method is less than the proposed method. The psnr of reconstructed images shows this fact. For example, from Table 9, we can see that, for Lena test image GPSR-BB reached psnr= 33.69 in 15.28 seconds whereas AFISTA-BB reached psnr= 39.50 in 12.55 seconds. So, AFISTA-BB can reach a better quality of reconstruction in less time compared to GPSR-BB. Thus, from this test, we can see that the AFISTA method overcomes the GPSR-BB method from the perspective of convergence rate and accuracy.
5 Conclusion
This paper proposes AFISTA to accelerate the FISTA method using a specific continuation strategy. Also, the convergence analysis of the proposed method is presented. Although the main computational cost of these two methods at each iteration remains almost the same, however thanks to the continuation strategy AFISTA method is significantly faster than the FISTA method. Furthermore, to improve its convergence rate, we furnished the AFISTA method with BB stepsize. In the sequel, AFISTA is specialized for the orthogonal case i.e. AAT = I. A fundamental difference between the proposed method and the FISTA method is that the proposed method solves BP problem while the FISTA method solves LASSO problem. As mentioned above, AFISTA is applied on various tasks of signal and image processing and compared with other methods such as generalizations of linearized Bregman, fixed - point continuation (FPC), split Bregman, and Gradient projection for sparse reconstruction (GPSR) methods. Numerical results show that in most cases, the proposed method in both the convergence rate and the quality of reconstruction is better than other methods. The main limitation of the proposed method is that its parameters need to set beforehand. The application of AFISTA on medical image processing is our future work.
References
Becker S, Bobin J, Candes EJ (2011) NESTA: A fast and accurate first-order method for sparse recovery. SIAM J Imaging Sci 4(1):1–39
Beck A, Teboulle M (2009) A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J Imaging Sci 2(1):183–202
Beck A, Teboulle M (2010) Gradient-based algorithms with applications to signal recovery problems. Convex optimization in signal processing and communications. Cambribge University Press, Cambribge, pp 42–88
Barzilai J, Borwein J (1988) Two point step size gradient methods. IMA J Numer Anal 8:141–148
Ben-Tal A, Nemirovski A (2001) Lectures on modern convex optimization: Analysis, algorithms, and engineering applications. MPS/SIAM Ser. Optim. SIAM
Boyd S, Vandenberghe L. (2004) Convex optimization. Cambridge University Press, New York
Bredies K, Lorenz D (2008) Linear convergence of iterative soft-thresholding. J Fourier Anal Appl 14:813–837
Bruck RJ (1977) On the weak convergence of an ergodic iteration for the solution of variational inequalities for monotone operators in Hilbert space. J Math Anal Appl 61:159–164
Candes E, Romberg J, Tao T (2006) Stable signal recovery from incomplete and inaccurate information. SPure Appl Math 59:1207–1233
Chambolle A, Dossal C h (2015) On the convergence of the iterates of the “Fast Iterative Shrinkage/Thresholding Algorithm” J Opt Theory Appl 166(3):968–982
Chen S, Donoho D Basis pursuit. IEEE, Proceedings of 1994 28th Asilomar conference on signals, systems and computers
Chen S, Yang M (2017) An adaptive fast iterative shrinkage threshold algorithm. In: 2017 29th Chinese control and decision conference (CCDC). IEEE, pp 2190–2194
Combettes PL, Wajs VR (2005) Signal recovery by proximal forward-backward splitting. Multiscale Model Simul 4:1168–1200
Donoho D (2006) Compressed sensing. IEEE Trans Inform Theory 52:1289–1306
Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least angle regression. Ann Statist 32:407–499
Facchinei F, Pang JS (2003) Finite-dimensional variational inequalities and complementarity problems. Vol. II, Springer Ser. Oper. Res. Springer, New York
Figueiredo MA, Nowak R, Wright SJ (2007) Gradient projection for sparse reconstruction: Application to compressed sensing and other inverse problems. IEEE Journal of Selected Topics in Signal Processing 1(4):586–597
Goldstein T, Osher S (2009) The split Bregman method for L1-regularized problems. SIAM J Imaging Sciences 2(2):323–343
Hale ET, Yin W, Zhang Y (2007) A fxed-point continuation method for l1-regularized minimization with applications to compressed sensing. Technical Report-Rice University
Hale ET, Yin W, Zhang Y (2008) Fixed-point continuation for l1-minimization: Methodology and convergence. SIAM J Optim 19:1107–1130
Huang B, Ma S. h., Goldfarb D (2013) Accelerated linearized Bregman method. J Sci Comput 54:428–453
Islam SMR, Huang X, Ou KL (2015) Image compression based on compressive sensing using wavelet lifting scheme. The International Journal of Multimedia & Its Applications 7(1)
Jiajun H, Russell B, Downton J (2017) Seismic spectral decomposition via basis pursuit. SEG Technical program expanded abstracts 2017. Society of exploration geophysicists: 2174–2179
Karimi S, Vavasis S (2017) IMRO A proximal quasi-Newton method for solving l1-regularized least squares problems. SIAM J Optim 27(2):583–615
Lorenz DA, Pfetsch ME, Tillmann AM (2014) An infeasible-point subgradient method using adaptive approximate projections. Comput Optim Appl 57 (2):271–306
Lorenz DA, Pfetsch ME, Tillmann AM (2015) Solving basis pursuit: Heuristic optimality check and solver comparison. ACM Trans Math Softw (TOMS) 41(2):1–29
Malioutov DM, Cetin M, Willsky AS (2005) Homotopy continuation for sparse signal representation. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing, vol 5, pp 733–736
Osbourne MR, Presnell B, Turlach BA (2000) A new approach to variable selection in least squares problems. IMA J Numer Anal 20(3):389–402
Osher S, Mao Y, Dong B, Yin W (2010) Fast linearized Bregman iteration for compressive sensing and sparse denoising. Commun Math Sci 8(1):93–111
Parikh N, Boyd S (2013) Proximal algorithms. Foundations and Trends in Optimization 1(3):123–231
Passty GB (1979) Ergodic convergence to a zero of the sum of monotone operators in Hilbert space. J Math Anal Appl 72:383–390
Rockafellar RT (1970) Convex analysis. Princeton University Press, Princeton
Tanay S, Srivastava S. h., Khare S, Stanimirovic PS, Petkovic MD (2019) An improved algorithm for basis pursuit problem and its applications. Appl Math Comput 355:385–398
Tao Z. h. (2020) Accelerating monotone fast iterative shrinkage–thresholding algorithm with sequential subspace optimization for sparse recovery. SIViP 14(4):771–780
Tao S, Boley D, Zhang S (2016) Local linear convergence of ISTA and FISTA on the LASSO problem. SIAM J OPTIM 26(1):313–336
Tesfamicael SA, Barzideh F (2015) Clustered compressive sensing: Application on medical imaging. Int J Electr Comput Eng 5(1)
Tesfamicael S, Barzideh F (2014) Clustered compressed sensing in fMRI data analysis using a Bayesian framework. Int J Inf Electr Eng 4(2)
Vahi I, Shahri PK, Ahani H (2020) A compressed-sensing-based compressor for ECG. Biomed Eng Lett: 1–9
van den Berg E (2020) A hybrid quasi-Newton projected-gradient method with application to lasso and basis-pursuit denoising. Math Program Comput 12(1):1–38
Wen Z, Yin W, Goldfarb D, Zhang Y (2010) A fast algorithm for sparse reconstruction based on shrinkage, subspace optimization, and continuation. SIAM J Sci Comput 32:1832–1857
Yin W (2010) Analysis and generalizations of the linearized Bregman method. SIAM J Imaging Sciences 3(4):856–877
Yin W, Osher S, Goldfarb D, Darbon J (2008) Bregman iterative algorithms for L1-minimization with applications to compressed sensing. SIAM J Imaging Sci 1:143–168
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Babapour, S., Lakestani, M. & Fatholahzadeh, A. AFISTA: Accelerated FISTA for sparse signal recovery and compressive sensing. Multimed Tools Appl 80, 20707–20731 (2021). https://doi.org/10.1007/s11042-021-10701-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-021-10701-w