AFISTA: Accelerated FISTA for sparse signal recovery and compressive sensing

Babapour, Shahab; Lakestani, Mehrdad; Fatholahzadeh, Abolfazl

doi:10.1007/s11042-021-10701-w

AFISTA: Accelerated FISTA for sparse signal recovery and compressive sensing

Published: 09 March 2021

Volume 80, pages 20707–20731, (2021)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Multimedia Tools and Applications Aims and scope Submit manuscript

AFISTA: Accelerated FISTA for sparse signal recovery and compressive sensing

Download PDF

Shahab Babapour¹,
Mehrdad Lakestani ORCID: orcid.org/0000-0002-2752-0167¹ &
Abolfazl Fatholahzadeh¹

850 Accesses
7 Citations
Explore all metrics

Abstract

This paper presents a new fast iterative shrinkage-thresholding algorithm, termed AFISTA. The essential idea is to improve the convergence rate of FISTA using a new continuation strategy leading to a less number of iterations compared to FISTA. The convergence theorem of the AFISTA is proposed. In order to further accelerate the AFISTA method, it is equipped with the Barzilai-Borwein (BB) method. Also, for applications with orthogonal sensing matrix A, we proposed a specialized version of the AFISTA method. AFISTA is tailored for solving the basis pursuit problem which can be applied successfully on a variety of problems arising in signal and image processing issues such as sparse signal recovery, signal and image denoising, image restoration, and compressive sensing. To show the efficiency of the method, we compare our results with generalizations of linearized Bregman and fixed - point continuation (FPC) methods in sparse signal recovery applications, with split Bregman method in compressive sensing for sparse MRI and with Gradient projection for sparse reconstruction (GPSR) method in image deconvolution. Numerical results demonstrate that AFISTA overcomes all of the compared methods in convergence rate and some of them in both convergence rate and quality of reconstructed results.

A new conjugate gradient hard thresholding pursuit algorithm for sparse signal recovery

Article 11 September 2020

Accelerating monotone fast iterative shrinkage–thresholding algorithm with sequential subspace optimization for sparse recovery

Article 30 November 2019

Data-Driven Forward–Backward Pursuit for Sparse Signal Reconstruction

Article 20 September 2016

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Let $A\in \mathbb {R}^{m\times n}$ with m < n or m ≪ n (in compressed sensing), $b\in \mathbb {R}^{m}$, and $u\in \mathbb {R}^{n}$. A basis pursuit (BP) problem is a constrained minimization problem as follows [11]:

$$ \min_{u\in \mathbb{R}^{n} }\left\lbrace ||u||_{1}: Au=b \right\rbrace, $$

(1)

which gives the solution of the underdetermined linear system with minimal L₁ norm. This formulation has been suggested by Chen and Donoho [11] to find a sparse solution among the many possible solutions to Au = b. Basis pursuit problem emerges in many applications such as statistics [15], image compression [22], spectral decomposition [23] and compressed sensing [9, 14, 33, 38], image reconstruction [39], specially its applications in medical images [36, 37]. Instead of problem (1), one can consider the unconstrained problem:

$$ \min_{u\in \mathbb{R}^{n} } \frac{1}{2}\Vert Au-b {\Vert_{2}^{2}} + \lambda ||u||_{1} , $$

(2)

for λ ≥ 0 named LASSO (Least Absolute Shrinkage And Selection Operator) which is closely related to that. Indeed, in the [32], the authors proved that a solution of (1) is either u = 0, or else is a solution of problem (2), for some λ > 0. In (1), one can use another norm instead of L₁ norm. Especially, considering sparsity for the solution, L₀ norm can be replaced with L₁ norm since it gives the sparsest solution for the problem at hand (note that L₀ is not actually a norm) [24, 26]. But, it is an NP-hard problem and so it is difficult to solve [24, 26]. In [29], it is mentioned that under suitable circumstances, the solution to the problem (2) gives the sparsest solution satisfying Au = b. Indeed, the problem (2) is a convex approximation to the L₀ regularizing problem [17]:

$$ \min_{u\in \mathbb{R}^{n} } \frac{1}{2}\Vert Au-b {\Vert_{2}^{2}} + \lambda ||u||_{0} , $$

(3)

where ∥u∥₀ corresponds to the total number of nonzero elements in u. The problem (2) can be considered as a second order cone programming problem [17] and therefore can be solved by conventional methods such as an interior point method [5]. However, because of its computational complexity, these conventional methods are not suitable for large scale data [29] which arises in the real applications such as CS reconstruction of Magnetic Resonance Imaging (MRI). Many algorithms solve efficiently the problem (2). Some of these methods can be listed as follows: infeasible-point subgradient algorithm (ISA) [25], homotopy based methods [27, 28], fixed - point continuation (FPC) [19, 40], L₁-Magic algorithm [6], linearized Bregman and its generations [21, 29, 41, 42] and proximal gradient based algorithms such as alternating direction method (ADM) [30], alternating direction method of multipliers (ADMM) [30], iterative shrinkage thresholding algorithm (ISTA) and its modification, fast iterative shrinkage thresholding algorithm (FISTA) [2, 3, 30, 34, 35]. Compared to the traditional methods, the above methods, had been suggested to be suitable for problems with large - scale data and sparse solutions and had been used successfully for the LASSO problem. Most of these methods are proposed to solving LASSO (2) which has nothing to do with the BP problem. Among these methods, fixed - point continuation and an active set method can also solve the BP problem directly only if λ is set to be a tiny value (e.g., 1e-10).

In this paper, we proposed a method based on FISTA which solves the BP problem directly. Indeed, we modified FISTA using a continuation strategy to lead to a less number of iterations compared to FISTA and to solve the BP problem instead of the LASSO problem. Given that the only difference between the FISTA and AFISTA methods is related to the continuation strategy added to the AFISTA algorithm, which has little computational cost compared to the matrix-vector multiplication, the computational cost of each of these two algorithms at each iteration is almost the same. But the number of iterations for AFISTA is considerably less than FISTA. Also, we presented the convergence analysis of the proposed method. In order to further accelerate the proposed method, we used the Barzilai-Borwein (BB) method [4]. Also, for applications with orthogonal sensing matrix A, we proposed a specialized version of the AFISTA method. The efficiency of the proposed method is illustrated through a variety of numerical experiments such as sparse signal recovery, recovery of a noisy sinusoidal wave, compressive sensing for MRI images, and image deconvolution.

This paper is organized as follows: Section 2 recalls ISTA and FISTA methods which is the core of the proposed method, in Section 3, we proposed the AFISTA method and its generations, also, we presented its convergence analysis, in Section 4, some numerical experiments were presented. Finally, in the conclusion section, we summarize the whole text of this paper.

2 ISTA and FISTA methods

In this section, the ideas of ISTA and FISTA algorithms, the core of our proposed method, are shortly described In [2], the authors considered the following general formulation to solve

$$ \begin{array}{@{}rcl@{}} \min_{u\in \mathbb{R}^{n} } \quad f(u)+g(u) , \end{array} $$

(4)

where $g:\mathbb {R}^{n} \rightarrow \mathbb {R} $ is a continuous convex function which can be both smooth or non-smooth. And $f: \mathbb {R}^{n} \rightarrow \mathbb {R} $ is a convex and continuously differentiable function with Lipschitz continuous gradient, i.e.,

$$ \begin{array}{@{}rcl@{}} \Vert \nabla f(u_{1}) - \nabla f(u_{2}) \Vert \leq L \Vert u_{1} - u_{2} \Vert , \quad \forall u_{1}, u_{2} \in \mathbb{R}^{n}, \end{array} $$

where ∥.∥ denotes the Euclidean norm in the space $\mathbb {R}^{n}$. The basic iterative scheme of the ISTA for solving the convex optimization problem (4) is as follows

$$ \begin{array}{@{}rcl@{}} u^{k} &=& \underset{ u \in \mathbb{R}^{n} }{\operatorname{argmin}} \quad f(u^{k-1})+\left\langle u - u^{k-1}, \nabla f(u^{k-1}) \right\rangle + \frac{L}{2}\big\Vert u - u^{k-1}\big\Vert^{2}+ g(u)\\ &=& \underset{ u \in \mathbb{R}^{n} }{\operatorname{argmin}} \quad g(u) + \frac{L}{2}\Big\Vert u - \left( u^{k-1} - \frac{1}{L}\nabla f(u^{k-1}) \right) \Big\Vert^{2}. \end{array} $$

Now by setting $f(u) = \frac {1}{2}\Vert Au-b \Vert ^{2}$ and g(u) = λ∥u∥₁, we have ∇f(u) = A^T(Au − b) and L(f) = ∥A^TA∥ and also [2]

$$ \begin{array}{@{}rcl@{}} u^{k} = \mathcal{T}_{\lambda /L} \left( u^{k-1} - \frac{1}{L}A^{T}(Au^{k-1}-b) \right), \end{array} $$

where $\mathcal {T}_{\alpha }:\mathbb {R}^{n}\rightarrow \mathbb {R}^{n}$ is the shrinkage (soft thresholding) operator defined by

$$ \begin{array}{@{}rcl@{}} \mathcal{T}_{\alpha}(\textbf{x})_{i}=\text{sgn}(x_{i})\max\left\lbrace |x_{i}|-\alpha , 0\right\rbrace . \end{array} $$

Finally, algorithm of the ISTA method for solving LASSO problem can be written as follows [2]

Algorithm 1 shows the ISTA algorithm exactly uses two matrix-vector multiplications involving A and A^T followed by a shrinkage step. So, it is suitable for large-scale applications. Note that, ISTA is the special case of the proximal forward-backward method proposed in [8] and [31]. A general convergence results for the proximal forward-backward method can be found in [13]. A sufficient but not necessary condition to ensure the convergence of the sequence $\left \lbrace u^{k} \right \rbrace $ generated by the ISTA for the optimal point of the LASSO problem is that the stepsize L satisfies $\frac {1}{L}\in \left (0, \frac {1}{\Vert A^{T}A \Vert }\right ) $ [16] or equivalently L ≥∥A^TA∥. The authors in [7] proved that the sequence $\left \lbrace u^{k} \right \rbrace $ can converge to minimizer of the LASSO problem with $L>\frac {1}{2}\Vert A^{T}A \Vert $. However, it is not a necessary condition too.

The main drawback of the ISTA algorithm is its rate of convergence. Indeed, it converges slowly to the optimal objective function value. To fix this drawback of the ISTA, in [2] a faster algorithm called FISTA is proposed. The main difference between FISTA and ISTA algorithms is that FISTA uses y^k composed of u^k− 1 and u^k− 2, whereas ISTA is only based on u^k− 1. Algorithm 2 shows FISTA method for LASSO problem [2]

As appears, the cost of calculation of Algorithms 1 and 2 are almost the same, this is because the execution times of lines 4 and 5 are obviously marginal. Now, we report two theorems of the convergence rate of the ISTA and FISTA methods presented in [2]. Suppose that $F(u) = \frac {1}{2}\Vert Au-b \Vert ^{2} + \lambda \Vert u \Vert _{1}$ and BP problem is solvable, i.e., U^∗ = argminF≠∅ and also F^∗ be the optimal function value of F at an optimal point u^∗.

Theorem 1

[2] Let $\left \lbrace u^{k} \right \rbrace $be the sequence generated by ISTA method. Then for any k ≥ 1

$$ \begin{array}{@{}rcl@{}} F(u^{k})-F(u^{*})\leq \frac{C}{k}, \quad \forall u^{*} \in U^{*}, \end{array} $$

where C is a positive constant.

Theorem 2

[2] Let $\left \lbrace u^{k} \right \rbrace $ be the sequence generated by FISTA method. Then for any k ≥ 1

$$ \begin{array}{@{}rcl@{}} F(u^{k})-F(u^{*})\leq \frac{C}{(k+1)^{2}}, \quad \forall u^{*} \in U^{*}, \end{array} $$

where C is a positive constant.

In [35], the authors note that the ISTA and FISTA methods have local linear convergence rate, i.e., the convergence of these methods is linear when they reach close to the optimal point.

3 AFISTA method

Considering the importance of the convergence rate in large-scale problems, we advocate in favor of using a continuation strategy along with the BB stepsize for accelerating the FISTA method.

3.1 Accelerating FISTA with continuation

Here, the AFISTA method is presented. In [19], the authors introduced a fixed point continuation method that solves a sequence of LASSO problems with different values of the regularization parameter λ. Indeed, for a decreasing sequence $\left \lbrace \lambda _{k} \right \rbrace $, as a warm start for the next LASSO problem, they used the solution of the i-th LASSO problem associated with the regularization parameter λ_k, which was obtained using fixed - point algorithm. Similar continuation strategies have also been used in [1, 17, 20, 28].

Here, we have used the continuation technique to accelerate the FISTA method in a different way. Compared with previous works, that produce decreasing sequence $\left \lbrace \lambda _{k} \right \rbrace $ with a fixed descent rate 0 < β < 1, we used the convergence of the sequence {u_k} generated by FISTA method to produce a decreasing sequence of the regularization parameters which converges to zero. Indeed, here we produced the decreasing sequence $\left \lbrace \lambda _{k} \right \rbrace $, first by setting λ₁ as $ \lambda _{1}=\tilde {\lambda } \Vert b\Vert _{2}\Vert A^{T}b\Vert _{2} /L$ and then at iteration k, by setting λ_k+ 1 as follows

$$ \begin{array}{@{}rcl@{}} &&\lambda^{\prime}=\tilde{\lambda}\times\Vert b\Vert_{2}\times\Vert u^{k}-u^{k-1}\Vert_{2},\\ &&\textbf{if} \quad \lambda^{\prime} < \gamma \lambda_{k}, \\ && \quad \lambda_{k+1} = \lambda^{\prime}, \\ &&\textbf{else} \\ && \quad \lambda_{k+1} = \lambda_{k}, \end{array} $$

(5)

where $\tilde {\lambda }$ and γ are positive constants. It is notable that, the ISTA and FISTA are proposed to solve LASSO problem which has nothing to do with BP problem. Whereas, the proposed method, by converging λ_k to 0, solves BP problem. Also, to ensure the convergence of the sequence λ_k to 0, we need to take γ < 1. Numerical results show how much the use of this method to generate a sequence of regularization parameters in the continuation technique can be successful. Now, we are ready to present the algorithm of the AFISTA for the BP problem

Given that the only difference between the FISTA and AFISTA methods is in the fifth and seventh lines of the AFISTA algorithm, which has little computational cost compared to the matrix-vector multiplication, the computational cost of each of these two algorithms at each iteration is almost the same. But the number of iterations for AFISTA is considerably less than FISTA. This fact is clearly shown in Fig. 1, where the total number of iterations of the FISTA and AFISTA methods to reach the same stopping criterion are 7651 and 155, respectively. In this experiment, we take the sensing matrix A to be a 300 × 1000 Gaussian random matrix, the total number of nonzeros of u^∗ to be 50 and λ = 0.005 for FISTA method and $\tilde {\lambda } = 2 $ and γ = 0.9 for AFISTA method and we use the stopping criterion $\frac {\Vert Au^{k}-b\Vert }{\Vert b\Vert }<10^{-5}$ for these two methods.

Table 1 shows the difference between the convergence rate of the AFISTA and FISTA methods in terms of the number of iterations (N^oIter for short) along with stoping criterion $\frac {\Vert u^{k}-u^{*} \Vert }{\Vert u^{*} \Vert }\leq \varepsilon $ for different values of ε. Here, Func. Err. denotes objective function error F(u^k) − F(u^∗). Observe the big difference between AFISTA and FISTA: the former needs 715 iterations with γ = 0.5, 733 iterations with γ = 0.7 and 708 iterations with γ = 0.9 for ε = 10⁻¹⁴ whereas the latter requires 3282 iterations for ε = 10⁻¹. In this test, we take the sensing matrix A to be a 300 × 1000 Gaussian random matrix, the total number of nonzeros of u^∗ to be 50 and $\tilde {\lambda } = 2.5 $. It is notable that, numerical results are obtained for 10 random samples.

Table 1 Iteration numbers of AFISTA .vs. FISTA to reach the stoping criterion $\frac {\Vert u^{k}-u^{*} \Vert }{\Vert u^{*} \Vert }\leq \varepsilon $

Full size table

3.2 Convergence analysis of the AFISTA

In this section, first we present the convergence result of the FISTA. Then, we propose the convergence theorem of the AFISTA. Note that, the Authors in [10], proved the convergence of the objective function. However, the convergence of the sequence $\left \lbrace u_{k}\right \rbrace $ given by FISTA was not known. In [10], the Authors show that with a small modification, one can ensure the convergence of the objective function as well as the convergence of the sequence $\left \lbrace u_{k}\right \rbrace $ itself. Here, we present the theorem proved in [10].

Theorem 3

[10] Suppose that a > 2 be a real number, and $\forall k\in \mathbb {N}$ let $t_{k}=\frac {k+a-1}{a}$. Then, the sequence $\left \lbrace u_{k}\right \rbrace $ produced by FISTA converges to a minimizer of objective function F.

Note that, here we set objective function $F(u) = \frac {1}{2}\Vert Au-b \Vert ^{2} + \lambda \Vert u \Vert _{1}$, which is the objective function of the LASSO problem. Now, we are ready to present the following theorem of the convergence of sequence $\left \lbrace u_{k}\right \rbrace $ given by AFISTA.

Theorem 4

Suppose that $\left \lbrace u_{k}\right \rbrace $ is the sequence given by AFISTA. Then, the sequence $\left \lbrace u_{k}\right \rbrace $ converges to the minimizer of the Basis Pursuit problem.

Proof

Based on updating rule (5), it is obvious that the sequence $\left \lbrace \lambda _{k}\right \rbrace $ is a non-increasing sequence. Also it is not a fixed sequence. For some k ≥ 1, suppose that at iteration k, we have λ_k = λ_k+ 1 = ⋯. It means that, from iteration k, we are solving LASSO problem with regularization parameter λ_k using FISTA starting from initial point u_k. Thus, the convergence of FISTA method yields that for a given 0 < γ < 1

$$ \begin{array}{@{}rcl@{}} &&\exists N^{\prime}\in \mathbb{N} \text{ s. t. } \Vert u_{k+N^{\prime}-1} - u_{k+N^{\prime}-2} \Vert < \gamma \Vert u_{k-1} - u_{k-2} \Vert \\ &&\Rightarrow \tilde{\lambda}\Vert b\Vert \Vert u_{k+N^{\prime}-1} - u_{k+N^{\prime}-2} \Vert < \gamma \tilde{\lambda}\Vert b\Vert \Vert u_{k-1} - u_{k-2} \Vert \\ &&\Rightarrow \lambda_{k+N^{\prime}} < \gamma \lambda_{k}. \end{array} $$

(6)

So, the sequence $\left \lbrace \lambda _{k}\right \rbrace $ is a positive decreasing sequence, and converges to a point. Also, from (6), one can conclude that the sequence $\left \lbrace \lambda _{k}\right \rbrace $ has a subsequence which satisfies $\lambda _{k_{i+1}} < \gamma \lambda _{k_{i}} \quad \forall i\in \mathbb {N}$. This means that, the sequence $\left \lbrace \lambda _{k}\right \rbrace $ is a convergent seguence which has a subsequence that converges to zero. Thus, we have $\left \lbrace \lambda _{k}\right \rbrace \rightarrow 0$ which will force the equality Au = b constraint. This means that eventually AFISTA solves the Basis Pursuit problem. □

Note that, throughout this paper we set a = 3 in formula $t_{k}=\frac {k+a-1}{a}$.

3.3 AFISTA with BB stepsize

The Barzilai-Borwein (BB) method was proposed in [4]. The motivation of the BB method is the same idea behind quasi-Newton’s method. In fact, BB method provides an approximation to the inverse of the Hessian matrix of the problem at hand, and this approximation simply is set to be a multiple of the identity matrix, that is, the step size α_k is chosen so that α_kI approximates the inverse of the Hessian at iteration k.

Let S^k− 1 = u^k − u^k− 1 and G^k− 1 = g^k − g^k− 1, then the BB stepsize can be computed as follows

$$\alpha_{k} = \underset{ \alpha \in \mathbb{R} }{\operatorname{argmin}}\quad \Vert S^{k-1}-\alpha G^{k-1}\Vert^{2},$$

and so by differentiating and setting it to zero, we obtain

$$ \begin{array}{@{}rcl@{}} \alpha_{k} = \frac{(G^{k-1})^{T}S^{k-1}}{(G^{k-1})^{T}G^{k-1}}. \end{array} $$

(7)

Now, in order to further accelerate the AFISTA method, we intend to use the BB method, which has been successfully used in similar works for l₁-minimization issues [12, 17, 40, 41]. The AFISTA algorithm contains two basic steps in each iteration. The first step has one iteration of the gradient descent method with fixed stepsize $\frac {\mathbf {1}}{\mathbf {L}}$, which performs the minimization of the function $f(u) = \frac {1}{2}\Vert Au-b \Vert ^{2}$. The second step uses a shrinkage operator, which imposes the sparsity assumption on the obtained solution. Therefore, to improve the convergence rate of the first step, we changed the value of stepsize furnished by BB stepsizes. The above discussions lead to an accelerated version of AFISTA, namely AFISTA-BB (see Algorithm 4).

Each iteration of the AFISTA-BB algorithm uses two matrix-vector multiplications as in the previous methods, but in practice, its convergence rate is better than the AFISTA. Left and right - hand side of Fig. 2 shows the residual and relative error as a function of iteration counts, respectively. In this figure, to reach the stopping criterion $\frac {\Vert Au^{k}-b\Vert }{\Vert b\Vert }<10^{-5}$, the total number of iterations of the AFISTA and AFISTA-BB methods are 122 and 71, respectively. Figure 2 shows also the AFISTA-BB method has significantly better performance than the AFISTA. The sensing matrix A is a 600 × 2000 Gaussian matrix. All other parameters used in this figure are the same as used in Fig. 1.

3.4 Orthogonal case

Let us consider the specific case of AA^T = I which appears in the most applications. In this case, all of the stepsizes are equal to one. For steepest descent stepsize, we have

$$ \begin{array}{@{}rcl@{}} \alpha_{k} = \frac{(g^{k})^{T}g^{k}}{(g^{k})^{T}A^{T}Ag^{k}}=\frac{(Au^{k}-b)^{T}}{brace{AA^{T}}(Au^{k}-b) }{ (Au^{k}-b)^{T}\underbrace{AA^{T}AA^{T}}(Au^{k}-b) }=1. \end{array} $$

And for BB stepsize, we have

$$ \begin{array}{@{}rcl@{}} &S^{k-1}=u^{k}-u^{k-1}\\ &G^{k-1}=g^{k}-g^{k-1}=A^{T}A(u^{k}-u^{k-1}), \end{array} $$

thus

$$ \begin{array}{@{}rcl@{}} \alpha_{k} = \frac{(G^{k-1})^{T}S^{k-1}}{(G^{k-1})^{T}G^{k-1}} =\frac{ (u^{k}-u^{k-1})^{T}A^{T}A(u^{k}-u^{k-1}) }{ (u^{k}-u^{k-1})^{T}A^{T}\underbrace{AA^{T} }A(u^{k}-u^{k-1}) }=1. \end{array} $$

Therefore, for this particular case, the AFISTA algorithm can be rewritten as follows

4 Numerical experiments

In this section, the efficiency of the AFISTA algorithm is shown for solving the BP problem. We applied the AFISTA on several examples. First, it is done on sparse signal recovery, second on noisy sinusoidal wave recovery, third on CS for sparse MRI, and finally on image deconvolution problem. All of the discussed algorithms were implemented using Lenovo B590 laptop with Intel(R) Celeron(R) CPU 1005M @ 1.90GHz Processor in MATLAB. Throughout this section, the accuracy of the methods is measured by relative error $\frac {\Vert u^{k}-\bar {u} \Vert }{\Vert \bar {u} \Vert }$, where ${\bar {\mathbf {u}}}$ and u^k stand for the original signal and the reconstructed one at iteration k, respectively.

4.1 Sparse signal recovery

In this subsection, our goal is to reconstruct the original sparse signal ${\bar {\mathbf {u}}}$ from the sensing matrix A and the observation b. To generate the original signal ${\bar {\mathbf {u}}}$, the position of the non-zero components of signal ${\bar {\mathbf {u}}}$ were randomly selected using the randsample command in MATLAB and then each of these non-zero components were sampled from the standard Gaussian (randn in MATLAB) or from [− 1,1] uniformly at random (2 ∗rand − 1 in MATLAB) or from $\left \lbrace -1,1\right \rbrace $ standard Gaussian random (sign(randn) in MATLAB). And we use three types of measurement matrices: orthogonalized Gaussian matrices that its entries were generated using Gaussian distribution and its rows were orthogonalized by QR decompositions, standard Gaussian matrices, and partial discrete cosine transform (DCT) matrices that its m rows were randomly selected from the n × n DCT matrix. Note that, when we deal with the partial DCT matrix, we can compute matrix-vector multiplications involving A and A^T using the discrete cosine transform and the inverse discrete cosine transform, respectively. Here, we have done two types of tests: Noise - free Test: its results are given in Fig. 3 and Table 2, and Noisy Test: its results are given in Figs. 4, 5, 6 and Table 3. In noisy case, we take the measurement vector b as follows

$$b=A\bar{u}+\eta , $$

where η is the white Gaussian noise of variance σ². To detect the noise level, instead of using σ, we use the signal to noise ratio (SNR), which is defined as follows

$$ \begin{array}{@{}rcl@{}} \text{SNR}=20\log_{10}\left( \frac{\Vert \bar{u} \Vert}{\Vert \eta \Vert } \right) \end{array} $$

Table 2 Noiseless test: Comparison of the SAFISTA for orthogonal case with KB and LB methods [41]. Numerical results are obtained for 20 random samples

Full size table

Table 3 Noisy test: Comparison of the AFISTA and the KO [29]. Standard Gaussian and partial DCT matrices are used for AFISTA-BB and SAFISTA algorithms, respectively. Numerical results are obtained for 10 random samples

Full size table

Here, we used AFISTA-BB method along with the following parameters, sensing matrix A = 600 × 2000 standard Gaussian matrix, $\Vert \bar {u}\Vert _{0}=50$ and $\tilde {\lambda } =2 $. Figure 3 shows the residual and the relative error. AFISTA-BB algorithm converges to a solution with 10^− 15 residual and relative error in less than 350 iterations. The stopping criterion is $\frac {\Vert Au^{k}-b\Vert }{\Vert b\Vert }<10^{-15}$. Figures 4–6 show the results of the noisy tests with three different levels of noise. These figures show, even for huge noise levels with negative SNRs, the AFISTA-BB algorithm works well.

In Table 2, we compare specialized AFISTA for orthogonal case, termed SAFISTA, with two different methods, namely kicking+BB line search (KB) and L-BFGS (LB) of the linearized Bergman [41]. In this Table, following [41], we use orthogonalized Gaussian matrices as sensing matrices and we report average results which are obtained using 20 random instances. Also, we let $\tilde {\lambda }=0.1$ for Gaussian signals and $\tilde {\lambda }=0.05$ for uniformly random [− 1,1] signals, α = 0.95 and the stopping criterion equal to $\frac {\Vert Au^{k}-b\Vert }{\Vert b\Vert }<10^{-5}$. From Table 2, one can observe that the number of matrix-vector multiplications required for the SAFISTA method is less than linearized Bergman’s methods, so SAFISTA is faster than these tow methods. Notably, the quality of reconstructions for all three methods are the same.

Table 3 shows the comparison of AFISTA and the linearized Bregman with kicking (KO) [29]. Let std denotes the standard deviation. For standard Gaussian matrices and partial DCT ones we use the AFISTA-BB with stopping criterion std$\left (Au^{k}-b\right ) <\sigma $ and SAFISTA with std$\left (Au^{k}-b\right ) < 0.5\sigma $, respectively. We report mean results obtained for 10 instances. Numerical results show that the AFISTA method overcomes KO method from the perspective of convergence rate and accuracy.

Table 4 shows the comparison of the SAFISTA with FPC-AS (i.e. FPC Active Set [40]). We used latest version of the FPC-AS Matlab code, which is available at http://www.caam.rice.edu/~optimization/L1/FPC_AS/. For a fair comparison, first, we run the FPC-AS code and then, having the FPC-AS’s solution u_F, we terminated the SAFISTA method at iteration k when the following solution u^k is satisfied

$$ \begin{array}{@{}rcl@{}} 10^{-5}\Vert u^{k} \Vert_{1} + \frac{1}{2}\Vert Au^{k}-b {\Vert_{2}^{2}}\leq 10^{-5}\Vert u_{F} \Vert_{1} + \frac{1}{2}\Vert Au_{F}-b {\Vert_{2}^{2}}. \end{array} $$

Table 4 Comparison of the SAFISTA and the FPC active set [40]. Numerical results are obtained for 20 random samples

Full size table

Table 4 shows also the average results which are obtained for 20 random instances using orthogonalized Gaussian matrices as sensing matrices. Also, we take α = 0.95 and $\tilde {\lambda } = 0.04$. Note that, in this table, we have both noise - free and noisy cases. From this table, one can see that for the noise free cases FPC-AS works better than SAFISTA, but in noisy cases, SAFISTA works better and required fewer matrix-vector multiplications than FPC-AS. In the last column of this table we present the time ratio of FPC-AS and proposed methods, which the average of this column shows that for noise - free cases stopping time of these two methods is approximately the same but for noisy cases, the proposed method is approximately 2 times faster than FPC-AS.

4.2 Recovery of noisy sinusoidal wave

For the first compressive sensing application in the frequency domain, we consider a sinusoidal wave that is sparse in frequency domain:

$$ \begin{array}{@{}rcl@{}} \bar{x}(t) = a_{1}\sin(\beta_{1}t) + a_{2}\cos(\beta_{2}t), \end{array} $$

where a_i and β_i denote magnitudes and frequencies of the signal. The observed signal $\tilde {x}$, is a noisy signal with the form $\tilde {x} = \bar {x} + \eta $, where η is the white Gaussian noise of variance σ². Note that, in CS application, only a random measurement of observed signal $\tilde {x}$ is available. So, the problem is to reconstruct the original signal $\bar {x}$ from a random measurement of the observed signal $\tilde {x}$. Then, the problem can be written as Au = b, where $b=\tilde {x}(I)$, I is a random subset of $ \left \lbrace 1,2,{\ldots } ,n \right \rbrace $, n is total signal length and the sensing matrix A is the partial inverse Fourier transform matrix. Note that, signal u is in the Fourier domain, thus we need to take an inverse Fourier transform to get the reconstructed signal in the physical domain. In numerical experiments, to generate original signal $\bar {x}$, the magnitudes a_i are generated uniformly at random from [− 1,1] and β_is are taken to be random multiplies of $\frac {2\pi }{n}$, i.e., $\beta _{i} = k_{i}\frac {2\pi }{n}$, where k_i is randomly selected from $\left \lbrace 0,1,{\ldots } ,n-1 \right \rbrace $.

In this test, AFISTA-BB algorithm is used with parameters $ \tilde {\lambda }=0.01$, L = 1 and the stopping criterion $\frac {\Vert Au^{k}-b\Vert }{\Vert b\Vert }<0.01$. Numerical results for this test for three different levels of noise with SNRs = 9.5369,4.0767 and-0.9420 are shown in Figs. 7, 8 and 9. Details of the results obtained in this test are presented in the Table 5. These results show that to have a reliable reconstruction at the higher level of noise, we need to use a greater number of measurements.

Table 5 Details of the results obtained in Figs. 7, 8 and 9

Full size table

4.3 CS for sparse MRI

To illustrate the efficiency of our method in practical applications, we tested it on MRI images. All images used in this section are of the size 720 × 960 which are available at https://usa.healthcare.siemens.com/magnetic-resonance-imaging/. After the vectorization of these images, the original signal ${\bar {\mathbf {u}}}$ will be of length n = 691200. In the CS application for sparse MRI, we need to take the sensing matrix A to be a partial discrete Fourier transform matrix. Note that, we compute matrix-vector multiplications involving A and A^T using the fast Fourier transform and the inverse fast Fourier transform, respectively. Throughout this part, we use the AFISTA-BB algorithm with parameters $ \tilde {\lambda }=1\times 10^{-10}$, L = 1 and the stopping criterion $\frac {\Vert Au^{k}-b\Vert }{\Vert b\Vert }<5\times 10^{-3}$.

Considering the importance of the sparsity assumption, we first start with angiogram images, which is usually sparse in the spatial domain. We used four angiogram images with different sparsity levels. We call these four images angio1, angio2, angio3 and angio4, which are used for producing Fig. 10, from top to bottom, and Fig. 11, respectively. Figure 10 shows the results of our first test on angiography images. In this figure, we compare the AFISTA-BB algorithm with the split Bregman method [18]. The Matlab code for the split Bregman method can be downloaded from http://tag7.web.rice.edu/Split_Bregman.html. The obtained results show that for enough sparse images, the AFISTA-BB works better than split Bregman in convergence rate and accuracy of the solution. In the other words, Whereas AFISTA-BB is faster than split Bregman, the quality of reconstructed images using AFISTA-BB is better than split Bregman. Details of the results obtained in this test are presented in the Table 6. Figure 11 shows the results of our second test on angiography images. In this test, we use image angio4 that is 52% sparse. Compared to the images used in Fig. 10, which are about 70% sparse, this image is not sparse enough to reconstruct using 30% of its frequency domain data. From Fig. 11, we can see that the AFISTA-BB algorithm failed to have a proper reconstruction using 30% of frequency domain data.

Table 6 Details of the results obtained in Fig. 10

Full size table

This figure also shows that the AFISTA-BB algorithm works well in reconstructing the image angio4 using 50% of its frequency domain data. Details of the results obtained in this test are presented in the Table 7. Figure 11 shows the importance of the sparsity assumption in BP problem. It is obvious that for other MRI images with significantly little sparsity level the AFISTA method will not work.

Table 7 Details of the results obtained in Fig. 11

Full size table

As the last test, we applied AFISTA-BB on MRI images. Here, the original signal is considered as the wavelet coefficients of the MRI images, in which the Haar wavelet is used to decompose images up to five resolution levels. The results of this test are presented in Fig. 12 and its details are presented in Table 8.

Table 8 Details of the results presented in Fig. 12

Full size table

4.4 Image deconvolution

For our final experiment, we test the proposed AFISTA-BB on the Image Deconvolution problem. In other words, we set matrix A to be RW, in which, R is a matrix representation of the blur operation, and W denotes the inverse wavelet transform. Following [17], the blur kernel is setted to be $h_{ij} = \frac {1}{i^{2} + j^{2}}$. Here, we use two well - known test images, Lena and Cameraman. And we compare the proposed method With GPSR-BB algorithm [17]. Both the paper and the code of the GPSR-BB algorithm are available at http://www.lx.it.pt/~mtf/GPSR/. For the GPSR algorithm, we use the demo ”demo_ image_ deblur.m” with default settings for its parameters except parameter ”tolA”. For implementation of the discrete wavelet transform, both algorithms use the Rice wavelet toolbox that it can be freely downloaded from http://www-dsp.rice.edu/software/rwt.shtml. Throughout this part, we use the AFISTA-BB algorithm with parameters $ \tilde {\lambda }=1\times 10^{-7}$, L = 1 × 10^+ 3 and the stopping criterion $\frac {\Vert Au^{k}-b\Vert }{\Vert b\Vert }<\epsilon $. The results of Image Deconvolution experiments are reported at Table 9 and Figs. 13 and 14, which show the efficiency of the proposed method. Note that, in this test, the quality of reconstructed images using the GPSR method is less than the proposed method. The psnr of reconstructed images shows this fact. For example, from Table 9, we can see that, for Lena test image GPSR-BB reached psnr= 33.69 in 15.28 seconds whereas AFISTA-BB reached psnr= 39.50 in 12.55 seconds. So, AFISTA-BB can reach a better quality of reconstruction in less time compared to GPSR-BB. Thus, from this test, we can see that the AFISTA method overcomes the GPSR-BB method from the perspective of convergence rate and accuracy.

Table 9 Image deconvolution test: Comparison of the AFISTA-BB with the GPSR-BB [17] for different tolerances

Full size table

5 Conclusion

This paper proposes AFISTA to accelerate the FISTA method using a specific continuation strategy. Also, the convergence analysis of the proposed method is presented. Although the main computational cost of these two methods at each iteration remains almost the same, however thanks to the continuation strategy AFISTA method is significantly faster than the FISTA method. Furthermore, to improve its convergence rate, we furnished the AFISTA method with BB stepsize. In the sequel, AFISTA is specialized for the orthogonal case i.e. AA^T = I. A fundamental difference between the proposed method and the FISTA method is that the proposed method solves BP problem while the FISTA method solves LASSO problem. As mentioned above, AFISTA is applied on various tasks of signal and image processing and compared with other methods such as generalizations of linearized Bregman, fixed - point continuation (FPC), split Bregman, and Gradient projection for sparse reconstruction (GPSR) methods. Numerical results show that in most cases, the proposed method in both the convergence rate and the quality of reconstruction is better than other methods. The main limitation of the proposed method is that its parameters need to set beforehand. The application of AFISTA on medical image processing is our future work.

References

Becker S, Bobin J, Candes EJ (2011) NESTA: A fast and accurate first-order method for sparse recovery. SIAM J Imaging Sci 4(1):1–39
Article MathSciNet Google Scholar
Beck A, Teboulle M (2009) A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J Imaging Sci 2(1):183–202
Article MathSciNet Google Scholar
Beck A, Teboulle M (2010) Gradient-based algorithms with applications to signal recovery problems. Convex optimization in signal processing and communications. Cambribge University Press, Cambribge, pp 42–88
MATH Google Scholar
Barzilai J, Borwein J (1988) Two point step size gradient methods. IMA J Numer Anal 8:141–148
Article MathSciNet Google Scholar
Ben-Tal A, Nemirovski A (2001) Lectures on modern convex optimization: Analysis, algorithms, and engineering applications. MPS/SIAM Ser. Optim. SIAM
Boyd S, Vandenberghe L. (2004) Convex optimization. Cambridge University Press, New York
Book Google Scholar
Bredies K, Lorenz D (2008) Linear convergence of iterative soft-thresholding. J Fourier Anal Appl 14:813–837
Article MathSciNet Google Scholar
Bruck RJ (1977) On the weak convergence of an ergodic iteration for the solution of variational inequalities for monotone operators in Hilbert space. J Math Anal Appl 61:159–164
Article MathSciNet Google Scholar
Candes E, Romberg J, Tao T (2006) Stable signal recovery from incomplete and inaccurate information. SPure Appl Math 59:1207–1233
MATH Google Scholar
Chambolle A, Dossal C h (2015) On the convergence of the iterates of the “Fast Iterative Shrinkage/Thresholding Algorithm” J Opt Theory Appl 166(3):968–982
Article MathSciNet Google Scholar
Chen S, Donoho D Basis pursuit. IEEE, Proceedings of 1994 28th Asilomar conference on signals, systems and computers
Chen S, Yang M (2017) An adaptive fast iterative shrinkage threshold algorithm. In: 2017 29th Chinese control and decision conference (CCDC). IEEE, pp 2190–2194
Combettes PL, Wajs VR (2005) Signal recovery by proximal forward-backward splitting. Multiscale Model Simul 4:1168–1200
Article MathSciNet Google Scholar
Donoho D (2006) Compressed sensing. IEEE Trans Inform Theory 52:1289–1306
Article MathSciNet Google Scholar
Efron B, Hastie T, Johnstone I, Tibshirani R (2004) Least angle regression. Ann Statist 32:407–499
Article MathSciNet Google Scholar
Facchinei F, Pang JS (2003) Finite-dimensional variational inequalities and complementarity problems. Vol. II, Springer Ser. Oper. Res. Springer, New York
Google Scholar
Figueiredo MA, Nowak R, Wright SJ (2007) Gradient projection for sparse reconstruction: Application to compressed sensing and other inverse problems. IEEE Journal of Selected Topics in Signal Processing 1(4):586–597
Article Google Scholar
Goldstein T, Osher S (2009) The split Bregman method for L1-regularized problems. SIAM J Imaging Sciences 2(2):323–343
Article MathSciNet Google Scholar
Hale ET, Yin W, Zhang Y (2007) A fxed-point continuation method for l1-regularized minimization with applications to compressed sensing. Technical Report-Rice University
Hale ET, Yin W, Zhang Y (2008) Fixed-point continuation for l₁-minimization: Methodology and convergence. SIAM J Optim 19:1107–1130
Article MathSciNet Google Scholar
Huang B, Ma S. h., Goldfarb D (2013) Accelerated linearized Bregman method. J Sci Comput 54:428–453
Article MathSciNet Google Scholar
Islam SMR, Huang X, Ou KL (2015) Image compression based on compressive sensing using wavelet lifting scheme. The International Journal of Multimedia & Its Applications 7(1)
Jiajun H, Russell B, Downton J (2017) Seismic spectral decomposition via basis pursuit. SEG Technical program expanded abstracts 2017. Society of exploration geophysicists: 2174–2179
Karimi S, Vavasis S (2017) IMRO A proximal quasi-Newton method for solving l1-regularized least squares problems. SIAM J Optim 27(2):583–615
Article MathSciNet Google Scholar
Lorenz DA, Pfetsch ME, Tillmann AM (2014) An infeasible-point subgradient method using adaptive approximate projections. Comput Optim Appl 57 (2):271–306
Article MathSciNet Google Scholar
Lorenz DA, Pfetsch ME, Tillmann AM (2015) Solving basis pursuit: Heuristic optimality check and solver comparison. ACM Trans Math Softw (TOMS) 41(2):1–29
Article MathSciNet Google Scholar
Malioutov DM, Cetin M, Willsky AS (2005) Homotopy continuation for sparse signal representation. In: Proceedings of the IEEE international conference on acoustics, speech, and signal processing, vol 5, pp 733–736
Osbourne MR, Presnell B, Turlach BA (2000) A new approach to variable selection in least squares problems. IMA J Numer Anal 20(3):389–402
Article MathSciNet Google Scholar
Osher S, Mao Y, Dong B, Yin W (2010) Fast linearized Bregman iteration for compressive sensing and sparse denoising. Commun Math Sci 8(1):93–111
Article MathSciNet Google Scholar
Parikh N, Boyd S (2013) Proximal algorithms. Foundations and Trends in Optimization 1(3):123–231
Google Scholar
Passty GB (1979) Ergodic convergence to a zero of the sum of monotone operators in Hilbert space. J Math Anal Appl 72:383–390
Article MathSciNet Google Scholar
Rockafellar RT (1970) Convex analysis. Princeton University Press, Princeton
Book Google Scholar
Tanay S, Srivastava S. h., Khare S, Stanimirovic PS, Petkovic MD (2019) An improved algorithm for basis pursuit problem and its applications. Appl Math Comput 355:385–398
MathSciNet MATH Google Scholar
Tao Z. h. (2020) Accelerating monotone fast iterative shrinkage–thresholding algorithm with sequential subspace optimization for sparse recovery. SIViP 14(4):771–780
Article Google Scholar
Tao S, Boley D, Zhang S (2016) Local linear convergence of ISTA and FISTA on the LASSO problem. SIAM J OPTIM 26(1):313–336
Article MathSciNet Google Scholar
Tesfamicael SA, Barzideh F (2015) Clustered compressive sensing: Application on medical imaging. Int J Electr Comput Eng 5(1)
Tesfamicael S, Barzideh F (2014) Clustered compressed sensing in fMRI data analysis using a Bayesian framework. Int J Inf Electr Eng 4(2)
Vahi I, Shahri PK, Ahani H (2020) A compressed-sensing-based compressor for ECG. Biomed Eng Lett: 1–9
van den Berg E (2020) A hybrid quasi-Newton projected-gradient method with application to lasso and basis-pursuit denoising. Math Program Comput 12(1):1–38
Article MathSciNet Google Scholar
Wen Z, Yin W, Goldfarb D, Zhang Y (2010) A fast algorithm for sparse reconstruction based on shrinkage, subspace optimization, and continuation. SIAM J Sci Comput 32:1832–1857
Article MathSciNet Google Scholar
Yin W (2010) Analysis and generalizations of the linearized Bregman method. SIAM J Imaging Sciences 3(4):856–877
Article MathSciNet Google Scholar
Yin W, Osher S, Goldfarb D, Darbon J (2008) Bregman iterative algorithms for L₁-minimization with applications to compressed sensing. SIAM J Imaging Sci 1:143–168
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Mathematical Sciences, University of Tabriz, Tabriz, Iran
Shahab Babapour, Mehrdad Lakestani & Abolfazl Fatholahzadeh

Authors

Shahab Babapour
View author publications
You can also search for this author in PubMed Google Scholar
Mehrdad Lakestani
View author publications
You can also search for this author in PubMed Google Scholar
Abolfazl Fatholahzadeh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mehrdad Lakestani.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Babapour, S., Lakestani, M. & Fatholahzadeh, A. AFISTA: Accelerated FISTA for sparse signal recovery and compressive sensing. Multimed Tools Appl 80, 20707–20731 (2021). https://doi.org/10.1007/s11042-021-10701-w

Download citation

Received: 02 April 2020
Revised: 27 October 2020
Accepted: 10 February 2021
Published: 09 March 2021
Issue Date: May 2021
DOI: https://doi.org/10.1007/s11042-021-10701-w

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

AFISTA: Accelerated FISTA for sparse signal recovery and compressive sensing

Abstract

Similar content being viewed by others

A new conjugate gradient hard thresholding pursuit algorithm for sparse signal recovery

Accelerating monotone fast iterative shrinkage–thresholding algorithm with sequential subspace optimization for sparse recovery

Data-Driven Forward–Backward Pursuit for Sparse Signal Reconstruction

1 Introduction