Abstract
In this paper, we propose an inertial accelerated primal-dual method for the linear equality constrained convex optimization problem. When the objective function has a “nonsmooth + smooth” composite structure, we further propose an inexact inertial primal-dual method by linearizing the smooth individual function and solving the subproblem inexactly. Assuming merely convexity, we prove that the proposed methods enjoy \(\mathcal {O}(1/k^{2})\) convergence rate on the objective residual and the feasibility violation in the primal model. Numerical results are reported to demonstrate the validity of the proposed methods.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Consider the linear equality constrained convex optimization problem:
where \(F: \mathbb {R}^{n}\to \mathbb {R}\) is a convex but possibly nonsmooth function, \(A\in \mathbb {R}^{m\times n}\) and \(b\in \mathbb {R}^{m}\). The problem (1) captures a number of important applications arising in various areas, and the following are three concrete examples.
Example 1.1
The basis pursuit problem (see, e.g., [9, 10]):
where \(A\in \mathbb {R}^{m\times n}\) with m ≪ n, and ∥⋅∥1 is the ℓ1-norm of \(\mathbb {R}^{n}\) defined by \(\|x\|_{1} ={\sum }^{n}_{i=1} |x_{i}|\). Algorithms for the basis pursuit problem can be found in [31] and [33].
Example 1.2
The linearly constrained ℓ1 − ℓ2 minimization problem [17]:
where β > 0 and ∥⋅∥2 is the ℓ2-norm of \(\mathbb {R}^{n}\) defined by \(\|x\|^{2}_{2} ={\sum }^{n}_{i=1} {x_{i}^{2}}\). When β is small enough, a solution of the problem (3) is also a solution of the basis pursuit problem (2). Since the problem (3) has the regularization term \(\frac {\beta }{2}\|x\|^{2}_{2}\), it is less sensitive to noise than the basis pursuit problem (2).
Example 1.3
The global consensus problem [8]:
where \(f_{i}:\mathbb {R}^{n}\to \mathbb {R}\) is convex, i = 1,2,⋯ ,N. The global consensus problem is a widely investigated model that has important applications in signal processing [23], routing of wireless sensor networks [22] and optimal consensus of agents [28].
Recall that \((x^{*}, \lambda ^{*})\in \mathbb {R}^{n} \times \mathbb {R}^{m}\) is a KKT point of the problem (1) if
where ∂F is the classical subdifferential of F defined by
Let Ω be the KKT point set of the problem (1). Then, for any (x∗,λ∗) ∈Ω, from (4) we have
where \({\mathscr{L}}:\mathbb {R}^{n}\times \mathbb {R}^{m}\to \mathbb {R}\) is the Lagrangian function associated with the problem (1) defined by
A classical method for solving the problem (1) is the augmented Lagrangian method (ALM) [6]:
In general, since \({\mathscr{L}}(x,\lambda _{k})+\frac {\sigma }{2}\|Ax-b\|^{2}\) is not strictly convex, the subproblem may have more than one solutions and be difficult to solve. To overcome this disadvantage, the proximal ALM [11] has been proposed:
where \(\|x\|^{2}_{P} = x^{T}Px\) with a positive semidefinite matrix P and P + σATA is positive definite.
In some practical situations, the objective function F has the composite structure: F(x) = f(x) + g(x), where f is a convex but possibly nonsmooth function and g is a convex smooth function. Then, the problem (1) becomes the linearly constrained composite convex optimization problem:
An application of the method (6) to the problem (7) with linearizing the smooth function g leads to the linearized ALM [32]:
1.1 Related works
Under the assumption that F is smooth, He and Yuan [15] showed that the iteration-complexity of the method (5) is \(\mathcal {O}(1/k)\) in terms of the objective residual of the associated \({\mathscr{L}}(x,\lambda )\). When F is nonsmooth, Gu et al. [13] proved that the method (5) enjoys a worst-case \(\mathcal {O}(1/k)\) convergence rate in the ergodic sense. A worst-case \(\mathcal {O}(1/k)\) convergence rate in the non-ergodic sense of the method (6) was shown in [21]. When g has a Lipschitz continuous gradient with constant Lg and P ≻ LgId, Xu [32] proved that the method (8) achieves \(\mathcal {O}(1/k)\) convergence rate in the ergodic sense. Tran-Dinh and Zhu [30] proposed a modified version of the method (8) and proved that the objective residual and feasibility violation sequences generated by the method both enjoy \(\mathcal {O}(1/k)\) non-ergodic convergence rate. Liu et al. [24] investigated the non-ergodic convergence rate of an inexact augmented Lagrangian method for the problem (7).
Generally, naive first-order methods converge slowly. Much effort has been made to accelerate the existing first-order methods in past decades. Nesterov [25] first proposed an accelerated version of the classical gradient method for a smooth convex optimization problem, and proved that the accelerated inertial gradient method enjoys \(\mathcal {O}(1/k^{2})\) convergence rate. Beck and Teboulle [5] proposed an iterative shrinkage-thresholding algorithm for solving the linear inverse problem, which achieves \(\mathcal {O}(1/k^{2})\) convergence rate. The acceleration idea of [25] was further applied in Nesterov [26] to design the accelerated methods for unconstrained convex composite optimization problems. Su et al. [29] first studied accelerated methods from a continuous-time perspective. Since then, some new accelerated inertial methods based on the second-order dynamical system have been proposed for unconstrained optimization problems (see, e.g., [1, 3, 4]). For more results on inertial methods for unconstrained optimization problems, we refer the reader to [2, 12, 27].
Meanwhile, inertial accelerated methods for linearly constrained optimization problems have also been well-developed. When f is differentiable, He and Yuan [15] proposed an accelerated inertial ALM for the problem (1) and proved that its convergence rate is \(\mathcal {O}(1/k^{2})\) by using an extrapolation technique similar to [5]. When f is a differentiable function with Lipschitz continuous gradient, by time discretization of dynamical system, Boţ et al. [7] proposed an inertial ALM with \(\mathcal {O}(1/k^{2})\) convergence rate and provided the convergence of the sequence of iterates. Kang et al. [18] presented an inexact version of the accelerated ALM with inexact calculations of subproblems and showed that the convergence rate remains \(\mathcal {O}(1/k^{2})\) under the assumption that F is strongly convex. Kang et al. [17] further presented an accelerated Bregman method for the linearly constrained ℓ1 − ℓ2 minimization problem, and a convergence rate of \(\mathcal {O}(1/k^{2})\) was proved when the accelerated Bregman method is applied to solve the problem (1). To linearize the augmented term of the Bregman method, Huang et al. [16] raised an accelerated linearized Bregman algorithm with \(\mathcal {O}(1/k^{2})\) convergence rate. For the problem (7), Tran-Dinh and Zhu [30] proposed an inertial primal-dual method which enjoys \(\underline {o}(1/k\sqrt {\log k})\) convergence rate. Xu [32] proposed an accelerated version of the linearized ALM (8), named the accelerated linearized augmented Lagrangian method, which is formulated as follows:
It was shown in Xu [32] that the algorithm (9) enjoys \(\mathcal {O}(1/k^{2})\) convergence rate under specific parameter settings. It is worth mentioning that to achieve the \(\mathcal {O}(1/k^{2})\) rate, linearization to the augmented term is not allowed in the algorithm (9) since it may cause great difficulty on solving subproblems. Xu [32] did not discuss the convergence analysis of the method when the subproblem is solved inexactly.
1.2 Inertial primal-dual methods
We first propose Algorithm 1, an inertial version of the proximal ALM (6), for solving the problem (1). Algorithm 1 is inspired by the second-order primal-dual dynamical system in [14, 34] and the Nesterov accelerated methods for unconstrained optimization problem [2, 5, 25]. When the objective has the composite structure: F(x) = f(x) + g(x), by linearizing the smooth function g and introducing the perturbed sequence {𝜖k}k≥ 1 in Step 2 of Algorithm 1, we propose an inexact inertial accelerated primal-dual method (Algorithm 2) for the problem (7). As a comparison to Algorithm 1, we solve the subproblem inexactly by finding an approximate solution instead of an exact solution.
1.3 Outline
The rest of the paper is organized as follows. In Section 2, we investigate the convergence rates of the proposed methods. In Section 3, we perform numerical experiments. Finally, we give a concluding remark in Section 4.
2 Convergence analysis
In this section we analyze the convergence rates of Algorithm 1 and Algorithm 2. Assuming merely convexity, we show that both of them enjoy \(\mathcal {O}(1/k^{2})\) convergence rates in terms of the objective function and the primal feasibility.
To do so, we first recall some standard notations and results which will be used in the paper. In what follows, we always use ∥⋅∥ to denote the ℓ2-norm.
Let \(\mathbb {S}_{+}(n)\) denote the set of all positive symmetric semidefinite matrixes in \(\mathbb {R}^{n\times n}\) and Id be the identity matrix. For \(M\in \mathbb {S}_{+}(n)\), we introduce the semi-norm on \(\mathbb {R}^{n}\): \(\|x\|_{M} =\sqrt {x^{T}Mx}\) for any \(x\in \mathbb {R}^{n}\). This introduces on \(\mathbb {S}_{+}(n)\) the following partial ordering: for any \(M_{1}, M_{2} \in \mathbb {S}_{+}(n)\),
For any \(x,y\in \mathbb {R}^{n}\), the following equality holds:
Now, we start to analyze Algorithm 1.
Lemma 1
Let \(\{(x_{k},\lambda _{k},\bar {x}_{k})\}_{k\geq 1}\) be the sequence generated by Algorithm 1. Then,
Proof
From step 2, we have
This yields
It follows from Step 2 and Step 3 that
This together with (12) yields (11). □
Lemma 2
Suppose that F is a convex function, Ω≠∅ and \(M_{k-1}\succcurlyeq M_{k}\). Let \(\{(x_{k},\lambda _{k}, \bar {x}_{k}, \hat {\lambda }_{k})\}_{k\geq 1}\) be the sequence generated by Algorithm 1 and (x∗,λ∗) ∈Ω. Define
with
Then, for any k ≥ 1, we have
Proof
By computation,
and
Similarly, we have
and
By the definition of \({\mathscr{L}}(x,\lambda )\), we get \(\partial _{x} {\mathscr{L}}(x,\lambda ) = \partial F(x)+A^{T}\lambda \). Combining this and equality (18), we can rewrite (11) as
which implies
Since \(M_{k-1}\succcurlyeq M_{k}\succcurlyeq 0\), it follows from (10) and (15) that
Since \({\mathscr{L}}(x,\lambda ^{*})\) is a convex function with respect to x, from (16) and (19) we get
Combining (20) and (21) together, we have
Since Ax∗ = b, it follows from Step 3 of Algorithm 1 and (16) that
This together with (10) and (17) yields
It follows from (22) and (23) that
where the last inequality follows from α ≥ 3 and (x∗,λ∗) ∈Ω. This yields the desire result. □
To obtain the fast convergence rates, we need the following lemma.
Lemma 3
([20, Lemma 2], [19, Lemma 3.18]) Let \(\{a_{k}\}_{k=1}^{+\infty }\) be a sequence of vectors in \(\mathbb {R}^{n}\) such that
where τ > 1 and C ≥ 0. Then, \(\|{\sum }^{K}_{k=1}a_{k}\|\leq C\) for all K ≥ 1.
Now, we discuss the \(\mathcal {O}(1/k^{2})\) convergence rate of Algorithms 1.
Theorem 1
Suppose that F is a convex function, Ω≠∅ and \(M_{k-1}\succcurlyeq M_{k}\). Let \(\{(x_{k},\lambda _{k},\bar {x}_{k},\bar {\lambda }_{k})\}_{k\geq 1}\) be the sequence generated by Algorithm 1 and (x∗,λ∗) ∈Ω. The following conclusions hold:
-
\({\sum }_{k=1}^{+\infty }k^{2}(\|x_{k+1}-\bar {x}_{k}\|_{M_{k}}^{2}+\|\lambda _{k+1}-\bar {\lambda }_{k}\|^{2}) < +\infty \).
-
For all k > 1,
$$ \begin{array}{@{}rcl@{}} &&\|Ax_{k}-b\| \leq \frac{4(\alpha-1)^{2}\sqrt{2\mathcal{E}_{1}}}{s(k-1)(k+\alpha-3)},\\ &&|F(x_{k})-F(x^{*})|\leq \frac{(\alpha-1)^{2}\mathcal{E}_{1}}{s(k^{2}-k)}+ \frac{4(\alpha-1)^{2}\sqrt{2\mathcal{E}_{1}}\|\lambda^{*}\|}{s(k-1)(k+\alpha-3)}, \end{array} $$where \(\mathcal {E}_{1}=\frac {1}{2}\|x_{1}-x^{*}\|_{M_{0}}^{2}+\frac {1}{2}\|\lambda _{1}-\lambda ^{*}\|^{2}\).
Proof
From Lemma 2, we have
By the definition of \(\mathcal {E}_{k}\) and (24), \(\{\mathcal {E}_{k}\}_{k\geq 1}\) is a nonincreasing and positive sequence. As a consequence, \(\mathcal {E}_{k}\) converges to some point. It follows from (24) that
which is (i).
Combining (13) and (24), we have
This yields
for all K ≥ 1. It follows form (17) and Step 3 of Algorithm 1 that
This together with (26) implies
When α = 3, it follows from (27) that
When α > 3: applying Lemma 3 with ak = (α − 3)(k − 1)(Axk − b), \(\tau = \frac {\alpha -2}{\alpha -3}\) and \(C = \frac {2(\alpha -1)^{2}\sqrt {2\mathcal {E}_{1}}}{s}\), from (27), we obtain
which together with (27) yields
From above discussion, when α ≥ 3, we have
It follows form the definition of \(\mathcal {E}_{k}\) and (24) that
for all k > 1. This together with (28) implies that
for all k > 1. The proof is complete. □
To investigate the convergence of Algorithm 2, we need the following assumption.
Assumption (H)
Ω≠∅, f is a convex function, g is a convex smooth function and has a Lipschitz continuous gradient with constant Lg > 0, i.e.,
equivalently,
Lemma 4
Let \(\{(x_{k},\lambda _{k},\bar {x}_{k})\}_{k\geq 1}\) the sequence generated by Algorithm 2. Then,
Proof
From Step 2 of Algorithm 2 , we have
which yields
The rest of the proof is similar as the one of Lemma 1, and so we omit it. □
Lemma 5
Assume that Assumption (H) holds, and \(M_{k-1}\succcurlyeq M_{k}\succcurlyeq sL_{g} Id \). Let \(\{(x_{k},\lambda _{k}, \hat {\lambda }_{k})\}_{k\geq 1}\) be the sequence generated by Algorithm 2 and (x∗,λ∗) ∈Ω. Define
where \(\mathcal {E}_{k}\) is defined in (13) and \(\hat {x}_{k}\) is defined in (14). Then, for any k ≥ 1,
Proof
By same arguments as in the proof of Lemma 2, we get
For notation simplicity, we denote
Then, \({\mathscr{L}}^{f}\) is a convex function, \(\partial {\mathscr{L}}^{f}(x) = \partial f(x)+A^{T}\lambda ^{*}\), and
It follows from (30) and (35) that
which yields
Since \(M_{k-1}\succcurlyeq M_{k}\), it follows from (10), (32) and (38) that
where the inequality follows from the convexity of \({\mathscr{L}}^{f}\). Since g has a Lipschitz continuous gradient, from (29) we get
By the convexity of g, we have
and
It follows from (41)–(43) that
and
This together with (33) yields
It follows from (39), (40) and (44) that
where the second inequality follows from the assumption \(M_{k}\succcurlyeq sL_{g} Id\succcurlyeq \frac {sL_{g}k}{k+\alpha -2}Id\).
It follows from (23), (31) and (45) that
where the last inequality follows from α ≥ 3 and (x∗,λ∗) ∈Ω. □
To analyze the convergence of Algorithm 2, we need the following discrete version of the Gronwall-Bellman lemma.
Lemma 6
[2, Lemma 5.14] Let {ak}k≥ 1 and {bk}k≥ 1 be two nonnegative sequences such that \({\sum }^{+\infty }_{k} b_{k}< +\infty \) and
for all k ≥ 1, where c ≥ 0. Then,
for all k ≥ 1.
Theorem 2
Assume that Assumption (H) holds, \(M_{k-1}\succcurlyeq M_{k}\succcurlyeq sL_{g}Id\) for all k ≥ 1, and
Let {(xk,λk)}k≥ 1 be the sequence generated by Algorithm 2 and (x∗,λ∗) ∈Ω. Then, for all k > 1,
where
Proof
From Lemma 5 we have
and it yields
This together with (13) and Cauchy-Schwarz inequality implies
Since \(M_{k-1}\succcurlyeq sL_{g} Id \),
Since \({\sum }^{+\infty }_{j=1} j\|\epsilon _{j}\|< +\infty \), applying Lemma 6 with \(a_{k}=\|\hat {x}_{k}-x^{*}\|\) to (48), we obtain
This together with (47) yields
for any k ≥ 1. Denote
By the definition of \(\mathcal {E}_{k}\) and (50), we have
and
By similar arguments in Theorem 1, we obtain
and
for all k > 1. □
Remark 1
It was shown in [32, Theorem 2.9] that the algorithm (9) with adaptive parameters enjoys \(\mathcal {O}(1/k^{2})\) rate. Xu [32] did not discuss whether the convergence rate of algorithm (9) is preserved when the subproblem is solved inexactly. By Theorem 2, the \(\mathcal {O}(1/k^{2})\) convergence rate of Algorithm 2 is preserved even if the subproblem is solved inexactly, provided the errors are sufficiently small. The numerical experiments in section 3 also show the effectiveness of the inexact algorithm.
3 Numerical experiments
In this section, we present numerical experiments to illustrate the efficiency of the proposed algorithms. All codes are run on a PC (with 2.3 GHz Quad-Core Intel Core i5 and 8 GB memory) under MATLAB Version 9.4.0.813654 (R2018a).
3.1 The quadratic programming problem
In this subsection, we test the algorithms on the nonnegative linearly constrained quadratic programming problem (NLCQP):
where \( q \in \mathbb {R}^{n}\), \(Q\in \mathbb {R}^{n\times n}\) is a positive semidefinite matrix, \(A\in \mathbb {R}^{m\times n}\), \( b \in \mathbb {R}^{m}\). Here, we compare Algorithm 2 (Al2) with the accelerated linearized augmented Lagrangian method (AALM [32, Algorithm 1]), which enjoys \(\mathcal {O}(1/k^{2})\) convergence rate with adaptive parameters.
Set m = 100 and n = 500. Let q be generated by standard Gaussian distribution, b be generated by uniform distribution, A = [B,Id] with \(B\in \mathbb {R}^{m\times (n-m)}\) generated by standard Gaussian distribution, Q = 2HTH with \(H\in \mathbb {R}^{n\times n}\) generated by standard Gaussian distribution. Then, Q may not be positive definite. The optimal value F(x∗) is obtained by Matlab function quadprog with tolerance 10− 15. In this case, F(x) = f(x) + g(x) with \(f(x) = \mathcal {I}_{y\geq 0}(x)\), HCode \(g(x)= \frac {1}{2}x^{T}Qx+q^{T}x\), where \(\mathcal {I}_{y\geq 0}\) is the indicator function of the set {y|y ≥ 0}, i.e.,
Set the parameters of Algorithm 2 as: s = ∥Q∥, Mk = s ∗∥Q∥Id. Set the parameters of AALM ([32, Algorithm 1]) with adaptive parameters, in which \(\alpha _{k}=\frac {2}{k+1}\), βk = γk = ∥Q∥k, \(P_{k}=\frac {2\|Q\|}{k}Id\). Subproblems for both algorithms are solved by interior-point algorithms to a tolerance subtol. Figure 1 describes the distance of optimal value |F(xk) − F(x∗)| and violation of feasibility ∥Axk − b∥ given Al2 with α = 10, 20, 30 and AALM for the first 500 iterations. As shown in Fig. 1, Algorithm 2 performs better and more stable than AALM under different subtol.
3.2 The basis pursuit problem
Consider the following basis pursuit problem:
where \(A\in \mathbb {R}^{m\times n}\), \(b\in \mathbb {R}^{m}\) and m ≤ n. Let A be generated by standard Gaussian distribution. The number of nonzero elements of the original solution x∗ is fixed at 0.1 ∗ n, and the nonzero elements are selected randomly in [− 2,2]. Set b = Ax∗. We compare Algorithm 1 with the inexact augmented Lagrangian method (IAL [24, Algorithm 1]). Here, subproblems for both algorithms are solved by fast iterative shrinkage-thresholding algorithm (FISTA [5], [24, Algorithm 2]), and the stopping condition of the FISTA is when
is satisfied or the number of iterations exceeds 100, where accuracy subtol = 1e − 4, 1e − 6, 1e − 8. In each test, we calculate the residual error ∥Axk − b∥ (Res) and the relative error of the solution \(\frac {\|x_{k}-x^{*}\|}{\|{x}^{*}\|}\) (Rel) with the stopping condition Res + Rel ≤ 1e − 8. Set the parameters of Algorithm 1 as α = n, s = 100, Mk = 0, and the parameter of IAL as β = 1. Let Init and Time denote the number of iterations, and the CPU time in seconds, respectively. Under different tolerance subtol of subproblem, Tables 1, 2 and 3 report the results for the basis pursuit problem with different dimensions. We observe that when the subproblem is solved with different accuracy, Algorithm 1 is faster than IAL in terms of the number of iterations and the cpu time.
3.3 The linearly constrained ℓ 1 − ℓ 2 minimization problem
Consider the following problem:
where \(A\in \mathbb {R}^{m\times n}\) and \(b\in \mathbb {R}^{m}\). Let m = 1500,n = 3000, and A be generated by standard Gaussian distribution. Suppose that the original solution (signal) \({x}^{*}\in \mathbb {R}^{n}\) has only 150 non-zero elements which are generated by the Gaussian distribution \(\mathcal {N}(0,4)\) in the interval [− 2,2] and that the noise ω is selected randomly with ∥ω∥ = 10− 4,
Set parameters for Algorithm 2 (Al2) with α = 20, s = 1, Mk = sβ, and the parameters of IAALM ([18, Algorithm 1]) with γ = 1. Subproblems are solved by FISTA and the stopping condition is when
is satisfied or the number of iterations exceeds 100, where accuracy subtol = 1e − 6, 1e − 8. We terminate all the methods when ∥Axk − b∥≤ 5 ∗ 10− 4. In each test, we calculate the residual error res = ∥Ax − b∥, the relative error \(rel = \frac {\|x-x^{*}\|}{\|{x}^{*}\|}\) and the signal-to-noise ratio
where x is the recovery signal.
In Table 4, we present the numerical results of Algorithm 2 and IAALM for various β. When subtol = 1e − 6, IAALM does not work well, we list the numerical results of Algorithm 2 in Table 5. Based on the Rel and SNR, it is seen that the sparse original signal is well restored when β ≤ 1. This is also shown in Fig. 2.
4 Conclusion
In this paper, we propose two inertial accelerated primal-dual methods for solving linear equality constrained convex optimization problems. Assuming merely convexity, we show the inertial primal-dual methods own \(\mathcal {O}(1/k^{2})\) convergence rates even if the subproblem is solved inexactly. The numerical results demonstrate the validity and superior performance of our methods over some existing methods.
References
Apidopoulos, V., Aujol, J. F., Dossal, C.: Convergence rate of inertial forward-backward algorithm beyond Nesterov’s rule. Math. Program. 180(1), 137–156 (2020)
Attouch, H., Chbani, Z., Peypouquet, J., Redont, P.: Fast convergence of inertial dynamics and algorithms with asymptotic vanishing viscosity. Math. Program. 168(1-2), 123–175 (2018)
Attouch, H., Peypouquet, J.: The rate of convergence of Nesterov’s accelerated forward-backward method is actually faster than 1/k2. SIAM J. Optim. 26(3), 1824–1834 (2016)
Attouch, H., Peypouquet, J., Redont, P.: A dynamical approach to an inertial forward-backward algorithm for convex minimization. SIAM J. Optim. 24(1), 232–256 (2014)
Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imaging Sci. 2(1), 183–202 (2009)
Bertsekas, D.: Constrained optimization and lagrange multiplier methods. Academic Press (1982)
Boţ, R. I., Csetnek, E. R., Nguyen, D. K.: Fast augmented Lagrangian method in the convex regime with convergence guarantees for the iterates. arXiv:https://arxiv.org/abs/2111.09370 (2021)
Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3, 1–122 (2010)
Candes, E. J., Wakin, M. B.: An introduction to compressive sampling. IEEE Signal. Proc. Mag. 25(2), 21–30 (2008)
Chen, S. S., Donoho, D. L., Saunders, M. A.: Atomic decomposition by basis pursuit. SIAM Rev. 43(1), 129–159 (2001)
Chen, G., Teboulle, M.: A proximal-based decomposition method for convex minimization problems. Math. Program. 64(1-3), 81–101 (1994)
Goldfarb, D., Ma, S., Scheinberg, K.: Fast alternating linearization methods for minimizing the sum of two convex functions. Math. Program. 141 (1-2), 349–382 (2013)
Gu, G., He, B., Yuan, X.: Customized proximal point algorithms for linearly constrained convex minimization and saddle-point problems: a unified approach. Comput. Optim. Appl. 59(1-2), 135–161 (2014)
He, X., Hu, R., Fang, Y. P.: Convergence rates of inertial primal-dual dynamical methods for separable convex optimization problems. SIAM J. Control Optim. 59(5), 3278–3301 (2021)
He, B., Yuan, X.: On the acceleration of augmented Lagrangian method for linearly constrained optimization. Optimization online http://www.optimization-online.org/DB_FILE/2010/10/2760.pdf (2010)
Huang, B., Ma, S., Goldfarb, D.: Accelerated linearized Bregman method. J. Sci. Comput. 54(2), 428–453 (2013)
Kang, M., Yun, S., Woo, H., Kang, M.: Accelerated Bregman method for linearly constrained ℓ1 − ℓ2 minimization. J. Sci. Comput. 56(3), 515–534 (2013)
Kang, M., Kang, M., Jung, M.: Inexact accelerated augmented Lagrangian methods. Comput. Optim. Appl. 62(2), 373–404 (2015)
Lin, Z., Li, H., Fang, C.: Accelerated optimization for machine learning. Springer, Singapore (2019)
Li, H., Lin, Z.: Accelerated alternating direction method of multipliers: An optimal \(\mathcal {O}(1/k)\) nonergodic analysis. J. Sci. Comput. 79(2), 671–699 (2019)
Ma, F., Ni, M.: A class of customized proximal point algorithms for linearly constrained convex optimization. Comput. Appl. Math. 37(2), 896–911 (2018)
Madan, R., Lall, S.: Distributed algorithms for maximum lifetime routing in wireless sensor networks. IEEE Trans Wirel. Commun. 5(8), 2185–2193 (2006)
Nedic, A., Ozdaglar, A.: Cooperative Distributed Multi-agent. Convex Optimization in Signal Processing and Communications. Cambridge University Press (2010)
Liu, Y. F., Liu, X., Ma, S.: On the nonergodic convergence rate of an inexact augmented Lagrangian framework for composite convex programming. Math. Oper. Res. 44(2), 632–650 (2019)
Nesterov, Y.: A method of solving a convex programming problem with convergence rate \(\mathcal {O}(1/k^{2})\). Insov. Math. Dokl. 27, 372–376 (1983)
Nesterov, Y.: Gradient methods for minimizing composite functions. Math. Program. 140(1), 125–161 (2013)
Nocedal, J., Wright, S.: Numerical optimization. Springer Science and Business Media (2006)
Shi, G., Johansson, K. H.: Randomized optimal consensus of multi-agent systems. Automatica 48(12), 3018–3030 (2012)
Su, W., Boyd, S., Candes, E. J.: A differential equation for modeling Nesterov’s accelerated gradient method: theory and insights. J. Mach. Learn. Res. 17(1), 5312–5354 (2016)
Tran-Dinh, Q., Zhu, Y.: Non-stationary first-Order primal-Dual algorithms with fast convergence rates. SIAM J. Optim. 30(4), 2866–2896 (2020)
Van Den Berg, E., Friedlander, M. P.: Probing the Pareto frontier for basis pursuit solutions. SIAM J. Sci. Comput. 31(2), 890–912 (2009)
Xu, Y.: Accelerated first-order primal-dual proximal methods for linearly constrained composite convex programming. SIAM J. Optim. 27(3), 1459–1484 (2017)
Yin, W., Osher, S., Goldfarb, D., Darbon, J.: Bregman iterative algorithms for ℓ1-minimization with applications to compressed sensing. SIAM J. Imaging Sci. 1(1), 143–168 (2008)
Zeng, X., Lei, J., Chen, J.: Dynamical primal-dual accelerated method with applications to network optimization. arXiv:https://arxiv.org/abs/1912.03690 (2019)
Acknowledgements
The authors would like to thank the referees and the editor for their helpful comments and suggestions, which have led to the improvement of this paper.
Funding
This work was supported by the National Natural Science Foundation of China (11471230).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
He, X., Hu, R. & Fang, YP. Inertial accelerated primal-dual methods for linear equality constrained convex optimization problems. Numer Algor 90, 1669–1690 (2022). https://doi.org/10.1007/s11075-021-01246-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11075-021-01246-y
Keywords
- Inertial accelerated primal-dual method
- Linear equality constrained convex optimization problem
- \(\mathcal {O}(1/k^{2})\) convergence rate
- Inexactness