Abstract
The alternating direction method of multipliers (ADMM) is widely used in solving structured convex optimization problems. Despite its success in practice, the convergence of the standard ADMM for minimizing the sum of \(N\,(N\geqslant 3)\) convex functions, whose variables are linked by linear constraints, has remained unclear for a very long time. Recently, Chen et al. (Math Program, doi:10.1007/s10107-014-0826-5, 2014) provided a counter-example showing that the ADMM for \(N\geqslant 3\) may fail to converge without further conditions. Since the ADMM for \(N\geqslant 3\) has been very successful when applied to many problems arising from real practice, it is worth further investigating under what kind of sufficient conditions it can be guaranteed to converge. In this paper, we present such sufficient conditions that can guarantee the sublinear convergence rate for the ADMM for \(N\geqslant 3\). Specifically, we show that if one of the functions is convex (not necessarily strongly convex) and the other N-1 functions are strongly convex, and the penalty parameter lies in a certain region, the ADMM converges with rate O(1 / t) in a certain ergodic sense and o(1 / t) in a certain non-ergodic sense, where t denotes the number of iterations. As a by-product, we also provide a simple proof for the O(1 / t) convergence rate of two-block ADMM in terms of both objective error and constraint violation, without assuming any condition on the penalty parameter and strong convexity on the functions.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
We consider solving the following multi-block convex minimization problem:
where \(A_i \in \mathbb {R}^{p\times n_i}\), \(b\in \mathbb {R}^p\), \(\mathcal {X}_i\subset \mathbb {R}^{n_i}\) are closed convex sets and \(f_i:\mathbb {R}^{n_i}\rightarrow \mathbb {R}^p\) are closed convex functions. One recently popular way to solve (1.1), when the functions \(f_i\) are of special structures, is to apply the alternating direction method of multipliers (ADMM) [1, 2]. The ADMM is closely related to the Douglas–Rachford [3] and Peaceman–Rachford [4] operator splitting methods that date back to 1950s. These operator splitting methods were further studied later in [5–8]. The ADMM has been revisited recently due to its success in solving problems with special structures arising from compressed sensing, machine learning, image processing, and so on; see the recent survey papers [9, 10] for more information.
ADMM for solving (1.1) is based on an augmented Lagrangian method framework. The augmented Lagrangian function for (1.1) is defined as
where \(\lambda \) is the Lagrange multiplier and \(\gamma > 0\) is a penalty parameter. In a typical iteration of the standard ADMM for solving (1.1), the following updating procedure is implemented:
The ADMM (1.2) for solving two-block convex minimization problems (i.e., \(N=2\)) has been studied extensively in the literature. The global convergence of ADMM (1.2) when \(N=2\) has been shown in [11, 12]. There are also some very recent works that study the convergence rate properties of ADMM when \(N=2\) (see, e.g., [13–18]).
However, the convergence of ADMM (1.2) when \(N\geqslant 3\) had remained unclear for a long time. In a recent work by Chen et al. [19], a counter-example was constructed that shows the failure of ADMM (1.2) when \(N\geqslant 3\). Since the ADMM (1.2) for \(N\geqslant 3\) has been successfully applied to solve many problems arising from real practice (see e.g., [20, 21]), it is worth investigating under what kind of sufficient conditions the ADMM (1.2) can converge. Moreover, it has been observed by many researchers that the ADMM (1.2) often outperforms all its modified versions (see the observations in [22, 23]). In fact, Sun, Toh, and Yang made the following statement in [22]: “However, to the best of our knowledge, up to now the dilemma is that at least for convex conic programming, the modified versions though with convergence guarantee, often perform 2-3 times slower than the multi-block ADMM with no convergent guarantee.” There is thus a strong need to further study sufficient conditions that can guarantee the convergence of (1.2). It was shown by Han and Yuan in [24] that ADMM (1.2) globally converges if all the functions \(f_1,\cdots ,f_N\) are assumed to be strongly convex and the penalty parameter \(\gamma \) is smaller than a certain bound. Chen, Shen, and You [25] showed that the 3-block ADMM [i.e., \(N=3\) in (1.2)] globally converges if \(A_1\) is injective, \(f_2\) and \(f_3\) are strongly convex, and \(\gamma \) is smaller than a certain bound. After we released our work,Footnote 1 Cai, Han, and Yuan [26] and Li, Sun, and Toh [27] independently proved that when \(N=3\), the ADMM (1.2) converges under the conditions that one function among \(f_1\), \(f_2\), and \(f_3\) is strongly convex and \(\gamma \) is smaller than a certain bound. Davis and Yin [28] studied a variant of the 3-block ADMM (see Algorithm 8 in [28]) which requires that \(f_1\) is strongly convex and \(\gamma \) is smaller than a certain bound to guarantee the convergence. Recently, Lin, Ma, and Zhang [29] proposed several alternative approaches to ensure the sublinear convergence rate of (1.2) without requiring any function to be strongly convex. Furthermore, Lin, Ma, and Zhang [30] proved that the 3-block ADMM is globally convergent for any \(\gamma >0\) when it is applied to solve the so-called regularized least-squares decomposition problems. In a recent work by Hong and Luo [31], a variant of ADMM (1.2) with small step size in updating the Lagrange multiplier was studied. Specifically, [31] proposed to replace the last equation in (1.2) by
where \(\alpha >0\) is a small step size. Linear convergence of this variant is proved under the assumption that the objective function satisfies certain error bound conditions. However, it is noted that the selection of \(\alpha \) is in fact bounded by some parameters associated with the error bound conditions to guarantee the convergence. Therefore, it might be difficult to choose \(\alpha \) in practice. There are also studies on the convergence rate of some other variants of ADMM (1.2), and we refer the interested readers to [32–36] for details of these variants. In this paper, we focus on the ADMM (1.2) that directly extends the two-block ADMM to problems with more than two block variables.
1.1 Our Contributions
The main contribution in this paper is as follows. We show that the ADMM (1.2) when \(N\geqslant 3\) converges with rate O(1 / t) in ergodic sense and o(1 / t) in non-ergodic sense, under the assumption that \(f_2,\cdots ,f_N\) are strongly convex and \(f_1\) is convex but not necessarily strongly convex, and \(\gamma \) is smaller than a certain bound. It should be pointed out that our assumption is weaker than the one used in [24], in which all the functions are required to be strongly convex. Moreover, unlike the sufficient condition suggested in [19], we do not make any assumption on the matrices \(A_1,\cdots ,A_N\). To the best of our knowledge, the convergence rate results given in this paper are the first sublinear convergence rate results for the standard ADMM (1.2) when \(N\geqslant 3\). We also remark here that by further assuming additional conditions, we proved the global linear convergence rate of ADMM (1.2) in [37].
1.2 Organization
The rest of this paper is organized as follows. In Sect. 2, we provide some preliminaries for our convergence rate analysis. In Sect. 3, we prove the convergence rate of ADMM (1.2) in the ergodic sense. In Sect. 4, we prove the convergence rate of ADMM (1.2) in the non-ergodic sense. Section 5 draws some conclusions and points out some future directions.
2 Preliminaries
We will only prove the convergence results of ADMM for \(N=3\), because all the analysis can be extended to arbitrary N easily. As a result, for the ease of presentation and succinctness, we assume \(N=3\) in the rest of this paper. We will present the results for general N but omit the proofs.
We restate the problem (1.1) for \(N=3\) as
The ADMM for solving (2.1) can be summarized as (note that some constant terms in the three subproblems are discarded):
The first-order optimality conditions for (2.2)–(2.4) are given, respectively, by \(x_{i}^{k+1}\in \mathcal {X}_{i}, i=1,2,3\), and
where \(g_i \in \partial f_i\) is the subgradient of \(f_i\) for \(i=1,2,3\). Moreover, by combining with (2.5), (2.6)–(2.8) can be rewritten as
We denote \(\Omega = \mathcal {X}_1\times \mathcal {X}_2 \times \mathcal {X}_3 \times \mathbb {R}^p\) and the optimal set of (2.1) as \(\Omega ^*\), and the following assumption is made throughout this paper.
Assumption 2.1
The optimal set \(\Omega ^*\) for problem (2.1) is non-empty.
According to the first-order optimality conditions for (2.1), solving (2.1) is equivalent to finding
such that the following holds:
where \(g_i(x_i^*)\in \partial f_i(x_i^*)\), \(i=1,2,3\).
Furthermore, the following condition is assumed in our subsequent analysis.
Assumption 2.2
The functions \(f_2\) and \(f_3\) are strongly convex with parameters \(\sigma _2>0\) and \(\sigma _3>0\), respectively; i.e., the following two inequalities hold:
or equivalently,
where \(g_2(x)\in \partial f_2(x)\) and \(g_3(x) \in \partial f_3(x)\) are the subgradients of \(f_2\) and \(f_3\), respectively.
In our analysis, the following well-known identity is used frequently:
Notation
For simplicity, we use the following notation to denote the stacked vectors or tuples:
We denote the objective function of problem (2.1) by \(f(u)\equiv f_1(x_{1})+f_{2}(x_{2})+f_{3}(x_{3})\); \(g_i\) is a subgradient of \(f_i\); \(\lambda _{\max }(B)\) denotes the largest eigenvalue of a real symmetric matrix B; \(\Vert x\Vert \) denotes the Euclidean norm of x.
3 Ergodic Convergence Rate of ADMM
In this section, we prove the O(1 / t) convergence rate of ADMM (2.2)–(2.5) in the ergodic sense.
Lemma 3.1
Assume that \(\gamma \leqslant \min \left\{ \frac{\sigma _2}{2\lambda _{\max }(A_{2}^{\mathrm{T}}A_{2})}, \frac{\sigma _3}{2\lambda _{\max }(A_{3}^{\mathrm{T}}A_{3})}\right\} \), where \(\sigma _2\) and \(\sigma _3\) are defined in Assumption 2.2. Let \((x_1^{k+1},x_2^{k+1},x_3^{k+1},\lambda ^{k+1})\in \Omega \) be generated by ADMM from given \((x_{2}^{k},x_{3}^{k},\lambda ^{k})\). Then, for any primal optimal solution \(u^*=(x_1^*,x_2^*,x_3^*)\) of (2.1) and \(\lambda \in \mathbb {R}^p\), it holds that
Proof
Note that combining (2.9)–(2.11) yields
The key step in our proof is to bound the following two terms:
For the first term, we have
where in the second equality we used the identity (2.17), and the last equality follows from the updating formula for \(\lambda ^{k+1}\) in (2.5).
For the second term, we have
where in the second equality we applied the identity (2.17).
Therefore, we have
Combining (3.3), (3.2), and (2.5), it holds for any \(\lambda \in \mathbb {R}^p\) that
Using the convexity of \(f_1\) and the identity
letting \(u=u^{*}\) in (3.4), and applying the facts that [invoking (2.13) and (2.14)]
and
we obtain
This together with the facts that \(\gamma \lambda _{\max }(A_{2}^{\mathrm{T}}A_{2})-\frac{\sigma _2}{2} \leqslant 0\) and \(\gamma \lambda _{\max }(A_{3}^{\mathrm{T}}A_{3})-\frac{\sigma _3}{2}\leqslant 0\) implies the desired inequality (3.1).
Now, we are ready to present the O(1 / t) ergodic convergence rate of the ADMM.
Theorem 3.2
Assume that \(\gamma \leqslant \min \left\{ \frac{\sigma _2}{2\lambda _{\max }(A_{2}^{\mathrm{T}}A_{2})}, \frac{\sigma _3}{2\lambda _{\max }(A_{3}^{\mathrm{T}}A_{3})}\right\} \). Let \((x_1^{k+1},x_2^{k+1},x_3^{k+1},\lambda ^{k+1})\in \Omega \) be generated by ADMM (2.2)–(2.5) from given \((x_2^{k},x_3^{k},\lambda ^{k})\). For any integer \(t>0\), let \(\bar{u}^{t}=(\bar{x}_{1}^{t}, \bar{x}_{2}^{t}, \bar{x}_{3}^{t})\) and \(\bar{\lambda }^{t}\) be defined as
Then, for any \((u^*,\lambda ^*) \in \Omega ^{*}\), by defining \(\rho :=\Vert \lambda ^*\Vert +1\), we have
Note that this also implies that both the error of the objective function value and the residual of the equality constraint converge to 0 with convergence rate O(1 / t), i.e.,
Proof
Because \((u^{k},\lambda ^k)\in \Omega \), it holds that \((\bar{u}^{t},\bar{\lambda }^t)\in \Omega \) for all \(t\geqslant 0\). Using Lemma 3.1, the last equation of (2.12), and invoking the convexity of function \(f(\cdot )\), we have
Note that this inequality holds for all \(\lambda \in \mathbb {R}^{p}\). From weak duality of (2.1), we obtain
which implies that
because \(\rho =\Vert \lambda ^*\Vert +1\). Moreover, by letting \(\lambda :=-\rho (A_1\bar{x}_1^t+A_2\bar{x}_2^t+A_3\bar{x}_3^t-b)/\Vert A_1\bar{x}_1^t+A_2\bar{x}_2^t+A_3\bar{x}_3^t-b\Vert _2\) in (3.6), and using \(A_1x^*_1+A_2x^*_2+A_3x^*_3=b\), we obtain
We now define the function
It is easy to verify that v is convex, \(v(0)=f(u^*)\), and \(\lambda ^* \in \partial v(0)\). Therefore, from the convexity of v, it holds that
Letting \(\bar{\xi } = A_1 \bar{x}_1 + A_2 \bar{x}_2 + A_3\bar{x}_3 - b\), we have \(f(\bar{u}^t) \geqslant v(\bar{\xi })\). Therefore, by denoting the constant
and combining (3.7), (3.8), and (3.9), we get
which, by using \(\rho =\Vert \lambda ^*\Vert +1\), yields
Moreover, by combining (3.7), (3.8), and (3.10), one obtains that
As a result, (3.5) follows immediately from (3.10) and (3.11).
Therefore, we have established the O(1 / t) convergence rate of the ADMM (2.2)–(2.5) in an ergodic sense. Our proof is readily extended to the case of N-block ADMM (1.2). The following theorem shows the O(1 / t) convergence rate of N-block ADMM (1.2). We omit the proof here for the sake of succinctness.
Theorem 3.3
Assume that
where \(\sigma _{i}\) is the strong convexity parameter of \(f_{i}\), \(i=2,\cdots ,N\). Let \((x_{1}^{k+1},x_{2}^{k+1},x_{3}^{k+1},\cdots ,x_{N}^{k+1}, \lambda ^{k+1})\in \Omega \) be generated by the N-block ADMM (1.2). For any integer \(t>0\), we define
Then, for \(\rho :=\Vert \lambda ^*\Vert +1\), it holds that
Similarly as Theorem 3.2, this also implies that N-block ADMM (1.2) converges with rate O(1 / t) in terms both error of objective function value and the residual of the equality constraints, i.e., it holds that
4 Non-ergodic Convergence Rate of ADMM
In this section, we prove an o(1 / k) non-ergodic convergence rate for ADMM (2.2)–(2.5).
Let us first observe the following (see also Lemma 4.1 in [24]). Suppose at the \((k+1)\)-th iteration of ADMM (2.2)–(2.5), we have
Then, (2.9)–(2.11) would immediately lead to
In other words, if (4.1) is satisfied, then \((x_1^{k+1},x_2^{k+1},x_3^{k+1},\lambda ^{k+1})\) would have been already an optimal solution for (2.1). It is therefore natural to introduce a residual for the linear system (4.1) as an optimality measure. Below is such a measure, to be denoted by \(R_{k+1}\):
In the sequel, we will show that \(R_k\) converges to 0 at the rate o(1 / k). Note that this gives the convergence rate of ADMM (2.2)–(2.5) in non-ergodic sense.
We first show that \(R_k\) is non-increasing.
Lemma 4.1
Assume \(\gamma \leqslant \min \{\frac{\sigma _2}{\lambda _{\max }(A_{2}^{\mathrm{T}}A_{2})}, \frac{\sigma _3}{\lambda _{\max }(A_{3}^{\mathrm{T}}A_{3})}\}\). Let the sequence \(\{x_1^k,x_2^k,x_3^k,\lambda ^k\}\) be generated by ADMM (2.2)–(2.5). It holds that \(R_k\) defined in (4.2) is non-increasing, i.e.,
Proof
Letting \(x_{1}=x_{1}^{k}\) in (2.6) yields
with \(g_1 \in \partial f_1\), which further implies that
where the last equality is due to the identity (2.17). Letting \(x_{1}=x_{1}^{k+1}\) in (2.9) with \(k+1\) changed to k yields
which further implies that
Combining (4.4) and (4.5) gives
Letting \(x_{2}=x_{2}^{k}\) in (2.7) yields
which further implies that
where the last equality is due to the identity (2.17). Letting \(x_{2}=x_{2}^{k+1}\) in (2.10) with \(k+1\) changed to k yields
which further implies that
Combining (4.7) and (4.8) gives
Letting \(x_3=x_3^k\) in (2.11) and \(x_3=x_3^{k+1}\) in (2.11) with \(k+1\) changed to k, and adding the two resulting inequalities, yields
where the last equality is due to the identity (2.17).
Combining (4.6), (4.9), and (4.10) yields
Note that (2.15) and (2.16) imply that
Combining (4.11) and (4.12), and the fact that \(\gamma \leqslant \min \left\{ \frac{\sigma _2}{\lambda _{\max }(A_{2}^{\mathrm{T}}A_{2})}, \frac{\sigma _3}{\lambda _{\max }(A_{3}^{\mathrm{T}}A_{3})}\right\} \), it is easy to see that \(R_{k+1} \leqslant R_k\) for \(k=0,1,2,\cdots \).
We are now ready to present the o(1 / k) non-ergodic convergence rate of the ADMM (2.2)–(2.5).
Theorem 4.2
Assume \(\gamma \leqslant \min \left\{ \frac{\sigma _2}{2\lambda _{\max }(A_{2}^{\mathrm{T}}A_{2})}, \frac{\sigma _3}{2\lambda _{\max }(A_{3}^{\mathrm{T}}A_{3})}\right\} \). Let the sequence \(\{x_1^{k},x_2^k,x_3^k,\lambda ^k\}\) be generated by ADMM (2.2)–(2.5). Then \(\sum _{k=1}^\infty R_k < +\infty \) and \(R_k = o(1/k)\).
Proof
Combining (4.9) and (4.10) yields
Using (4.12) and the assumption that \(\gamma \leqslant \min \left\{ \frac{\sigma _2}{2\lambda _{\max }(A_{2}^{\mathrm{T}}A_{2})}, \frac{\sigma _3}{2\lambda _{\max }(A_{3}^{\mathrm{T}}A_{3})}\right\} \), we obtain
From the optimality conditions (2.12) and the convexity of f, it follows that
By combining (3.1) and (4.14), we have
Note that the first term in (4.15) is equal to
Therefore, (4.15) can be rearranged as
By (4.13) and (4.16), we get that
Note that we have proved that \(R_k\) is monotonically non-increasing, and \(\sum _{k=1}^\infty R_k<+\infty \). As observed in Lemma 1.2 of [32], one has
and therefore \(R_k = o(1/k)\).
Remark 4.3
We remark here that using similar arguments, it is easy to see that (4.13) and (4.16) together with the monotonicity of \(R_k\) also imply that \(R_k\) has a non-asymptotic sublinear convergence rate O(1 / k).
Note that our analysis can be extended to N-block ADMM (1.2) easily. The results are summarized in the following theorem, and the proof is omitted for the sake of succinctness.
Theorem 4.4
Assume that
Let \((x_{1}^{k+1},x_{2}^{k+1},x_{3}^{k+1},\cdots ,x_{N}^{k+1},\lambda ^{k+1})\in \Omega \) be generated by ADMM (1.2). Then \(\sum _{k=1}^\infty R_k < +\infty \) and \(R_k=o(1/k)\), where \(R_k\) is defined as
5 Conclusions
In this paper, we analyzed the sublinear convergence rate of the standard Gauss–Seidel multi-block ADMM in both ergodic and non-ergodic sense. These are the first sublinear convergence rate results for standard multi-block ADMM. Using the techniques developed in this paper, we can also analyze the convergence rate of some variants of the standard multi-block ADMM such as the ones studied in [32] and [33], where the primal variables are updated in a Jacobi manner; we plan to pursue this direction of research in the future.
We remark here the techniques developed in this paper can lead to a very simple proof for the O(1 / t) complexity of two-block ADMM in terms of objective error and constraint violation of (1.1) (\(N=2\)). Specifically, when \(N=2\), denote \((x_1^k,x_2^k;\lambda ^k)\) as the iterate generated by the two-block ADMM (1.2), and define
We can prove that
i.e., the convergence rate of the two-block ADMM is O(1 / t) in terms of both objective error and constraint violation. Note that for \(N=2\), \(\gamma \) can be any positive number and there is no need to impose the strong convexity on either \(f_1\) or \(f_2\). The proof of this result is as follows.
First, when \(N=2\), the optimality conditions (2.9)–(2.11) reduce to
Therefore, by letting \(x_1=x_1^*\) in (5.2), \(x_2=x_2^*\) in (5.3), and using the convexity of \(f_1\) and \(f_2\), we have
where the second equality is due to (2.17). Thus for any \(\lambda \in \mathbb {R}^p\), it holds that,
Summing (5.4) over \(k=0,1,\ldots ,t\) yields,
Based on the above bound, the error analysis for both the objective and the residual follow the same line of arguments as the proof of Theorem 3.2.
Notes
Preprint available at http://arxiv.org/abs/1408.4265.
References
Gabay, D., Mercier, B.: A dual algorithm for the solution of nonlinear variational problems via finite-element approximations. Comp. Math. Appl. 2, 17–40 (1976)
Glowinski, R., Marrocco, A.: Sur l’approximation par éléments finis et la résolution par pénalisation-dualité d’une classe de problèmes de Dirichlet non linéaires. Revue Française d’Automatique, Informatique, Recherche Operationnelle, Serie Rouge (AnalyseNumérique), R-2,pp. 41–76 (1975)
Douglas, J., Rachford, H.H.: On the numerical solution of the heat conduction problem in 2 and 3 space variables. Trans. Am. Math. Soc. 82, 421–439 (1956)
Peaceman, D.H., Rachford, H.H.: The numerical solution of parabolic elliptic differential equations. SIAM J. Appl. Math. 3, 28–41 (1955)
Eckstein, J.: Splitting methods for monotone operators with applications to parallel optimization. PhD thesis, Massachusetts Institute of Technology (1989)
Fortin, M., Glowinski, R.: Augmented Lagrangian Methods: Applications to the Numerical Solution of Boundary-Value Problems. North-Holland Pub, Co, Amsterdam (1983)
Glowinski, R., Le Tallec, P.: Augmented Lagrangian and Operator-Splitting Methods in Nonlinear Mechanics. SIAM, Philadelphia (1989)
Lions, P.L., Mercier, B.: Splitting algorithms for the sum of two nonlinear operators. SIAM J. Numer. Anal. 16, 964–979 (1979)
Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3(1), 1–122 (2011)
Eckstein, J.: Augmented Lagrangian and alternating direction methods for convex optimization: A tutorial and some illustrative computational results (2012). Preprint http://www.optimization-online.org/DB_HTML/2012/12/3704.html
Eckstein, J., Bertsekas, D.P.: On the Douglas–Rachford splitting method and the proximal point algorithm for maximal monotone operators. Math. Program. 55, 293–318 (1992)
Gabay, D.: Applications of the method of multipliers to variational inequalities. In: Fortin, M., Glowinski, R. (eds.) Augmented Lagrangian Methods: Applications to the Solution of Boundary Value Problems. North-Holland, Amsterdam (1983)
Boley, D.: Local linear convergence of the alternating direction method of multipliers on quadratic or linear programs. SIAM J. Optim. 23(4), 2183–2207 (2013)
Davis, D., Yin, W.: Faster convergence rates of relaxed Peaceman–Rachford and ADMM under regularity assumptions. Technical report, UCLA CAM Report 14–58 (2014)
Deng, W., Yin, W.: On the global and linear convergence of the generalized alternating direction method of multipliers. J. Sci. Comput. (2015). doi:10.1007/s10915-015-0048-x
He, B., Yuan, X.: On the \({O}(1/n)\) convergence rate of Douglas–Rachford alternating direction method. SIAM J. Numer. Anal. 50, 700–709 (2012)
He, B., Yuan, X.: On nonergodic convergence rate of Douglas–Rachford alternating direction method of multipliers. Numerische Mathematik 130(3), 567–577 (2015)
Monteiro, R.D.C., Svaiter, B.F.: Iteration-complexity of block-decomposition algorithms and the alternating direction method of multipliers. SIAM J. Optim. 23, 475–507 (2013)
Chen, C., He, B., Ye, Y., Yuan, X.: The direct extension of ADMM for multi-block convex minimization problems is not necessarily convergent. Math. Program. (2014). doi:10.1007/s10107-014-0826-5
Peng, Y., Ganesh, A., Wright, J., Xu, W., Ma, Y.: RASL: Robust alignment by sparse and low-rank decomposition for linearly correlated images. IEEE Trans. Pattern Anal. Mach. Intell. 34(11), 2233–2246 (2012)
Tao, M., Yuan, X.: Recovering low-rank and sparse components of matrices from incomplete and noisy observations. SIAM J. Optim. 21, 57–81 (2011)
Sun, D., Toh, K.-C., Yang, L.: A convergent 3-block semiproximal alternating direction method of multipliers for conic programming with 4-type constraints. SIAM J. Optim. 25, 882–915 (2015)
Wang, X., Hong, M., Ma, S., Luo, Z.-Q.: Solving multiple-block separable convex minimization problems using two-block alternating direction method of multipliers (2013). Preprint arXiv:1308.5294
Han, D., Yuan, X.: A note on the alternating direction method of multipliers. J. Optim. Theory Appl. 155(1), 227–238 (2012)
Chen, C., Shen, Y., You, Y.: On the convergence analysis of the alternating direction method of multipliers with three blocks. Abstract and Applied Analysis, Article ID 183961 (2013). doi:10.1155/2013/183961
Cai, X., Han, D., Yuan, X.: The direct extension of ADMM for three-block separable convex minimization models is convergent when one function is strongly convex (2014). Preprint http://www.optimization-online.org/DB_HTML/2014/11/4644.html
Li, M., Sun, D., Toh, K.C.: A convergent 3-block semi-proximal ADMM for convex minimization problems with one strongly convex block. Asia-Pacific J. Oper. Res. 32(3), 1550024 (2015)
Davis, D., Yin, W.: A three-operator splitting scheme and its optimization applications. Technical report, UCLA CAM Report 15–13 (2015)
Lin, T., Ma, S., Zhang, S.: Iteration complexity analysis of multi-block ADMM for a family of convex minimization without strong convexity (2015). Preprint arXiv:1504.03087
Lin, T., Ma, S., Zhang, S.: Global convergence of unmodified 3-block ADMM for a class of convex minimization problems (2015). Preprint arXiv:1505.04252
Hong, M., Luo, Z.: On the linear convergence of the alternating direction method of multipliers (2012). Preprint arXiv:1208.3922
Deng, W., Lai, M., Peng, Z., Yin, W.: Parallel multi-block ADMM with \(o(1/k)\) convergence (2013). Preprint arXiv:1312.3040
He, B., Hou, L., Yuan, X.: On full Jacobian decomposition of the augmented Lagrangian method for separable convex programming (2013). Preprint http://www.optimization-online.org/DB_HTML/2013/05/3894.html
He, B., Tao, M., Yuan, X.: Alternating direction method with Gaussian back substitution for separable convex programming. SIAM J. Optim. 22, 313–340 (2012)
B. He, M. Tao, and X. Yuan. Convergence rate and iteration complexity on the alternating direction method of multipliers with a substitution procedure for separable convex programming (2013). Preprint http://www.optimization-online.org/DB_FILE/2012/09/3611.pdf
Hong, M., Chang, T.-H., Wang, X., Razaviyayn, M., Ma, S., Luo, Z.-Q.: A block successive upper bound minimization method of multipliers for linearly constrained convex optimization (2014). Preprint arXiv:1401.7079
Lin, T., Ma, S., Zhang, S.: On the global linear convergence of the ADMM with multi-block variables. SIAM J. Optim., to appear (2015)
Acknowledgments
We would like to thank the editor and the anonymous referees for carefully reading this paper and for insightful comments.
Author information
Authors and Affiliations
Corresponding author
Additional information
The research of S.-Q. Ma was supported in part by the Hong Kong Research Grants Council General Research Fund Early Career Scheme (No. CUHK 439513). The research of S.-Z. Zhang was supported in part by the National Natural Science Foundation (No. CMMI 1161242).
Rights and permissions
About this article
Cite this article
Lin, TY., Ma, SQ. & Zhang, SZ. On the Sublinear Convergence Rate of Multi-block ADMM. J. Oper. Res. Soc. China 3, 251–274 (2015). https://doi.org/10.1007/s40305-015-0092-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40305-015-0092-0