1 Introduction

There has been an impressive development on operator splitting methods in the area of partial differential equations, and among them are some alternating direction methods of multipliers (ADMMs for short). In this note, we focus on the Douglas–Rachford ADMM scheme proposed by Glowinski and Marrocco in [7] (see also [5]) and we restrict our discussion into the context of a convex minimization problem with linear constraints and a separable objective function:

$$\begin{aligned} \min \big \{\theta _1(x) + \theta _2(y) \; | \; Ax + By =b, \, x\in \mathcal{X}, y\in \mathcal{Y} \big \}, \end{aligned}$$
(1.1)

where \(A\in \mathfrak {R}^{m\times n_1}\), \(B\in \mathfrak {R}^{m\times n_2}\), \(b\in \mathfrak {R}^m\), \(\mathcal{X}\subset \mathfrak {R}^{n_1}\) and \(\mathcal{Y}\subset \mathfrak {R}^{n_2}\) are closed convex sets, \(\theta _1: \mathfrak {R}^{n_1} \rightarrow \mathfrak {R}\) and \(\theta _2: \mathfrak {R}^{n_2} \rightarrow \mathfrak {R}\) are convex functions (not necessarily smooth). The solution set of (1.1) is assumed to be nonempty, and we refer to [4, 8] for some convergence results without this assumption.

As in [10], in order to treat the original ADMM in [7] and the split inexact Uzawa method in [13] uniformly, we study the following ADMM scheme for (1.1):

$$\begin{aligned} x^{k+1} = \arg \min \bigl \{\theta _1(x)+\frac{\beta }{2}\Vert (Ax+By^k-b)- {1 \over \beta }\lambda ^k\Vert ^2+\frac{1}{2} \Vert x-x^k\Vert _G^2\;\big |\;x\in \mathcal{X} \bigr \}\!,\quad \end{aligned}$$
(1.2a)
$$\begin{aligned} y^{k+1} = \arg \min \bigl \{\theta _2(y)+\frac{\beta }{2}\Vert (Ax^{k+1}+By-b) -{1 \over \beta }\lambda ^k\Vert ^2\;\big |\;y\in \mathcal{Y}\bigr \}\!, \end{aligned}$$
(1.2b)
$$\begin{aligned} \lambda ^{k+1} = \lambda ^k-\beta (Ax^{k+1}+By^{k+1}-b)\!, \end{aligned}$$
(1.2c)

where \(\lambda ^k\in {\mathfrak {R}}^m\) is the Lagrange multiplier, \(\beta > 0\) is a penalty parameter, and \(G \in \mathfrak {R}^{n_1 \times n_1}\) is a symmetric and positive semidefinite matrix. In fact, the original ADMM scheme in [7] and the split inexact Uzawa method in [13] are recovered by taking \(G=0\) and \(G=(r I_{n_1}-\beta A^TA)\) with \(r >\beta \Vert A^TA\Vert \) (where \(\Vert \cdot \Vert \) denotes a matrix norm such as the spectrum norm) in (1.2a), respectively. We refer to review papers [1, 3, 6] and references therein for the history of ADMMs, and in particular, some efficient applications of ADMMs exploited recently.

Because of its impressive efficiency and wide applicability, it is interesting to investigate the convergence rate of the ADMM scheme (1.2). A first-step work is [10] where we showed a worst-case \(O(1/k)\) convergence rate measured by the iteration complexity, where \(k\) denotes the iteration counter, for the ADMM scheme (1.2)Footnote 1. With this first-step result, it becomes possible to investigate more intensive results on the convergence rate of the scheme (1.2). Recall that the convergence rate derived in [10] is in the ergodic sense because the approximate solution with an accuracy of \(O(1/k)\) is found based on all \(k\) iterates generated by (1.2). One may ask if we can establish the same convergence rate directly in a non-ergodic sense for the sequence generated by the scheme (1.2). The main purpose of this note is to answer this question affirmatively. We expect the new technique to be used to analyze convergence rates for other important numerical algorithms of the same kind.

2 Preliminaries

In this section, we provide some preliminaries which are useful in later analysis.

2.1 Notations

We first define some matrices which will simplify the notations in our analysis. More specifically, let

$$\begin{aligned} H= \left( \begin{array}{lll} G &{} 0 &{} 0\\ 0 &{} \beta B^TB &{} 0 \\ 0 &{} 0 &{} \frac{1}{\beta } I_m \end{array} \right) \!, \;\; M= \left( \begin{array}{lll} I_{n_1} &{} 0 &{} 0 \\ 0 &{} I_{n_2} &{} 0 \\ 0 &{} -\beta B &{} I_m \end{array} \right) \!,\;\; Q= \left( \begin{array}{lll} G &{} 0 &{} 0 \\ 0 &{} \beta B^TB&{} 0 \\ 0 &{} -B &{} \frac{1}{\beta } I_m \end{array} \right) \!. \end{aligned}$$
(2.1)

Without further assumption on \(B\), the matrix \(H\) defined above can only be guaranteed as symmetric and positive semidefinite. But we still use the notation \(\Vert w\Vert _H^2\) to represent the non-negative number \(w^THw\) in our analysis. Based on these matrices, some relationship can be easily derived and we summarize them in the following proposition.

Proposition 2.1

Let the matrices \(H\), \(M\) and \(Q\) be defined in (2.1). Then we have

  1. (1)

    \(Q=HM\);

  2. (2)

    The symmetric matrix \((Q^T+Q) - M^THM\) is positive semidefinite: \((Q^T+Q) - M^THM \succeq 0\).

Proof

The first conclusion is trivial, and we omit it. For the second one, we notice that

$$\begin{aligned}&(Q^T+Q) - M^THM =(Q^T+Q) - M^T Q \\&\quad = \left( \begin{array}{lll} 2 G &{} 0 &{} 0 \\ 0 &{} 2\beta B^TB&{} -B^T \\ 0 &{} -B &{} \frac{2}{\beta } I_m \end{array} \right) - \left( \begin{array}{lll} I_{n_1} &{} 0 &{} 0 \\ 0 &{} I_{n_2} &{} -\beta B^T \\ 0 &{} 0 &{} I_m \end{array} \right) \left( \begin{array}{lll} G &{} 0 &{} 0 \\ 0 &{} \beta B^TB&{} 0 \\ 0 &{} -B &{} \frac{1}{\beta } I_m \end{array} \right) \\&\quad = \left( \begin{array}{lll} 2 G &{} 0 &{} 0 \\ 0 &{} 2\beta B^TB&{} -B^T \\ 0 &{} -B &{} \frac{2}{\beta } I_m \end{array} \right) - \left( \begin{array}{lll} G &{} 0 &{} 0 \\ 0 &{} 2\beta B^TB&{} -B^T \\ 0 &{} -B &{} \frac{1}{\beta } I_m \end{array} \right) \\&\quad = \left( \begin{array}{lll} G &{} 0 &{} 0 \\ 0 &{} 0 &{} 0 \\ 0 &{} 0 &{} \frac{1}{\beta } I_m \end{array} \right) \succeq 0. \end{aligned}$$

Thus, the proposition is proved. \(\square \)

2.2 Variational inequality characterization of (1.1)

It is easy to see that (1.1) is characterized by a variational inequality (VI) problem: Find \(w^*=(x^*,y^*,\lambda ^*) \in \Omega :=\mathcal{X} \times \mathcal{Y} \times \mathfrak {R}^m\) such that

$$\begin{aligned} \hbox {VI}(\Omega ,F,\theta )\!: \qquad \theta (u) - \theta (u^*) + (w-w^*)^T F(w^*) \ge 0, \quad \forall \, w\in \Omega , \end{aligned}$$
(2.2a)

where

$$\begin{aligned} u= \left( \begin{array}{l} x\\ y \end{array} \right) \!, \; w = \left( \begin{array}{l} x\\ y\\ \lambda \end{array} \right) \!, \; F(w) = \left( \begin{array}{l} - A^T\lambda \\ - B^T\lambda \\ Ax + By -b \end{array} \right) \;\;\hbox {and}\;\; \theta (u)=\theta _1(x) + \theta _2(y)\!. \end{aligned}$$
(2.2b)

Note that the mapping \(F(w)\) is monotone because it is affine with a skew-symmetric matrix. We denote by \(\Omega ^*\) the solution set of VI\((\Omega ,F,\theta )\). Then, \(\Omega ^*\) is nonempty under the nonempty assumption on the solution set of (1.1).

3 Sketch of Proof

To establish a worst-case \(O(1/k)\) convergence rate for the sequence \(\{w^k\}\) generated by (1.2) in a non-ergodic sense, we need the assertion in the following lemma.

Lemma 3.1

Let the sequence \(\{w^k\}\) be generated by (1.2) and \(H\) be given in (2.1). Then, we have

$$\begin{aligned}&\theta (u) -\theta (u^{k+1}) + (w-w^{k+1})^T \left\{ F(w^{k+1}) + \eta (y^k,y^{k+1}) + H(w^{k+1} - w^k)\right\} \ge 0,\nonumber \\&\quad \forall \, w\in \Omega , \end{aligned}$$
(3.1)

where

$$\begin{aligned} \eta (y^k,y^{k+1}) := \beta \,\left( \!\! \begin{array}{l} A^T \\ B^T \\ 0 \end{array}\!\! \right) B\,(y^k-y^{k+1})\!. \end{aligned}$$
(3.2)

Proof

First, deriving the optimality condition of the minimization problem (1.2a), we have

$$\begin{aligned}&\theta _1(x) \!-\!\theta _1(x^{k+1}) \!+\! (x-x^{k+1})^T\left\{ A^T\left[ \beta (Ax^{k+1} \!+\! By^k -b)\!-\!\lambda ^k\right] \!+\! G (x^{k+1}-x^k)\right\} \\&\quad \ge 0, \forall \, x\in \mathcal{X}. \end{aligned}$$

By using (1.2c), it can be written as

$$\begin{aligned}&\theta _1(x) -\theta _1(x^{k+1}) + (x-x^{k+1})^T \{- A^T\lambda ^{k+1} +\beta A^TB(y^k-y^{k+1})\nonumber \\&\quad + G (x^{k+1}-x^k) \} \ge 0, \forall \, x\in \mathcal{X}. \end{aligned}$$
(3.3)

It follows from (1.2) that

$$\begin{aligned} (Ax^{k+1} + By^{k+1} - b) + \frac{1}{\beta }(\lambda ^{k+1} -\lambda ^k) =0. \end{aligned}$$
(3.4)

Combining (3.3), (4.2) and (3.4) together, we get \(w^{k+1}=(x^{k+1},y^{k+1},\lambda ^{k+1}) \in \Omega \), such that

$$\begin{aligned} \theta (u)&- \theta (u^{k+1}) + \left( \begin{array}{l} x- x^{k+1}\\ y- y^{k+1}\\ \lambda - \lambda ^{k+1} \end{array}\right) ^T \left\{ \left( \begin{array}{l} - A^T\lambda ^{k+1} +\beta A^TB(y^k-y^{k+1}) \\ -B^T \lambda ^{k+1} \\ A x^{k+1} +By^{k+1} -b \end{array}\right) \right. \\&+ \left. \left( \begin{array}{l} G(x^{k+1} - x^k) \\ 0 \\ \frac{1}{\beta }(\lambda ^{k+1}-\lambda ^k) \end{array}\right) \right\} \ge 0, \quad \forall \, w\in \Omega , \end{aligned}$$

which can be rewritten as

$$\begin{aligned}&{\theta (u) -\theta (u^{k+1}) + \left( \begin{array}{l} x- x^{k+1}\\ y- y^{k+1}\\ \lambda - \lambda ^{k+1} \end{array}\right) ^T \left\{ \left( \begin{array}{l} - A^T\lambda ^{k+1} \\ -B^T \lambda ^{k+1} \\ A x^{k+1} +By^{k+1} -b \end{array}\right) \right. }\\&\quad +\left. \beta \left( \begin{array}{l} A^TB(y^k-y^{k+1}) \\ B^TB(y^k-y^{k+1}) \\ 0 \end{array}\right) + \left( \begin{array}{l} G(x^{k+1} - x^k) \\ \beta B^TB(y^{k+1} - y^k) \\ \frac{1}{\beta }(\lambda ^{k+1}-\lambda ^k) \end{array}\right) \right\} \ge 0, \forall \, w\in \Omega . \end{aligned}$$

Using the notations of \(F(w)\), \(\eta (y^k,y^{k+1})\) and \(H\), we get the assertion (3.1) immediately. \(\square \)

Lemma 3.1 indicates that the quantity \(\Vert w^k-w^{k+1}\Vert _H^2\) can be used to measure the accuracy of the iterate \(w^{k+1}\) to a solution point of VI\((\Omega ,F,\theta )\). More specifically, since \(H\) is positive semidefinite, we conclude that \(H(w^{k+1}-w^k)=0\) and \(\eta (y^k ,y^{k+1})=0\) if \(\Vert w^k-w^{k+1}\Vert _H^2=0\). In other words, because of the variational inequality characterization (2.2), when \(\Vert w^k-w^{k+1}\Vert _H^2=0\), we have

$$\begin{aligned} \theta (u) -\theta (u^{k+1}) + (w-w^{k+1})^T F(w^{k+1}) \ge 0, \quad \forall \, w\in \Omega , \end{aligned}$$

which means \(w^{k+1}\) is a solution of VI\((\Omega ,F,\theta )\). Therefore, \(\Vert w^k - w^{k+1}\Vert _H^2\) can be viewed as an error measurement after \(k\) iterations of the ADMM scheme (1.2), and it is reasonable to seek an upper bound of \(\Vert w^k - w^{k+1}\Vert _H^2\) in term of the quantity \(O(1/k)\) for the purpose of investigating the convergence rate of ADMM. Based on this fact, our proof follows the following steps:

  1. 1.

    To show that \(\{w^k\}\) is strictly contractive with respect to \(\Omega ^*\), i.e.,

    $$\begin{aligned} \Vert w^{k+1} - w^*\Vert _H^2 \le \Vert w^k-w^*\Vert _H^2 - \Vert w^k - w^{k+1}\Vert _H^2 \quad \forall \, w^*\in \Omega ^*; \end{aligned}$$
    (3.5)
  2. 2.

    To show that \(\{\Vert w^{k} - w^{k+1}\Vert _H^2 \}\) is monotonically non-increasing, i.e.,

    $$\begin{aligned} \Vert w^{k} - w^{k+1}\Vert _H^2 \le \Vert w^{k-1}-w^{k}\Vert _H^2 \quad \forall \, k\ge 1;\end{aligned}$$
    (3.6)
  3. 3.

    To derive a worst-case \(O(1/k)\) convergence rate in a non-ergodic sense based on (3.5) and (3.6), i.e.,

    $$\begin{aligned} \Vert w^k-w^{k+1}\Vert _H^2 \le \frac{1}{(k+1)}\Vert w^0-w^*\Vert _H^2\quad \forall w^*\in \Omega ^*. \end{aligned}$$
    (3.7)

In the following, our analysis is thus divided into three sections to address these three tasks.

4 Strict Contraction

We prove the conclusion (3.5) in this section, and our proof is inspired by Theorem 1 in [9]. We first present several lemmas.

Lemma 4.1

Let the sequence \(\{w^k\}\) be generated by (1.2). Then, we have

$$\begin{aligned} (y^k-y^{k+1})^TB^T (\lambda ^k-\lambda ^{k+1})\ge 0, \quad \forall \, k\ge 0. \end{aligned}$$
(4.1)

Proof

Deriving the optimality conditions of the minimization problem (1.2b), we have

$$\begin{aligned}&y^{k+1}\in \mathcal{Y}, \quad \theta _2(y) -\theta _2(y^{k+1}) {+} (y-y^{k+1})^T \left\{ B^T[\beta (Ax^{k+1} + By^{k+1} -b) -\lambda ^k] \right\} \\&\quad \ge 0, \forall \, y\in \mathcal{Y}. \end{aligned}$$

Substituting (1.2c) into the last inequality, we obtain

$$\begin{aligned} y^{k+1}\in \mathcal{Y}, \quad \theta _2(y) -\theta _2(y^{k+1}) + (y-y^{k+1})^T (-B^T\lambda ^{k+1} ) \ge 0, \; \forall \, y\in \mathcal{Y}. \end{aligned}$$
(4.2)

Obviously, analogous to (4.2), we have

$$\begin{aligned} y^k \in \mathcal{Y}, \quad \theta _2(y) - \theta _2(y^k) + (y -y^{k})^T ( - B^T\lambda ^{k} ) \ge 0, \; \forall \; y\in \mathcal{Y}. \end{aligned}$$
(4.3)

Setting \(y=y^k\) and \(y=y^{k+1}\) in in (4.2) and (4.3), respectively, and then adding the two resulting inequalities, we get (4.1) immediately. \(\square \)

Lemma 4.2

Let the sequence \(\{w^k\}\) be generated by (1.2) and \(H\) be given in (2.1). Then, we have

$$\begin{aligned} (w^{k+1} - w^*)^T H (w^k-w^{k+1}) \ge 0, \quad \forall \, w^*\in \Omega ^*. \end{aligned}$$
(4.4)

Proof

Setting \(w^*\) in (3.1), we obtain

$$\begin{aligned}&{ (w^{k+1} - w^*)^T H(w^k-w^{k+1}) } \nonumber \\&\quad \ge \theta (u^{k+1}) - \theta (u^*) + (w^{k+1} - w^*)^T F(w^{k+1})\nonumber \\&\qquad + (w^{k+1} - w^*)^T\eta (y^k,y^{k+1}), \forall w^*\in \Omega ^*. \end{aligned}$$
(4.5)

In the following we show that the right-hand side of (4.5) is non-negative. First, since \(w^{k+1}\in \Omega \) and \(w^*\in \Omega ^*\), it follows from (2.2a) that

$$\begin{aligned}\theta (u^{k+1})- \theta (u^*) + (w^{k+1} - w^*)^T F(w^*) \ge 0. \end{aligned}$$

Because of the monotonicity of \(F\), we have

$$\begin{aligned} (w^{k+1} - w^*)^T F(w^{k+1}) \ge (w^{k+1} - w^*)^T F(w^*) . \end{aligned}$$

Thus, we obtain

$$\begin{aligned} \theta (u^{k+1}) - \theta (u^*) + (w^{k+1} - w^*)^T F(w^{k+1}) \ge 0.\end{aligned}$$
(4.6)

On the other hand, by using the notation of \(\eta (y^k,y^{k+1})\) [see (3.2)], \(Ax^*+By^*=b\) and (1.2c), we have

$$\begin{aligned}&{ (w^{k+1} - w^*)^T\eta (y^k,y^{k+1}) } \nonumber \\&\quad =(y^k-y^{k+1})^TB^T \beta \{ (Ax^{k+1} + By^{k+1}) - (Ax^*+ By^*)\} \nonumber \\&\quad = (y^k-y^{k+1})^TB^T \beta ( Ax^{k+1} + By^{k+1} -b) \nonumber \\&\quad = (y^k-y^{k+1})^TB^T (\lambda ^k-\lambda ^{k+1}). \end{aligned}$$

Combining with (4.1), we get

$$\begin{aligned} (w^{k+1} - w^*)^T\eta (y^k,y^{k+1}) \ge 0. \end{aligned}$$
(4.7)

Substituting (4.6) and (4.7) into the right-hand side of (4.5), the lemma is proved. \(\square \)

With the proved lemmas, we are now ready to show the assertion (3.5).

Theorem 4.1

Let the sequence \(\{w^k\}\) be generated by (1.2) and \(H\) be given in (2.1). Then (3.5) is satisfied for any \(k\ge 0\).

Proof

Using (4.4), we have

$$\begin{aligned} \Vert w^k -w^*\Vert _H^2&= \Vert (w^{k+1} - w^*) + (w^k - w^{k+1}\Vert _H^2 \nonumber \\&= \Vert w^{k+1} \!-\! w^*\Vert _H^2 \!+\! 2(w^{k+1} \!-\! w^*)^TH (w^k \!-\! w^{k+1}) + \Vert w^k - w^{k+1}\Vert _H^2 \nonumber \\&\ge \Vert w^{k+1} - w^*\Vert _H^2 + \Vert w^k - w^{k+1}\Vert _H^2, \end{aligned}$$

and thus the assertion (3.5) is proved. \(\square \)

5 Monotonicity

This section shows the assertion (3.6), i.e, the sequence \(\{\Vert w^{k+1} - w^{k+2}\Vert _H^2\}\) is monotonically non-increasing. Again, we need to prove several lemmas for this purpose.

First of all, for the convenience of analysis, we introduce an auxiliary variable \({\tilde{w}}^k\) defined as

$$\begin{aligned} {\tilde{w}}^k= \left( \begin{array}{l} {\tilde{x}}^k \\ {\tilde{y}}^k \\ {\tilde{\lambda }}^k \end{array} \right) = \left( \begin{array}{l} x^{k+1}\\ y^{k+1} \\ \lambda ^k-\beta (Ax^{k+1}+By^{k}-b) \end{array} \right) \!, \end{aligned}$$
(5.1)

where \(w^{k}\) is generated by (1.2). Then, we have the relationship

$$\begin{aligned} w^{k+1} = w^k - M(w^k-\tilde{w}^k)\!, \end{aligned}$$
(5.2)

where the matrix \(M\) is given in (2.1).

Lemma 5.1

Let \(\{w^k\}\) be the sequence generated by (1.2), the associated sequence \(\{{\tilde{w}}^k\}\) be defined by (5.1) and \(Q\) be given in (2.1). Then, we have

$$\begin{aligned} \tilde{w}^k \in \Omega , \quad \theta (u) -\theta (\tilde{u}^k) + (w- \tilde{w}^k)^T \left\{ F(\tilde{w}^k) + Q(\tilde{w}^k - w^k)\right\} \ge 0, \quad \forall \, w\in \Omega .\nonumber \\ \end{aligned}$$
(5.3)

Proof

By using the notation \(\tilde{w}^k\) in (5.1), and the facts

$$\begin{aligned} \frac{1}{\beta }(\lambda ^k-\tilde{\lambda }^k) = (A\tilde{x}^k +B\tilde{y}^k -b) -B(\tilde{y}^k-y^k) \quad \hbox {and} \quad \Omega =\mathcal{X} \times \mathcal{Y} \times \mathfrak {R}^m, \end{aligned}$$

the inequality (3.1) can be rewritten as

$$\begin{aligned}&\tilde{w}^k\in \Omega , \quad { \theta (u) -\theta (\tilde{u}^k) + \left( \begin{array}{c} x- \tilde{x}^k\\ y- \tilde{y}^k\\ \lambda - \tilde{\lambda }^{k} \end{array}\right) ^T } \nonumber \\&\quad \times \left\{ \left( \begin{array}{c} - A^T\tilde{\lambda }^k \\ -B \tilde{\lambda }^k \\ A \tilde{x}^k +B\tilde{y}^k -b \end{array}\right) + \left( \begin{array}{l} G(\tilde{x}^k - x^k) \\ \beta B^TB(\tilde{y}^k - y^k)\\ -B(\tilde{y}^k-y^k) + \frac{1}{\beta }(\tilde{\lambda }^k-\lambda ^k) \end{array}\right) \right\} \ge 0, \quad \forall \, w\in \Omega . \end{aligned}$$

The assertion (5.3) thus follows immediately from the definition of \(Q\). \(\square \)

Lemma 5.1 enables us to establish an important inequality in the following lemma.

Lemma 5.2

Let \(\{w^k\}\) be the sequence generated by (1.2), the associated sequence \(\{{\tilde{w}}^k\}\) be defined by (5.1) and \(Q\) be given in (2.1). Then, we have

$$\begin{aligned} (\tilde{w}^k - \tilde{w}^{k+1})^T Q \{ (w^k-w^{k+1})- (\tilde{w}^k -\tilde{w}^{k+1})\} \ge 0. \end{aligned}$$
(5.4)

Proof

Setting \(w=\tilde{w}^{k+1}\) in (5.3), we have

$$\begin{aligned} \theta (\tilde{u}^{k+1}) -\theta (\tilde{u}^k) + (\tilde{w}^{k+1} - \tilde{w}^k)^T \{ F(\tilde{w}^k) + Q(\tilde{w}^k - w^k)\} \ge 0. \end{aligned}$$
(5.5)

Note that (5.3) is also true for \(k:=k+1\) and thus we have

$$\begin{aligned} \theta (u)-\theta (\tilde{u}^{k+1}) + (w -\tilde{w}^{k+1} )^T \{ F(\tilde{w}^{k+1}) + Q(\tilde{w}^{k+1} - w^{k+1})\} \ge 0, \quad \forall \, w\in \Omega . \end{aligned}$$

Setting \(w=\tilde{w}^k\) in the above inequality, we obtain

$$\begin{aligned} \theta (\tilde{u}^k)-\theta (\tilde{u}^{k+1}) + (\tilde{w}^k -\tilde{w}^{k+1} )^T \{F(\tilde{w}^{k+1}) + Q(\tilde{w}^{k+1} - w^{k+1})\} \ge 0. \end{aligned}$$
(5.6)

Adding (5.5) to (5.6) and using the monotonicity of \(F\), we get (5.4) immediately. \(\square \)

Lemma 5.3

Let \(\{w^k\}\) be the sequence generated by (1.2), the associated sequence \(\{{\tilde{w}}^k\}\) be defined by (5.1), the matrices \(H\), \(M\) and \(Q\) be given in (2.1). Then, we have

$$\begin{aligned}&(w^k - \tilde{w}^k)^T M^THM \left\{ (w^k-\tilde{w}^k) - (w^{k+1}-\tilde{w}^{k+1})\right\} \ge \frac{1}{2} \Vert (w^k-\tilde{w}^k)\nonumber \\&\quad - (w^{k+1}-\tilde{w}^{k+1})\Vert _{(Q^T+Q)}^2. \end{aligned}$$
(5.7)

Proof

First, adding the term

$$\begin{aligned} \left\{ (w^k -w^{k+1}) - (\tilde{w}^k - \tilde{w}^{k+1})\right\} ^T Q \left\{ (w^k-w^{k+1}) - (\tilde{w}^k -\tilde{w}^{k+1})\right\} \end{aligned}$$

to the both sides of (5.4), and using \(w^TQw= \frac{1}{2}w^T(Q^T+Q)w\), we get

$$\begin{aligned}&(w^k - w^{k+1})^T Q \left\{ (w^k - w^{k+1})-(\tilde{w}^k-\tilde{w}^{k+1})\right\} \nonumber \\&\quad \ge \frac{1}{2} \Vert (w^k-w^{k+1})-(\tilde{w}^k- \tilde{w}^{k+1})\Vert _{(Q^T+Q)}^2\!. \end{aligned}$$

Reordering \((w^k - w^{k+1}) -(\tilde{w}^k-\tilde{w}^{k+1})\) in the above inequality to \((w^k-\tilde{w}^k) - (w^{k+1}-\tilde{w}^{k+1})\) , we get

$$\begin{aligned}&(w^k - w^{k+1})^T Q \left\{ (w^k-\tilde{w}^k) - (w^{k+1}-\tilde{w}^{k+1})\right\} \nonumber \\&\quad \ge \frac{1}{2} \Vert (w^k-\tilde{w}^k) - (w^{k+1}-\tilde{w}^{k+1})\Vert _{(Q^T+Q)}^2\!. \end{aligned}$$

Substituting the term \((w^k - w^{k+1})\) into the left-hand side of the last inequality, and using the relationship in (5.2) and the fact \(Q=HM\) in Proposition 2.1, we obtain (5.7). \(\square \)

Finally, we are ready to show the assertion (3.6) in the following theorem.

Theorem 5.1

Let \(\{w^k\}\) be the sequence generated by (1.2) and \(H\) be given in (2.1). Then, (3.6) is satisfied for any \(k \ge 0\).

Proof

Setting \(a=M(w^k-\tilde{w}^k)\) and \(b=M(w^{k+1}-\tilde{w}^{k+1})\) in the identity

$$\begin{aligned}\Vert a\Vert _H^2 - \Vert b\Vert _H^2 = 2a^TH(a-b) - \Vert a-b\Vert _H^2\!, \end{aligned}$$

we obtain

$$\begin{aligned}&{ \Vert M(w^k-\tilde{w}^k)\Vert _H^2 - \Vert M(w^{k+1}-\tilde{w}^{k+1})\Vert _H^2 } \\&\quad = 2(w^k-\tilde{w}^k)^TM^THM\{(w^k-\tilde{w}^k) - (w^{k+1}-\tilde{w}^{k+1})\} \\&\qquad - \Vert M\{(w^k-\tilde{w}^k) - (w^{k+1}-\tilde{w}^{k+1})\}\Vert _H^2\!. \end{aligned}$$

Inserting (5.7) into the first term of the right-hand side of the last equality, we obtain

$$\begin{aligned}&{ \Vert M(w^k-\tilde{w}^k)\Vert _H^2 - \Vert M(w^{k+1}-\tilde{w}^{k+1})\Vert _H^2 } \\&\quad \ge \Vert (w^k-\tilde{w}^k) \!-\! (w^{k+1}\!-\!\tilde{w}^{k+1})\Vert _{(Q^T+Q)}^2 \!-\! \Vert M\{(w^k-\tilde{w}^k) \!-\! (w^{k+1}-\tilde{w}^{k+1})\}\Vert _H^2 \\&\quad = \Vert (w^k-\tilde{w}^k) - (w^{k+1}-\tilde{w}^{k+1})\Vert _{\{(Q^T+Q)- M^THM\}}^2 \\&\quad \ge 0, \end{aligned}$$

where the last inequality is because of the positive definiteness of the matrix \((Q^T+Q)-M^THM\) proved in Proposition 2.1. In other words, we derive

$$\begin{aligned} \Vert M(w^{k+1} -\tilde{w}^{k+1})\Vert _H^2 \le \Vert M(w^k-\tilde{w}^k)\Vert _H^2\!. \end{aligned}$$
(5.8)

Recall the relationship in (5.2). The assertion (3.6) follows immediately from (5.8). \(\square \)

6 Non-ergodic convergence rate

With Theorems 4.1 and 5.1, we can prove the assertion (3.7). That is, a worst-case \(O(1/k)\) convergence rate in a non-ergodic sense for the ADMM scheme (1.2) is established.

Theorem 6.1

Let \(\{w^k\}\) be the sequence generated by (1.2). Then, the assertion (3.7) is satisfied.

Proof

First, it follows from (3.5) that

$$\begin{aligned} \sum _{t=0}^{\infty } \Vert w^t - w^{t+1}\Vert _H^2 \le \Vert w^0 - w^*\Vert _H^2,\quad \forall \, w^*\in \Omega ^*. \end{aligned}$$
(6.1)

According to Theorem 5.1, the sequence \(\{\Vert w^t-w^{t+1}\Vert _H^2 \}\) is monotonically non-increasing. Therefore, we have

$$\begin{aligned} (k+1)\Vert w^k-w^{k+1}\Vert _H^2 \le \sum _{t=0}^{k} \Vert w^t -w^{t+1}\Vert _H^2. \end{aligned}$$
(6.2)

The assertion (3.7) follows from (6.1) and (6.2) immediately. \(\square \)

Notice that \(\Omega ^*\) is convex and closed (see Theorem 2.1 in [10]). Let \(d: = \inf \{ \Vert w^0 - w^*\Vert _H \, |\, w^*\in \Omega ^* \}\). Then, for any given \(\epsilon >0\), Theorem 6.1 shows that the ADMM scheme (1.2) needs at most \( \lfloor {d^2}/{\epsilon } \rfloor \) iterations to ensure that \(\Vert w^k- w^{k+1}\Vert _H^2\le \epsilon \). Recall that \(w^{k+1}\) is a solution of VI\((\Omega ,F,\theta )\) if \(\Vert w^k- w^{k+1}\Vert _H^2=0\) (see Lemma 3.1). A worst-case \(O(1/k)\) convergence rate in a non-ergodic sense for the ADMM scheme (1.2) is thus established in Theorem 6.1.

Finally, we remark that because of the monotonicity of the sequence \(\{\Vert w^k-w^{k+1}\Vert _H^2\}\) and the fact (6.1), and using Lemma 1.2 in [2], we can immediately refine the worst-case convergence rate in Theorem 6.1 from \(O(1/k)\) to \(o(1/k)\).