A self-adaptive descent LQP alternating direction method for the structured variational inequalities

Bnouhachem, Abdellah

doi:10.1007/s11075-020-00890-0

A self-adaptive descent LQP alternating direction method for the structured variational inequalities

Original Paper
Published: 10 March 2020

Volume 86, pages 303–324, (2021)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Numerical Algorithms Aims and scope Submit manuscript

A self-adaptive descent LQP alternating direction method for the structured variational inequalities

Download PDF

Abdellah Bnouhachem¹

176 Accesses
Explore all metrics

Abstract

In this paper, by combining the logarithmic-quadratic proximal (LQP) method and alternating direction method, we proposed an LQP alternating direction method for solving structured variational inequalities. The new iterate is generated by searching the optimal step size along a descent direction with a new step size α_k. The choice of the descent direction and the step size selection strategies are important for the algorithm’s efficiency. The O(1/t) convergence rate of the proposed method is studied, and its efficiency is also verified by some numerical experiments.

On descent alternating direction method with LQP regularization for the structured variational inequalities

Article 29 May 2019

A new descent alternating direction method with LQP regularization for the structured variational inequalities

Article 11 April 2018

Parallel LQP alternating direction method for solving variational inequality problems with separable structure

Article Open access 13 October 2014

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

This paper considers the constrained convex programming problem with the following separate structure:

$$ \min \left\{ \theta_{1}(x) +\theta_{2}(y)| Ax+By=b, x\in \mathbb{R}_{+}^{n}, y\in \mathbb{R}_{+}^{m}\right\}, $$

(1.1)

where $\theta _{1}:\mathbb {R}_{+}^{n}\rightarrow \mathbb {R}\ $ and $\theta _{2}:\mathbb {R}_{+}^{m}\rightarrow \mathbb {R}\ $ are closed proper convex functions not necessarily smooth, $A\in \mathbb {R}^{l\times n}$ and $B \in \mathbb {R}^{l\times m}$ are given matrices, and $b\in \mathbb {R}^{l}$ is a given vector.

A very rich class of applications may be modeled as problem (1.1). Various methods have been developed, as we all know, a benchmark solver for problem (1.1) is the alternating direction method (ADM), originally proposed in Glowinski and Marrocco [25]. We refer to, e.g., [7, 17, 21, 24, 26,27,28, 30,31,32,33, 35, 38, 39] for some early references of ADM in PDE and optimization literatures. In particular, ADM has recently found impressive applications in many areas such as image processing and statistical learning (see, e.g., [20] and references therein). To alleviate the subproblems when a splitting method is applied to solve (1.1), one important effort is to regularize the objective functions appropriately and thus force the solutions to stay strictly within the interiors of $\mathbb {R}_{+}^{n}$ and $\mathbb {R}_{+}^{m}$. A very good choice for such regularization is the LQP regularization which was originally proposed in [1] and has been studied extensively in many articles such as [2,3,4,5, 19, 29, 37, 41].

Let ∂(.) denotes the sub-gradient operator of a convex function, and f(x) ∈ ∂𝜃₁(x) and g(y) ∈ ∂𝜃₂(y) are the sub-gradient of 𝜃₁(x) and 𝜃₂(y), respectively. By attaching a Lagrange multiplier vector $\lambda \in \mathbb {R}^{l}$ to the linear constraint Ax + By = b, problem (1.1) can be written in terms of finding $w \in {\mathcal W}=\mathbb {R}_{+}^{n} \times \mathbb {R}_{+}^{m} \times \mathbb {R}^{l}$ such that

$$ (w^{\prime}-w)^{\top} Q(w)\geq 0, \quad \forall w^{\prime} \in {\mathcal W}, $$

(1.2)

where

$$ w= (x,y, \lambda)^{\top}, Q(w)=\left( f(x)-A^{\top}\lambda,g(y)-B^{\top} \lambda,Ax + By-b\right)^{\top} $$

(1.3)

and (.)^⊤ denotes the transpose.

Problem (1.2)–(1.3) is referred to as structured variational inequalities (in short, SVI).

Yuan and Li [41] developed the following LQP-based decomposition method by applying the LQP terms to regularize the ADM subproblems: For a given $w^{k}=(x^{k},y^{k},\lambda ^{k})\in \mathbb {R}^{n}_{++}\times \mathbb {R}^{m}_{++}\times \mathbb {R}^{l}$, and μ ∈ (0,1), the new iterative (x^k+ 1, y^k+ 1, λ^k+ 1) is obtained via solving the following system:

$$ f(x) - A^{\top} \!\left[ \lambda^{k} - H (A x + B y^{k} - b)\!\right] + R_{1} \!\left[ (x - x^{k}) + \mu (x^{k} - {X_{k}^{2}} x^{-1}) \right] = 0, $$

(1.4)

$$ g(y)\! -\! B^{\top} \!\left[ \lambda^{k} - H (A {x} + B y - b)\!\right] + S_{1} \!\left[ (y - y^{k}) + \mu (y^{k} - {Y_{k}^{2}} y^{-1}) \right] = 0, $$

(1.5)

$$ \lambda^{k+1} = \lambda^{k} - H (A x^{k} + B y^{k} - b), $$

(1.6)

where $ X_{k}=\text {diag}({x^{k}_{1}},...,{x^{k}_{n}})$, x^− 1 is an n-vector whose j th element is 1/x_j, $ Y_{k}=\text {diag}({y^{k}_{1}},...,{y^{k}_{m}}),$ y^− 1 is an m-vector whose j th element is 1/y_j, and $H \in \mathbb {R}^{l \times l}$, $R_{1} \in \mathbb {R}^{n \times n}$, and $S_{1} \in \mathbb {R}^{m \times m}$ are symmetric positive definite, where R₁ = diag(r₁,...,r_n) and S₁ = diag(s₁,...,s_m).

Later, Bnouhachem et al. [9, 10] and Li [36] proposed some LQP-ADMs and made the LQP-ADM more practical. Each iteration of the above methods contains a prediction and a correction; the predictor is obtained via solving (1.4)–(1.6), and the new iterate is obtained by a convex combination of the previous point and the one generated by a projection-type method along a descent direction for [9, 36], while the new iterate is computed directly by an explicit formula derived from the original LQP method for [10]. The main disadvantage of the methods in [8,9,10,11, 36, 37, 41] is that solving the (1.5) requires the solution of (1.4). Hence, the alternating direction methods are not eligible for parallel computing in the sense that the solutions of (1.4)–(1.5) cannot be obtained simultaneously. This characteristic excludes the possibility of applying some advanced computing technologies to solve (1.4)–(1.5). To overcome this difficulty, Bnouhachem and Hamdi [12] proposed a parallel descent LQP-ADM for solving SVI. The main advantage of this method is that the predictor is obtained via solving a system of non-linear equations in a parallel wise. Very recently, Bnouhachem and Rassias [18] proposed a new LQP alternating direction scheme for the separable constrained convex programming problem; the predictor $\tilde {w}^{k}=(\tilde {x}^{k},\tilde {y}^{k},\tilde {\lambda }^{k})$ is obtained via solving LQP system approximately under significantly relaxed accuracy criterion and the new iterate w^k+ 1(α_k) = (x^k+ 1, y^k+ 1, λ^k+ 1) is given by

$$ w^{k+1}(\alpha_{k})= (1-\sigma) w^{k}+\sigma P_{\mathcal W}[w^{k}-\alpha_{k}dd_{1}(w^{k},\tilde{w}^{k})], \quad \sigma\in(0,1) $$

where

$$ \alpha_{k}=\frac{\varphi(w^{k}, \tilde{w}^{k})}{\|dd_{1}(w^{k}, \tilde{w}^{k})\|^{2}}, $$

$$ \varphi(w^{k}, \tilde{w}^{k})=\|w^{k}- \tilde{w}^{k}\|_{M_{1}}^{2}+(w^{k}- \tilde{w}^{k})^{\top}G_{1}(w^{k}- \tilde{w}^{k})+(w^{k}- \tilde{w}^{k})^{\top}\hat{\xi}^{k}, $$

$$ \begin{array}{c} dd_{1}(w^{k},\tilde{w}^{k})= \end{array}\left( \begin{array}{c} (1+\mu)R_{1}(x^{k}-\tilde{x}^{k})-[f(x^{k})-f(\tilde{x}^{k})] \\ (1+\mu)S_{1}(y^{k}-\tilde{y}^{k})-[g(y^{k})-g(\tilde{y}^{k})] + B^{\top}HA(x^{k}-\tilde{x}^{k})\\ H^{-1}(\lambda^{k}-\tilde{\lambda}^{k}) \end{array}\right) $$

$$ M_{1}=\left( \begin{array}{cccc} R_{1} & & \\ &S_{1} & \\ & & & \frac{1}{2}H^{-1} \end{array} \right) , $$

$$ \hat{\xi}^{k}=\left( \begin{array}{c} \hat{\xi}^{k}_{x}\\ \hat{\xi}^{k}_{y} \\ 0 \end{array}\right)=\left( \begin{array}{c} f(\tilde{x}^{k})-f(x^{k})-\frac{1}{2}A^{\top}HA(x^{k}-\tilde{x}^{k})\\ g(\tilde{y}^{k})-g(y^{k})-\frac{1}{2}B^{\top}HB(y^{k}-\tilde{y}^{k}) \\ 0 \end{array}\right) $$

and

$$ G_{1}=\left( \begin{array}{ccc} \frac{1}{2}A^{\top}HA & 0& 0 \\ B^{\top}HA & \frac{1}{2}B^{\top}HB & 0 \\ A & B & \frac{1}{2} H^{-1} \end{array} \right). $$

We notice that the convergence of this method was established under the assumption that the mappings f(x) and g(y) are Lipschitz continuous. However, in many applications, the mappings f(x) and g(y) may not be Lipschitz continuous. For this situation, it could be difficult to verify their Lipschitz continuity condition. It is interesting to design a new easily implementable method for the continuous mappings whose available information is only the mapping values.

In this paper, we propose a descent LQP alternating direction method for SVI. The global convergence of the proposed method can be guaranteed by the continuity rather than the Lipchitz continuity of these mappings. Since the self-adaptive adjustment rule is necessary in practice, we propose a self-adaptive method that adjusts the scalar parameter automatically. To illustrate the proposed method and demonstrate its efficiency, some applications and their numerical results are also provided. Our results can be viewed as significant extensions of the previously known results.

2 The proposed method

We recall some concepts and results which are needed in the sequel.

For any vector $u\in \mathbb {R}^{n}, \|u\|^{2}=u^{\top }u, \|u\|_{\infty }=\max \limits \{|u_{1}|\ldots ,|u_{n}|\}$. Let $D \in \mathbb {R}^{n \times n}$ be a symmetric positive definite matrix; the operators λ_l(D) and λ_m(D) denote the largest eigenvalue and the smallest eigenvalue of D, respectively. We denote the D-norm of u by $\|u\|_{D}^{2}=u^{\top } Du$.

Some fundamental properties of the projection operator are listed without proof, (see, e.g., [6]).

Lemma 2.1

Let Ω be a nonempty closed convex subset of $\mathbb {R}^{l}$ and denoted by P_Ω[.] the projection on Ω with respect to the Euclidean norm, that is,

$$ P_{\Omega}[v]=\text{argmin} \{ \|v-u\| : u \in {\Omega}\}. $$

Then, we have the following inequalities:

$$ (z-P_{\Omega}[z])^{\top} (P_{\Omega}[z]-v) \geq 0, \quad \forall z \in \mathbb{R}^{l}, v\in {\Omega}; $$

(2.1)

$$ \|u-P_{\Omega}[z]\|^{2} \leq \|z-u\|^{2}-\|z-P_{\Omega}[z]\|^{2}, \quad \forall z \in \mathbb{R}^{l}, u\in {\Omega}. $$

(2.2)

We recall some basic definitions,which will be used in our later analysis.

Definition 2.1

The mapping $T:\mathbb {R}^{n}\to \mathbb {R}^{n}$ is said to be

(a) monotone if
$$ (Tx-Ty)^{\top}(x-y)\ge 0,\qquad \forall x,y\in \mathbb{R}^{n}; $$
(b) L-Lipschitz continuous if there exists a constant L > 0 such that
$$ \|Tx-Ty\|\le L\|x-y\|,\qquad\forall x,y\in \mathbb{R}^{n}; $$

We make the following standard assumptions:

Assumption A

f is monotone and continuous on $\mathbb {R}^{n}_{+}$ and g is monotone and continuous on $\mathbb {R}^{m}_{+}$.

Assumption B

The solution set of SVI, denoted by ${\mathcal W}^{*}$, is nonempty.

Let β_k > 0, r > 0,s > 0, $H \in \mathbb {R}^{l \times l}$, $R \in \mathbb {R}^{n \times n}$, and $S \in \mathbb {R}^{m \times m}$ be positive definite diagonal matrices, where R = rI_n×n and S = sI_m×m. We propose the following inexact LQP alternating direction method for solving SVI:

Algorithm 2.1

Step 0. The initial step: Given ε > 0, μ ∈ (0,1),η ∈ (0,1),ρ > 0, and $w^{0}=(x^{0}, y^{0}, \lambda ^{0}) \in \mathbb {R}^{n}_{++}\times \mathbb {R}^{m}_{++}\times \mathbb {R}^{l}$. ${\upbeta }_{0}=\min \limits \left \{\frac {(1-\eta )\lambda _{m}(R)}{3\lambda _{l}(H)\|A\|^{2}},\frac {(1-\eta )\lambda _{m}(S)}{3\lambda _{l}(H)\|B\|^{2}}\right \}.$ Set k = 0.
Step 1. Prediction step: Compute $\tilde {w}^{k}=(\tilde {x}^{k},\tilde {y}^{k},\tilde {\lambda }^{k})\in \mathbb {R}^{n}_{++}\times \mathbb {R}^{m}_{++}\times \mathbb {R}^{l}$ by solving the following system:

$$ \begin{array}{@{}rcl@{}} &&{\upbeta}_{k}\left( f(x) - A^{\top} [\lambda^{k} - H (A x^{k} + B y^{k} - b) ]\right)\\ &+& R[(x-x^{k}) + \mu (x^{k} - {X_{k}^{2}} x^{-1})]=:{\xi^{k}_{x}}\approx0, \end{array} $$
(2.3a)
$$ \begin{array}{@{}rcl@{}} &&{\upbeta}_{k}\left( g(y) - B^{\top} [\lambda^{k} - H (A \tilde{x}^{k}+By^{k}- b)]\right)\\&+& S[(y-y^{k}) + \mu (y^{k} - {Y_{k}^{2}} y^{-1})]=:{\xi^{k}_{y}}\approx0, \end{array} $$
(2.3b)
$$ \tilde{\lambda}^{k} = \lambda^{k} - H (A\tilde{x}^{k} + B \tilde{y}^{k} - b), $$
(2.3c)
where β_k is a proper parameter which satisfies

$$ \| {\xi^{k}_{x}}\|\leq\eta r\|x^{k}-\tilde{x}^{k}\|,\qquad \| {\xi^{k}_{y}}\|\leq\eta s\|y^{k}-\tilde{y}^{k}\| $$
(2.4)
and
$$ \begin{array}{@{}rcl@{}} \xi^{k}=\left( \begin{array}{c} {\xi^{k}_{x}}\\ {\xi^{k}_{y}} \\ 0 \end{array}\right)={\upbeta}_{k}\left( \begin{array}{c} f(\tilde{x}^{k})-f(x^{k})+\rho A^{\top}HA(x^{k}-\tilde{x}^{k})\\ g(\tilde{y}^{k})-g(y^{k})+\rho B^{\top}HB(y^{k}-\tilde{y}^{k}) \\ 0 \end{array}\right). \end{array} $$
(2.5)
Step 2. Convergence verification: If $ \max \limits \{\|x^{k}-\tilde {x}^{k}\|_{\infty },\|y^{k}-\tilde {y}^{k}\|_{\infty },\|\lambda ^{k}-\tilde {\lambda }^{k}\|_{\infty }\}<\epsilon ,$ then stop.
Step 3. Correction step: The new iterate w^k+ 1(α_k) = (x^k+ 1, y^k+ 1, λ^k+ 1) is given by
$$ w^{k+1}(\alpha_{k})= (1-\sigma) w^{k}+\sigma P_{\mathcal W}[w^{k}-\alpha_{k}d_{2}(w^{k},\tilde{w}^{k})], \quad \sigma\in(0,1) $$
(2.6)
where
$$ \alpha_{k}=\frac{\varphi(w^{k}, \tilde{w}^{k})}{\|d_{1}(w^{k}, \tilde{w}^{k})\|^{2}}, $$
(2.7)

$$ \begin{array}{c} d_{2}(w^{k},\tilde{w}^{k})= \end{array}\left( \begin{array}{c} {\upbeta}_{k}(f(\tilde{x}^{k}) - A^{\top} \tilde{\lambda}^{k})+{\upbeta}_{k} A^{\top}H(A(x^{k}-\tilde{x}^{k})+B(y^{k}-\tilde{y}^{k}))\\ {\upbeta}_{k}(g(\tilde{y}^{k}) - B^{\top} \tilde{\lambda}^{k})+{\upbeta}_{k} B^{\top}H(A(x^{k}-\tilde{x}^{k})+B(y^{k}-\tilde{y}^{k}))\\ {\upbeta}_{k}(A\tilde{x}^{k}+B\tilde{y}^{k}-b) \end{array}\right) $$
(2.8)

$$ \begin{array}{@{}rcl@{}} \varphi(w^{k}, \tilde{w}^{k})&=&(w^{k}-\tilde{w}^{k})^{\top}d_{1}(w^{k},\tilde{w}^{k})-\mu\| x^{k} -\tilde{x}^{k} \|_{R}^{2}-\mu\| y^{k} - \tilde{y}^{k} \|_{S}^{2} \\&&+{\upbeta}_{k}(\lambda^{k}-\tilde{\lambda}^{k})^{\top}(A(x^{k}-\tilde{x}^{k})+B(y^{k}-\tilde{y}^{k})), \end{array} $$
(2.9)

$$ \begin{array}{c} d_{1}(w^{k},\tilde{w}^{k}){\kern-.5pt}={\kern-4.5pt} \end{array}\left( \begin{array}{c} (1+\mu)R(x^{k}-\tilde{x}^{k})-{\upbeta}_{k}[f(x^{k})-f(\tilde{x}^{k})]+\rho{\upbeta}_{k} A^{\top}HA(x^{k}-\tilde{x}^{k}) \\ {\kern-4.5pt}(1+\mu)S(y^{k}-\tilde{y}^{k})-{\upbeta}_{k}[g(y^{k})-g(\tilde{y}^{k})] +{\upbeta}_{k} B^{\top}HA(x^{k}-\tilde{x}^{k})+\rho{\upbeta}_{k} B^{\top}HB(y^{k}-\tilde{y}^{k})\\ {\upbeta}_{k}H^{-1}(\lambda^{k}-\tilde{\lambda}^{k}) \end{array}{}\right) $$
(2.10)
Step 4. Adjusting:

Adaptive rule of choosing a suitable β_k+ 1 as the start prediction step size for the next iteration
$$ {\upbeta}_{k+1} :=\left\{\begin{array}{ll} \min\{{\upbeta}_{0}, \tau*{\upbeta}_{k}\} & \text{ if }\max\{r_{1},r_{2}\} \le 0.5, \\\ {\upbeta}_{k} & \text{ otherwise}. \end{array}\right. $$
(2.11)

where

$$r_{1}=\frac{\|{\xi^{k}_{x}}\|}{r\|x^{k}-\tilde{x}^{k}\|},$$

$$ r_{2}=\frac{\|{\xi^{k}_{y}}\|}{s\|y^{k}-\tilde{y}^{k}\|} $$

and τ > 1. Set k := k + 1 and go to Step 1.

Remark 2.1

In general, the prediction step is implementable. Sometimes we can get the approximate solution of (2.3a)–(2.3c) directly via choosing a suitable β_k > 0. Since R = rI_n×n and S = sI_m×m and if f is Lipschitz continuous on $\mathbb {R}^{n}_{+}$ with Lipschitz constant k_f, and g is Lipschitz continuous on $\mathbb {R}^{m}_{+}$ with Lipschitz constant k_g. Therefore, criterion (2.4) is satisfied when ${\upbeta }_{k}\leq \min \limits \{\frac {\eta r}{k_{f}+\rho \|A^{\top }HA\|},\frac {\eta s}{k_{g}+\rho \|B^{\top }HB\|}\}.$

2.1 Relationship with some existing methods

Our method can be viewed as an extension and improvement for some well-known results, for example, the following:

It follows from (2.5) that (2.3a)–(2.3b) can be written as
$$ (R+\rho {\upbeta}_{k} A^{T}HA)(\tilde{x}^{k})^{2}-s^{k}\tilde{x}^{k}-\mu R {X_{k}^{2}}=0, $$
(2.12)
$$ (S+\rho {\upbeta}_{k} B^{T}HB)(\tilde{y}^{k})^{2}-p^{k}\tilde{y}^{k}-\mu S {Y_{k}^{2}}=0, $$
(2.13)
with
$$ s^{k}=(1-\mu)Rx^{k}-{\upbeta}_{k}(f(x^{k}) - A^{T} [\lambda^{k} - H ((1-\rho)A x^{k} + B {y}^{k} - b) ]), $$
$$ p^{k}=(1-\mu)Sy^{k}-{\upbeta}_{k}(g(y^{k}) - B^{T} [\lambda^{k} - H (A \tilde{x}^{k} + (1-\rho)B y^{k} - b)]). $$
The proposed method obtains the predictors $\tilde {x}^{k}$ and $\tilde {y}^{k}$ quite easily by an explicit formula. Since R = rI_n×n and S = sI_m×m, and if A = I_l×n, B = −I_l×m, and H = I_l×l, the positive solution of (2.12)–(2.13) can be obtained explicitly by
$$ \tilde{x}^{k}_{i}=\frac{\left( \bar{s}^{k}_{i}+\sqrt{(\bar{s}^{k}_{i})^{2}+4\mu r(r+{\upbeta}_{k}\rho)({x^{k}_{i}})^{2}}\right)}{2(r+{\upbeta}_{k}\rho)}, $$
(2.14)
$$ \tilde{y}^{k}_{i}=\frac{\left( \bar{p}^{k}_{i}+\sqrt{(\bar{p}^{k}_{i})^{2}+4\mu s(s+{\upbeta}_{k}\rho)({y^{k}_{i}})^{2}}\right)}{2(s+{\upbeta}_{k}\rho)} $$
(2.15)
with
$$ \bar{s}^{k}=r(1-\mu)x^{k}-{\upbeta}_{k}\left( f(x^{k}) - [\lambda^{k} - ((1-\rho) x^{k} - {y}^{k} - b) ]\right), $$
$$ \bar{p}^{k}=s(1-\mu)y^{k}-{\upbeta}_{k}\left( g(y^{k}) + [\lambda^{k} - (\tilde{x}^{k} - (1-\rho) y^{k} - b)]\right). $$
Therefore, the proposed method is easily implementable. In contrast, the predictors $\tilde {x}^{k}$ and $\tilde {y}^{k}$ in [22, 34, 40, 42] could be computed by using the projection method but the applicability of the projection method is limited due to the fact that it is not easy to find the projection except in very special cases.
If β_k = 1,∀k ≥ 0 and $x^{k+1}=\tilde {x}^{k}, y^{k+1}=\tilde {y}^{k}$ and $\lambda ^{k+1}=\tilde {\lambda }^{k}$ in (2.3a), (2.3b) and (2.3c), respectively, and ${\xi ^{k}_{x}}={\xi ^{k}_{y}}=0,$ we obtain the method proposed in [41].
The methods proposed in [8,9,10,11,12, 36] have suggested to solve problem (2.3a)–(2.3b) (with β_k = 1, ∀k ≥ 0) exactly. It is more practical to find approximate solutions of problems (2.3a)–(2.3b) rather than the exact solutions due to the fact that in general this excludes some practical applications. Driven by the fact of eliminating this drawback, we solve problem (2.3a)–(2.3b) approximately. Since the self-adaptive adjustment rule is necessary in practice, we propose a self-adaptive method that adjusts the scalar parameter β_k automatically.
The convergence of the method proposed in [18] was established under the assumption that the mappings f(x) and g(y) are Lipschitz continuous. However, in many applications, the mappings f(x) and g(y) may not be Lipschitz continuous. For this case, it could be difficult to verify their Lipschitz continuity condition. The convergence of the proposed method can be guaranteed under the assumption that the mappings f(x) and g(y) are continuous. Moreover, the parameter β_k in the prediction procedure of the proposed method is selected self-adaptively.
Comparing the proposed method with the methods in [8,9,10,11,12,13,14,15,16, 18, 36], the new iterate is obtained by using a new direction with a new step size α_k.

We need this lemma to analyze the convergence for the proposed method.

Lemma 2.2

[41] Let $q(u)\in \mathbb {R}^{n}$ be a monotone mapping of u with respect to $\mathbb {R}^{n}_{^{+}}$ and $R\in \mathbb {R}^{n\times n}$ be a positive definite diagonal matrix. For a given u^k > 0, if $U_{k}:= \text {diag}({u^{k}_{1}},{u^{k}_{2}},\cdots , {u^{k}_{n}})$ and u^− 1 be an n-vector whose j th element is 1/u_j, then the equation

$$ q(u) + R[(u -u^{k}) + \mu(u^{k} - {U_{k}^{2}} u^{-1})]=0 $$

(2.16)

has a unique positive solution u. Moreover, for any v ≥ 0, we have

$$ (v-u)^{\top} q(u) \geq {\textstyle\frac{1+\mu}{2} } \bigl(\| u - v \|_{R}^{2} - \|u^{k} - v\|_{R}^{2} \bigr) + {\textstyle\frac{1-\mu}{2} }\| u^{k} - u \|_{R}^{2}. $$

(2.17)

Now we are ready to present an inequality where a lower bound of $\varphi (w^{k}, \tilde {w}^{k})$ is found for all $w^{k}\in \mathbb {R}_{++}^{n} \times \mathbb {R}_{++}^{m}\times \mathbb {R}^{l}$. This inequality is also crucial for analyzing the contraction property and the convergence for the iterative sequence.

Theorem 2.1

For given $w^{k}\in \mathbb {R}_{++}^{n} \times \mathbb {R}_{++}^{m}\times \mathbb {R}^{l},$ let $\tilde {w}^{k}$ be generated by (2.3a)–(2.3c), then there exists two constants α₁ > 0 and α₂ > 0 such that

$$ \begin{array}{@{}rcl@{}} \varphi(w^{k}, \tilde{w}^{k})\geq \alpha_{1}\|w^{k}- \tilde{w}^{k}\|^{2} \end{array} $$

(2.18)

and

$$ \begin{array}{@{}rcl@{}} \alpha_{k}\geq\frac{\alpha_{1}}{\alpha_{2}}. \end{array} $$

(2.19)

Proof

Since

$$ (\lambda^{k}-\tilde{\lambda}^{k})^{\top}A(x^{k}-\tilde{x}^{k})\geq-\frac{1}{4}\|\lambda^{k}-\tilde{\lambda}^{k}\|_{H^{-1}}^{2} -\|A(x^{k}-\tilde{x}^{k})\|_{H}^{2}. $$

$$ (\lambda^{k}-\tilde{\lambda}^{k})^{\top}B(y^{k}-\tilde{y}^{k})\geq-\frac{1}{4}\|\lambda^{k}-\tilde{\lambda}^{k}\|_{H^{-1}}^{2} -\|B(y^{k}-\tilde{y}^{k})\|_{H}^{2}. $$

$$ (A(x^{k}-\tilde{x}^{k}))^{\top}HB(y^{k}-\tilde{y}^{k})\geq-\frac{1}{2}(\|A(x^{k}-\tilde{x}^{k}\|_{H}^{2} +\|B(y^{k}-\tilde{y}^{k})\|_{H}^{2}). $$

It follows from the definition of $\varphi (w^{k}, \tilde {w}^{k})$ that

$$ \begin{array}{@{}rcl@{}} \varphi(w^{k}, \tilde{w}^{k})&=&\|x^{k}-\tilde{x}^{k}\|_{R}^{2}+\|y^{k}-\tilde{y}^{k}\|_{S}^{2}+ {\upbeta}_{k}\|\lambda^{k}-\tilde{\lambda}^{k}\|_{H^{-1}}^{2}\\ &&+{\upbeta}_{k}(A(x^{k}-\tilde{x}^{k}))^{\top}HB(y^{k}-\tilde{y}^{k})-{\upbeta}_{k}(x^{k}-\tilde{x}^{k})^{\top}[f(x^{k}) \end{array} $$

(2.20)

$$ \begin{array}{@{}rcl@{}}&&-f(\tilde{x}^{k})-\rho A^{\top}HA(x^{k}-\tilde{x}^{k})]\\ &&-{\upbeta}_{k}(y^{k}-\tilde{y}^{k})^{\top}[g(y^{k})-g(\tilde{y}^{k})-\rho B^{\top}HB(y^{k}-\tilde{y}^{k})]\\ &&+{\upbeta}_{k}(\lambda^{k}-\tilde{\lambda}^{k})^{\top}(A(x^{k} -\tilde{x}^{k})+B(y^{k}-\tilde{y}^{k}))\\ &\geq&\|x^{k} - \tilde{x}^{k}\|_{R}^{2}+\|y^{k} - \tilde{y}^{k}\|_{S}^{2}+\frac{1}{2} {\upbeta}_{k}\|\lambda^{k}-\tilde{\lambda}^{k}\|_{H^{-1}}^{2}-\frac{3}{2}{\upbeta}_{k}\|A(x^{k} - \tilde{x}^{k})\|_{H}^{2}\\ &&-\frac{3}{2}{\upbeta}_{k}\|B(y^{k} - \tilde{y}^{k})\|_{H}^{2} - {\upbeta}_{k}(x^{k} - \tilde{x}^{k})^{\top}[f(x^{k}) - f(\tilde{x}^{k})\\&&-\rho A^{\top}HA(x^{k}-\tilde{x}^{k})]\\ &&-{\upbeta}_{k}(y^{k}-\tilde{y}^{k})^{\top}[g(y^{k})-g(\tilde{y}^{k})-\rho B^{\top}HB(y^{k}-\tilde{y}^{k})]. \end{array} $$

(2.21)

By using Cauchy-Schwarz inequality in (2.4), we have

$$ (w^{k}-\tilde{w}^{k})^{\top }\xi_{k}\geq-\eta\|w^{k}-\tilde{w}^{k}\|^{2}_{M} $$

where

$$ \begin{array}{@{}rcl@{}} M=\left( \begin{array}{cccc} R & & \\ & S & \\ & & & \frac{1}{2}{\upbeta}_{k} H^{-1} \end{array} \right) . \end{array} $$

(2.22)

Then, we have

$$ \begin{array}{@{}rcl@{}} \varphi(w^{k}, \tilde{w}^{k})&\geq&\|x^{k} - \tilde{x}^{k}\|_{R}^{2}+\|y^{k} - \tilde{y}^{k}\|_{S}^{2}+\frac{1}{2} {\upbeta}_{k}\|\lambda^{k} - \tilde{\lambda}^{k}\|_{H^{-1}}^{2} - \frac{3}{2}{\upbeta}_{k}\|A(x^{k}-\tilde{x}^{k})\|_{H}^{2}\\ &&-\frac{3}{2}{\upbeta}_{k}\|B(y^{k}-\tilde{y}^{k})\|_{H}^{2}-\eta\|w^{k}-\tilde{w}^{k}\|^{2}_{M}\\ &\geq&\frac{(1 - \eta)}{2}\|w^{k}\!- \tilde{w}^{k}\|_{M}^{2}+\left( \!\frac{(1 - \eta)\lambda_{m}(R)}{2}\!-\frac{3}{2}{\upbeta}_{k}\lambda_{l}(H)\|A\|^{2}\right)\!\|x^{k}\!-\boldsymbol\!\tilde{x}^{k}\|^{2}\\ &&+ \left( \frac{(1-\eta)\lambda_{m}(S)}{2}-\frac{3}{2}{\upbeta}_{k}\lambda_{l}(H)\|B\|^{2}\right)\|y^{k}-\tilde{y}^{k}\|^{2}\\ &\geq&\frac{(1 - \eta)}{2}\|w^{k} - \tilde{w}^{k}\|_{M}^{2} + \left( \!\frac{(1 - \eta)\lambda_{m}(R)}{2} - \frac{3}{2}{\upbeta}_{0}\lambda_{l}(H)\|A\|^{2}\right)\|x^{k} - \tilde{x}^{k}\|^{2}\\ &&+ \left( \frac{(1-\eta)\lambda_{m}(S)}{2}-\frac{3}{2}{\upbeta}_{0}\lambda_{l}(H)\|B\|^{2}\right)\|y^{k}-\tilde{y}^{k}\|^{2}\\ &\geq&\frac{(1-\eta)}{2}\|w^{k}- \tilde{w}^{k}\|_{M}^{2}\\ &\geq&\alpha_{1}\|w^{k}- \tilde{w}^{k}\|^{2}, \end{array} $$

where α₁ > 0 is a constant. The third inequality holds because β_k ≤β₀ for any k. The fourth inequality is obtained from the definition of β₀.

Recalling the definition in (2.10), we rewrite $d_{1}(w^{k}, \tilde {w}^{k})$ as

$$ d_{1}(w^{k}, \tilde{w}^{k})=\xi_{k}+G(w^{k}- \tilde{w}^{k}) $$

where

$$ G=\left( \begin{array}{ccc} (1+\mu)R& 0& 0 \\ {\upbeta}_{k} B^{\top}HA & (1+\mu)S & 0 \\ 0 & 0 & {\upbeta}_{k} H^{-1} \end{array} \right). $$

Note that, for any $a, b \in \mathbb {R}^{n},$ we have

$$ \|a+b\|^{2}\leq2\|a\|^{2}+2\|b\|^{2}. $$

It follows that

$$ \begin{array}{@{}rcl@{}} \|d_{1}(w^{k}, \tilde{w}^{k})\|^{2}& \leq&2\|\xi_{k}\|^{2}+2\|G(w^{k}- \tilde{w}^{k})\|^{2}\\ &\leq& 2\|{\upbeta}_{k}(f(\tilde{x}^{k})-f(x^{k}))+\rho {\upbeta}_{k} A^{\top}HA(x^{k}-\tilde{x}^{k})\|^{2}\\ &&+2\| {\upbeta}_{k}(g(\tilde{y}^{k})-g(y^{k}))+\rho {\upbeta}_{k} B^{\top}HB(y^{k}-\tilde{y}^{k})\|^{2}\\ &&+2\|G(w^{k}- \tilde{w}^{k})\|^{2}\\ &\leq&2\eta^{2}r^{2}\|x^{k}-\tilde{x}^{k}\|^{2}+ 2\eta^{2}s^{2}\|y^{k}-\tilde{y}^{k}\|^{2}+2\|G(w^{k}- \tilde{w}^{k})\|^{2}\\ &\leq&2\max\big(\eta^{2}r^{2},\eta^{2}s^{2}\big)\|w^{k}-\tilde{w}^{k}\|^{2}+2\lambda_{l}(G^{\top}G)\|w^{k}- \tilde{w}^{k}\|^{2}\\ &\leq&\alpha_{2}\|w^{k}-\tilde{w}^{k}\|^{2}, \end{array} $$

where α₂ > 0 is a constant. Therefore, it follows from (2.7) and (2.18) that

$$ \alpha_{k}\geq\frac{\alpha_{1}}{\alpha_{2}} $$

and this completes the proof. □

3 Basic results

In this section, to prove the global convergence for the proposed method, we first present some lemmas.

Lemma 3.1

For given $w^{k}=(x^{k}, y^{k},\lambda ^{k})\in \mathbb {R}_{++}^{n} \times \mathbb {R}_{++}^{m}\times \mathbb {R}^{l},$ let $\tilde {w}^{k}$ be generated by (2.3a)–(2.3c). Then for any $w=(x,y, \lambda ) \in {\mathcal W}$, we have

$$ (w-\tilde{w}^{k})^{\top} d_{2}(w^{k},\tilde{w}^{k}) \geq (w-\tilde{w}^{k})^{\top} d_{1}(w^{k},\tilde{w}^{k})-\mu\| x^{k} -\tilde{x}^{k} \|_{R}^{2}-\mu\| y^{k} - \tilde{y}^{k} \|_{S}^{2} . $$

(3.1)

Proof

Applying Lemma 2.2 to (2.3a) by setting u^k = x^k, $u=\tilde {x}^{k}$, v = x in (2.17) and

$$ q(u)= {\upbeta}_{k}(f(\tilde{x}^{k}) - A^{\top} [\lambda^{k} -H(A x^{k} + B y^{k} - b)])-{\xi^{k}_{x}}, $$

we get

$$ \begin{array}{@{}rcl@{}} && \lefteqn{ (x - \tilde{x}^{k})^{\top} \Bigl\{{\upbeta}_{k}(f(\tilde{x}^{k}) - A^{\top} [\lambda^{k} -H(A x^{k} + B y^{k} - b)]) -{\xi^{k}_{x}} \Bigr\} }\\ & \geq & {\textstyle\frac{1+\mu}{2} }\Bigl(\| \tilde{x}^{k}-x \|_{R}^{2} - \|x^{k} -x\|_{R}^{2} \Bigr) + {\textstyle\frac{1-\mu}{2} }\| x^{k} - \tilde{x}^{k} \|_{R}^{2}.\qquad \end{array} $$

(3.2)

Recall

$$ (x-\tilde{x}^{k})^{\top} R(x^{k}-\tilde{x}^{k})={\frac{1}{2}}\left( \|\tilde{x}^{k}-x\|_{R}^{2}-\|x^{k}-x\|_{R}^{2}\right) +{\frac{1}{2}}\|x^{k}-\tilde{x}^{k}\|_{R}^{2}. $$

(3.3)

Adding (3.2) and (3.3), we obtain

$$ \begin{array}{@{}rcl@{}} && (x - \tilde{x}^{k})^{\top} \left\{{(1+\mu)}R(x^{k}-\tilde{x}^{k})-{\upbeta}_{k} f(\tilde{x}^{k}) +{\upbeta}_{k} A^{\top} \tilde{\lambda}^{k}+{\xi^{k}_{x}}\right.\\ &&\left.-{\upbeta}_{k} A^{\top}H\left( A(x^{k}-\tilde{x}^{k})+B(y^{k}-\tilde{y}^{k}) \right) \right\} \leq {\mu}\| x^{k} - \tilde{x}^{k} \|_{R}^{2}. \end{array} $$

(3.4)

Similarly, applying Lemma 2.2 to (2.3b); substituting u^k = y^k, $ u= \tilde {y}^{k}$, and v = y; and replacing R, n with S, m, respectively, in (2.17) and

$$ q(u)= {\upbeta}_{k}(g(\tilde{y}^{k}) - B^{\top} [\lambda^{k} -H (A \tilde{x}^{k}+By^{k}- b)])-{\xi^{k}_{y}}, $$

we get

$$ \begin{array}{lll} &&\lefteqn{ (y - \tilde{y}^{k})^{\top} \bigl\{{\upbeta}_{k}(g(\tilde{y}^{k}) - B^{\top} [\lambda^{k} - H (A \tilde{x}^{k}+ By^{k}- b)]) -{\xi^{k}_{y}} \bigr\} } \\ [2mm] & \ge & {\textstyle\frac{1+\mu}{2} }\Bigl(\| \tilde{y}^{k}-y \|_{S}^{2} - \|y^{k} -y\|_{S}^{2} \Bigr) + {\textstyle\frac{1-\mu}{2} }\| y^{k} - \tilde{y}^{k} \|_{S}^{2}. \end{array} $$

(3.5)

Recall

$$ (y-\tilde{y}^{k})^{\top} S(y^{k}-\tilde{y}^{k})={\frac{1}{2}}\left( \|\tilde{y}^{k}-y\|_{S}^{2}-\|y^{k}-y\|_{S}^{2}\right) +{\frac{1}{2}}\|y^{k}-\tilde{y}^{k}\|_{S}^{2}. $$

(3.6)

Adding (3.5) and (3.6), we have

$$ \begin{array}{@{}rcl@{}} &&(y - \tilde{y}^{k})^{\top} \left\{(1+\mu)S(y^{k}-\tilde{y}^{k})-{\upbeta}_{k} g(\tilde{y}^{k}) +{\upbeta}_{k} B^{\top} \tilde{\lambda}^{k}+{\xi^{k}_{y}}\right.\\ &&\left.-{\upbeta}_{k} B^{\top}HB(y^{k}-\tilde{y}^{k}) \right\} \le{\mu}\| y^{k} - \tilde{y}^{k} \|_{S}^{2}. \end{array} $$

(3.7)

It follows from (3.4), (3.7), (2.3c), and (2.5) that

$$ (w-\tilde{w}^{k})^{\top} (d_{1}(w^{k},\tilde{w}^{k})-d_{2}(w^{k},\tilde{w}^{k}))-\mu\| x^{k} -\tilde{x}^{k} \|_{R}^{2}-\mu\| y^{k} - \tilde{y}^{k} \|_{S}^{2}\leq0 $$

(3.8)

and the assertion of this lemma is proved. □

Lemma 3.2

For given $w^{k}=(x^{k}, y^{k},\lambda ^{k})\in \mathbb {R}_{++}^{n} \times \mathbb {R}_{++}^{m}\times \mathbb {R}^{l},$ let $\tilde {w}^{k}$ be generated by (2.3a–2.3c). Then for any $w^{*}=(x,y, \lambda ) \in {\mathcal W}^{*}$, we have

$$ \begin{array}{lll} (\tilde{w}^{k}-w^{*})^{\top} d_{2}(w^{k},\tilde{w}^{k}) & \geq & \varphi(w^{k},\tilde{w}^{k})-(w^{k}-\tilde{w}^{k})^{\top}d_{1}(w^{k},\tilde{w}^{k})\\ &&+\mu\| x^{k} -\tilde{x}^{k} \|_{R}^{2}+\mu\| y^{k} - \tilde{y}^{k} \|_{S}^{2}. \end{array} $$

(3.9)

Proof

Recalling the definition in (2.9)

$$ \begin{array}{lll} \varphi(w^{k}, \tilde{w}^{k})&=&(w^{k}-\tilde{w}^{k})^{\top}d_{1}(w^{k},\tilde{w}^{k})-\mu\| x^{k} -\tilde{x}^{k} \|_{R}^{2}-\mu\| y^{k} - \tilde{y}^{k} \|_{S}^{2}\\ &&+{\upbeta}_{k}(\lambda^{k}-\tilde{\lambda}^{k})^{\top}\left( A(x^{k}-\tilde{x}^{k})+B(y^{k}-\tilde{y}^{k})\right). \end{array} $$

(3.10)

Using the monotonicity of f and g, we obtain

$$ \begin{array}{@{}rcl@{}} \left( \begin{array}{c} \tilde{x}^{k} - x^{*} \\ \tilde{y}^{k} - y^{*}\\ \tilde{\lambda}^{k} - \lambda^{*} \end{array}\right)^{\!\top}\!\left( \begin{array}{c} f(\tilde{x}^{k}) - A^{\top}\tilde{\lambda}^{k}\\ g(\tilde{y}^{k}) - B^{\top}\tilde{\lambda}^{k}\\ A\tilde{x}^{k} + B\tilde{y}^{k} - b \end{array}\!\right) \!\geq \!\left( \begin{array}{c} \tilde{x}^{k} - x^{*} \\ \tilde{y}^{k} - y^{*}\\ \tilde{\lambda}^{k} - \lambda^{*} \end{array}\right)^{\!\top}\!\left( \begin{array}{c} f(x^{*}) - A^{\top}\lambda^{*}\\ g(y^{*}) - B^{\top}\lambda^{*}\\ Ax^{*} + By^{*} - b \end{array}\!\right)\!\geq\!0. \end{array} $$

(3.11)

It follows from (3.11) that

$$ \begin{array}{@{}rcl@{}} &&(\tilde{w}^{k}-w^{*})^{\top} d_{2}(w^{k},\tilde{w}^{k})\\ &\geq&{\upbeta}_{k}(\tilde{w}^{k}-w^{*})^{\top} \left( \begin{array}{c} A^{\top} H \left( A(x^{k}-\tilde{x}^{k})+B(y^{k}-\tilde{y}^{k})\right)\\ B^{\top} H\left( A(x^{k}-\tilde{x}^{k})+B(y^{k}-\tilde{y}^{k})\right)\\ 0 \end{array}\right)\\ &=&{\upbeta}_{k}(A\tilde{x}^{k}+B\tilde{y}^{k}-b)^{\top} H\left( A(x^{k}-\tilde{x}^{k})+B(y^{k}-\tilde{y}^{k})\right)\\ &=&{\upbeta}_{k}(\lambda^{k}-\tilde{\lambda}^{k})^{\top}\left( A(x^{k}-\tilde{x}^{k})+B(y^{k}-\tilde{y}^{k})\right). \end{array} $$

(3.12)

Combining (3.10) and the above inequality, we can get the assertion of this lemma. □

The following theorem provides a unified framework for proving the convergence of the proposed method.

Theorem 3.1

Let $w^{*}\in {\mathcal W}^{*}, w^{k+1}(\alpha _{k})$ be defined by (2.6) and

$$ {\Theta}(\alpha_{k}):=\|w^{k}-w^{*}\|^{2}-\| w^{k+1}(\alpha_{k})-w^{*}\|^{2}, $$

(3.13)

then

$$ \begin{array}{lll} {\Theta}(\alpha_{k}) & \geq & \sigma(\|w^{k}-{w_{p}^{k}}-\alpha_{k}d_{1}(w^{k},\tilde{w}^{k})\|^{2}+2\alpha_{k}\varphi(w^{k}, \tilde{w}^{k})-{\alpha_{k}^{2}}\|d_{1}(w^{k},\tilde{w}^{k})\|^{2}). \end{array} $$

(3.14)

Proof

Since $w^{*}\in {\mathcal W}^{*}$ and

$$ {w_{p}^{k}}:=P_{\mathcal W}[w^{k}-\alpha_{k}d_{2}(w^{k},\tilde{w}^{k})], $$

(3.15)

it follows from (2.2) that

$$ \|{w_{p}^{k}}-w^{*}\|^{2}\leq\|w^{k}-\alpha_{k}d_{2}(w^{k},\tilde{w}^{k}) -w^{*}\|^{2}- \|w^{k}-\alpha_{k}d_{2}(w^{k},\tilde{w}^{k})-{w_{p}^{k}}\|^{2}. $$

(3.16)

From (2.6), we get

$$ \begin{array}{@{}rcl@{}} && \|w^{k+1}(\alpha_{k})-w^{*}\|^{2}\\ &=&\|(1-\sigma)(w^{k}-w^{*})+\sigma({w_{p}^{k}}-w^{*})\|^{2}\\ &=& (1-\sigma)^{2}\|w^{k}-w^{*}\|^{2}+\sigma^{2}\|{w_{p}^{k}}-w^{*}\|^{2}+2\sigma(1-\sigma)(w^{k}-w^{*})^{\top}({w_{p}^{k}}-w^{*}). \end{array} $$

Using the following identity

$$ 2(a+b)^{\top} b =\|a+b\|^{2}-\|a\|^{2}+\|b\|^{2} $$

for $a=w^{k}-{w^{k}_{p}}$, $b={w^{k}_{p}}-w^{*},$ and (3.16), we obtain

$$ \begin{array}{@{}rcl@{}} \|w^{k+1}(\alpha_{k})-w^{*}\|^{2} &=&(1-\sigma)^{2}\|w^{k}-w^{*}\|^{2}+\sigma^{2}\|{w_{p}^{k}}-w^{*}\|^{2}\\ &&+\sigma(1-\sigma)\{\|w^{k}-w^{*}\|^{2}-\|w^{k}-{w_{p}^{k}}\|^{2}+\|{w_{p}^{k}}-w^{*}\|^{2}\}\\ &=& (1 - \sigma)\|w^{k} - w^{*}\|^{2}+\sigma\|{w_{p}^{k}}-w^{*}\|^{2} - \sigma(1 - \sigma)\|w^{k}-{w_{p}^{k}}\|^{2}\\ &\leq&(1-\sigma)\|w^{k}-w^{*}\|^{2}+\sigma\|w^{k}-\alpha_{k}d_{2}(w^{k},\tilde{w}^{k})-w^{*}\|^{2}\\ &&-\sigma \|w^{k}-\alpha_{k}d_{2}(w^{k},\tilde{w}^{k})-{w_{p}^{k}}\|^{2}-\sigma(1-\sigma)\|w^{k}-{w_{p}^{k}}\|^{2}\\ &\leq&(1-\sigma)\|w^{k}-w^{*}\|^{2}+\sigma\|w^{k}-\alpha_{k}d_{2}(w^{k},\tilde{w}^{k})-w^{*}\|^{2}\\ &&-\sigma \|w^{k}-\alpha_{k}d_{2}(w^{k},\tilde{w}^{k})-{w_{p}^{k}}\|^{2}. \end{array} $$

(3.17)

Using the definition of Θ(α_k) and (3.17), we get

$$ \begin{array}{@{}rcl@{}} {\Theta}(\alpha_{k}) &\geq&\sigma\|w^{k}-{w_{p}^{k}}\|^{2}+2\sigma\alpha_{k}({w_{p}^{k}}- w^{*})^{\top}d_{2}(w^{k},\tilde{w}^{k}). \end{array} $$

(3.18)

Applying (3.1) (with $w={w_{p}^{k}}$), we obtain

$$ \begin{array}{lll} ({w_{p}^{k}} - \tilde{w}^{k})^{\top} d_{2}(w^{k},\tilde{w}^{k}) & \geq & ({w_{p}^{k}} - \tilde{w}^{k})^{\top} d_{1}(w^{k},\tilde{w}^{k}) - \mu\| x^{k} - \tilde{x}^{k} \|_{R}^{2} - \mu\| y^{k} - \tilde{y}^{k} \|_{S}^{2}. \end{array} $$

(3.19)

Adding (3.9) and (3.19), we get

$$ \begin{array}{lll} ({w_{p}^{k}}-w^{*})^{\top} d_{2}(w^{k},\tilde{w}^{k}) & \geq & ({w_{p}^{k}}-w^{k})^{\top} d_{1}(w^{k},w^{k})+\varphi(w^{k}, \tilde{w}^{k}). \end{array} $$

(3.20)

Applying (3.20) to the last term on the right side of (3.18), we obtain

$$ \begin{array}{lll} {\Theta}(\alpha_{k}) &\geq&\sigma\|w^{k}-{w_{p}^{k}}\|^{2}+2\sigma\alpha_{k}({w_{p}^{k}}- w^{k})^{\top} d_{1}(w^{k},\tilde{w}^{k})+2\sigma\alpha_{k}\varphi(w^{k}, \tilde{w}^{k})\\ &=&\sigma\{\|w^{k}-{w_{p}^{k}}-\alpha_{k}d_{1}(w^{k},\tilde{w}^{k})\|^{2}-{\alpha_{k}^{2}}\|d_{1}(w^{k},\tilde{w}^{k})\|^{2} +2\alpha_{k}\varphi(w^{k}, \tilde{w}^{k})\} \end{array} $$

and the theorem is proved. □

4 Convergence of the proposed method

In this section, we prove the global convergence of the proposed method. From the computational point of view, a relaxation factor γ ∈ (0,2) is preferable in the correction. The assertion (3.14) enables us to study the contraction property of the iterative sequence.

Theorem 4.1

Let $w^{*}\in {\mathcal W}^{*}$ be a solution of SVI and let w^k+ 1(γα_k) be generated by (2.6). Then w^k and $\tilde {w}^{k}$ are bounded, and

$$ \| w^{k+1}(\gamma\alpha_{k}) - w^{*}\|^{2}\leq\| w^{k} -w^{*}\|^{2}-c\|w^{k}-\tilde{w}^{k}\|^{2}, $$

(4.1)

where

$$ c:={\textstyle\frac{\sigma\gamma(2-\gamma){\alpha_{1}^{2}}}{\alpha_{2}}}>0. $$

Proof

It follows from (3.14), (2.18), and (2.19) that

$$ \begin{array}{@{}rcl@{}} \| w^{k+1}(\gamma\alpha_{k}) - w^{*}\|^{2} &\leq& \| w^{k} -w^{*}\|^{2}-\sigma(2\gamma\alpha_{k}\varphi(w^{k}, \tilde{w}^{k}) -\gamma^{2}{\alpha_{k}^{2}}\|d_{1}(w^{k},\tilde{w}^{k})\|^{2})\\ &=& \| w^{k} -w^{*}\|^{2}-\gamma(2-\gamma)\alpha_{k}\sigma\varphi(w^{k}, \tilde{w}^{k})\\ &\leq&\| w^{k} -w^{*}\|^{2}-{\textstyle\frac{\sigma\gamma(2-\gamma){\alpha_{1}^{2}}}{\alpha_{2}}}\|w^{k}-\tilde{w}^{k}\|^{2}. \end{array} $$

Since γ ∈ (0,2), we have

$$ \|w^{k+1}-w^{*}\|\leq \|w^{k}-w^{*}\|\leq\cdots\leq\|w^{0}-w^{*}\|, $$

and thus, {w^k} is a bounded sequence.

It follows from (4.1) that

$$ \sum\limits_{k=0}^{\infty} c\|w^{k}-\tilde{w}^{k}\|^{2}<+\infty. $$

which means that

$$ \lim_{k\to\infty}\|w^{k}-\tilde{w}^{k}\|=0. $$

(4.2)

Since {w^k} is a bounded sequence, we conclude that $\{\tilde {w}^{k}\}$ is also bounded. □

With Lemma 3.1 and Theorem 4.1 at hand, we are able to prove the convergence of the proposed method. The following result can be proved by using the technique of Theorem 4.2 in [19].

Theorem 4.2

The sequence {w^k} generated by the proposed method converges to some $w^{\infty }$ which is a solution of SVI.

Proof

Since {w^k} is bounded, it has at least one cluster point. Let $w^{\infty }$ be a cluster point of {w^k} and the subsequence $\{ w^{k_{j}}\}$ converges to $w^{\infty }$, since ${\mathcal W}$ is closed set, we have $w^{\infty }\in {\mathcal W}$. By the construction of β_k, we have that 0 < β_k ≤β₀,∀k. It follows from (4.2) that

$$ \begin{array}{@{}rcl@{}} \lim\limits_{j\rightarrow \infty}d_{1}(w^{k_{j}}, \tilde{w}^{k_{j}})=0 \quad \text{and} \quad \lim_{j\rightarrow \infty}\frac{d_{2}(w^{k_{j}},\tilde{w}^{k_{j}})}{{\upbeta}_{k_{j}}}=Q(w^{\infty}).\end{array} $$

(4.3)

Moreover, (4.3) and (3.1) imply that

$$ \begin{array}{@{}rcl@{}} \lim\limits_{j\to \infty} \frac{(w - w^{k_{j}})^{\top}d_{2}(w^{k_{j}},\tilde{w}^{k_{j}})}{{\upbeta}_{k_{j}}} \ge 0, \qquad & \forall w\in {\mathcal W}, \end{array} $$

(4.4)

and consequently

$$ (w - w^{\infty})^{\top} Q(w^{\infty}) \ge 0, \qquad \forall w\in {\mathcal W}, $$

which means that $w^{\infty }$ is a solution of SVI.

Now we prove that the sequence {w^k} converges to $w^{\infty }.$ Since

$$ \lim_{k\rightarrow \infty} \| w^{k} - \tilde{w}^{k} \|=0, \quad \text{and}\quad \{\tilde{w}^{k_{j}}\}\rightarrow w^{\infty}, $$

for any 𝜖 > 0, there exists an l > 0 such that

$$ \begin{array}{@{}rcl@{}} \|\tilde{w}^{k_{l}} -w^{\infty}\|<\frac{\epsilon}{2} \quad \text{and} \quad \|w^{k_{l}}-\tilde{w}^{k_{l}}\|<\frac{\epsilon}{2}. \end{array} $$

(4.5)

Therefore, for any k ≥ k_l, it follows from (4.1) and (4.5) that

$$ \|w^{k}-w^{\infty}\|\leq\|w^{k_{l}}-w^{\infty}\|\leq \|w^{k_{l}}-\tilde{w}^{k_{l}}\|+\|\tilde{w}^{k_{l}} -w^{\infty}\|<\epsilon. $$

This implies that the sequence {w^k} converges to $w^{\infty }$ which is a solution of SVI. □

5 O(1/t) convergence rate

In this section, we show that the proposed method has the O(1/t) convergence rate. Recall that ${\mathcal W}^{*}$ can be characterized as (see (2.3.2) in pp. 159 of [23])

$$ {\mathcal W}^{*}=\bigcap_{w\in {\mathcal W}}\{\hat{w}\in{\mathcal W}: (w-\hat{w})^{\top}Q(w)\geq0\}. $$

This implies that $\hat {w}$ is an approximate solution of SVI with the accuracy 𝜖 > 0 if it satisfies

$$ \hat{w}\in {\mathcal W}\quad \text{ and}\quad\sup_{w\in {\mathcal W}}\{(\hat{w}-w)^{\top}Q(w)\}\leq\epsilon. $$

(5.1)

For the rest, our purpose is to show that after t iterations of the proposed method, we can find a $\hat {w}\in {\mathcal W}$ such that (5.1) is satisfied with 𝜖 = O(1/t).

Our analysis needs a new sequence defined by

$$ \hat{w}^{k}= \left( \begin{array}{c} \hat{x}^{k} \\ \hat{y}^{k}\\ \hat{\lambda}^{k} \end{array}\right)=\left( \begin{array}{c} \tilde{x}^{k}\\ \tilde{y}^{k}\\ {\lambda}^{k}-H(Ax^{k}+By^{k}-b) \end{array}\right). $$

(5.2)

Based on (3.10) and (5.2), we easily have a relationship

$$ \varphi(w^{k}, \tilde{w}^{k})=(w^{k}-\hat{w}^{k})^{\top}d_{1}(w^{k},\tilde{w}^{k})-\mu\| x^{k} -\hat{x}^{k} \|_{R}^{2}-\mu\| y^{k} - \hat{y}^{k} \|_{S}^{2}. $$

(5.3)

Using (1.3), (2.8), and (5.2), we obtain

$$ d_{2}(w^{k},\tilde{w}^{k})={\upbeta}_{k} Q(\hat{w}^{k}). $$

(5.4)

Lemma 5.1

Let $\hat {w}^{k}$ be defined by (5.2) and $w\in {\mathcal W}$, then, we have

$$ (w-\hat{w}^{k})^{\top}({\upbeta}_{k}Q(\hat{w}^{k})-d_{1}(w^{k},\tilde{w}^{k})) \geq -\mu\| x^{k} - \hat{x}^{k} \|_{R}^{2}-\mu\| y^{k} - \hat{y}^{k} \|_{S}^{2}. $$

(5.5)

Proof

It follows from (3.4) and (3.7) that

$$ \begin{array}{@{}rcl@{}} && (x - \tilde{x}^{k})^{\top} \left\{(1+\mu)R(x^{k}-\tilde{x}^{k})-{\upbeta}_{k} f(x^{k}) +{\upbeta}_{k} A^{\top} \tilde{\lambda}^{k}+\rho {\upbeta}_{k} A^{\top}HA(x^{k}-\tilde{x}^{k})\right.\\ &&\left.-{\upbeta}_{k}A^{\top}H\left( A(x^{k}-\tilde{x}^{k})+B(y^{k}-\tilde{y}^{k}) \right) \right\} \leq \mu\| x^{k} - \tilde{x}^{k} \|_{R}^{2} \end{array} $$

(5.6)

and

$$ \begin{array}{@{}rcl@{}} &&(y - \tilde{y}^{k})^{\top} \left\{(1+\mu)S(y^{k}-\tilde{y}^{k})- {\upbeta}_{k}g(y^{k}) +{\upbeta}_{k} B^{\top} \tilde{\lambda}^{k} +{\upbeta}_{k}B^{\top}HA(x^{k}-\tilde{x}^{k})\right.\\ &&\left.+\rho {\upbeta}_{k} B^{\top}HB(y^{k}-\tilde{y}^{k})-{\upbeta}_{k}B^{\top}H(A(x^{k}-\tilde{x}^{k}) +B(y^{k}-\tilde{y}^{k})) \right\}\\ &\le&\mu\| y^{k} - \tilde{y}^{k} \|_{S}^{2}. \end{array} $$

(5.7)

Then, by using the notation of $\hat {w}^{k}$ in (5.2), (5.6), and (5.7) can be written as

$$ \begin{array}{@{}rcl@{}} &&(x - \hat{x}^{k})^{\top} \Bigl\{(1+\mu)R(x^{k}-\hat{x}^{k})- {\upbeta}_{k}f(x^{k})+\rho {\upbeta}_{k} A^{\top}HA(x^{k}-\hat{x}^{k})+{\upbeta}_{k}A^{\top}\hat{\lambda}^{k} \Bigr\}\\ &\leq& \mu\| x^{k} - \hat{x}^{k} \|_{R}^{2} \end{array} $$

(5.8)

and

$$ \begin{array}{@{}rcl@{}} &&(y - \hat{y}^{k})^{\top} \left\{(1+\mu)S(y^{k}-\hat{y}^{k})- {\upbeta}_{k}g(y^{k})+{\upbeta}_{k}B^{\top}\hat{\lambda}^{k} +{\upbeta}_{k}B^{\top}HA(x^{k}-\hat{x}^{k})\right.\\&&\left. +\rho {\upbeta}_{k} B^{\top}HB(y^{k}-\hat{y}^{k})\right\}\\ &\le&\mu\| y^{k} - \hat{y}^{k} \|_{S}^{2}.\\ \end{array} $$

(5.9)

In addition, it follows from (2.3c) and (5.2) that

$$ {\upbeta}_{k}(A\hat{x}^{k}+ B\hat{y}^{k}-b) -{\upbeta}_{k}H^{-1}({\lambda}^{k}-\tilde{\lambda}^{k})=0. $$

(5.10)

Combining (5.8)–(5.10), recall the definition of $d_{1}(w^{k},\tilde {w}^{k})$, we obtain the assertion (5.5). The proof is completed. □

Lemma 5.2

For given $w^{k}\in \mathbb {R}_{++}^{n} \times \mathbb {R}_{++}^{m}\times \mathbb {R}^{l},$ let ${w_{p}^{k}}$ be defined by (3.15) and $w\in {\mathcal W}$, then we have

$$ \gamma\alpha_{k}(w-\hat{w}^{k})^{\top}{\upbeta}_{k}Q(w)+\frac{1}{2}(\|w-w^{k}\|^{2}-\|w-{w_{p}^{k}}\|^{2})\geq\frac{1}{2} \gamma(2-\gamma){\alpha_{k}^{2}}\|w^{k}-\tilde{w}^{k}\|^{2}. $$

(5.11)

Proof

Since ${w_{p}^{k}}\in {\mathcal W},$ substituting $w={w_{p}^{k}}$ in (5.5) and using (5.3), we get

$$ \begin{array}{@{}rcl@{}} &&\gamma\alpha_{k}({w_{p}^{k}}-\hat{w}^{k})^{\top}{\upbeta}_{k}Q(\hat{w}^{k})\\ &\geq&\gamma\alpha_{k}({w_{p}^{k}}-\hat{w}^{k})^{^{\top}}d_{1}(w^{k},\tilde{w}^{k}) -\mu\gamma\alpha_{k}\| x^{k} - \hat{x}^{k} \|_{R}^{2}-\mu\gamma\alpha_{k}\| y^{k} - \hat{y}^{k} \|_{S}^{2}\\ &=&\gamma\alpha_{k}(w^{k}-\hat{w}^{k})^{\top}d_{1}(w^{k},\tilde{w}^{k})+\gamma\alpha_{k}({w_{p}^{k}}-w^{k})^{\top}d_{1}(w^{k},\tilde{w}^{k}) \\ && -\mu\gamma\alpha_{k}\| x^{k} - \hat{x}^{k} \|_{R}^{2}-\mu\gamma\alpha_{k}\| y^{k} - \hat{y}^{k} \|_{S}^{2}.\\ &=&\gamma\alpha_{k}\varphi(w^{k},\tilde{w}^{k})+\gamma\alpha_{k}({w_{p}^{k}}-w^{k})^{\top}d_{1}(w^{k},\tilde{w}^{k})\\ &\geq&\gamma\alpha_{k}\varphi(w^{k},\tilde{w}^{k})-\frac{1}{2}\|w^{k}-{w_{p}^{k}}\|^{2} -\frac{1}{2}\gamma^{2}{\alpha_{k}^{2}}\|d_{1}(w^{k},\tilde{w}^{k})\|^{2}\\ &=&\frac{1}{2} \gamma(2-\gamma){\alpha_{k}^{2}}\|d_{1}(w^{k},\tilde{w}^{k})\|^{2}-\frac{1}{2}\|w^{k}-{w_{p}^{k}}\|^{2}. \end{array} $$

(5.12)

On the other hand, using (3.15) and (5.4), ${w_{p}^{k}}$ is the projection of $w^{k}-\gamma \alpha _{k} {\upbeta }_{k}Q(\hat {w}^{k})$ on $\mathcal W,$ it follows from (2.1) that

$$ (w^{k}-\gamma\alpha_{k} {\upbeta}_{k}Q(\hat{w}^{k})-{w_{p}^{k}})^{\top}(w-{w_{p}^{k}})\leq0, \quad \forall w\in\mathcal W $$

and consequently

$$ \gamma\alpha_{k} (w-{w_{p}^{k}})^{\top} {\upbeta}_{k}Q(\hat{w}^{k})\geq(w^{k}-{w_{p}^{k}})^{\top}(w-{w_{p}^{k}}). $$

Using the identity $a^{\top }b =\frac {1}{2}\left (\|a\|^{2}-\|a-b\|^{2}+\|b\|^{2}\right )$ to the right hand side of the last inequality, we obtain

$$ \begin{array}{@{}rcl@{}} \gamma\alpha_{k} (w - {w_{p}^{k}})^{\top} {\upbeta}_{k}Q(\hat{w}^{k})\!\geq\!\frac{1}{2}\!\left( \|w - {w_{p}^{k}}\|^{2} - \|w - w^{k}\|^{2}\right) + \frac{1}{2}\|w^{k} - {w_{p}^{k}}\|^{2}. \end{array} $$

(5.13)

Adding (5.12) and (5.13), we get

$$ \gamma\alpha_{k}(w-\hat{w}^{k})^{\top} {\upbeta}_{k}Q(\hat{w}^{k})+\frac{1}{2}(\|w-w^{k}\|^{2}-\|w-{w_{p}^{k}}\|^{2})\geq\frac{1}{2} \gamma(2-\gamma){\alpha_{k}^{2}}\|d_{1}(w^{k},\tilde{w}^{k})\|^{2} $$

and by using the monotonicity of Q, we obtain (5.11) and the proof is completed. □

Lemma 5.3

For given $w^{k}\in \mathbb {R}_{++}^{n} \times \mathbb {R}_{++}^{m}\times \mathbb {R}^{l}$, let w^k+ 1(γα_k) be generated by (2.6) and $w\in {\mathcal W}$, then we have

$$ \begin{array}{@{}rcl@{}} &&\gamma\sigma\alpha_{k}(w-\hat{w}^{k})^{\top} {\upbeta}_{k}Q(w)+\frac{1}{2}(\|w-w^{k}\|^{2}-\|w-w^{k+1}(\gamma\alpha_{k})\|^{2}\\&\geq&\frac{1}{2} \sigma\gamma(2-\gamma){\alpha_{k}^{2}}\|d_{1}(w^{k},\tilde{w}^{k})\|^{2}. \end{array} $$

(5.14)

Proof

$$ \begin{array}{@{}rcl@{}} &&\|w-w^{k}\|^{2}-\|w-w^{k+1}(\gamma\alpha_{k})\|^{2}\\&=&\|w^{k}-w\|^{2}-\|w^{k}-\sigma(w^{k}-{w_{p}^{k}})-w\|^{2}\\ &=& 2\sigma(w^{k}-w)^{\top}(w^{k}-{w_{p}^{k}})-\sigma^{2}\|w^{k}-{w_{p}^{k}}\|^{2}\\ &=&2\sigma\left( \|w^{k}-{w_{p}^{k}}\|^{2}-(w-{w_{p}^{k}})^{\top}(w^{k}-{w_{p}^{k}})\right)-\sigma^{2}\|w^{k}-{w_{p}^{k}}\|^{2}.\\ \end{array} $$

(5.15)

Using the following identity

$$ \begin{array}{@{}rcl@{}} (w-{w_{p}^{k}})^{\top}(w^{k}-{w_{p}^{k}}) & = &{\frac{1}{2}}\left( \|{w_{p}^{k}}-w\|^{2}- \|w^{k}-w\|^{2}\right) +{\frac{1}{2}}\|w^{k}-{w_{p}^{k}}\|^{2}, \end{array} $$

we get

$$ \begin{array}{@{}rcl@{}} \|w^{k}-{w_{p}^{k}}\|^{2}-2(w-{w_{p}^{k}})^{\top}(w^{k}-{w_{p}^{k}})=\|w^{k}-w\|^{2}-\|{w_{p}^{k}} -w\|^{2}. \end{array} $$

(5.16)

Substituting (5.16) into (5.15), we obtain

$$ \begin{array}{@{}rcl@{}} &&\|w-w^{k}\|^{2}-\|w-w^{k+1}(\gamma\alpha_{k})\|^{2}\\&=&\sigma(\|w-w^{k}\|^{2}-\|w-{w_{p}^{k}}\|^{2})+\sigma(1-\sigma)\|w^{k}-{w_{p}^{k}}\|^{2}\\ &\geq&\sigma(\|w-w^{k}\|^{2}-\|w-{w_{p}^{k}}\|^{2}). \end{array} $$

(5.17)

Substituting (5.17) into (5.11), we obtain (5.14), the required result. □

Now, we are ready to present the O(1/t) convergence rate of the proposed method.

Theorem 5.1

For any integer t > 0, we have a $\hat {w}_{t}\in \mathcal W$ which satisfies

$$ (\hat{w}_{t}-w)^{\top}Q(w)\leq\frac{1}{2\gamma\sigma{\Upsilon}_{t}}\|w-w^{0}\|^{2},\quad \forall w\in\mathcal W, $$

where

$$ \hat{w}_{t}=\frac{1}{{\Upsilon}_{t}}\sum\limits_{k=0}^{t}{\upbeta}_{k}\alpha_{k}\hat{w}^{k}\quad and \quad {\Upsilon}_{t}=\sum\limits_{k=0}^{t}{\upbeta}_{k}\alpha_{k}. $$

Proof

Summing the inequality (5.14) over k = 0,⋅⋅⋅,t, we obtain

$$ \left( \left( \sum\limits_{k=0}^{t}\gamma\sigma{\upbeta}_{k}\alpha_{k}\right) w-\sum\limits_{k=0}^{t}\gamma\sigma{\upbeta}_{k}\alpha_{k}\hat{w}^{k}\right)^{\top}Q(w)+\frac{1}{2}\|w-w^{0}\|^{2}\geq0. $$

Using the notations of Υ_t and $\hat {w}_{t}$ in the above inequality, we derive

$$ (\hat{w}_{t}-w)^{\top}Q(w)\leq\frac{1}{2\gamma\sigma{\Upsilon}_{t}}\|w-w^{0}\|^{2},\quad \forall w\in\mathcal W. $$

Indeed, $\hat {w}_{t}\in \mathcal W$ because it is a convex combination of $\hat {w}^{0},\hat {w}^{1},\cdot \cdot \cdot , \hat {w}^{t}.$

The proof is completed. □

If $\inf ^{\infty }_{k=0}{\upbeta }_{k}=\upbeta >0,$ then from (2.19) we have

$$ {\Upsilon}_{t}\geq\frac{\alpha_{1}\upbeta}{\alpha_{2}}\left( t+1\right). $$

Suppose that for any compact set $\mathcal D\subset \mathcal W,$ let $d =\sup \{\|w-w^{0}\||w\in \mathcal D\}.$ For any given 𝜖 > 0, after most

$$ t=\big[\frac{\alpha_{2}d^{2}}{2\alpha_{1}\upbeta\gamma\sigma\epsilon}-1\big] $$

iterations, we have

$$ (\hat{w}_{t}-w)^{\top}Q(w)\leq\epsilon, \forall w\in\mathcal D. $$

That is, the O(1/t) convergence rate is established in an ergodic sense.

6 Preliminary computational results

In order to verify the theoretical assertions, we consider the following optimization problem with matrix variables:

$$ \min \left\{ \frac1 2 \|X-C\|_{F}^{2} : X \in S_{+}^{n} \right\}, $$

(6.1)

where ∥⋅∥_F is the matrix Fröbenius norm, that is,

$$ \|C\|_{F} = \left( \sum\limits_{i=1}^{n}\sum\limits_{j=1}^{n}|C_{ij}|^{2} \right)^{1/2}, $$

$$ S_{+}^{n} = \left\{ H \in \mathbb{R}^{n\times n} : H^{\top} = H,~ H\succeq 0 \right\}. $$

It has been shown [9] that problem (6.1) can be converted to the following variational inequality: find $u^{*}=(X^{*},Y^{*},Z^{*})\in \mathcal W $ $= {S_{+}^{n} \times S_{+}^{n}\times \mathbb {R}^{n\times n}}$ such that

$$ \left\{\begin{array}{ll} & \langle X-X^{*},(X^{*}-C)-Z^{*}\rangle \geq 0,\\ [3mm] & \langle Y-Y^{*},(Y^{*}-C)+Z^{*}\rangle\geq0, \quad \forall u=(X,Y,Z)\in \mathcal W,\\ [3mm] & X^{*} - Y^{*} =0. \end{array}\right. $$

(6.2)

Problem (6.2) is a special case of (1.2)–(1.3) with matrix variables where A = I_n×n, B = −I_n×n, b = 0, f(X) = X − C, g(Y ) = Y − C, and ${\mathcal W} =S_{+}^{n} \times S_{+}^{n}\times \mathbb {R}^{n\times n}$.

For simplification, we take R = rI_n×n, S = sI_n×n, and H = I_n×n where r > 0 and s > 0 are scalars. In all tests, we take γ = 1.98, μ = 0.1,τ = 1.5,η = 0.95,ρ = 1.47,σ = 0.95, C = rand(n), and (X⁰, Y⁰, Z⁰) = (I_n×n, I_n×n,0_n×n) as the initial point in the test. The iteration is stopped as soon as

$$ \max \left\{ \|X^{k}-\tilde{X}^{k}\|,\|Y^{k}-\tilde{Y}^{k}\|,\|Z^{k}-\tilde{Z}^{k}\| \right\} \leq 10^{-5}. $$

All codes were written in Matlab; we compare the proposed method with those in [9, 12, 13, 16, 36]. The iteration numbers, denoted by k, and the computational time for problem (6.1) with different dimensions are given in tables 6.1–6.2.

Tables 1 and 2 report the comparison between the methods of [9, 12, 13, 16, 36] and the proposed method. The number of iteration has great advantage and a faster convergence speed.

Table 1 Numerical results for the problem

Full size table

Table 2 Numerical results for the problem (6.1) with r = s = 10

Full size table

7 Conclusions

In this paper, we proposed a new descent logarithmic-quadratic proximal alternating direction method for solving structured variational inequalities. The optimal step size along the descent direction is chosen to improve the efficiency of the new method. In addition, we proposed a self-adaptive method that adjusts the scalar parameter automatically. And the numerical efficiency of our algorithm is verified compared with several existing algorithms.

References

Auslender, A., Teboulle, M., Ben-Tiba, S.: A logarithmic-quadratic proximal method for variational inequalities. Comput. Optim. Appl. 12, 31–40 (1999)
Article MathSciNet MATH Google Scholar
Auslender, A., Teboulle, M.: Entropic proximal decomposition methods for convex programs and variational inequalities. Math. Program. 91, 33–47 (2001)
Article MathSciNet MATH Google Scholar
Auslender, A., Teboulle, M.: Interior gradient and epsilon-subgradient descent methods for constrained convex minimization. Math. Oper. Res. 29, 1–26 (2004)
Article MathSciNet MATH Google Scholar
Auslender, A., Teboulle, M.: Interior gradient and proximal methods for convex and conic optimization. SIAM J. Optim. 16, 697–725 (2006)
Article MathSciNet MATH Google Scholar
Auslender, A., Teboulle, M.: Lagrangian duality and related multiplier methods for variational inequality problems. SIAM J. Optim. 10, 1097–1115 (2000)
Article MathSciNet MATH Google Scholar
Bertsekas, D.P., Gafni, E.M.: Projection method for variational inequalities with applications to the traffic assignment problem. Math. Program. Study 17 (1982)
Bertsekas, D.P., Tsitsiklis, J.N.: Parallel and Distributed Computation: Numerical Methods. Athena Scientific, Belmont (1997)
MATH Google Scholar
Bnouhachem, A., Benazza, H., Khalfaoui, M.: An inexact alternating direction method for solving a class of structured variational inequalities. Appl. Math. Comput. 219, 7837–7846 (2013)
MathSciNet MATH Google Scholar
Bnouhachem, A.: On LQP alternating direction method for solving variational inequality problems with separable structure. J. Inequal. Appl. 2014(80), 1–15 (2014)
MathSciNet MATH Google Scholar
Bnouhachem, A., Xu, M.H.: An inexact LQP alternating direction method for solving a class of structured variational inequalities. Comput. Math. Appl. 67, 671–680 (2014)
Article MathSciNet MATH Google Scholar
Bnouhachem, A., Ansari, Q.H.: A descent LQP alternating direction method for solving variational inequality problems with separable structure. Appl. Math. Comput. 246, 519–532 (2014)
MathSciNet MATH Google Scholar
Bnouhachem, A., Hamdi, A.: Parallel LQP alternating direction method for solving variational inequality problems with separable structure. J. Inequal. Appl. 2014(392), 1–14 (2014)
MathSciNet MATH Google Scholar
Bnouhachem, A., Hamdi, A.: A hybrid LQP alternating direction method for solving variational inequality problems with separable structure. Appl. Math. Inf. Sci. 9(3), 1259–1264 (2015)
MathSciNet Google Scholar
Bnouhachem, A., Al-Homidan, S., Ansari, Q.H.: New descent LQP alternating direction methods for solving a class of structured variational inequalities. Fixed Point Theory Appl. 2015(137), 1–11 (2015)
MathSciNet MATH Google Scholar
Bnouhachem, A., Latif, A., Ansari, Q.H.: On the O(1/t) convergence rate of the alternating direction method with LQP regularization for solving structured variational inequality problems. J. Inequal. Appl. 2016(297), 1–14 (2016)
MathSciNet MATH Google Scholar
Bnouhachem, A., Bensi, F., Hamdi, A.: On alternating direction method for solving variational inequality problems with separable structure. J. Nonlin. Sci. Apps. 10(1), 175–185 (2017)
Article MathSciNet MATH Google Scholar
Bnouhachem, A., Ansari, Q.H., Al-Homidan, S.: SQP Alternating direction for structured vriational inequality. J. Nonlinear Convex Anal. 19(3), 461–476 (2018)
MathSciNet MATH Google Scholar
Bnouhachem, A., Rassias, T.M.: A new descent alternating direction method with LQP regularization for the structured variational inequalities. Optim. Lett. 13 (1), 175–192 (2018)
Article MathSciNet MATH Google Scholar
Bnouhachem, A., Rassias, T.M.: On descent alternating direction method with LQP regularization for the structured variational inequalities, Optim. Lett., in press (2019)
Boyd, S., Parikh, N., Chu, E., Peleato, B., Eckstein, J.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3, 1–122 (2010)
Article MATH Google Scholar
Chan, T.F., Glowinski, R.: Finite Element Approximation and Iterative Solution of a Class of Mildly Non-Linear Elliptic Equations. Stanford University, Technical Report (1978)
Chen, Z., Wan, L., Yang, Y.: An inexact alternating direction method for structured variational inequalities. J. Optim. Theory Appl. 163(2), 439–459 (2014)
Article MathSciNet MATH Google Scholar
Facchinei, F., Pang, J.S.: Finite-Dimensional Variational Inequalities and Complementarity Problems, vol. I and II. Springer Series in Operations Research. Springer, New York (2003)
MATH Google Scholar
Fukushima, M.: Application of the alternating directions method of multipliers to separable convex programming problems. Comput. Optim. Appl. 1(1), 93–111 (1992)
Article MathSciNet MATH Google Scholar
Glowinski, R., Marrocco, A.: Sur l’approximation par éléments finis d’ordre un et la résolution par pénalisation-dualité d’une classe de problémes de Dirichlet non linéaires. Revue Fr. Autom. Inform. Rech. opér. Anal. Numé,r. 2, 41–76 (1975)
MATH Google Scholar
Glowinski, R.: Numerical Methods for Nonlinear Variational Problems. Springer, New York (1984)
Book MATH Google Scholar
Glowinski, R., Tallec, P.L.: Augmented Lagrangian and Operator-Splitting Methods in Nonlinear Mechanics. Society for Industrial and Applied Mathematics, Philadelphia (1989)
Book MATH Google Scholar
He, B.S., Liao, L.Z., Han, D.R., Yang, H.: A new inexact alternating directions method for monotone variational inequalities. Math. Program. 92, 103–118 (2002)
Article MathSciNet MATH Google Scholar
He, B.S., Liao, L.-Z., Yuan, X.M.: A LQP based interior prediction-correction method for nonlinear complementarity problems. J. Comput. Math. 24(1), 33–44 (2006)
Article MathSciNet MATH Google Scholar
He, B.S.: Parallel splitting augmented Lagrangian methods for monotone structured variational inequalities. Comput. Optim. Appl. 42, 195–212 (2009)
Article MathSciNet MATH Google Scholar
He, B.S., Tao, M., Yuan, X.M.: Alternating direction method with Gaussian back substitution for separable convex programming. SIAM J. Optim. 22, 313–340 (2012)
Article MathSciNet MATH Google Scholar
Hou, L.S.: On the O(1/t) convergence rate of the parallel descent-like method and parallel splitting augmented Lagrangian method for solving a class of variational inequalities. Appl. Math. Comput. 219, 5862–5869 (2013)
MathSciNet MATH Google Scholar
Jiang, Z.K., Bnouhachem, A.: A projection-based prediction-correction method for structured monotone variational inequalities. Appl. Math Comput. 202, 747–759 (2008)
MathSciNet MATH Google Scholar
Jiang, Z.K., Yuan, X.M.: New parallel descent-like method for sloving a class of variational inequalities. J. Optim. TheoryAppl. 145, 311–323 (2010)
Article MATH Google Scholar
Kontogiorgis, S., Meyer, R.R.: A variable-penalty alternating directions method for convex optimization. Math. Program. 83, 29–53 (1998)
MathSciNet MATH Google Scholar
Li, M.: A hybrid LQP-based method for structured variational inequalities. Int. J. Comput. Math. 89(10), 1412–1425 (2012)
Article MathSciNet MATH Google Scholar
Tao, M., Yuan, X.M.: On the O(1/t) convergence rate of alternating direction method with Logarithmic-quadratic proximal regularization. SIAM J. Optim. 22(4), 1431–1448 (2012)
Article MathSciNet MATH Google Scholar
Tseng, P.: Applications of a splitting algorithm to decomposition in convex programming and variational inequalities. SIAM J. Con. Optim. 29, 119–138 (1991)
Article MathSciNet MATH Google Scholar
Tseng, P.: Alternating projection-proximal methods for convex programming and variational inequalities. SIAM J. Optim. 7, 951–965 (1997)
Article MathSciNet MATH Google Scholar
Wang, K., Xu, L.L., Han, D.R.: A new parallel splitting descent method for structured variational inequalities. J. IndustrialMan. Optim. 10(2), 461–476 (2014)
Article MathSciNet MATH Google Scholar
Yuan, X.M., Li, M.: An LQP-based decomposition method for solving a class of variational inequalities. SIAM J. Optim. 21(4), 1309–1318 (2011)
Article MathSciNet MATH Google Scholar
Zhang, W., Han, D., Jiang, S.: A modified alternating projection based prediction-correction method for structured variational inequalities. Appl. Numer. Math. 83, 12–21 (2014)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Equipe MAISI, Ibn Zohr University, ENSA, BP 1136, Agadir, Morocco
Abdellah Bnouhachem

Authors

Abdellah Bnouhachem
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Abdellah Bnouhachem.

Additional information

This paper is dedicated to Mohamed Bnouhachem and Mohamed Khalfaoui

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Bnouhachem, A. A self-adaptive descent LQP alternating direction method for the structured variational inequalities. Numer Algor 86, 303–324 (2021). https://doi.org/10.1007/s11075-020-00890-0

Download citation

Received: 14 March 2019
Accepted: 21 January 2020
Published: 10 March 2020
Issue Date: January 2021
DOI: https://doi.org/10.1007/s11075-020-00890-0

Keywords

Mathematics subject classification (2010)

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A self-adaptive descent LQP alternating direction method for the structured variational inequalities

Abstract

Similar content being viewed by others

On descent alternating direction method with LQP regularization for the structured variational inequalities

A new descent alternating direction method with LQP regularization for the structured variational inequalities

Parallel LQP alternating direction method for solving variational inequality problems with separable structure

1 Introduction

2 The proposed method

Lemma 2.1

Definition 2.1

Assumption A

Assumption B

Algorithm 2.1

Remark 2.1

2.1 Relationship with some existing methods

Lemma 2.2

Theorem 2.1

Proof

3 Basic results

Lemma 3.1

Proof

Lemma 3.2

Proof

Theorem 3.1

Proof

4 Convergence of the proposed method

Theorem 4.1

Proof

Theorem 4.2

Proof

5 O(1/t) convergence rate

Lemma 5.1

Proof

Lemma 5.2

Proof

Lemma 5.3

Proof

Theorem 5.1

Proof

6 Preliminary computational results

7 Conclusions

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics subject classification (2010)

Search

Navigation