Abstract
The idea of this paper is to perturb Mann iteration scheme and obtain a strong convergence result for approximation of solutions to constrained convex minimization problem in a real Hilbert space. Furthermore, we give computational analysis of our iterative scheme.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Let H be a real Hilbert space and C a nonempty, closed, and convex subset of H. Let f : H → ℝ be a convex and continuously Fréchet differentiable functional. Consider the following constrained convex minimization problem:
The gradient projection method (for short, GPM) generates a sequence {x n } using the following recursive formula:
or more generally,
where in both (2) and (3), the initial guess x 0 is taken from C arbitrarily, and the parameters, λ or λ n , are positive real numbers known as stepsize.
The gradient projection (or projected-gradient) algorithm is a powerful tool for solving constrained convex optimization problems and has been extensively studied (see [5, 6, 9, 11, 15, 17–23] and the references therein). It has been recently applied to solve split feasibility problems which find applications in image reconstructions and the intensity-modulated radiation therapy (see [3, 4, 14, 25]).
The convergence of algorithms (2) and (3) depends on the behavior of the gradient ∇f. The gradient projection method (3) has been considered with several stepsize rules:
-
Constant stepsize, where for some λ > 0, we have λ n = λ for all n.
-
Diminishing stepsize, where λ n → 0 and \({\sum }_{n=1}^{\infty } \lambda _{n}=\infty \).
-
Polyak’s stepsize, where \(\lambda _{n}=\frac {f(x_{n})-f^{\ast }}{\|\nabla f(x_{n})\|^{2}}\), where f ∗ is the optimal value of (1).
-
Modified Polyak’s stepsize, where \(\lambda _{n}=\frac {f(x_{n})-\hat {f_{n}}}{\|\nabla f(x_{n})\|^{2}}\) and \(\hat {f_{n}}=\min _{0\leq j \leq n}f(x_{j})-\delta \) for some scalar δ > 0.
The constant stepsize rule is suitable when we are interested in finding an approximate solution to problem (1). Diminishing stepsize rule is an off-line rule and is typically used with \(\lambda _{n}=\frac {c}{n+1}\) or \(\frac {c}{\sqrt {n+1}}\) for some c > 0. The constant and the diminishing stepsizes are also well suited for some distributed implementations of the method.
As a matter of fact, it is known (see [11]) that if ∇f is α-strongly monotone and L-Lipschitzian with constants α, L > 0, then the operator
is a contraction; hence, the sequence {x n } defined by algorithm (2) converges in norm to the unique solution of the minimization problem (1). More generally, if the sequence {λ n } is chosen to satisfy the property
then the sequence {x n } defined by algorithm (3) converges in norm to the unique minimizer of (1). However, if the gradient ∇f fails to be strongly monotone, the operator T defined by (4) would fail to be contractive; consequently, the sequence {x n } generated by algorithm 3 may fail to converge strongly (see [22, Sect. 5]). If ∇f is Lipschitzian, then algorithms (2) and (3) can still converge in the weak topology under certain conditions.
Recently, Xu [22] gave an alternative operator-oriented approach to algorithm 3, namely an averaged mapping approach. He gave his averaged mapping approach to the gradient projection algorithm (3) and the relaxed gradient projection algorithm. Moreover, he constructed a counterexample which shows that algorithm (2) does not converge in norm in an infinite-dimensional space and also presented two modifications of gradient projection algorithms which are shown to have strong convergence. Further, he regularized the minimization problem (1) to devise an iterative scheme that generates a sequence converging in norm to the minimum-norm solution of (1) in the consistent case.
Very recently, motivated by the work of Xu [22], Ceng et al. [6] proposed implicit and explicit iterative schemes for finding the approximate minimizer of a constrained convex minimization problem and prove that the sequences generated by their schemes converge strongly to a solution of the constrained convex minimization problem. Such a solution is also a solution of a variational inequality defined over the set of fixed points of a nonexpansive mapping.
Motivated by the works of Xu [22] and Ceng et al. [6], we prove a strong convergence theorem for finding the approximate minimizer of a constrained convex minimization problem in a real Hilbert space. Furthermore, we give computational analysis of our result. Our result complements the results of Xu [22] and several other works in this direction.
2 Preliminaries
Definition 1
A mapping T : C → C is said to be nonexpansive if
Construction of fixed points of nonexpansive mappings is an important subject in nonlinear mapping theory and its applications, in particular, in image recovery and signal processing (see, for example, [3, 4, 14, 25]). For the past 40 years or so, the approximation of fixed points of nonexpansive mappings and fixed points of some of their generalizations and approximation of zeros of accretive-type operators have been a flourishing area of research for many mathematicians. For example, the reader can consult the recent monographs of Berinde [2] and Chidume [7].
For any point u ∈ H, there exists a unique point P C u ∈ C such that
P C is called the metric projection of H onto C. We know that P C is a nonexpansive mapping of H onto C. It is also known that P C satisfies
for all x, y ∈ H. Furthermore, P C x is characterized by the properties P C x ∈ C and
for all y ∈ C.
Definition 2
A mapping T : H → H is said to be firmly nonexpansive if and only if 2T − I is nonexpansive, or equivalently
Alternatively, T is firmly nonexpansive if and only if T can be expressed as
where S : H → H is nonexpansive. Projections are firmly nonexpansive.
Definition 3
A mapping T : H → H is said to be an averaged mapping if and only if it can be written as the average of the identity mapping I and a nonexpansive mapping; that is
where α ∈ (0, 1) and S : H → H is nonexpansive. More precisely, when (5) holds, we say that T is α-averaged. Thus, firmly nonexpansive mappings (in particular, projections) are \(\frac {1}{2}\)-averaged mappings.
Some properties of averaged mappings are in the following proposition.
Proposition 1.
([4, 8]) For given operators S, T, V : H → H.
-
(a)
If T = (1 − α)S + αV for some α ∈ (0, 1) and if S is averaged and V is nonexpansive, then T is averaged.
-
(b)
T is firmly nonexpansive if and only if the complement I − T is firmly nonexpansive.
-
(c)
If T = (1 − α)S + αV for some α ∈ (0, 1) and if S is firmly nonexpansive and V is nonexpansive, then T is averaged.
-
(d)
The composite of finitely many averaged mappings is averaged. That is, if each of the mappings \(\{T_{i}\}_{i=1}^{N}\) is averaged, then so is the composite T 1 …T N . In particular, if T 1 is α 1 -averaged and T 2 is α 2 -averaged, where α 1 ,α 2 ∈ (0, 1), then the composite T 1 T 2 is α-averaged, where α = α 1 + α 2 − α 1 α 2.
Definition 4
A nonlinear operator T whose domain D(T) ⊂ H and range R(T) ⊂ H is said to be
-
(a)
monotone if
$$\langle x-y, Tx-Ty\rangle \geq 0\quad \forall x, y \in D(T), $$ -
(b)
β-strongly monotone if there exists β > 0 such that
$$\langle x-y, Tx-Ty\rangle \geq \beta \|x-y\|^{2}\quad \forall x, y \in D(T), $$ -
(c)
ν-inverse strongly monotone (for short, ν-ism) if there exists ν > 0 such that
$$\langle x-y, Tx-Ty\rangle \geq \nu \|Tx-Ty\|^{2}\quad \forall x, y \in D(T). $$
It can be easily seen that (i) if T is nonexpansive, then I − T is monotone; (ii) the projection mapping P C is a 1-ism. The inverse strongly monotone (also referred to as co-coercive) operators have been widely used to solve practical problems in various fields, for instance, in traffic assignment problems; see, for example, [3, 10] and the references therein.
The following proposition gathers some results on the relationship between averaged mappings and inverse strongly monotone operators.
Proposition 2.
([4]) Let T : H → H be an operator from H to itself.
-
(a)
T is nonexpansive if and only if the complement I − T is \(\frac {1}{2}\) -ism.
-
(b)
If T is ν-ism, then for γ > 0, γT is \(\frac {\nu }{\gamma }\) -ism.
-
(c)
T is averaged if and only if the complement I − T is ν-ism for some ν > 1/2. Indeed, for α ∈ (0, 1), T is α-averaged if and only if I−T is \(\frac {1}{2\alpha }\) -ism.
Since the Lipschitz continuity of the gradient of f implies that it is indeed inverse strongly monotone (ism), its complement can be an averaged mapping. Consequently, the GPA can be rewritten as the composite of a projection and an averaged mapping, which is again an averaged mapping. This shows that averaged mappings play an important role in the gradient projection algorithm. Recall that a mapping T is nonexpansive if and only if it is Lipschitz with a Lipschitz constant not more than one and that a mapping is an averaged mapping if and only if it can be expressed as a proper convex combination of the identity mapping and a nonexpansive mapping. An averaged mapping with a fixed point is asymptotically regular and its Picard iterates at each point converge weakly to a fixed point of the mapping. This convergence property is quite helpful.
In the sequel, we shall also make use of the following lemmas.
Lemma 1.
Let H be a real Hilbert space. Then the following inequality holds:
Lemma 2.
Let H be a real Hilbert space. The following inequality holds:
Lemma 3.
(Reich,s [16]) Let K be closed and convex subset of a reflexive real Banach space E with a uniformly Gâteaux differentiable norm. Assume that every weakly compact and convex subset of E has the fixed-point property for nonexpansive mappings. Let A : K → E be an accretive mapping which satisfies the range condition K ⊆ R(I + sA) for all s > 0. Suppose that 0 ∈ R(A), then for each x ∈ K, the strong limit, \(\lim _{t \to \infty } J_{t} x\) exists and belongs to A −1 (0). If we denote \(\lim _{t \to \infty } J_{t} x\) by Qx, then Q : K → A −1 (0) is the unique sunny nonexpansive retraction of K onto A −1(0).
We remark that in Hilbert spaces, a sunny nonexpansive retraction is a projection mapping.
Lemma 4.
(Moore and Nnoli, [12]) Let \(\{\theta _{n}\}_{n=1}^{\infty }\) be a sequence of nonnegative real numbers satisfying the following relation:
where (i) 0 < α n < 1; (ii) \(\sum _{n=1}^{\infty }\alpha _{n}=\infty \) (iii) ; Φ : [0, ∞) → [0, ∞) is a strictly increasing function with Φ(0) = 0. Suppose that σ n = o(α n ) (where σ n = o(α n ) if and only if \(\lim _{n \to \infty } \frac {\sigma _{n}}{\alpha _{n}}=0)\) , then θ n → 0 as n → ∞.
We adopt the following notation: x n → x means that x n → x strongly.
3 Main Result
In this section, we modify the gradient projection method so as to have strong convergence. Below, we include such modification. We use constant stepsize λ since we are interested in finding an approximate solution to problem (1).
Theorem 1.
Let C be a nonempty, closed, and convex subset of a real Hilbert space H. Suppose that the minimization problem (1) is consistent and let S denote its solution set. Assume that the gradient ∇f is L-Lipschitzian with constant L > 0. For a fixed u ∈ C, let the sequence {x n } be generated iteratively by x 1 ∈ C,
where \(0<\lambda <\frac {2}{L}\) and {α n }, {γ n } are sequences in (0,1) satisfying the following conditions:
-
(i)
α n (1 + γ n ) < 1,
-
(ii)
α n = o(γ n ),
-
(iii)
\(\sum \nolimits _{n=1}^{\infty }\alpha _{n}\gamma _{n}=+\infty. \)
Then the sequence {x n } converges strongly to a minimizer \(\hat {x}\) of (1), where \(\hat {x}\) is the projection of the starting point u onto the solution set of the convex problem being solved.
Proof
Observe that x ∗ ∈ C solves the minimization problem (1) if and only if x ∗ solves the fixed-point equation
where λ > 0 is any fixed positive number. Note that the gradient ∇f is L-Lipschitzian with constant L > 0 implies that the gradient ∇f is \(\frac {1}{L}\)-ism [1], which then implies that λ∇f is \(\frac {1}{\lambda L}\)-ism. So by Proposition 2(c), I − λ∇f is \(\frac {\lambda L}{2}\)-averaged. Now since the projection P C is \(\frac {1}{2}\)-averaged, we see from Proposition 2(d) that the composite P C (I − λ∇f) is \(\frac {2 + \lambda L}{4}\)-averaged for \(0 < \lambda < \frac {2}{L}\). Therefore, we can write
where T is nonexpansive and \(\beta = \frac {2 + \lambda L}{4} \in (0,1)\). Then, we can rewrite (6) as
where θ n = β α n ∈ (0, 1) ∀n ≥ 1. For any x ∗ ∈ S, notice that T x ∗ = x ∗. We first show that the sequence \(\{x_{n}\}_{n=1}^{\infty }\) is bounded. Observe that T is nonexpansive and is equivalent to
Using (7), we have
Then, using (8), we obtain
which implies that
It follows from (9) and (10) that
Thus, \(\{x_{n}\}_{n=1}^{\infty }\) is bounded and so is {T x n }.
Next, we show that the sequence \(\{x_{n}\}_{n=1}^{\infty }\) converges strongly to \(\hat {x}\). For each n ≥ 1, let A := I − T. Then A is a bounded, continuous, and monotone mapping. Since \(\{x_{n}\}_{n=1}^{\infty }\) is bounded, we have that \(\{Ax_{n}\}_{n=1}^{\infty }\) is bounded. Furthermore, by Theorem 2 of [13], A satisfies the range condition. Observe that if for all γ > 0, we define
then we easily see that A γ satisfies the range condition and
where \(J_{s}^{A_{\gamma }}\) is the resolvent of the operator A γ ∀γ > 0. Observe that
where B is any closed ball containing the sequence {x n }. This implies that \(\lim _{\gamma \to 0} \|A_{\gamma } x_{n}\|=0\) for each n ≥ 1. Furthermore, we obtain from Lemma 3 that \(\lim _{s \to \infty } J_{s}^{A_{\gamma }}u=Q^{\gamma } u\), where Q γ is the unique projection mapping from R(I + s A) onto \(A_{\gamma }^{-1}(0) = A^{-1}(0) = F(T)\). Thus, from uniqueness of projection mapping, we obtain that if Q is the projection mapping of R(I+s A) onto A −1(0), then Q γ = Q for all γ > 0. This implies that \(Q u =\lim _{s \to \infty } J_{s}^{A_{\gamma }}u\) for all γ > 0. Let \(\hat {x}:=Q u\). We show that if
then \(\lim _{n \to \infty } \xi _{n}=0\). We further observe that since \(J_{s}^{A_{\gamma }}=(I+sA_{\gamma })^{-1},\) then \((I+sA_{\gamma })J_{s}^{A_{\gamma }}u=u\). This implies that \(A_{\gamma } o J_{s}^{A_{\gamma }}u=\frac {1}{s}(u-J_{s}^{A_{\gamma }}u)\) and thus since A γ is monotone, we have that
This implies that for some positive constant M > 0,
Thus, \(\limsup _{\gamma \to 0}\langle u-J_{s}^{A_{\gamma }}u, x_{n}-J_{s}^{A_{\gamma }}u \rangle \leq 0~\forall n \geq 1\). Therefore, given 𝜖 > 0, there exists δ := δ(𝜖) > 0 such that for all γ ∈ (0, δ],
Moreover, we have (in particular, for γ = δ) that for some constant M 0 > 0,
Observe that
Thus, as s → ∞, we obtain from 11) that \(\langle u-\hat {x},x_{n}-\hat {x}\rangle \leq \epsilon \), so
and since 𝜖 > 0 is arbitrary, (12) gives
from which we can deduce that \(\underset {n \to \infty }\lim \xi _{n}=0\). From (7), we have
where \(\delta _{n}={\alpha _{n}^{2}}(\gamma _{n}+1)M^{\ast }+2\alpha _{n}\gamma _{n}\xi _{n}=o(\alpha _{n})\) for some M ∗ > 0. Hence, by Lemma 4, we have that \(x_{n}\to \hat {x}\) as n → ∞. This completes the proof. □
4 Computational Analysis
In this section, we give a computational analysis result using our iterative scheme.
Let H be a real Hilbert space and C := {x ∈ H : ∥x∥ ≤ r} (i.e., a closed ball centered at the origin of radius r) and define f : C → ℝ by \(f(x) = \frac {1}{2}\|x\|^{2}\). Then f is convex and differentiable with ∇f(x) = x. Observe that ∇f is 1-Lipschitzian. Let us consider the problem
Clearly, we see that the optimal solution \(\hat {x}\) to the minimization problem 13) is \(\hat {x}=0\).
Now, by our Theorem 1, we can take \(\alpha _{n}=\frac {1}{(n+1)^{1/2}}, \gamma _{n}=\frac {1}{(n+1)^{1/3}},~n \geq 1\) and \(\lambda =\frac {1}{2}\). By the choice of our function f and λ, we see that \(P_{C}(x-\frac {1}{2}\nabla f(x)) = P_{C}(x-\frac {1}{2}x) = P_{C}(\frac {1}{2}x) = \frac {1}{2}x\). Furthermore, our iterative scheme (6) becomes
5 Conclusions
Since the gradient projection method (GPM) fails, in general, to converge in norm in infinite-dimensional Hilbert spaces, here in our result, we have provided a strongly convergent modification of gradient projection method (GPM).
We note that Theorem 4.1 and Theorem 4.2 of Xu [22] are weak convergence results, while our Theorem 1 is a strong convergence result. Thus, our Theorem 1 improves Theorem 4.1 and Theorem 4.2 of Xu [22]. Furthermore, our iterative scheme in this paper does not involve the “CQ” algorithm studied by Xu [22]. Also, our result does not require additional projections which was required in Theorem 5.4 of Xu [22] and Theorem 3.3 of [24] in order to guarantee strong convergence. Our method of proof is different from the methods of proof of Xu [22], Yao et al. [23], Ceng et al. [6], Su and Xu [18], and others.
References
Baillon, J.B., Haddad, G.: Quelques proprietes des operateurs angle-bornes et n-cycliquement monotones. Isreal. J. Math. 26, 137–150 (1977)
Berinde, V.: Iterative approximation of fixed points. Lecture Notes in Mathematics 1912. Springer, Berlin (2007)
Bertsekas, D.P., Gafni, E.M.: Projection methods for variational inequalities with applications to the traffic assignment problem. Math. Program. Stud. 17, 139–159 (1982)
Byrne, C.: Unified treatment of some algorithms in signal processing and image construction. Inverse Probl. 20, 103–120 (2004)
Calamai, P.H., More, J.J.: Projected gradient methods for linearly constrained problems. Math. Program. 39, 93–116 (1987)
Ceng, L.-C., Ansari, Q.H., Yao, J.-C.: Some iterative methods for finding fixed points and for solving constrained convex minimization problems. Nonlinear Anal. 74, 5286–5302 (2011)
Chidume, C.E.: Geometric properties of Banach spaces and nonlinear iterations. Lecture Notes in Mathematics 1965. Springer (2009)
Combettes, P.L.: Solving monotone inclusion via compositions of nonexpansive averaged operators. Optim. 53, 475–504 (2004)
Gafni, E.M., Bertsekas, D.P.: Two metric projection methods for constrained optimization. SIAM J. Control Optim. 22, 936–964 (1984)
Han, D., Lo, H.K.: Solving non-additive traffic assignment problems: a descent method for co-coercive variational inequalities. Eur. J. Oper. Res. 159, 529–544 (2004)
Levitin, E.S., Polyak, B.T.: Constrained minimization problems. USSR Comput. Math. Math. Physis. 6, 1–50 (1966)
Moore, C., Nnoli, B.V.C.: Iterative solution of nonlinear equations involving set-valued uniformly accretive operators. Comput. Math. Appl. 42, 131–140 (2001)
Morales, C.H.: Surjectivity theorems for multivalued mappings of accretive type. Comment. Math. Uni. Carol. 26, 397–413 (1985)
Podilchuk, C.I., Mammone, R.J.: Image recovery by convex projections using a least-squares constraint. J. Optim. Soc. Amer. A7, 517–521 (1990)
Polyak, B.T.: Introduction to Optimization. Optimization Software, New York (1987)
Reich, S.: Strong convergence theorems for resolvents of accretive mappings in Banach spaces. J. Math. Anal. Appl. 75, 287–292 (1980)
Ruszczynski, A.: Nonlinear Optimization. Princeton University Press, New Jersey (2006)
Su, M., Xu, H.K.: Remarks on the gradient-projection algorithm. J. Nonlinear Anal. Optim. 1, 35–43 (2010)
Wang, C.Y., Xiu, N.H.: Convergence of gradient projection methods for generalized convex minimization. Comput. Optim. Appl. 16, 111–120 (2000)
Xiu, N.H., Wang, C.Y., Zhang, J.Z.: Convergence properties of projection and contraction methods for variational inequality problems. Appl. Math. Optim. 43, 147–168 (2001)
Xiu, N., Wang, D., Kong, L.: A note on the gradient projection method with exact stepsize rule. J. Comput. Math. 25, 221–230 (2007)
Xu, H.K.: Averaged mappings and the gradient-projection algorithm. J. Optim. Theory Appl. 150, 360–378 (2011)
Yao, Y., Kang, S.M., Jigang, W., Yang, P.-X.: A Regularized gradient projection method for the minimization problem. J. Appl. Math. 2012 (259813), 9 (2012). doi:10.1155/2012/259813
Yao, Y., Liou, Y-C., Wen, C-F.: Variant gradient projection methods for the minimization problems. Abstr. Appl. Anal. 2012 (792078), 21 (2012). doi:10.1155/2012/792078
Youla, D.: On deterministic convergence of iterations of related projection mappings. J. Vis. Commun. Image Represent 1, 12–20 (1990)
Acknowledgements
The author is very grateful to the Editor and the three anonymous referees for many insightful, detailed, and helpful comments which led to significant improvement of the previous version of the paper.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Shehu, Y. Approximation of Solutions to Constrained Convex Minimization Problem in Hilbert Spaces. Vietnam J. Math. 43, 515–523 (2015). https://doi.org/10.1007/s10013-014-0091-1
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10013-014-0091-1