Gradient Sampling Methods

Bagirov, Adil; Karmitsa, Napsu; Mäkelä, Marko M.

doi:10.1007/978-3-319-08114-4_13

Adil Bagirov⁴,
Napsu Karmitsa⁵ &
Marko M. Mäkelä⁵

2902 Accesses

Abstract

One of the newest approaches in general nonsmooth optimization is to use gradient sampling algorithms developed by Burke, Lewis, and Overton. The gradient sampling method is a method for minimizing an objective function that is locally Lipschitz continuous and smooth on an open dense subset of \(\mathbb {R}^n\). The objective may be nonsmooth and/or nonconvex. Gradient sampling methods may be considered as a stabilized steepest descent algorithm. The central idea behind these techniques is to approximate the subdifferential of the objective function through random sampling of gradients near the current iteration point. In this chapter, we introduce the original gradient sampling algorithm.

Access provided by Autonomous University of Puebla. Download chapter PDF

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

One of the newest approaches in general NSO is to use gradient sampling algorithms developed by Burke et al. [51, 52]. The gradient sampling method (GS) is a method for minimizing an objective function that is locally Lipschitz continuous and smooth on an open dense subset \(D \subset {\mathbb {R}^{n}}\). The objective may be nonsmooth and/or nonconvex. The GS may be considered as a stabilized steepest descent algorithm. The central idea behind these techniques is to approximate the subdifferential of the objective function through random sampling of gradients near the current iteration point. The ongoing progress in the development of gradient sampling algorithms (see e.g. [67]) suggests that they may have potential to rival bundle methods in the terms of theoretical might and practical performance. However, here we introduce only the original GS [51, 52].

1 Gradient Sampling Method

Let \(f\) be a locally Lipschitz continuous function on \({\mathbb {R}^{n}}\), and suppose that \(f\) is smooth on an open dense subset \( D \subset \mathbb {R}^n\). In addition, assume that there exists a point \(\bar{\varvec{x}}\) such that the level set \({{\mathrm{lev}}}_{f(\bar{\varvec{x}})} = \{ {\varvec{x}}\mid f({\varvec{x}}) \le f(\bar{\varvec{x}})\}\) is compact.

At a given iterate \({\varvec{x}}_k\) the gradient of the objective function is computed on a set of randomly generated nearby points \({\varvec{u}}_{kj}\) with \(j \in \{1,2,\ldots ,m\}\) and \(m>n+1\). This information is utilized to construct a search direction as a vector in the convex hull of these gradients with the shortest norm. A standard line search is then used to obtain a point with lower objective function value. The stabilization of the method is controlled by the sampling radius \(\varepsilon _k\) used to sample the gradients.

The pseudo-code of the GS is the following:

Note that the probability to obtain a point \({\varvec{x}}_{kj}\notin D\) is zero in the above algorithm. In addition, it is reported in [52] that it is highly unlikely to have \({\varvec{x}}_k+t_k {\varvec{d}}_k \notin D\).

The GS algorithm may be applied to any function \(f:\mathbb {R}^n \rightarrow \mathbb {R}\) that is continuous on \(\mathbb {R}^n\) and differentiable almost everywhere. Furthermore, it has been shown that when \(f\) is locally Lipschitz continuous, smooth on an open dense subset \(D\) of \(\mathbb {R}^n\), and has bounded level sets, the cluster point \(\bar{{\varvec{x}}}\) of the sequence generated by the GS with fixed \(\varepsilon \) is \(\varepsilon \)-stationary with probability 1 (that is, \({\pmb 0}\in \partial _\varepsilon ^G f(\bar{{\varvec{x}}})\), see also Definition 3.3 in Part I). In addition, if \(f\) has a unique \(\varepsilon \)-stationary point \(\bar{{\varvec{x}}}\), then the set of all cluster points generated by the algorithm converges to \(\bar{{\varvec{x}}}\) as \(\varepsilon \) is reduced to zero.

Author information

Authors and Affiliations

Centre for Informatics and Applied Optimization, School of Information Technology and Mathematical Sciences, University of Ballarat, Ballarat, VIC, Australia
Adil Bagirov
Department of Mathematics and Statistics, University of Turku, Turku, Finland
Napsu Karmitsa & Marko M. Mäkelä

Authors

Adil Bagirov
View author publications
You can also search for this author in PubMed Google Scholar
Napsu Karmitsa
View author publications
You can also search for this author in PubMed Google Scholar
Marko M. Mäkelä
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Adil Bagirov .

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Bagirov, A., Karmitsa, N., Mäkelä, M.M. (2014). Gradient Sampling Methods. In: Introduction to Nonsmooth Optimization. Springer, Cham. https://doi.org/10.1007/978-3-319-08114-4_13

Download citation

DOI: https://doi.org/10.1007/978-3-319-08114-4_13
Published: 13 August 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-08113-7
Online ISBN: 978-3-319-08114-4
eBook Packages: Business and EconomicsBusiness and Management (R0)

Publish with us

Policies and ethics