A feasible proximal bundle algorithm with convexification for nonsmooth, nonconvex semi-infinite programming

Pang, Li-Ping; Wu, Qi

doi:10.1007/s11075-021-01192-9

A feasible proximal bundle algorithm with convexification for nonsmooth, nonconvex semi-infinite programming

Original Paper
Published: 22 September 2021

Volume 90, pages 387–422, (2022)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Numerical Algorithms Aims and scope Submit manuscript

A feasible proximal bundle algorithm with convexification for nonsmooth, nonconvex semi-infinite programming

Download PDF

Li-Ping Pang^1,2 &
Qi Wu³

309 Accesses
2 Citations
Explore all metrics

Abstract

We propose a new numerical method for semi-infinite optimization problems whose objective function is a nonsmooth function. Existing numerical methods for solving semi-infinite programming (SIP) problems make strong assumptions on the structure of the objective function, e.g., differentiability, or are not guaranteed to furnish a feasible point on finite termination. In this paper, we propose a feasible proximal bundle method with convexification for solving this class of problems. The main idea is to derive upper bounding problems for the SIP by replacing the infinite number of constraints with a finite number of convex relaxation constraints, introduce improvement functions for the upper bounding problems, construct cutting plane models of the improvement functions, and reformulate the cutting plane models as quadratic programming problems and solve them. The convex relaxation constraints are constructed with ideas from the α BB method of global optimization. Under mild conditions, we showed that every accumulation point of the iterative sequence is an 𝜖-stationary point of the original SIP problem. Under slightly stronger assumptions, every accumulation point of the iterative sequence is a local solution of the original SIP problem. Preliminary computational results on solving nonconvex, nonsmooth constrained optimization problems and semi-infinite optimization problems are reported to demonstrate the good performance of the new algorithms.

An infeasible bundle method for nonconvex constrained optimization with application to semi-infinite programming problems

Article 10 February 2018

A new infeasible proximal bundle algorithm for nonsmooth nonconvex constrained optimization

Article 15 June 2019

Primal-dual path following method for nonlinear semi-infinite programs with semi-definite constraints

Article 04 June 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

In this paper, we consider the following nonsmooth, nonconvex optimization problem

$$ \left\{ \begin{aligned} \underset{x\in X}{\min} & f(x) \\ \text{s.t.} & g(x,t)\le 0, t \in T, \end{aligned}\right. $$

(1)

where $f:\mathbb {R}^{n}\rightarrow \mathbb {R}$ is locally Lipschitz but potentially nonsmooth and nonconvex, $g:\mathbb {R}^{n}\times T\rightarrow {\mathbb {R}}$ is twice continuously differentiable on X × T. In addition, T is a fixed, nonempty compact subset of $\mathbb {R}^{p}$ and X is a convex compact subset of $\mathbb {R}^{n}$. This problem is called the semi-infinite programming (SIP) problem. For simplicity of notation, we consider the semi-infinite problem with one semi-infinite constraint and a one-dimensional index set T = [a,b], that is p = 1. However, the algorithm proposed in this paper might be helpful to deal with more general cases.

SIP has recently drawn a lot of attention due to its application in many different fields, including portfolio problem, robot trajectory planning, vibrating membrane problem, minimal cost control of air pollution, and optimization over probability measures (see [11, 15, 22]).

There exist a wide range of numerical methods for SIP problems. Traditional methods for SIP like discretization, exchange, and reduction-based [14, 21, 32, 39, 44] solve a sequence of finitely nonlinear programming subproblems, that is, subproblems with finitely many constraints chosen from the original constraints. In particular, in the reduction-based methods, the finite subset of constraints is generated from finding implicitly all the local maxima t(x) of g(x,⋅) on T for each x ∈ X. Based on this reduction principle, a number of nonsmooth approaches have been presented [10, 24, 31]. These traditional methods suffer from the major drawback that they generally do not provide a feasible point in a finite number of iterations.

A breakthrough for SIP was achieved by Bhattacharjee et al. [4], who replaced the semi-infinite constraint by an overestimation obtained via interval extensions and presented a convergent upper bounding approach to SIP. It is the first algorithm with feasible iterates for SIP, to our knowledge. Based on this upper bounding approach, a number of algorithms for finding the global solution of SIP problems have been presented [5, 8, 27, 29, 30]. Under the assumptions that the functions f and g(⋅,t) are continuously differentiable on X for each t ∈ T, g(x,⋅) is continuous on T for each x ∈ X, there exist SIP Slater points arbitrarily close to all global minima, and the resulting finite nonlinear programs (NLPs) are solved to local optimality; the method using interval extensions with feasible iterates generates an 𝜖-optimal estimate of the global solution of the SIP on finite termination [5]. The main drawback of this method is the rapid increase in the size of the lower and upper bounding problems as nodes of increasing depth are visited by the branch-and-bound procedure. Other methods to generate the global solution of the SIP problems were proposed by Mitsos and coworkers [8, 27, 30]. The upper bounding problems of these methods are not generated by interval extensions, but rather by a restriction of the constraints right-hand side by a negative value (< −𝜖^g, for some 𝜖^g > 0). For fixed point x, these methods use three subproblems, including the lower bounding problem, the upper bounding problem, and the lower level problem. Under the assumptions that f is continuous on X, g is continuous on X × T, there exists an 𝜖^f-optimal SIP Slater point, and the resulting finite NLPs are solved to global optimality, the algorithms terminate finitely with a feasible point and solve the SIP approximately to global optimality. We point out that the authors in [8, 27, 29, 30] consider global solution of all subproblems, which is computationally expensive.

In contrast to these methods, the adaptive convexification method discussed in [9] does not focus on the global solution of the SIP problem, but global solutions of the lower level problem. The authors construct convex relaxations of the lower level problem using the α BB method [1, 2] of global optimization, replace the convex lower level problems with their Karush-Kuhn-Tucker (KKT) conditions, and solve the resulting mathematical programs with complementarity constraints. Under the assumptions that $f:\mathbb {R}^{n}\rightarrow \mathbb {R}$ and $g:\mathbb {R}^{n}\times \mathbb {R}^{p}\rightarrow \mathbb {R}^{m}$ are twice continuously differentiable, the algorithm generates a feasible point and reaches a 𝜖-stationary point of the original SIP problem on finite termination. Such method requires slightly stronger assumption on the constraints, but can give tighter bounds than interval extensions. Based on the method discussed in [9], a number of convexification algorithms have been presented [36, 38, 41]. We point out that the method discussed in [41] relies on constructing concave relaxations of the lower level problem and solving the resulting approximate problems with finitely many constraints.

Bundle methods are among the most efficient methods for solving nonsmooth optimization problems. See [18, 20, 25, 35] for a more detailed discussion and comparison of some bundle methods for nonsmooth convex constrained optimization. When the objective function is nonconvex, the recent results [12, 13] make assumptions on the structure of the objective function and propose a redistributed bundle method for unconstrained nonsmooth, nonconvex optimization problems. Based on this redistributed method, many nonconvex bundle approaches have been suggested. The authors [24, 43] propose proximal bundle methods for constrained nonsmooth, nonconvex optimization problems, where the objective and constraint functions are lower-C². Lv et al. [23] propose an inexact proximal bundle method for constrained nonsmooth, nonconvex optimization problems, where the objective and constraint functions are lower-C¹. Hoseini and Nobakhtian [16, 17] propose bundle methods for constrained nonsmooth nonconvex optimization problems, where the objective and constraint functions are regular. The methods presented in [23, 24, 31] can be used to solve the SIP problem, but the calculation of the lower level problem in these methods is costly since the lower level problem needs to be solved within a given fixed accuracy at each point. Moreover on finite termination, these methods are not guaranteed to furnish a feasible point.

In this paper, our aim is to design a feasible algorithm for nonsmooth, nonconvex SIP that combines some of the ideas of the proximal bundle methods and of the convexification methods. The main idea is to generate upper bounding problems for the SIP by replacing the lower level problem with its concave relaxations using the α BB method, and solve the upper bounding problems by a new feasible proximal bundle method. The relaxed lower level problem is a finite nonsmooth problem. Under certain assumptions given in Theorem 6 below, it can be shown that every accumulation point of the iterative sequence is an 𝜖-stationary point of the original SIP problem. Under slightly stronger assumptions given in Theorem 7 below, every accumulation point of the iterative sequence is a local solution of the original SIP problem. These assumptions appeared in nonsmooth, nonconvex bundle methods cited above. Compared to the related works [23, 24, 31], our algorithm starts with a coarse subset of T and is advantageous to save computational time on constraint calculations. Moreover, we adopt an aggregation technique to control the size of subproblems flexibly. In particular, our algorithm can obtain a feasible point in finite iterations. Compared to [9, 36, 38, 41], the assumptions made on the structure of the objective function in this paper are weaker. We point out that the subproblems of our algorithm are solved by existing QP solver and do not depend on the number of the constraints of the upper bounding problem, whereas the subproblems of algorithm from [9, 36, 38, 41] are solved by the existing NLP solver and depend on the number of the constraints of the upper bounding problem. Thus, our algorithm is, naturally, significantly less time-consuming when the number of subintervals/nodes increases. The results of numerical experiment also show the good performance of the new method.

The remainder of this paper is organized as follows. The main concepts used throughout the paper are described in the next section, where we also state the methods to construct upper bounding functions of the constraints. Section 3 presents our feasible proximal bundle algorithm with convexification. Convergence of the proximal bundle method with convexification is addressed in Section 4. Section 5 presents some numerical results and Section 6 gives conclusions.

Our notation is fairly standard. The Euclidean inner product of two vectors $x,y\in \mathbb {R}^{n}$ is denoted by 〈x,y〉 := x^Ty, and the associated norm by ∥⋅∥. The positive-part function is denoted by $x^{+} := \max \limits \{x, 0\}$. For a set $X\in \mathbb {R}^{n}$, convX denotes its convex hull. The closed ball with center $x\in \mathbb {R}^{n}$ and radius r > 0 is denoted by B(x;r). That is, ${B}(x; r):=\{y\in \mathbb {R}^{n} | \|y-x\|\leq r\}$.

2 Preliminaries

In this section, we first recall some concepts and results related to nonsmooth analysis. Next, we give the necessary condition of the optimization measure. Then, we introduce the convexification method to construct the concave relaxation of the lower level problem.

2.1 Background

Definition 1

(Lipschitz continuity) A function $f:\mathbb {R}^{n}\rightarrow \mathbb {R}$ is Lipschitz continuous on a set $X\subset \mathbb {R}^{n}$ if there exists L > 0 such that

$$ |f(y)-f(z)|\leq L\|y-z\| \text{for all} y,z\in X. $$

Then, L is called a Lipschitz constant for f on X.

Definition 2

The directional derivative of f at x in the direction $d\in \mathbb {R}^{n}$ is defined by

$$ f^{\prime}(x;d):=\underset{h\rightarrow 0}{\lim}\frac{f(x+hd)-f(x)}{h}. $$

Let $f:\mathbb {R}^{n}\rightarrow \mathbb {R}$ be a locally Lipschitz continuous function at a point $x\in \mathbb {R}^{n}$. Then, the generalized directional derivative (Clarke) of f at x in the direction of $d\in \mathbb {R}^{n}$ is defined by

$$ f^{\circ}(x;d):=\underset{\begin{array}{cc}y\rightarrow x \\ h\downarrow 0 \end{array}}{\limsup} \frac{f(y+hd)-f(y)}{h}. $$

Definition 3

A function $f:\mathbb {R}^{n}\rightarrow \mathbb {R}$ is called regular at $x\in \mathbb {R}^{n}$ if it is locally Lipschitz continuous at x and for all $d\in \mathbb {R}^{n}$ the classical directional derivative $f^{\prime }(x;d)$ exists and we have $f^{\prime }(x;d)=f^{\circ }(x;d)$.

Definition 4

A function $f:\mathbb {R}^{n}\rightarrow \mathbb {R}$ is called semismooth at $x\in \mathbb {R}^{n}$ if f is Lipschitz on a ball about x and for each $d\in \mathbb {R}^{n}$ and for any sequences $\{h_{k}\}\subset \mathbb {R}^{+}$, HCode $\{\theta _{k}\}\subset \mathbb {R}^{n}$ and $\{\xi _{k}\}\subset \mathbb {R}^{n}$ such that {h_k}↓ 0, HCode $\{\theta _{k}/ h_{k}\}\rightarrow 0\in \mathbb {R}^{n}$ and ξ_k ∈ ∂f(x + h_kd + 𝜃_k), the sequence {〈ξ_k,d〉} has exactly one accumulation point.

Definition 5

A locally Lipschitz function $f:O\rightarrow \mathbb {R}$, where O is an open subset of $\mathbb {R}^{n}$, is called lower-C¹ (lower-C²) on O, if on some neighborhood V of each x₀ ∈ O there is a representation $f(x)=\max \limits _{\omega \in {{\varOmega }}}f_{\omega }(x)$ in which the functions f_ω are of class (twice) continuously differentiable on V and the index set Ω is a compact space such that f_ω(x) and ∇f_ω(x) (∇₂f_ω(x)) depend continuously not just on x ∈ V but jointly on (ω,x) ∈Ω× V.

Definition 6

Let $f:\mathbb {R}^{n}\rightarrow \mathbb {R}$ be a locally Lipschitz continuous function at a point $x\in \mathbb {R}^{n}$. Then, the subdifferential of f at x is defined by

$$ \partial f(x):=\{\xi\in\mathbb{R}^{n} | f^{\circ}(x;d)\geq\xi^{T}d \text{for all} d\in\mathbb{R}^{n}\}. $$

Each element ξ ∈ ∂f(x) is called a subgradient of f at x. If $f:\mathbb {R}^{n}\rightarrow \mathbb {R}$ is a convex function, then the subdifferential of f at $x\in \mathbb {R}^{n}$ is defined by

$$ \partial f(x):=\{ \xi | f(y)\geq f(x)+\langle\xi,y-x\rangle \text{for all} y\in\mathbb{R}^{n} \}. $$

Theorem 1

[3, Theorem 3.3] Let $f:\mathbb {R}^{n}\rightarrow \mathbb {R}$ be a locally Lipschitz continuous function at $x\in \mathbb {R}^{n}$ with a Lipschitz constant L. Then, the subdifferential ∂f(x) is a nonempty, convex, and compact set such that

$$ \partial f(x)\subseteq {{B}(0;L)}. $$

Theorem 2

[3, Theorem 3.23] Let $f_{i}:\mathbb {R}^{n}\rightarrow \mathbb {R}$ be locally Lipschitz continuous at x for all i = 1,...,m. Then, the function

$$ f(x)=\max\{f_{i}(x) | i=1,...,m\} $$

is locally Lipschitz continuous at x and

$$ \partial f(x)\subseteq\text{conv} \{\partial f_{i}(x) | i\in\mathcal{I}(x)\}, $$

(2)

where

$$ \mathcal{I}(x):=\{i\in\{1,...,m\} | f_{i}(x)=f(x)\}. $$

In addition, if f_i is regular at x for all i = 1,...,m, then f is also regular at x and equality holds in (2).

Now, we introduce the indicator function I_X of a set $X\subset \mathbb {R}^{n}$ as follows:

$$ I_{X}(x)=\left\{ \begin{aligned} 0 \ \ \ & \ \text{if}\ x\in X, \\ \infty\ \ & \ \text{if}\ x\notin X. \end{aligned}\right. $$

The indicator function I_X of a set $X\subset \mathbb {R}^{n}$ is convex if and only if X is convex. Let the point x belong to X, then

$$ \partial I_{X}(x)=N_{X}(x):=\{v\in\mathbb{R}^{n} | \langle v,y-x\rangle\leq0,\ \text{for all}\ y\in X\}. $$

From [26, Proposition 4] and [3, Theorem 3.13], we have the following:

Theorem 3

If $f:X\rightarrow \mathbb {R}$ is continuously differentiable then f is semismooth and regular on X.

From [37, Proposition 2.4] and [7, Theorem 2], we have the following:

Proposition 1

Given an open set O containing X. The following statements are equivalent:

1.:: f is semismooth and regular on O.
2.:: For all x ∈ O, for all 𝜖 > 0, there exists ρ > 0 such that for all y ∈ B(x;ρ) and ξ ∈ ∂f(y)
$$ f(z)-f(y)\geq\langle\xi ,z-y\rangle-\epsilon\|z-y\|,\qquad z\in B(x;\rho). $$
3.:: f is lower-C¹ on O.

Now we return to our basic problem (1), and give the following assumption.

Assumption 1

The Slater constraint qualification holds for problem (1), i.e., there exists $\hat {x}\in \mathbb {R}^{n}$ such that $g(\hat {x},t)<0$ for all t ∈ T.

Assumption 1 was used in [41]. In this paper, we always assume that the Slater constraint qualification holds.

2.2 Necessary conditions

To solve problem (1), we introduce the improvement function

$$ H(y,x):=\max\left\{f(y)-f(x) ,G(y)\right\},\ \ y\in\mathbb{R}^{n}, $$

(3)

where

$$ G(y)=\underset{t\in T}{\max} g(y,t). $$

Now, the subdifferential of G at a point x is defined by [6]

$$ \partial G(x)=\text{conv}\left\{\nabla_{x} g(x,t) | t\in A(x)\right\}, $$

(4)

where A(x) = {t ∈ T|G(x) = g(x,t)}.

Definition 7

[42](EMFCQ) A feasible point x is said to satisfy the extended Mangasarian-Fromovitz constraint qualification for SIP problem (1) if there exists a feasible direction d of X at x such that

$$ \nabla_{x} g(x,t)^{T} d<0,\ \text{for all}\ \ t\in T_{act}(x), $$

(5)

where T_act(x) = {t ∈ T|g(x,t) = 0}.

Theorem 4

Let x^∗ be a local minimizer of (1). Then the following statements hold:

(a):: 0 ∈ ∂H(x^∗,x^∗) + ∂I_X(x^∗).
(b):: There exist nonnegative multipliers λ_i, i = 0,...,n + 1, and indices t_i ∈ A(x^∗),i = 1,...,n + 1 such that ${{\sum }_{i=0}^{n+1}} \lambda _{i}=1$, λ_ig(x^∗,t_i) = 0,i = 1,...,n + 1 and
$$ 0 \in\lambda_{0}\partial f(x^{*})+\sum\limits_{i=1}^{n+1}\lambda_{i}\nabla_{x} g(x^{*},t_{i})+\partial I_{X}(x^{*}). $$
(c):: If EMFCQ holds at x^∗ for problem (1), then there exist nonnegative multipliers λ_i and indices t_i ∈ T_act(x^∗),i = 1,...,n + 1 such that
$$ 0 \in \partial f(x^{*})+\sum\limits_{i=1}^{n+1}\lambda_{i}\nabla_{x} g(x^{*},t_{i})+\partial I_{X}(x^{*}). $$

Proof

(a)
Let F be the feasible region of the problem (1), namely, F = {x ∈ X|G(x) ≤ 0}. Since $x^{*}\in \mathbb {R}^{n}$ is a local optimum of the problem (1), then G(x^∗) ≤ 0 and there exists a neighborhood U of x^∗ such that f(x^∗) ≤ f(x) for all x ∈ U ∩ F. Then, for all $x\in U\cap F\subseteq X$
$$ H(x,x^{*})=\max\{f(x)-f(x^{*}) , G(x)\}\geq f(x)-f(x^{*})\geq 0=H(x^{*},x^{*}). $$
This means that x^∗ is a local minimum of H(⋅,x^∗) on X, which implies 0 ∈ ∂H(x^∗,x^∗) + ∂I_X(x^∗).
(b)
If G(x^∗) < 0, we have $\partial H(x^{*},x^{*})\subseteq \partial f(x^{*})$. Then, the assertion of the theorem is proved by choosing λ₀ = 1 and λ_i = 0 for i = 1,...,n + 1. If G(x^∗) = 0, we have $\partial H(x^{*},x^{*})\subseteq \text {conv}\{\partial f(x^{*})\cup \partial G(x^{*})\}$, where ∂G(x^∗) = conv{∇g(x^∗,t_i),t_i ∈ A(x^∗)}. Then due to the definition of convex hull and the Caratheodory’s Theorem, there exist λ_i ≥ 0, $i=0,1,\dots ,n+1$, such that
$$ 0 \in\lambda_{0}\partial f(x^{*})+\sum\limits_{i=1}^{n+1}\lambda_{i}\nabla_{x} g(x^{*},t_{i})+\partial I_{X}(x^{*}). $$
(c)
Suppose that λ₀ = 0, then we deduce that G(x^∗) = 0 and A(x^∗) = T_act(x^∗). Moreover, $-{\sum }_{i=1}^{n+1}\lambda _{i}\nabla _{x} g(x^{*},t_{i})\in \partial I_{X}(x^{*})$. For any feasible direction d of X at x^∗, we have
$$ \left\langle-\sum\limits_{i=1}^{n+1}\lambda_{i}\nabla_{x} g(x^{*},t_{i}),d\right\rangle\leq 0, \ \ t_{i}\in T_{act}(x^{*}), $$
which is an contradiction with (5). Thus, λ₀ is strictly positive, and the result item (c) holds by setting λ_i = λ_i/λ₀.

□

Definition 8

x^∗ is called an 𝜖-KKT point of (1) if there exist some multipliers λ_i, i = 1,...,n + 1 such that

$$ 0\in \partial f(x^{*})+\sum\limits_{i=1}^{n+1}\lambda_{i}\nabla_{x} g(x^{*},t_{i})+\partial I_{X}(x^{*}), t_{i}\in T_{act}^{\epsilon}(x^{*}), $$

where $T_{act}^{\epsilon }(x^{*})=\{t\in T | g(x^{*},t)\in [-\epsilon ,0]\}$.

2.3 The concave relaxation of the lower level problem

In this subsection, we consider the following lower level problem

$$ Q(x):\qquad\qquad \underset{t\in T}{\max} g(x,t).\qquad\qquad\qquad\qquad $$

We note that the lower level problem is actually a concave optimization problem for all x ∈ X if g(x,⋅) is convex on T for these x ∈ X.

For $N\in \mathbb {N}$, let a = t₀ < t₁ < ... < t_N = b define a subdivision E of T:

$$ E:=\{t_{u} | u\in U^{\prime}=\{0,1,...,N\}\}. $$

Then, we have

$$ \underset{t\in T}{\max} g(x,t)= \underset{u\in U}{\max} \underset{t\in T_{u}}{\max} g(x,t), $$

where U = {1,2,...,N} and T_u = [t_u− 1,t_u]. T_u is called a subinterval of T and satisfies

$$ T=\underset{u\in U}{\bigcup} T_{u}. $$

The length of the subinterval T_u is defined by |T_u| = |t_u − t_u− 1| and the length of the subdivision E is defined by $|E|=\max \limits _{u\in U}|T_{u}|$.

Following the approach of [41], for all u ∈ U, we generate the upper bound function g_u of g on T_u,

$$ g_{u}(x,t)=g(x,t)+\frac{\alpha_{u}}{2}(t-\frac{t_{u-1}+t_{u}}{2})^{2},~~\forall~~x\in X. $$

In this case, the function g_u is twice continuously differentiable with respect to t and the second-order derivative with respect to t of g_u is

$$ {\nabla^{2}_{t}} g_{u}(x,t)={\nabla^{2}_{t}}{g}(x,t)+\alpha_{u}. $$

It follows that g_u(x,t) is convex with respect to t on T_u if the parameter α_u satisfies

$$ \alpha_{u}\geq\underset{(x,t)\in X\times T_{u}}{\max} \left\{-{\nabla_{t}^{2}} g(x,t)\right\}. $$

On the other hand, we require to construct an upper bound function of g. Thus,

$$ \alpha_{u}\geq\max\left\{0 ,\underset{(x,t)\in X\times T_{u}}{\max} \left\{-{\nabla_{t}^{2}} g(x,t)\right\}\right\},~~\forall~~ u\in U. $$

(6)

Lemma 1

[41] Let f be a convex function and C = convS, where S is an arbitrary set of points. Then

$$ \sup\{f(x) | x\in C\}=\sup\{f(x) | x\in S\}, $$

where the first supremum is attained only when the second supremum is attained.

We know from Lemma 1 that the maximum of g_u(x,t)(∀ u ∈ U) must be attained on the boundary of T_u, that is,

$$ \begin{array}{@{}rcl@{}} \underset{t\in T_{u}}{\max} g_{u}(x,t)&=&\underset{t\in \{t_{u-1},t_{u}\}}{\max} g_{u}(x,t)=\max\left\{g(x,t_{u-1}),g(x,t_{u})\right\}\\ &&+\frac{\alpha_{u}}{8}(t_{u}-t_{u-1})^{2}, ~~\forall~~x\in X. \end{array} $$

Then, we have

$$ \underset{t\in \{t_{u-1},t_{u}\}}{\max} g(x,t)\leq \underset{t\in T_{u}}{\max} g(x,t)\leq\underset{t\in \{t_{u-1},t_{u}\}}{\max} g_{u}(x,t),~~\forall~~x\in X. $$

The convex upper bound function G(⋅,α,E) of G on X is defined by

$$ G(x,\alpha,E)=\underset{u\in U}{\max} \underset{t\in T_{u}}{\max} g_{u}(x,t). $$

It can be rewritten as

$$ G(x,\alpha,E)=\underset{t_{u}\in E}{\max} \left\{g(x,t_{u})+h(t_{u},\alpha_{u})\right\}, $$

(7)

where

$$ h(t_{u},\alpha_{u})=\left\{ \begin{aligned} &\frac{\alpha_{u+1}}{8}({t_{u+1}-t_{u}})^{2},&\text{if}\ \ u=0,\qquad\ \\ &\max\left\{\frac{\alpha_{u}}{8}({t_{u} -t_{u-1}})^{2}\ ,\frac{\alpha_{u+1}}{8}({t_{u+1}-t_{u}})^{2}\right\},& \text{if}\ \ 0<u<N, \\ &\frac{\alpha_{u}}{8}({t_{u} -t_{u-1}})^{2},&\text{if}\ \ u=N,\qquad \end{aligned} \right. $$

(8)

where the parameter α_u satisfies (6). We define

$$ \bar{A}(x,\alpha,E)=\{t_{u}\in E | G(x,\alpha,E)=g(x,t_{u})+h(t_{u},\alpha_{u})\}, $$

(9)

and

$$ D(E,\alpha)=\underset{u\in U}{\max} \frac{\alpha_{u}}{8}({t_{u} -t_{u-1}})^{2}. $$

(10)

Then, we can obtain that

$$ G(x)\leq G(x,\alpha,E)\leq G(x)+D(E,\alpha). $$

(11)

Now the upper bounding problem of the original problem is defined by

$$ \left\{\begin{aligned} \underset{x\in X}{\min} & f(x) \\ \text{s.t.} & G(x,\alpha,E)\le 0. \end{aligned}\right. $$

(12)

Using the improvement function mentioned in the preceding section, we define $H(\cdot ,x,\alpha ,E):\mathbb {R}^{n}\rightarrow \mathbb {R}$ as follows:

$$ H(y,x,\alpha,E):=\max\left\{ f(y)-f(x) , G(y,\alpha,E) \right\}, $$

where α and E vary along the iterations.

Lemma 2

[41] If EMFCQ holds for all feasible points of the original problem (1), then the MFCQ holds for all feasible points of the approximation problem (12) when |E| is sufficiently small.

Theorem 5

Suppose that $f: X\rightarrow \mathbb {R}$ and $g: X\times T\rightarrow \mathbb {R}$ are locally Lipschitz continuous, then the following statements hold:

1.:

If x^∗ is a local minimizer of (12), then 0 ∈ ∂H(x^∗,x^∗,α,E) + ∂I_X(x^∗).

2.:

If 0 ∈ ∂H(x^∗,x^∗,α,E) + ∂I_X(x^∗) and G(x^∗,α,E) ≤ 0, then there exist nonnegative multipliers λ_i, i = 0,...,n + 1, and indices $t_{i}\in \bar {A}(x^{*},\alpha ,E), i=1,...,n+1$, such that ${{\sum }_{i=0}^{n+1}} \lambda _{i}=1$, λ_i(g(x^∗,t_i) + h(t_i,α_i)) = 0,i = 1,...,n + 1, and

$$ 0 \in\lambda_{0}\partial f(x^{*})+\sum\limits_{i=1}^{n+1}\lambda_{i}\nabla_{x} g(x^{*},t_{i})+\partial I_{X}(x^{*}). $$

If, in addition, MFCQ holds at x^∗ for problem (12), then there exist nonnegative multipliers λ_i and indices $t_{i}\in \bar {T}_{act}(x^{*},\alpha ,E), i=1,...,n+1$, such that

$$ 0 \in \partial f(x^{*})+\sum\limits_{i=1}^{n+1}\lambda_{i}\nabla_{x} g(x^{*},t_{i})+\partial I_{X}(x^{*}), $$

where $\bar {T}_{act}(x^{*},\alpha ,E)=\{t_{u}\in E | g(x^{*},t_{u})+h(t_{u},\alpha _{u})=0\}.$ If, in addition, D(E,α) ≤ 𝜖, then x^∗ is an 𝜖-KKT point of the problem (1).

Proof

The proof of this theorem is analogous to Theorem 4; we only require to show that x^∗ is an 𝜖-KKT point of the original problem if D(E,α) ≤ 𝜖. It suffices to prove that if $t_{i}\in \bar {T}_{act}(x^{*},\alpha ,E)$, i = 1,...,n + 1, then $t_{i}\in T_{act}^{\epsilon }(x^{*})$, i = 1,...,n + 1, provided that condition D(E,α) ≤ 𝜖 holds. For each $t_{i}\in \bar {T}_{act}(x^{*},\alpha ,E)$, we have

$$ g(x^{*},t_{i})+\frac{\alpha_{\bar{i}}}{8}|T_{\bar{i}}|^{2}=0, $$

where

$$ \bar{i}=\left\{ \begin{aligned} i & \text{if} \frac{\alpha_{i}}{8}|T_{i}|^{2}\geq \frac{\alpha_{i+1}}{8}|T_{i+1}|^{2}, \\ i+1 & \text{otherwise}, \end{aligned} \right. $$

where T₀ = 0 and T_N+ 1 = 0. It implies that g(x^∗,t_i) ≤ 0 and

$$ g(x^{*},t_{i})=-\frac{\alpha_{\bar{i}}}{8}|T_{\bar{i}}|^{2}\geq -D(E,\alpha) \geq -\epsilon. $$

It follows that $t_{i}\in T_{act}^{\epsilon }(x^{*})$, i = 1,...,n + 1, if the condition D(E,α) ≤ 𝜖 holds. □

3 The numerical approach

Let k be the current iteration index and x^k be the stability center (or serious iterate) at iteration k, where iterations generate a sequence of trial points $\{y^{j}\}_{j\in J_{k}^{ora}}$, with $J_{k}^{ora}\subseteq \{0,1,...,k\}$. In particular, $x^{k}\in \{y^{j}\}_{j\in J_{k}^{ora}}$. Let $E^{k}=\{t_{u} | u\in U^{\prime }_{k}=\{0,1,...,N_{k}\}\}$ be the current subdivision of T and α^k be the corresponding convexification parameter.

Following the redistributed proximal approach [12, 13], we generate the augmented functions of f and G(⋅,α^k,E^k) defined by

$$ \begin{array}{@{}rcl@{}} f^{k}(\cdot):= &\ f_{ }(\cdot) +\frac{{\eta_{1}^{k}}}{2}\|\cdot-x^{k}\|^{2}, \\ g^{k}(\cdot):= &\ G(\cdot,\alpha^{k},E^{k})+\frac{{\eta_{2}^{k}}}{2}\|\cdot-x^{k}\|^{2}. \end{array} $$

The idea is to utilize the augmented functions in the model construction. We can first construct a convex piecewise linear approximation defined by

$$ \begin{array}{@{}rcl@{}} \hat{H}_{k}(y) &:=& \underset{j\in J_{k}^{ora}}{\max} \left\{ f^{k}(y^{j})+ \left\langle\xi_{f}^{j} +{\eta_{1}^{k}}(y^{j}-x^{k}),y-y^{j}\right\rangle-f(x^{k}),\right. \\ &&\left. g^{k}(y^{j})+ \left\langle\xi_{g}^{k,j}+{\eta_{2}^{k}}(y^{j}-x^{k}),y-y^{j}\right\rangle\right\}, \end{array} $$

where ${\xi ^{j}_{f}}\in \partial f(y^{j})$ and $\xi ^{k,j}_{g}\in \partial G(y^{j},\alpha ^{k},E^{k})$. It can be rewritten in an equivalent form

$$ \hat{H}_{k}(y) = G^{+}(x^{k},\alpha^{k},E^{k})+\underset{j\in J_{k}^{ora}}{\max} \left\{ -c_{k,j}^{f}+\langle s_{f}^{k,j},y-x^{k}\rangle, -c_{k,j}^{g}+\langle s_{g}^{k,j},y-x^{k}\rangle\right\}. $$

(13)

where

$$ \begin{aligned} & s_{f}^{k,j}:={\xi_{f}^{j}} \ +{\eta_{1}^{k}} {{{\varDelta}}^{j}_{k}} \ \ \text{and}\ \ c_{k,j}^{f} := e_{k,j}^{f}+{\eta_{1}^{k}} {b^{j}_{k}}+G^{+}(x^{k},\alpha^{k},E^{k}), \\ & s_{g}^{k,j}:=\xi_{g}^{k,j}+{\eta_{2}^{k}} {{{\varDelta}}^{j}_{k}} \ \ \text{and}\ \ c_{k,j}^{g} := e_{k,j}^{g}+{\eta_{2}^{k}} {b^{j}_{k}}+G^{+}(x^{k},\alpha^{k},E^{k}) - G(x^{k},\alpha^{k},E^{k}), \end{aligned} $$

(14)

with

$$ \begin{aligned} & e_{k,j}^{f}:=f(x^{k})-f(y^{j})-{\langle\xi_{f}^{j}},x^{k}-y^{j}\rangle, \\ & e_{k,j}^{g}:=G(x^{k},\alpha^{k},E^{k})-G(y^{j},\alpha^{k},E^{k})-\langle\xi_{g}^{k,j},x^{k}-y^{j}\rangle,\\ & {b_{k}^{j}}:=\frac{1}{2}\|y^{j}-x^{k}\|^{2},\ \ {{{\varDelta}}_{k}^{j}}:=y^{j}-x^{k}. \end{aligned} $$

(15)

Any choice for the parameters ${\eta _{1}^{k}}$, HCode ${\eta _{2}^{k}}$ that keeps $c_{k,j}^{f},c_{k,j}^{g}$ in (13) nonnegative is acceptable. In our method, we take

$$ {\eta_{1}^{k}}= \max\left\{\underset{\begin{array}{cc}j\in J_{k}^{ora} \\ y^{j}\neq x^{k} \end{array}}{\max}\left\{- \frac{\ e_{k,j}^{f}\ }{{b_{k}^{j}}}\right\}, 0\right\}+\gamma,\ \ {\eta_{2}^{k}}= \max\left\{\underset{\begin{array}{cc}j\in J_{k}^{ora} \\ y^{j}\neq x^{k} \end{array}}{\max}\left\{- \frac{\ e_{k,j}^{g}\ }{{b_{k}^{j}}}\right\}, 0\right\}+\gamma, $$

(16)

where γ is a small positive parameter. In particular, since $x^{k}\in \{y^{j}\}_{j\in J_{k}^{ora}}$, we obtain that

$$ \hat{H}_{k}(x^{k})=G^{+}(x^{k},\alpha^{k},E^{k})=H(x^{k},x^{k},\alpha^{k},E^{k}). $$

(17)

Then, we seek for the new point y^k+ 1 as a solution of

$$ \left\{\begin{aligned} \min & \hat{H}_{k}(y)+\frac{1}{2}\mu_{k}\|y-x^{k}\|^{2} \\ \text{s.t.} & y\in X, \end{aligned}\right. $$

(18)

where μ_k > 0 is a proximal parameter.

The following lemma introduces the characterization of the solution of the (18).

Lemma 3

Let y^k+ 1 be the unique solution to (18) and assume μ_k > 0. Then, we have that

$$ y^{k+1}=x^{k}-\frac{1}{\mu_{k}}(S^{k+1}+v^{k+1}), \text{where} S^{k+1}\in\partial \hat{H}_{k}(y^{k+1}), v^{k+1}\in \partial I_{X}(y^{k+1}). $$

(19)

Lemma 3 implies that

$$ \langle v^{k+1},y^{k+1}-x^{k}\rangle{\geq}0. $$

(20)

The linearization error of $\hat {H}_{k}(\cdot )$ is defined by

$$ C^{k+1}:=\hat{H}_{k}(x^{k})-\hat{H}_{k}(y^{k+1})-\langle S^{k+1},x^{k}-y^{k+1}\rangle\geq 0. $$

(21)

Since the piecewise linear model $\hat {H}_{k}$ is convex, we can construct a convex linear approximation of $\hat {H}_{k}$ by

$$ \hat{H}_{k}(y)\geq\hat{H}_{k}(y^{k+1})+\langle S^{k+1},y-y^{k+1}\rangle=G^{+}(x^{k},\alpha^{k},E^{k})+\langle S^{k+1},y-x^{k}\rangle-C^{k+1}. $$

Hence, the cutting plane model can be redefined in the form

$$ \begin{array}{lll} \hat{H}_{k}(y) &= & G^{+}(x^{k},\alpha^{k},E^{k})+\max\left\{\underset{j\in J_{k}^{agg}}{\max}\left\{-C^{j}+\langle S^{j},y-x^{k}\rangle\right\},\right.\\ && \left.\underset{j\in J_{k}^{ora}}{\max}\left\{ -c_{k,j}^{f}+\langle s_{f}^{k,j},y-x^{k}\rangle, -c_{k,j}^{g}+\langle s_{g}^{k,j},y-x^{k}\rangle\right\}\right\}, \end{array} $$

(22)

where $J_{k}^{ora}\subseteq \{0,1,...,k\}$ and $J_{k}^{agg}\subseteq \{1,...,k\}$. Then, (18) can be stated as a quadratic programming problem in $\mathbb {R}^{1}\times X$:

$$ \text{compute }(r,y) \\ := \left\{ \begin{aligned} \arg \min & r+\frac{\mu_{k}}{2}\|y-x^{k}\|^{2} \\ \text{s.t.}\ \ & G^{+}(x^{k},\alpha^{k},E^{k})-c_{k,j}^{f}+\langle s_{f}^{k,j},y-x^{k}\rangle\leq r, j\in J^{ora}_{k},\\ & G^{+}(x^{k},\alpha^{k},E^{k})-c_{k,j}^{g}+\langle s_{g}^{k,j},y-x^{k}\rangle\leq r, j\in J^{ora}_{k},\\ & G^{+}(x^{k},\alpha^{k},E^{k})- C^{j} +\langle S^{j} ,y-x^{k}\rangle\leq r, j\in J^{agg}_{k}. \end{aligned}\right. $$

(23)

From the KKT conditions for (23), one easily obtains a representation for S^k+ 1 and C^k+ 1 (defined by (19) and (21)):

$$ \begin{aligned} S^{k+1} & =\underset{j\in J_{k}^{ora}}{\sum} \left( \lambda_{1,k+1}^{j} s_{f}^{k,j}+\lambda_{2,k+1}^{j} s_{g}^{k,j}\right)+\underset{j\in J_{k}^{agg}}{\sum} \lambda_{3,k+1}^{j} {S^{j}}, \\ C^{k+1} & =\underset{j\in J_{k}^{ora}}{\sum} \left( \lambda_{1,k+1}^{j} c_{k,j}^{f}+\lambda_{2,k+1}^{j} c_{k,j}^{g}\right)+\underset{j\in J_{k}^{agg}}{\sum} \lambda_{3,k+1}^{j} {C^{j}}, \end{aligned} $$

(24)

and (λ_1,k+ 1,λ_2,k+ 1,λ_3,k+ 1) is a solution to

$$ \left\{ \begin{aligned} \min & \frac{1}{2}\left\|\underset{j\in J_{k}^{ora}}{\sum} \left( \lambda_{1,k+1}^{j} s_{f}^{k,j}+\lambda_{2,k+1}^{j} s_{g}^{k,j}\right)+\underset{j\in J_{k}^{agg}}{\sum}\lambda_{3,k+1}^{j} S^{j} \right\|^{2} \\ & + {\mu_{k}}\left( \underset{j\in J_{k}^{ora}}{\sum} \left( \lambda_{1,k+1}^{j} c_{k,j}^{f}+\lambda_{2,k+1}^{j} c_{k,j}^{g}\right)+\underset{j\in {J_{k}^{agg}}}{\sum} \lambda_{3,k+1}^{j} C^{j}\right)\\ \text{s.t.} & \ \ \underset{j\in J_{k}^{ora}}{\sum}\left( \lambda_{1,k+1}^{j} +\lambda_{2,k+1}^{j} \right)+\underset{j\in J_{k}^{agg}}{\sum}\lambda_{3,k+1}^{j} =1,\\ & \lambda_{1,k+1}^{j},\ \lambda_{2,k+1}^{j}\geq 0 \text{for all} j\in J^{ora}_{k} \text{and} \lambda_{3,k+1}^{j}\geq 0 \text{for all} j\in J^{agg}_{k}. \end{aligned}\right. $$

(25)

To measure the quality of the candidate, the nominal decrease δ_k+ 1 is defined by:

$$ \delta_{k+1}:=C^{k+1}+\frac{1}{\mu_{k}}\|S^{k+1}+v^{k+1}\|^{2}\geq0. $$

(26)

According to (20), (21), and Lemma 3, we can obtain that

$$ \begin{array}{lll} \delta_{k+1}&=&H(x^{k},x^{k},\alpha^{k},E^{k})-\hat{H}_{k}(y^{k+1})-\langle v^{k+1},y^{k+1}-x^{k}\rangle\\ &\leq& H(x^{k},x^{k},\alpha^{k},E^{k})-\hat{H}_{k}(y^{k+1}). \end{array} $$

(27)

We now present the criterion that determines whether a new stability center is generated:

$$ \max\left\{ f(y^{k+1})-f(x^{k}) ,G(y^{k+1},\alpha^{k},E^{k}) \right\}-G^{+}(x^{k},\alpha^{k},E^{k})\leq -m \delta_{k+1}, $$

(28)

where m is a positive descent parameter. If (28) holds, the trial point y^k+ 1 brings a sufficient descent for the objective or constraint function. Then, the stability center will be moved to y^k+ 1, i.e., x^k+ 1 := y^k+ 1. This case is called a serious step (or descent step). Otherwise, the trial point y^k+ 1 is used to enrich the model and the stability center is not changed. This case is called a null step.

In the following, we introduce a subdivision strategy to refine the current subdivision and construct a refined approximation of the lower level problem by the following procedure.

Refinement procedure

1.
For a given subset T_u = [t_u− 1,t_u], let α_u be the corresponding convexification parameter.
2.
Define
$$ E^{\alpha BB}(T_{{u}})=\left\{t_{{u},i} | t_{{u},i}=t_{{u}-1}+\frac{i}{2s+1}(t_{{u}}-t_{{u}-1}),i=1,...,2s, s\in\mathbb{N}\right\}. $$
(29)
3.
Put T_u,i = [t_u,i− 1,t_u,i], i = 1,..., 2s + 1, where t_u,0 = t_u− 1 and t_u,2s+ 1 = t_u.
4.
Determine the corresponding convexification parameters α_u,i, i = 1,..., 2s + 1, such that α_u,i ≤ α_u and
$$ \alpha_{{u},i}\geq\max\left\{0,\underset{(x,t)\in X\times T_{{u},i}}{\max} \left\{-{\nabla_{t}^{2}} g(x,t)\right\}\right\}. $$
(30)

Remark 1

Note that the computation of α_u,i involves a global optimization problem. However, we can use an upper bound for the right-hand side in (30). In particular, we can choose α_u,i = α_u. Therefore, the computation of the globally optimal value of the lower level problem $\max \limits _{t\in T}\{g(x,t)\}$ uses more computation efforts than the computation of α_u,i.

Now, we show that our refinement procedure brings a sufficient descent for the new constraint function.

Lemma 4

Assume that T_u,i = [t_u,i− 1,t_u,i] and α_u,i, i = 1,..., 2s + 1, are obtained by taking the refinement procedure on subset T_u = [t_u− 1,t_u]. Then, we have

$$ g(x,t)\leq g_{1,i}(x,t)\leq g_{2}(x,t), \text{for all} t\in[t_{u,i-1},t_{u,i}], i=1,...,2s+1, $$

where

$$ \begin{array}{@{}rcl@{}} g_{1,i}(x,t)&= & g(x,t)+\frac{\alpha_{u,i}}{2}\left( t-\frac{t_{u,i-1}+t_{u,i}}{2}\right)^{2},\\ g_{2}(x,t)&= & g(x,t)+\frac{\alpha_{u}}{2}\left( t-\frac{t_{u-1}+t_{u}}{2}\right)^{2}. \end{array} $$

Proof

The first inequality can be obtained directly by the definitions of g_1,i,i = 1,..., 2s + 1. We set $\psi _{i}(t)=\frac {\alpha _{u,i}}{2}(t-\frac {t_{u,i-1}+t_{u,i}}{2})^{2}-\frac {\alpha _{u}}{2}(t-\frac {t_{u-1}+t_{u}}{2})^{2}$. Then, we only need to prove that ψ_i(t) ≤ 0 for any i = 1,..., 2s + 1. We note that $|t-\frac {t_{u,i-1}+t_{u,i}}{2}|\leq \frac {t_{u}-t_{u-1}}{2(2s+1)}\leq \frac {t_{u-1}+t_{u}}{2}-t$ for all t ∈ [t_u,i− 1,t_u,i],i = 1,...,s. Then, for any i ≤ s, we have

$$ \begin{array}{@{}rcl@{}} \nabla_{t} \psi_{i}(t)& = &\ \alpha_{u,i}\left( t-\frac{t_{u,i-1}+t_{u,i}}{2}\right)-\alpha_{u}\left( t-\frac{t_{u-1}+t_{u}}{2}\right)\\ &= &\ \alpha_{u,i}\left( t-\frac{t_{u,i-1}+t_{u,i}}{2}\right)+\alpha_{u}\left( \frac{t_{u-1}+t_{u}}{2}-t\right) \\ &\geq &\ 0. \end{array} $$

Therefore, we conclude that ψ_i(t) is nondecreasing on [t_u,i− 1,t_u,i],i = 1,...,s. It implies that ψ_i(t) ≤ ψ_i(t_u,i) for all t ∈ [t_u,i− 1,t_u,i],i = 1,...,s. For any i ≤ s, we have

$$ \begin{array}{@{}rcl@{}} \psi_{i}(t_{u,i}) &= &\ \frac{\alpha_{u,i}}{2}\left( \frac{t_{u,i}-t_{u,i-1}}{2}\right)^{2}-\frac{\alpha_{u}}{2}\left( t_{u,i}-\frac{t_{u-1}+t_{u}}{2}\right)^{2} \\ &\leq &\ \frac{\alpha_{u,i}}{2}\left( \left( \frac{t_{u,i}-t_{u,i-1}}{2}\right)^{2}-\left( t_{u,i}-\frac{t_{u-1}+t_{u}}{2}\right)^{2}\right)\\ &= &\ \frac{\alpha_{u,i}}{2}\left( \left( \frac{t_{u,i}-t_{u,i-1}}{2}\right)^{2}-\left( \frac{t_{u}-t_{u-1}}{2(2s+1)}\right)^{2}\right) \\ &\leq &\ 0. \end{array} $$

Then, we have ψ_i(t) ≤ 0 for any [t_u,i− 1,t_u,i],i = 1,...,s. Similarly, we can prove that ψ_i(t) ≤ 0 for any [t_u,i− 1,t_u,i],i = s + 2,..., 2s + 1. For i = s + 1, we have

$$ \begin{array}{@{}rcl@{}} \psi_{i}(t) &= & \frac{\alpha_{u,i}}{2}\left( t-\frac{t_{u,i-1}+t_{u,i}}{2}\right)^{2}-\frac{\alpha_{u}}{2}\left( t-\frac{t_{u-1}+t_{u}}{2}\right)^{2} \\ &= & \left( \frac{\alpha_{u,i}}{2}-\frac{\alpha_{u}}{2}\right)\left( t-\frac{t_{u-1}+t_{u}}{2}\right)^{2}\\ &\leq & 0. \end{array} $$

The proof is complete. □

Now, we present the algorithm in full detail.

Algorithm 1

(Feasible proximal bundle method with convexification for SIP problems) Data. m ∈ (0, 1), tol₁ ≥ 0, tol₂ > 0, tol₃ ≥ 0, $\mu _{\max \limits }>\mu _{\min \limits }>0$, $\mu _{0}\in [\mu _{\min \limits },\mu _{\max \limits }]$, ${\eta _{1}^{0}}\geq 0$, ${\eta _{2}^{0}}\geq 0$, γ > 0, s > 0, $i_{n\max \limits }>0$, $\tilde {X}\subset \mathbb {R}$. Set k = 0,N₀ = 2s + 1, and

$$ E^{0}=\left\{t_{u} | t_{u}=a+\frac{u}{N_{0}}(b-a), u=0,1,...,N_{0}\right\}\subset T. $$

Determine the parameter α⁰ with (6) on E⁰.

Step 0 (Initialization)

Solve the problem

$$ \left\{ \begin{aligned} \underset{(x,z)\in X\times\tilde{X}}{\min} \qquad & z \\ \text{s.t.} \quad\qquad & G(x,\alpha^{0},E^{0})-z\leq 0. \end{aligned} \right. $$

(31)

and from the solution $\tilde {x}$ compute $G(\tilde {x},\alpha ^{0},E^{0})$. Set $y^{0}=\tilde {x}$, $G(y^{0},\alpha ^{0},E^{0})=G(\tilde {x},\alpha ^{0},E^{0})$. If z > 0, set i_z = 1, $J_{1}^{ora}=\{0\}$, and go to Step 7. Otherwise, compute $f(y^{0}),\ {\xi _{f}^{0}}\in \partial f(y^{0}),\ \xi _{g}^{0,0}\in \partial G(y^{0},\alpha ^{0},E^{0})$ and set x⁰ = y⁰, f(x⁰) = f(y⁰), G(x⁰,α⁰,E⁰) = G(y⁰,α⁰,E⁰). Define the associated $e_{0,0}^{f}, e_{0,0}^{g}, {b^{0}_{0}}, {{{\varDelta }}_{0}^{0}}$ by (15). Set i_z = 0, i_x = 0, i_n = 0, $J_{0}^{ora}=\{0\}$, $J_{0}^{agg}=\emptyset $ and go to Step 1.

Step 1 (Quadratic programming subproblem)

Given the model $\hat {H}_{k}$ defined by (14) and (22), compute S^k+ 1, C^k+ 1, y^k+ 1, (λ_1,k+ 1,λ_2,k+ 1,λ_3,k+ 1) by solving the subproblem (23) and its dual (25).

Step 2 (Stopping test)

Compute δ_k+ 1 by (26). If δ_k+ 1 ≤ tol₁ and D(E^k,α^k) ≤ tol₂, STOP.

Step 3 (Bundle management)

Set

$$ \begin{array}{@{}rcl@{}} & J_{k+1}^{ora}\subseteq J_{k}^{ora}\ \text{and} \ J_{k+1}^{ora}\supseteq\{ k+1 \}, \\ & J_{k+1}^{agg}\subseteq J_{k}^{agg}\ \text{and} \ J_{k+1}^{agg}\supseteq\{ k+1 \}. \end{array} $$

Step 4 (Descent test)

Compute

$$ f(y^{k+1}),\ G(y^{k+1},\alpha^{k},E^{k}),\ \xi_{f}^{k+1}\in\partial f(y^{k+1}),\ \xi_{g}^{k,{k+1}}\in\partial G(y^{k+1},\alpha^{k},E^{k}). $$

Define the associated $e_{k,{k+1}}^{f}$, $e_{k,{k+1}}^{g}$, $b_{k}^{k+1}$, and ${{\varDelta }}_{k}^{k+1}$ by (15). If the descent condition (28) holds, then set i_n = 0 and go to Step 6. Otherwise, set i_n = i_n + 1 and go to Step 5.

Step 5 (Null step update)

Choose $\mu _{k+1}\in [\mu _{k},\mu _{\max \limits }]$ and set

$$ \begin{array}{@{}rcl@{}} \left( x^{k+1},f(x^{k+1})\right) &=&\left( x^{k},f(x^{k})\right),\\ \left( e_{k+1,j}^{f},b_{k+1}^{j},{{\varDelta}}_{k+1}^{j}\right) &=&\left( e_{k,j}^{f},{b_{k}^{j}},{{{\varDelta}}_{k}^{j}}\right),\ \text{for\ all\ }j\in J_{k+1}^{ora}. \end{array} $$

If $i_{n}>i_{n\max \limits }$ and D(E^k,α^k) > tol₂, then go to Step 7. Otherwise, set

$$ \begin{array}{@{}rcl@{}} &\left( E^{k+1},\alpha^{k+1},G(x^{k+1},\alpha^{k+1},E^{k+1})\right) = \left( E^{k},\alpha^{k},G(x^{k},\alpha^{k},E^{k})\right),\\ &\left( \xi^{k+1,j}_{g},e_{k+1,j}^{g}\right) =\left( \xi^{k,j}_{g},e_{k,j}^{g}\right),\ \text{for\ all\ }j\in J_{k+1}^{ora}. \end{array} $$

and go to Step 8.

Step 6 (Serious step update)

Choose $\mu _{k+1}\in [\mu _{\min \limits },\mu _{\max \limits }]$ and set

$$ i_{x}=k+1,\ x^{k+1}=y^{k+1},\ f(x^{k+1})=f(y^{k+1}),\ J_{k+1}^{agg}=\emptyset. $$

For all $j\in J_{l+1}^{ora}$, we set

$$ \begin{array}{@{}rcl@{}} e_{k+1,j}^{f} & =&\ e_{k,j}^{f}+f(x^{k+1})-f(x^{k})-{\langle\xi_{f}^{j}}, x^{k+1}-x^{k}\rangle, \\ b_{k+1}^{j} & =&\ {b_{k}^{j}}+\|x^{k+1}-x^{k}\|^{2}/2-{\langle{{\varDelta}}_{k}^{j}}, x^{k+1}-x^{k}\rangle,\\ {{\varDelta}}_{k+1}^{j} & =&\ {{{\varDelta}}_{k}^{j}}-x^{k+1}+x^{k}. \end{array} $$

If D(E^k,α^k) > tol₂, go to Step 7. Otherwise, set

$$ E^{k+1}=E^{k},\ \alpha^{k+1}=\alpha^{k},\ G(x^{k+1},\alpha^{k+1},E^{k+1})=G(y^{k+1},\alpha^{k},E^{k}). $$

For all $j\in J_{k+1}^{ora}$, we set $\xi ^{k+1,j}_{g}=\xi ^{k,j}_{g}$ and

$$ e_{k+1,j}^{g} = e_{k,j}^{g}+G(x^{k+1},\alpha^{k+1},E^{k+1})-G(x^{k},\alpha^{k},E^{k})-\langle\xi_{g}^{k,j}, x^{k+1}-x^{k}\rangle. $$

Then, go to Step 8.

Step 7 (Subdivision update)

Select the index set W_k as follows:

$$ \begin{array}{@{}rcl@{}} W_{k}\supseteq\left\{u | G(y^{j},\alpha^{k},E^{k})\right.&=&g(y^{j},t_{u})+h(t_{u},\alpha_{u}) \text{and} h(t_{u},\alpha_{u})>tol_{2},\\ &&\left.\text{for all} j\in J_{k+1}^{ora},\ t_{u}\in E^{k}\right\}, \end{array} $$

and

$$ W_{k}\subseteq\left\{u | h(t_{u},\alpha_{u})>tol_{2}, \text{for all}\ t_{u}\in E^{k}\right\}. $$

Take the Refinement procedure on subsets $T_{\bar {u}}$, u ∈ W_k, where $\bar {u}$ is defined by

$$ \bar{u}=\left\{ \begin{aligned} u & \text{if} \frac{\alpha_{u}}{8}|T_{u}|^{2}\geq \frac{\alpha_{u+1}}{8}|T_{u+1}|^{2}, \\ u+1 & \text{otherwise}, \end{aligned} \right. $$

where T₀ = 0 and $T_{N_{k}+1}=0$. Put

$$ \begin{array}{@{}rcl@{}} E^{k+1}&=&\ \{ t_{u} | u=0,...,N_{k+1}\}:=E^{k}\bigcup\left\{E^{\alpha BB}(T_{\bar{u}}) | u\in W_{k}\right\}, \\ \alpha^{k+1}&=&\ \{\alpha_{u} | u=1,...,N_{k+1}\}:=\alpha^{k}\bigcup\\ && \left\{\alpha_{\bar{u},i}: u\in W_{k}, i=1,...,2s+1\right\} \backslash \left\{\alpha_{\bar{u}}: u\in W_{k}\right\}. \end{array} $$

If i_z = 1, set E⁰ = E^k+ 1, α⁰ = α^k+ 1, k = 0 and go to Step 0. Otherwise, for all $j\in J_{k+1}^{ora}$, compute $G(y^{j},\alpha ^{k+1},E^{k+1}),\ \xi _{g}^{k+1,j}\in \partial G(y^{j},\alpha ^{k+1},E^{k+1})$ and define the associated $e_{k+1,j}^{g}$ by (15). Then, go to Step 8.

Step 8 (Parameter update)

Compute the parameters $\eta _{1}^{k+1}$ and $\eta _{2}^{k+1}$ by (16). Set k = k + 1 and go to Step 1.

Remark 2

Under Assumption 1, there exist E,α, and a point x such that

$$G(x,\alpha,E)\leq 0~\text{as}~D(E,\alpha)\rightarrow0.$$

Therefore, Algorithm 1 cannot pass infinitely many times through step 0, i.e., Algorithm 1 generates a feasible initial point x⁰ such that G(x⁰,α⁰,E⁰) ≤ z ≤ 0 at step 0. According to (28), if the descent test is satisfied, then it holds that G(x^k+ 1,α^k,E^k) ≤ G⁺(x^k,α^k,E^k) − mδ_k+ 1 ≤ G⁺(x^k,α^k,E^k). By Lemma 4, we have G(x^k+ 1,α^k+ 1,E^k+ 1) ≤ G⁺(x^k,α^k,E^k). If x^k is infeasible, then we can obtain that G(x⁰,α⁰,E⁰) > 0, which contradicts the fact that G(x⁰,α⁰,E⁰) ≤ z ≤ 0. Therefore, if a serious step is declared, then the following conditions hold:

$$ f(x^{k+1})-f(x^{k})\leq -m\delta_{k+1}\ \ \text{and}\ \ G(x^{k+1},\alpha^{k+1},E^{k+1})\leq0. $$

Lemma 5

Algorithm 1 cannot pass infinitely many times through step 7, i.e., there exist $k_{E}, \bar {E}, \bar {\alpha }$ such that

$$ E^{k}=\bar{E},\ \alpha^{k}=\bar{\alpha},\ D(\bar{E},\bar{\alpha})\leq tol_{2}\ \text{for all}\ k\geq k_{E}. $$

4 Convergence

Firstly, we define the index set of serious step as follows:

$$ \mathcal{K}_{s}:=\{k | y^{k} \text{is a serious step, i.e.},x^{k}=y^{k}\}. $$

We start with the following lemma.

Lemma 6

Assume that f is bounded below on X and Algorithm 1 generates an infinite number of serious steps. Then $C^{k}\rightarrow 0$ and $S^{k}+v^{k}\rightarrow 0$ as $k\rightarrow \infty $ in $\mathcal {K}_{s}$.

Proof

We first show that

$$ \underset{k\in\mathcal{K}_{s}}{\sum}\delta_{k}<+\infty. $$

(32)

Making use of (28) and Remark 2, we conclude that, for all $k\in \mathcal {K}_{s}$,

$$ m \delta_{k}\leq f(x^{k-1})-f(x^{k}). $$

Thus, the sequence {f(x^k)} is nonincreasing and bounded below on X. Let $\bar {f}=\inf \{f(x) | x\in X\}$. Therefore, we obtain that

$$ \underset{k\in\mathcal{K}_{s}}{\sum}\delta_{k}\leq\frac{1}{m}\left( f(x^{0})-\bar{f}\right)<+\infty. $$

Now, we obtain that $\{\delta _{k}\}_{k\in \mathcal {K}_{s}}\rightarrow 0$. Making use of (26), we conclude that

$${ C^{k}\leq\delta_{k}\ \ \text{and}\ \ \frac{1}{\mu_{k-1}}\|S^{k}+v^{k}\|^{2}\leq\delta_{k}.} $$

It immediately follows that $C^{k}\rightarrow 0$ and $S^{k}+v^{k}\rightarrow 0$ as $k\rightarrow \infty $ in $\mathcal {K}_{s}$ (a consequence of $\mu _{k-1}\in [\mu _{\min \limits },\mu _{\max \limits }]$). □

Let Ψ_k(y^k+ 1) denote the objective function optimal value of the subproblem (18), i.e., ${{\varPsi }}_{k}(y^{k+1})=\hat {H}_{k}(y^{k+1})+\frac {\mu _{k}}{2}\|y^{k+1}-x^{k}\|^{2}$.

Lemma 7

Assume that Algorithm 1 takes a finite number of serious steps and $\bar {k}$ is the last serious iteration, i.e., for all $k\geq \bar {k}$ we have $x^{k}=x^{\bar {k}}$. Then we can obtain that

$$ H(x^{k},x^{k},\alpha^{k},E^{k})\geq{{\varPsi}}_{k}(y^{k+1})\geq {{\varPsi}}_{k-1}(y^{k})+\frac{\mu_{k-1}}{2}\|{y^{k+1}}-y^{k}\|^{2},\ \text{for\ all\ } k>\max\{\bar{k},k_{E}\}. $$

Proof

In what follows, we consider $k>\max \limits \{\bar {k},k_{E}\}$. Using Lemma 3 and the convexity of $\hat {H}_{k}$, we obtain that

$$ \begin{aligned} H(x^{k},x^{k},\alpha^{k},E^{k})= &\ \hat{H}_{k}(x^{k})\\ \geq &\ \hat{H}_{k}(y^{k+1})+\langle S^{k+1},x^{k}-y^{k+1}\rangle\\ \geq &\ \hat{H}_{k}(y^{k+1})+\langle S^{k+1},x^{k}-y^{k+1}\rangle+\langle v^{k+1}, x^{k}-y^{k+1}\rangle\\ \geq &\ \hat{H}_{k}(y^{k+1})+\frac{\mu_{k}}{2}\|y^{k+1}-x^{k}\|^{2}\\ = &\ {{\varPsi}}_{k}(y^{k+1}), \end{aligned} $$

where the first equality is by (17), the second inequality is due to (20). Since $k\in J_{k}^{agg}$, (21) and (22) imply that

$$ \begin{array}{@{}rcl@{}} \hat{H}_{k}(y^{k+1}) &\geq\ G^{+}(x^{k},\alpha^{k},E^{k})-C^{k}+\langle S^{k},y^{k+1}-x^{k} \rangle \\ & =\ \hat{H}_{k-1}(y^{k})+\langle S^{k},y^{k+1}-y^{k} \rangle. \end{array} $$

Moreover,

$$ \begin{aligned} {{\varPsi}}_{k}(y^{k+1}) = &\ \hat{H}_{k}(y^{k+1})+\frac{\mu_{k}}{2}\|y^{k+1}-x^{k}\|^{2}\\ \geq &\ \hat{H}_{k-1}(y^{k})+\langle S^{k},y^{k+1}-y^{k} \rangle+\frac{\mu_{k-1}}{2}\|y^{k+1}-x^{k}\|^{2}\\ \geq &\ \hat{H}_{k-1}(y^{k})+\langle S^{k},y^{k+1}-y^{k} \rangle+\langle v^{k}, y^{k+1}-y^{k}\rangle+\frac{\mu_{k-1}}{2}\|y^{k+1}-x^{k}\|^{2}\\ \geq &\ \hat{H}_{k-1}(y^{k})-\mu_{k-1}\langle y^{k}-x^{k},y^{k+1}-y^{k}\rangle+\frac{\mu_{k-1}}{2}\|y^{k}-x^{k}+{y^{k+1}}-y^{k}\|^{2} \\ = &\ \hat{H}_{k-1}(y^{k})+\frac{\mu_{k-1}}{2}\|y^{k}-x^{k}\|^{2}+\frac{\mu_{k-1}}{2}\|{y^{k+1}}-y^{k}\|^{2}\\ = &\ {{\varPsi}}_{k-1}(y^{k})+\frac{\mu_{k-1}}{2}\|{y^{k+1}}-y^{k}\|^{2}, \end{aligned} $$

where the second inequality is by the fact that 〈v^k,y^k+ 1 − y^k〉≤ 0. □

Lemma 8

Assume that Algorithm 1 takes a finite number of serious steps. Suppose that $\{{\eta ^{k}_{1}}\}$ and $\{{\eta _{2}^{k}}\}$ are bounded above. Then, $C^{k}\rightarrow 0$ and $S^{k}+v^{k}\rightarrow 0$ as $k\rightarrow \infty $.

Proof

Let $\bar {k}$ be the last serious iteration. In what follows, we consider $k>\{\bar {k},k_{E}\}$. By the definition (22) of $\hat {H}_{k}$ and the fact $k\in J_{k}^{ora}$, we obtain that

$$ \hat{H}_{k}(y^{k+1})\geq G^{+}(x^{k},\alpha^{k},E^{k})+\left\{ -c_{k,k}^{f}+\langle s_{f}^{k,k},y^{k+1}-x^{k}\rangle, -c_{k,k}^{g}+\langle s_{g}^{k,k},y^{k+1}-x^{k}\rangle\right\}{.} $$

It can be rewritten as

$$ \begin{array}{lll} \hat{H}_{k}(y^{k+1})&\geq& \max \left\{ f^{k}(y^{k})+ \left\langle\xi_{f}^{k} \ +{\eta_{1}^{k}}(y^{k}-x^{k}),y^{k+1}-y^{k}\right\rangle-f(x^{k}),\right. \\ && \left.g^{k}(y^{k})+ \left\langle\xi^{k,k}_{g}+{\eta_{2}^{k}}(y^{k}-x^{k}),y^{k+1}-y^{k}\right\rangle\right\} \\ &\geq& \max \left\{ \left\langle\xi_{f}^{k} \ +{\eta_{1}^{k}}(y^{k}-x^{k}),y^{k+1}-y^{k}\right\rangle+ f(y^{k})-f(x^{k}),\right. \\ && \left.\left\langle\xi^{k,k}_{g}+{\eta_{2}^{k}}(y^{k}-x^{k}),y^{k+1}-y^{k}\right\rangle+ G(y^{k},\alpha^{k},E^{k})\right\}. \end{array} $$

Since {y^k}∈ X, the functions f and g are Lipschitz continuous on X and X × T respectively and X is compact, Theorem 1 implies that $\{{\xi _{f}^{k}}\}$ and $\{\xi ^{k,k}_{g}\}$ are bounded on that set. Therefore by the boundedness of $\{{\eta _{1}^{k}}\}$ and $\{{\eta _{2}^{k}}\}$, we get $\{{\xi _{f}^{k}}+{\eta _{1}^{k}}(y^{k}-x^{k})\}$ and $\{\xi ^{k,k}_{g}+{\eta _{2}^{k}}(y^{k}-x^{k})\}$ are bounded. We set

$$ L^{\prime}=\max\left\{\|{\xi_{f}^{k}}+{\eta_{1}^{k}}(y^{k}-x^{k})\|,\|\xi^{k,k}_{g}+{\eta_{2}^{k}}(y^{k}-x^{k})\|\right\}. $$

Then, we can obtain

$$ \hat{H}_{k}(y^{k+1})\geq \max\left\{f(y^{k})-f(x^{k}) ,G(y^{k},\alpha^{k},E^{k})\right\}-L^{\prime}\|y^{k+1}-y^{k}\|. $$

According to (27) and (28), we can obtain that

$$\begin{aligned} \delta_{k+1}\leq &\ H(x^{k},x^{k},\alpha^{k},E^{k})-\hat{H}_{k}(y^{k+1}) \\ \leq &\ G^{+}(x^{k},\alpha^{k},E^{k})-\max\left\{f(y^{k})-f(x^{k}) ,G(y^{k},\alpha^{k},E^{k})\right\}+L^{\prime}\|y^{k+1}-y^{k}\|\\ \leq &\ m \delta_{k}+L^{\prime}\|y^{k+1}-y^{k}\|, \end{aligned} $$

where the third inequality holds by $k>\bar {k}$. Combining this relation with (26) yields

$$ 0\leq \delta_{k+1}\leq \ m \delta_{k}+L^{\prime}\|y^{k+1}-y^{k}\|. $$

(33)

According to Lemma 7, we infer that

$$ \{y^{k+1}-y^{k}\}\rightarrow 0, \text{as} k\rightarrow\infty, $$

which implies $\lim _{k\rightarrow \infty }\delta _{k}=0$ [33, Lemma 3, p. 45]. Making use of (26), we conclude that

$${ C^{k}\leq\delta_{k} \text{and} \frac{1}{\mu_{k-1}}\|S^{k}+v^{k}\|^{2}\leq\delta_{k}.} $$

It immediately follows that $C^{k}\rightarrow 0$ and $S^{k}+v^{k}\rightarrow 0$ as $k\rightarrow \infty $. □

Let

$$\eta_{\min}^{1,j}=\max\{0,-\frac{e_{k,j}^{f}}{{b_{k}^{j}}}\}+\gamma,\ \eta_{\min}^{2,j}=\max\{0,-\frac{e_{k,j}^{g}}{{b_{k}^{j}}}\}+\gamma,\ \forall \ \ j\in\{0,...,1\},$$

$$\eta_{\max}^{i}=\max_{j\in\{0,...,k\}}{\eta_{i}^{j}},\ i\in\{1,2\},\ {J^{C}_{k}}=\{j\in J^{ora}_{k} | \lambda_{1,k+1}^{j} = \lambda_{2,k+1}^{j}=0\}.$$

Lemma 9

Let x^k(k ≥ k_E) be the current stability center. Suppose that $\{{\eta ^{k}_{1}}\}$ and $\{{\eta _{2}^{k}}\}$ are bounded above, then there exist $\hat {\lambda }_{1,k+1},\hat {\lambda }_{2,k+1}\in \mathbb {R}^{k+1}$ and $\hat {J}_{k}$ with

$$ \left\{ \begin{aligned} & \hat{J}_{k}\subseteq\{0,1,...,k\}, \\ & \hat{\lambda}_{1,k+1}^{j},\hat{\lambda}_{2,k+1}^{j}\geq 0,\ \hat{\lambda}_{1,k+1}^{j}+\hat{\lambda}_{2,k+1}^{j}>0,\ \ \text{for\ all}\ j\in \hat{J}_{k},\\ & \underset{j\in \hat{J}_{k}}{\sum} (\hat{\lambda}_{1,k+1}^{j}+\hat{\lambda}_{2,k+1}^{j})=1, \end{aligned}\right. $$

(34)

such that

$$ S^{k+1} = \underset{j\in \hat{J}_{k}}{\sum} \left( \hat{\lambda}_{1,k+1}^{j}\left( {\xi_{f}^{j}}+\hat{\eta}_{1}^{j} {{{\varDelta}}_{k}^{j}}\right)+\hat{\lambda}_{2,k+1}^{j}\left( \xi_{g}^{k,j}+\hat{\eta}_{2}^{j} {{{\varDelta}}_{k}^{j}}\right)\right),\qquad\ \ $$

(35)

and

$$ \begin{aligned} C^{k+1} =& \underset{j\in \hat{J}_{k}}{\sum} \left( \hat{\lambda}_{1,k+1}^{j}\left( e_{k,j}^{f}+\hat{\eta}_{1}^{j} {b_{k}^{j}}+G^{+}(x^{k},\bar{\alpha},\bar{E})\right)\right. \\ &+ \left.\hat{\lambda}_{2,k+1}^{j}\left( e_{k,j}^{g}+\hat{\eta}_{2}^{j} {b_{k}^{j}}+G^{+}(x^{k},\bar{\alpha},\bar{E})-G(x^{k},\bar{\alpha},\bar{E})\right)\right), \end{aligned} $$

(36)

where $\hat {\eta }_{i}^{j}\in [\eta _{\min \limits }^{i,j},\eta _{\max \limits }^{i}],~i\in \{1,2\}$.

Proof

In what follows, we consider k ≥ k_E. If $k\in \mathcal {K}_{s}$ or k = k_E, then we have $J^{agg}_{k}=\emptyset $ and

$$ \begin{array}{@{}rcl@{}} S^{k+1} &=& \underset{j\in J_{k}^{ora}}{\sum} \left( \lambda_{1,k+1}^{j} \left( {\xi_{f}^{j}} +{\eta_{1}^{k}} {{{\varDelta}}^{j}_{k}}\right)+\lambda_{2,k+1}^{j}\left( \xi_{g}^{k,j}+{\eta_{2}^{k}}{{{\varDelta}}^{j}_{k}}\right)\right),\\ C^{k+1} &=& \underset{j\in J_{k}^{ora}}{\sum} \left( {\lambda}_{1,k+1}^{j}\left( e_{k,j}^{f}+{\eta}_{1}^{k} {b_{k}^{j}}+G^{+}(x^{k},\bar{\alpha},\bar{E})\right)\right. \\ &&+ \left.{\lambda}_{2,k+1}^{j}\left( e_{k,j}^{g}+{\eta}_{2}^{k} {b_{k}^{j}}+G^{+}(x^{k},\bar{\alpha},\bar{E})-G(x^{k},\bar{\alpha},\bar{E})\right)\right), \end{array} $$

where $\lambda _{1,k+1}^{j}, \lambda _{2,k+1}^{j}\geq 0$ for all $j\in J^{ora}_{k}$ and ${\sum }_{j\in J_{k}^{ora}}(\lambda _{1,k+1}^{j} +\lambda _{2,k+1}^{j})=1$. Then, the result holds by setting $\hat {J}_{k}=J_{k}^{ora}\setminus {J^{C}_{k}}$, $\hat {\lambda }_{1,k+1}^{j}=\lambda _{1,k+1}^{j}$, $\hat {\lambda }_{2,k+1}^{j}=\lambda _{2,k+1}^{j}$, $\hat {\eta }_{1}^{j}={\eta _{1}^{k}}$ and $\hat {\eta }_{2}^{j}={\eta _{2}^{k}}$.

If $k\notin \mathcal {K}_{s}$ and k > k_E, then there exists $\tilde {k}\in \{k_{E}+1,...,k-1\}$ such that $x^{k}=x^{\tilde {k}}$ and $J^{agg}_{\tilde {k}}=\emptyset $. It holds that ${{\varDelta }}^{j}_{\tilde {k}}={{{\varDelta }}^{j}_{k}}$, $b^{j}_{\tilde {k}}={b^{j}_{k}}$, $\xi _{g}^{\tilde {k},j}=\xi _{g}^{k,j}$ for all $j\in J_{k}^{ora}$ and $G(x^{\tilde {k}},\bar {\alpha },\bar {E})=G(x^{k},\bar {\alpha },\bar {E})$. Hence,

$$ \begin{array}{@{}rcl@{}} S^{\tilde{k}+1} &=&\underset{j\in J_{\tilde{k}}^{ora}}{\sum}\left( \lambda_{1,\tilde{k}+1}^{j} \left( {\xi_{f}^{j}} +\eta_{1}^{\tilde{k}} {{{\varDelta}}^{j}_{k}}\right)+\lambda_{2,\tilde{k}+1}^{j}\left( \xi_{g}^{{k},j}+\eta_{2}^{\tilde{k}}{{{\varDelta}}^{j}_{k}}\right)\right),\\ C^{\tilde{k}+1} &=&\underset{j\in J_{\tilde{k}}^{ora}}{\sum}\left( \lambda_{1,\tilde{k}+1}^{j} \left( e_{k,j}^{f}+{\eta}_{1}^{\tilde{k}} {b_{k}^{j}}+G^{+}(x^{k},\bar{\alpha},\bar{E})\right)+\right.\\ &&\left.\lambda_{2,\tilde{k}+1}^{j}\left( e_{k,j}^{g}+{\eta}_{2}^{\tilde{k}} {b_{k}^{j}}+G^{+}(x^{k},\bar{\alpha},\bar{E})-G(x^{k},\bar{\alpha},\bar{E})\right)\right), \end{array} $$

where ${\sum }_{j\in J_{\tilde {k}}^{ora}}\left (\lambda _{1,\tilde {k}+1}^{j} +\lambda _{2,\tilde {k}+1}^{j} \right )=1$. Since $J^{agg}_{\tilde {k}}=\emptyset $, then $J^{agg}_{\tilde {k}+1}=\tilde {k}+1$. Hence,

$$ \begin{array}{@{}rcl@{}} S^{\tilde{k}+2} &= &\underset{j\in J_{\tilde{k}+1}^{ora}}{\sum}\left( \lambda_{1,\tilde{k}+2}^{j} s_{f}^{\tilde{k}+1,j} +\lambda_{2,\tilde{k}+2}^{j} s_{g}^{\tilde{k}+1,j}\right) +\underset{j\in J_{\tilde{k}+1}^{agg}}{\sum}\lambda_{3,\tilde{k}+2}^{j}\cdot S^{j}\\ &=&\underset{j\in J_{\tilde{k}+1}^{ora}}{\sum}\left( \lambda_{1,\tilde{k}+2}^{j} \left( {\xi_{f}^{j}} +\eta_{1}^{\tilde{k}+1} {{{\varDelta}}^{j}_{k}}\right)+\lambda_{2,\tilde{k}+2}^{j}\left( \xi_{g}^{k,j}+\eta_{2}^{\tilde{k}+1}{{{\varDelta}}^{j}_{k}}\right)\right) \\ &&+\underset{j\in J_{\tilde{k}}^{ora}}{\sum}\left( \lambda_{1,\tilde{k}+1}^{j} \lambda_{3,\tilde{k}+2}^{k+1} \left( {\xi_{f}^{j}} +\eta_{1}^{\tilde{k}} {{{\varDelta}}^{j}_{k}}\right)+\lambda_{2,\tilde{k}+1}^{j}\lambda_{3,\tilde{k}+2}^{k+1}\left( \xi_{g}^{\tilde{k},j}+\eta_{2}^{\tilde{k}}{{{\varDelta}}^{j}_{k}}\right)\right)\\ \end{array} $$

and

$$ \begin{array}{@{}rcl@{}} C^{\tilde{k}+2} &=& \underset{j\in J_{\tilde{k}+1}^{ora}}{\sum} \left( \lambda_{1,\tilde{k}+2}^{j} c^{f}_{\tilde{k}+1,j} +\lambda_{2,\tilde{k}+2}^{j} c^{g}_{\tilde{k}+1,j}\right) +\underset{j\in J_{\tilde{k}+1}^{agg}}{\sum}\lambda_{3,\tilde{k}+2}^{j}\cdot C^{j}\\ &=&\underset{j\in J_{\tilde{k}+1}^{ora}}{\sum}\left( \lambda_{1,\tilde{k}+2}^{j} \left( e_{k,j}^{f}+{\eta}_{1}^{\tilde{k}+1} {b_{k}^{j}}+G^{+}(x^{k},\bar{\alpha},\bar{E})\right)\right. \\ &&+\left.\lambda_{2,\tilde{k}+2}^{j} \left( e_{k,j}^{g}+{\eta}_{2}^{\tilde{k}+1} {b_{k}^{j}}+G^{+}(x^{k},\bar{\alpha},\bar{E})-G(x^{k},\bar{\alpha},\bar{E})\right)\right) \\ &&+\underset{j\in J_{\tilde{k}}^{ora}}{\sum}\left( \lambda_{1,\tilde{k}+1}^{j} \lambda_{3,\tilde{k}+2}^{\tilde{k}+1} \left( e_{k,j}^{f}+{\eta}_{1}^{\tilde{k}} {b_{k}^{j}}+G^{+}(x^{k},\bar{\alpha},\bar{E})\right)\right.\\ &&+\left.\lambda_{2,\tilde{k}+1}^{j}\lambda_{3,\tilde{k}+2}^{\tilde{k}+1} \left( e_{k,j}^{g}+{\eta}_{2}^{\tilde{k}} {b_{k}^{j}}+G^{+}(x^{k},\bar{\alpha},\bar{E})-G(x^{k},\bar{\alpha},\bar{E})\right)\right).\\ \end{array} $$

It is easily seen that

$$ \underset{j\in J_{\tilde{k}+1}^{ora}}{\sum}\left( \lambda_{1,\tilde{k}+2}^{j} +\lambda_{2,\tilde{k}+2}^{j} \right)+\underset{j\in J_{\tilde{k}}^{ora}}{\sum}\left( \lambda_{1,\tilde{k}+1}^{j} \lambda_{3,\tilde{k}+2}^{\tilde{k}+1} +\lambda_{2,\tilde{k}+1}^{j} \lambda_{3,\tilde{k}+2}^{\tilde{k}+1} \right)=1. $$

We set

$$ \hat{J}_{\tilde{k}+1}= \left\{ \begin{aligned} & {J}^{ora}_{\tilde{k}+1}\setminus {J}^{C}_{\tilde{k}+1}, & \text{if}\ \ \lambda_{3,\tilde{k}+2}^{\tilde{k}+1}=0, \\ & ({J}^{ora}_{\tilde{k}+1}\setminus {J}^{C}_{\tilde{k}+1}) \cup\ ({J}^{ora}_{\tilde{k}}\setminus {J}^{C}_{\tilde{k}}), & \text{if}\ \ \lambda_{3,\tilde{k}+2}^{\tilde{k}+1}\neq0. \end{aligned} \right. $$

For any $j\in \hat {J}_{\tilde {k}+1}$ and i ∈{1, 2}, we set

$$ \left\{ \begin{aligned} & \hat{\lambda}_{i,\tilde{k}+2}^{j}=\lambda_{i,\tilde{k}+1}^{j} \lambda_{3,\tilde{k}+2}^{\tilde{k}+1}\ & \text{and} \quad \hat{\eta}_{i}^{j}=\eta_{i}^{\tilde{k}}, \quad & \quad \text{if}\ \ j\in {J}^{ora}_{\tilde{k}}\setminus {J}^{ora}_{\tilde{k}+1}, \\ & \hat{\lambda}_{i,\tilde{k}+2}^{j}=\lambda_{i,\tilde{k}+2}^{j} & \text{and} \quad \hat{\eta}_{i}^{j}=\eta_{i}^{\tilde{k}+1}, & \quad \text{if}\ \ j\in {J}^{ora}_{\tilde{k}+1}\setminus {J}^{ora}_{\tilde{k}}, \\ & \hat{\lambda}_{i,\tilde{k}+2}^{j}=\lambda_{i,\tilde{k}+2}^{j}+\lambda_{i,\tilde{k}+1}^{j} \lambda_{3,\tilde{k}+2}^{\tilde{k}+1}\ & \text{and} \quad \hat{\eta}_{i}^{j}={\ddot{\eta}}_{i}^{j}, \quad & \quad \text{if}\ \ j\in {J}^{ora}_{\tilde{k}}\cap {J}^{ora}_{\tilde{k}+1}, \end{aligned} \right. $$

where

$$ \ddot{\eta}_{i}^{j}= \left\{ \begin{aligned} & \frac{\eta_{i}^{\tilde{k}}+\eta_{i}^{\tilde{k}+1}}{2}, & \quad \text{if}\ \ \hat{\lambda}_{i,\tilde{k}+2}^{j}=0, \\ & \frac{\lambda_{i,\tilde{k}+2}^{j} \eta_{i}^{\tilde{k}+1} +\lambda_{i,\tilde{k}+1}^{j} \lambda_{3,\tilde{k}+2}^{\tilde{k}+1}\eta_{i}^{\tilde{k}}}{\hat{\lambda}_{i,\tilde{k}+2}^{j}}, & \quad \text{if}\ \ \hat{\lambda}_{i,\tilde{k}+2}^{j}>0. \end{aligned} \right. $$

Combining these, we have

$$\eta_{\min}^{i,j} \leq \hat{\eta}_{i}^{j} \leq \eta_{\max}^{i},\ \forall \ i=\{1,2\},\ j\in\hat{J}_{\tilde{k}+1},$$

and

$$ \begin{array}{@{}rcl@{}} S^{\tilde{k}+2} &= & {\sum}_{j\in \hat{J}_{\tilde{k}+1}}\left( \hat{\lambda}_{1,\tilde{k}+2}^{j} \left( {\xi_{f}^{j}} +\hat{\eta}_{1}^{j} {{{\varDelta}}^{j}_{k}}\right)+\hat{\lambda}_{2,\tilde{k}+2}^{j}\left( \xi_{g}^{\tilde{k}+1,j}+\hat{\eta}_{2}^{j}{{{\varDelta}}^{j}_{k}}\right)\right), \\ C^{\tilde{k}+2} &= &{\sum}_{j\in \hat{J}_{\tilde{k}+1}}\left( \hat{\lambda}_{1,\tilde{k}+2}^{j} \left( e_{k,j}^{f}+{\hat{\eta}}_{1}^{j} {b_{k}^{j}}+G^{+}(x^{k},\bar{\alpha},\bar{E})\right) \right.\\ &&+\left.\hat{\lambda}_{2,\tilde{k}+2}^{j} \left( e_{k,j}^{g}+{\hat{\eta}}_{2}^{j} {b_{k}^{j}}+G^{+}(x^{k},\bar{\alpha},\bar{E})-G(x^{k},\bar{\alpha},\bar{E})\right)\right). \end{array} $$

By the same argument, we can obtain that there exist $\hat {\lambda }_{1,k+1},\hat {\lambda }_{2,k+1}\in \mathbb {R}^{k+1}$ and $\hat {J}_{k}$ satisfying (34) such that (35) and (36) hold. □

From Theorems 2 and 3, we can obtain that G(⋅,α^k,E^k) is regular on X. The authors [17] state that the following lemma holds. We give the proof since the cutting plane model and the bundle update are different from [17]. In addition, the proof of [17] does not consider the aggregate information of the cutting plane model.

Lemma 10

Assume that f is regular on X. Suppose that $\{{\eta ^{k}_{1}}\}$ and $\{{\eta _{2}^{k}}\}$ are bounded above. Let x^k (k ≥ k_E) be the current stability center and $\bar {\eta }=\max \limits _{j\in \{0,1,...,k\}}\{{\eta _{1}^{j}},{\eta _{2}^{j}}\}$. The upper envelope model is defined as

$$ \begin{array}{cc} {{\varPhi}}(y,x^{k},\alpha^{k},E^{k})= G^{+}(x^{k},\alpha^{k},E^{k})+ \\ \underset{i=1,2}{\sup}\left\{m^{i}_{y^{+},\xi_{i},\eta} |\ y^{+}\in {B}(x^{k};R),\xi_{1}\in\partial f(y^{+}),\xi_{2}\in\partial G(y^{+},\alpha^{k},E^{k}),\eta\in[\gamma,\bar{\eta}]\right\}, \end{array} $$

where

$$ m^{i}_{y^{+},\xi_{i},\eta}:={-\frac{\gamma}{2}}\|y^{+}-x^{k}\|^{2}+\langle\xi_{i}+\eta(y^{+}-x^{k}),y-x^{k}\rangle, i=1,2. $$

(37)

B(x^k; R) is a fixed ball large enough to contain all possible trial steps during the iteration k. Then, the following statements hold.

1.
Φ(⋅,x^k,α^k,E^k) is a convex function and $\hat {H}_{k}(\cdot )\leq {{\varPhi }}(\cdot ,x^{k},\alpha ^{k},E^{k})$.
2.
Φ(x^k,x^k,α^k,E^k) = G⁺(x^k,α^k,E^k).
3.
$\partial {{\varPhi }}(x^{k},x^{k},\alpha ^{k},E^{k})\subseteq \partial H(x^{k},x^{k},\alpha ^{k},E^{k})$.

Proof

We only show that $\hat {H}_{k}(\cdot )\leq {{\varPhi }}(\cdot ,x^{k},\alpha ^{k},E^{k})$. For the remainder of the proofs, we refer the reader to [17].

Using (14) together with (15), for $j\in J_{k}^{ora}$ we obtain that

$$ \begin{array}{@{}rcl@{}} -c_{k,j}^{f}+\langle s_{f}^{k,j},y-x^{k}\rangle & \leq -\gamma {b_{k}^{j}}+{\langle\xi_{f}^{j}}+{\eta^{k}_{1}}(y^{j}-x^{k}),y-x^{k}\rangle=m^{1}_{y^{j},{\xi_{f}^{j}},{\eta^{k}_{1}}}, \\ -c_{k,j}^{g}+\langle s_{g}^{k,j},y-x^{k}\rangle & \leq -\gamma {b_{k}^{j}}+\langle\xi_{g}^{k,j}+{\eta^{k}_{2}}(y^{j}-x^{k}),y-x^{k}\rangle=m^{2}_{y^{j},\xi_{g}^{k,j},{\eta^{k}_{2}}}. \end{array} $$

On the other hand, from Lemma 9, we can obtain that for $j\in J_{k}^{agg}$ there exist $\hat {\lambda }_{1,j},\hat {\lambda }_{2,j}\in \mathbb {R}^{j}$, and $\hat {J}_{j-1}$ satisfying (34) such that

$$ \begin{array}{@{}rcl@{}} S^{j} &= & \underset{l\in \hat{J}_{j-1}}{\sum} \left( \hat{\lambda}_{1,j}^{l}\left( {\xi_{f}^{l}}+\hat{\eta}_{1}^{l} {{{\varDelta}}_{k}^{l}}\right)+\hat{\lambda}_{2,j}^{l}\left( \xi_{g}^{k,l}+\hat{\eta}_{2}^{l} {{{\varDelta}}_{k}^{l}}\right)\right), \\ C^{j} &= & \underset{l\in \hat{J}_{j-1}}{\sum} \left( \hat{\lambda}_{1,j}^{l}\left( e_{k,l}^{f}+\hat{\eta}_{1}^{l} {b_{k}^{l}}+G^{+}(x^{k},\bar{\alpha},\bar{E})\right)\right.\\ &&+\left.\hat{\lambda}_{2,j}^{l}\left( e_{k,l}^{g}+\hat{\eta}_{2}^{l} {b_{k}^{l}}+G^{+}(x^{k},\bar{\alpha},\bar{E})-G(x^{k},\bar{\alpha},\bar{E})\right)\right), \end{array} $$

where $\hat {\eta }_{i}^{l}\geq {\eta }_{\min \limits }^{i,l},\ i=1,2$. According to (16) and the definition of ${\eta }_{\min \limits }^{i,l}$, we can obtain that

$$ \begin{array}{@{}rcl@{}} e_{k,l}^{f}+\hat{\eta}_{1}^{l} {b_{k}^{l}} \geq \gamma {b_{k}^{l}}, ~~ e_{k,l}^{g}+\hat{\eta}_{2}^{l} {b_{k}^{l}} \geq \gamma {b_{k}^{l}}. \end{array} $$

Hence,

$$ \begin{array}{@{}rcl@{}} &&-C^{j}+\langle S^{j} ,y-x^{k}\rangle \\ &\leq & \underset{l\in \hat{J}_{j-1}}{\sum}\left( \hat{\lambda}_{1,j}^{l}\left( -\gamma {b_{k}^{l}}+\langle {\xi_{f}^{l}}+\hat{\eta}_{1}^{l} {{{\varDelta}}_{k}^{l}},y-x^{k}\rangle\right)\right.\\ &&+\left.\hat{\lambda}_{2,j}^{l}\left( -\gamma {b_{k}^{l}}+\langle \xi_{g}^{k,l}+\hat{\eta}_{2}^{l} {{{\varDelta}}_{k}^{l}},y-x^{k}\rangle\right)\right) \\ &\leq & \max\{-\gamma {b_{k}^{l}}+\langle {\xi_{f}^{l}}+\hat{\eta}_{1}^{l} {{{\varDelta}}_{k}^{l}},y-x^{k}\rangle, -\gamma {b_{k}^{l}}+\langle \xi_{g}^{k,l}+\hat{\eta}_{2}^{l} {{{\varDelta}}_{k}^{l}},y-x^{k}\rangle\} \\ &\leq & \underset{\eta\in[\gamma,\bar{\eta}]}{\sup} \{m^{1}_{y^{j},{\xi_{f}^{j}},\eta}, m^{2}_{y^{j},\xi_{g}^{k,j},\eta}\}. \end{array} $$

By the definition (22) of $\hat {H}_{k}(\cdot )$, we can obtain that $\hat {H}_{k}(\cdot )\leq {{\varPhi }}(\cdot ,x^{k},\alpha ^{k},E^{k})$. □

Theorem 6

Assume that f is bounded below and regular on X. Suppose that $\{{\eta ^{k}_{1}}\}$ and $\{{\eta _{2}^{k}}\}$ are bounded above. Consider the stopping parameter tol₁ = 0 and suppose there is no termination. Then, the following mutually exclusive situations hold:

(i):: Algorithm 1 generates an infinite sequence of serious steps. Let $\bar {x}$ be the accumulation point of {x^k}, then $0\in \partial H(\bar {x},\bar {x},\bar {\alpha },\bar {E})+\partial I_{X}(\bar {x})$.
(ii):: Algorithm 1 generates a finite sequence of serious steps. Let $\bar {x}$ be the last stability center, then $0\in \partial H(\bar {x},\bar {x},\bar {\alpha },\bar {E})+\partial I_{X}(\bar {x})$.

In addition, if MFCQ holds at $\bar {x}$ for problem (12) with $E=\bar {E}$, then $\bar {x}$ is a tol₂-KKT point of the problem (1).

Proof

In what follows, we consider k ≥ k_E. From Theorem 1 and Lemma 9, we can obtain that {S^k} is bounded; thus, we can assume that $S^{k+1}\rightarrow \bar {S}$. Hence, $v^{k+1}\rightarrow -\bar {S}$ as $k\rightarrow \infty $. Using (21) together with Lemma 10, we can obtain that

$$ \begin{array}{@{}rcl@{}} {{\varPhi}}(y,x^{k},\bar{\alpha},\bar{E})&\geq & \ \hat{H}_{k}(y)\\ &\geq & \ \hat{H}_{k}(y^{k+1})+\langle S^{k+1},y-y^{k+1}\rangle \\ &= & \ G^{+}(x^{k},\bar{\alpha},\bar{E})-C^{k+1}+\langle S^{k+1},y-x^{k}\rangle. \end{array} $$

By (19) and the convexity of I_X, I_X(y) ≥ I_X(y^k+ 1) + 〈v^k+ 1,y − y^k+ 1〉. As a result, it holds that

$$ {{\varPhi}}(y,x^{k},\bar{\alpha},\bar{E})+I_{X}(y)\geq G^{+}(x^{k},\bar{\alpha},\bar{E})-C^{k+1}+\langle S^{k+1}+v^{k+1},y-x^{k}\rangle+\langle v^{k+1},x^{k}-y^{k+1}\rangle. $$

(38)

(i)
First, note that {x^k} is in the compact set X, so it has an accumulation point, say for some infinite set $K\subseteq \mathcal {K}_{s}$, $x^{k}\rightarrow \bar {x}$ as $K\ni k\rightarrow \infty $. For all k + 1 ∈ K, we have $x^{k}-y^{k+1}=\frac {1}{\mu _{k}}(S^{k+1}+v^{k+1})$. By Lemma 6, $C^{k+1}\rightarrow 0$, HCode $S^{k+1}+v^{k+1}\rightarrow 0$, HCode $x^{k}-y^{k+1}\rightarrow 0$, $x^{k}\rightarrow \bar {x}$ as $K\ni k+1\rightarrow \infty $. Passing onto the limit in (38) as $K\ni k+1\rightarrow \infty $, we obtain that
$$ {{\varPhi}}(y,\bar{x},\bar{\alpha},\bar{E})+I_{X}(y)\geq G^{+}(\bar{x},\bar{\alpha},\bar{E})+\langle 0,y-\bar{x}\rangle={{\varPhi}}(\bar{x},\bar{x},\bar{\alpha},\bar{E})+\langle 0,y-\bar{x}\rangle. $$
Using the convexity of ${{\varPhi }}(\cdot ,\bar {x},\bar {\alpha },\bar {E})+I_{X}(\cdot )$, we have $0\in \partial {{\varPhi }}(\bar {x},\bar {x},\bar {\alpha },\bar {E})+\partial I_{X}(\bar {x}).$ According to Lemma 10, we can conclude that $0\in \partial H(\bar {x},\bar {x},\bar {\alpha },\bar {E})+\partial I_{X}(\bar {x}).$
(ii)
Let $\bar {k}$ be the last serious iteration, i.e., $x^{k}=\bar {x}$ for all $k\geq \bar {k}$. For $k\geq \bar {k}$, we have that $\bar {x}-y^{k+1}=x^{k}-y^{k+1}=\frac {1}{\mu _{k}}(S^{k+1}+v^{k+1})$. By Lemma 8, $C^{k+1}\rightarrow 0$, HCode $S^{k+1}+v^{k+1}\rightarrow 0$, HCode $y^{k+1}-x^{k}\rightarrow 0$ as $k\rightarrow \infty $. Passing onto the limit in (38) as $k\rightarrow \infty $, we obtain that
$$ {{\varPhi}}(y,\bar{x},\bar{\alpha},\bar{E})+I_{X}(y)\geq G^{+}(\bar{x},\bar{\alpha},\bar{E})+\langle 0,y-\bar{x}\rangle={{\varPhi}}(\bar{x},\bar{x},\bar{\alpha},\bar{E})+\langle 0,y-\bar{x}\rangle. $$
By the same discussion, we deduce that $0\in \partial H(\bar {x},\bar {x},\bar {\alpha },\bar {E})+\partial I_{X}(\bar {x})$.

From Theorem 5 and Remark 2, we obtain that $\bar {x}$ is a tol₂-KKT point of the original problem (1). The proof is complete.

□

Theorem 7

Assume that f is bounded below and regular on X. Suppose that $\{{\eta ^{k}_{1}}\}$ and $\{{\eta _{2}^{k}}\}$ are bounded above. Consider the stopping parameter tol₁ = 0 and suppose that algorithm loops forever. If f is semismooth, then for any accumulation point $\bar {x}$ of the sequence {x^k}, for each 𝜖 > 0 there exists ρ > 0 such that

$$ \max\left\{f(y)-f(\bar{x}) ,G(y,\bar{\alpha},\bar{E})\right\}\geq -\epsilon\|y-\bar{x}\|, \ {\text{for\ all} \ y\in X\cap B(\bar{x};\rho)}. $$

If, in addition, the set $X_{\rho ,\epsilon ,tol_{2}}=\{y\in X\cap B(\bar {x};\rho ): G(y)<-(\rho \epsilon +tol_{2})\}$ is not empty, then

$$ f(y)\geq f(\bar{x})-\epsilon\|y-\bar{x}\|\ \text{for\ all}\ y\in X_{\rho,\epsilon,tol_{2}}. $$

Proof

Let $\hat {\lambda }^{j}=\hat {\lambda }_{1,k+1}^{j}+\hat {\lambda }_{2,k+1}^{j}$, then we can obtain that

$$ C^{k+1}\geq \gamma\underset{j\in \hat{J}_{k}}{\sum} \hat{\lambda}^{j} {b_{k}^{j}} \geq \gamma \underset{j\in \hat{J}_{k}}{\sum} (\hat{\lambda}^{j})^{2} {b_{k}^{j}}=\frac{\gamma}{2}\underset{j\in \hat{J}_{k}}{\sum}\left( \hat{\lambda}^{j}\|y^{j}-x^{k}\|\right)^{2}. $$

(39)

Equation (39) together with Lemma 6 and 8 implies that $\|y^{j}-x^{k}\|\rightarrow 0$ for all $j\in \hat {J}_{k}$. Moreover, $\|y^{j}-\bar {x}\|\leq \|y^{j}-x^{k}\|+\|x^{k}-\bar {x}\|\rightarrow 0$ for all $j\in \hat {J}_{k}$. From Proposition 1 and Theorem 3, we can obtain that to fix any 𝜖 > 0 there exist ρ > 0, $\hat {\lambda }_{1,k+1},\hat {\lambda }_{2,k+1}\in \mathbb {R}^{k+1}$ and $\hat {J}_{k}$ satisfying (34) such that $y^{j}\in B(\bar {x};\rho )$, for all $j\in \hat {J}_{k}$ and

$$ \begin{aligned} & f(y)\geq \ f(y^{j})+{\langle\xi^{j}_{f}},y-y^{j}\rangle-\epsilon\|y-y^{j}\|,\ y\in X\cap B(\bar{x};\rho), \\ & G(y,\bar{\alpha},\bar{E})\geq \ G(y^{j},\bar{\alpha},\bar{E})+\langle\xi^{k,j}_{g},y-y^{j}\rangle-\epsilon\|y-y^{j}\|,\ y\in X\cap B(\bar{x};\rho). \end{aligned} $$

Using (35) together with (36), for $j\in \hat {J}_{k}$ we see that

$$ \begin{array}{@{}rcl@{}} && \ \max\left\{f(y)-f(x^{k}) ,G(y,\bar{\alpha},\bar{E})\right\} \\ &\geq & \ \underset{j\in \hat{J}_{k}}{\sum}\left( \hat{\lambda}_{1,k+1}^{j}\left( f(y)-f(x^{k})\right)+\hat{\lambda}_{2,k+1}^{j} G(y,\bar{\alpha},\bar{E})\right)\\ &\geq & \ \underset{j\in \hat{J}_{k}}{\sum}\left( \hat{\lambda}_{1,k+1}^{j}\left( f(y^{j})+{\langle\xi^{j}_{f}},y-y^{j}\rangle-\epsilon\|y-y^{j}\|-f(x^{k})\right)\right.\\ &&+\left.\hat{\lambda}_{2,k+1}^{j}\left( G(y^{j},\bar{\alpha},\bar{E})+\langle\xi^{k,j}_{g},y-y^{j}\rangle-\epsilon\|y-y^{j}\|\right)\right)\\ &= & \underset{j\in \hat{J}_{k}}{\sum}\left( -\epsilon\hat{\lambda}^{j}\|y-y^{j}\|+\hat{\lambda}_{1,k+1}^{j}\left( -e^{f}_{k,j}+{\langle\xi^{j}_{f}},y-x^{k}\rangle\right)\right.\\ &&\left.+\hat{\lambda}_{2,k+1}^{j} \left( -e^{g}_{k,j}+\langle\xi^{k,j}_{g},y-x^{k}\rangle + G(x^{k},\bar{\alpha},\bar{E}) \right)\right)\\ &\geq & \ -C^{k+1} +G^{+}(x^{k},\bar{\alpha},\bar{E})+\langle S^{k+1},y-x^{k}\rangle-\left\langle \underset{j\in \hat{J}_{k}}{\sum}\hat{\eta}^{j}(y^{j}-x^{k}),y-x^{k}\right\rangle\\ &&-\epsilon\underset{j\in \hat{J}_{k}}{\sum}\hat{\lambda}^{j}\|y-y^{j}\|\\ &\geq & \ -C^{k+1}+\langle S^{k+1},y-x^{k}\rangle-\left\langle \underset{j\in \hat{J}_{k}}{\sum}\hat{\eta}^{j}(y^{j}-x^{k}),y-x^{k}\right\rangle\\ &&-\epsilon\underset{j\in \hat{J}_{k}}{\sum}\hat{\lambda}^{j}\left( \|y-x^{k}\|+\|y^{j}-x^{k}\|\right). \end{array} $$

From Theorem 1 and Lemma 9, we can obtain that {S^k} is bounded; thus, we can assume that $S^{k+1}\rightarrow \bar {S}$. By noting Lemma 6 and Lemma 8, and passing to the limit as $k\rightarrow \infty $, we obtain that

$$ \max\left\{f(y)-f(\bar{x}) ,G(y,\bar{\alpha},\bar{E})\right\}\geq \langle \bar S,y-\bar x\rangle-\epsilon\|y-\bar{x}\|. $$

As already seen, $S^{k+1}+v^{k+1}\rightarrow 0$ implies that $-\bar S\in \partial I_{X}(\bar x)$, thus $\langle -\bar S,y-\bar x\rangle \leq 0$. Hence

$$ \max\left\{f(y)-f(\bar{x}) ,G(y,\bar{\alpha},\bar{E})\right\}\geq -\epsilon\|y-\bar{x}\|. $$

By (11), we have $G(y,\bar {\alpha },\bar {E})\leq G(y)+D(\bar {E},\bar {\alpha })\leq G(y)+tol_{2}$. Since also $y\in B(\bar {x};\rho )$, then $f(y)\geq f(\bar {x})-\epsilon \|y-\bar {x}\|$ for all $y\in X_{\rho ,\epsilon ,tol_{2}}=\{y\in X\cap B(\bar {x};\rho ): G(y)<-(\rho \epsilon +tol_{2})\}$. The proof is complete. □

5 Computational results

From Section 2.3, we find that the SIP can be rewritten as a finite constrained optimization problem when the lower level problem is a concave optimization problem. Hence, we consider the nonconvex nonsmooth constrained optimization problems and the nonconvex nonsmooth SIP problems.

In this section, we report some numerical results to verify the practical efficiency of our approach. We consider two classes of test problems and those test problems are introduced in Appendix Appendix. Solvers. For comparison purposes, we use the following solvers:

PBM-inexact, the Proximal Bundle Method from [23].
ICPBM, the Infeasible Constrained Proximal Bundle Method from [17].
SolvOpt, the pubic software for local nonlinear optimization problems. The code is available at https://imsc.uni-graz.at/kuntsevich/solvopt/index.html.
ASA, the Adaptive Subdivision Algorithm from [41].
fseminf, Matlab Optimization Toolbox for finding minimum of semi-infinitely constrained multivariable nonlinear function.

Solvers ICPBM and SolvOpt are designed for nonsmooth nonconvex constrained problems; these methods are only applicable to Problems 1–8. Solvers ASA and fseminf are designed for semi-infinite programming problems; these methods are only applicable to Problems 9–23. All solvers were implemented in MATLAB R2018a (Windows 10 64-bit). The experiments were performed on a PC with Intel Core i5, 2.4 GHz. We solve the quadratic subproblems using quadprog, which is available in the Matlab Optimization Toolbox. For solving the NLP subproblems, ASA and Algorithm 1 use fmincon of Matlab Optimization Toolbox. Parameters. Parameters of Algorithm 1 are set as follows: HCode $tol_{1}=10^{-6},\ tol_{2}=10^{-6},\ s=2,\ \gamma =2,\ i_{n\max \limits }=100.$ We choose the descent parameter as follows:

$$ m=\left\{\begin{aligned} & 0.01, \quad \text{if}\ \ n\leq 20 \\ & 0.25, \quad \text{if}\ \ 20<n\leq200 \\ & 0.55, \quad \text{if}\ \ n>200. \end{aligned}\right. $$

The proximal parameter is only updated at serious steps. We set μ₀ = 5 and

$$ \mu_{k+1}:=\max\left( \mu_{\min},\min\left( \bar{\mu}_{k+1},\mu_{\max}\right)\right), $$

where

$$ \begin{array}{@{}rcl@{}} \bar{\mu}_{k+1}:=\frac{\|S^{k+1}-S^{k}\|^{2}}{\langle S^{k+1}-S^{k},y^{k+1}-y^{k}\rangle},\\ \mu_{\min}=1,\ \mu_{\max}=10^{5}.\\ \end{array} $$

In ICPBM, the stopping tolerance tol has been set to 10^− 6. For all the other parameters, we have used the default values suggested by the respective codes [17]. Furthermore, in the parameters selection of PBM-inexact and ASA we have used the default values [23, 41]. In ASA, the maximum number of iterations is fixed to 10,000. In fseminf, the initial sampling interval is set as follows:

$$ \left\{ \begin{aligned} & a :\frac{b-a}{10^{3}}:b, \ \ & p=1, \\ & a_{i} :\frac{b_{i}-a_{i}}{10^{2}}:b_{i},i=1,2,\ & p=2. \end{aligned} \right. $$

The stopping criterions of fseminf are set as follows:

$$ TOLFun=10^{-6}, TOLCon=10^{-6}, TOLX=10^{-6}. $$

Results. We have summarized the results of our numerical experiments in Tables 1 and 2, where we have used the following notations:

P is the number of problems,
NF is the number of function evaluations,
NT is the number of nodes in the final subdivision,
time is the CPU time in seconds,
f^∗ is the objective function value when the algorithm terminates,
g^∗ is the constraint function value when the algorithm terminates for constrained optimization problems,
G^∗ measures the feasibility of x^∗ when the algorithm terminates for SIP problems, i.e.,
$$ G^{*}=\underset{t\in\bar{E}}{\max} g(x^{*},t), $$
where x^∗ is the obtained solution of the problem when the algorithm terminates and
$$ \bar{E}= \left\{ \begin{aligned} & a :\frac{b-a}{10^{6}}:b, & p=1, \\ & a_{i} :\frac{b_{i}-a_{i}}{10^{4}}:b_{i},i=1,2, & p=2. \end{aligned} \right. $$

The results presented in Table 1 show that the new solver Algorithm 1 is more reliable to find an optimal solution to each problem than the other solvers. The solvers Algorithm 1 and PBM-inexact reach the same optimal value in most problems, only in some of the problems (Problem 3, 7 (n= 100)) Algorithm 1 reaches lower objective value. Despite being more unreliable than Algorithm 1, the solver PBM-inexact has also quite a good performance and yields feasible points in all problems. In some of the problems (Problem 7 (n= 20,100,200), 8 (n= 100,200)) Algorithm 1 reaches a lower objective value than the solver ICPBM. The solver SolvOpt fails to find the optimal value in Problems 7 and 8 (n≥ 20).

Figure 1 depicts the performance plot comparing the solvers Algorithm 1, PBM-inexact, ICPBM and SolvOpt on Problems 1–8. It can be observed that Algorithm 1 performs much better in terms of CPU time, compared to the other solvers. The solver SolvOpt is the faster one for n ≤ 10. However, it uses more computational efforts than the other solvers in Problems 7 and 8 (n > 10).

All in all, we can conclude that the new solver Algorithm 1 is really efficient to solve the nonconvex, nonsmooth constrained optimization problems.

Table 1 Summary of numerical results with PBM-inexact, ICPBM, and SolvOpt

Full size table

Table 2 Summary of numerical results with PBM-inexact, ASA, and fseminf

Full size table

The results presented in Table 2 show that the solver Algorithm 1 is efficient to find an optimal solution to each SIP problem. The solvers Algorithm 1 and PBM-inexact find a feasible point to each SIP problem. The solver ASA fails to find the optimal solution approximately in 40% of the cases while fseminf fails in 80% of the cases. From Table 2, it observes that Algorithm 1 requires a lot more nodes than ASA. However, Algorithm 1 is much faster than ASA. This is expected because of the subproblems of ASA use more computational efforts than the subproblems of Algorithm 1 and the subproblems of our method do not depend on the number of the nodes in the subdivision.

Figure 2 depicts the performance plot comparing the solvers Algorithm 1, PBM-inexact, ASA, and fseminf on Problems 9–23. Algorithm 1 performs much better in terms of CPU time, compared to the solvers PBM-inexact and ASA. The solver fseminf is the faster one in some cases. However, it cannot find the optimal value in most problems.

All in all, we can conclude that Algorithm 1 is really efficient to solve the nonconvex, nonsmooth SIP problems.

6 Conclusions

In this paper, we design a feasible proximal bundle method with convexification for SIP problems. We consider SIP problems that involve nonsmooth, nonconvex functions and present upper bounding problem. The upper bounding problem is constructed based on a concave relaxation of the lower level problem which results in a restriction of the SIP. The relaxed lower level problem is a finite nonsmooth problem and constructed with ideas from the α BB method of global optimization. Then, we introduce an improvement function of the upper bounding problem, construct a cutting plane model of the improvement function, and reformulate the cutting plane model as a quadratic programming problem and solve it. In contrast to the traditional bundle method, the cutting plane model involves new parameters E, α which vary along the iterations. If $D(E,\ \alpha )\rightarrow 0$, the upper bounding problems converge to the original SIP problem. We perform an initialization to generate a feasible initial point of our proximal bundle method. Under our refinement procedure and descent condition, we can obtain that each iteration point is feasible. The reasonable convergence properties of the our algorithms are obtained under mild assumptions. The presented results of numerical experiments confirm that Algorithm 1 is efficient for solving nonsmooth, nonconvex SIP problems. The feasibility of Algorithm 1 is demonstrated through theoretical analysis and numerical experiments.

References

Adjiman, C.S., Androulakis, I.P., Floudas, C.A.: A global optimization method, α BB, for general twice-differentiable constrained NLPs-I: theoretical advances. Comput. Chem. Eng. 22, 1137–1158 (1998)
Article Google Scholar
Adjiman, C.S., Androulakis, I.P., Floudas, C.A.: A global optimization method, α BB, for general twice-differentiable constrained NLPs-II: implementation and computational results. Comput. Chem. Eng. 22, 1159–1179 (1998)
Article Google Scholar
Bagirov, A., Karmitsa, N., Mäkelä, M. M.: Introduction to Nonsmooth Optimization: Theory, Practice and Software. Springer, Cham (2014)
Book Google Scholar
Bhattacharjee, B., Green, Jr.W. H., Barton, P.I.: Interval methods for semi-infinite programs. Comput. Optim. Appl. 30, 63–93 (2005)
Article MathSciNet Google Scholar
Bhattacharjee, B., Lemonidis, P., Green, W.H. Jr., Barton, P.I.: Global solution of semi-infinite programs. Math. Program. 103, 283–307 (2005)
Article MathSciNet Google Scholar
Clarke, F.H., Ledyaev, Y.S., Stern, R.J., Wolenski, P.R.: Nonsmooth Analysis and Control Theory. Springer, New York (1998)
MATH Google Scholar
Daniilidis, A., Georgiev, P.: Approximate convexity and submonotonicity. J. Math. Anal. Appl. 291 (1), 292–301 (2004)
Article MathSciNet Google Scholar
Djelassi, H., Mitsos, A.: A hybrid discretization algorithm with guaranteed feasibility for the global solution of semi-infinite programs. J Glob Optim. 68, 227–253 (2017)
Article MathSciNet Google Scholar
Floudas, C.A., Stein, O.: The adaptive convexification algorithm: a feasible point method for semi-infinite programming. SIAM J. Optim. 18, 1187–1208 (2007)
Article MathSciNet Google Scholar
Fuduli, A., Gaudioso, M., Giallombardo, G., Miglionico, G.: A partially inexact bundle method for convex semi-infinite minmax problems. Commun. Nonlinear Sci. Numer. Simul. 21, 172–180 (2015)
Article MathSciNet Google Scholar
Gustafson, S.-Å., Kortanek, K.: Semi-infinite programming and applications. In: Mathematical Programming The State of the Art, pp 132–157. Springer, Berlin (1983)
Hare, W., Sagastizábal, C.: A redistributed proximal bundle method for nonconvex optimization. SIAM J. Optim. 20 (5), 2442–2473 (2010)
Article MathSciNet Google Scholar
Hare, W., Sagastizábal, C., Solodov, M.: A proximal bundle method for nonsmooth nonconvex functions with inexact information. Comput. Optim. Appl. 63, 1–28 (2016)
Article MathSciNet Google Scholar
Hettich, R.: An implementation of a discretization method for semi-infinite programming. Math. Program. 34 (3), 354–361 (1986)
Article MathSciNet Google Scholar
Hettich, R., Kortanek, K.O.: Semi-infinite programming: theory, methods, and applications. Soc. Ind. Appl. Math. 35 (3), 380–429 (1993)
MathSciNet MATH Google Scholar
Hoseini Monjezi, N., Nobakhtian, S.: A filter proximal bundle method for nonsmooth nonconvex constrained optimization. J. Glob. Optim. (2020)
Hoseini Monjezi, N., Nobakhtian, S.: A new infeasible proximal bundle algorithm for nonsmooth nonconvex constrained optimization. Comput Optim Appl. 74, 443–480 (2019)
Article MathSciNet Google Scholar
Karas, E., Ribeiro, A., Sagastizábal, C., Solodov, M.: A bundle-filter method for nonsmooth convex constrained optimization. Math. Program. 116, 297–320 (2009)
Article MathSciNet Google Scholar
Karmitsa, N.: Test problems for large-scale nonsmooth minimization. Reports of the Department of Mathematical Information Technology, Series B, Scientific computing, No. B 4/2007 (2007)
Kiwiel, K.C.: Methods of Descent for Nondifferentiable Optimization, vol. 1133. Springer, Berlin, Heidelberg (2006)
Google Scholar
Kortanek, K.O., No, H.: A central cutting plane algorithm for convex semi-infinite programming problems. SIAM J. Optim. 3 (4), 901–918 (1993)
Article MathSciNet Google Scholar
López, M., Still, G.: Semi-infinite programming. Eur. J. Oper. Res. 180 (2), 491–518 (2007)
Article MathSciNet Google Scholar
Lv, J., Pang, L.P., Meng, F.Y.: A proximal bundle method for constrained nonsmooth nonconvex optimization with inexact information. J. Global Optim. 70, 517–549 (2018)
Article MathSciNet Google Scholar
Lv, J., Pang, L.P., Xu, N., Xiao, Z.H.: An infeasible bundle method for nonconvex constrained optimization with application to semi-infinite programming problems. Numer. Algorithms 80, 397–427 (2019)
Article MathSciNet Google Scholar
Meng, F.Y., Pang, L.P., Lv, J., Wang, J.H.: An approximate bundle method for solving nonsmooth equilibrium problems. J. Global Optim. 68, 537–562 (2017)
Article MathSciNet Google Scholar
Mifflin, R.: Semismooth and semiconvex functions in constrained optimization. SAIM J. Control Optim. 15 (6), 959–972 (1977)
Article MathSciNet Google Scholar
Mitsos, A.: Global optimization of semi-infinite programs via restriction of the right-hand side. Optimization 60 (10-11), 1291–1308 (2011)
Article MathSciNet Google Scholar
Mitsos, A., Djelassi, H.: A Test Set of Semi-Infinite Programs. Process Systems Engineering (AVT.SVT), RWTH Aachen University, Aachen, Germany. http://web.mit.edu/mitsos/www/pubs/siptestset.pdf (2016)
Mitsos, A., Lemonidis, P., Lee, C.K., Barton, P.I.: Relaxation-based bounds for semi-infinite programs. SIAM J. Optim. 19, 77–113 (2007)
Article MathSciNet Google Scholar
Mitsos, A., Tsoukalas, A.: Global optimization of generalized semi-infinite programs via restriction of the right hand side. J. Glob. Optim. 61 (1), 1–17 (2015)
Article MathSciNet Google Scholar
Pang, L.P., Lv, J., Wang, J.H.: Constrained incremental bundle method with partial inexact oracle for nonsmooth convex semi-infinite programming problems. Comput. Optim. Appl. 64 (2), 433–465 (2016)
Article MathSciNet Google Scholar
Pang, L., Wu, Q., Wang, J., et al.: A discretization algorithm for nonsmooth convex semi-infinite programming problems based on bundle methods. Comput Optim Appl. 76, 125–153 (2020)
Article MathSciNet Google Scholar
Polyak, B.T.: Introduction to Optimization. Optimization Software, Inc., Publications Division, New York (1987)
MATH Google Scholar
Rustem, B., Nguyen, Q.: An algorithm for the inequality-constrained discrete min-max problem. SIAM J. Optim. 8, 265–283 (1998)
Article MathSciNet Google Scholar
Sagastizábal, C., Solodov, M.: An infeasible bundle method for nonsmooth convex constrained optimization without a penalty function or a filter. SIAM J. Optim. 16 (1), 146–169 (2005)
Article MathSciNet Google Scholar
Shiu, T.J., Wu, S.Y.: Relaxed cutting plane method with convexification for solving nonlinear semi-infinite programming problems. Comput Optim Appl. 53, 91–113 (2012)
Article MathSciNet Google Scholar
Spingarn, J.E.: Submonotone subdifferentials of Lipschitz functions. Trans. Am. Math. Soc. 264, 77–89 (1981)
Article MathSciNet Google Scholar
Stein, O., Steuermann, P.: The adaptive convexification algorithm for semi-infinite programming with arbitrary index sets. Math. Program. 136, 183–207 (2012)
Article MathSciNet Google Scholar
Still, G.: Discretization in semi-infinite programming: the rate of convergence. Math. Programm. 91 (1), 53–69 (2001)
Article MathSciNet Google Scholar
Tang, C., Liu, S., Jian, J., et al.: A feasible SQP-GS algorithm for nonconvex, nonsmooth constrained optimization. Numer Algor. 65, 1–22 (2014)
Article MathSciNet Google Scholar
Wang, S.X., Yuan, Y.X.: Feasible method for semi-infinite programs. SIAM J. Optim. 25 (4), 2537–2560 (2015)
Article MathSciNet Google Scholar
Xu, M.W., Wu, S.Y., Ye, J.J.: Solving semi-infinite programs by smoothing projected gradient method. Comput. Optim. Appl. 59, 591–616 (2014)
Article MathSciNet Google Scholar
Yang, Y., Pang, L., Ma, X., et al.: Constrained Nonconvex Nonsmooth Optimization via Proximal Bundle Method. J Optim Theory Appl. 163, 900–925 (2014)
Article MathSciNet Google Scholar
Zhang, L.P., Wu, S.Y., López, M.A.: A new exchange method for convex semi-infinite programming. SIAM J. Optim. 20 (6), 2959–2977 (2010)
Article MathSciNet Google Scholar

Download references

Funding

This work was supported by the Key Research and Development Projects of Shandong Province (NO. 2019GGX104089), and the Natural Science Foundation of Shandong Province (NO. ZR2019BA014).

Author information

Authors and Affiliations

School of Mathematical Sciences, Dalian University of Technology, Dalian, 116024, China
Li-Ping Pang
Key Laboratory for Computational Mathematics and Data Intelligence of Liaoning Province, Dalian, 116024, China
Li-Ping Pang
Department of Basic Sciences, Beijing International Studies University, Beijing, 100024, China
Qi Wu

Authors

Li-Ping Pang
View author publications
You can also search for this author in PubMed Google Scholar
Qi Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qi Wu.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix: 1: Test problems

Problem 1

[23, 40] Dimension: n= 2, $f(x) = 8|{x_{1}^{2}}-x_{2}|+(1-x_{1})^{2}$, $g(x) = \max \limits \left \{\sqrt {2}x_{1},\ 2 x_{2}\right \}-1$, X = [− 2, 2]², x⁰ = (1, 1)^T.

Problem 2

[34] Dimension: n= 2, $f(x) = \max \limits \left \{{{x_{1}^{2}}}+{{x_{2}^{4}}},\ (2-x_{1})^{2}+(2-x_{2})^{2},\ 2\exp (x_{2}-x_{1})\right \}$, $g(x) = \max \limits \left \{-{{x_{1}^{4}}}-2{{x_{2}^{2}}}-1,\ 2 {{x_{1}^{3}}}-{{x_{2}^{2}}}-2.5\right \}$, X = [− 4, 4]², x⁰ = (2, 2)^T.

Problem 3

[34] Dimension: n= 2, $f(x) = \max \limits \left \{{{x_{1}^{2}}}+{{x_{2}^{2}}}+x_{1} x_{2},\ -{{x_{1}^{2}}}-{{x_{2}^{2}}}-x_{1} x_{2},\ \sin \limits {x_{1}},\! -\sin \limits {x_{1}},\\ \cos \limits {x_{2}},\ -\cos \limits {x_{2}}\right \}$, $g(x) = \max \limits \left \{-{{x_{1}^{4}}}-2{{x_{2}^{2}}}-1,\ 2 {{x_{1}^{3}}}-{{x_{2}^{2}}}-2.5\right \}$, X = [− 4, 4]², x⁰ = (3, 1)^T.

Problem 4

[34] Dimension: n= 2, $f(x) = \max \limits \left \{{{x_{1}^{4}}}+{{x_{2}^{2}}},\ (2-x_{1})^{2}+(2-x_{2})^{2},\ 2\exp (x_{2}-x_{1})\right \}$, $g(x) = \max \limits \left \{{{x_{1}^{2}}}-{{x_{2}^{2}}},\ -2{{x_{1}^{3}}}-{{x_{2}^{2}}}\right \}$, X = [− 4, 4]², x⁰ = (0, 1)^T.

Problem 5

[34] Dimension: n= 2, $f(x) = \max \limits \left \{{{x_{1}^{2}}}+{{x_{2}^{2}}},\ (2-x_{1})^{2}+(2-x_{2})^{2},\ 2\exp (x_{2}-x_{1})\right \}$, $g(x) = \max \limits \left \{{x_{1}}+{x_{2}}-2,\ -{{x_{1}^{2}}}-{{x_{2}^{2}}}+2.25\right \}$, X = [− 4, 4]², x⁰ = (2.1, 1.9)^T.

Problem 6

[34] Dimension: n= 2, $f(x) = \max \limits \left \{10(x_{2}-{x_{1}^{2}}),\ 10({x_{1}^{2}}-x_{2}),\ 1-x_{1},\ x_{1}-1\right \}$, $g(x) = \max \limits \left \{100{x_{1}^{2}}+{x_{2}^{2}}-101,\ 80{x_{1}^{2}}-{x_{2}^{2}}-79\right \}$, X = [− 4, 4]², x⁰ = (− 1.2, 1)^T.

Problem 7

[19] Dimension: n= 20, 50, 100, 200, $f(x) =\max \limits \left \{{\sum }_{i=1}^{n-1}({x_{i}^{2}}+(x_{i+1}-1)^{2}+x_{i+1}-1)\right .$, $~~~~~~~~~~~~~\left .{\sum }_{i=1}^{n-1}(-{x_{i}^{2}}-(x_{i+1}-1)^{2}+x_{i+1}+1)\right \}$, $g(x) = {\sum }_{i=1}^{n-1}({x_{i}^{2}}+x_{i+1}^{2}+x_{i} x_{i+1}-2x_{i}-2x_{i+1}+1.0)$, X = [− 10, 10]ⁿ, x⁰ = ones(n, 1).

Problem 8

[19, 40] Dimension: n= 10, 50, 100, 200, 500, 1000, $f(x) = {\sum }_{i=1}^{n-1}(-x_{i}+2({x_{i}^{2}}+x_{i+1}^{2}-1)+1.75|{x_{i}^{2}}+x_{i+1}^{2}-1|)$, $g(x) = {\sum }_{i=1}^{n-2}((3-2 x_{i+1})x_{i+1}-x_{i}-2x_{i+2}+2.5)$, X = [− 10, 10]ⁿ, x⁰ = ones(n, 1).

Problem 9

[41] Dimension: n = 2, p = 1, $f(x) = \frac {1}{3}{x_{1}^{2}}+{x_{2}^{2}}+\frac {1}{2}x_{1}$, $g(x,t) = (1-{x_{1}^{2}} t^{2})^{2}-x_{1} t^{2}-{x_{2}^{2}}+x_{2}$, X = [− 2, 2]², T = [0, 1], x⁰ = (− 1,− 1)^T.

Problem 10

[28] Dimension: n = 3, p = 1, $f(x) = \exp (x_{1})+\exp (x_{2})+\exp (x_{3})$, g(x,t) = 1/(1 + t²) − x₁ − x₂t − x₃t², X = [− 2, 2]³, T = [0, 1], x⁰ = (1, 1, 1)^T.

Problem 11

[28] Dimension: n = 3, p = 1, f(x) = x₁ + x₂/2 + x₃/3, $g(x,t) = \exp (t-1)-x_{1} -x_{2} t-x_{3} t^{2}$, X = [− 2, 2]³, T = [0, 1], x⁰ = (1, 1, 1)^T.

Problem 12

[41] Dimension: n = 3, p = 1, $f(x) = {x_{1}^{2}}+{x_{2}^{2}}+{x_{3}^{2}}$, $g(x,t) = x_{1} +x_{2}\exp (x_{3} t)+\exp (2t)-2\sin \limits (4t)$, X = [− 4, 2]³, T = [0, 1], x⁰ = (1, 1, 1)^T.

Problem 13

[31, 32] Dimension: n = 3, p = 2, f(x) = |x₁| + |x₂| + |x₃|, $g(x,t) = x_{1}+x_{2}\exp (x_{3} t)-\exp (2x_{1} t)+\sin \limits (4 t)$, X = [− 1, 1]³, T = [0, 1], x⁰ = ones(3, 1).

Problem 14

[31, 32] Dimension: n = 3, p = 2, f(x) = |x₁| + |x₂| + |x₃|, $g(x,t) = x_{1}+x_{2}\exp (x_{3} t_{1})-\exp (2 t_{2})+\sin \limits (4 t_{1})$, X = [− 1, 1]³, T = [0, 1]², x⁰ = ones(3, 1).

Problem 15

[31, 32] Dimension: n = 4, p = 2, f(x) = 1/2(|x₁| + |x₂| + |x₃| + |x₄|), $g(x,t) = \sin \limits (t_{1} t_{2})-x_{1}-x_{2} t_{1}-x_{3} t_{2}-x_{4} t_{1} t_{2}$, X = [− 4, 4]⁶, T = [0, 1]², x⁰ = ones(4, 1).

Problem 16

Dimension: n = 6, p = 2, $f(x) = {\sum }_{i=1}^{n-1}(-x_{i}+2({x_{i}^{2}}+x_{i+1}^{2}-1)+1.75|{x_{i}^{2}}+x_{i+1}^{2}-1|)$, $g(x,t) = \sin \limits (t_{1} t_{2})-x_{1}-x_{2} t_{1}-x_{3} t_{2}-x_{4} {t_{1}^{2}}-x_{5} t_{1} t_{2}-x_{6} {t_{2}^{2}}$, X = [− 4, 4]⁶, T = [0, 1]², x⁰ = ones(6, 1).

Problem 17

Dimension: n = 6, p = 2, $f(x) = {\sum }_{i=1}^{n-1}(-x_{i}+2({x_{i}^{2}}+x_{i+1}^{2}-1)+1.75|{x_{i}^{2}}+x_{i+1}^{2}-1|)$, $g(x,t) = (1+{t_{1}^{2}}+{t_{2}^{2}})^{2}-x_{1}-x_{2} t_{1}-x_{3} t_{2}-x_{4} {t_{1}^{2}}-x_{5} t_{1} t_{2}-x_{6} {t_{2}^{2}}$, X = [− 4, 4]⁶, T = [0, 1]², x⁰ = ones(6, 1).

Problem 18

Dimension: n = 6, p = 2, $f(x) = {\sum }_{i=1}^{n-1}\max \limits \left \{{x_{i}^{2}}+(x_{i+1}-1)^{2}+x_{i+1}-1,\ -{x_{i}^{2}}-(x_{i+1}-1)^{2}+x_{i+1}+1\right \}$, $g(x,t) = \exp ({t_{1}^{2}}+{t_{2}^{2}})-x_{1}-x_{2} t_{1}-x_{3} t_{2}-x_{4} {t_{1}^{2}}-x_{5} t_{1} t_{2}-x_{6} {t_{2}^{2}}$, X = [− 4, 4]⁶, T = [0, 1]², x⁰ = ones(6, 1).

Problem 19

Dimension: n = 6, p = 2, $f(x) = {\sum }_{i=1}^{n-1}\max \limits \left \{{x_{i}^{2}}+(x_{i+1}-1)^{2}+x_{i+1}-1,\ -{x_{i}^{2}}-(x_{i+1}-1)^{2}+x_{i+1}+1\right \}$, $g(x,t) = \sin \limits (t_{1} t_{2})-x_{1}-x_{2} t_{1}-x_{3} t_{2}-x_{4} {t_{1}^{2}}-x_{5} t_{1} t_{2}-x_{6} {t_{2}^{2}}$, X = [− 4, 4]⁶, T = [0, 1]², x⁰ = ones(6, 1).

Problem 20

Dimension: n = 6, p = 2, $f(x) = {\sum }_{i=1}^{n-1}\max \limits \left \{{x_{i}^{2}}+(x_{i+1}-1)^{2}+x_{i+1}-1,\ -{x_{i}^{2}}-(x_{i+1}-1)^{2}+x_{i+1}+1\right \}$, $g(x,t) = (1+{t_{1}^{2}}+{t_{2}^{2}})^{2}-x_{1}-x_{2} t_{1}-x_{3} t_{2}-x_{4} {t_{1}^{2}}-x_{5} t_{1} t_{2}-x_{6} {t_{2}^{2}}$, X = [− 4, 4]⁶, T = [0, 1]², x⁰ = ones(6, 1).

Problem 21

Dimension: n = 6, p = 2,

$f(x)\! =\! \max \limits \left \{{\sum }_{i=1}^{n-1}({x_{i}^{2}}+(x_{i+1}-1)^{2}+x_{i+1}-1)\right .$, $~~~~~~~~~~~~~~~~~\left .{\sum }_{i=1}^{n-1}(-{x_{i}^{2}}-(x_{i+1}-1)^{2}+x_{i+1}+1)\right \}$,

$g(x,t) = \sin \limits (t_{1} t_{2})-x_{1}-x_{2} t_{1}-x_{3} t_{2}-x_{4} {t_{1}^{2}}-x_{5} t_{1} t_{2}-x_{6} {t_{2}^{2}}$,

X = [− 4, 4]⁶, T = [0, 1]², x⁰ = ones(6, 1).

Problem 22

Dimension: n = 6, p = 2, $f(x) =\max \limits \left \{\sum \limits _{i=1}^{n-1}({x_{i}^{2}}+(x_{i+1}-1)^{2}+x_{i+1}-1)\right .$, $~~~~~~~~~~~~~\left .\sum \limits _{i=1}^{n-1}(-{x_{i}^{2}}-(x_{i+1}-1)^{2}+x_{i+1}+1)\right \}$,

$g(x,t) = (1+{t_{1}^{2}}+{t_{2}^{2}})^{2}-x_{1}-x_{2} t_{1}-x_{3} t_{2}-x_{4} {t_{1}^{2}}-x_{5} t_{1} t_{2}-x_{6} {t_{2}^{2}}$,

X = [− 4, 4]⁶, T = [0, 1]², x⁰ = ones(6, 1).

Problem 23

Dimension: n = 6, p = 2,

$f(x) = \max \limits _{1\leq i\leq n }\left \{h(-{\sum }_{i=1}^{n}x_{i}),\ h(x_{i})\right \}$, where $h(y)=\ln (|y|+1),\ \forall y\in \mathbb {R}$,

$g(x,t) = \sin \limits (t_{1} t_{2})-x_{1}-x_{2} t_{1}-x_{3} t_{2}-x_{4} {t_{1}^{2}}-x_{5} t_{1} t_{2}-x_{6} {t_{2}^{2}}$,

X = [− 4, 4]⁶, T = [0, 1]², x⁰ = ones(6, 1).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Pang, LP., Wu, Q. A feasible proximal bundle algorithm with convexification for nonsmooth, nonconvex semi-infinite programming. Numer Algor 90, 387–422 (2022). https://doi.org/10.1007/s11075-021-01192-9

Download citation

Received: 13 February 2021
Accepted: 27 August 2021
Published: 22 September 2021
Issue Date: May 2022
DOI: https://doi.org/10.1007/s11075-021-01192-9

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

A feasible proximal bundle algorithm with convexification for nonsmooth, nonconvex semi-infinite programming

Abstract

Similar content being viewed by others

An infeasible bundle method for nonconvex constrained optimization with application to semi-infinite programming problems

A new infeasible proximal bundle algorithm for nonsmooth nonconvex constrained optimization

Primal-dual path following method for nonlinear semi-infinite programs with semi-definite constraints

1 Introduction

2 Preliminaries

2.1 Background

Definition 1

Definition 2

Definition 3

Definition 4

Definition 5

Definition 6

Theorem 1

Theorem 2

Theorem 3

Proposition 1

Assumption 1

2.2 Necessary conditions

Definition 7

Theorem 4

Proof

Definition 8

2.3 The concave relaxation of the lower level problem

Lemma 1

Lemma 2

Theorem 5

Proof

3 The numerical approach

Lemma 3

Refinement procedure

Remark 1

Lemma 4

Proof

Algorithm 1

Step 0 (Initialization)

Step 1 (Quadratic programming subproblem)

Step 2 (Stopping test)

Step 3 (Bundle management)

Step 4 (Descent test)

Step 5 (Null step update)

Step 6 (Serious step update)

Step 7 (Subdivision update)

Step 8 (Parameter update)

Remark 2

Lemma 5

4 Convergence

Lemma 6

Proof

Lemma 7

Proof

Lemma 8

Proof

Lemma 9

Proof

Lemma 10

Proof

Theorem 6

Proof

Theorem 7

Proof

5 Computational results

6 Conclusions

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s note

Appendix: 1: Test problems

Appendix: 1: Test problems

Problem 1

Problem 2

Problem 3

Problem 4

Problem 5

Problem 6