1 Introduction

Many practical optimization problems have both continuous and integer variables, and these problems are generally modeled as mixed-integer nonlinear programming problems (MINLPs). Such a problem is known as convex MINLP if the objective and constraint functions are convex; otherwise, it is a nonconvex MINLP.

Since MINLPs provide a powerful framework for mathematically modeling many optimization problems, they have been used in various applications such as electricity transmission and electric power system, design of water distribution networks, process industry, and finance and scheduling problems and production planning and control. We refer readers to bibliographies [1,2,3,4,5,6,7,8,9] for more details on applications. The study on MINLPs as well as solution algorithms has been an active focus of research over the past decades. These algorithms are mainly based on branch-and-bound (B&B), extended cutting plane, Lagrangian relaxation, outer approximation (OA) and generalized Benders decomposition (GBD). Readers are invited to consult references [10,11,12,13,14,15,16,17,18,19,20,21,22,23] for more details on these solution methods.

One focus of this paper is OA. Duran and Grossmann [15] first introduced OA to solve MINLPs in which subproblems are affine for fixed integer variables and convex in continuous variables. Fletcher and Leyffer [16] improved the OA to solve convex MINLPs with continuously differentiable functions and considered the nonsmooth extension in the context of penalty functions therein. Kesavan et al. [17] applied OA to separable nonconvex MINLPs and proposed outer approximation algorithms to solve several relaxed master problems and nonlinear programming problems. Drewes and Ulbrich [24] used OA to solve a mixed-integer second-order cone programming problem with the linear objective function and the affine constraint function. Eronen and Mäkelä [25] applied the generalized OA to nonsmooth convex MINLPs. Based on the above references, we study the OA for one MINLP problem defined by nonsmooth convex functions and use subgradients to reformulate the problem as an equivalent mixed-integer linear program (MILP). An outer approximation algorithm is given for solving this MILP, and several illustrative examples are computed by this algorithm. Compared with aforementioned references, one part of main work in this paper is to show the existence of subgradients that are applicable to the reformulation and provide a complete proof on the equivalence between MINLP and the reformulated problem.

The other focus of this paper is GBD. It is known that Benders [20] first introduced this decomposition for solving mixed-integer linear programming problems and exploiting the structure of complicated problems. Geoffrion [21] generalized this decomposition method to deal with MINLPs where the subproblems are nonlinear programming. Hooker and Ottosson [8] studied logic-based Benders decomposition and applied it to planning and scheduling problems (cf. [6, 7] and references therein). As the other part of main work in this paper, we study GBD and a generalized Benders algorithm for one MINLP with conic constraint. The main idea of this algorithm is given and then several examples as well as their solutions are provided by the generalized Benders algorithm.

The remainder of this paper is organized as follows. In Sect. 2, we give some definitions and preliminaries to be used in this paper. In Sect. 3, we consider OA and the outer approximation algorithm and use this algorithm to solve nonsmooth convex MINLP examples. Section 4 is devoted to the study of GBD and the generalized Benders algorithm. It is shown that several convex MINLP examples with conic constraint can be solved by this algorithm. Section 5 is devoted to the proofs for results in Sects. 3 and 4. Conclusions of this paper are given in Sect. 6.

2 Preliminaries

Let \(\Vert \cdot \Vert \) be the norm of an Euclidean space \({\mathbb {R}}^n\) and denote by \(\langle \cdot , \cdot \rangle \) the inner product between two elements of \({\mathbb {R}}^n\). For any subset \(C\subset {\mathbb {R}}^n\), we denote by \(\mathrm{int}(C)\) and \(\mathrm{cl}(C)\) the interior and the closure of C, respectively.

Let A be a closed convex set of \({\mathbb {R}}^n\) and \(x\in A\). We denote by

$$\begin{aligned} T(A, x):=\mathrm{cl}({\mathbb {R}}_+(A-x)) \end{aligned}$$

the contingent cone of A at x. Thus, \(v\in T(A, x)\) if and only if there exist \(v_k\rightarrow v\) and \(t_k\rightarrow 0^+\) such that \(x+t_kv_k\in A\) for all \(k\in {\mathbb {N}}\), where \({\mathbb {N}}\) denotes the set of all natural numbers.

We denote by

$$\begin{aligned} N(A, x):=\{\gamma \in {\mathbb {R}}^n: \langle \gamma , y-x\rangle \le 0\ \ \mathrm{for\;all} \ y\in A\} \end{aligned}$$

the normal cone of A at x. It is known that N(Ax) is the dual of T(Ax); that is,

$$\begin{aligned} N(A, x)=\{\gamma \in {\mathbb {R}}^n: \langle \gamma , v\rangle \le 0\ \ \mathrm{for\;all} \ v\in T(A, x)\}. \end{aligned}$$

Let \(\varphi : {\mathbb {R}}^n\rightarrow {\mathbb {R}}\cup \{+\infty \}\) be a proper lower semicontinuous convex function. We denote by \( \mathrm{dom}(\varphi ):=\{u\in {\mathbb {R}}^n: \varphi (u)<+\infty \}\) the domain of \(\varphi \). Let \(x\in \mathrm{dom}(\varphi )\). We denote by

$$\begin{aligned} \partial \varphi (x):=\{\alpha \in {\mathbb {R}}^n: \langle \alpha , y-x\rangle \le \varphi (y)-\varphi (x)\ \ \mathrm{for\ all} \ y\in {\mathbb {R}}^n\} \end{aligned}$$
(1)

the subdifferential of \(\varphi \) at x. Each vector in \(\partial \varphi (x)\) is said to be a subgradient of \(\varphi \) at x.

Let \(K\subset {\mathbb {R}}^m\) be a closed and convex cone with nonempty interior. The partial order in \({\mathbb {R}}^m\) by K is defined as follows: for any \(z_1, z_2\in {\mathbb {R}}^m\),

$$\begin{aligned} z_1\preceq _K z_2\Longleftrightarrow z_2-z_1\in K\ \mathrm{and} \ z_1\prec _K z_2\Longleftrightarrow z_2-z_1\in \mathrm{int}(K). \end{aligned}$$
(2)

For a function \(\psi : {\mathbb {R}}^n\rightarrow {\mathbb {R}}^m\), recall that \(\psi \) is said to be K-convex, if

$$\begin{aligned} \psi (\lambda x_1+(1-\lambda )x_2)\preceq _K\lambda \psi (x_1)+(1-\lambda )\psi (x_2)\ \ \forall x_1, x_2\in {\mathbb {R}}^n \ \mathrm{and} \ \forall \lambda \in [0, 1]. \end{aligned}$$

For the case when \(\psi : {\mathbb {R}}^n\rightarrow {\mathbb {R}}\) and \(K:=[0, +\infty [\), K-convexity of \(\psi \) reduces to the convexity of real-valued function \(\psi \).

The following lemma, proved in [19, Proposition 2.3], will be used in our analysis.

Lemma 2.1

Let \(\phi :{\mathbb {R}}^n\times {\mathbb {R}}^p\rightarrow {\mathbb {R}}\) be a continuous and convex function and suppose that \(({\bar{x}}, {\bar{y}})\in {\mathbb {R}}^n\times {\mathbb {R}}^p\). Then for any \(\alpha \in \partial \phi (\cdot ,{\bar{y}})({\bar{x}})\), there exists \(\beta \in {\mathbb {R}}^p\) such that \((\alpha , \beta )\in \partial \phi ({\bar{x}}, {\bar{y}})\).

We conclude this section with the following lemma (cf. [26, Theorem 2.4.18]).

Lemma 2.2

Let \(\varphi : {\mathbb {R}}^n\rightarrow {\mathbb {R}}\) be a continuous convex function. Define

$$\begin{aligned} \varphi _+(x):=\max \{\varphi (x), 0\}, \ \ \forall x\in {\mathbb {R}}^n. \end{aligned}$$

Then \(\varphi _+\) is a convex and continuous function and

$$\begin{aligned} \partial \varphi _+(x)=[0, 1]\partial \varphi (x) \end{aligned}$$

holds for all \(x\in {\mathbb {R}}^n\) with \(\varphi (x)=0\), where \([0, 1]\partial \varphi (x):=\{t\gamma : t\in [0, 1]\ { and}\ \gamma \in \partial \varphi (x)\}\) for any \(x\in {\mathbb {R}}^n\).

3 OA for MINLPs with Nonsmooth Convex Functions

In this section, we study OA and an outer approximation algorithm for MINLPs with nonsmooth convex functions. To illustrate this algorithm, we compute nonsmooth convex MINLP examples with detailed computing procedure, including the choice of nonnegative multipliers and subgradients that satisfy the KKT conditions and the computation of normal cone.

3.1 Overview of an Outer Approximation Algorithm

This subsection is devoted to an overview of OA for MINLPs with nonsmooth convex functions. The main idea of this method is to reformulate MINLP as an equivalent mixed-integer linear programming (MILP) problem and then construct an outer approximation algorithm for solving a sequence of relaxed MILPs using several solvers or softwares.

Suppose \(f,g_i: {\mathbb {R}}^n\times {\mathbb {R}}^p\rightarrow {\mathbb {R}}\)\( (i=1,\ldots ,m)\) are continuous convex functions (not necessarily differentiable), \(X\subset {\mathbb {R}}^n\) is a nonempty compact convex set and \(Y\subset {\mathbb {Z}}^p\) is a set of integers. A generalized nonsmooth MINLP is generally defined as follows:

$$\begin{aligned} \mathrm{(P)} \left\{ \begin{array}{l} \min \limits _{x,y} \ f(x, y)\\ \mathrm{s.t.}\ \ g_i(x, y)\le 0, i=1,\ldots ,m,\\ \ \ \ \ \ \ \, x\in X, y\in Y\ \mathrm{integer}. \end{array} \right. \end{aligned}$$
(3)

For any fixed \(y\in Y\), we consider the following subproblem P(y):

$$\begin{aligned} P(y) \left\{ \begin{array}{l} \min \limits _{x} \ f(x, y)\\ \mathrm{s.t.}\ \ g_i(x, y)\le 0, i=1,\ldots ,m,\\ \ \ \ \ \ \ \, x\in X. \end{array} \right. \end{aligned}$$

We divide Y into two disjoint subsets:

$$\begin{aligned} T:=\{y\in Y: P(y) \ \mathrm{is\ feasible}\} \ \ \mathrm{and} \ \ S:=\{y\in Y: P(y) \ \mathrm{is\ infeasible}\}. \end{aligned}$$
(4)

We assume that problem (P) satisfies Slater constraint qualification:

(A1) For any\(y\in T\), there is\({{\hat{x}}}\in X\)satisfying\(g_i({{\hat{x}}}, y)<0\)for all\(i=1,\ldots ,m\).

Let \(y\in Y\) be given. Then exactly one of following two cases occurs:

Case 1\(y\in T\). We assume that \(y=y_j\) for some \(y_j\in T\).

Solve subproblem \(P(y_j)\) and obtain an optimal solution \(x_j\in X\). By assumption (A1) and KKT conditions (cf. [27]), there exist \(\lambda _{j,1},\ldots ,\lambda _{j,m}\ge 0\) such that

$$\begin{aligned} \sum _{i=1}^m\lambda _{j,i}g_i(x_j,y_j)=0 \ \mathrm{and} \ 0\in \partial f(\cdot ,y_j)(x_j)+\sum \limits _{i=1}^m\lambda _{j,i}\partial g_i(\cdot ,y_j)(x_j)+ N(X, x_j).\nonumber \\ \end{aligned}$$
(5)

This and Lemma 2.1 imply that there exist \(\alpha _j\in \partial f(\cdot ,y_j)(x_j), \xi _{j,i}\in \partial g_i(\cdot ,y_j)(x_j)\) and \(\beta _j\in {\mathbb {R}}^p,\eta _{j,i}\in {\mathbb {R}}^p\) such that \((\alpha _j, \beta _j)\in \partial f(x_j,y_j)\), \((\xi _{j,i},\eta _{j,i})\in \partial g_i(x_j,y_j)\)\((i=1,\ldots ,m)\) and

$$\begin{aligned} -\alpha _j-\sum \limits _{i=1}^m\lambda _{j,i}\xi _{j,i}\in N(X, x_j). \end{aligned}$$
(6)

Case 2\(y\not \in T\). We assume that \(y=y_l\) for some \(y_l\in S\).

Let \(J_l\) be one subset of \(\{1,\ldots ,m\}\) such that there is some \({\tilde{x}}\in X\) satisfying

$$\begin{aligned} g_i({\tilde{x}}, y_l)< 0,\ \ \forall i\in J_l. \end{aligned}$$

Denote by \(I_l:=\{1,\ldots ,m\}\backslash J_l\) the complement of \(J_l\) and consider the following subproblem \(F(y_l)\):

$$\begin{aligned} F(y_l)\left\{ \begin{array}{l} \min \limits _{x} \ \sum \limits _{i\in I_l}[g_i(x, y_l)]_+:=\min \limits _{x}\sum \limits _{i\in I_l}\max \{g_i(x, y_l), 0\}\\ \mathrm{s.t.}\ \ g_i(x, y_l)\le 0\ \ \forall i\in J_l,\\ \ \ \ \ \ \ \, x\in X. \end{array} \right. \end{aligned}$$
(7)

Solve subproblem \(F(y_l)\) and obtain the optimal solution \(x_l\). By KKT conditions, there exist \(\lambda _{l,i}\ge 0\) for all \(i\in J_l\) such that

$$\begin{aligned}&\sum \limits _{i\in J_l}\lambda _{j,i}g_i(x_j,y_j)=0 \ \mathrm{and} \ 0\in \sum \limits _{i\in I_l}\partial [g_i(\cdot ,y_l)]_+(x_l)\nonumber \\&\quad +\sum \limits _{i\in J_l}\lambda _{l,i}\partial g_i(\cdot ,y_l)(x_l)+ N(X, x_l). \end{aligned}$$
(8)

This together with Lemmas 2.1 and 2.2 implies that there exist \(\lambda _{l,i}\in [0,1]\) for all \(i\in I_l\) and \(\xi _{l,i}\in \partial g_i(\cdot ,y_l)(x_l)\), \(\eta _{l,i}\in {\mathbb {R}}^p\)\((\forall i\in I_l\cup J_l)\) such that \((\xi _{l,i}, \eta _{l,i})\in \partial g_i(x_l,y_l)\) for all \(i\in I_l\cup J_l=\{1,\ldots ,m\}\) and

$$\begin{aligned}&-\sum \limits _{i\in I_l\cup J_l}\lambda _{l,i}\xi _{l,i}\in N(X,x_l),\nonumber \\&\lambda _{l,i}=0,\ \mathrm{if} \ i\in I_l\ \mathrm{with} \ g_i(x_l,y_l)<0, \nonumber \\&\lambda _{l,i}=1,\ \mathrm{if} \ i\in I_l\ \mathrm{with} \ g_i(x_l,y_l)>0. \end{aligned}$$
(9)

The following proposition provides constraints to exclude integers producing infeasible subproblems. The proof is given in Sect. 5.

Proposition 3.1

Consider the following constraint:

$$\begin{aligned}&g_i(x_l, y_l)+(\xi _{l,i}, \eta _{l,i})^T\begin{pmatrix}x-x_l\\ y-y_l\end{pmatrix}\le 0, i=1,\ldots ,m, \nonumber \\&\quad x\in X, y\in Y. \end{aligned}$$
(10)

Then for any \(({{\hat{x}}},{{\hat{y}}})\in X\times Y\) feasible to the constraints in (10), one has \({{\hat{y}}}\not = y_l\).

By virtue of subgradients given by (5),(6) (8) and (9), one can establish the following MILP master program (MP):

$$\begin{aligned} \mathrm{(MP)}\left\{ \begin{array}{l} \min \limits _{x,y,\theta } \ \theta \\ \mathrm{s.t.}\ \ f(x_j, y_j)+(\alpha _j, \beta _j)^T\begin{pmatrix}x-x_j\\ y-y_j\end{pmatrix}\le \theta \ \ \ \forall y_j\in T, \\ \ \ \ \ \ \ \, g_i(x_j, y_j)+(\xi _{j,i}, \eta _{j,i})^T\begin{pmatrix}x-x_j\\ y-y_j\end{pmatrix}\le 0\ \ \forall y_j\in T, i=1,\ldots ,m,\\ \ \ \ \ \ \ \, g_i(x_l, y_l)+(\xi _{l,i}, \eta _{l,i})^T\begin{pmatrix}x-x_l\\ y-y_l\end{pmatrix}\le 0\ \ \forall y_l\in S, i=1,\ldots ,m,\\ \ \ \ \ \ \ \, x\in X, y\in Y\ \mathrm{integer}. \end{array} \right. \end{aligned}$$
(11)

The following proposition shows the equivalence between the master program (MP) and the MINLP problem (P). The reader is invited to consult Sect. 5 for the proof.

Proposition 3.2

The master program (MP) of (11) is equivalent to problem (P) in the sense that both problems have the same optimal value and the optimal solution \(({\bar{x}}, {\bar{y}})\) to problem (P) corresponds to the optimal solution \(({\bar{x}}, {\bar{y}}, {\bar{\theta }})\) to (MP) with \({\bar{\theta }}=f({\bar{x}}, {\bar{y}})\).

Remark 3.1

The KKT conditions play an important role in the equivalent reformulation of MINLP problem (P), and Example 3.2 shows that the subgradients, if not satisfying KKT conditions, may be insufficient for this equivalent reformulation. \(\square \)

Based on Proposition 3.2, the outer approximation algorithm can be constructed to solve a relaxed MP of (11).

At iteration k, subsets \(T^k\) and \(S^k\) are defined as follows:

$$\begin{aligned} T^k:=T\cap \{y_1,\ldots , y_k\} \ \ \mathrm{and} \ \ S^k:=S\cap \{y_1,\ldots , y_k\}. \end{aligned}$$

Exactly one of the following cases occurs:

(a) \(y_k\in T^k\): Solve subproblem \(P(y_k)\) and denote the optimal solution by \(x_k\). Solve the following optimization problem \({ OP}(x_k,y_k)\):

$$\begin{aligned} { Find } \left[ \begin{array}{l} (\alpha _k,\beta _k)\\ (\xi _{k,i},\eta _{k,i})\\ \lambda _{k,i}\in {\mathbb {R}} \end{array} \right] { such \ that} \ \left\{ \begin{array}{l} -\alpha _k-\sum \limits _{i=1}^m\lambda _{k,i}\xi _{k,i}\in N(X, x_k),\\ (\alpha _k,\beta _k)\in \partial f(x_k,y_k),\\ (\xi _{k,i},\eta _{k,i})\in \partial g_i(x_k,y_k), i=1,\ldots ,m,\\ \lambda _{k,i}\ge 0, i=1,\ldots ,m. \end{array} \right. \end{aligned}$$
(12)

Obtain subgradients \((\alpha _k,\beta _k)\) and \((\xi _{k,i},\eta _{k,i})\) for all \(i=1,\ldots ,m\).

(b) \(y_k\in S^k\): Take \(I_k,J_k\subset \{1,\ldots ,m\}\) and create subproblem \(F(y_k)\) as in (7). Solve \(F(y_k)\) and denote the solution by \(x_k\). Solve the following optimization problem \({ Sub\text{- }OP}(x_k,y_k)\):

$$\begin{aligned} { Find\ } \left[ \begin{array}{l} (\xi _{k,i},\eta _{k,i})\\ \lambda _{k,i}\in {\mathbb {R}} \end{array} \right] { such \ that} \ \left\{ \begin{array}{l} -\sum \limits _{i=1}^m\lambda _{k,i}\xi _{k,i}\in N(X, x_k),\\ (\xi _{k,i},\eta _{k,i})\in \partial g_i(x_k,y_k),\ i=1,\ldots ,m,\\ \lambda _{k,i}\ge 0, \ i=1,\ldots ,m,\\ \lambda _{k,i}=0, \ \ \ \ i\in I_k\ { with} \ g_i(x_k,y_k)<0,\\ \lambda _{k,i}=1, \ \ \ \ i\in I_k\ { with} \ g_i(x_k,y_k)>0,\\ \lambda _{k,i}\in [0,1], i\in I_k\ { with} \ g_i(x_k,y_k)=0. \end{array} \right. \end{aligned}$$
(13)

Obtain subgradients \((\xi _{k,i},\eta _{k,i})\) for all \(i=1,\ldots ,m\).

Let \(UBD^k:=\min \{f(x_j,y_j): y_j\in T^k\}\). Consider the following relaxed master program \(MP^k\):

$$\begin{aligned} MP^k\left\{ \begin{array}{l} \min \limits _{x,y,\theta }\ \theta \\ \mathrm{s.t.}\ \ \, \theta <UBD^k\\ \ \ \ \ \ \ \, f(x_j, y_j)+(\alpha _j, \beta _j)^T\begin{pmatrix}x-x_j\\ y-y_j\end{pmatrix}\le \theta \ \ \forall y_j\in T^k,\ \ \\ \ \ \ \ \ \, g_i(x_j, y_j)+(\xi _{j,i}, \eta _{j,i})^T\begin{pmatrix}x-x_j\\ y-y_j\end{pmatrix}\le 0\ \ \forall y_j\in T^k, i=1,\ldots ,m,\\ \ \ \ \ \ \, g_i(x_l, y_l)+(\xi _{l,i}, \eta _{l,i})^T\begin{pmatrix}x-x_l\\ y-y_l\end{pmatrix}\le 0\ \ \forall y_l\in S^k, i=1,\ldots ,m,\\ \ \ \ \ \ \, x\in X, y\in Y\ \mathrm{integer}. \end{array} \right. \end{aligned}$$
(14)

The outer approximation algorithm for solving the relaxed master problems \(MP^k\) is stated as follow.

figure a

The following proposition is on the convergence of Algorithm 1. The proof is given in Sect. 5.

Proposition 3.3

Suppose that the cardinality of Y is finite. Then either problem (P) is infeasible or Algorithm 1 terminates in a finite number of steps at an optimal value of problem (P).

3.2 Illustrative Examples

In this subsection, Algorithm 1 is tested by numerical examples and the results show that Algorithm 1 is feasible and applicable to solve MINLPs with nonsmooth convex functions. All computational tests for these examples are implemented in MATLAB version R2014a and solved by BARON version 16.10.6 (cf. [12]).

Example 3.1

Let \(X=\{(u,v): -1\le u+v\le 1 \ { and} \ -1\le u-v\le 1 \}\) and \(Y=\{-1, 1, 2\}\). Consider the following convex MINLP:

$$\begin{aligned}&\min \limits _{x,y} \ f(x, y)=\max \{-u+y, v^2+y-1\} \nonumber \\&\mathrm{s.t.}\ \ g_1(x, y)=\max \{u-y, v+y-1\}\le 0, \nonumber \\&\qquad g_2(x,y)=\max \{2u-4+y, v-y\}\le 0, \nonumber \\&\qquad x=(u,v)\in X, y\in Y. \end{aligned}$$
(15)

Choose the tolerance \(\varepsilon =0.005\). Let \(y_1=2\) be the initial point.

At the first iteration, check subproblem \(P(y_1)\) that is feasible. Solve subproblem \(P(y_1)\) and denote the optimal solution by \(x_1=(0,-1)\). Solve optimization problem \({ OP}(x_1,y_1)\) as in (12) and obtain multipliers \(\lambda _{1,1}=2, \lambda _{1,2}=0.5\) and subgradients \( (\alpha _1,\beta _1)=((-1,0),1)\), \((\xi _{1,1},\eta _{1,1})=((0,1),1), (\xi _{1,2},\eta _{1,2})=((2,0),1)\). Solve the following relaxed MILP problem:

$$\begin{aligned} MP^1\left\{ \begin{array}{l} \min \limits _{x,y,\theta }\ \theta \\ \mathrm{s.t.}\ \ \theta \le f(x_1,y_1)-\varepsilon ,\\ \ \ \ \ \ \ \, f(x_1, y_1)+(\alpha _1, \beta _1)^T\begin{pmatrix}x-x_1\\ y-y_1\end{pmatrix}\le \theta , \\ \ \ \ \ \ \ \, g_i(x_1, y_1)+(\xi _{1,i}, \eta _{1,i})^T\begin{pmatrix}x-x_1\\ y-y_1\end{pmatrix}\le 0, i=1,2,\\ \ \ \ \ \ \ \, x\in X, y\in Y. \end{array} \right. \end{aligned}$$

Denote the optimal solution by \(({{\hat{x}}}, {{\hat{y}}}, {\hat{\theta }})=((0,1), -1, -2)\).

In the second iteration, we set \(y_2={{\hat{y}}}=-1\) and check subproblem \(P(y_2)\) by Algorithm 1; \(P(y_2)\) is infeasible. Consider the following subproblem \(F(y_2)\):

$$\begin{aligned} F(y_2) \left\{ \begin{array}{l} \min \limits _{x} \ \max \{g_1(x,y_2),0\}\\ \mathrm{s.t.}\ \ g_2(x, y_2)=\max \{2u-5, v+1\}\le 0, \\ \ \ \ \ \ \ \, x=(u,v)\in X. \end{array} \right. \end{aligned}$$

Solve subproblem \(F(y_2)\) and denote the optimal solution by \(x_2=(0,-1)\). Solve optimization problem \({ Sub\text{- }OP}(x_2,y_2)\) as in (13) and obtain subgradients \((\xi _{2,1}, \eta _{2,1})=((1,0),-1)\) and \((\xi _{2,2}, \eta _{2,2})=((0,-1),-1)\). According to Algorithm 1, consider the following constraints:

$$\begin{aligned} g_i(x_2, y_2)+(\xi _{2,i}, \eta _{2,i})^T\begin{pmatrix}x-x_2\\ y-y_2\end{pmatrix}\le 0, i=1,2. \end{aligned}$$

Add these two constraints into \(MP^1\) and obtain relaxed MILP problem \(MP^2\). Solve \(MP^2\) and denote the optimal solution by \(({{\hat{x}}}, {{\hat{y}}}, {\hat{\theta }})=((-1,0), 1, 0)\). The second iteration is concluded.

In the third iteration, we set \(y_3={{\hat{y}}}=1\) and check subproblem \(P(y_3)\), which is feasible. Solve subproblem \(P(y_3)\) and denote the optimal solution by \(x_3=(1,0)\). Solve optimization problem \({ OP}(x_3,y_3)\) as in (12) and obtain multipliers \(\lambda _{3,1}=\lambda _{3,2}=0\) and subgradients \((\alpha _3,\beta _3)=((0, 0),1), (\xi _{3,1},\eta _{3,1})=((1,0),-1), (\xi _{3,2},\eta _{3,2})=((0,1),-1)\).

By Algorithm 1, consider the following constraints:

$$\begin{aligned}&\theta \le f(x_3,y_3)-\varepsilon ,\\&f(x_3, y_3)+(\alpha _3, \beta _3)^T\begin{pmatrix}x-x_3\\ y-y_3\end{pmatrix}\le \theta , \\&g_i(x_3, y_3)+(\xi _{3,i}, \eta _{3,i})^T\begin{pmatrix}x-x_3\\ y-y_3\end{pmatrix}\le 0,i=1,2. \end{aligned}$$

Add these four constraints into \(MP^2\) and get relaxed MILP problem \(MP^3\). As \(MP^3\) is infeasible, the algorithm stops and an \(\varepsilon \)-optimal solution is obtained. \(\square \)

The following example shows that the chosen subgradients, if not satisfying KKT conditions, may be not sufficient for solving nonsmooth convex MINLPs.

Example 3.2

Let \(X=[0, 2]\) and \(Y=\{1, 2, 3\}\). Consider the following convex MINLP:

$$\begin{aligned} \mathrm{(P)} \left\{ \begin{array}{l} \min \limits _{x,y} \ f(x, y)=x+y\\ \mathrm{s.t.}\ \ g_1(x, y)=\max \{-x+y+1, x-y+1\}\le 0, \\ \ \ \ \ \ \ \, g_2(x,y)=x-y\le 0,\\ \ \ \ \ \ \ \, x\in X, y\in Y. \end{array} \right. \end{aligned}$$
(16)

It is not hard to verify that (P) of (16) is infeasible.

Let \(y_1=1\) be the initial point and check subproblem \(P(y_1)\). As \(P(y_1)\) is infeasible, we consider the following subproblem \(F(y_1)\):

$$\begin{aligned} F(y_1) \left\{ \begin{array}{l} \min \limits _{x} \ \max \{g_1(x,1),0\}\\ \mathrm{s.t.}\ \ g_2(x, y_1)=x-1\le 0, \\ \ \ \ \ \ \ \, x\in X. \end{array} \right. \end{aligned}$$

Solve subproblem \(F(y_1)\) and denote the optimal solution by \(x_1=1\). Solve optimization problem \({ Sub-OP}(x_1,y_1)\) as in (12) and obtain subgradients \(\xi _{1,1}=-1, \xi _{1,2}=1\), multiplier \(\lambda _{1,2}=1\) and numbers \(\eta _{1,1}=1, \eta _{1,2}=-1\) such that

$$\begin{aligned} \xi _{1,1}+\lambda _{1,2}\xi _{1,2}\in N(X,x_1)\ \ \mathrm{and} \ \ (\xi _{1,i},\eta _{1,i})\in \partial g_i(x_1,y_1), i=1,2. \end{aligned}$$

According to Algorithm 1, consider the following constraints:

$$\begin{aligned}&g_1(x_1, y_1)+(\xi _{1,1}, \eta _{1,2})^T\begin{pmatrix}x-x_1\\ y-y_1\end{pmatrix}\le 0,\\&g_2(x_1, y_1)+(\xi _{2,1}, \eta _{2,2})^T\begin{pmatrix}x-x_1\\ y-y_1\end{pmatrix}\le 0. \end{aligned}$$

Solve this constraint problem and then the problem is infeasible which shows that problem (P) of (16) is infeasible.

However, if taking

$$\begin{aligned} (\xi _{1,1}^*,\eta _{1,1}^*)=(1,-1)\in \partial g_1(x_1,y_1)\ \ \mathrm{and} \ \ (\xi _{1,2}^*,\eta _{1,2}^*)=(1,-1)\in \partial g_2(x_1,y_1), \end{aligned}$$

then one can verify that these two subgradients do not satisfy KKT conditions; that is,

$$\begin{aligned} \not \exists \lambda _{1,2}\ge 0 \ \ \mathrm{satisfying} \ \ \xi _{1,1}^*+\lambda _{1,2}\xi _{1,2}^*\in N(X,x_1)=\{0\}. \end{aligned}$$

Consider the following constraints:

$$\begin{aligned}&g_1(x_1, y_1)+(\xi _{1,1}^*, \eta _{1,2}^*)^T\begin{pmatrix}x-x_1\\ y-y_1\end{pmatrix}\le 0,\\&g_2(x_1, y_1)+(\xi _{2,1}^*, \eta _{2,2}^*)^T\begin{pmatrix}x-x_1\\ y-y_1\end{pmatrix}\le 0, \end{aligned}$$

which has a feasible point \(({{\hat{x}}},{{\hat{y}}})=(0,1)\). This means that OA for this MINLP may generate an infinite loop between point \((x_1, y_1)\) and point (0, 1). Thus OA is invalid for MINLP problem of (16). \(\square \)

4 GBD for MINLPs with Conic Constraint

In this section, we study GBD for solving MINLPs with the vector conic constraint. Several MINLP examples with vector conic constraint are solved, and the computing procedure which includes the choice of optimal Lagrange multipliers of subproblems is also provided.

4.1 Overview of the Generalized Benders Algorithm

This subsection is devoted to an overview of the generalized Benders algorithm for solving MINLPs with vector conic constraint. The main idea of this algorithm is to separate the original MINLP into many subproblems, then reformulate MINLP as an equivalent master program by optimal Lagrange multipliers (of subproblems) and finally construct an algorithm for solving a sequence of relaxed master program problems.

Let \(X\subset {\mathbb {R}}^n\) be a closed bounded convex set, \(Y\subset {\mathbb {Z}}^p\) be a set with integers and \(K\subset {\mathbb {R}}^m\) be a closed convex cone with a nonempty interior which specifies a partial order \(\preceq _K\) on \({\mathbb {R}}^m\) as (2). The MINLP problem is defined as follows:

$$\begin{aligned} \mathrm{(VOP)} \left\{ \begin{array}{l} \min \limits _{x,\, y} \ f(x, y)\\ \mathrm{s.t.}\ \ g(x, y)\preceq _K 0, \\ \ \ \ \ \ \ \, x\in X, y\in Y\ \mathrm{integer}, \end{array} \right. \end{aligned}$$
(17)

where \(f: {\mathbb {R}}^n\times {\mathbb {R}}^p\rightarrow {\mathbb {R}}\) and \(g: {\mathbb {R}}^n\times {\mathbb {R}}^p\rightarrow {\mathbb {R}}^m\) satisfy that \(f(\cdot , y)\) is convex and \(g(\cdot , y)\) is K-convex on X for any fixed \(y\in Y\).

We denote by

$$\begin{aligned} K^+:=\{z^*\in {\mathbb {R}}^m: \langle z^*,z\rangle \ge 0\ \mathrm{for\ all} \ z\in K\} \end{aligned}$$
(18)

the dual cone of K.

Let \(y\in Y\). We consider the following subproblem P(y):

$$\begin{aligned} P(y) \left\{ \begin{array}{l} \min \limits _{x} \ f(x, y)\\ \mathrm{s.t.}\ \ g(x, y)\preceq _K 0, \\ \ \ \ \ \ \ \, x\in X. \end{array} \right. \end{aligned}$$

Recall that the perturbation function \(v_y(\cdot )\) associated with subproblem P(y) is defined by

$$\begin{aligned} v_y(z): {\mathbb {R}}^m\rightarrow {\mathbb {R}}\cup \{+\infty \}, z\mapsto v_y(z):=\inf _{x\in X}\{f(x, y): g(x, y)\preceq _K z\}, \end{aligned}$$
(19)

here we use the convention that infimum over the empty set is \(+\infty \).

Recall that a linear continuous functional \({\bar{u}}^*\in {\mathbb {R}}^m\) is said to be an optimal Lagrange multiplier for problem P(y) if there exists \({\bar{x}}\in X\) such that \(({\bar{x}}, {\bar{u}}^*)\) satisfies the following optimality conditions:

$$\begin{aligned}&\mathrm{(i)}\ f({\bar{x}},y)+\langle {\bar{u}}^*, g(\bar{x},y)\rangle =\min \limits _{x\in X}\big \{f(x,y)+\langle {\bar{u}}^*, g(x,y)\rangle \big \}, \nonumber \\&\mathrm{(ii)}\ \langle {\bar{u}}^*, g({\bar{x}},y)\rangle =0,\nonumber \\&\mathrm{(iii)}\ {\bar{u}}^*\in K^+, \nonumber \\&\mathrm{(iv)}\ g({\bar{x}},y)\preceq _K0. \end{aligned}$$
(20)

The following condition is assumed to hold for (VOP) in (17):

(A2) Suppose that functions\(f(\cdot , y), g(\cdot , y)\)are continuous onXfor any\(y\in Y\)and the Slater constraint qualification

$$\begin{aligned} g({{\hat{x}}}, y)\prec _K0\ \ { for \ some} \ {{\hat{x}}}\in X \end{aligned}$$
(21)

holds for any\(y\in Y\)such that subproblemP(y) is feasible.

The following proposition establishes properties of the perturbation function \(v_y(\cdot )\) and of the optimal Lagrange multipliers for subproblem P(y). The proof is given in Sect. 5.

Proposition 4.1

The following statements hold:

(i):

For any \(y\in Y\), perturbation function \(v_y(\cdot )\) is convex, lower semicontinuous and monotone nonincreasing (by \(\preceq _K\)) on \({\mathbb {R}}^m\), and \({ dom}(v_y)\) is convex and closed.

(ii):

For any \(y\in Y\) satisfying subproblem P(y) is feasible, \(v_y(\cdot )\) is continuous at \(0=(0,\ldots ,0)\in {\mathbb {R}}^m\) and \(-\partial v_y(0)\) is the set of all optimal Lagrange multipliers of subproblem P(y).

The following proposition is a key result for reformulating MINLP (VOP) as an equivalent master program. The reader is invited to consult Sect. 5 for the proof.

Proposition 4.2

The following statements hold:

(i):

Let \(y\in Y\). Then subproblem P(y) is feasible if and only if

$$\begin{aligned} \inf _{x\in X}\langle u^*, g(x, y)\rangle \le 0 \ \ \forall u^*\in K^+\ { with}\ \Vert u^*\Vert =1. \end{aligned}$$
(22)
(ii):

Let \(y\in Y\) be such that subproblem P(y) is feasible. Then

$$\begin{aligned} v_y(0)=\max _{u^*\in K^+}\Big \{\inf _{x\in X}\big \{f(x, y)+\langle u^*, g(x, y)\rangle \big \}\Big \}. \end{aligned}$$
(23)

Based on Proposition 4.2, we have the following master program (MP) which is equivalent to MINLP problem (VOP) in (17):

$$\begin{aligned} \mathrm{(MP)}\left\{ \begin{array}{l} \min \limits _{y,\theta }\ \theta \\ \mathrm{s.t}\ \ \inf \limits _{x\in X}\{f(x, y)+\langle u^*, g(x, y)\rangle \}\le \theta , \ \ \forall u^*\in K^+\\ \ \ \ \ \ \, \inf \limits _{x\in X}\langle z^*, g(x, y)\rangle \le 0,\ \ \forall z^*\in K^+\ \mathrm{with}\ \Vert z^*\Vert =1,\\ \ \ \ \ \ \ \, y\in Y,\, \theta \in {\mathbb {R}}. \end{array} \right. \end{aligned}$$
(24)

To solve the equivalent master program (MP) in (24), we consider the following relaxation of (MP) and optimization problems \(L\text{- }P(y_k)\) and \(In\text{- }P(y_k)\) referring to subproblem \(P(y_k)\).

For any subsets\(T\subset K^+, S\subset K^+\), the relaxed master program\(\mathrm{RMP}(T, S)\)is defined as follows:

$$\begin{aligned} \mathrm{RMP}(T, S)\left\{ \begin{array}{l} \mathop {\min }\limits _{y,\theta } \ \theta \\ \mathrm{s.t.}\ \mathop {\inf }\limits _{x\in X}\{f(x, y)+\langle u^*, g(x, y)\rangle \}\le \theta , \ \ \forall u^*\in T,\\ \ \ \ \ \ \, \mathop {\inf }\limits _{x\in X}\langle z^*, g(x, y)\rangle \le 0,\ \ \forall z^*\in S,\\ \ \ \ \ \ \, y\in Y,\, \theta \in {\mathbb {R}}. \end{array} \right. \end{aligned}$$
(25)

For any\(y_k\in Y\)such that\(P(y_k)\)is feasible, optimization problem\({ L-P}(y_k)\)is defined as follows:

$$\begin{aligned}&{ Find }\ \left[ \begin{array}{l} x_k\in X\\ u_k^*\in {\mathbb {R}}^m \end{array} \right] ~~ { such \ that} ~~ \left\{ \begin{array}{l} x_k\in \mathop {argmin}\limits _{x\in X}\{f(x,y_k)+\langle u^*_k, g(x,y_k)\rangle \},\\ \langle u_k^*, g(x_k,y_k)\rangle =0,\\ u_k^*\in K^+,\\ g(x_k,y_k)\preceq _K 0. \end{array} \right. \end{aligned}$$
(26)

For any\(y_k\in Y\)such that\(P(y_k)\)is infeasible, optimization problem\({ In\text{- }P}(y_k)\)is defined as follows:

$$\begin{aligned}&{ Find\ }\ \left[ \begin{array}{l} z_k^*\in {\mathbb {R}}^m \end{array} \right] ~~ { such \ that} ~~ \left\{ \begin{array}{l} \mathop {\inf }\limits _{x\in X}\langle z_{k}^*, g(x, y_{k})\rangle >0,\\ z_k^*\in K^+,\\ \Vert z^*_k\Vert =1. \end{array} \right. \end{aligned}$$
(27)

The generalized Benders algorithm is stated as follows. This algorithm can be used to find the optimum of MINLP problem (VOP).

figure b

The following proposition shows the termination of Algorithm 2 after a finite number of steps. The detailed proof is presented in Sect. 5.

Proposition 4.3

Suppose that the cardinality of Y is finite. Then for any given \(\varepsilon >0\), Algorithm 2 terminates in a finite number of steps.

4.2 Illustrative Examples

In this subsection, MINLP examples are solved by Algorithm 2. The algorithm is implemented in MATLAB version R2014a, and examples are all solved by BARON version 16.10.6 (cf. [12]). The implementation of Algorithm 2 also demonstrates that GBD is feasible and applicable to solve MINLPs with conic constraint.

Example 4.1

Let \(K:=\{(u,v)\in {\mathbb {R}}^2:u\ge 2|v|, u\ge 0\}\), \(Y=\{-1, 0, 1, 2\}\) and \(X=\{(u,v)\in {\mathbb {R}}^2: -1\le u+v\le 1 \ { and} \ -1\le u-v\le 1 \}\). Consider the following MINLP with vector conic constraint:

$$\begin{aligned}&\min \limits _{x,y} \ f(x, y) \nonumber \\&\mathrm{s.t.}\ \ g(x, y)\preceq _K 0, \nonumber \\&\qquad x=(u,v)\in X, \nonumber \\&\qquad y\in Y, \end{aligned}$$
(28)

where \(f(x, y)=\max \{-u+y, v^2+y-1\}\), \(g(x,y)=(g_1(x,y),g_2(x,y))\) with

$$\begin{aligned} g_1(x, y)=u^2+v^2-\frac{1}{2}y\ \ { and} \ \ g_2(x,y)=-\frac{1}{2}v^2+\frac{1}{4}y. \end{aligned}$$

It is easy to verify that \(g(\cdot ,y)\) is K-convex. We solve MINLP in (28) by Algorithm 2.

Choose the tolerance \(\varepsilon =0.005\). Let \(y_1=-1\) be the initial point and \(z_1^*=(1,0)\in K^+\) with \(\Vert z^*_1\Vert =1\). Solve optimization problem \(L\text{- }P(y_1)\) as in (26) and obtain an optimal Lagrange multiplier \(u^*_1=(0.991,0.01)\). Set \(T^1=\{u_1^*\}\), \(S^1=\{z^*_1\}\) and \(\mathrm{UBD}^1=v_{y_1}(0)=2\). Solve the relaxed master program \(\mathrm{RMP}(T^1, S^1)\) and denote the optimal solution by \((y_2,\theta _2)=(1,0.2547)\). Since the termination criterion is not satisfied, we check subproblem \(P(y_2)\) which is feasible and obtain optimal Lagrange multiplier \(u^*_2=(0,0)\) by solving \({ L-P}(y_2)\) as in (26). Set \(T^2=T^1\cup \{u_2^*\}\), \(S^2=S^1\) and \(\mathrm{UBD}^2=\min \{\mathrm{UBD}^1, v_{y_2}(0)\}=0\). Solve the relaxed master program \(\mathrm{RMP}(T^2, S^2)\) and denote the optimal solution by \((y_3,\theta _3)=(1,0)\). Since \(\mathrm{UBD}^2\le \theta _3+\varepsilon \), it follows that the termination criterion is satisfied and the algorithm stops. This means that \((x_3,y_3)=((1,0), 1)\) is an \(\varepsilon \)-tolerance optimal solution of problem (28). \(\square \)

Example 4.2

Let \(K:=\{(u,v)\in {\mathbb {R}}^2:-u\ge 2|v|, u\le 0\}\), \(Y=\{-2, -1, 0, 1\}\) and \(X=\{(u,v): u^2+v^2\le 2 \}\). Consider the following MINLP with vector conic constraint:

$$\begin{aligned}&\min \limits _{x,y} \ f(x, y) \nonumber \\&\mathrm{s.t.}\ \ g(x, y)\preceq _K 0, \nonumber \\&\qquad x=(u,v)\in X, \nonumber \\&\qquad y\in Y. \end{aligned}$$
(29)

where \(f(x, y)=-2u+v+y\), \(g(x,y)=(g_1(x,y),g_2(x,y))\) with

$$\begin{aligned} g_1(x, y)=-u^2-v^2+y\ \ { and} \ \ g_2(x,y)=-\frac{1}{2}u^2+\frac{1}{4}v^2+\frac{1}{2}y. \end{aligned}$$

Choose the tolerance \(\varepsilon =0.005\). Let \(y_1=1\) be the initial point and \(z_1^*=(-1,0)\in K^+\) with \(\Vert z^*_1\Vert =1\). Solve optimization problem \(L\text{- }P(y_1)\) as in (26) and obtain the optimal Lagrange multiplier \(u^*_1=(-0.001,0.002)\). Set \(T^1=\{u_1^*\}\), \(S^1=\{z^*_1\}\) and \(\mathrm{UBD}^1=v_{y_1}(0)=-1.001\). Solve the relaxed master program \(\mathrm{RMP}(T^1, S^1)\) and denote the optimal solution by \((y_2,\theta _2)=(-2,-5.1697)\). Noting that the termination criterion is not satisfied, we check subproblem \(P(y_2)\) (that is feasible) and get the optimal value \(v_{y_2}(0)=-5.1623\) (of \(P(y_2)\)). This implies that the termination criterion is satisfied and then the algorithm is stopped. Thus \((x_2,y_2)=((1.2649,-0.6325), -2)\) is an \(\varepsilon \)-tolerance optimal solution of MINLP problem (28). \(\square \)

Let us turn back to Example 3.1 in Sect. 3. We next use generalized Benders decomposition procedure in Algorithm 2 to solve this example.

For this case, one has that \(K=\{(u,v):u\ge 0, v\ge 0\}\) and \(K^+=K\).

Let \(y_1=2\) be the initial point and \(z_1^*=(0,1)\in K^+\) with \(\Vert z^*_1\Vert =1\). Obtain optimal Lagrange multiplier \(u^*_1=(1,0.001)\) by solving the optimization problem \({ L-P}(y_1)\) as in (26). Set \(T^1=\{u_1^*\}\), \(S^1=\{z^*_1\}\) and \(\mathrm{UBD}^1=v_{y_1}(0)=2\). Solve the relaxed master program \(\mathrm{RMP}(T^1, S^1)\) and denote the optimal solution by \((y_2,\theta _2)=(1,-0.002)\). Since \(v_{y_2}(0)\le \theta _2+\varepsilon \), the termination criterion is satisfied and then the algorithm stops. Thus \((x_2,y_2)=((1,0), 1)\) is an \(\varepsilon \)-tolerance optimal solution of MINLP problem (15). \(\square \)

5 Proofs of Results in Sects. 3 and 4

This section is devoted to proofs of Propositions 3.13.3 and 4.14.3.

Proof of Proposition 3.1

From \(y_l\in S\), one has \(\sum _{i\in I_l}[g_i(x_l, y_l)]_+>0\). Suppose on the contrary that there exists \({{\hat{x}}}\in X\) such that \(({{\hat{x}}}, y_l)\) is feasible to the constraint of (10). Then

$$\begin{aligned} g_i(x_l, y_l)+\langle \xi _{l,i},{{\hat{x}}}-x_l\rangle \le 0, \ \ \forall i\in I_l\cup J_l=\{1,\ldots ,m\}. \end{aligned}$$

This implies that

$$\begin{aligned} \sum _{i=1}^m(\lambda _{l,i}g_i(x_l, y_l)+\langle \lambda _{l,i}\xi _{l,i},{{\hat{x}}}-x_l\rangle )\le 0 \end{aligned}$$

and by (8), one has

$$\begin{aligned} \sum _{i\in I_l}\lambda _{l,i}g_i(x_l, y_l)+\sum _{i=1}^m\langle \lambda _{l,i}\xi _{l,i},{{\hat{x}}}-x_l\rangle \le 0. \end{aligned}$$
(30)

Noting that \({{\hat{x}}}-x_l\in T(X,x_l)\) by the convexity of X, it follows from (9) and (30) that

$$\begin{aligned} \sum _{i\in I_l}\lambda _{l,i}g_i(x_l, y_l)\le 0. \end{aligned}$$

By (9), one can verify that

$$\begin{aligned} \sum _{i\in I_l}[g_i(x_l, y_l)]_+=\sum _{i\in I_l}\lambda _{l,i}g_i(x_l, y_l)\le 0, \end{aligned}$$

which contradicts \(\sum _{i\in I_l}[g_i(x_l, y_l)]_+>0\). The proof is complete. \(\square \)

For the proof of Proposition 3.2, we need the following lemma.

Lemma 5.1

Let \(x_j\) solve subproblem \(P(y_j)\) and subgradients \((\alpha _j,\beta _j), (\xi _{j,i},\eta _{j,i})\) be given as said in the case that \(y_j\in T\). Then

$$\begin{aligned} \langle \alpha _j, x-x_j \rangle \ge 0,\ \forall x\in X\ { with}\ g_i(x_j, y_j)+\langle \xi _{j,i}, x-x_j\rangle \le 0, \forall i=1,\ldots ,m.\nonumber \\ \end{aligned}$$
(31)

Proof

Let \(x\in X\) be such that

$$\begin{aligned} g_i(x_j, y_j)+\langle \xi _{j,i}, x-x_j\rangle \le 0, \forall i=1,\ldots ,m. \end{aligned}$$

This and (5) imply that

$$\begin{aligned} \sum _{i=1}^m\langle \lambda _{j,i}\xi _{j,i}, x-x_j\rangle =\sum _{i=1}^m(\lambda _{j,i}g_i(x_j,y_j)+\langle \lambda _{j,i}\xi _{j,i}, x-x_j\rangle )\le 0. \end{aligned}$$
(32)

Noting that \(x-x_j\in T(X,x_j)\) by the convexity of X, it follows from (6) and (32) that

$$\begin{aligned} \langle \alpha _j, x-x_j\rangle =\Big \langle \alpha _j+\sum \limits _{i=1}^m\lambda _{j,i}\xi _{j,i}, x-x_j\Big \rangle -\Big \langle \sum \limits _{i=1}^m\lambda _{j,i}\xi _{j,i}, x-x_j\Big \rangle \ge 0. \end{aligned}$$

Hence (31) holds. The proof is complete. \(\square \)

Proof of Proposition 3.2

Suppose that \(({{\hat{x}}},\hat{y},{\hat{\theta }})\) is an optimal solution of master program (MP) in (11) and \((x_{j_0},y_{j_0})\) solves problem (P) in (3). We need to prove that \({\hat{\theta }}=f(x_{j_0},y_{j_0})\).

Note that \(({{\hat{x}}},{{\hat{y}}},{\hat{\theta }})\) is feasible to (MP) and thus \(({{\hat{x}}},{{\hat{y}}},{\hat{\theta }})\) satisfies constraints in (MP). By Proposition 3.1, one has \({{\hat{y}}}\not \in S\) and thus we can assume that \({{\hat{y}}}=y_{j_1}\) for some \(y_{j_1}\in T\). Then

$$\begin{aligned} f(x_{j_1}, y_{j_1})+(\alpha _{j_1}, \beta _{j_1})^T\begin{pmatrix}\hat{x}-x_{j_1}\\ 0\end{pmatrix}\le {\hat{\theta }} \end{aligned}$$
(33)

and

$$\begin{aligned} g_i(x_{j_1}, y_{j_1})+(\xi _{j_1,i}, \eta _{j_1,i})^T\begin{pmatrix}{{\hat{x}}}-x_{j_1}\\ 0\end{pmatrix}\le 0, i=1,\ldots ,m. \end{aligned}$$

This and Lemma 5.1 imply that \(\langle \alpha _{j_1}, \hat{x}-x_{j_1}\rangle \ge 0\) and by (33), one has

$$\begin{aligned} {\hat{\theta }}\ge f(x_{j_1}, y_{j_1})\ge f(x_{j_0}, y_{j_0}). \end{aligned}$$

On the other side, \((x_{j_0},y_{j_0})\) is feasible to problem (P) and thus \(g_i(x_{j_0},y_{j_0})\le 0\) for all \(i=1,\ldots , m\). Noting that \((\alpha _j,\beta _j)\in \partial f(x_j,y_j),(\xi _{j,i},\eta _{j,i})\in \partial g_i(x_j,y_j)\) for all \(y_j\in T\) and \((\xi _{l,i},\eta _{l,i})\in \partial g_i(x_l,y_l)\) for all \(y_l\in S\), it follows that

$$\begin{aligned}&f(x_j, y_j)+(\alpha _j, \beta _j)^T\begin{pmatrix}x_{j_0}-x_j\\ y_{j_0}-y_j\end{pmatrix}\le f(x_{j_0},y_{j_0})\ \ \ \forall y_j\in T, \nonumber \\&g_i(x_j, y_j)+(\xi _{j,i}, \eta _{j,i})^T\begin{pmatrix}x_{j_0}-x_j\\ y_{j_0}-y_j\end{pmatrix}\le g_i(x_{j_0},y_{j_0})\le 0\ \ \forall y_j\in T, i=1,\ldots ,m,\nonumber \\&g_i(x_l, y_l)+(\xi _{l,i}, \eta _{l,i})^T\begin{pmatrix}x_{j_0}-x_l\\ y_{j_0}-y_l\end{pmatrix}\le g_i(x_{j_0},y_{j_0})\le 0\ \ \forall y_l\in S, i=1,\ldots ,m. \end{aligned}$$

This implies that \((x_{j_0},y_{j_0},f(x_{j_0},y_{j_0}))\) is feasible to (MP) and consequently

$$\begin{aligned} {\hat{\theta }} \le f(x_{j_0},y_{j_0}). \end{aligned}$$

Hence \({\hat{\theta }}=f(x_{j_0},y_{j_0})\). The proof is complete. \(\square \)

Proof of Proposition 3.3

We prove that no integer variable will be generated more than once. Granting this, Algorithm 1 terminates in a finite number of steps by the finite cardinality of Y.

At iteration k, suppose that \(({{\hat{x}}}, {{\hat{y}}}, \hat{\theta })\) is an optimal solution to the relaxed master program \(MP^k\). We need to show that \({{\hat{y}}}\not \in T^k\cup S^k\).

By Proposition 3.1, one has \({{\hat{y}}}\not \in S^k\) and thus it suffices to prove that \({{\hat{y}}} \not \in T^k\). Suppose on the contrary that \({{\hat{y}}}=y_{j_k}\) for some \(y_{j_k}\in T^k\). Then \(({{\hat{x}}}, y_{j_k}, \hat{\theta })\) is feasible to the relaxed master program \(MP^k\) and

$$\begin{aligned}&\hat{\theta }<UBD^k\le f(x_{j_k}, y_{j_k}), \nonumber \\&f(x_{j_k}, y_{j_k})+\langle \alpha _{j_k}, {{\hat{x}}}-x_{j_k}\rangle \le \hat{\theta },\nonumber \\&g_i(x_{j_k}, y_{j_k})+\langle \xi _{j_k,i}, {{\hat{x}}}-x_{j_k}\rangle \le 0, i=1,\ldots ,m. \end{aligned}$$
(34)

By Lemma 5.1, one has

$$\begin{aligned} \langle \alpha _{j_k}, {{\hat{x}}}-x_{j_k}\rangle \ge 0. \end{aligned}$$

This and (34) imply that \(f(x_{j_k}, y_{j_k})\le \hat{\theta }\), which contradicts \(\hat{\theta }<f(x_{j_k}, y_{j_k})\).

Now suppose that Algorithm 1 terminate at k-th step for some k; that is, the relaxed master program \(MP^{k}\) is infeasible. To complete the proof, we need to show that the optimal value of problem (P) is attainable.

Let \(r^*\) denote the optimal value of problem (P) in (3). If there exists \(y_j\in T^{k-1}\) such that \(f(x_j, y_j)=r^*\), then the conclusion holds. Next, we consider the case that \(f(x_j, y_j)>r^*\) for all \(y_j\in T^{k-1}\). Then \(UBD^{k-1}>r^*\) and \(y_k\in T^k\) (otherwise, \(y_k\in S^k\), \(UBD^{k}=UBD^{k-1}\) by Algorithm 1 and consequently \(MP^{k}\) is feasible, a contradiction). This implies that subproblem \(P(y_k)\) is feasible and thus \(f(x_k, y_k)\ge r^*\). We claim that \(f(x_k, y_k)=r^*\). (Otherwise \(f(x_k, y_k)> r^*\). Then \(r^*<UBD^{k}\) and thus one can verify that \(MP^k\) is feasible, a contradiction). The proof is complete. \(\square \)

Proof of Proposition 4.1

(i) Let \(y\in Y\). From the convexity of \(f(\cdot ,y)\) and K-convexity of \(g(\cdot ,y)\), one can verify that \(v_y(\cdot )\) is a convex and monotone nonincreasing and thus \(\mathrm{dom}(v_y)\) is convex. Since X is compact and \(f(\cdot , y), g(\cdot , y)\) are continuous, it is easy to verify that \(\mathrm{dom}(v_y)\) is closed.

We next prove the lower semicontinuity of \(v_y(\cdot )\). Let \(z\in \mathrm{dom}(v_y)\) and \(z_k\rightarrow z\) with \(z_k\in \mathrm{dom}(v_y)\). Then for any \(k\in {\mathbb {N}}\), there exists \(x_k\in X\) such that \(g(x_k,y)\preceq _K z_k\) and

$$\begin{aligned} v_y(z_k)\ge f(x_k,y)-\frac{1}{k}. \end{aligned}$$

Noting that X is compact, without loss of generality, we can assume that \(x_k\rightarrow {\bar{x}}\in X\) (considering subsequence if necessary). Thus \(g({\bar{x}},y)\preceq _K z\) by the continuity of \(g(\cdot ,y)\). This implies that

$$\begin{aligned} \liminf _{k\rightarrow \infty }v_y(z_k)\ge \liminf _{k\rightarrow \infty }(f(x_k,y)-\frac{1}{k})\ge f({\bar{x}},y)\ge v_y(z). \end{aligned}$$

(ii) Let \(y\in Y\) be such that P(y) is feasible. By the Slater constraint qualification in (21), one has \(-g(\hat{x},y)\in \mathrm{int}(K)\) and then there exists \(\delta >0\) such that \(-g({{\hat{x}}},y)+B(0,\delta )\subset K\). This implies that \(0\in \mathrm{int}(\mathrm{dom}(v_y))\) and it follows from [28, Proposition 3.3] that \(v_y\) is continuous at 0.

Let \(-{\bar{u}}^*\in \partial v_y(0)\) and suppose that \({\bar{x}}\) is an optimal solution of subproblem P(y). We next prove that \(({\bar{x}}, {\bar{u}}^*)\) satisfies the optimality conditions in (20).

For any \(z\in K\), by \(-{\bar{u}}^*\in \partial v_y(0)\), one has

$$\begin{aligned} \langle {\bar{u}}^*,z\rangle \ge v_y(0)-v_y(z)\ge 0 \end{aligned}$$

as \(v_y(\cdot )\) is monotone nonincreasing. This implies that \(\bar{u}^*\in K^+\). Noting that \({\bar{x}}\) solves P(y), it follows that

$$\begin{aligned} g({\bar{x}},y)\preceq _K 0, \ v_y(g({\bar{x}},y))=v_y(0)\ \mathrm{and} \ \langle {\bar{u}}^*, g({\bar{x}},y)\rangle \ge v_y(0)-v_y(g({\bar{x}},y))=0.\nonumber \\ \end{aligned}$$
(35)

From \({\bar{u}}^*\in K^+\), one has \(\langle {\bar{u}}^*, g(\bar{x},y)\rangle \le 0\) and so \(\langle {\bar{u}}^*, g({\bar{x}},y)\rangle =0\) by (35) .

For any \(x\in X\), by \(-{\bar{u}}^*\in \partial v_y(0)\), one has

$$\begin{aligned} f(x,y)\ge & {} v_y(g(x,y))\ge v_y(0)-\langle {\bar{u}}^*, g(x,y)\rangle =f({\bar{x}},y)\\&+\langle {\bar{u}}^*, g(\bar{x},y)\rangle -\langle {\bar{u}}^*, g(x,y)\rangle . \end{aligned}$$

This implies that \(f({\bar{x}},y)+\langle {\bar{u}}^*, g(\bar{x},y)\rangle =\min _{x\in X}\big \{f(x,y)+\langle {\bar{u}}^*, g(x,y)\rangle \big \}\). Hence \({\bar{u}}^*\) is the optimal Lagrange multiplier of P(y).

Conversely, let \({\bar{u}}^*\) be an optimal Lagrange multiplier of P(y). Then there exists \({\bar{x}}\in X\) such that \(({\bar{x}}, \bar{u}^*)\) satisfies optimality conditions in (20). Let \(z\in \mathrm{dom}(v_y)\) and \(x\in X\) satisfying \(g(x,y)\preceq _Kz\). Then \(\langle {\bar{u}}^*,z\rangle \ge \langle {\bar{u}}^*,g(x,y)\rangle \) and

$$\begin{aligned} f(x,y)+\langle {\bar{u}}^*,z\rangle \ge f(x,y)+\langle \bar{u}^*,g(x,y)\rangle \ge f({\bar{x}},y)+\langle {\bar{u}}^*,g(\bar{x},y)\rangle =v_y(0). \end{aligned}$$

This implies that \(v_y(z)\ge v_y(0)-\langle {\bar{u}}^*,z\rangle \) and thus \(-{\bar{u}}^*\in \partial v_y(0)\). The proof is complete. \(\square \)

Proof of Proposition 4.2

(i) Note that if P(y) is feasible, then there is \(x\in X\) such that \(g(x,y)\preceq _K0\) and thus the necessity part follows.

The sufficiency part. By (22), one has

$$\begin{aligned} \sup _{u^*\in K^+, \Vert u^*\Vert =1}\inf _{x\in X}\langle u^*, g(x, \bar{y})\rangle \le 0. \end{aligned}$$

This implies that

$$\begin{aligned} \max _{u^*\in K^+}\Big \{\inf _{x\in X}\langle u^*, g(x, \bar{y})\rangle \Big \}=0. \end{aligned}$$
(36)

Note that P(y) is feasible if and only if \(0\in \mathrm{dom}(v_y)\) and it suffices to prove that

$$\begin{aligned} 0\in \mathrm{dom}(v_y). \end{aligned}$$

Suppose on the contrary that \(0\not \in \mathrm{dom}(v_y)\). Since \(\mathrm{dom}(v_y)\) is convex and closed by Proposition 4.1(i), it follows from separation theorem that there exist \(z^*\in {\mathbb {R}}^m\) with \(\Vert z^*\Vert =1\) and \(r\in {\mathbb {R}}\) such that

$$\begin{aligned} \inf \{\langle z^*,z\rangle : z\in \mathrm{dom}(v_y)\}>r>0. \end{aligned}$$
(37)

We first show that \(z^*\in K^+\). Let \(z\in K\) and \(k\in {\mathbb {N}}\). Take any \(x\in X\). Then \(kz\in K\) and \(g(x,y)+kz\in \mathrm{dom}(v_y)\). By (37), one has \(\langle z^*,g(x,y)+kz\rangle>r>0\) and thus

$$\begin{aligned} \frac{1}{k}\langle z^*,g(x,y)\rangle +\langle z^*,z\rangle>\frac{r}{k}>0. \end{aligned}$$

By taking limits as \(k\rightarrow \infty \), one has \(\langle z^*,z\rangle \ge 0\). Hence \(z^*\in K^+\).

Noting that \(f(\cdot , y)\) is continuous and X is compact, it follows that

$$\begin{aligned} \inf _{x\in X}f(x,y)>-\infty \end{aligned}$$

and for any \(t>0\), one has \(tz^*\in K^+\) and

$$\begin{aligned} \inf _{x\in X}\{f(x,y)+\langle tz^*, g(x,y)\rangle \}\ge \inf _{x\in X}f(x,y)+t\inf _{x\in X}\langle z^*, g(x,y)\rangle >\inf _{x\in X}f(x,y)+tr. \end{aligned}$$

By taking limits as \(t\rightarrow +\infty \), one has

$$\begin{aligned} \max _{u^*\in K^+}\Big \{\inf _{x\in X}\big \{f(x, y)+\langle u^*, g(x, y)\rangle \big \}\Big \}=+\infty , \end{aligned}$$

which contradicts (36).

(ii) Using the weak duality, one has

$$\begin{aligned} \max _{u^*\in K^+}\Big \{\inf _{x\in X}\{f(x,y)+\langle u^*, g(x,y)\rangle \}\Big \}\le \inf _{x\in X}\{f(x,y): g(x,y)\preceq _K 0\}=v_y(0). \end{aligned}$$

On the other side, suppose that \(-{\bar{u}}^*\in \partial v_y(0)\). By Proposition 4.1(ii), \({\bar{u}}^*\in K^+\) and for any \(x\in X\), one has

$$\begin{aligned} f(x,y)\ge v_y(g(x,y))\ge v_y(0)-\langle {\bar{u}}^*,g(x,y)\rangle \end{aligned}$$

and thus

$$\begin{aligned} \inf _{x\in X}\{f(x,y)+\langle {\bar{u}}^*,g(x,y)\rangle \}\ge v_y(0). \end{aligned}$$

This means that

$$\begin{aligned} \max _{u^*\in K^+}\Big \{\inf _{x\in X}\{f(x,y)+\langle u^*, g(x,y)\rangle \}\Big \}\ge v_y(0). \end{aligned}$$

Hence (23) holds. The proof is complete. \(\square \)

For the proof of Proposition 4.3, we need the following lemma.

Lemma 5.2

Let \(L:Y\times K^+\rightarrow {\mathbb {R}}\) be defined as

$$\begin{aligned} L(y, u^*):=\inf _{x\in X}\big \{f(x, y)+\langle u^*, g(x, y)\rangle \big \},\ \ \forall (y, u^*)\in Y\times K^+. \end{aligned}$$
(38)

Then for any \({\bar{y}}\in Y\), one has that \(L({\bar{y}}, \cdot )\) is continuous on \(K^+\).

Proof

Let \({\bar{y}}\in Y\) be fixed and \({\bar{u}}^*\in K^+\). We first prove that \(L({\bar{y}}, \cdot )\) is lower semicontinuous at \({\bar{u}}^*\). To do this, take any \(u^*_k\rightarrow {\bar{u}}^*\) with \(u^*_k\in K^+\). Then for any \(k\in {\mathbb {N}}\), there exists \(x_k\in X\) such that

$$\begin{aligned} f(x_k, {\bar{y}})+\langle u^*_k, g(x_k, {\bar{y}})\rangle =L({\bar{y}}, u^*_k) \end{aligned}$$
(39)

as X is compact and \(f(\cdot , {\bar{y}}), g(\cdot , {\bar{y}})\) are continuous. Without loss of generality, we can assume that \(x_k\rightarrow {\bar{x}}\in X\) (considering subsequence if necessary). By (39), one has

$$\begin{aligned} L({\bar{y}}, {\bar{u}}^*)\le & {} f({\bar{x}}, {\bar{y}})+\langle {\bar{u}}^*, g(\bar{x}, {\bar{y}})\rangle \le \lim _k(f(x_k, {\bar{y}})+\langle u^*_k, g(x_k, {\bar{y}})\rangle ). \end{aligned}$$

This implies that \(L({\bar{y}}, \cdot )\) is lower semicontinuity at \({\bar{u}}^*\).

We next show that \(L({\bar{y}}, \cdot )\) is upper semicontinuous at \({\bar{u}}^*\). For any \(x\in X\), one has

$$\begin{aligned} f(x, {\bar{y}})+\langle {\bar{u}}^*, g(x, {\bar{y}})\rangle= & {} \lim _k(f(x, {\bar{y}})+\langle u^*_k, g(x, {\bar{y}})\rangle )\ge \limsup _k L({\bar{y}}, u_k^*). \end{aligned}$$

This implies that

$$\begin{aligned} L({\bar{y}}, {\bar{u}}^*)=\inf _{x\in X}(f(x, {\bar{y}})+\langle {\bar{u}}^*, g(x, {\bar{y}})\rangle )\ge \limsup _k L({\bar{y}}, u_k^*) \end{aligned}$$

and consequently \(L({\bar{y}}, \cdot )\) is upper semicontinuous at \(\bar{u}^*\). Hence \(L({\bar{y}}, \cdot )\) is continuous at \({\bar{u}}^*\). The proof is complete. \(\square \)

Proof of Proposition 4.3

Suppose on the contrary that there exists \(\varepsilon _0>0\) such that Algorithm 2 can generate a sequence \(\{(y_k, \theta _k)\}\) satisfying subproblem \(P(y_k)\) is feasible and \(\theta _k\in {\mathbb {R}}\) for all k. Then for any k, one has \(u^*_k\in K^+\) is the optimal Lagrange multiplier of subproblem \(P(y_k)\) and thus \(u_k^*\in -\partial v_{y_k}(0)\) by Proposition  4.1(ii). From the relaxed master program \(\mathrm{RMP}(T^k,S^k)\), one can verify that \(\{\theta _k\}\) is nondecreasing and bounded above. By the finite cardinality of Y, there exist \({\bar{y}}\in Y\) and a subsequence \(\{k_i:i=1,2,\ldots \}\) such that \(y_{k_i}\equiv {\bar{y}}\) for all \(i\in {\mathbb {N}}\). Using Proposition 4.1(ii), one has that \(v_{{\bar{y}}}(\cdot )\) is continuous at 0 and it follows from [28, Proposition 1.11] that \(\partial v_{{\bar{y}}}(0)\) is closed and bounded. This implies that \(\{u^*_{k_i}\}\) is bounded. Thus, without loss of generality, we can assume that \((u^*_{k_i}, \theta _{k_i})\rightarrow ({\bar{u}}^*,{\bar{\theta }})\in -\partial v_{\bar{y}}(0)\times {\mathbb {R}}\).

By solving relaxed master program \(\mathrm{RMP}(T^{k_i},S^{k_i})\) in Algorithm 2, one has

$$\begin{aligned} \theta _{k_{i+1}}\ge L(y_{k_{i+1}}, u_{k_i}^*)=L({\bar{y}}, u_{k_i}^*). \end{aligned}$$

By Lemma 5.2, \(L({\bar{y}},\cdot )\) is continuous at \({\bar{u}}^*\) and it follows by taking limits as i that

$$\begin{aligned} {\bar{\theta }}\ge L({\bar{y}}, {\bar{u}}^*)=v_{{\bar{y}}}(0). \end{aligned}$$

This implies that when i is sufficiently large, one has

$$\begin{aligned} \theta _{k_i}+\frac{\varepsilon _0}{2}>{\bar{\theta }}\ge v_{\bar{y}}(0)=v_{y_{k_i}}(0)>v_{y_{k_i}}(0)-\frac{\varepsilon _0}{2} \end{aligned}$$

and consequently

$$\begin{aligned} v_{y_{k_i}}(0)\le \theta _{k_i}+\varepsilon _0. \end{aligned}$$

Then the termination criterion in Algorithm 2 is satisfied and thus Algorithm 2 will terminate at \(k_i\)-step, a contradiction. The proof is complete. \(\square \)

6 Conclusions

Compared with smooth functions, nonsmooth functions appear more frequently and generally in optimization problems; for example, for the function space consisting of all continuous functions on closed interval [ab], the complement of the subset with functions that fail to be differentiable everywhere is the first category. This means that the number of nonsmooth functions is far more than that of smooth functions. Meanwhile, over the past three decades, nonsmooth analysis has been recognized as a broad spectrum of mathematical theory that has grown in connection with many aspects of mathematical programming and optimization. This paper is motivated by the idea of applying techniques in nonsmooth analysis to dealing with nonsmooth optimization problems. By the subgradients, we study two algorithms by OA and GBD for dealing with MINLPs that are lack of differentiability data. The further work may be done along two lines: one is to consider how to solve more complicated nonsmooth MINLPs that are with a larger scale and difficult to solve by differentiability techniques, and the other is to tackle the nonsmooth MINLPs by relaxing or dropping the convexity assumption.