Convex mixed integer nonlinear programming problems and an outer approximation algorithm

Wei, Zhou; Ali, M. Montaz

doi:10.1007/s10898-015-0284-5

Convex mixed integer nonlinear programming problems and an outer approximation algorithm

Published: 18 February 2015

Volume 63, pages 213–227, (2015)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Journal of Global Optimization Aims and scope Submit manuscript

Convex mixed integer nonlinear programming problems and an outer approximation algorithm

Download PDF

Zhou Wei¹ &
M. Montaz Ali^2,3

567 Accesses
8 Citations
1 Altmetric
Explore all metrics

Abstract

In this paper, we mainly study one class of convex mixed-integer nonlinear programming problems (MINLPs) with non-differentiable data. By dropping the differentiability assumption, we substitute gradients with subgradients obtained from KKT conditions, and use the outer approximation method to reformulate convex MINLP as one equivalent MILP master program. By solving a finite sequence of subproblems and relaxed MILP problems, we establish an outer approximation algorithm to find the optimal solution of this convex MINLP. The convergence of this algorithm is also presented. The work of this paper generalizes and extends the outer approximation method in the sense of dealing with convex MINLPs from differentiable case to non-differentiable one.

Outer Approximation Algorithm for One Class of Convex Mixed-Integer Nonlinear Programming Problems with Partial Differentiability

Article 04 March 2015

Using regularization and second order information in outer approximation for convex MINLP

Article 18 December 2018

Alternative regularizations for Outer-Approximation algorithms for convex MINLP

Article 01 July 2022

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Many practical optimization problems are modelled as mixed-integer nonlinear programming problems (MINLPs) involving continuous and discrete variables and the study of solution algorithms for these optimization problems has been an active focus of research over the past decades (cf. [3, 10, 12–16, 21, 22, 25] and references therein). Suppose $f,g_i: {\mathbb {R}}^n\times {\mathbb {R}}^p\rightarrow {\mathbb {R}}\, (i=1,\ldots ,m)$ are nonlinear functions, $X$ is a nonempty compact convex set in ${\mathbb {R}}^n$ and $Y$ is a set of discrete variables in ${\mathbb {R}}^p$. The general form for MINLPs is defined mathematically as follows:

$$\begin{aligned} (\hbox {P}) \left\{ \begin{array}{ll} \mathop {\hbox {minimize}}\limits _{x,\,y} &{} f(x, y)\\ \hbox {subject to} &{} g_i(x, y)\le 0, i=1,\ldots ,m,\\ &{} x\in X, y\in Y\ \hbox {discrete variable}. \end{array}\right. \end{aligned}$$

(1.1)

This paper is devoted to one class of convex MINLPs in which objective and constraint functions $f,g_i$ for $i=1,\ldots ,m$ are convex but not differentiable.

The class of convex MINLPs has been extensively studied by many authors and several methods for these MINLPs have been developed over past decades. These methods include branch-and-bound, generalized Benders decomposition, extended cutting-plane method, NLP/LP based branch and bound and outer approximation method (cf. [2, 5–9, 11, 12, 15, 19, 23–25] and references therein). Note that the extended cutting-plane method was proposed by Westerlund and Pettersson [24] for solving differentiable convex MINLPs. Subsequently, Westerlund and Pettersson [25] presented this method to deal with a more general case of MINLPs including pseudo-convex functions. It was shown in [25] that one MINLP with pseudo-convex functions and pseudo-convex constraints can be solved to global optimality by the cutting-plane techniques. In 2014, Eronen et al. [6] generalized the extended cutting-plane method for solving convex nonsmooth MINLPs and provided one ECP algorithm which was proved to converge to one global optimum. Recently they [7] further considered this extended cutting plane method to deal with nonsmooth MINLPs with pseudo-convexity assumptions.

It is known that Duran and Grossmann [5] introduced the outer approximation method to deal with a particular class of MINLPs which was restricted to contain separable convex differentiable functions and not general convex differentiable functions in all variables. These separable convex functions were composed of convex differentiable functions in continuous variables and linear functions in discrete variables separately. Afterwards Fletcher and Leyffer [8] further extended the outer approximation method for solving convex MINLPs with convex and continuously differentiable objective and constraint functions, and provided a linear outer approximation algorithm to attain the optimal solution of this MINLP by solving a finite sequence of relaxed subproblems. This extension is the pioneering work on outer approximation method in a sense of solving MINLPs where the discrete variables are considered as nonlinear. In 2008, Bonami et. al [2] also studied outer approximation algorithms for convex and continuously differentiable MINLPs. Recently the authors in [6] and [23] used the outer approximation method to study convex nonsmooth MINLPs and established the resulting algorithms. It is noted that differentiability of functions plays an important role in the construction of relaxation and is proved to be an important matter for allowing to solve these relaxed subproblems efficiently. Since nonsmooth optimization problems defined by non-differentiable functions appear in practice, from the theoretical viewpoint as well as for applications, it is interesting and significant to consider convex and non-differentiable MINLPs. Motivated by this, in this paper, we are inspired by [2, 5, 6, 8, 23] to continue studying one convex MINLP by dropping the differentiability assumption and aim to construct an outer approximation algorithm for solving this MINLP. The outer approximation method used herein is along the line given in [8, 23] and consists of the use of KKT conditions to linearize the objective and constraint functions at different points for constructing an equivalent MILP relaxation of the problem.

The paper is organized as follows. In Sect. 2, we give some definitions and preliminaries used in this paper. Section 3 contains the equivalent reformulation of convex MINLP by the outer approximation method and one outer approximation algorithm for finding optimal solutions of this MINLP. The reformulation is mainly dependent on KKT conditions and projection techniques. For the algorithm construction, it is necessary to solve a finite sequence of nonlinear programs including feasible and infeasible subproblems and the relaxations of mixed-integer linear master program. The convergence theorem for the established algorithm is also presented therein. The conclusion of this paper is presented in Sect. 4. Section 5 is an “Appendix” which contains the proofs of the main results given for constructing the algorithm in the paper.

2 Preliminaries

Let $\Vert \cdot \Vert $ be the norm of ${\mathbb {R}}^n$ and denote the inner product between two elements of ${\mathbb {R}}^n$ by $\langle \cdot , \cdot \rangle $. Let ${\varOmega }$ be a closed convex set of ${\mathbb {R}}^n$ and $x\in {\varOmega }$. We denote $T({\varOmega }, x)$ the contingent cone of ${\varOmega }$ at $x$; that is, $v\in T({\varOmega }, x)$ if and only if there exist a sequence $\{v_k\}$ in ${\mathbb {R}}^n$ converging to $v$ and a sequence $t_k$ in $(0, +\infty )$ decreasing to $0$ such that $x+t_kv_k\in {\varOmega }$ for all $k\in \mathbb {N}$, where $\mathbb {N}$ denotes the set of all natural numbers. It is known from [1] that

$$\begin{aligned} T({\varOmega }, x)=cl({\mathbb {R}}_+({\varOmega }-x)) \end{aligned}$$

where $cl$ denotes the closure.

Let $N({\varOmega }, x)$ denote the normal cone of ${\varOmega }$ at $x$, that is

$$\begin{aligned} N({\varOmega }, x):=\{\gamma \in {\mathbb {R}}^n: \langle \gamma , z-x\rangle \le 0\quad \hbox {for all} \ z\in {\varOmega }\}. \end{aligned}$$

(2.1)

It is easy to verify that normal cone $N({\varOmega }, x)$ and contingent cone $T({\varOmega }, x)$ are the polar dual; that is

$$\begin{aligned} N({\varOmega }, x)=\big (T({\varOmega }, x)\big )^{\circ }:=\big \{\gamma \in {\mathbb {R}}^n: \langle \gamma , v\rangle \le 0 \quad \hbox {for all} \ v\in T({\varOmega }, x)\big \}. \end{aligned}$$

Readers are invited to consult the book [1] for more details on contingent cone and normal cone.

Let $\varphi : {\mathbb {R}}^n\rightarrow {\mathbb {R}}$ be a continuous convex function, $\bar{x}\in {\mathbb {R}}^n$ and $h\in {\mathbb {R}}^n$. Recall (cf. [20]) that $d^+\varphi (\bar{x})(h)$ denotes the directional derivative of $\varphi $ at $\bar{x}$ along the direction $h$ and is defined by

$$\begin{aligned} d^+\varphi (\bar{x})(h):=\lim \limits _{t\rightarrow 0^+}\frac{\varphi (\bar{x}+th)-\varphi (\bar{x})}{t}. \end{aligned}$$

We denote $\partial \varphi (\bar{x})$ the subdifferential of $\varphi $ at $\bar{x}$ which is defined by

$$\begin{aligned} \partial \varphi (\bar{x}):=\{\alpha \in {\mathbb {R}}^n:\; \langle \alpha , x-\bar{x}\rangle \le \varphi (x)- \varphi (\bar{x})\ \hbox {for all} \ x\in {\mathbb {R}}^n\}. \end{aligned}$$

Each vector in $\partial \varphi (\bar{x})$ is called a subgradient of $\varphi $ at $\bar{x}$. It is known from [20] that $\alpha \in \partial \varphi (\bar{x})$ if and only if

$$\begin{aligned} \langle \alpha , h\rangle \le d^+\varphi (\bar{x})(h)\quad \hbox {for all} \ h\in {\mathbb {R}}^n. \end{aligned}$$

Recall that $\varphi $ is said to be Gâteaux differentiable at $\bar{x}$ if there exists $d\varphi (\bar{x})\in {\mathbb {R}}^n$ such that

$$\begin{aligned} \lim \limits _{t\rightarrow 0^+}\frac{\varphi (\bar{x}+th)-\varphi (\bar{x})}{t}=\langle d\varphi (\bar{x}), h\rangle \quad \hbox {for all} \, h\in {\mathbb {R}}^n \end{aligned}$$

(2.2)

and $\varphi $ is said to be Fréchet differentiable at $\bar{x}$ if $\varphi $ is Gâteaux differentiable there and the limit in (2.2) exists uniformly for $\Vert h\Vert \le 1$ as $t\rightarrow 0^+$.

It is known from [20] that $\varphi $ is Gâteaux differentiable at $\bar{x}$ if and only if $\partial \varphi (\bar{x})$ is the singleton. Further, Gâteaux differentiability of $\varphi $ is equivalent to the Fréchet differentiability of $\varphi $ due to the local Lipschitzian property of $\varphi $ and the compactness of unit closed ball in ${\mathbb {R}}^n$.

Given a continuous convex function $\phi : {\mathbb {R}}^n\times {\mathbb {R}}^p\rightarrow {\mathbb {R}}$ and $(\bar{x}, \bar{y})\in {\mathbb {R}}^n\times {\mathbb {R}}^p$, one vector $(\alpha , \beta )\in {\mathbb {R}}^n\times {\mathbb {R}}^p$ is the subgradient of $\phi $ at $(\bar{x}, \bar{y})$ if and only if

$$\begin{aligned} \phi (x, y)\ge \phi (\bar{x}, \bar{y})+(\alpha , \beta )^T\begin{pmatrix}x-\bar{x}\\ y-\bar{y}\end{pmatrix}\quad \hbox {for all} \ (x, y)\in {\mathbb {R}}^n\times {\mathbb {R}}^p, \end{aligned}$$

(2.3)

where $(\alpha , \beta )^T$ is the transpose of matrix $(\alpha , \beta )$. When $\bar{y}$ is fixed (resp. $\bar{x}$ is fixed), the subdifferential of $\phi (\cdot ,\bar{y})$ (resp. $\phi (\bar{x},\cdot )$) at $\bar{x}$ (resp. $\bar{y}$) is the set defined by

$$\begin{aligned}&\partial \phi (\cdot ,\bar{y})(\bar{x}):=\big \{\alpha \in {\mathbb {R}}^n: \phi (x, \bar{y})\ge \phi (\bar{x}, \bar{y})+\langle \alpha , x-\bar{x}\rangle \quad \hbox {for all} \ x\in {\mathbb {R}}^n\big \} \\&\Big ({ resp.}\ \partial \phi (\bar{x},\cdot )(\bar{y}):=\big \{\beta \in {\mathbb {R}}^p: \phi (\bar{x}, y)\ge \phi (\bar{x}, \bar{y})+\langle \beta , y-\bar{y}\rangle \quad \hbox {for all} \ y\in {\mathbb {R}}^p\big \}\Big ). \end{aligned}$$

The following proposition on the subdifferential of convex functions is easy to verify from the definition.

Proposition 2.1

Let $\phi : {\mathbb {R}}^n\times {\mathbb {R}}^p\rightarrow {\mathbb {R}}$ be a continuous convex function and $(\bar{x}, \bar{y})\in {\mathbb {R}}^n\times {\mathbb {R}}^p$. Then for any $(\alpha , \beta )\in \partial \phi (\bar{x}, \bar{y})$, one has $\alpha \in \partial \phi (\cdot ,\bar{y})(\bar{x})$ and $\beta \in \partial \phi (\bar{x},\cdot )(\bar{y})$.

It is an interesting question to consider the converse of Proposition 2.1. This question is also interesting even for smooth convex functions in mathematical analysis. The question is explicitly stated as follows:

Given one continuous convex function $\phi : {\mathbb {R}}^n\times {\mathbb {R}}^p\rightarrow {\mathbb {R}}$ and one vector $\bar{\alpha }$ from $\partial \phi (\cdot ,\bar{y})(\bar{x})$, whether or not is there some vector $\bar{\beta }\in {\mathbb {R}}^p$ such that $(\bar{\alpha }, \bar{\beta })\in \partial \phi (\bar{x}, \bar{y})$?

The following two propositions provided an affirmative answer to this question. These propositions will play a key role in construction of outer approximation algorithm in the sequel. The first proposition is on convex and Fréchet differentiable functions.

Proposition 2.2

Let $\phi :{\mathbb {R}}^n\times {\mathbb {R}}^p\rightarrow {\mathbb {R}}$ be a continuous convex function and $(\bar{x}, \bar{y})\in {\mathbb {R}}^n\times {\mathbb {R}}^p$. Suppose that $\phi (\cdot ,\bar{y})$ is Fréchet differentiable at $\bar{x}$ and $\phi (\bar{x}, \cdot )$ is Fréchet differentiable at $\bar{y}$. Then $\phi $ is Fréchet differentiable at $(\bar{x}, \bar{y})$.

Proof

By the Fréchet differentiability of $\phi (\cdot ,\bar{y})$ and $\phi (\bar{x}, \cdot )$, one has

$$\begin{aligned} \partial \phi (\cdot ,\bar{y})(\bar{x})=\{\triangledown _x\phi (\bar{x},\bar{y})\}\quad \hbox {and} \quad \partial \phi (\bar{x}, \cdot )(\bar{y})=\{\triangledown _y\phi (\bar{x},\bar{y})\}. \end{aligned}$$

This and Proposition 2.1 imply that $\partial \phi (\bar{x}, \bar{y})$ is the singleton and

$$\begin{aligned} \partial \phi (\bar{x}, \bar{y})=\{(\triangledown _x \phi (\bar{x},\bar{y}), \triangledown _y\phi (\bar{x},\bar{y})\}. \end{aligned}$$

Hence $\phi $ is Gâteaux differentiable at $(\bar{x}, \bar{y})$ and consequently Fréchet differentiable at $(\bar{x}, \bar{y})$. The proof is complete. $\square $

Proposition 2.2 may not necessarily be true for non-convex functions. Consider function $\phi $ on ${\mathbb {R}}\times {\mathbb {R}}$ defined as: $\phi (x,y)=\frac{x^2y^2}{(x^2+y^2)^{3/2}}$ if $x^2+y^2\not =0$ and $\phi (x,y)=0$ if $x^2+y^2=0$. Then $\phi $ is continuous on ${\mathbb {R}}\times {\mathbb {R}}$ and partial derivatives $\triangledown _x\phi (0,0)$ and $\triangledown _y\phi (0,0)$ exist ($\triangledown _x\phi (0,0)=\triangledown _y\phi (0,0)=0$). However, one can verify that $\phi $ is not differentiable at $(0,0)$.

Proposition 2.3

Let $\phi :{\mathbb {R}}^n\times {\mathbb {R}}^p\rightarrow {\mathbb {R}}$ be a continuous convex function and $(\bar{x}, \bar{y})\in {\mathbb {R}}^n\times {\mathbb {R}}^p$. Then for any $\bar{\alpha }\in \partial \phi (\cdot ,\bar{y})(\bar{x})$, there exists $\bar{\beta }\in {\mathbb {R}}^p$ such that $(\bar{\alpha }, \bar{\beta })\in \partial \phi (\bar{x}, \bar{y})$.

Proof

Let $F_{\bar{y}}:{\mathbb {R}}^n\rightarrow {\mathbb {R}}^n\times {\mathbb {R}}^p$ be defined by $F_{\bar{y}}(x):=(x, \bar{y})$. Then $\phi (\cdot ,\bar{y})=\phi \circ F_{\bar{y}}$, and it is easy to verify that $F_{\bar{y}}$ is differentiable at $\bar{x}$ and

$$\begin{aligned} \triangledown F_{\bar{y}}(\bar{x})(h)=(h, 0)\in {\mathbb {R}}^n\times {\mathbb {R}}^p \end{aligned}$$

(2.4)

holds for all $h\in {\mathbb {R}}^n$. Let $\bar{\alpha }\in \partial \phi (\cdot ,\bar{y})(\bar{x})$. We first prove that

$$\begin{aligned} \bar{\alpha }\in \triangledown F_{\bar{y}}(\bar{x})^*(\partial \phi (\bar{x}, \bar{y})) \end{aligned}$$

(2.5)

where $\triangledown F_{\bar{y}}(\bar{x})^*$ is the conjugate operator of $\triangledown F_{\bar{y}}(\bar{x})$.

Since $\phi $ is continuous at $(\bar{x}, \bar{y})$, it follows that $\partial \phi (\bar{x}, \bar{y})$ is a nonempty, convex and compact subset by [20, Proposition1.11] and then $\triangledown F_{\bar{y}}(\bar{x})^*(\partial \phi (\bar{x}, \bar{y}))$ is convex and compact as $\triangledown F_{\bar{y}}(\bar{x})^*$ is continuous.

Suppose to the contrary that $\bar{\alpha }\not \in \triangledown F_{\bar{y}}(\bar{x})^*(\partial \phi (\bar{x}, \bar{y}))$. By the seperation theorem, there exists $\bar{u}\in {\mathbb {R}}^n$ with $\Vert \bar{u}\Vert =1$ such that

$$\begin{aligned} \langle \bar{\alpha }, \bar{u}\rangle&> \max \{\langle \triangledown F_{\bar{y}}(\bar{x})^*(\alpha , \beta ), \bar{u}\rangle : (\alpha , \beta )\in \partial \phi (\bar{x}, \bar{y})\}\\&= \max \{\langle (\alpha , \beta ), \triangledown F_{\bar{y}}(\bar{x})(\bar{u})\rangle : (\alpha , \beta )\in \partial \phi (\bar{x}, \bar{y})\}. \end{aligned}$$

This and (2.4) imply that

$$\begin{aligned} \langle \bar{\alpha }, \bar{u}\rangle >\max \{\langle (\alpha , \beta ), (\bar{u}, 0)\rangle : (\alpha , \beta )\in \partial \phi (\bar{x}, \bar{y})\}. \end{aligned}$$

(2.6)

Noting that $\bar{\alpha }\in \partial \phi (\cdot ,\bar{y})(\bar{x})$ and $\phi $ is a continuous convex function on ${\mathbb {R}}^n\times {\mathbb {R}}^p $, it follows from [20, Proposition 2.24] and (2.6) that

$$\begin{aligned} d^+\phi (\cdot ,\bar{y})(\bar{x})(\bar{u})\ge \langle \bar{\alpha }, \bar{u}\rangle >d^+\phi (\bar{x}, \bar{y})(\bar{u}, 0)=d^+\phi (\cdot ,\bar{y})(\bar{x})(\bar{u}), \end{aligned}$$

which is contradiction. Thus (2.5) holds.

By virtue of (2.5), there exists $(\hat{\alpha }, \bar{\beta })\in \partial \phi (\bar{x},\bar{y})$ such that $\bar{\alpha }=\triangledown F_{\bar{y}}(\bar{x})^*(\hat{\alpha }, \bar{\beta })$. It suffices to prove that $\bar{\alpha }=\hat{\alpha }$.

For any $h\in {\mathbb {R}}^n$, by using (2.4), one has

$$\begin{aligned} \langle \bar{\alpha }, h\rangle =\langle \triangledown F_{\bar{y}}(\bar{x})^*(\hat{\alpha }, \bar{\beta }), h\rangle =\langle (\hat{\alpha }, \bar{\beta }), \triangledown F_{\bar{y}}(\bar{x})(h)\rangle =\langle (\hat{\alpha }, \bar{\beta }), (h, 0)\rangle =\langle \hat{\alpha }, h\rangle . \end{aligned}$$

This means that $\bar{\alpha }=\hat{\alpha }$. The proof is complete. $\square $

The following proposition is on the subdifferential of maximum function of two convex functions which is from [26, Theorem 2.4.18]. This result will be used later in our analysis.

Proposition 2.4

Let $\varphi : {\mathbb {R}}^n\rightarrow {\mathbb {R}}$ be a convex and continuous function. Define $\varphi _+(x):=\max \{\varphi (x), 0\}$ for all $x\in {\mathbb {R}}^n$. Then $\varphi _+$ is a convex continuous function and

$$\begin{aligned} \partial \varphi _+(x)=[0, 1]\partial \varphi (x) \end{aligned}$$

(2.7)

holds for all $x\in {\mathbb {R}}^n$ with $\varphi (x)=0$, where $[0, 1]\partial \varphi (x):=\{t\gamma : t\in [0, 1]\ \hbox {and}\ \gamma \in \partial \varphi (x)\}$ for any $x\in {\mathbb {R}}^n$.

3 Main results

In this section, we mainly study convex MINLP problem of (1.1) by dropping the differentiability assumption and aim to establish one outer approximation algorithm for solving such problem.

Let convex MINLP be defined as (1.1) and set $g:=(g_1,\ldots ,g_m)$. For the case that $f,g_i(i=1,\ldots ,m)$ in (1.1) are convex and smooth, it is known from [2, 6, 8] that main idea of outer approximation algorithm for convex smooth MINLPs is using linearization of the objective function and the constraints at different points to build a mixed-integer linear program (MILP) relaxation of the problem; that is, given some set $K$ with optimal solutions of several optimization problems, it is possible to build a relaxation of problem (P) in (1.1):

$$\begin{aligned} \left\{ \begin{array}{ll} \mathop {\hbox {minimize}} &{} \theta \\ \hbox {subject to} &{} f(x_j, y_j)+ f(x_j,y_j)^T\begin{pmatrix}x-x_j\\ y-y_j\end{pmatrix}\le \theta , \\ &{} g(x_j, y_j)+ g(x_j,y_j)^T\begin{pmatrix}x-x_j\\ y-y_j\end{pmatrix}\le 0,\\ &{} x\in X, y\in Y\ \hbox {discrete variable}. \end{array} \right. \quad \forall (x_j,y_j)\in K \end{aligned}$$

(3.1)

When dealing with problem (P) in (1.1), the concept of subgradient is the substitute of the gradient in relaxation of (P). Note that arbitrary subgradients substituting gradients in (3.1) is not sufficient to equivalently reformulate problem (P) (see Example 3.1 below). As in [3, 6], with the help of KKT conditions, we obtain several special subgradients, which we then use to reformulate problem (P) as one equivalent MILP master program such as (3.1).

3.1 An overview of the method

For the equivalent reformulation of problem (P) in (1.1) and by using techniques in (3.1), we appeal to the concept of projection for expressing problem (P) onto $y$ variables. For any fixed $y\in Y$, we consider the following subproblem $P^y$:

$$\begin{aligned} P^y \left\{ \begin{array}{ll} \mathop {\hbox {minimize}}\limits _{x} &{} f(x, y)\\ \hbox {subject to} &{} g(x, y)\le 0, \\ &{} x\in X. \end{array} \right. \end{aligned}$$

(3.2)

If there exists some $x\in X$ such that $g(x, y)\le 0$, the subproblem $P^y$ is said to be feasible; otherwise, $P^y$ is said to be infeasible.

For the validness of KKT conditions, we assume that the following Slater constraint qualification holds:

Assumption (A1) For any $y\in Y$ satisfying that the subproblem $P^y$ is feasible, the following Slater constraint qualification holds:

$$\begin{aligned} g(\hat{x}, y)<0\ \ { for\ some}\ \hat{x}\in X. \end{aligned}$$

(Slater CQ)

Let

$$\begin{aligned} {\varSigma }:=\{y\in Y: g(x, y)\le 0\quad \hbox {for some}\ x\in X\} \end{aligned}$$

(3.3)

denote the set of all discrete variables $y$ that produce feasible subproblems. Then the projection of problem (P) onto variable $y$ can be given as follows:

$$\begin{aligned} \mathop {\hbox {minimize}}\limits _{y_j\in {\varSigma }}\left\{ \begin{array}{ll} \mathop {\hbox {minimize}}\limits _{x} &{} f(x, y_j)\\ \hbox {subject to} &{} g(x, y_j)\le 0,\\ &{} x\in X. \end{array} \right. \end{aligned}$$

(3.4)

Now let $y_j\in {\varSigma }$ be fixed. Since $X$ is compact and $f,g_i$ are continuous, it follows that the optimal solution to subproblem $P^{y_j}$ exists. Thus we can suppose that $x_j$ is one optimal solution to $P^{y_j}$. By the assumption (A1) and KKT conditions, there exist $(\lambda _{j,1},\ldots ,\lambda _{j,m})\in {\mathbb {R}}^m$ such that

$$\begin{aligned} \left\{ \begin{array}{l} 0\in \partial f(\cdot ,y_j)(x_j)+\sum \limits _{i\in I(x_j)}\lambda _{j,i}\partial g_i(\cdot ,y_j)(x_j)+ N(X, x_j),\\ \lambda _{j,i}g_i(x_j,y_j)=0, \ \ i=1,\ldots ,m,\\ \lambda _{j,i}\ge 0, \ \ i=1,\ldots ,m, \end{array} \right. \end{aligned}$$

(3.5)

where

$$\begin{aligned} I(x_j):=\{i\in \{1,\ldots ,m\}: g_i(x_j,y_j)=0\} \end{aligned}$$

(3.6)

is the active constraint set. This means that we can take $\alpha _j\in \partial f(\cdot ,y_j)(x_j)$ and $\xi _{j,i}\in \partial g_i(\cdot ,y_j)(x_j) (i=1,\ldots ,m)$ such that

$$\begin{aligned} -\alpha _j-\sum \limits _{i\in I(x_j)}\lambda _{j,i}\xi _{j,i}\in N(X, x_j). \end{aligned}$$

(3.7)

By Proposition 2.3, there exist $\beta _j\in {\mathbb {R}}^p$ and $\eta _{j,i}\in {\mathbb {R}}^p (i=1,\ldots ,m)$ such that

$$\begin{aligned} (\alpha _j, \beta _j)\in \partial f(x_j,y_j)\quad \hbox {and} \quad (\xi _{j,i},\eta _{j,i})\in \partial g_i(x_j,y_j), \forall i\in \{1,\ldots ,m\}. \end{aligned}$$

(3.8)

Set $\xi _j:=(\xi _{j,1},\ldots ,\xi _{j,m})$ and $\eta _j:=(\eta _{j,1},\ldots ,\eta _{j,m})$. We consider the following linear problem:

$$\begin{aligned} LP(x_j, y_j) \left\{ \begin{array}{ll} \mathop {\hbox {minimize}}\limits _{x} &{} f(x_j, y_j)+(\alpha _j, \beta _j)^T\begin{pmatrix}x-x_j\\ 0\end{pmatrix}\\ \hbox {subject to} &{} g(x_j, y_j)+(\xi _j, \eta _j)^T\begin{pmatrix}x-x_j\\ 0\end{pmatrix}\le 0, \\ &{} x\in X. \end{array} \right. \end{aligned}$$

(3.9)

The following theorem establishes the equivalence between subproblem $P^{y_j}$ and linear program $LP(x_j,y_j)$ of (3.9). The proof of this theorem will be given in Sect. 5.

Theorem 3.1

Let $LP(x_j, y_j)$ be defined as (3.9). Then $x_j$ is one optimal solution for $LP(x_j, y_j)$ in (3.9) and $f(x_j, y_j)$ is the optimal value of $LP(x_j, y_j)$ in (3.9).

We denote

$$\begin{aligned} T:=\Big \{j: P^{y_j}\ \hbox {is feasible and } \ x_j\ \hbox {is an optimal solution to}\ P^{y_j}\Big \}. \end{aligned}$$

(3.10)

Let $j\in T$. By assumption (A1), we can take $(\lambda _{j,1},\ldots ,\lambda _{j,m})\in {\mathbb {R}}^m_+$, $(\alpha _j,\beta _j)\in \partial f(x_j,y_j)$ and $(\xi _{j,i},\eta _{j,i})\in \partial g_i(x_j,y_j) $ $(i=1,\ldots ,m)$ such that (3.7) holds. Applying Proposition 2.3, there exist $\beta _j\in {\mathbb {R}}^p$ and $\eta _{j,i}\in {\mathbb {R}}^p$ $(i=1,\ldots ,m)$ such that (3.8) holds. Then we set

$$\begin{aligned} \xi _j:=(\xi _{j,1},\ldots ,\xi _{j,m}) \quad \hbox {and} \quad \eta _j:=(\eta _{j,1},\ldots ,\eta _{j,m}). \end{aligned}$$

We consider the following MILP:

$$\begin{aligned} (M_{{\varSigma }})\left\{ \begin{array}{ll} \mathop {\hbox {minimize}}\limits _{x,\,y,\,\theta } &{} \theta \\ \hbox {subject to} &{} f(x_j, y_j)+(\alpha _j, \beta _j)^T\begin{pmatrix}x-x_j\\ y-y_j\end{pmatrix}\le \theta \quad \forall j\in T,\ \ \\ &{} g(x_j, y_j)+(\xi _j, \eta _j)^T\begin{pmatrix}x-x_j\\ y-y_j\end{pmatrix}\le 0\quad \forall j\in T,\\ &{} x\in X, y\in {\varSigma }\ \hbox {discrete variable}. \end{array} \right. \end{aligned}$$

(3.11)

By virtue of Theorem 3.1, we obtain the following theorem on the equivalence of problem (P) of (1.1) and MILP $(M_{{\varSigma }})$ of (3.11).

Theorem 3.2

Assmue that MINLP problem (P) of (1.1) satisfies assumption (A1). Then MILP $(M_{{\varSigma }})$ of (3.11) are equivalent to problem (P) in the sense that both have the same optimal value and that the optimal solution $(\bar{x}, \bar{y})$ to problem (P) corresponds to the optimal solution $( \bar{x}, \bar{y}, \bar{\theta })$ to $(M_{{\varSigma }})$ of (3.11) with $\bar{\theta }=f(\bar{x}, \bar{y})$.

For completely reformulating the problem (P), it remains to provide an appropriate representation of constraint $y\in Y\backslash {\varSigma }$ by supporting hyperplanes. Along the lines in [2, 8], we are inspired to study infeasible subproblems so as to eliminate those discrete variables that give rise to infeasibility.

Let $y_l\in Y\backslash {\varSigma }$. Then subproblem $P^{y_l}$ is infeasible; that is,

$$\begin{aligned} \not \exists x\in X \quad \hbox {satisfying} \ g_i(x, y_l)\le 0 \quad \hbox {for all} \ i=1,\ldots , m. \end{aligned}$$

Let $J_l$ be one subset of $\{1,\ldots ,m\}$ such that there is some $\hat{x}\in X$ satisfying

$$\begin{aligned} g_i(\hat{x}, y_l)< 0,\quad \forall i\in J_l. \end{aligned}$$

(3.12)

Denote $J_l^{\bot }:=\{1,\ldots ,m\}\backslash J_l$ the complement of $J_l$. To detect the infeasibility, we study the following subproblem $F^{y_l}$:

$$\begin{aligned} F^{y_l}\left\{ \begin{array}{ll} \mathop {\hbox {minimize}}\limits _{x} &{} \sum \limits _{i\in J_l^{\bot }}[g_i(x, y_l)]_+\\ \hbox {subject to} &{} g_i(x, y_l)\le 0 \quad \forall i\in J_l,\\ &{} x\in X, \end{array} \right. \end{aligned}$$

(3.13)

where $[g_{i}(x,y_{l})]_+ := \max \{g_{i}{(x, y_l)},0\}$.

Since $X$ is compact and $g_i$ for $i=1,\ldots ,m$ are continuous, then the optimal solution to subproblem $F^{y_l}$ exists. Thus we can assume that $x_l$ is one optimal solution to subproblem $F^{y_l}$. For convenience to state the process, we divide the set $J_l^{\bot }$ into three disjoint subsets which are denoted by $J_l^1, J_l^2$ and $J_l^3$. These three subsets are defined as

$$\begin{aligned} \left\{ \begin{array}{l} J_l^1:=\{i\in J_l^{\bot }: g_i(x_l,y_l)=0\},\\ J_l^2:=\{i\in J_l^{\bot }: g_i(x_l,y_l)>0\},\\ J_l^3:=\{i\in J_l^{\bot }: g_i(x_l,y_l)<0\}. \end{array} \right. \end{aligned}$$

(3.14)

This means that $J_l^{\bot }=J_l^1\cup J_l^2\cup J_l^3$ and by using continuity of $g_i$, one has

$$\begin{aligned} \partial [g_i(\cdot , y_l)]_+(x_l)=\partial g_i(\cdot , y_l)(x_l), \ \forall i\in J_l^2 \quad \hbox {and} \quad \partial [g_i (\cdot , y_l)]_+(x_l)=\{0\},\ \forall i\in J_l^3. \end{aligned}$$

By (3.12) and KKT condition, there exist $\lambda _{l,i}\in {\mathbb {R}}$ for all $i\in J_l$ such that

$$\begin{aligned} \left\{ \begin{array}{l} 0\in \sum \limits _{i\in J_l^{\bot }}\partial [g_i(\cdot ,y_l)]_+(x_l)+\sum \limits _{i\in J_l}\lambda _{l,i}\partial g_i(\cdot ,y_l)(x_l)+ N(X, x_l),\\ \lambda _{l,i}g_i(x_l,y_l)=0, \ \ \forall i\in J_l,\\ \lambda _{l,i}\ge 0, \ \ \forall i\in J_l. \end{array} \right. \end{aligned}$$

(3.15)

Denote $\lambda _{l,i}\equiv 1$ for all $i\in J_l^2$ and $\lambda _{l,i}\equiv 0$ for all $i\in J_l^3$. Using Proposition 2.4, there exist $\lambda _{l,i}\in [0,1] (\forall i\in J_l^1)$ and $\xi _{l,i}\in \partial g_i(\cdot ,y_l)(x_l) (\forall i\in J_l^{\bot }\cup J_l)$ such that

$$\begin{aligned} -\sum _{i\in J_l^{\bot }\cup J_l}\lambda _{l,i}\xi _{l,i}\in N(X,x_l). \end{aligned}$$

(3.16)

By virtue of Proposition 2.3, there exist $\eta _{l,i}\in {\mathbb {R}}^p$ such that $(\xi _{l,i}, \eta _{l,i})\in \partial g_i(x_l,y_l)$ for all $i\in J_l^{\bot }\cup J_l$.

Since subproblem $P^{y_l}$ is infeasible, then there exists one optimal solution $x_l$ to subproblem $F^{y_l}$ such that $\sum _{i\in J_l^{\bot }}[g_i(x_l, y_l)]_+>0$, by the continuity of $g_i$ and compactness of $X$. This gives the following theorem on subproblem $F^{y_l}$. The proof is also given in Section 5.

Theorem 3.3

The discrete variable $y_l\in Y\backslash {\varSigma }$ is infeasible to the following constraint:

$$\begin{aligned} \left\{ \begin{array}{l} g_i(x_l, y_l)+(\xi _{l,i}, \eta _{l,i})^T\begin{pmatrix}x-x_l\\ y-y_l\end{pmatrix}\le 0, \ \ \forall i\in J_l^{\bot }\cup J_l,\\ x\in X, y\in Y. \end{array} \right. \end{aligned}$$

(3.17)

It is necessary to ensure that discrete variables that produce infeasible subproblems are also infeasible in the reformulated master program. We denote

$$\begin{aligned} S:=\big \{l: P^{y_l}\ \hbox {is infeasible and}\ x_l\ \hbox {solves}\ F^{y_l} \big \}. \end{aligned}$$

(3.18)

For any $l\in S$, take $\lambda _{l,i}\ge 0$ and $\xi _{l,i}\in \partial g_i(\cdot ,y_l)(x_l) (i=1,\ldots ,m)$ such that (3.16) holds. Take $\eta _{l,i}\in {\mathbb {R}}^p$ such that $(\xi _{l,i},\eta _{l,i})\in \partial g_i(x_l,y_l)$ for any $i\in \{1,\ldots ,m\}$ by Proposition 2.3. We set $\xi _l:=(\xi _{l,1},\ldots ,\xi _{l,m})$ and $\eta _l:=(\eta _{l,1},\ldots ,\eta _{l,m})$. Then by using Theorem 3.3, we have the following theorem which shows how to eliminate those discrete variables giving rise to infeasible subproblems.

Theorem 3.4

For any $l\in S$, let $(\xi _l, \eta _l)$ be defined as above. Then the following constraints

$$\begin{aligned} \left\{ \begin{array}{l} g(x_l, y_l)+(\xi _{l}, \eta _{l})^T\begin{pmatrix}x-x_l\\ y-y_l\end{pmatrix}\le 0, \ \ \forall l\in S,\\ x\in X, y\in Y \end{array} \right. \end{aligned}$$

(3.19)

exclude all discrete variables $y_l\in Y$ for which subproblem $P^{y_l}$ is infeasible.

It is known from Theorem 3.4 that we can add linearization from $F^{y_l}$ when subproblem $P^{y_l}$ is infeasible so as to correctly represent the constraints $y\in {\varSigma }$ in (3.3). This gives rise to the MILP master program (MP) which is equivalent to MINLP problem (P) in (1.1) and used to reformulate problem (P).

Let $T$ and $S$ be defined as (3.10) and (3.18), respectively. For any $j\in T$, by assumption (A1), we can take $\lambda _{j,i}\ge 0$ $(i=1,\ldots ,m)$, $\alpha _j\in \partial f(\cdot ,y_j)(x_j)$ and $\xi _{j,i}\in \partial g_i(\cdot ,y_j)(x_j) (i=1,\ldots ,m)$ such that (3.7) holds, and by Proposition 2.3, we take $\beta _j\in {\mathbb {R}}^p$ and $\eta _{j,i}\in {\mathbb {R}}^p (i=1,\ldots ,m)$ such that (3.8) holds. We set

$$\begin{aligned} \xi _j:=(\xi _{j,1},\ldots ,\xi _{j,m}) \quad \hbox {and} \quad \eta _j:=(\eta _{j,1},\ldots ,\eta _{j,m}). \end{aligned}$$

For any $l\in S$, we take $\lambda _{l,i}\ge 0$ and $\xi _{l,i}\in \partial g_i(\cdot ,y_l)(x_l) (i=1,\ldots ,m)$ such that (3.16) holds and by Proposition 2.3, we take $\eta _{l,i}\in {\mathbb {R}}^p$ such that $(\xi _{l,i},\eta _{l,i})\in \partial g_i(x_l,y_l)$. Set

$$\begin{aligned} \xi _l:=(\xi _{l,1},\ldots ,\xi _{l,m}) \quad \hbox {and} \quad \eta _l:=(\eta _{l,1},\ldots ,\eta _{l,m}). \end{aligned}$$

The MILP master problem (MP) is given as follows:

$$\begin{aligned} \hbox {(MP)}\left\{ \begin{array}{ll} \mathop {\hbox {minimize}}\limits _{x,\,y,\,\theta } &{} \theta \\ \hbox {subject to} &{} f(x_j, y_j)+(\alpha _j, \beta _j)^T\begin{pmatrix}x-x_j\\ y-y_j\end{pmatrix}\le \theta \quad \forall j\in T,\ \ \\ &{} g(x_j, y_j)+(\xi _j, \eta _j)^T\begin{pmatrix}x-x_j\\ y-y_j\end{pmatrix}\le 0 \quad \forall j\in T,\\ &{} g(x_l, y_l)+(\xi _l, \eta _l)^T\begin{pmatrix}x-x_l\\ y-y_l\end{pmatrix}\le 0 \quad \forall l\in S,\\ &{} x\in X, y\in Y\ \hbox {discrete variable}. \end{array} \right. \end{aligned}$$

(3.20)

The following theorem, immediate from Theorems 3.3 and 3.4, is one main result in the procedure of reformulating MINLP problem (P) of (1.1) as the equivalent MILP master program (MP).

Theorem 3.5

Assume that MINLP problem (P) of (1.1) satisfies assumption (A1). Then master program (MP) of (3.20) is equivalent to problem (P) in the sense that both problems have the same optimal value and that the optimal solution $(\bar{x}, \bar{y})$ to problem (P) corresponds to the optimal solution $(\bar{x}, \bar{y}, \bar{\theta })$ to (MP) of (3.20) with $\bar{\theta }=f(\bar{x}, \bar{y})$.

Remark 3.1

Theorem 3.5 is one extension of main results given in [8, 23], and it generalizes the outer approximation method in the sense of equivalently reformulating convex MINLP problem (P) from differentiable case to the non-differentiable one. Further, it is known from Theorem 3.5 that all optimal solutions of problem (P) are optimal solutions to master program (MP). However, the converse is not necessarily true since some optimal solutions of (MP) may be infeasible to problem (P). We refer the reader to [2, Example 1] and [23, Remark3.1] for the detail.

Theorem 3.5 shows that some subgradients obtained from the KKT conditions enable to reformulate MINLP problem (P) as an equivalent MILP master program by outer approximation method. However, this procedure is not valid if arbitrary subgradients are chosen to replace gradients. The following example demonstrates that the substitution of the gradient by an arbitrary subgradient in the outer approximation method is insufficient for the equivalent reformulation.

Example 3.1

We consider the following convex MINLP problem:

$$\begin{aligned} \left\{ \begin{array}{ll} \mathop {\hbox {minimize}}\limits _{x,\,y} &{} f(x, y):=x+y\\ \hbox {subject to} &{} g_1(x, y):=\max \{-x+y+1, x-y+1\}\le 0,\\ &{} g_2(x,y):=x-y\le 0,\\ &{} x\in [0, 2], \ y\in \{1,2,3\}. \end{array} \right. \end{aligned}$$

(3.21)

One can verify that this convex MINLP in (3.21) is infeasible. However, let the initial point $y_0=1$. Then subproblem $P^{y_0}$ is infeasible and we can consider the following subproblem $F^{y_0}$:

$$\begin{aligned} \left\{ \begin{array}{ll} \mathop {\hbox {minimize}}\limits _{x} &{} [g_1(x, y_0)]_+=g_1(x, y_0)\\ \hbox {subject to} &{} g_2(x, y_0)\le 0,\\ &{} x\in [0, 2]. \end{array} \right. \end{aligned}$$

(3.22)

It is easy to verify that $x_0=1$ is the optimal solution to subproblem $F^{y_0}$ and

$$\begin{aligned}{}[-1, 1]\times [-1, 1]\subset \partial g_1(x_0, y_0). \end{aligned}$$

Now, if we take $(\xi _{0,1},\eta _{0,1})=(1,1)\in \partial g_1(x_0, y_0)$, $(\xi _{0,2},\eta _{0,2})=\triangledown g_2(x_0, y_0)$ and $(\alpha _0, \beta _0)=\triangledown f(x_0, y_0)$, then the MILP is defined as

$$\begin{aligned} \left\{ \begin{array}{ll} \mathop {\hbox {minimize}}\limits _{x,\,y,\,\theta } &{} \theta \\ \hbox {subject to} &{} x+y\le \theta ,\ \ \\ &{} x+y-1\le 0,\\ &{} x-y\le 0,\\ &{} x\in [0, 2], \ y\in \{1,2,3\}. \end{array} \right. \end{aligned}$$

(3.23)

The optimal solution to MILP in (3.23) is $(x,y,\theta )=(0,1,1)$. This means that the outer approximation method for this MINLP may generate an infinite loop between points $(x_0,y_0)$ and $(0,1)$. Thus the outer approximation method is invalid for this MINLP problem in (3.21). Further, when tracking down why this method is not valid here, the reason noticed is that the KKT conditions at $(x_0,y_0)$ for $(\xi _{0,1},\eta _{0,1})$ does not hold; that is,

$$\begin{aligned} \not \exists (\lambda _{0,1},\lambda _{0,2})\in {\mathbb {R}}^2_+\ \ \hbox {satisfying} \quad \triangledown f(x_0, y_0)+\lambda _{0,1}(\xi _{0,1},\eta _{0,1})+\lambda _{0,2}\triangledown g_2(x_0, y_0)=0. \end{aligned}$$

3.2 The algorithm

In this subsection, based on the solution of MILP master program (MP) in (3.20), we pay main attention to one outer approximation algorithm for finding the optimal solution of problem (P) in (1.1) along the line in [5, 8, 23].

At iteration $k$, the sets $T$ and $S$ in master program (MP) of (3.20) are substituted by the sets $T^k$ and $S^k$ respectively which are defined as

$$\begin{aligned} \left\{ \begin{array}{l} T^k:=\{j\le k: P^{y_j}\ \hbox {is feasible and} \ x_j \ \hbox {solves }\ P^{y_j} \},\\ S^k:=\{l\le k: P^{y_l}\ \hbox {is infeasible and} \ x_l \ \hbox {solves } \ F^{y_l} \}. \end{array} \right. \end{aligned}$$

(3.24)

If $k\in T^k$ then $x_k$ solves $P^{y_k}$ and there exist $(\alpha _k,\beta _k)\in {\mathbb {R}}^n\times {\mathbb {R}}^p$ and $(\xi _{k,i},\eta _{k,i})\in {\mathbb {R}}^n\times {\mathbb {R}}^p$ for all $i=1,\ldots ,m$ such that

$$\begin{aligned} \left\{ \begin{array}{l} -\alpha _k-\sum \limits _{i=1}^m\lambda _{k,i}\xi _{k,i}\in N(X, x_k)\ \ \hbox {for some} \ (\lambda _{k,1}\ldots ,\lambda _{k,m})\in {\mathbb {R}}^m_+,\\ (\alpha _k,\beta _k)\in \partial f(x_k,y_k),\\ (\xi _{k,i},\eta _{k,i})\in \partial g_i(x_k,y_k),\ \forall i=1,\ldots ,m. \end{array} \right. \end{aligned}$$

(3.25)

If $k\in S^k$ then $x_k$ solves $F^{y_k}$ and there exist $(\xi _{k,i},\eta _{k,i})\in {\mathbb {R}}^n\times {\mathbb {R}}^p$ for all $i=1,\ldots ,m$ such that

$$\begin{aligned} \left\{ \begin{array}{l} -\sum \limits _{i=1}^m\lambda _{k,i}\xi _{k,i}\in N(X, x_k)\ \ \hbox {for some} \ (\lambda _{k,1}\ldots ,\lambda _{k,m})\in {\mathbb {R}}^m_+,\\ (\xi _{k,i},\eta _{k,i})\in \partial g_i(x_k,y_k),\ \forall i=1,\ldots ,m. \end{array} \right. \end{aligned}$$

(3.26)

Set

$$\begin{aligned} \xi _k:=(\xi _{k,1},\ldots ,\xi _{k,m})\quad \hbox {and} \quad \eta _k:=(\eta _{k,1},\ldots ,\eta _{k,m}). \end{aligned}$$

To prevent discrete variable assignment $y_j$ (for any $j\in T^k$) from being the solution to the relaxed master program, it is necessary to define $UBD^k:=\min \{f(x_j,y_j): j\in T^k\}$ and add a constraint $\theta <UBD^k$ to the master program. This gives rise to the following relaxed master program $MP^k$:

$$\begin{aligned} MP^k\left\{ \begin{array}{ll} \mathop {\hbox {minimize}}\limits _{x,\,y,\,\theta } &{} \theta \\ \hbox {subject to} &{} \theta <UBD^k\\ &{} f(x_j, y_j)+(\alpha _j, \beta _j)^T\begin{pmatrix}x-x_j\\ y-y_j\end{pmatrix}\le \theta \quad \forall j\in T^k,\ \ \\ &{} g(x_j, y_j)+(\xi _j, \eta _j)^T\begin{pmatrix}x-x_j\\ y-y_j\end{pmatrix}\le 0\quad \forall j\in T^k,\\ &{} g(x_l, y_l)+(\xi _l, \eta _l)^T\begin{pmatrix}x-x_l\\ y-y_l\end{pmatrix}\le 0\quad \forall l\in S^k,\\ &{} x\in X, y\in Y\ \hbox {discrete variable}. \end{array} \right. \end{aligned}$$

(3.27)

The new discrete variable assignment $y_{k+1}$ can be obtain by solving $MP^k$ and the whole process is repeated iteratively until the relaxed master program is infeasible.

We are now in a position to state the outer approximation algorithm for solving problem (P) in detail.

Under the assumption of finite cardinality of discrete variable subset $Y$, the following theorem shows Algorithm 1 can detect feasibility or infeasibility of problem (P) in (1.1) and the procedure in Algorithm 1 terminates after a finite number of steps. The proof is also given in Sect. 5.

Theorem 3.6

Suppose that MINLP problem (P) in (1.1) satisfies assumption (A1) and the cardinality of $Y$ is finite. Then either problem (P) is infeasible or Algorithm 1 terminates in a finite number of steps at an optimal value of problem (P).

4 Conclusions

This paper is mainly devoted to the study of one convex MINLP in which objective and constraint functions are continuous and non-differentiable. With no differentiability assumption, subgradients of objective and constraint functions, the substitute of gradients in convex and smooth MINLP, are chosen from the KKT conditions and used to reformulate the MINLP problem as one equivalent mixed-integer linear program. A counterexample shows that the chosen subgradients, if not satisfying KKT conditions, may be invalid for the MILP reformulation, which demonstrates the necessity of KKT conditions in the equivalent reformulation. By solving a finite sequence of subproblems and relaxed MILP problems, one outer approximation algorithm for this convex MINLP is presented to find the optimal solution of the problem. The finite convergence of the algorithm is also proved. The work of this paper is the extension of references [5, 8, 23] and also generalizes outer approximation method in the sense of dealing with convex MINLP from differentiable case to the non-differentiable one.

References

Aubin, J.P., Frankowska, H.: Set-valued Analysis. Birkhäuser, Boston (1990)
MATH Google Scholar
Bonami, P., Biegler, L., Conn, A.R., Cornuéjols, G., Grossmann, I.E., Laird, C., Lee, J., Lodi, A., Margot, F., Sawaya, N., Wächter, A.: An algorithmic framework for convex mixed integer nonlinear programs. Discrete. Optim. 5(2), 186–204 (2008)
Article MATH MathSciNet Google Scholar
Castillo, I., Westerlund, J., Emet, S., Westerlund, T.: Optimization of block layout design problems with unequal areas: a comparison of MILP and MINLP optimization methods. Comput. Chem. Eng. 30, 54–69 (2005)
Article Google Scholar
Drewes, S., Ulbrich, S.: Subgradient based outer approximation for mixed integer second order cone pogramming. Mixed Integer Nonlinear Prog. IMA Vol. Math. Appl. 154, 41–59 (2012)
Article Google Scholar
Duran, M., Grossmann, I.E.: An Outer approximation algorithm for a class of mixed-integer nonlinear programs. Math. Program. 36, 307–339 (1986)
Article MATH MathSciNet Google Scholar
Eronen, V.-P., Mäkelä, M.M., Westerlund, T.: On the generalization of ECP and OA methods to nonsmooth convex MINLP problems. Optimization 63, 1057–1073 (2014)
Article MATH MathSciNet Google Scholar
Eronen, V.-P., Mäkelä, M.M., Westerlund, T.: Extended cutting plane method for a class of nonsmooth nonconvex MINLP problems. Optimization (2013). doi:10.1080/02331934.2013.796473
Google Scholar
Fletcher, R., Leyffer, S.: Solving mixed-integer nonlinear programs by outer approximation. Math. Prog. 66, 327–349 (1994)
Article MATH MathSciNet Google Scholar
Decomposition in general mathematical programming: Flippo, O.E., Rinnooy Kan, A.H.G. Math. Prog. 60, 361–382 (1993)
Article Google Scholar
Floudas, C.A.: Nonlinear and Mixed Integer Optimization: Fundamentals and Applications. Oxford University Press, New York (1995)
MATH Google Scholar
Geoffrion, A.M.: Generalized Benders decomposition. J. Optim. Theory. Appl. 10(4), 237–260 (1972)
Article MATH MathSciNet Google Scholar
Grossmann, I.E.: Review of nonlinear mixed-integer and disjunctive programming techniques. Optim. Eng. 3, 227–252 (2002)
Article MATH MathSciNet Google Scholar
Grossmann, I. E., Sahinidis, N. V. (eds): Special Issue on Mixed-Integer Programming and Its Application to Engineering, Part I, Optim. Eng., vol. 3(4), Kluwer Academic Publishers, Netherlands (2002)
Grossmann, I. E., Sahinidis, N. V. (eds): Special Issue on Mixed-Integer Programming and Its Application to Engineering, Part II, Optim. Eng., vol. 4(1), Kluwer Academic Publishers, Netherlands (2002)
Leyffer, S.: Integrating SQP and branch-and-bound for mixed integer nonlinear programming. Comput. Optim. Appl. 18, 295–309 (2001)
Article MATH MathSciNet Google Scholar
Liberti, L., Pantelides, C.: An exact reformulation algorithm for large nonconvex NLPs involving bilinear terms. J. Global. Optim. 36, 161–189 (2006)
Article MATH MathSciNet Google Scholar
Linderoth, J.T., Ralphs, T.K.: Noncommercial software for mixed-integer linear programming. In: Karlof, J. (ed.) Integer Programming: Theory and Practice, Operations Research Series, pp. 253–303. CRC Press, Boca Raton (2005)
Google Scholar
Michelon, P., Maculan, N.: Lagrangean decomposition for integer nonlinear programming with linear constraints. Math. Program. 52, 303–313 (1991)
Article MATH MathSciNet Google Scholar
Nowak, I., Vigerske, S.: LaGO: a (heuristic) branch and cut algorithm for nonconvex MINLPs. Cent. Eur. J. Oper. Res 16(2), 127–138 (2008)
Article MATH MathSciNet Google Scholar
Phelps, R.R.: Convex functions, Monotone Operators and Differentiability. Lecture Notes in Math, vol. 1364. Springer, New York (1989)
MATH Google Scholar
Tawarmalani, M., Sahinidis, N.V.: Global optimization of mixed-integer nonlinear programs: A theoretical and computational study. Math. Program. 99, 563–591 (2004)
Article MATH MathSciNet Google Scholar
Tawarmalani, M., Sahinidis, N.V.: Convexification and Global Optimization in Continuous and Mixed-Integer Nonlinear Programming: Theory, Algorithms, Software, and Applications. Kluwer Academic Publishers, Netherlands (2002)
Book Google Scholar
Wei, Z., Ali, M.M.: Outer approximation algorithm for one class of convex mixed-integer nonlinear programming problems with partial differentiability. J. Optim. Theory. Appl. doi:10.1007/s10957-015-0715-y
Westerlund, T., Pettersson, F.: An extended cutting plane method for solving convex MINLP problems. Computer. Chem. Eng. 19, 131–136 (1995)
Article Google Scholar
Westerlund, T., Pörn, R.: Solving pseudo-convex mixed integer optimization problems by cutting plane techniques. Optim. Eng. 3, 253–280 (2002)
Article MATH MathSciNet Google Scholar
Zǎlinescu, C.: Convex Analysis in General Vector Spaces. World Scientific, Singapore (2002)
Book Google Scholar

Download references

Acknowledgments

We are grateful to the referee for careful reading this paper and valuable comments which help us to improve the original version. This research was supported by the National Natural Science Foundations of P. R. China (Grant No. 11401518 and No. 11261067) and IRTSTYN, and by the Claude Leon Foundation of South Africa.

Author information

Authors and Affiliations

Department of Mathematics, Yunnan University, Kunming, 650091, People’s Republic of China
Zhou Wei
School of Computational and Applied Mathematics, University of the Witwatersrand, Johannesburg, 2050, South Africa
M. Montaz Ali
TCSE, Faculty of Engineering and Built Environment, University of the Witwatersrand, Johannesburg, 2050, South Africa
M. Montaz Ali

Authors

Zhou Wei
View author publications
You can also search for this author in PubMed Google Scholar
M. Montaz Ali
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhou Wei.

Appendix: Proofs of Theorems 3.1, 3.3 and 3.6

In this section, we present the proofs of several key results in Section 3.

Proof of Theorem 3.1

To prove Theorem 3.1, it suffices to show that

$$\begin{aligned} (\alpha _j, \beta _j)^T\begin{pmatrix}x-x_j\\ 0\end{pmatrix}\ge 0,\ \ \forall x\in X\ \hbox {with}\ g(x_j, y_j)+(\xi _j, \eta _j)^T\begin{pmatrix}x-x_j\\ 0\end{pmatrix}\le 0. \end{aligned}$$

(5.1)

Let $x\in X$ be such that

$$\begin{aligned} g(x_j, y_j)+(\xi _j, \eta _j)^T\begin{pmatrix}x-x_j\\ 0\end{pmatrix}\le 0 \end{aligned}$$

and let $I(x_j)$ be defined as (3.6). Then

$$\begin{aligned} \langle \xi _{j,i}, x-x_j\rangle \le 0, \forall i\in I(x_j). \end{aligned}$$

(5.2)

By using (3.7), there exists $\gamma \in N(X, x_j)$ such that

$$\begin{aligned} \alpha _j+\sum \limits _{i\in I(x_j)}\lambda _{j,i}\xi _{j,i}+\gamma =0. \end{aligned}$$

(5.3)

Noting that $X$ is convex, it follows that $x-x_j\in T(X,x_j)$. This together with (5.2) and (5.3) implies that

$$\begin{aligned} (\alpha _j, \beta _j)^T\begin{pmatrix}x-x_j\\ 0\end{pmatrix}=\langle \alpha _j, x-x_j\rangle =-\Big \langle \sum \limits _{i\in I(x_j)}\lambda _{j,i}\xi _{j,i}+\gamma , x-x_j\Big \rangle \ge 0. \end{aligned}$$

Hence (5.1) holds. The proof is complete. $\square $

Proof of Theorem 3.3

Since $X$ is compact and $g$ is continuous, then one has

$$\begin{aligned} \sum _{i\in J_l^{\bot }}[g_i(x_l, y_l)]_+>0. \end{aligned}$$

(5.4)

Suppose to the contrary that there exists $\hat{x}\in X$ such that $(\hat{x}, y_l)$ is feasible to the constraint of (3.17). Then

$$\begin{aligned} g_i(x_l, y_l)+\langle \xi _{l,i},\hat{x}-x_l\rangle \le 0, \ \ \forall i\in J_l^{\bot }\cup J_l. \end{aligned}$$

(5.5)

Noting that $\hat{x}-x_l\in T(X,x_l)$ by the convexity of $X$, it follows from (3.16) that there exists $\gamma \in N(X, x_l)$ such that

$$\begin{aligned} \sum _{i\in J_l^1\cup J_l^2\cup J_l^3}\lambda _{l,i}\xi _{l,i}+\gamma =0, \end{aligned}$$

(5.6)

where $\lambda _{l,i}\equiv 1$ for all $i\in J_l^2$ and $\lambda _{l,i}\equiv 0$ for all $i\in J_l^3$. By multiplying (5.5) by $\lambda _{l,i}$ for any $i\in J_l^1\cup J_l^2\cup J_l^3$, it follows from (5.6) that

$$\begin{aligned} 0&\ge \sum _{i\in J_l^1\cup J_l^2}\lambda _{l,i}g_i(x_l, y_l)+\Big \langle \sum _{i\in J_l^1\cup J_l^2\cup J_l}\lambda _{l,i}\xi _{l,i}+\gamma , \hat{x}-x_l\Big \rangle \\&\ge \sum _{i\in J_l^2}g_i(x_l, y_l)=\sum _{i\in J^{\bot }_l}[g_i(x_l, y_l)]_+ \end{aligned}$$

as $\hat{x}-x_l\in T(X, x_l)$ and $\lambda _{l,i}g_i(x_l,y_l)=0 (\forall i\in J_l)$, which contradicts (5.4). The proof is complete. $\square $

Proof of Theorem 3.6

(This is similar to the proof for [23, Theorem 3.6]. For the sake of completeness, we provide the proof in brief.)

For the proof of Theorem 3.6, we first prove that there is no discrete variable in $Y$ generated more than once by Algorithm 1. Granting this, it follows from the finite cardinality of $Y$ that the termination of Algorithm 1 holds after a finite number steps.

At iteration $k$, let $(\hat{x}, \hat{y}, \hat{\theta })$ be an optimal solution to the relaxed master program $MP^k$. By virtue of Theorem 3.4, one can verify that $\hat{y}\not =y_l$ for all $l\in S^k$. Suppose to the contrary that $\hat{y}=y_{j_k}$ for some $j_k\in T^k$. Then $(\hat{x}, y_{j_k}, \hat{\theta })$ solves the relaxed master program $MP^k$ and

$$\begin{aligned} \left\{ \begin{array}{l} \hat{\theta }<UBD^k\le f(x_{j_k}, y_{j_k}),\\ f(x_{j_k}, y_{j_k})+(\alpha _{j_k}, \beta _{j_k})^T\begin{pmatrix}\hat{x}-x_{j_k}\\ 0\end{pmatrix}\le \hat{\theta },\\ g(x_{j_k}, y_{j_k})+(\xi _{j_k}, \eta _{j_k})^T\begin{pmatrix}\hat{x}-x_{j_k}\\ 0\end{pmatrix}\le 0. \end{array} \right. \end{aligned}$$

(5.7)

By using (5.1) in the proof of Theorem 3.1, one has

$$\begin{aligned} (\alpha _{j_k}, \beta _{j_k})^T\begin{pmatrix}\hat{x}-x_l\\ 0\end{pmatrix}\ge 0. \end{aligned}$$

This and (5.7) imply that $f(x_{j_k}, y_{j_k})\le \hat{\theta }$, which contradicts $\hat{\theta }<f(x_{j_k}, y_{j_k})$ in (5.7). Hence $\hat{y}\not =y_j$ for all $j\in T_k$. This means that $\hat{y}$ is distinct from any $y_j$ for all $j\in T^k\cup S^k$.

Now suppose that the relaxed master program $MP^{k}$ is infeasible for some $k$. Then Algorithm 1 terminate at $k$-th step. Let $\rho $ denote the optimal value of MINLP problem (P). If there is some $j\in T^{k-1}$ such that $f(x_j, y_j)=\rho $, then the conclusion holds. Next, we assume that $f(x_j, y_j)>\rho $ for all $j\in T^{k-1}$. Then $UBD^{k-1}>\rho $ and $k\in T^k$ (otherwise, $k\in S^k$, $UBD^{k}=UBD^{k-1}$ by Algorithm 1 and consequently $MP^{k}$ is feasible, a contradiction). Thus subproblem $P^{y_k}$ is feasible and $f(x_k, y_k)\ge \rho $.

Suppose to the contrary that $f(x_k, y_k)> \rho $. If $f(x_k, y_k)\le UBD^{k-1}$, then $\rho <f(x_k, y_k)=UBD^{k}$ and $MP^k$ is feasible, a contradiction. If $f(x_k, y_k)> UBD^{k-1}$ then $\rho <UBD^{k-1}=UBD^{k}$ and thus $MP^{k}$ is feasible, a contradiction. This means $f(x_k, y_k)=\rho $. The proof is complete. $\square $

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wei, Z., Ali, M.M. Convex mixed integer nonlinear programming problems and an outer approximation algorithm. J Glob Optim 63, 213–227 (2015). https://doi.org/10.1007/s10898-015-0284-5

Download citation

Received: 02 November 2014
Accepted: 10 February 2015
Published: 18 February 2015
Issue Date: October 2015
DOI: https://doi.org/10.1007/s10898-015-0284-5

Keywords

Mathematics Subject Classification

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Convex mixed integer nonlinear programming problems and an outer approximation algorithm

Abstract

Similar content being viewed by others

Outer Approximation Algorithm for One Class of Convex Mixed-Integer Nonlinear Programming Problems with Partial Differentiability

Using regularization and second order information in outer approximation for convex MINLP

Alternative regularizations for Outer-Approximation algorithms for convex MINLP

1 Introduction

2 Preliminaries

Proposition 2.1

Proposition 2.2

Proof

Proposition 2.3

Proof

Proposition 2.4

3 Main results

3.1 An overview of the method

Theorem 3.1

Theorem 3.2

Theorem 3.3

Theorem 3.4

Theorem 3.5

Remark 3.1

Example 3.1

3.2 The algorithm

Theorem 3.6

4 Conclusions

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Appendix: Proofs of Theorems 3.1, 3.3 and 3.6

Appendix: Proofs of Theorems 3.1, 3.3 and 3.6

Proof of Theorem 3.1

Proof of Theorem 3.3

Proof of Theorem 3.6

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classification

Search

Navigation