Global optimality conditions for nonlinear optimization problems

Zhong, Haitao; Zheng, Mingfa; Chen, Wei; Zheng, Aoyu

doi:10.1007/s12065-022-00725-y

Global optimality conditions for nonlinear optimization problems

Special Issue
Published: 24 May 2022

Volume 17, pages 291–301, (2024)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Evolutionary Intelligence Aims and scope Submit manuscript

Global optimality conditions for nonlinear optimization problems

Download PDF

Haitao Zhong¹,
Mingfa Zheng¹,
Wei Chen² &
…
Aoyu Zheng¹

149 Accesses
Explore all metrics

Abstract

Based on the alternative theorem, global optimality conditions for nonlinear programming problems to be discussed in this article. Firstly, on the basis of the research of optimality conditions for polynomial optimization problems, the paper considers nonlinear programming over constrains which are not real polynomial functions. And then necessary global optimality conditions for nonlinear programming problems with non-polynomial constraints functions are proposed and sufficient global optimality conditions for polynomial objective function programming problems with non-polynomial constraints functions are developed by using the alternative theorem. Finally, necessary and sufficient global optimality conditions for 0–1 quadratic programming problems are presented.

Generalized Fritz John optimality in nonlinear programming in the presence of equality and inequality constraints

Article 14 October 2015

Optimality conditions for nonconvex problems over nearly convex feasible sets

Article Open access 02 March 2021

Second-order optimality conditions for nonlinear programs and mathematical programs

Article Open access 08 September 2017

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

1.1 Motivation

Theories and methods of optimization originated in various fields such as military, economics, management, and engineering technology. After World War II, the application of optimization theory and methods was transferred from military to civilian use, and optimization methods and theories of engineering technology and modern management were proposed. Since Dantzing proposed the simplex method to solve general linear programming problems in 1947, optimization became an independent discipline. When the steepest descent method, the conjugate gradient method, the Newton method and the quasi-Newton method were proposed after the 1850s, nonlinear programming has been greatly developed. Up to now, the theoretical research of various optimization problems such as linear programming, nonlinear programming, integer programming, semi-definite programming, stochastic programming, multi-objective programming, etc. has developed rapidly. With the rapid development of informatization and artificial intelligence, the computational experiments of these optimization problems have gained greater development space. As more and more problems are solved and people’s needs are getting higher, global optimization has become a new branch of the optimization discipline, and more and more studies have been done. Global optimization is to solve the global optimal solution of the programming problem or to prove the non-existence of the global optimal solution. This involves the study of optimality conditions, because the point that does not meet the necessary condition must not be the optimal solution, and the point that meets the sufficient condition must be the optimal point, plus a good algorithm without a well termination criterion which is the optimality condition that can be verified may also be in vain. Therefore, optimality conditions are an indispensable part of the optimization theory. So, the global optimality condition is a theoretical knowledge that needs to be developed urgently.

1.2 Literature review

The research on optimality conditions starts from two aspects: local and global. The study of local optimality conditions is very mature with the development of nonlinear programming. The KKT condition first proposed by Kuhn and Tucker et al. [1] in 1951 is a local optimality condition for continuous optimization problems. The research on global optimality conditions is currently only for some special structure optimization problems. It is difficult to obtain global optimality conditions that are easy to verify for general optimization problems. For constrained optimization problems, Yang [2], Pinar [3] respectively gave global optimality conditions for convex programming problems. Wu [4], Alexanders [5] respectively has made some progress in the characterization of the global optimal solution of several special non-convex programming problems. And Wu [4] used the theory of abstract convex analysis to give the sufficient global optimality condition for the weakly convex minimum problem with inequality constraints. In 2006, Schichl [6] proposed that in the study of optimality conditions, they generally rely on qualitative conditions (smooth, linear, convex, etc.) and technical conditions which describe the type of problem, that is, constraint qualification. Wu, Jeyakumar and Rubinov [7] established the lagrangian function to give the sufficient global optimality condition for the non-convex quadratic programming problem with quadratic equality and inequality constraints. Jeyakumar, Srisatkunarajah and Huy [8] discussed non-convex continuous optimization problems with equality and inequality constraints, and gave Kuhn-Tucker sufficient global optimality conditions for such problems. In Beck’s article [9], for the non-convex quadratic programming problem with bivariate constraints, Beck established a sufficient condition for the feasible solution of the problem to be the global optimal solution, and obtained the necessary condition for the global optimal solution. There are many other scholars paying a great deal of attention on global optimality conditions for those polynomial optimization problems, Bienstock [10], David [11], Qi [12], Jeyakumar [13]. And for 0-1 quadratic programming problems, professor Chen and Zhang [14] used the relationship between 0–1 quadratic programming and integer quadratic programming to get the necessary global optimality condition and sufficient global optimality condition for integer quadratic programming with linear inequality constraints, Fang [15] also studied optimality conditions and gave the canonical dual approach to solve 0–1 quadratic programming problems, and Hsia [16]. For other multi-objective optimization problems, some scholars study optimality conditions, as Wang [17], Jean-Pierre [18], Perkki [19]. In the literature [20], Jeyakumar and Li further deepened the theory of Schichl [6] to propose necessary global optimality conditions under the optimization problem of nonlinear objective function and polynomial constraints, and put forward a concept to convert the optimality condition of the polynomial optimization problem proposed by Schichl [6].

1.3 Proposed approaches

The polynomial transformation theorem of Schichl [6], i.e. alternative theorem, is a key tool to develop global optimality conditions in the polynomial optimization. In the paragraph above, some scholars used the alternative theorem to discuss the global optimality condition. However, the existing results are limited to some special cases such as the objective function and the constraint function to be polynomial functions. This paper further deepens the study, and discusses global optimality conditions for general nonlinear programming problems through the alternative theorem. By using the polynomial transformation theorem which from Schichl [6] and Marshall [21] together with a polynomial construction, necessary global optimality conditions for nonlinear programming problems with non-polynomial constraints functions and sufficient global optimality conditions for polynomial objective function programming problems with non-polynomial constraints functions can be developed. In particular, this paper presents some necessary and sufficient global optimality conditions for 0–1 quadratic programming problems.

The outline of the paper is as follows. In Sect. 2, we describe the alternative theorem and give the some relevant theoretical knowledge and notations description. In Sect. 3, we present necessary global optimality conditions for nonlinear programming problems with non-polynomial constraints functions and develop sufficient global optimality conditions for polynomial objective function programming problems with non-polynomial constraints functions. In Sect. 4, we also discuss necessary and sufficient global optimality conditions for 0–1 quadratic programming problems. In Sect. 5, we put the conclusion.

2 Experimental

2.1 Symbol description

Denoting the real polynomial ring on $\varvec{R}^{n}$ by $\varvec{R[x]}$ where ${x}=(x_{{1}},\dots ,x_{{n}})^{T}\in \varvec{R}^{n}$. The set of all natural numbers is denoted by $\varvec{N}$. The notation $\varvec{A}$ $\succeq 0$ means that the matrix $\varvec{A}$ is positive semi-definite. f is a sum of squares polynomial (SOS polynomial) in $\varvec{R[x]}$ if there exist $k \in \varvec{N}$, $f_{i} \in \varvec{R[x]}, i=1,\dots ,k$, such that for each $x \in \varvec{R}^{n}$, $f(\textit{x})=\sum \nolimits _{i=1}^{k}f_{i}^{2}(x)$. Let $f_{1},\dots ,f_{k} \in \varvec{R[x]}$. Here are those following notations throughout the paper:

$$\begin{array}{*{20}l} {S\langle f_{1} , \ldots ,f_{k} \rangle :} \hfill & { = \left\{ {\prod\limits_{{i = 1}}^{k} {f_{i}^{{e_{i} }} } |e_{i} \in \{ 0,1\} } \right\}.} \hfill \\ {I\langle f_{1} , \ldots ,f_{k} \rangle :} \hfill & { = \left\{ {\sum\limits_{{i = 1}}^{k} {a_{i} } {\kern 1pt} f_{i} |a_{i} \in {\text{ }}R[x]} \right\}.} \hfill \\ {M\langle f_{1} , \ldots ,f_{k} \rangle :} \hfill & { = \left\{ {\prod\limits_{{i = 1}}^{k} {f_{i}^{{e_{i} }} } |e_{i} \in {\text{ }}N \cup \{ 0\} } \right\}.} \hfill \\ {C\langle f_{1} , \ldots ,f_{k} \rangle :} \hfill & { = \left\{ {\sum\limits_{{i = 1}}^{{2^{k} }} {y_{i} } u_{i} |y_{i} \;{\text{is}}\;{\text{a}}\;{\text{SOS}}\;{\text{polynomial}},u_{i} \in S\langle f_{1} , \ldots ,f_{k} \rangle } \right\}.} \hfill \\ \end{array}$$

2.2 Alternative theorem

In the literature [6], Schichl and Neumaier proposed that researches about optimality conditions generally depend on qualitative conditions (smooth, linear, convex, etc.) which describe the type of problem, that is, constraint qualification, which meet to remove specific difficulties or counter-examples in proving. However, optimality conditions are used to exclude some solutions or to verify the existence of solutions, their own availability is very important. So, optimality conditions should not depend on any constraint qualification. Schichl and Neumaier pointed out the alternative theorem is a key point in research of optimality conditions. And a very strong non-degenerate condition is required in (second-order) necessary conditions. Schichl and Neumaier constructed necessary and sufficient conditions of polynomial optimization problems which are without any constraint qualifications by using a polynomial similar theorem of linear system transposition theorem that is namely Positivstellensatz theorem.

Theorem 1

[6] (Polynomial transformation theorem II) P, Q and R are respectively vectors of polynomials. The following two conditions are listed and only one is true:

(i)
$P(x)\ge 0,Q(x)=0,R_{i}(x)> 0(i=1,\ldots ,k)$, for some $x\in \varvec{R}^{n}$;
(ii)
$w_{1}+w_{2}+w_{3}=0$, for some $w_{1}\in C\langle P,R\rangle ,w_{2}\in I\langle Q\rangle ,w_{3}\in M\langle R\rangle$.

The theorem above refers to alternative theorem, there is one established between two choices. Alternative theorem is a key point to construct or prove optimality conditions. Some global optimality conditions are presented in [20].

3 Global optimality conditions for nonlinear programming problems

In [20], authors discussed problems with polynomial constraints. Here, we consider the problem over non-polynomial constraints and present some global optimality conditions.

We consider the problem

$$\begin{aligned}(P_{1})\left\{ {\begin{array}{*{20}{ll}} \min &{}p(x )-k(x ) \\ s.t &{}p_{i}(x )+k_{i}(x )\ge 0,~i=1,\dots ,k;\\ &{}h_{j}(x )=0, ~~~~~~~~~~j=1,\dots ,l. \end{array}} \right. \end{aligned}$$

where $p(x), p_{i}(x), h_{j}(x)$ are real polynomials on $\varvec{R}^{n}$, $k(x), k_{i}(x)$ are twice continuously differentiable convex functions on $\varvec{R}^{n}$. We Denote

$$\begin{aligned} F_{0}=\{x\vert p_{i}(x)+k_{i}(x)\ge 0, i=1,\dots ,k, h_{j}(x)=0, j=1,\dots ,l\}. \end{aligned}$$

Let $\hat{x}$ be a feasible point of $(P_{1})$, to define the function $\tilde{g}_{i\hat{x}}^{*}:\varvec{R}^{n}\rightarrow \varvec{R}$,

$$\begin{aligned}\tilde{g}_{i\hat{x}}^{*}(x)=p_{i}(x)-p_{i}(\hat{x})+\nabla k_{i}(\hat{x})^{T}x-\nabla k_{i}(\hat{x})^{T}\hat{x}, i=1\dots ,k.\end{aligned}$$

And the function $\tilde{g}_{\hat{x}}^{*}: \varvec{R}^{n}\rightarrow \varvec{R}$,

$$\begin{aligned} g(x)=p(x)-\nabla k(\hat{x})^{T}x ~~and~~ \tilde{g}_{\hat{x}}^{*}(x)=g(\hat{x})-g(x). \end{aligned}$$

Theorem 2

If $\hat{x}$ is a global optimal solution of $(P_{1})$, then there exist

$$\begin{aligned} \gamma \in \varvec{N},a_{j}(x) \in \varvec{R[x]}, j=1,\dots ,l, \end{aligned}$$

and SOS polynomials $y_{i}(x),i=1,\dots ,2^{k+1}$,

$$u(x) \in \left\{ {\prod\limits_{{i = 1}}^{k} {\left[ {\tilde{g}_{{ix}}^{*} (x)} \right]^{{^{{ei}} }} \left[ {\tilde{g}_{{ix}}^{*} (x)} \right]^{{^{e} }} \vert e_{i} e \in } } \right\} \in \{ 0,1\}$$

$$\begin{aligned} u(x)=(u_{1}(x),\dots ,u_{2^{k+1}}(x)), \end{aligned}$$

such that, for some $1\le i^{*}\le 2^{k+1}$, we have

$$\begin{aligned} u_{i^{*}}(\hat{x})=1, y_{i^{*}}(\hat{x})=0, \end{aligned}$$

and for each $x\in \varvec{R}^{n}$, we have

$$\begin{aligned} \begin{array}{l} \ \sum \limits _{i=1}^{2^{k+1}}y_{i}(x)u_{i}(x)+\sum \limits _{j=1}^{l}a_{j}(x)h_{j}(x)+[\tilde{g}_{\hat{x}}^{*}(x)]^{\gamma }=0. \end{array} \end{aligned}$$

(1)

Proof

If $\hat{x}$ is a global optimal solution of $(P_{1})$.

Let

$$\begin{aligned} \begin{array}{ll} &{}g(x)=p(x)-\nabla k(\hat{x})^{T}x, \\ &{}\tilde{g}_{\hat{x}}^{*}(x)=g(\hat{x})-g(x)=p(\hat{x})-p(x)-\nabla k(\hat{x})^{T}(\hat{x}-x), \\ &{}\phi (x)=p(x)-k(x)-g(x), \\ &{}\nabla \phi (x)=\nabla p(x)-\nabla k(x)-\nabla g(x)=-\nabla k(x)+\nabla k(\hat{x}), \\ &{}\nabla ^{2}\phi (x)=-\nabla ^{2}k(x)\preceq 0, \\ &{}\nabla \phi (\hat{x})=0. \end{array} \end{aligned}$$

$\square$

By construction of $\phi (x)$, its gradient vanishes at $\hat{x}$. So,

$$\begin{aligned}&\phi (x)\le \phi (\hat{x}), p(x)-k(x)-g(x)\le p(\hat{x})-k(\hat{x})-g(\hat{x}),\\&\quad g(\hat{x})-g(x)\le p(\hat{x})-k(\hat{x})-(p(x)-k(x)). \end{aligned}$$

Thus,

$$\begin{aligned} g(\hat{x})-g(x)\le 0~~ when~~p(\hat{x})-k(\hat{x})-(p(x)-k(x))\le 0. \end{aligned}$$

For $i=1,\dots ,k$, let

$$\begin{aligned} \begin{array}{ll} &{} g_{i}(x)=p_{i}(x)+\nabla k_{i}(\hat{x})^{T}x, \\ &{}\tilde{g}_{i\hat{x}}^{*}(x)=g_{i}(x)-g_{i}(\hat{x})=p_{i}(x)-p_{i}(\hat{x})+\nabla k_{i}(\hat{x})^{T}(x-\hat{x}), \\ &{}\phi _{i}(x)=p_{i}(x)+k_{i}(x)-g_{i}(x), \\ &{}\nabla \phi _{i}(x)=\nabla p_{i}(x)+\nabla k_{i}(x)-\nabla g_{i}(x)=\nabla k_{i}(x)-\nabla k_{i}(\hat{x}), \\ &{}\nabla ^{2}\phi _{i}(x)=\nabla ^{2}k_{i}(x)\succeq 0, \\ &{}\nabla \phi _{i}(\hat{x})=0. \end{array} \end{aligned}$$

So, $\hat{x}$ is a minimum of $\phi _{i}(x)$.

$$\begin{aligned}&\phi _{i}(x)\ge \phi _{i}(\hat{x}),\\&\quad p_{i}(x)+k_{i}(x)-g_{i}(x)\ge p_{i}(\hat{x})+k_{i}(\hat{x})-g_{i}(\hat{x}),\\&\quad p_{i}(x)+k_{i}(x)-(p_{i}(\hat{x})+k_{i}(\hat{x}))\ge g_{i}(x)-g_{i}(\hat{x}). \end{aligned}$$

Since $\hat{x}$ is a feasible point of problem, then $p_{i}(\hat{x})+k_{i}(\hat{x})\ge 0$.

If $p_{i}(x)+k_{i}(x)<0$, then $\tilde{g}_{i\hat{x}}^{*}(x)<0$.

By construction, $\tilde{g}_{\hat{x}}^{*}(x), \tilde{g}_{i\hat{x}}^{*}(x), i=1,\dots ,k$ are polynomials. So, from theorem 1, let

$$\begin{aligned} P(x)= {} (\tilde{g}_{1\hat{x}}^{*}(x),\dots ,\tilde{g}_{k\hat{x}}^{*}(x)),\\ Q(x)= {} (h_{1}(x),\dots ,h_{l}(x)),\\ R(x)= {} \tilde{g}_{\hat{x}}^{*}(x). \end{aligned}$$

Then exactly one of the following holds:

(i)
$P(x)\ge 0, Q(x)=0, R(x)>0$ for some $x \in \varvec{R}^{n}$;
(ii)
$w_{1}(x)+w_{2}(x)+w_{3}(x)=0$ for some $w_{1}(x) \in C\langle P(x),R(x) \rangle , w_{2}(x) \in I\langle Q(x) \rangle , w_{3}(x) \in M\langle R \rangle .$
$$\begin{aligned} C\langle P(x),R(x)\rangle= & {} \left\{ \sum \limits _{i=1}^{2^{k+1}}y_{i}(x)u_{i}(x)\vert y_{i}(x) is \ a \ SOS \ polynomial,\right. \\&\left. u_{i}(x) \in S\langle P(x),R(x) \rangle \right\} ,\\ S\langle P(x),R(x)\rangle= & {} \left\{ \prod \limits _{i=1}^{k}[\tilde{g}_{i\hat{x}}^{*}(x)]^{e_{i}}[\tilde{g}_{\hat{x}}^{*}(x)]^{e}\vert e_{i},e \in \{0,1\}\right\} ,\\ I\langle Q(x) \rangle= & {} \left\{ \sum \limits _{j=1}^{l}a_{j}(x)h_{j}(x)\vert a_{j}(x) \in \varvec{R[x]}\right\} ,\\ M\langle R \rangle= & {} \left\{ [\tilde{g}_{\hat{x}}^{*}(x)]^{\gamma }\vert \gamma \in \varvec{N}\bigcup \{0\}\right\} . \end{aligned}$$

If $\hat{x}$ is a global solution, then for $x \in F_{0}, R(x)\le 0$.

For $x \in \varvec{R}^{n}\setminus F_{0}$,

Case 1, $h_{j}(x)=0$, for all $j=1,\dots ,l$, then there must exist $1\le i_{0}\le k$, such that

$$\begin{aligned} p_{i_{0}}(x)+k_{i_{0}}(x)<0,\tilde{g}_{i_{0}\hat{x}}^{*}(x)<0, \end{aligned}$$

then $P(x)\ge 0$ does not hold.

Case 2, there exists $1\le j_{0}\le l$, such that

$$\begin{aligned} h_{j_{0}}(x)\ne 0, \end{aligned}$$

then $Q(x)\ne 0$.

So, for each $x \in \varvec{R}^{n}$,

$$\begin{aligned} P(x)\ge 0, Q(x)=0, R(x)>0, \end{aligned}$$

don’t hold at the same time. Thus, (ii) holds. To obtain (1) when $\hat{x}$ is a global solution of $(P_{1})$.

To substitute $x=\hat{x}$ in (1), and note that

$$\begin{aligned} \tilde{g}_{\hat{x}}^{*}(\hat{x})=0, \tilde{g}_{i\hat{x}}^{*}(\hat{x})=0, i=1,\dots ,k, h_{j}(\hat{x})=0, j=1,\dots ,l, \sum \limits _{i=1}^{2^{k+1}}y_{i}(\hat{x})u_{i}(\hat{x})=0. \end{aligned}$$

By construction of $u_{i}(x)$, there must exist $1\le i^{*}\le 2^{k+1}$, such that

$$\begin{aligned} u_{i^{*}}(\hat{x})=1, \end{aligned}$$

and for all $1\le i\ne i^{*}\le 2^{k+1}$, we have

$$\begin{aligned} u_{i}(\hat{x})=0, \end{aligned}$$

so, $y_{i^{*}}(\hat{x})=0$. That completes the proof.

We consider the problem

$$\begin{aligned}(P_{2})\left\{ {\begin{array}{*{20}{ll}} \min &{}f(x) \\ s.t &{}p_{i}(x)-k_{i}(x)\ge 0,~~~~i=1,\dots ,k;\\ &{}\beta _{j}(x)-\alpha _{j}(x)=0,~~~j=1,\dots ,l. \end{array}} \right. \end{aligned}$$

Where $f(x), p_{i}(x), \beta _{j}(x)$ are real polynomials on $\varvec{R}^{n}$, $k(x), k_{i}(x), \alpha _{j}(x)$ are twice continuously differentiable convex functions on $\varvec{R}^{n}$. We Denote

$$\begin{aligned} F=\{x\vert p_{i}(x)-{k}_{i}(x)\ge 0, i=1,\dots ,k, \beta _{j}(x)-\alpha _{j}(x)=0, j=1,\dots ,l\}. \end{aligned}$$

Since for $i=1,\dots ,k,$

$$\begin{aligned} \{x\vert {p}_{i}(x)-{k}_{i}(x)\ge 0\}\ne \varvec{R}^{n}, \{x \vert {p}_{i}(x)-{k}_{i}(x)< 0\}\ne \emptyset , \end{aligned}$$

to denote $x^{(i)} \in \varvec{R}^{n}$, such that

$$\begin{aligned} p_{i}(x^{(i)})-k_{i}(x^{(i)})=0~~or ~~p_{i}(x^{(i)})-k_{i}(x^{(i)})<0. \end{aligned}$$

The function $\tilde{g}_{ix^{(i)}}^{**}:\varvec{R}^{n}\rightarrow \varvec{R}$,

$$\begin{aligned} \tilde{g}_{ix^{(i)}}^{**}(x)=p_{i}(x)-p_{i}(x^{(i)})+\nabla k_{i}(x^{(i)})^{T}x^{(i)}-\nabla k_{i}(x^{(i)})^{T}x, i=1\dots ,k. \end{aligned}$$

And the function $\tilde{g}_{j\hat{x}}^{***}:\varvec{R}^{n}\rightarrow \varvec{R}$,

$$\begin{aligned} \tilde{g}_{j\hat{x}}^{***}(x)=\beta _{j}(x)-\beta _{j}(\hat{x})+\nabla \alpha _{j}(\hat{x})^{T}\hat{x}-\nabla \alpha _{j}(\hat{x})^{T}x, j=1,\dots ,l. \end{aligned}$$

Theorem 3

Let $\hat{x}$ be a feasible point of problem $(P_{2})$, if there exist $\gamma \in \varvec{N}$ and SOS polynomials $y_{i}(x),i=2^{k+l+1}$, $u(x)=(u_{1}(x),\dots ,u_{2^{k+l+1}}(x))$,

$$u_{i} (x) \in \left\{ {\prod\limits_{{i = 1}}^{k} {\left[ {\tilde{g}^{{**}} _{{ix^{{(i)}} }} (x)} \right]^{{e_{i} }} \prod\limits_{{j = 1}}^{l} {\left[ {\tilde{g}^{{**}} _{{j\hat{x}}} (x)} \right]^{{e_{j} }} \left[ {f(\hat{x}) - f(x)} \right]^{e} \vert e_{i} ,} {\kern 1pt} {\kern 1pt} e_{j} ,e \in \{ 0,1\} } } \right\}^{{}}$$

such that for each $x\in \varvec{R}^{n}$, we have

$$\begin{aligned} \begin{array}{l} \ \sum \limits _{j=1}^{2^{k+l+1}}y_{j}(x)u_{j}(x)+[f(\hat{x})-f(x)]^{\gamma }=0, \end{array} \end{aligned}$$

(2)

then $\hat{x}$ is a global optimal solution of problem $(P_{2})$.

Proof

Let

$$\begin{aligned}\begin{array}{ll} &{} \eta _{i}(x)=p_{i}(x)-k_{i}(x)-(p_{i}(x)-\nabla k_{i}(x^{(i)})^{T}x), \\ &{}\nabla \eta _{i}(x)=\nabla p_{i}(x)-\nabla k_{i}(x)-\nabla p_{i}(x)+\nabla k_{i}(x^{(i)})=-\nabla k_{i}(x)+\nabla k_{i}(x^{(i)}), \\ &{}\nabla ^{2}\eta _{i}(x)=-\nabla ^{2}k_{i}(x)\preceq 0, \\ &{}\nabla \eta _{i}(x^{(i)})=0. \end{array} \end{aligned}$$

$\square$

By construction of $\eta _{i}(x)$, its gradient vanishes at $x^{(i)}$. So,

$$\begin{aligned}&\eta _{i}(x)\le \eta _{i}(x^{(i)}),\\&\quad p_{i}(x)-k_{i}(x)-(p_{i}(x)-\nabla k_{i}(x^{(i)})^{T}x)\le p_{i}(x^{(i)})-k_{i}(x^{(i)})-(p_{i}(x^{(i)})-\nabla k_{i}(x^{(i)})^{T}x^{(i)}),\\&\quad (p_{i}(x^{(i)})-\nabla k_{i}(x^{(i)})^{T}x^{(i)})-(p_{i}(x)-\nabla k_{i}(x^{(i)})^{T}x)\le p_{i}(x^{(i)})-k_{i}(x^{(i)})-(p_{i}(x)-k_{i}(x)), \end{aligned}$$

if $p_{i}(x)-k_{i}(x)\ge 0$, then $p_{i}(x^{(i)})-k_{i}(x^{(i)})-(p_{i}(x)-k_{i}(x))\le 0$, we have

$$\begin{aligned} \tilde{g}_{ix^{(i)}}^{**}(x)=(p_{i}(x)-\nabla k_{i}(x^{(i)})^{T}x)-(p_{i}(x^{(i)})-\nabla k_{i}(x^{(i)})^{T}x^{(i)})\ge 0. \end{aligned}$$

Let

$$\begin{aligned} \begin{array}{ll} &{} g_{j\hat{x}}(x)=\beta _{j}(x)-\nabla \alpha _{j}(\hat{x})^{T}x, \\ &{}\tilde{g}_{j\hat{x}}^{***}(x)=g_{j\hat{x}}(x)-g_{j\hat{x}}(\hat{x})=\beta _{j}(x)-\beta _{j}(\hat{x})-\nabla \alpha _{j}(\hat{x})^{T}(x-\hat{x}), \\ &{}\pi _{j}(x)=\beta _{j}(x)-\alpha _{j}(x)-g_{j\hat{x}}(x), \\ &{}\nabla \pi _{j}(x)=\nabla \beta _{j}(x)-\nabla \alpha _{j}(x)-\nabla g_{j\hat{x}}(x)=-\nabla \alpha _{j}(x)+\nabla \alpha _{j}(\hat{x}), \\ &{}\nabla ^{2}\pi _{j}(x)=-\nabla ^{2}\alpha _{j}(x)\preceq 0, \\ &{}\nabla \pi _{j}(\hat{x})=0. \end{array} \end{aligned}$$

Similarly,

$$\begin{aligned}&\pi _{j}(x)\le \pi _{j}(\hat{x}),\\&\quad \beta _{j}(x)-\alpha _{j}(x)-g_{j\hat{x}}(x)\le \beta _{j}(\hat{x})-\alpha _{j}(\hat{x})-g_{j\hat{x}}(\hat{x}), \\&\quad g_{j\hat{x}}(\hat{x})-g_{j\hat{x}}(x)\le \beta _{j}(\hat{x})-\alpha _{j}(\hat{x})-(\beta _{j}(x)-\alpha _{j}(x)). \end{aligned}$$

Since $\hat{x}$ is a feasible point,

$$\begin{aligned} \beta _{j}(\hat{x})-\alpha _{j}(\hat{x})=0, ~~g_{j\hat{x}}(\hat{x})-g_{j\hat{x}}(x)\le 0~ when ~\beta _{j}(x)-\alpha _{j}(x)=0. \end{aligned}$$

By construction, $\tilde{g}_{ix^{(i)}}^{**}(x), i=1,\dots ,k, \tilde{g}_{j\hat{x}}^{***}(x), j=1,\dots ,l$ are polynomials.

So, let

$$\begin{aligned} P(x)=(\tilde{g}_{1x^{(1)}}^{**}(x),\dots ,\tilde{g}_{kx^{(k)}}^{**}(x),\tilde{g}_{1\hat{x}}^{***}(x),\dots ,\tilde{g}_{l\hat{x}}^{***}(x)),~R(x)=f(\hat{x})-f(x). \end{aligned}$$

Then exactly one of the following holds:

(i)
$P(x)\ge 0, R(x)>0$ for some $x \in \varvec{R}^{n}$;
(ii)
$w_{1}(x)+w_{3}(x)=0$ for some $w_{1}(x) \in C\langle P(x),R(x) \rangle , w_{3}(x) \in M\langle R \rangle$.

$$\begin{gathered} C\langle P(x),R(x)\rangle = \left\{ {\sum\limits_{{i = 1}}^{{2^{{k + 1 + 1}} }} {y_{i} (x)u_{i} (x)\vert y_{i} } (x){\text{is a SOS polynomial}},u_{i} (x) \in S\langle P(x),R(x)\rangle } \right\} \hfill \\ S\langle P(x),R(x)\rangle = \left\{ {\prod\limits_{{i = 1}}^{k} {\left[ {\tilde{g}_{{ix^{{(i)}} }}^{{**}} (x)} \right]^{{e_{i} }} \prod\limits_{{j = 1}}^{l} {\left[ {\tilde{g}_{{j\hat{x}}}^{{**}} (x)} \right]^{{e_{j} }} \left[ {f(\hat{x}) - f(x)} \right]^{e} |e_{i} ,} e_{j} ,e \in \{ 0,1\} } } \right\} \hfill \\ M\langle R\rangle = \left\{ {[f(\hat{x}) - f(x)^{\gamma } \vert \gamma \in {\mathbf{N}}\bigcup {\{ 0\} } } \right\}. \hfill \\ \end{gathered}$$

If (2) holds, (ii) holds, then (i) doesn’t hold.

$$\begin{aligned} P(x)\ge 0, R(x)>0, \end{aligned}$$

don’t hold for each $x \in \varvec{R}^{n}$.

For all $x \in F, \beta _{j}(x)-\alpha _{j}(x)=0$, we have

$$\begin{aligned} \tilde{g}_{j\hat{x}}^{***}(x)=g_{j\hat{x}}(x)-g_{j\hat{x}}(\hat{x})\ge 0. \end{aligned}$$

Also,

$$\begin{aligned} p_{i}(x)-k_{i}(x)\ge 0, \tilde{g}_{ix^{(i)}}^{**}(x)\ge 0. \end{aligned}$$

So, for each $x \in F$, we have $P(x)\ge 0$, which means that $R(x)>0$ does not hold.

Thus,

$$\begin{aligned} f(\hat{x})-f(x)\le 0, \end{aligned}$$

$\hat{x}$ is a global minimum of $(P_{2})$. That completes the proof.

Now we consider the problem

$$\begin{aligned}(P_{3})\left\{ {\begin{array}{*{20}{ll}} \min &{}f(x) \\ s.t &{}p_{1}(x)-k_{1}(x)\ge 0;\\ &{}x \in \{0,1\}^{n}. \end{array}} \right. \end{aligned}$$

Where $f(x),p_{1}(x)$ are real polynomials on $\varvec{R}^{n}$, $k_{1}(x)$ is a twice continuously differentiable convex function on $\varvec{R}^{n}$. We denote $x^{(1)}\in \varvec{R}^{n}$, such that

$$\begin{aligned} p_{1}(x^{(1)})-k_{1}(x^{(1)})=0~~ or ~~p_{1}(x^{(1)})-k_{1}(x^{(1)})<0. \end{aligned}$$

To define the function $\tilde{g}_{1x^{(1)}}^{**}:\varvec{R}^{n}\rightarrow \varvec{R}$,

$$\begin{aligned} \tilde{g}_{1x^{(1)}}^{**}(x)=p_{1}(x)-p_{1}(x^{(1)})+\nabla k_{1}(x^{(1)})^{T}x^{(1)}-\nabla k_{1}(x^{(1)})^{T}x. \end{aligned}$$

Corollary 1

Let $\hat{x}$ be a feasible point of problem $(P_{3})$, there exist $\gamma \in \varvec{N}$,$a_{j}(x)\in \varvec{R[x]} ,j=1,\ldots ,n$ and SOS polynomials $y_{i}(x),i=1,2,3,4$, such that for each $x\in \varvec{R}^{n}$,

$$\begin{aligned} \begin{array}{l} y_{1}(x)+y_{2}(x)(f(\hat{x})-f(x))+y_{3}(x)\tilde{g}_{1x^{(1)}}^{**}(x)+y_{4}(x)\tilde{g}_{1x^{(1)}}^{**}(x)(f(\hat{x})-f(x))\\ +\sum \limits _{j=1}^{n}a_{j}(x)(x_{j}^{2}-x_{j})+(f(\hat{x})-f(x))^{\gamma }=0. \end{array} \end{aligned}$$

(3)

Then $\hat{x}$ is a global optimal solution.

Proof

Let $\beta _{j}(x)=x_{j}^{2}-x_{j}, \alpha _{j}(x)=0, j=1,\dots ,n$. Similarly as theorem 3, let

$$\begin{aligned} P(x)=\tilde{g}_{1x^{(1)}}^{**}(x), Q(x)=(x_{1}^{2}-x_{1},\dots ,x_{n}^{2}-x_{n}), R(x)=f(\hat{x})-f(x), \end{aligned}$$

there exactly one of the following holds:

(i)
There exist some $x \in \varvec{R}^{n}$ such that
$$\begin{aligned} \tilde{g}_{1x^{(1)}}^{**}(x) \ge 0,x_{i}^{2}-x_{i}=0~~ for ~~ i=1,\ldots ,n ~~and~~ f(\hat{x})-f(x)>0; \end{aligned}$$
(ii)
$w_{1}+w_{2}+w_{3}=0$ for some
$$\begin{aligned}&w_{1}\in C\langle \tilde{g}_{1x^{(1)}}^{**}(x),f(\hat{x})-f(x)\rangle ,\\&\quad w_{2}\in I\langle x_{1}^{2}-x_{1},\dots ,x_{n}^{2}-x_{n}\rangle ,\\&\quad w_{3}\in M\langle f(\hat{x})-f(x)\rangle . \end{aligned}$$

(3) holds, there exist $\gamma \in \varvec{N} \cup \{0\}$,
$$\begin{aligned} \begin{array}{ll} w_{1}&{}\in C\langle \tilde{g}_{1x^{(1)}}^{**}(x),f(\hat{x})-f(x)\rangle \\ {} &{}\;\;=\{y_{1}(x)+y_{2}(x)(f(\hat{x})-f(x))+y_{3}(x)\tilde{g}_{1x^{(1)}}^{**}(x)+y_{4}(x)\tilde{g}_{1x^{(1)}}^{**}(x)(f(\hat{x})-f(x))\\ &{}\;\;\;\;\;\;\;\vert y_{i} \ is \ a \ SOS \ polynomial, i=1,2,3,4\}, \end{array} \end{aligned}$$
and
$$\begin{aligned} w_{2}\in I\langle x_{1}^{2}-x_{1},\dots ,x_{n}^{2}-x_{n}\rangle =\{\sum \limits _{j=1}^{n}a_{j}(x)(x_{j}^{2}-x_{j})\vert a_{j} \in \varvec{R[x]}\}, \end{aligned}$$
such that for each $x\in \varvec{R}^{n}$, we have
$$\begin{aligned} w_{1}(x)+w_{2}(x)+(f(\hat{x})-f(x))^{\gamma }=0. \end{aligned}$$
So, (ii) holds and (i) does not hold. Let $\hat{x}$ be a feasible point of $(P_{3})$, from the proof of theorem 3,
$$\begin{aligned} P(x)\ge 0, Q(x)=0. \end{aligned}$$
As (i) does not hold, we have $f(\hat{x})-f(x)\le 0$. $\hat{x}$ is a global solution of $(P_{3})$. That completes the proof.

$\square$

4 Global optimality conditions for 0–1 quadratic problems

Global optimality conditions for polynomial function are proposed by Schichl and Neumaier [6] in terms of the objective function and constraint functions are polynomials. In 2011, Jeyakumar and Li further deepened Schichl and Neumaier’s theory in [20]. They proposed necessary global optimality conditions of non-linear objective function, polynomial constrained optimization problem by putting forward a concept of over-estimator to convert optimality conditions for polynomial problems made by Schichl and Neumaier.

Here is the problem considered in [20],

$$\begin{aligned}(P_{4})\left\{ {\begin{array}{*{20}{ll}} \min &{}f(x) \\ s.t &{}g_{i}(x) \ge 0,~~~~ i=1,\ldots ,k;\\ &{}h_{j}(x)=0,~~~~ j=1,\ldots ,l. \end{array}} \right. \end{aligned}$$

Where f is a twice continuously differentiable function on $\varvec{R}^{n}$, $g_{i}$ and $h_{j}$ respectively is polynomial function on $\varvec{R}^{n}$.The feasible set of $(P_{4})$ is given by

$$\begin{aligned} F_{1}:=\{x\vert g_{i}(x) \ge 0, h_{j}(x)=0, i=1,\ldots ,k \ j=1,\ldots ,l\}. \end{aligned}$$

For $\hat{x} \in \varvec{R}^{n}$, denote $\tilde{g}_{\hat{x}}:\varvec{R}^{n}\rightarrow \varvec{R}$,

$$\begin{aligned} g(x):=p(x)-(\nabla p(\hat{x})-\nabla f(\hat{x}))^{T}x,\tilde{g}_{\hat{x}}(x)=g(\hat{x})-g(x). \end{aligned}$$

Lemma 1

[20] For the problem $(P_{4})$, suppose that there exist a convex set $C \supseteq F_{1}$ and $p \in \varvec{R}^{n}$, such that, for each $x \in C$,

$$\begin{aligned} \nabla ^{2}f(x)-\nabla ^{2}p(x)\preceq 0. \end{aligned}$$

If $\hat{x}$ is a global optimal solution of $(P_{4})$, then there exist

$$\begin{aligned} \gamma \in \varvec{N}, a_{j} \in \varvec{R[x]}, j=1,\ldots ,l, \end{aligned}$$

and

$$\begin{aligned} y_{i} \in SOS, u_{i} \in S\langle g_{1},\ldots ,g_{k},\tilde{g}_{\hat{x}}\rangle , i=1,\ldots ,2^{k+1}, \end{aligned}$$

such that $min\{y_{i}(\hat{x}),u_{i}(\hat{x})\}=0$ for $i \in \{1,\ldots ,2^{k+1}\}$ and for each $x\in \varvec{R}^{n}$,

$$\begin{aligned} \begin{array}{l} \ \sum \limits _{i=1}^{2^{k+1}}y_{i}(x)u_{i}(x)+\sum \limits _{j=1}^{l}a_{j}(x)h_{j}(x)+[\tilde{g}_{\hat{x}}(x)]^{\gamma }=0. \end{array} \end{aligned}$$

(4)

Lemma 1 is a necessary global optimality condition for twice continuously differentiable objective function with polynomial constraints optimization problem. In this section, we set the objective function in $(P_{4})$ as $f(x)=\frac{1}{2}x^{T}\varvec{Q}x+\varvec{b}^{T}x$.

Considering the following 0-1 quadratic problem

$$\begin{aligned}(P_{5})\left\{ {\begin{array}{*{20}{ll}} \min &{}f(x)=\frac{1}{2}x^{T}\varvec{Q}x+\varvec{b}^{T}x \\ s.t &{}x \in \{0,1\}^{n}. \end{array}} \right. \end{aligned}$$

Where $\varvec{Q}$ is a real symmetric matrix, $\varvec{b} \in \varvec{R}^{n}$.

Theorem 4

Let $\hat{x} \in \{0,1\}^{n}$, $\hat{x}$ is a global optimal solution if and only if there exist

$$\begin{aligned} \gamma \in \varvec{N}, a_{j}(x) \in \varvec{R[x]},j=1,\ldots ,n, \end{aligned}$$

and SOS polynomials $y_{1}(x)$ and $y_{2}(x)$, such that

$$\begin{aligned}y_{1}(\hat{x})=0, \end{aligned}$$

and for each $x \in \varvec{R}^{n}$,

$$\begin{aligned} \begin{array}{l} \ y_{1}(x)+y_{2}(x)(f(\hat{x})-f(x))+\sum \limits _{j=1}^{n}a_{j}(x)(x_{j}^{2}-x_{j})+(f(\hat{x})-f(x))^{\gamma }=0. \end{array} \end{aligned}$$

(5)

Proof

If $\hat{x}$ is a global optimal solution of $(P_{5})$, then by lemma 1, (4) holds. For problem $(P_{5})$, there exist a convex set

$$\begin{aligned} \varvec{R}^{n} \supseteq \{0,1\}^{n}, \end{aligned}$$

and

$$\begin{aligned} p(x)=f(x) \in \varvec{R[x]}, \end{aligned}$$

such that

$$\begin{aligned} \nabla ^{2}f(x)-\nabla ^{2}f(x)\preceq 0. \end{aligned}$$

$x \in \{0,1\}^{n}$ can be written as

$$\begin{aligned} h_{i}=x_{i}^{2}-x_{i}=0, i=1,\ldots ,n. \end{aligned}$$

We Denote

$$\begin{aligned} \tilde{g}_{\hat{x}}(x)=f(\hat{x})-f(x). \end{aligned}$$

Since there isn’t any inequality constraints in $(P_{5})$, $k=0,2^{k+1}=2$. So,

$$\begin{aligned} S\langle g_{1},\dots ,g_{k},\tilde{g}_{\hat{x}}\rangle =\{1,f(\hat{x})-f(x)\}, \end{aligned}$$

set $u_{1}(x)=1,u_{2}(x)=f(\hat{x})-f(x)$.

$$\begin{aligned} min\{y_{i}(\hat{x}),u_{i}(\hat{x})\}=0,i=1,2. \end{aligned}$$

By lemma 1, (4)can be written as

$$\begin{aligned} \begin{array}{l} \ \sum \limits _{i=1}^{2}y_{i}(x)u_{i}(x)+\sum \limits _{j=1}^{n}a_{j}(x)h_{j}(x)+[\tilde{g}_{\hat{x}}(x)]^{\gamma }=0, \end{array} \end{aligned}$$

(6)

so, (5) is obtained.

On the contrary, if (5) holds, for each feasible point $x \in \{0,1\}^{n}$,

$$\begin{aligned} y_{1}(x)+y_{2}(x)(f(\hat{x})-f(x))+(f(\hat{x})-f(x))^{\gamma }=0, \end{aligned}$$

SOS polynomials $y_{1}(x)\ge 0$ and $y_{2}(x)\ge 0$, so, $f(\hat{x})-f(x)\le 0$, this shows that

$$\begin{aligned} f(\hat{x})\le f(x) \end{aligned}$$

for each feasible point $x \in \{0,1\}^{n}$. Thus, $\hat{x}$ is a global optimal solution of $(P_{5})$. That completes the proof. $\square$

Theorem 5

If $\hat{x}$ is a global optimal solution of $(P_{5})$, then the following two conditions are listed and only one is true:

(i)
there exist $u=y_{2}(\hat{x})+1, \hat{v}_{1}=a_{1}(\hat{x}),\ldots ,\hat{v}_{n}=a_{n}(\hat{x})$, such that
$$\begin{aligned} \sum \limits _{j=1}^{n}\nabla h_{j}(\hat{x})\hat{v}_{j}-\nabla f(\hat{x})u=0; \end{aligned}$$
(ii)
there exist $u=y_{2}(\hat{x}), \hat{v}_{1}=a_{1}(\hat{x}),\ldots ,\hat{v}_{n}=a_{n}(\hat{x})$, such that
$$\begin{aligned} \sum \limits _{j=1}^{n}\nabla h_{j}(\hat{x})\hat{v}_{j}-\nabla f(\hat{x})u=0. \end{aligned}$$
Where $a_{i}(x),i=1,\dots ,n,y_{2}(x)$ are polynomials satisfying (5).

Proof

From theorem 4, if $\hat{x}$ is a global optimal solution of $(P_{5})$, then for each $x\in \varvec{R}^{n}$

$$\begin{aligned} \begin{array}{l} y_{1}(x)+y_{2}(x)(f(\hat{x})-f(x))+\sum \limits _{j=1}^{n}a_{j}(x)(x_{j}^{2}-x_{j})+(f(\hat{x})-f(x))^{\gamma }=0, \end{array}\\ \begin{array}{l} \nabla y_{1}(x)+\nabla y_{2}(x)(f(\hat{x})-f(x))+y_{2}(x)(-\nabla f(x))+\sum \limits _{j=1}^{n}\nabla a_{j}(x)(x_{j}^{2}-x_{j})\\ +\sum \limits _{j=1}^{n}a_{j}(x){(0,\dots ,0,2x_{j}-1,0,\dots ,0)}^{T}+\gamma (f(\hat{x})-f(x))^{\gamma -1}(-\nabla f(x))=0. \end{array} \end{aligned}$$

$\square$

Let $x=\hat{x}$. If $\gamma =1$, then

$$\begin{aligned} \begin{array}{l} \ \nabla y_{1}(\hat{x})+ y_{2}(\hat{x})(-\nabla f(\hat{x}))+\sum \limits _{j=1}^{n}a_{j}(\hat{x})(0,\dots ,0,2\hat{x}_{j}-1,0,\dots ,0)^{T}-\nabla f(\hat{x})=0. \end{array} \end{aligned}$$

If $\gamma >1$, then

$$\begin{aligned} \begin{array}{l} \ \nabla y_{1}(\hat{x})+ y_{2}(\hat{x})(-\nabla f(\hat{x}))+\sum \limits _{j=1}^{n}a_{j}(\hat{x}) (0,\dots ,0,2\hat{x}_{j}-1,0,\dots ,0)^{T}=0. \end{array} \end{aligned}$$

By theorem 4, $y_{1}(\hat{x})=0$.

Since $y_{1}(x)$ is a sum of square polynomial,

$$\begin{aligned} y_{1}(x)=\sum \limits _{j=1}^{s}f_{j}^{2}(x). \end{aligned}$$

$y_{1}(\hat{x})=0$ means for all $1\le j\le s$,

$$\begin{aligned} f_{j}(\hat{x})=0. \end{aligned}$$

So,

$$\begin{aligned} \begin{array}{l} \nabla y_{1}(\hat{x})=\sum \limits _{j=1}^{s}2f_{j}(\hat{x})\nabla f_{j}(\hat{x})=0. \end{array} \end{aligned}$$

Thus, the conclusion is obtained.

Corollary 2

If $\hat{x}$ is a global optimal solution of $(P_{5})$, $(a_{1}(\hat{x}),\dots ,a_{n}(\hat{x}))\ne (0,\dots ,0)$, then the following two conditions are listed and only one is true:

(i)
If in (5) $\gamma =1$, then Lagrangian multipliers in KKT conditions are
$$\begin{aligned} v_{1}=-\frac{a_{1}(\hat{x})}{y_{2}(\hat{x})+1},\ldots ,v_{n}=-\frac{a_{n}(\hat{x})}{y_{2}(\hat{x})+1}; \end{aligned}$$
(ii)
If in (5) $\gamma >1$, then Lagrangian multipliers in KKT conditions are
$$\begin{aligned} v_{1}=-\frac{a_{1}(\hat{x})}{y_{2}(\hat{x})},\ldots ,v_{n}=-\frac{a_{n}(\hat{x})}{y_{2}(\hat{x})}. \end{aligned}$$

Proof

If $\gamma =1$, $u=y_{2}(\hat{x})+1$. As $y_{2}(x)$ is a SOS polynomial, $y_{2}(x)\ge 0$ for each $x \in \varvec{R}^{n}$, so,

$$\begin{aligned} y_{2}(\hat{x})\ge 0, y_{2}(\hat{x})+1\ge 1. \end{aligned}$$

By theorem 5,

$$\begin{aligned} -\nabla f(\hat{x})u+\sum \limits _{j=1}^{n}\nabla h_{i}(\hat{x})\hat{v}_{i}=0, \end{aligned}$$

can be written as

$$\begin{aligned} \nabla f(\hat{x})+\sum \limits _{j=1}^{n}\nabla h_{i}(\hat{x})(-\frac{\hat{v}_{i}}{u})=0, \end{aligned}$$

that is, Lagrangian multipliers in KKT conditions are

$$\begin{aligned} v_{i}=-\frac{\hat{v}_{i}}{u}=-\frac{a_{i}(\hat{x})}{y_{2}(\hat{x})+1},i=1,\dots ,n. \end{aligned}$$

If $\gamma >1$, $u=y_{2}(\hat{x})$. $\nabla h_{i}(\hat{x})=(0,\dots ,0,2\hat{x_{i}}-1,0,\dots ,0)^{T}$.

For $i=1,\dots ,n$, $\nabla h_{i}(\hat{x})$ are linearly independent. Note that

$$\begin{aligned} u=y_{2}(\hat{x})>0. \end{aligned}$$

If $u=y_{2}(\hat{x})=0$, by the assumption $(a_{1}(\hat{x}),\dots ,a_{n}(\hat{x})\ne (0,\dots ,0)$, $-\nabla f(\hat{x})u+\sum \limits _{j=1}^{n}\nabla h_{i}(\hat{x})\hat{v}_{i}=0$ would contradict the linear independence of $\nabla h_{i}(\hat{x})$. Thus, (ii) holds when $\gamma >1$. That completes the proof.

For any real symmetric matrix $\varvec{ Q}$, there exists orthogonal matrix $\varvec{ A}$ such that $\varvec{ AQA}^{T}=\Lambda$. $\Lambda$ is the diagonal matrix with entries $\lambda _{i},\lambda _{i}(i=1,\dots ,n)$ are eigenvalues of $\varvec{Q}$. Based on this result, there are following theorems. $\square$

Theorem 6

Consider the problem $(P_{5})$ with $\varvec{ b} =(b_{1},\ldots ,b_{n})^{T}$, $\lambda = \underset{1\le i\le n}{max}\{\vert \lambda _{i}\vert \}$. If $b_{i}-\frac{1}{2}\lambda \ge 0,~i=1,\ldots ,n$, then the global optimal solution of $(P_{5})$ is 0.

Proof

Let $x\in \varvec{ R}^{n} , \varvec{z}=\varvec{ A}x$, then

$$\begin{aligned} \varvec{z}^{T}\varvec{z}= & {} x^{T}\varvec{A}^{T}\varvec{A}x=x^{T}x=\sum \limits _{i=1}^{n}x_{i}^{2},\\ x^{T}\varvec{Q}x= & {} \varvec{z}^{T}\varvec{AQA}^{T}\varvec{z}=\sum \limits _{i=1}^{n}\lambda _{i}z_{i}^{2}. \end{aligned}$$

Set

$$\begin{aligned}&y_{1}(x)=\frac{1}{2}\sum \limits _{i=1}^{n}(\lambda _{i}+\lambda )z_{i}^{2}+\sum \limits _{i=1}^{n}(b_{i}-\frac{1}{2}\lambda )x_{i}^{2}, y_{2}(x)=0,\\&\quad \gamma =1, a_{i}(x)=-b_{i}, i=1,\ldots ,n. \end{aligned}$$

So, $y_{1}(x)$ is a SOS polynomial,

$$\begin{aligned} \lambda _{i}+\lambda \ge 0, \end{aligned}$$

and

$$\begin{aligned}&b_{i}-\frac{1}{2}\lambda \ge 0, i=1,\ldots ,n, y_{1}(0)=0.\\&\quad y_{1}(x)+y_{2}(x)(f(0)-f(x))+\sum \limits _{j=1}^{n}a_{j}(x) (x_{j}^{2}-x_{j})+(f(0)-f(x))\\&\quad =\frac{1}{2}\sum \limits _{i=1}^{n}(\lambda _{i}+\lambda )z_{i}^{2} +\sum \limits _{i=1}^{n}(b_{i}-\frac{1}{2}\lambda )x_{i}^{2} +\sum \limits _{j=1}^{n}(-b_{j})(x_{j}^{2}-x_{j})-\frac{1}{2}x^{T}\varvec{Q}x-\varvec{b}^{T}x\\&\quad =\frac{1}{2}\sum \limits _{i=1}^{n}(\lambda _{i}+\lambda )z_{i}^{2} -\frac{1}{2}\sum \limits _{i=1}^{n}\lambda x_{i}^{2}-\frac{1}{2}x^{T}\varvec{Q}x =0. \end{aligned}$$

$\square$

From theorem 4, 0 is the global optimal solution.

Theorem 7

Let $\hat{x} \in \{0,1\}^{n}$ and $\varvec{c}=\varvec{Q}\hat{x}+\varvec{b}+\lambda \hat{x}-\frac{1}{2}\lambda \varvec{e}$, $\varvec{e}=(1,\ldots ,1)^{T}$. $c_{i}\ge 0$ when $\hat{x}_{i}=0$ and $c_{i}\le 0$ when $\hat{x}_{i}=1$, then $\hat{x}$ is a global optimal solution of $(P_{5})$.

Proof

Let $x\in \varvec{R}^{n} , \varvec{z}=\varvec{A}x, \varvec{\hat{z}}=\varvec{A}\hat{x}$, set

$$\begin{aligned} y_{1}= & {} \frac{1}{2}\sum \limits _{i=1}^{n}(\lambda _{i}+\lambda )(z_{i}-\hat{z}_{i})^{2}+\sum \limits _{\hat{x}_{i}=0}c_{i}x_{i}^{2}+\sum \limits _{\hat{x}_{i}=1}-c_{i}(x_{i}-1)^{2},\\ y_{2}(x)= & {} 0, \gamma =1, \end{aligned}$$

$a_{i}(x)=-\frac{1}{2}\lambda -c_{i}$ when $\hat{x}_{i}=0$; $a_{i}(x)=-\frac{1}{2}\lambda +c_{i}$ when $\hat{x}_{i}=1$.

Similarly as the proof of theorem 6, $y_{1}(x)$ is a SOS polynomial, and

$$\begin{aligned} y_{1}(\hat{x})=\frac{1}{2}\sum \limits _{i=1}^{n}(\lambda _{i}+\lambda )(\hat{z}_{i}-\hat{z}_{i})^{2}+\sum \limits _{\hat{x}_{i}=0}c_{i}\hat{x}_{i}^{2}+\sum \limits _{\hat{x}_{i}=1}-c_{i}(\hat{x}_{i}-1)^{2}=0. \end{aligned}$$

So, (5) can be written as

$$\begin{aligned} \begin{array}{ll} &{}y_{1}(x)+\sum \limits _{\hat{x}_{i}=0}^{n}(-\frac{1}{2}\lambda -c_{i})(x_{i}^{2}-x_{i})+\sum \limits _{\hat{x}_{i}=1}^{n}(-\frac{1}{2}\lambda +c_{i})(x_{i}^{2}-x_{i})+\frac{1}{2}\hat{x}^{T}\varvec{Q}\hat{x}+\varvec{b}^{T}\hat{x} \\ &{}-\frac{1}{2}x^{T}\varvec{Q}x-\varvec{b}^{T}x \\ =&{}\sum \limits _{i=1}^{n}(c_{i}x_{i}+\frac{1}{2}\lambda x_{i})-\sum \limits _{\hat{x}_{i}=1}c_{i}+\hat{x}^{T}\varvec{Q}\hat{x}+\varvec{b}^{T}\hat{x}-\varvec{b}^{T}x+\frac{1}{2}\sum \limits _{i=1}^{n}\lambda \hat{z}^{2}-\sum \limits _{i=1}^{n}\lambda z_{i}\hat{z}_{i} \\ &{}-\sum \limits _{i=1}^{n}\lambda _{i}z_{i}\hat{z}_{i} \\ =&{}\hat{x}^{T}\varvec{Q}\hat{x}+\varvec{b}^{T}\hat{x}+\frac{1}{2}\sum \limits _{i=1}^{n}\lambda \hat{z}^{2}-\sum \limits _{\hat{x}_{i}=1}c_{i} \\ =&{}0. \end{array} \end{aligned}$$

Thus, $\hat{x}$ is the global optimal solution of $(P_{5})$. $\square$

Example

We consider the problem $(P_{5})$ with ${{\varvec{Q}}} = \left( \begin{array}{cccccc} -2 &{} 3 &{} 5 \\ 3 &{} -8 &{} -1 \\ 5 &{} -1 &{} 9 \\ \end{array} \right) ,$

$$\begin{aligned} {{\varvec{b}}}=(1,2,-14)^{T},\lambda _{1}=10.9343,\lambda _{2}=-2.2102,\lambda _{3}=-9.7241, \end{aligned}$$

so, $\lambda =10.9343$. For feasible point $\hat{x}=(0,1,1)^{T}$,

$$\begin{aligned} \varvec{c}=\varvec{Q}\hat{x}+\varvec{b}+\lambda \hat{x}-\frac{1}{2}\lambda \varvec{e}=(3.5328,-1.5328,-0.5328)^{T}, \end{aligned}$$

satisfy $c_{i}\ge 0$ when $\hat{x}_{i}=0$ and $c_{i}\le 0$ when $\hat{x}_{i}=1$, from theorem 7, we can see $\hat{x}=(0,1,1)^{T}$ is a global optimal solution.

On the other hand, $x_{0}=(1,1,1)^{T}$ is not the global optimal solution,

$$\begin{aligned} \varvec{c}=(12.46715,1.46715,4.46715)^{T}, x_{0i}=1, c_{i}>0. \end{aligned}$$

In theorem 7, conditions $c_{i}\ge 0$ when $\hat{x}_{i}=0$ and $c_{i}\le 0$ when $\hat{x}_{i}=1$ refer to

$$\begin{aligned} (2\hat{x}_{i}-1)c_{i}\le 0, 1\le i\le n. \end{aligned}$$

So, conditions in theorem 7 can be expressed as

$$\begin{aligned}&(2\hat{X}-\varvec{I})\varvec{c}\le 0,\\&(2\hat{X}-\varvec{I})\varvec{c}=(2\hat{X}-\varvec{I})(\varvec{Q}\hat{x}+\varvec{b}+\lambda \hat{x}-\frac{1}{2}\lambda \varvec{e})\le 0,\\&(2\hat{X}-\varvec{I})(\varvec{Q}\hat{x}+\varvec{b})\le -(2\hat{X}-\varvec{I}) (\lambda \hat{x}-\frac{1}{2}\lambda \varvec{e})\\&=-\frac{1}{2}\lambda \varvec{e}, (2\hat{x}_{i}-1)^{2}=1. \end{aligned}$$

Thus, theorem 7 also can be written as:

Theorem 8

Let $\hat{x} \in \{0,1\}^{n}$ and $\lambda = \underset{1\le i\le n}{max}\{\vert \lambda _{i}\vert \}$, $\hat{X}=Diag(\hat{x}_{1},\ldots ,\hat{x}_{n})$. If

$$\begin{aligned} \begin{array}{l}\ 2(2\hat{X}-\varvec{I})(\varvec{Q}\hat{x}+\varvec{b})\le -\lambda \varvec{e},\end{array} \end{aligned}$$

(7)

then $\hat{x}$ is a global optimal solution of $(P_{5})$.

In [14], the authors present a sufficient global optimality condition of $(P_{5})$.

Theorem 9

[14] Let $\hat{x} \in \{0,1\}^{n}$, $\lambda _{n}=\underset{1\le i\le n}{min}\{\lambda _{i} \}$,$\lambda _{1}=\underset{1\le i \le n}{max}\{\lambda _{i}\}$. If

$$\begin{aligned} \begin{array}{l}\ 2(2\hat{X}-\varvec{I})(\varvec{Q}\hat{x}+\varvec{b})\le \lambda _{n}\varvec{e},\end{array} \end{aligned}$$

(8)

then $\hat{x}$ is a global optimal solution of $(P_{5})$.

There is some relationship between theorem 8 and theorem 9.

Theorem 10

Let $\hat{x} \in \{0,1\}^{n}$ and $\lambda = \underset{1\le i\le n}{max}\{\vert \lambda _{i}\vert \}$, $\lambda _{n}=\underset{1\le i\le n}{min}\{\lambda _{i} \}$,

$$\begin{aligned} \lambda _{1}=\underset{1\le i \le n}{max}\{\lambda _{i}\},\hat{X}=Diag(\hat{x}_{1},\ldots ,\hat{x}_{n}). \end{aligned}$$

If (7) holds, then (8) holds.

Proof

If $\lambda _{1}\ge \dots \ge \lambda _{n}\ge 0$, then $\lambda =\lambda _{1}$.

If $0\ge \lambda _{1}\ge \dots \ge \lambda _{n}$, then $\lambda =-\lambda _{n}$.

If $\lambda _{1}\ge \dots \ge 0\ge \dots \ge \lambda _{n}$ and $\lambda _{1}\ge -\lambda _{n}$, then $\lambda =\lambda _{1}$.

If $\lambda _{1}\ge \dots \ge 0\ge \dots \ge \lambda _{n}$ and $\lambda _{1}\le -\lambda _{n}$, then $\lambda =-\lambda _{n}$.

Thus, $\lambda _{1}>-\lambda _{n}$ when $\lambda =\lambda _{1}$; and $\lambda _{1}\le -\lambda _{n}$ when $\lambda =-\lambda _{n}$.

Case 1, if $\lambda =-\lambda _{n}$, (7) and (8) are the same. Case 2, if $\lambda =\lambda _{1}$,

$$\begin{aligned} 2(2\hat{X}-\varvec{I})(\varvec{Q}\hat{x}+\varvec{b})\le -\lambda \varvec{e}=-\lambda _{1}\varvec{e}\le \lambda _{n}\varvec{e}. \end{aligned}$$

The theorem is proved. $\square$

5 Conclusion

For the nonlinear programming over constrains which are not real polynomial functions, the paper gives some global optimality conditions. Necessary global optimality conditions for nonlinear programming problems with non-polynomial constraints functions are proposed and sufficient global optimality conditions for polynomial objective function programming problems with non-polynomial constraints functions are developed. In particular, for 0-1 quadratic problems, necessary and sufficient global optimality conditions are discussed. We can also continue to study the 0-1 quadratic problems in depth, as when the index $\gamma$ is not too large, $y_{1}(x),y_{2}(x),a_{j}(x),j=1,\ldots ,n$ corresponding $\gamma =1,2,3$. And if the corresponding $y_{1}(x),y_{2}(x),a_{j}(x),j=1,\ldots ,n$ are the same construction when $\gamma$ is odd or $\gamma$ is even. If not, general situation can be met when $\gamma$ is too large, we can make $\gamma \le k,k\in \varvec{N}$ as a rule, no $y_{1}(x),y_{2}(x),a_{j}(x),j=1,\ldots ,n$ satisfy global optimality conditions for any feasible point if $\gamma \le k$.

This work was supported by the Research Fund of Fundamentals Department of Air Force Engineering University (JK2022204).

References

Kuhn HW, Tucker AW (1951) Nonlinear programming. University of California Press, California
Google Scholar
Yang XQ (2004) Second-order global optimality conditions for optimization problems. J Glob Optim 30:271–284
Article MathSciNet Google Scholar
Pinar MC (2004) Sufficient global optimality conditions for bivalent quadratic optimization. J Optim Theory Appl 122(2):433–440
Article MathSciNet Google Scholar
Wu ZY (2007) Sufficent global optimality conditions for weakly convex minimization problems. J Glob Optim 39:427–440
Article Google Scholar
Alexanders Strekalovsky (1998) Global optimality conditions for nonconvex optimization. J Glob Optim 12:415–434
Article MathSciNet Google Scholar
Schichl H, Neumaier A (2006) Transposition theorems and qualification-free optimality conditions. SIAM J Optim 17:1035–1055
Article MathSciNet Google Scholar
Wu ZY, Jeyakumar V, Rubinov AM (2007) Sufficent conditions conditons for global optimality of bivalent nonconvex quadratic programs with inequality constraints. J Optim Theory Appl 133:123–130
Article MathSciNet Google Scholar
Jeyakumar V, Srisatkunarajah S, Huy NQ (2007) Kuhn-Tucker sufficiency for global minimum of multi-extremal mathematical programming problems. J Math Anal Appl 335:779–788
Article MathSciNet Google Scholar
Beck A, Teboulle M (2000) Global optimality conditions for quadratic optimization problems with binary constraints. SIAM J Optim 11:179–188
Article MathSciNet Google Scholar
Bienstock D (2018) LP formulations for polynomial optimization problems. SIAM J Optim 28(2):1121–1150
Article MathSciNet Google Scholar
David Yang G, Changzhi W (2017) On the triality theory for a quartic polynomial optimization problem. J Ind Manag Optim 8(1):229–242
Article MathSciNet Google Scholar
Qi L, Fei W, Wang Y (2009) Z-eigenvalue methods for a global polynomial optimization problem. Mathemat Programm 118(2):301–316
Article MathSciNet Google Scholar
Jeyakumar V, Rubinov AM (2006) Sufficient global optimality conditions for non-convex quadratic minimization problems with box constraints. J Glob Optim 36:471–481
Article MathSciNet Google Scholar
Chen W, Zhang LS (2010) Global optimality conditions for quadratic o-1 optimization problems. J Glob Optim 46:191–206
Article MathSciNet Google Scholar
Fang SC, Gao DY, Sheu RL (2017) Canonical dual approach to solving 0–1 quadratic programming problems. J Ind Manag Optim 4(1):125–142
Article MathSciNet Google Scholar
Hsia Y, Wang YP (2013) A new penalty parameter for linearly constrained 0 and C1 quadratic programming problems. Optimiz Lett 7:765–778
Article Google Scholar
Wang C, Gao H (2019) Optimality conditions of multiobjective programming problems based on weakly convex. J Jilin University (Science Edition) 5:70–74
Google Scholar
Jean-Pierre D, Jacques A, Lematre Bernard (1986) Convex quadratic programming with one constraint and bounded variables. Mathem Programm 36:90–104
Article MathSciNet Google Scholar
Perkki AP, Pennanen T, Biagini S (2018) Duality and optimality conditions in stochastic optimization and mathematical finance. J Convex Analysis 25(2)
Jeyakumar V, Li GY (2011) Necessary gobal optimality conditions for nonlinear programming problems with polynomial constraints. Mathem Programm 126:393–399
Article Google Scholar
Marshall M (2008) Positive Polynomials and Sums of Squares. Mathematical Surveys and Monographs https://doi.org/http://dx.doi.org/10.1090/surv/146 MathSciNet

Download references

Author information

Authors and Affiliations

Fundamentals Department, Air Force Engineering University, Xi’an, 710051, China
Haitao Zhong, Mingfa Zheng & Aoyu Zheng
Department of Mathematics, Shanghai University, Shanghai, 200444, China
Wei Chen

Authors

Haitao Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Mingfa Zheng
View author publications
You can also search for this author in PubMed Google Scholar
Wei Chen
View author publications
You can also search for this author in PubMed Google Scholar
Aoyu Zheng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mingfa Zheng.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhong, H., Zheng, M., Chen, W. et al. Global optimality conditions for nonlinear optimization problems. Evol. Intel. 17, 291–301 (2024). https://doi.org/10.1007/s12065-022-00725-y

Download citation

Received: 28 June 2021
Revised: 22 February 2022
Accepted: 04 April 2022
Published: 24 May 2022
Issue Date: February 2024
DOI: https://doi.org/10.1007/s12065-022-00725-y

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Global optimality conditions for nonlinear optimization problems

Abstract

Similar content being viewed by others

Generalized Fritz John optimality in nonlinear programming in the presence of equality and inequality constraints

Optimality conditions for nonconvex problems over nearly convex feasible sets

Second-order optimality conditions for nonlinear programs and mathematical programs

Explore related subjects

1 Introduction

1.1 Motivation

1.2 Literature review

1.3 Proposed approaches

2 Experimental

2.1 Symbol description

2.2 Alternative theorem

Theorem 1

3 Global optimality conditions for nonlinear programming problems

Theorem 2

Proof

Theorem 3

Proof

Corollary 1

Proof

4 Global optimality conditions for 0–1 quadratic problems

Lemma 1

Theorem 4

Proof

Theorem 5

Proof

Corollary 2

Proof

Theorem 6

Proof

Theorem 7

Proof

Example

Theorem 8

Theorem 9

Theorem 10

Proof

5 Conclusion

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation