1 Introduction

Let f(xy) be a continuous function in \((x,y)\in X \times Y\), where \( X \subseteq \mathbb {R}^n\) and \( Y \subseteq \mathbb {R}^m\) are two sets (n and m are positive integers). A pair \((x^*,y^*)\in X \times Y\) is said to be a saddle point of f(xy) over \(X \times Y\) if

$$\begin{aligned} f(x^*,y) \leqslant f(x^*,y^*) \leqslant f(x,y^*), ~ \forall (x,y) \in X \times Y. \end{aligned}$$

Saddle point problems can be applied to wide fields, such as game theory and equilibrium theory [25, 27, 28, 41], robust optimization [3], optimal control problems [44], generative adversarial nets in deep learning [15], etc. We refer to [5] for the basic theory of saddle point problems.

There exists much work on saddle point problems based on subgradients [30], primal-dual method [8, 17], perturbations [20], variational inequalities [9, 31], Uzawa method [13], Krylov subspace method [40], splitting iteration [2], preconditioners for saddle point problems [4, 7, 12, 16] and other approaches. These classical methods focus on convex-concave type saddle point problems. Dauphin, Pascanu, Gulcehre, Cho, Ganguli, Bengio studied the saddle point problems for non-convex optimization [11, 39]. Nie, Yang and Zhou [38] proposed a numerical method for obtaining saddle points of polynomial optimization. Zhou, Wang and Zhao [45] proposed an approach for computing saddle points of rational functions. The polynomials or rational functions in [38, 45] are not limited to the convex-concave types. Recently, Adolphs, Daneshmand, Lucchi and Hofmann [1] defined local saddle points.

Definition 1.1

A pair \((x^*,y^*)\in X \times Y\) is said to be a local saddle point for continuous function f(xy) if there exists \(\tau >0\) such that

$$\begin{aligned} f(x^*,y) \leqslant f(x^*,y^*) \leqslant f(x,y^*) \end{aligned}$$

for all \((x,y) \in (X\cap B(x^*,\tau )) \times (Y \cap B(y^*,\tau ))\), where

$$\begin{aligned} B(x^*,\tau ) := \{ x\in \mathbb {R}^n~|~\left\| x-x^* \right\| ^2 \leqslant \tau \}, \quad B(y^*,\tau ) := \{ y\in \mathbb {R}^m~|~\left\| y-y^* \right\| ^2 \leqslant \tau \}. \end{aligned}$$

In Definition 1.1, let \(f(x,y)\in \mathbb {R}[x,y]\) be a polynomial in \((x,y)\in X \times Y\), where \(X:=\mathbb {R}^n\) and \(Y:=\mathbb {R}^m\), if there exist \(\tau >0\) and \((x^*,y^*)\in \mathbb {R}^n\times \mathbb {R}^m\) such that

$$\begin{aligned} f(x^*,y) \leqslant f(x^*,y^*) \leqslant f(x,y^*) \end{aligned}$$
(1.1)

for all \((x,y) \in B(x^*,\tau ) \times B(y^*,\tau )\) (resp., \((x,y) \in \mathbb {R}^n\times \mathbb {R}^m\)), we say \((x^*,y^*)\) is a local (resp., global) saddle point of the polynomial f(xy).

In this paper, we focus on computing local saddle points for unconstrained polynomial optimization. Suppose that a polynomial has finitely many local saddle points. We propose an algorithm based on second-order optimality conditions and Lasserre’s hierarchy of relaxations for computing them and their corresponding local saddle point values. If a polynomial does not have local saddle points, the algorithm can detect the nonexistence. In addition, under the assumption that the polynomial has finitely many critical points, the algorithm can terminate after finitely many iterations.

The remaining of this paper is organized as follows. We describe the basic theory of polynomial optimization in Sect. 2. In Sect. 3, after the second-order optimality conditions and Lasserre’s hierarchy of relaxations are introduced, we give an algorithm for solving local saddle points of unconstrained polynomial optimization. Convergent results and proofs are given in Sect. 4. Some examples are given to illustrate that the algorithm is effective in Sect. 5. Conclusions and discussions are shown in Sect. 6.

2 Preliminaries

2.1 Notation

Let \(\mathbb {N}\), \(\mathbb {N}^+\), \(\mathbb {R}\) and \(\mathbb {C}\) be the set of nonnegative integers, positive integers, real numbers and complex numbers, respectively. For \(n,m\in \mathbb {N}^+\), \(\mathbb {R}^n\) and \(\mathbb {R}^m\) denote n and m dimension Euclidean space, respectively. \(\mathbb {R}[x]:=\mathbb {R}[x_1,x_2,\dots ,x_n]\) is the ring of polynomials in \(x:=(x_1,x_2,\dots ,x_n)\) with real coefficients, and \(\mathbb {R}[x]_d\) represents the set of polynomials in \(\mathbb {R}[x]\) with their degrees not more than d. \(\mathbb {R}[x,y]\) and \(\mathbb {R}[x,y]_d\) can be defined similarly. For \(n\in \mathbb {N}^+\), let \(\mathbb {N}_d^n:=\{ \alpha \in \mathbb {N}^n~|~\alpha _1+ \alpha _2+\dots +\alpha _n\leqslant d \}\). The symbol \(\deg (f)\) stands for the degree of the polynomial f. The norm \(\left\| \cdot \right\| \) is the standard Euclidean norm. \(\left\lceil k \right\rceil \) indicates the smallest integer not less than k. We denote by \({{\,\mathrm{tr}\,}}(A)\) the trace of matrix A. The superscript \({{\,\mathrm{T}\,}}\) means the transpose of a matrix or vector. We write \(A\succ 0\), \(A\succeq 0\), \(A\prec 0\) and \(A\preceq 0\) to express that the matrix A is positive definite, positive semidefinite, negative definite and negative semidefinite, respectively. The symbol \({{\,\mathrm{diag}\,}}(D_1,D_2,\dots ,D_n)\) denotes the block diagonal matrix whose diagonal square blocks are \(D_1, D_2,\dots ,D_n\). \(\nabla _{x}f=(f_{x_1},f_{x_2},\dots ,f_{x_n})\) denotes the gradient vector of the polynomial f(xy) in \(x:=(x_1,x_2,\dots ,x_n)\), where \(f_{x_i}\) is the partial derivative of f(xy) with respect to \(x_i\). \(\nabla _yf\) and \(f_{y_i}\) are defined similarly. \(\nabla _x^2f\) (resp., \(\nabla _y^2f\)) stands for the Hessian matrix of the polynomial f(xy) in x (resp., y).

For \(x=(x_1,x_2,\dots ,x_n)\in \mathbb {R}^n\), \(\alpha =(\alpha _1,\alpha _2,\dots ,\alpha _n)\in \mathbb {N}^n\) and \(d\in \mathbb {N}\), we define the following symbols

$$\begin{aligned} x^\alpha := x_1^{\alpha _1}x_2^{\alpha _2}\dots x_n^{\alpha _n}, \quad [x]_d := (1,x_1,x_2,\dots ,x_n,x_1x_2,x_1x_3,\dots ,x_n^d)^{{{\,\mathrm{T}\,}}}. \end{aligned}$$

A polynomial \(p(x)\in \mathbb {R}[x]_d\) can be written as

$$\begin{aligned} p(x) = \sum _{\alpha \in \mathbb {N}_d^n}p_\alpha x^\alpha =vec(p)^{{{\,\mathrm{T}\,}}}[x]_d, \end{aligned}$$

where vec(p) means the column coefficient vector of the polynomial p(x) with respect to the basis \([x]_d\).

2.2 Sum of squares, ideals, quadratic modules and moments

A polynomial s(x) is a sum of squares (SOS) if it can be written as \(s(x)=s_1^2(x)+s_2^2(x)+\dots +s_l^2(x)\) for some polynomials \(s_1(x),s_2(x),\dots ,s_l(x) \in \mathbb {R}[x]\). The symbol \(\Sigma [x]\) denotes the set of SOS polynomials, and its k-th truncation is \(\Sigma [x]_k:=\Sigma [x]\cap \mathbb {R}[x]_k\). Obviously, if a polynomial is SOS, then it is nonnegative everywhere, but the inverse is not necessarily true.

A subset \(I \subset \mathbb {R}[x]\) is an ideal if it satisfies \(f+g\in I\) for all \(f \in I,g \in I\) and \(fh \in I \) for all \(f \in I,h \in \mathbb {R}[x]\). For a tuple \(p=(p_1,p_2,\dots ,p_{l_1})\) of polynomials in \(\mathbb {R}[x]\), \({{\,\mathrm{I}\,}}(p)\) denotes the smallest ideal which contains all \(p_i\), i.e., \({{\,\mathrm{I}\,}}(p)\) is defined as

$$\begin{aligned} {{\,\mathrm{I}\,}}(p) := \{ p_1h_1+p_2h_2+\dots +p_{l_1}h_{l_1}~|~ h_i \in \mathbb {R}[x], ~i = 1,2,\dots ,l_1 \}. \end{aligned}$$

We often need its truncation in computation, which is defined as

$$\begin{aligned} {{\,\mathrm{I}\,}}_{2k}(p) := \{&p_1h_1+p_2h_2+\dots +p_{l_1}h_{l_1}~|~h_i \in \mathbb {R}[x], \\&\deg (p_ih_i) \leqslant 2k, ~i = 1,2,\dots ,l_1 \}. \end{aligned}$$

The real and complex varieties of an ideal \(I\in \mathbb {R}[x]\) are defined respectively as

$$\begin{aligned}&{{\,\mathrm{V}\,}}_{\mathbb {R}}(I) := \{ u \in \mathbb {R}^n~|~f(u)=0, ~\forall f \in I \}, \\&{{\,\mathrm{V}\,}}_{\mathbb {C}}(I) := \{ v \in \mathbb {C}^n~|~f(v)=0, ~\forall f \in I \}. \end{aligned}$$

For some nonnegative polynomials \(q_1,q_2,\dots ,q_{l_2}\in \mathbb {R}[x]\), the quadratic module of the tuple \(q=(q_1,q_2,\dots ,q_{l_2})\) is defined as

$$\begin{aligned} {{\,\mathrm{Q}\,}}(q) := \{ s_0+s_1q_1+s_2q_2+\dots +s_{l_2}q_{l_2}~|~ s_i \in \Sigma [x], ~i = 0,1,2,\dots ,l_2 \}. \end{aligned}$$

The truncation of quadratic module \({{\,\mathrm{Q}\,}}(q)\) is defined as

$$\begin{aligned} {{\,\mathrm{Q}\,}}_k(q) := \{&s_0+s_1q_1+s_2q_2+\dots +s_{l_2}q_{l_2}~|~ s_0 \in \Sigma [x]_{2k},\\&s_i \in \Sigma [x]_{2k-\deg (q_i)},~ i = 1,2,\dots ,l_2 \}. \end{aligned}$$

Let \(\mathbb {R}^{\mathbb {N}_d^n}\) be the space of real sequences indexed by \(\alpha \in \mathbb {N}_d^n\). A truncated multi-sequence (tms) \(z \in \mathbb {R}^{\mathbb {N}_d^n}\), labelled as \((z)_d:=(z_{\alpha })_{\alpha \in \mathbb {N}_d^n}\), gives a Riesz linear functional such that

$$\begin{aligned} {\mathscr {L}}_z: ~f(x) = \sum _{\alpha \in \mathbb {N}_d^n}f_{\alpha }x^{\alpha } \quad \longmapsto \quad {\mathscr {L}}_z(f)=\sum _{\alpha \in \mathbb {N}_d^n}f_{\alpha }z_{\alpha }. \end{aligned}$$

Let \(u(x)\in \mathbb {R}[x]\) with \(\deg (u)\leqslant 2k\). The k-th localizing matrix of u(x) generated by a tms \(z\in \mathbb {R}^{\mathbb {N}_{2k}^n}\), is the symmetric matrix \(L_u^{(k)}(z)\) satisfying

$$\begin{aligned} {\mathscr {L}}_z(uab)=vec(a)^{{{\,\mathrm{T}\,}}} \left( L_u^{(k)}(z) \right) vec(b) \end{aligned}$$

for all \(a(x),b(x)\in \mathbb {R}[x]_{k-\left\lceil \deg (u)/2 \right\rceil }\). For example, when \(n=2\), \(k=2\) and \(u(x)=2-3x_1x_2\), for \(z\in \mathbb {R}^{\mathbb {N}_4^2}\), we have

$$\begin{aligned} L_u^{(2)}(z) = \begin{bmatrix} 2z_{00}-3z_{11} &{} 2z_{10}-3z_{21} &{} 2z_{01}-3z_{12} \\ 2z_{10}-3z_{21} &{} 2z_{20}-3z_{31} &{} 2z_{11}-3z_{22} \\ 2z_{01}-3z_{12} &{} 2z_{11}-3z_{22} &{} 2z_{02}-3z_{13} \end{bmatrix}. \end{aligned}$$

When \(u(x)=1\), \(L_u^{(k)}(z)\) is called the moment matrix, and it is denoted as

$$\begin{aligned} M_k(z) := L_1^{(k)}(z). \end{aligned}$$

For example, when \(n=2\) and \(k=1\), for \(z \in \mathbb {R}^{\mathbb {N}_2^2}\), we have

$$\begin{aligned} M_1(z) = \begin{bmatrix} z_{00} &{} z_{10} &{} z_{01} \\ z_{10} &{} z_{20} &{} z_{11} \\ z_{01} &{} z_{11} &{} z_{02} \\ \end{bmatrix}. \end{aligned}$$

Let H be an \(l\times l\) symmetric matrix, whose each element \(H_{ij}\) is a polynomial in \(\mathbb {R}[x]\). The k-th localizing matrix of H generated by the tms \(z \in \mathbb {R}^{\mathbb {N}_{2k}^n}\) is the block symmetric matrix \(L_{H}^{(k)}(z)\), which is defined as

$$\begin{aligned} L_{H}^{(k)}(z) := \big (L_{H_{ij}}^{(k)}(z)\big )_ {1 \leqslant i \leqslant l, 1 \leqslant j \leqslant l}, \end{aligned}$$

and each block \(L_{H_{ij}}^{(k)}(z)\) is a standard localizing matrix of the polynomial \(H_{ij}\).

Let \(\Sigma [x]^{l\times l}\) be the cone of all sums of \(s_1s_1^{{{\,\mathrm{T}\,}}}+s_2s_2^{{{\,\mathrm{T}\,}}}+\dots +s_rs_r^{{{\,\mathrm{T}\,}}}\) with \(s_1,s_2,\dots ,s_r\in \mathbb {R}[x]^l\). When \(l=1\), \(\Sigma [x]^{l\times l}\) is \(\Sigma [x]\). The quadratic module of H is

$$\begin{aligned} {{\,\mathrm{Q}\,}}(H) := \Sigma [x] + \left\{ {{\,\mathrm{tr}\,}}(HS)~|~ S \in \Sigma [x]^{l\times l} \right\} . \end{aligned}$$

The k-th truncation of \({{\,\mathrm{Q}\,}}(H)\) is defined as

$$\begin{aligned} {{\,\mathrm{Q}\,}}_k(H) := \Sigma [x]_{2k} + \left\{ {{\,\mathrm{tr}\,}}(HS) ~\left| ~ \begin{array}{l} S \in \Sigma [x]^{l \times l}, ~~\deg (H_{ij}S_{ij}) \leqslant 2k, \\ \forall ~1 \leqslant i \leqslant l, ~1 \leqslant j \leqslant l \end{array} \right. \right\} . \end{aligned}$$

If there exists \(r(x)\in {{\,\mathrm{I}\,}}(p)+{{\,\mathrm{Q}\,}}(q)+{{\,\mathrm{Q}\,}}(H)\) such that \(\{x~|~r(x)\geqslant 0\}\) defines a compact set in \(\mathbb {R}^n\), \({{\,\mathrm{I}\,}}(p)+{{\,\mathrm{Q}\,}}(q)+{{\,\mathrm{Q}\,}}(H)\) is said to be archimedean. If \({{\,\mathrm{I}\,}}(p)+{{\,\mathrm{Q}\,}}(q)+{{\,\mathrm{Q}\,}}(H)\) is archimedean, then the set \(\mathbb {K}:=\{x~|~p_1(x)=0,\dots ,p_{l_1}(x)=0,~q_1(x)\geqslant 0,\dots , q_{l_2}(x)\geqslant 0,~H \succeq 0 \}\) is compact. The converse is in general not true. However, if \(\mathbb {K}\) is compact set, and let \(\mathbb {K}\subseteq \{x~|~N-\left\| x \right\| ^2 \geqslant 0\}\) where \(N \geqslant 0\), then \({{\,\mathrm{I}\,}}(p)+{{\,\mathrm{Q}\,}}(q,N-\left\| x \right\| ^2)+{{\,\mathrm{Q}\,}}(H)\) is archimedean. If \({{\,\mathrm{I}\,}}(p)+{{\,\mathrm{Q}\,}}(q)+{{\,\mathrm{Q}\,}}(H)\) is archimedean and the polynomial \(\psi (x)\in \mathbb {R}[x]\) is strictly positive over \(\mathbb {K}\), then \(\psi (x) \in {{\,\mathrm{I}\,}}(p)+{{\,\mathrm{Q}\,}}(q)+{{\,\mathrm{Q}\,}}(H)\). Polynomial optimization is closely related to truncated moment problems. Optimizers can be extracted from Lasserre’s hierarchy of relaxations. We refer to the work [14, 33, 37].

3 Computation of local saddle points

3.1 Optimality conditions

There exists classical work on second-order optimality conditions for unconstrained optimization. Let \(\phi (x)\) be a polynomial in x, if there are \(\tau >0\) and \(x^*\) such that \(\phi (x^*)\) is the smallest value of \(\phi (x)\) on \(B(x^*,\tau )\), then \(x^*\) is a local minimizer of \(\phi (x)\), and \(\phi (x^*)\) is a local minimum. If \(x^*\) is a local minimizer of the polynomial \(\phi (x)\), then

$$\begin{aligned} \nabla _x\phi (x^*)=0, \quad \nabla _x^2\phi (x^*)\succeq 0. \end{aligned}$$

Conversely, if there exists \(x^*\) such that

$$\begin{aligned} \nabla _x\phi (x^*)=0, \quad \nabla _x^2\phi (x^*)\succ 0, \end{aligned}$$

then \(x^*\) is a local minimizer of the polynomial \(\phi (x)\).

We detect local saddle points by the second-order optimality conditions. If the polynomial f(xy) has local saddle points, then it has finitely many local saddle point values. This is shown in Proposition 4.1.

Suppose that \((x',y')\) is a minimizer of the problem

$$\begin{aligned} {\left\{ \begin{array}{ll} f_1 := \min &{} f(x,y) \\ \text {subject to}&{} \nabla _xf(x,y) = 0, ~~ \nabla _x^2f(x,y) \succeq 0, \\ &{} \nabla _yf(x,y) = 0, ~~ \nabla _y^2f(x,y) \preceq 0. \end{array}\right. } \end{aligned}$$
(3.1)

According to the second-order sufficient optimality condition, if \(\nabla _x^2f(x',y')\succ 0\) and \(\nabla _y^2f(x',y') \prec 0\), then \((x',y')\) is a local saddle point of the polynomial f(xy) and \(f_1\) is corresponding local saddle point value. Otherwise, according to the inequality (1.1), if \(x'\) is a minimizer of

$$\begin{aligned} {\left\{ \begin{array}{ll} \min &{} f(x,y') \\ \text {subject to}&{} \nabla _xf(x,y') = 0, ~~ \nabla _x^2f(x,y') \succeq 0, \\ &{} \left\| x-x' \right\| ^2 \leqslant \tau \end{array}\right. } \end{aligned}$$
(3.2)

for some small \(\tau >0\), and \(y'\) is a maximizer of

$$\begin{aligned} {\left\{ \begin{array}{ll} \max &{} f(x',y) \\ \text {subject to}&{} \nabla _yf(x',y) = 0, ~~ \nabla _y^2f(x',y) \preceq 0, \\ &{} \left\| y-y' \right\| ^2 \leqslant \tau \end{array}\right. } \end{aligned}$$
(3.3)

for some small \(\tau >0\), then \((x',y')\) is a local saddle point of the polynomial f(xy) and \(f_1\) is corresponding local saddle point value. If \(\tau >0\) is very small, but \(x'\) is not a minimizer of the problem (3.2) or \(y'\) is not a maximizer of the problem (3.3), it is mostly open how to identify if it is a local saddle point or not, because detecting local optimality is NP-hard [29].

We assume that \(f_{r-1}~(r \geqslant 2)\) is obtained. We consider the following problem for detecting new local saddle points by adding the new constraint \(f(x,y) \geqslant f_{r-1} + \delta \) with some \(\delta >0\)

$$\begin{aligned} {\left\{ \begin{array}{ll} f_r := \min &{} f(x,y) \\ \text {subject to}&{} \nabla _xf(x,y) = 0, ~~ \nabla _x^2f(x,y) \succeq 0, \\ &{} \nabla _yf(x,y) = 0, ~~ \nabla _y^2f(x,y) \preceq 0, \\ &{} f(x,y) - f_{r-1} - \delta \geqslant 0. \end{array}\right. } \end{aligned}$$
(3.4)

Analogously, suppose \((x'',y'')\) is a minimizer of (3.4). According to the second-order sufficient optimality condition, if \(\nabla _x^2f(x'',y'')\succ 0\) and \(\nabla _y^2f(x'',y'')\prec 0\), then \((x'',y'')\) is a local saddle point of f(xy) and \(f_r\) is corresponding local saddle point value. Otherwise, according to the inequality (1.1), if \(x''\) is a minimizer of

$$\begin{aligned} {\left\{ \begin{array}{ll} \min &{} f(x,y'') \\ \text {subject to}&{} \nabla _xf(x,y'') = 0, ~~ \nabla _x^2f(x,y'') \succeq 0, \\ &{} f(x,y'') - f_{r-1} - \delta \geqslant 0, \\ &{} \left\| x-x'' \right\| ^2 \leqslant \tau \end{array}\right. } \end{aligned}$$
(3.5)

for some small \(\tau >0\), and \(y''\) is a maximizer of

$$\begin{aligned} {\left\{ \begin{array}{ll} \max &{} f(x'',y) \\ \text {subject to}&{} \nabla _yf(x'',y) = 0, ~~ \nabla _y^2f(x'',y) \preceq 0, \\ &{} f(x'',y) - f_{r-1} - \delta \geqslant 0, \\ &{} \left\| y-y'' \right\| ^2 \leqslant \tau \end{array}\right. } \end{aligned}$$
(3.6)

for some small \(\tau >0\), then \((x'',y'')\) is a local saddle point of f(xy), and \(f_r\) is corresponding local saddle point value. If \(\tau >0\) is very small, but \(x''\) is not a minimizer of the problem (3.5) or \(y''\) is not a maximizer of the problem (3.6), it is mostly open how to identify if it is a local saddle point or not.

We assume that \(f_{r-1}~(r \geqslant 2)\) is obtained. In order to avoid losing local saddle points arising from inappropriate \(\delta \), we introduce a new maximization problem

$$\begin{aligned} {\left\{ \begin{array}{ll} f_r^{\delta } := \max &{} f(x,y) \\ \text {subject to}&{} \nabla _{x} f(x,y) = 0, ~~ \nabla _{x}^2 f(x,y) \succeq 0, \\ &{} \nabla _{y} f(x,y) = 0, ~~ \nabla _{y}^2 f(x,y) \preceq 0, \\ &{} f_{r-1} + \delta - f(x,y) \geqslant 0. \end{array}\right. } \end{aligned}$$
(3.7)

It is obvious that \(f_r^{\delta } \geqslant f_{r-1}\). Note that if \(f_r^{\delta }>f_{r-1}\), then a smaller positive value for \(\delta \) is required. Therefore, the criterion for choosing a suitable positive value for \(\delta \) is \(f_r^{\delta }=f_{r-1}\), which can guarantee that there is no other minimum of the problem (3.1) between \(f_{r-1}\) and \(f_r\). We refer to the paper [36] for details.

When we solve the problem (3.2), (3.3), (3.5) and (3.6), a suitable value for \(\tau \) is very important for checking local saddle points. There is not a good approach for choosing a suitable value for \(\tau \), but in general it can not be too tiny. In practice, the value like 0.01 or 0.05 is small enough.

3.2 Lasserre’s hierarchy of semidefinite relaxations

The optimization problems (3.1)–(3.7) are solved by Lasserre’s hierarchy of relaxations. In this section, we illustrate how to solve the problems (3.1) and (3.4) in detail. The other problems can be solved by same approach, so we do not give details for the other problems. The theory of Lasserre’s hierarchy of relaxations can be found in [22, 23].

For convenience, we introduce the following notation,

$$\begin{aligned}&g(x,y) := f(x,y)-f_{r-1}-\delta , \quad H_f := {{\,\mathrm{diag}\,}}(\nabla _{x}^2f,-\nabla _y^2f), \\&d_f := \left\lceil \deg (f)/2 \right\rceil , ~d_g := \left\lceil \deg (g)/2 \right\rceil , \\&d_G := \max \{ \left\lceil \deg (f_{x_i})/2 \right\rceil ,\left\lceil \deg (f_{y_j})/2 \right\rceil , ~i=1,2,\dots ,n, ~j=1,2,\dots ,m \}, \\&d_H := \max \{ \left\lceil \deg ((H_f)_{ij})/2 \right\rceil , ~ 1\leqslant i\leqslant m+n, ~1\leqslant j\leqslant m+n \}. \end{aligned}$$

Applying the Lasserre’s hierarchy of relaxations to (3.1), we get the following problem

$$\begin{aligned} {\left\{ \begin{array}{ll} \rho _1^k := \min &{} {\mathscr {L}}_z(f) \\ \text {subject to}&{} M_k(z) \succeq 0, ~~ z_0 = 1, \\ &{} L_{f_{x_i}}^{(k)}(z) = 0, ~i=1,2,\dots ,n, \\ &{} L_{f_{y_j}}^{(k)}(z) = 0, ~j=1,2,\dots ,m, \\ &{} L_{H_f}^{(k)}(z) \succeq 0. \end{array}\right. } \end{aligned}$$
(3.8)

in which k is called relaxation order, and \(k \geqslant \max \{ d_f, d_G, d_H \}\). The relaxation problem (3.8) can be reformulated as a semidefinite programming problem (SDP), which can be solved by many SDP solvers using interior point methods, such as SeDuMi [42], SDPT3 [43], etc. According to Lasserre’s hierarchy of relaxations, the inequality \(\rho _1^k\leqslant f_1\) holds for every k because the feasible region of (3.8) is larger than that of (3.1). However, as the relaxation order k increases, more constraints are added to the relaxation problem (3.8), so the sequence \(\{\rho _1^k\}\) is monotonically increasing, namely

$$\begin{aligned} \rho _1^{k} \leqslant \rho _1^{k+1} \leqslant \rho _1^{k+2} \leqslant \cdots \leqslant f_1. \end{aligned}$$

Suppose \(z^*\) is a minimizer of (3.8), if \(z^*\) has a flat truncation [33], which means there exists a positive integer t such that

$$\begin{aligned} {{\,\mathrm{rank}\,}}(M_t(z^*)) = {{\,\mathrm{rank}\,}}(M_{t-s}(z^*)), \end{aligned}$$
(3.9)

where \(\max \{ d_f,d_G,d_H \} \leqslant t \leqslant k\), \(s := \max \{1,d_G,d_H \}\), then we can get at least \(r:={{\,\mathrm{rank}\,}}(M_k(z^*))\) minimizers of (3.1). According to Theorem 1.1 in [10] or Proposition 2.1 in [36], if \(z^*\) is feasible for (3.8) and (3.9), then \(z^*\) admits a unique r-atomic measure support in the feasible set of (3.1), which means there exist \(\mu _1>0,\mu _2>0,\dots ,\mu _r>0\) and r distinct points \(p_1,p_2,\dots ,p_r\) in feasible set of (3.1) such that

$$\begin{aligned} z^* = \mu _1 [p_1]_{2k}+\mu _2 [p_2]_{2k}+\dots +\mu _r [p_r]_{2k}, \end{aligned}$$

where \(\mu _1 + \mu _2 + \dots + \mu _r = 1\). If the rank condition (3.9) holds, then \(f_1=\rho _1^k\) and all points \(p_1,p_2,\dots ,p_r\) are the minimizers of (3.1). They can be obtained by solving some singular value decompositions and eigenvalue problems. One can refer to [18] for more details.

The dual problem of (3.8) is

$$\begin{aligned} {\left\{ \begin{array}{ll} \gamma _1^k := \max &{} \gamma \\ \text {subject to}&{} f-\gamma \in {{\,\mathrm{I}\,}}_{2k}(\nabla f) + {{\,\mathrm{Q}\,}}_k(H_f). \end{array}\right. } \end{aligned}$$
(3.10)

The problem (3.10) is also equivalent to a semidefinite programming (SDP) problem. In terms of weak duality, the inequality \(\gamma _1^k\leqslant \rho _1^k\) holds for every k. When the Slater condition holds for the problem (3.8), i.e., there exists an interior point in the feasible set of (3.8), then (3.10) achieves its optimal value, and \(\gamma _1^k=\rho _1^k\) for every k. Besides, the feasible region of (3.10) will expand as k increases, so the sequence \(\{ \gamma _1^k \}\) is monotonically increasing, that is

$$\begin{aligned} \gamma _1^k \leqslant \gamma _1^{k+1} \leqslant \gamma _1^{k+2} \leqslant \cdots \leqslant f_1. \end{aligned}$$

If the archimedean condition holds [33], i.e., there exists \(\xi (x,y) \in {{\,\mathrm{I}\,}}_{2k}(\nabla f) + {{\,\mathrm{Q}\,}}_k(H_f)\) for some k such that \(\xi (x,y)\geqslant 0\) defines a compact set in (xy), then \(\gamma _1^k \rightarrow f_1\) as \(k \rightarrow +\infty \).

Applying Lasserre’s hierarchy of relaxations to (3.4), we get the following optimization problem

$$\begin{aligned} {\left\{ \begin{array}{ll} \rho _r^k := \min &{} {\mathscr {L}}_z(f) \\ \text {subject to}&{} M_k(z)\succeq 0, ~~ z_0 = 1, \\ &{} L_{f_{x_i}}^{(k)}(z) = 0, ~i=1,2,\dots ,n, \\ &{} L_{f_{y_j}}^{(k)}(z) = 0, ~j=1,2,\dots ,m, \\ &{} L_{H_f}^{(k)}(z) \succeq 0, ~~ L_g^{(k)}(z) \succeq 0, \end{array}\right. } \end{aligned}$$
(3.11)

where \(k \geqslant \max \{ d_f, d_G, d_H, d_g \}\). Likewise, if a minimizer \({\tilde{z}}^*\) of (3.11) has a flat truncation [33], i.e., there exists a positive integer \({\tilde{t}}\) such that

$$\begin{aligned} {{\,\mathrm{rank}\,}}(M_{{\tilde{t}}}({\tilde{z}}^*)) = {{\,\mathrm{rank}\,}}(M_{{\tilde{t}}-{\tilde{s}}}({\tilde{z}}^*)), \end{aligned}$$
(3.12)

where \(\max \{d_f, d_G, d_H, d_g\}\leqslant {\tilde{t}}\leqslant k\), and \({\tilde{s}}:=\max \{1, d_G, d_H, d_g\}\), then we can get at least \(r:={{\,\mathrm{rank}\,}}(M_k(z^*))\) minimizers of (3.4).

The dual problem of (3.11) is

$$\begin{aligned} {\left\{ \begin{array}{ll} \gamma _r^k := \max &{} \gamma \\ \text {subject to}&{} f-\gamma \in {{\,\mathrm{I}\,}}_{2k}(\nabla f) + {{\,\mathrm{Q}\,}}_k(g) + {{\,\mathrm{Q}\,}}_k(H_f). \end{array}\right. } \end{aligned}$$
(3.13)

For the above problems (3.11) and (3.13), we can obtain some conclusions similar to (3.8) and (3.10).

3.3 An algorithm

We now give an algorithm for computing local saddle points for unconstrained polynomial optimization. It can detect the existence of local saddle points. If there exist several distinct local saddle point values, we can get them from the smallest one to the largest one.

Algorithm 3.1

Let \(\delta _0>0\), \(\tau >0\), \({\mathcal {S}}:=\emptyset \), \(r:=1.\)

Step 1:

If the problem (3.1) is infeasible, then there is no local saddle point and stop. Otherwise, we get the minimum \(f_1\) and the set of minimizers \({\mathcal {C}}_1.\)

Step 2:

For a point \((x_{1,i}',y_{1,i}')\in {\mathcal {C}}_1\), if \(\nabla _x^2f(x_{1,i}',y_{1,i}')\succ 0\) and \(\nabla _y^2f(x_{1,i}',y_{1,i}')\prec 0\), then let \({\mathcal {S}}:={\mathcal {S}}\cup \{(x_{1,i}',y_{1,i}')\}\), and continue to verify the rest of points in \({\mathcal {C}}_1\). Otherwise, go to Step 3.

Step 3:

Solve the problem (3.2) replaced \(y'\) with \(y_{1,i}'\) and get the minimum \(\eta _1\). If \(f(x_{1,i}',y_{1,i}') > \eta _1\), then \((x_{1,i}',y_{1,i}')\) is not a local saddle point. Update \({\mathcal {C}}_1 := {\mathcal {C}}_1 \backslash \{(x_{1,i}',y_{1,i}')\}\) and go back to Step 2. Otherwise, go to Step 4.

Step 4:

Solve the problem (3.3) replaced \(x'\) with \(x_{1,i}'\) and get the maximum \(\nu _1\). If \(f(x_{1,i}',y_{1,i}')<\nu _1\), then \((x_{1,i}',y_{1,i}')\) is not a local saddle point. Update \({\mathcal {C}}_1 := {\mathcal {C}}_1 \backslash \{(x_{1,i}',y_{1,i}')\}\) and go back to Step 2. Otherwise, let \({\mathcal {S}}:={\mathcal {S}}\cup \{(x_{1,i}',y_{1,i}')\}\), and go back to Step 2 to verify the rest of points in \({\mathcal {C}}_1\). If all points in \({\mathcal {C}}_1\) are verified, then let \(r:=r+1\), \(\delta :=\delta _0\), and go to Step 5.

Step 5:

Solve the problem (3.7), and get the maximum \(f_r^{\delta }\), then go to Step 6.

Step 6:

If \(f_r^{\delta }>f_{r-1}\), then let \(\delta :=\delta /2\) and go back to Step 5. If \(f_r^{\delta }=f_{r-1}\), then go to Step 7.

Step 7:

If the problem (3.4) is infeasible, then there is no local saddle point value that is not less than \(f_{r-1}\) and stop. Otherwise, we get the minimum \(f_r\) and the set of minimizers \({\mathcal {C}}_r\).

Step 8:

For a point \((x_{r,i}',y_{r,i}')\in {\mathcal {C}}_r\), if \(\nabla _x^2f(x_{r,i}',y_{r,i}')\succ 0\) and \(\nabla _y^2f(x_{r,i}',y_{r,i}')\prec 0\), then let \({\mathcal {S}}:={\mathcal {S}}\cup \{(x_{r,i}',y_{r,i}')\}\), and continue to verify the rest of points in \({\mathcal {C}}_r\). Otherwise, go to Step 9.

Step 9:

Solve the problem (3.5) replaced \(y''\) with \(y_{r,i}'\) and get the minimum \(\eta _r\). If \(f(x_{r,i}',y_{r,i}')>\eta _r\) then \((x_{r,i}',y_{r,i}')\) is not a local saddle point. Update \({\mathcal {C}}_r := {\mathcal {C}}_r \backslash \{ (x_{r,i}',y_{r,i}') \}\) and go back to Step 8. Otherwise, go to Step 10.

Step 10:

Solve the problem (3.6) replaced \(x''\) with \(x_{r,i}'\) and get the maximum \(\nu _r\). If \(f(x_{r,i}',y_{r,i}')<\nu _r\), then \((x_{r,i}',y_{r,i}')\) is not a local saddle point. Update \({\mathcal {C}}_r := {\mathcal {C}}_r \backslash \{(x_{r,i}',y_{r,i}')\}\) and go back to Step 8. Otherwise, let \({\mathcal {S}}:={\mathcal {S}}\cup \{(x_{r,i}',y_{r,i}')\}\), then go back to Step 8 to verify the rest of points in \({\mathcal {C}}_r\). If all points in \({\mathcal {C}}_r\) are verified, then let \(r:=r+1\), \(\delta :=\delta _0\), and go back to Step 5.

In Step 1, we compute the smallest minimum \(f_1\) and the set of minimizers \({\mathcal {C}}_1\) via the second-order necessary optimality condition if the problem (3.1) is feasible. In Step 2–4, we use the second-order sufficient optimality condition and the definition of local saddle point to verify whether each minimizer in \({\mathcal {C}}_1\) is a local saddle point or not. In Step 5–6, solving the problem (3.4) can obtain a suitable value for \(\delta \). Further, by Step 7, we obtain the minimum \(f_r\) and the set of minimizers \({\mathcal {C}}_r\) of the problem (3.4) if the problem (3.4) is feasible. In Step 8–10, we verify whether each point in \({\mathcal {C}}_r\) is a local saddle point or not by the second-order sufficient optimality condition and the definition of local saddle point.

4 Convergence properties

The first result aims to prove that a polynomial has finitely many local saddle point values.

Proposition 4.1

If a polynomial f(xy) has local saddle points, then it has finitely many local saddle point values.

Proof

If \((x^*,y^*)\) is a local saddle point of the polynomial f(xy), then

$$\begin{aligned} \nabla f(x^*,y^*) = 0. \end{aligned}$$

So \((x^*,y^*)\) is a critical point of f(xy), and \(f(x^*,y^*)\) is a critical value. In terms of the proof of Theorem 8 in [32] or Lemma 3.2 in [34], the polynomial f(xy) has finitely many critical values. Meanwhile, the local saddle point \((x^*,y^*)\) also satisfies

$$\begin{aligned} \nabla _x^2f(x^*,y^*) \succeq 0, \quad \nabla _y^2f(x^*,y^*) \preceq 0, \end{aligned}$$

so the set of local saddle point values of f(xy) is finite. \(\square \)

The finite termination of Algorithm 3.1 is given as follows.

Theorem 4.2

For the polynomial f(xy), let \({\mathcal {C}}_r\) be the minimizers of (3.1) for \(r=1\), and the minimizers of (3.4) for \(r\geqslant 2\). Let \({\mathcal {S}}_r\) be the set of points in \({\mathcal {C}}_r\) such that \(f(x_{r,i}^*,y_{r,i}')=f(x_{r,i}',y_{r,i}')\) and \(f(x_{r,i}',y_{r,i}^*)=f(x_{r,i}',y_{r,i}')\) for \(r \geqslant 1\). If \(\nabla f(x,y)=0,\nabla _x^2f(x,y)\succeq 0,\nabla _y^2f(x,y)\preceq 0\) has finitely many real solutions, then Algorithm 3.1 will stop after finitely many iterations. If \({\mathcal {S}}_r\) is not empty, then each point \((x^*,y^*)\in {\mathcal {S}}_r\) is a local saddle point of f(xy) and \(f(x^*,y^*)\) is a local saddle point value.

Proof

If the problem (3.1) is infeasible, i.e., the set \({\mathcal {C}}_1\) is empty, then the original problem does not have a local saddle point. If there exists a positive integer \(r \geqslant 1\) make \({\mathcal {C}}_r\) nonempty, according to the assumption, then these sets \({\mathcal {C}}_r\) are always finite. Therefore, the algorithm will terminate after finitely many iterations. If \({\mathcal {S}}_r\) is not empty, it means each \((x^*,y^*)\in {\mathcal {S}}_r\) is feasible for optimization problems both (3.2) and (3.3) for \(r=1\) or both (3.5) and (3.6) for \(r\geqslant 2\). In terms of the inequality (1.1), each point \((x^*,y^*)\) in \({\mathcal {S}}_r\) is a local saddle point of the polynomial f(xy), and \(f(x^*,y^*)\) is a local saddle point value. \(\square \)

Remark

For a generic polynomial f(xy), the polynomial system \(\nabla f(x,y)=0\) has finitely many complex solutions, thus Algorithm 3.1 will have finite convergence [38, Theorem 3.3].

Some properties related to the relaxation (3.8), its dual (3.10) and the problem (3.1) are listed as follows.

Theorem 4.3

Let \(f_1\), \(\rho _1^k\) and \(\gamma _1^k\) be the optimal value of (3.1), (3.8) and (3.10), respectively. Suppose \({{\,\mathrm{I}\,}}(\nabla f)+{{\,\mathrm{Q}\,}}(H_f)\) is archimedean, then

  1. (i)

    If (3.8) is infeasible, then (3.1) is infeasible;

  2. (ii)

    If \({{\,\mathrm{V}\,}}_{\mathbb {R}}(\nabla f)\) is finite, and (3.1) is infeasible, then (3.10) is unbounded for all k big enough and (3.8) is infeasible;

  3. (iii)

    If \({{\,\mathrm{V}\,}}_{\mathbb {R}}(\nabla f)\) is finite, and (3.1) is feasible, then \(\rho _1^k=\gamma _1^k=f_1\) for some k;

  4. (iv)

    If \({{\,\mathrm{V}\,}}_{\mathbb {R}}(\nabla f)\) is finite, and (3.1) is feasible, then all minimizers of (3.8) hold the flat truncation condition (3.9) with respect to the feasible set of (3.1) for all k big enough.

Proof

(i) The conclusion is obvious, because (3.8) is a relaxation problem of (3.1).

(ii) Because \({{\,\mathrm{V}\,}}_{\mathbb {R}}(\nabla f)\) is finite, \({{\,\mathrm{V}\,}}_{\mathbb {R}}(\nabla f)\) is compact, then the ideal \({{\,\mathrm{I}\,}}(\nabla f)\) is archimedean. Since \(-\left\| (\nabla f) \right\| ^2\geqslant 0\) defines a compact set in \(\mathbb {R}^{m+n}\). If (3.1) is infeasible, then \(-H_f\) is non-negative semidefinite matrix for all \(x\in {{\,\mathrm{V}\,}}_{\mathbb {R}}(\nabla f)\). According to Corollary 3.16 in [21], we can get

$$\begin{aligned} -1 \in {{\,\mathrm{I}\,}}(\nabla f) + {{\,\mathrm{Q}\,}}(H_f). \end{aligned}$$

Hence, for all k big enough, (3.10) is unbounded, by weak duality, which means (3.8) is infeasible.

(iii) When \({{\,\mathrm{V}\,}}_{\mathbb {R}}(\nabla f)\) is finite, \({{\,\mathrm{V}\,}}_{\mathbb {R}}(\nabla f)\) can be written as \(\{ w_1,w_2,\dots ,w_L \}\), where L is a positive integer and \(w_i\ne w_j\) for all \(i\ne j\). Let \(t_1,t_2,\dots ,t_L\) be the interpolating polynomials such that \(t_i(w_j)=0\) for \(i\ne j\) and \(t_i(w_j)=1\) for \(i=j\).

For each \(w_i\), if \(f(w_i)-f_1\geqslant 0\), let

$$\begin{aligned} Q_i := (f(w_i)-f_1)t_i^2. \end{aligned}$$

If \(f(w_i)-f_1<0\), then \(H_f\) is non-positive semidefinite matrix. Hence, there exists a eigenvalue \(\lambda <0\) of \(H_f\), and its eigenvector is denoted by v. Let \(h(x,y) := v^{{{\,\mathrm{T}\,}}} H_f v\) and

$$\begin{aligned} Q_i := \Big ( \frac{f(w_i)-f_1}{h(w_i)} \Big ) ht_i^2. \end{aligned}$$

Hence, each \(Q_i \in {{\,\mathrm{Q}\,}}(H_f)\). Let \(Q:=Q_1+Q_2+\dots +Q_L\), so \(Q\in {{\,\mathrm{Q}\,}}_{N_1}(H_f)\) for some \(N_1>0\). The polynomial

$$\begin{aligned} p := f - f_1 - Q \in \{ b \in \mathbb {R}[x,y]~|~b(u,v) = 0,~ \forall (u,v) \in {{\,\mathrm{V}\,}}_{\mathbb {R}}(\nabla f) \}, \end{aligned}$$

By Corollary 4.1.8 in [6], there exist an integer \(l>0\) and \(q\in \Sigma [x,y]\subseteq {{\,\mathrm{Q}\,}}(H_f)\) such that

$$\begin{aligned} p^{2l} + q \in {{\,\mathrm{I}\,}}(\nabla f). \end{aligned}$$

Applying Lemma 2.1 in [35] to pq, with \(\nabla f\), \(H_f\) and any \(c\geqslant 1/2l\), then there exists \(N^*>N_1\) such that for all \(\epsilon >0\), \( p+\epsilon =\phi _{\epsilon } + \theta _{\epsilon }\), where \(\phi _{\epsilon }\in {{\,\mathrm{I}\,}}_{2N^*}(\nabla f)\), \(\theta _{\epsilon }\in {{\,\mathrm{Q}\,}}_{N^*}(H_f)\). So we have

$$\begin{aligned} f-(f_1-\epsilon ) = \phi _{\epsilon } + (\theta _{\epsilon } + Q), \quad \theta _{\epsilon } + Q \in {{\,\mathrm{Q}\,}}_{N^*}(H_f), \end{aligned}$$

which means that for arbitrary \(\epsilon >0\), \(\gamma _1^k=f_1-\epsilon \) is feasible in (3.10) for \(k=N^*\). \(\gamma _1^k\leqslant f_1\) for arbitrary k and \(\{ \dots ,\gamma _1^k,\gamma _1^{k+1},\gamma _1^{k+2},\dots \}\) is a monotonically increasing sequence, so \(\gamma _1^k=f_1\) as \(k\rightarrow +\infty \). According to \(\gamma _1^k \leqslant \rho _1^k \leqslant f_1\), so \(\rho _1^k=\gamma _1^k=f_1\) for some \(k\in \mathbb {N}^+\).

(iv) Because \({{\,\mathrm{V}\,}}_{\mathbb {R}}(\nabla f)\) is finite, according to Proposition 4.6 in [24], there exists k big enough, each z is feasible for (3.8), the truncation \((z)_{2k}\) is flat with respect to \(\nabla f=0\). Because \(d_G=d_H+1\), and \(L_{H_f}^{(k)}(z) \succeq 0\), \((z)_{2k}\) is also flat with respect to \(H_f \succeq 0\), so the conclusion is proved. \(\square \)

We will show some results of the relaxation (3.11), its dual (3.13) and the problem (3.4).

Theorem 4.4

Let \(f_r\), \(\rho _r^k\) and \(\gamma _r^k\) be the optimal value of (3.4), (3.11) and (3.13), respectively. Suppose \({{\,\mathrm{I}\,}}(\nabla f)+{{\,\mathrm{Q}\,}}(g)+{{\,\mathrm{Q}\,}}(H_f)\) is archimedean, then

  1. (i)

    If (3.11) is infeasible, then (3.4) is infeasible;

  2. (ii)

    If \({{\,\mathrm{V}\,}}_{\mathbb {R}}(\nabla f)\cap \{(x,y)~|~g(x,y)\geqslant 0\}\) is finite, and (3.4) is infeasible, then (3.13) is unbounded for all k big enough and (3.11) is infeasible;

  3. (iii)

    If \({{\,\mathrm{V}\,}}_{\mathbb {R}}(\nabla f)\cap \{(x,y)~|~g(x,y)\geqslant 0\}\) is finite, and (3.4) is feasible, then \(\rho _r^k=\gamma _r^k=f_r\) for some k;

  4. (iv)

    If \({{\,\mathrm{V}\,}}_{\mathbb {R}}(\nabla f)\cap \{(x,y)~|~g(x,y)\geqslant 0\}\) is finite, and (3.4) is feasible, then all minimizers of (3.11) hold the flat truncation condition (3.12) with respect to the feasible set of (3.4) for all k big enough.

Proof

(i) The conclusion is obvious, because (3.11) is a relaxation problem of (3.4).

(ii) Because \({{\,\mathrm{V}\,}}_{\mathbb {R}}(\nabla f)\cap \{(x,y)~|~g(x,y)\geqslant 0\}\) is finite, \({{\,\mathrm{V}\,}}_{\mathbb {R}}(\nabla f)\cap \{(x,y)~|~g(x,y)\geqslant 0\}\) is compact, then \({{\,\mathrm{I}\,}}(\nabla f)+{{\,\mathrm{Q}\,}}(g)\) is archimedean. Because (3.4) is infeasible, \(-H_f\) is non-negative semidefinite matrix for all \(x\in {{\,\mathrm{V}\,}}_{\mathbb {R}}(\nabla f)\cap \{(x,y)~|~g(x,y)\geqslant 0\}\). According to Corollary 3.16 in [21], we can get

$$\begin{aligned} -1 \in {{\,\mathrm{I}\,}}(\nabla f) + {{\,\mathrm{Q}\,}}(g) + {{\,\mathrm{Q}\,}}(H_f). \end{aligned}$$

Hence, for all k big enough, (3.13) is unbounded, by weak duality, which means (3.11) is infeasible.

(iii) The proof is similar to that of the item (iii) in Theorem 4.3.

(iv) The proof is similar to that of the item (iv) in Theorem 4.3, which can be proved by Remark 4.9 in [24]. \(\square \)

5 Numerical experiments

In this section, we apply Algorithm 3.1 to solve local saddle points of unconstrained polynomial optimization, and global saddle point is computed by Algorithm 3.1 in [38]. All examples of this section are computed in MATLAB R2016b on a Lenovo laptop with dual core CPU @ 2.5GHz and RAM 4.0 GB. All problems of Lasserre’s hierarchy of relaxations are solved by MATLAB software package YALMIP [26] and GloptiPoly 3 [19], in which the semidefinite optimization solver SeDuMi [42] is called.

Example 5.1

Consider the function over \((x,y)\in \mathbb {R}\times \mathbb {R}\)

$$\begin{aligned} f(x,y) = -4x^2+\frac{21}{10}x^4-\frac{1}{3}x^6-xy+4y^2-4y^4. \end{aligned}$$

Applying Algorithm 3.1, after 2 iterations, local saddle point values \(f_1^*\) and \(f_2^*\) were found, as shown in Table 1.

Table 1 The numerical results of Example 5.1

Since the problem (3.11) is infeasible for \(r=3\) and \(f_3^{\delta }=f_2^*\), which means \(f_2^*\) is the biggest local saddle point value. Hence, we got 2 local saddle point values and 4 local saddle points. Moreover, we got that the example has no global saddle point by Algorithm 3.1 in [38].

Example 5.2

Consider the function over \((x,y)\in \mathbb {R}\times \mathbb {R}\)

$$\begin{aligned} f(x,y) = x^4y^2-x^2y^4+x^3y-xy^3+x-y. \end{aligned}$$

Applying Algorithm 3.1, after 3 iterations, local saddle point values \(f_1^*\), \(f_2^*\) and \(f_3^*\) were found, as shown in Table 2.

Table 2 The numerical results of Example 5.2

Since the problem (3.11) is infeasible for \(r=4\) and \(f_4^{\delta }=f_3^*\), which means \(f_3^*\) is the biggest local saddle point value. Hence, we got 3 local saddle point values and 3 local saddle points. Moreover, we got that the example has no global saddle point by Algorithm 3.1 in [38].

Example 5.3

Consider the function over \((x,y)\in \mathbb {R}^2\times \mathbb {R}^2\)

$$\begin{aligned} f(x,y) =&~ x_1^3y_1+x_2^3y_2-x_1y_1^3-x_2y_2^3+ x_1^2y_1+x_2^2y_2-x_1y_1^2-x_2y_2^2+ \\&~ x_1y_1+x_2y_2+x_1+x_2-y_1-y_2. \end{aligned}$$

Applying Algorithm 3.1, after 1 iteration, we got local saddle point value \(f_1^* \approx 1.9346\) and local saddle point

$$\begin{aligned} (-0.7189, -0.7189, -1.2503, -1.2503). \end{aligned}$$

Since the problem (3.11) is infeasible for \(r=2\) and \(f_2^{\delta }=f_1^*\), which means \(f_1^*\) is the biggest local saddle point value. Hence, we got 1 local saddle point value and 1 local saddle point. Moreover, we got that the example has no global saddle point by Algorithm 3.1 in [38].

Example 5.4

Consider the function over \((x,y)\in \mathbb {R}^2\times \mathbb {R}^2\)

$$\begin{aligned} f(x,y) =&~ x_1^6+x_2^6-y_1^6-y_2^6-5(x_1^3x_2^2+x_2^2y_1^3+ y_1x_2^4)+6x_1^2x_2^2+6y_1^3y_2+ \\&~ 6x_1x_2y_1y_2-7(x_1x_2y_1+x_2y_1y_2)+ (x_1+x_2+y_1+y_2-1)^2. \end{aligned}$$

Applying Algorithm 3.1, after 2 iterations, local saddle point values \(f_1^*\) and \(f_2^*\) were found, as shown in Table 3.

Table 3 The numerical results of Example 5.4

Since the problem (3.11) is infeasible for \(r=3\) and \(f_3^{\delta }=f_2^*\), which means \(f_2^*\) is the biggest local saddle point value. Hence, we got 2 local saddle point values and 2 local saddle points. Moreover, we got the global saddle point

$$\begin{aligned} (-0.7087, 0.5995,-1.6767,-1.3325) \end{aligned}$$

and its saddle point value 18.2221 by Algorithm 3.1 in [38].

Example 5.5

Consider the function over \((x,y)\in \mathbb {R}^3\times \mathbb {R}^3\)

$$\begin{aligned} f(x,y) =&~ (x_1+x_2+x_3+y_1+y_2+y_3)^2+x_1x_2y_1y_2+x_2x_3y_2y_3+ \\&~ x_1x_3y_1y_3+x_1^2y_3^2+x_2^2y_1^2+x_3^2y_2^2. \end{aligned}$$

Applying Algorithm 3.1, since the problem (3.8) is infeasible for \(r=1\), we got that the example has no local saddle point.

Example 5.6

Consider the function over \((x,y)\in \mathbb {R}^3\times \mathbb {R}^3\)

$$\begin{aligned} f(x,y) =&~ \sum _{i=1}^{3}x_i^4+\sum _{i=1}^{3}(x_i-1)^2- \sum _{i=1}^{3}y_i^4-\sum _{i=1}^{3}(y_i-1)^2+ \\&~ x_3^2y_1y_2+2x_2^2y_1y_3+3x_1^2y_2y_3- y_3^2x_1x_2-5y_2^2x_1x_3-4y_1^2x_2x_3. \end{aligned}$$

Applying Algorithm 3.1, after 1 iteration, we got local saddle point value \(f_1^* \approx -0.3681\) and local saddle point

$$\begin{aligned} (0.5785,0.6043,0.6936,0.4363,0.4233,0.6357). \end{aligned}$$

Since the problem (3.11) is infeasible for \(r=2\) and \(f_2^{\delta }=f_1^*\), which means \(f_1^*\) is the biggest local saddle point value. Hence, we got 1 local saddle point value and 1 local saddle point. Moreover, we got that the local saddle point is just the global saddle point by Algorithm 3.1 in [38].

6 Conclusions and discussions

The paper concentrates on detecting local saddle points of unconstrained polynomial optimization. An algorithm based on second-order optimality conditions is proposed for getting local saddle points, in which all polynomial optimization problems of the algorithm are solved by Lasserre’s hierarchy of relaxations. The algorithm can detect the nonexistence if a polynomial does not have local saddle points. When a polynomial has several local saddle points, the algorithm can get them from the smallest local saddle point value to the largest one. Besides, we give the finite termination of the algorithm.

Finally, for future work, one can consider the problem: for a polynomial on constraints, how can we get its saddle points?