Abstract
We propose a hierarchy of semidefinite programming (SDP) relaxations for polynomial optimization with sparse patterns over unbounded feasible sets. The convergence of the proposed SDP hierarchy is established for a class of polynomial optimization problems. This is done by employing known sums-of-squares sparsity techniques of Kojima and Muramatsu Comput Optim Appl 42(1):31–41, (2009) and Lasserre SIAM J Optim 17:822–843, (2006) together with a representation theorem for polynomials over unbounded sets obtained recently in Jeyakumar et al. J Optim Theory Appl 163(3):707–718, (2014). We demonstrate that the proposed sparse SDP hierarchy can solve some classes of large scale polynomial optimization problems with unbounded feasible sets using the polynomial optimization solver SparsePOP developed by Waki et al. ACM Trans Math Softw 35:15 (2008).
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
The optimal value of a polynomial optimization over a compact semialgebraic set can be approximated as closely as desired by solving a hierarchy of semidefinite programming (SDP) relaxations and the convergence is finite generically under a mild assumption called Archimedean condition which requires the compactness of the feasible region (see [19, 20, 24, 27, 28, 30, 31]). It is known that the size of the SDP relaxations of the hierarchy, known now as the Lasserre hierarchy, rapidly grows as the number of variables and the relaxation order increase, preventing applications of the hierarchy to large scale polynomial optimization problems as the size of the SDP relaxations are too large to solve. A great deal of attention has recently been focused on reducing the size of these SDP relaxations. This has led to a sparse variant of the Lasserre hierarchy that allowed applications to various large scale polynomial optimization problems over compact semialgebraic feasible sets [5, 8, 16, 18, 36] such as the sensor network localization problems [17, 25], hybrid system identification problems [7], optimal power flow problems [9], robust optimization problems [1], portfolio optimization problems [15] and filter design problems [22].
More recently, the standard Lasserre hierarchy of SDP relaxations has been shown to extend to polynomial optimization over unbounded semialgebraic feasible sets via suitable modifications [11]. The purpose of this paper is to present a convergent sparse SDP hierarchy for solving polynomial optimization problems with sparse patterns and unbounded semialgebraic feasible sets, extending the unbounded version of the Lasserre hierarchy [11]. More specifically, we make the following contributions to global polynomial optimization.
-
(1)
We present a sparse SDP hierarchy for solving polynomial optimization problems with unbounded feasible sets incorporating the objective function in the construction of quadratic modules that generate the sequence of SDP relaxations. This approach extends the relaxation scheme, developed for convex polynomial optimization over noncompact sets [12]. This is done by employing known sums-of-squares sparsity techniques [18, 21] together with a representation theorem for polynomials over unbounded sets, established recently in [11].
-
(2)
By solving some numerical test problems, we illustrate that our sparse SDP hierarchy can easily be adapted with the current large scale polynomial optimization solver SparsePOP [37] to solve some classes of large scale polynomial optimization problems with unbounded feasible sets. We also apply our SDP hierarchy to solve a class of sparse polynomial optimization problems with unbounded feasible sets and hidden coercivity.
The organization of the paper is as follows. In Sect. 2, we fix the notation and recall some basic facts on polynomial optimization. In Sect. 3, we present our sparse SDP hierarchy for polynomial optimization with unbounded feasible sets and establish its convergence. In Sect. 4, we illustrate how our proposed scheme works by solving various large scale numerical test problems.
2 Preliminaries
Throughout this paper, \(\mathbb {R}^n\) denotes the Euclidean space with dimension n. The inner product in \(\mathbb {R}^n\) is defined by \(\langle x,y\rangle := x^Ty\) for all \(x,y \in \mathbb {R}^n\). The non-negative orthant of \(\mathbb {R}^n\) is denoted by \(\mathbb {R}^n_{+}\) and is defined by \(\mathbb {R}^n_{+}:=\{(x_1,\ldots ,x_n) \in \mathbb {R}^n \ | \ x_i \ge 0\}\). The closed ball with center x and radius r is denoted by \(\overline{\mathbb {B}}(x,r)\). We use \(\mathbf{e}_n\) to denotes the vector in \(\mathbb {R}^n\) whose elements are all one. For a multi-index \(\alpha =(\alpha _1,\ldots ,\alpha _n)\), the monomial \(x^{\alpha }\) is defined by \(x^{\alpha }=x_1^{\alpha _1},\ldots , x_n^{\alpha _n}\). Denote by \(\mathbb {R}[\underline{x}]\) the ring of polynomials in \(x := (x_1,x_2, \ldots , x_n)\) with real coefficients. The degree of a real polynomial f is denoted by \(\mathrm{deg}f\). We say that a real polynomial \(f\in \mathbb {R}[\underline{x}]\) is a sum of squares (SOS) polynomial if there exist real polynomials \(f_j, j = 1,\ldots , r,\) such that \(f=\sum _{j=1}^rf_j^2\). The set of all sum of squares real polynomials with variable x is denoted by \(\Sigma ^2[\underline{x}]\). The set of all sum of squares real polynomial with variable x and degree at most d is denoted by \(\Sigma ^2_d[\underline{x}]\). An important property of SOS polynomials is that checking a given polynomial with degree d is a sum of squares polynomial with degree d or not is equivalent to solving a semidefinite linear programming problem (see [10, 13, 14, 20, 24, 28, 32]).
The quadratic module \(\mathbf{M}(g;h)=\mathbf{M}(-g_1,\ldots -g_m;h_1,\ldots ,h_q)\subset \mathbb {R}[\underline{x}]\) associated with the semi-algebraic set K, defined by
is given by
We note that any member of the quadratic module \(\mathbf {M}(g;h)\) is a polynomial which is non-negative over the set K. The quadratic module \(\mathbf {M}(g;h)\) is called Archimedean [20, 24] if there exists a polynomial \(\psi \in \mathbf {M}(g;h)\) such that \(\{x: \psi (x) \ge 0\}\) is compact. A necessary condition for the quadratic module \(\mathbf {M}(g;h)\) to be Archimedean is that the set K is a compact set.
Lemma 2.1
(Putinar positivstellensatz) [29] Let \(f,g_j,h_s\), \(j=1,\ldots ,m\), \(s=1,\ldots ,q\), be real polynomials with \(K:=\{x:g_j(x) \le 0,j=1,\ldots ,m, h_s(x)=0, s=1,\ldots ,q\} \ne \emptyset \). Let \(\mathbf{M}(g;h)\) be Archimedean. If \(f(x)>0\) for all \(x \in K\), then \(f \in M(g;h)\).
We now introduce a sparse version of Putinar positivestellensatz which was derived by Lasserre [21] and improved later on by Kojima and Muramatsu [18]. To do this, we need the definitions of the support of a polynomial and the running intersection property. Recall that, for a polynomial \(f(x)=\sum _{\alpha }f_{\alpha }x^{\alpha }\) on \(\mathbb {R}^n\) with degree d, the support of f is denoted by \(\mathrm{supp}f\) and is defined by
Definition 2.1
(Running intersection property) Let \(I_l\), \(l=1,\ldots ,p\) be index sets such that \(\bigcup _{l=1}^p I_l=\{1,\ldots ,n\}\). We say that the running intersection property holds for \(I_l\), \(l=1,\ldots ,p\), if for each \(l=1,\ldots ,p-1\), there exists \(s \le l\) such that
We note that the running intersection property has its roots in graph theory, and can be satisfied for many polynomial optimization problems with structured sparsity patterns. In particular, It is known that maximal cliques of a chordal graph (after possible reordering) satisfy the running intersection property [35, Theorem 3.5]. For a detailed discussion, see [18].
Lemma 2.2
(Sparse version of Putinar positivstellensatz [18, 21]) Let \(f(x)=\sum _{l=1}^p f^l(x)\), \(g_j(x)=\sum _{\alpha }(g_j)_{\alpha }x^{\alpha }\), \(j=1,\ldots ,m\), and \(h_s(x)=\sum _{\alpha }(h_s)_{\alpha }x^{\alpha }\), \(s=1,\ldots ,q\), be polynomials on \(\mathbb {R}^n\) with degree d. Let \(I_l\) be a set of indexes such that \(\mathrm{supp}f^l \subseteq I_l \subseteq \{1,\ldots ,n\}\), \(l=1,\ldots ,p\) and \(\bigcup _{l=1}^p I_l=\{1,\ldots ,n\}\). Suppose that for each \(j=1,\ldots ,m\) and \(s=1,\ldots ,q\), \(\mathrm{supp}g_j \cup \mathrm{supp}h_s \subseteq I_l\) for some \(l \in \{1,\ldots ,p\}\), and the running intersection property holds for \(I_l\), \(l=1,\ldots ,p\). Let \(K:=\{x:g_j(x) \le 0,j=1,\ldots ,m, h_s(x)=0, s=1,\ldots ,q\} \ne \emptyset \) and let \(\mathbf{M}(g;h)\) be Archimedean. If \(f(x)>0\) on K, then
where \(\sigma _{jl}\), \(j=0,1,\ldots ,m\), are SOS polynomials with variables \(\{x_i:i \in I_l\}\), and \(\phi _{sl}\), \(s=1,\ldots ,q\), are real polynomials with variables \(\{x_i:i \in I_l\}\).
We note that, the assumption “for each \(j=1,\ldots ,m\) and \(s=1,\ldots ,q\), \(\mathrm{supp}g_j \cup \mathrm{supp}h_s \subseteq I_l\) for some \(l \in \{1,\ldots ,p\}\)” and the running intersection property are automatically satisfied in the special case where \(p=1\) and \(I_1=\{1,\ldots ,n\}\). So, in this case, the sparse version of Putinar positivstellensatz reduces to Putinar positivstellensatz.
2.1 Coercivity of polynomials
We now recall the definitions for coercive polynomials. This definition plays an important role in our analysis for positivity over noncompact sets later on.
Definition 2.2
(Coercivity and strong coercivity) Let \(f(x)=\sum _{\alpha }f_{\alpha }x^{\alpha }\) be a polynomial on \(\mathbb {R}^n\) with degree d. Let \(f(x)=\sum _{i=0}^d f_i(x)\) where each \(f_i\) is a homogeneous polynomial of degree i, \(i=0,1,\ldots ,d\). We say the polynomial f is
-
coercive if \(f(x) \rightarrow +\infty \) whenever \(\Vert x\Vert \rightarrow +\infty \);
-
s -strongly coercive for some \(s \in \{1,\ldots ,d\}\) if \(f_{s}(x)>0\) for all \(x \ne 0\) and \(f_{i}(x) \ge 0\) for all \(s+1 \le i \le d\) and \(x \in \mathbb {R}^n\);
-
strongly coercive if f is d-strongly coercive.
It follows from the definition that an s-strongly coercive polynomial, \(s=1,\ldots ,d\), must be coercive. On the other hand, the converse is not true. As an example, the 2-dimensional Rosenbrock function
is a coercive polynomial which is not s-strongly coercive for \(s=1,2,3,4\). We also note that it was shown in [11] that the strong coercivity can be numerically checked by solving semidefinite programming problems. Furthermore, any polynomial of the form \(\sum _{i=1}^na_ix_i^d+ \sum _{|\alpha |\le d-1}h_{\alpha }x^{\alpha }\) where \(|\alpha |=\sum _{i=1}^n \alpha _i\) with \(\alpha =(\alpha _1,\ldots ,\alpha _n) \in (\mathbb {N} \cup \{0\})^n\), \(d \in 2\mathbb {N}\) and \(a_i>0\), \(i=1,\ldots ,n\), is a strongly coercive polynomial.
The following proposition shows that a coercive polynomial is always level-bounded. Moreover, the corresponding bound for the lower level set can be computed using the coefficients of the underlying polynomial if the polynomial is assumed to be s-strongly coercive for some \(s=1,\ldots ,d\).
Proposition 2.1
(Boundness of the lower level set via coercivity) Let \(f(x)=\sum _{\alpha }f_{\alpha }x^{\alpha }\) be a coercive polynomial on \(\mathbb {R}^n\) with degree d. Let \(f(x)=\sum _{i=0}^d f_i(x)\) where each \(f_i\) is a homogeneous polynomial with degree i, \(i=0,1,\ldots ,d\). Then, for each \(c \in \mathbb {R}\), the lower level set \(\{x:f(x) \le c\}\) is a compact set. Furthermore, if f is s-strongly coercive for some \(s=1,\ldots ,d\), then \(\{x: f(x) \le c\} \subseteq \overline{\mathbb {B}}(0,r)\) where
Proof
Fix any \(c \in \mathbb {R}\). Then, the lower level set \(\{x:f(x) \le c\}\) is a compact set. To see this, we suppose on the contrary that there exists \(\{x_n\}_{n=1}^{\infty }\), with \(f(x_n) \le c\) and \(\{x_n\}\) is unbounded. By passing to subsequence if necessary, we may assume that \(\Vert x_n\Vert \rightarrow +\infty \). As f is a coercive polynomial, we must have \(f(x_n) \rightarrow +\infty \). This contradicts the fact that \(f(x_n) \le c\) for all \(n \in \mathbb {N}\).
To see the second assertion, we assume that f is s-strongly coercive for some \(s=1,\ldots ,d\). Let \(a \in \mathbb {R}^n\) be any point such that \(f(a) \le c\). Then,
This gives us that either \(\Vert a\Vert \le 1\) or
where the second inequality follows from the fact that \(|x^{\alpha }| \le \Vert x\Vert ^{|\alpha |}\) for all \(\Vert x\Vert \ge 1\) and the last inequality is from the fact that \(\Vert x\Vert ^{|\alpha |} \le \Vert x\Vert ^{s-1}\) for all \(|\alpha | \le s-1\) and \(\Vert x\Vert \ge 1\). So, we have
and hence, the conclusion follows. \(\square \)
Corollary 2.1
Let \(f(x)=\sum _{\alpha }f_{\alpha }x^{\alpha }\) and \(g_j(x)=\sum _{\alpha }(g_j)_{\alpha }x^{\alpha }\), \(j=1,\ldots ,m\), be polynomials on \(\mathbb {R}^n\) with degree d.
-
(i)
If there exist \(\mu _j \ge 0\), \(j=0,1,\ldots ,m\), such that \(\mu _0 f+\sum _{j=1}^m \mu _j g_j\) is coercive, then, for each \(c \in \mathbb {R}\), the set \(\{x: g_j(x) \le 0, j=1,\ldots ,m, f(x) \le c\}\) is a compact set.
-
(ii)
If \(\mu _0 f+\sum _{j=1}^m \mu _j g_j\) is s-strongly coercive for some \(s\in \{1,\ldots ,d\}\), then
$$\begin{aligned} \{x: g_j(x) \le 0, \quad j=1,\ldots ,m, f(x) \le c\} \subseteq \overline{\mathbb {B}}((0,r), \end{aligned}$$where
$$\begin{aligned} r=\max \left\{ 1, \displaystyle \frac{\mu _0 c+\sum _{0 \le |\alpha | \le s-1}|\mu _0f_{\alpha }+\sum _{j=1}^m\mu _j(g_j)_{\alpha }|}{\rho _{s}}\right\} \end{aligned}$$and
$$\begin{aligned} \rho _s=\min \left\{ \left( \mu _0 f+\sum _{j=1}^m\mu _jg_j\right) _{s}(v):\Vert v\Vert =1\right\} . \end{aligned}$$
Proof
Note that \(\{x: g_j(x) \le 0, j=1,\ldots ,m, f(x) \le c\} \subseteq \{x: \mu _0 f(x)+\sum _{j=1}^m \mu _j g_i(x) \le \mu _0 c\}\). The conclusion follows by applying the Proposition 2.1 with f replaced by \(\mu _0 f+\sum _{j=1}^m \mu _j g_i\). \(\square \)
3 A sparse hierarchy for optimization over unbounded sets
Consider the polynomial optimization problem
where \(f^l,g_j,h_s\) are (nonconvex) polynomials on \(\mathbb {R}^n\), \(l=1,\ldots ,p\), \(j=1,\ldots ,m\) and \(s=1,\ldots ,q\). Let the feasible set be denoted by K, that is, \(K=\{x: g_j(x) \le 0, j=1,\ldots ,m, h_s(x)=0, \, s=1,\ldots ,q\}\).
Let \(f(x)=\sum _{l=1}^p f^l(x)\), \(g_j(x)=\sum _{\alpha }(g_j)_{\alpha }x^{\alpha }\), \(j=1,\ldots ,m\) and \(h_s(x)=\sum _{\alpha } (h_s)_{\alpha } x^{\alpha }\), \(s=1,\ldots ,q\), be polynomials on \(\mathbb {R}^n\) with degree d. For simplicity, here we assume that the objective polynomial functions and the constraint polynomial functions have the same degree because one can let d be the maximum of the degree of f, \(g_j\), \(j=1,\ldots ,m\) and \(h_s\), \(s=1,\ldots ,q\), and regard f, \(g_j\) and \(h_s\) as polynomials with degree d.
Let \(I_l\) be a set of indices such that \(\mathrm{supp}f^l \subseteq I_l \subseteq \{1,\ldots ,n\}\), \(l=1,\ldots ,p\) and \(\bigcup _{l=1}^p I_l=\{1,\ldots ,n\}\). Let \(x^0 \in \{x: g_j(x) \le 0, j=1,\ldots ,m\}\) and let c be a number such that \(c>\max _{1 \le l \le p}\{f^l(x^0)\}\). For each integer k, we define the truncated sparse version of the quadratic module \(\bar{\mathbf {M}}_k\) by
where for each \(l=1,\ldots ,p\), \(\Sigma ^2[\underline{x_l}]\) denotes the set of all SOS polynomials with variable \(\{x_i: i \in I_l\}\) and \(\mathbb {R}[\underline{x_l}]\), denotes the set of all real polynomials with variable \(\{x_i:i \in I_l\}\).
Consider the following relaxation problem
By construction, \(\bar{f}^*_k \le \bar{f}^*_{k + 1} \le \cdots \le \min (P).\) Note that, if we set \(\bar{\sigma }_l \equiv 0\), \(l=1,\ldots ,p\), then the hierarchy (3.1) reduces to the known sparse SDP hierarchy proposed in [17, 18].
Theorem 3.1
(Convergence of sparse SDP hierarchy) Let \(f(x)=\sum _{l=1}^p f^l(x)\), \(g_j(x)=\sum _{\alpha }(g_j)_{\alpha }x^{\alpha }\), \(j=1,\ldots ,m\) and \(h_s(x)=\sum _{\alpha } (h_s)_{\alpha } x^{\alpha }\), \(s=1,\ldots ,q\), be polynomials on \(\mathbb {R}^n\) with degree d. Let \(I_l\) be a set of indexes such that \(\mathrm{supp}f^l \subseteq I_l \subseteq \{1,\ldots ,n\}\), \(l=1,\ldots ,p\) and \(\bigcup _{l=1}^p I_l=\{1,\ldots ,n\}\). Consider the SDP hierarchy (3.1) and denote its optimal value by \(\bar{f}_k^*\). Let \(K=\{x: g_j(x) \le 0, j=1,\ldots ,m, \, h_s(x)=0, \, s=1,\ldots ,q\}\), \(x^0 \in K\) and \(c>\max _{1 \le l \le p}\{f^l(x^0)\}\).
Suppose that for each \(j=1,\ldots ,m\) and \(s=1,\ldots ,q\), \(\mathrm{supp}g_j \cup \mathrm{supp}h_s \subseteq I_l \text{ for } \text{ some } l \in \{1,\ldots ,p\}\), and the running intersection property holds for \(I_l\), \(l=1,\ldots ,p\). If, for each \(l=1,\ldots ,p\), there exist \(\mu _{0l} \ge 0\), \(l=1,\ldots ,p\), \(\mu _j \ge 0\), \(j=1,\ldots ,m\) and \(\tau _s \in \mathbb {R}\), \(s=1,\ldots ,q\), such that \(\sum _{l=1}^p\mu _{0l} f^l+\sum _{j=1}^m \mu _j g_j+\sum _{s=1}^q \tau _s h_s\) is coercive, then \(\lim _{k \rightarrow \infty }\bar{f}_k^*=\min (P)\).
Proof
Let \(\epsilon >0\). Then, we have \(f-\min (P)+\epsilon >0\) over the feasible set K. As there exist \(\mu _{0l} \ge 0\), \(l=1,\ldots ,p\), \(\mu _j \ge 0\), \(j=1,\ldots ,m\) and \(\tau _s \in \mathbb {R}\), \(s=1,\ldots ,q\), such that \(\sum _{l=1}^p\mu _{0l} f^l+\sum _{j=1}^m \mu _j g_j+\sum _{s=1}^q \tau _s h_s\) is coercive, we see that \(\sum _{l=1}^p\mu _{0l} (f^l-c)+\sum _{j=1}^m \mu _j g_j+\sum _{s=1}^q \tau _s h_s\) is also coercive. It follows from Proposition 2.1 that \(\{x: \sum _{l=1}^p\mu _{0l} (c-f^l(x))+\sum _{j=1}^m \mu _j (-g_j)(x)+\sum _{s=1}^q (-\tau _s) h_s \ge 0\}\) is compact. So, by definition, \(\mathbf{M}(-g_1,\ldots ,-g_m,c-f^1,\ldots ,c-f^p;h_1,\ldots ,h_q)\) is Archimedean. Note that \(f-\min (P)+\epsilon >0\) over \(\hat{K}\) where
Hence, by Lemma 2.2, we obtain that
for some sum-of squares polynomials \(\sigma _{0l},\ldots ,\sigma _{ml},\bar{\sigma }_{l}\) with variables \(\{x_i:i \in I_l\}\) and real polynomials \(\phi _{sl}\) with variables \(\{x_i:i \in I_l\}\), \(l=1,\ldots ,p\). Thus, for each \(\epsilon >0\), there exists \(k_0 \in \mathbb {N}\) such that \(\bar{f}_{k_0}^* \ge \min (P)-\epsilon \). On the other hand, from the construction of the hierarchy, we see that \(\bar{f}^*_k \le \bar{f}^*_{k + 1} \le \cdots \le \min (P).\) Therefore, the conclusion follows. \(\square \)
Remark 3.1
It is worth noting that, if a polynomial f on \(\mathbb {R}^n\) is convex in the sense that \(\nabla ^2 f(x)\) is positive semi-definite for all \(x \in \mathbb {R}^n\) and there exists \(x^* \in \mathbb {R}^n\) such that \(\nabla ^2 f(x^*)\) is positive definite, then it is coercive (for example see [12]). Therefore, our coercivity assumption of Theorem 3.1 that, “there exist \(\mu _{0l} \ge 0\), \(l=1,\ldots ,p\) \(\mu _j \ge 0\), \(j=1,\ldots ,m\) and \(\tau _s \in \mathbb {R}\), \(s=1,\ldots ,q\) such that \(\sum _{l=1}^p\mu _{0l} f^l+\sum _{j=1}^m \mu _j g_j+\sum _{s=1}^q \tau _q h_q\) is coercive” is satisfied, if one of the polynomials \(f^l,g_j,h_s\), \(l=1,\ldots ,p\), \(j=1,\ldots ,m\), \(s=1,\ldots ,q\), is convex and has a positive definite Hessian at some point \(x^* \in \mathbb {R}^n\).
In the special case where \(p=1\), \(I_1=\{1,\ldots ,n\}\) and each \(h_s \equiv 0\), \(s=1,\ldots ,q\), the optimization problem (P) reduces to
In this case, the sparse truncated quadratic module \(\bar{\mathbf {M}}_k\) reduces to the truncated quadratic module \(\mathbf {M}_k\) generated by the polynomials \(c - f\) and \(-g_1, \ldots , - g_m\) given by
and the corresponding relaxation problem collapses to
So, Theorem 3.1 collapses to the convergence result of the dense SDP hierarchy for polynomial optimization problem with noncompact sets proposed in [11].
Remark 3.2
(Locating a global minimizer) It is worth noting that, in addition to the assumptions of Theorem 3.1, if we further assume that problem (P) has a unique solution \(\bar{x}\), then we can also find the global minimizer \(\bar{x}\) by solving a sequence of semidefinite programming problems which are the dual problems of the equivalent SDP reformulation of (3.2). The details are omitted here and they can be found in [20, 21].
4 Numerical experiments
In this Section, we show the effectiveness of the proposed sparse SDP hierarchy in Sect. 4 by solving some numerical test problems with unbounded feasible sets. All the numerical tests were conducted on a computer, with a 2.8 GHz Intel Core i7 and 8 GB RAM, equipped with Matlab 7.14 (R2012a).
The purpose of the numerical experiments is to illustrate how our proposed sparse SDP hierarchy works for solving polynomial optimization problems with unbounded feasible sets. Therefore, we first select some known test problems which are nonconvex coercive polynomials and test them by minimizing them over unbounded feasible sets. The selected test problems are:
-
(1)
a nonconvex quadratic programming problem with an unbounded feasible set, in which Lasserre’s hierarchy is known to fail;
-
(2)
the Rosenbrock function over the nonnegative orthant;
-
(3)
the Chain-wood function over the nonnegative orthant;
For the numerical tests, we first form the proposed sparse SDP hierarchy (3.1) for the above test problems. Then, we use a Matlab software SparsePOP [37] to solve the corresponding sparse SDP problems. Let \(f(x)=\sum _{l=1}^q f^l(x)\). We note that the proposed sparse SDP hierarchy (3.1) can be obtained from the existing sparse SDP hierarchy used in [18, 36, 37] designed for problems with compact feasible sets by adding additional constraints \(f^l(x) \le c\), \(l=1,\ldots ,q\), to the test problems with unbounded feasible sets, where c is appropriately chosen. SparsePOP can solve polynomial optimization problems exploiting the sparsity described in Sect. 4 by setting the parameter sparseSW = 1, and can also implement Lasserre’s hierarchy by setting the parameter sparseSW = 0. In addition, a parameter, called the relaxation order, can be chosen in SparsePOP, depending on the degree of the polynomial optimization problem. The larger value for the relaxation order is used, the better approximation to the optimal value of the polynomial optimization problem can be expected.
In SparsePOP, the accuracy of an obtained objective value is computed by
where POP.objValue means the value of the objective function of a polynomial optimization problem computed using a candidate for the optimal solution, and SDP.objValue the value of the objective function of the SDP relaxation of the polynomial optimization problem. Moreover, CPU time reported in the subsequent discussion is measured in seconds. For details, we refer to [37].
4.1 A 2-dimensional QP with an unbounded feasible set
Consider the following nonconvex quadratic optimization problem:
It was shown in [23, 26] that the global minimizers are \((\pm \frac{M+\sqrt{M^2+4}}{2},\pm 1)\) and its global minimum is \(1+\frac{\sqrt{2M^2+4+2M\sqrt{M^2+4}}}{4}\). For instance, if \(M=5\), then, the global minimizers are \((\pm 5.1926,\pm 1)\) and its global minimum is 27.9629.
Let \(M=5\). It was shown in [26] that Lasserre’s hierarchy only provides a lower bound 2, no matter how large the relaxation order k is chosen. As explained in [26], the main reason why the Lasserre’s hierarchy fails to achieve the global minimum of this quadratic problem is that the feasible set of this nonconvex quadratic problem is unbounded.
Notice that the objective \(f(x_1,x_2):=x_1^2+x_2^2\) is coercive. Thus, we can apply the proposed sparse hierarchy. Since the problem has only two variables without sparsity, we test the problem using our dense SDP hierarchy (3.2) with \(c=40\). Note that \(c>f(6,1)\) and the point (6, 1) is feasible for this quadratic optimization problem.
As shown in Table 1, by running the sparsePOP with sparseSW = 0 and the relaxation order 4, the relaxation problem in the dense hierarchy (3.2) solves the original problem and returns a good approximation to the true global minimizer (5.1926, 1.000). Table 1 summarizes the optimal values for the dense SDP hierarchy (3.1) with relaxation order \(k=2,3,4\), which illustrates the effectiveness of our approach.
4.2 The Rosenbrock function over nonnegative orthant
The Rosenbrock function is described as
Clearly, \(f_R\) is a SOS nonconvex polynomial, and is coercive. We add constraints to the Rosenbrock function to have a polynomial optimization problem with an unbounded region as follows:
It can be easily verified that this problem has a unique global minimizer \(\mathbf{e}_n:=(\underbrace{1,\ldots ,1}_n)\).
Let \(I_l=\{l,l+1\}\), \(l=1,\ldots ,n-1\) and \(g_j(x)=x_j\), \(j=1,\ldots ,n\). Then, the assumptions in Theorem 3.1 are satisfied as
and for each \(l=1,\ldots ,n-1\),
Therefore, according to Theorem 3.1, the optimal value of the proposed sparse SDP hierarchy (3.1) converges to the optimal value of the global minimum of \((EP_R)\).
We now test our proposed sparse SDP hierarchy (3.1) using the global optimization problem \((EP_R)\) for different dimension n with \(c=2\). From Table 2, we see that for \(n=10{,}000\) or \(n=20{,}000\), the sparsePOP returns an accurate solution for the relaxation order 2 in the proposed sparse hierarchy Sparse SDP hierarchy (3.1).
When the dimension n of the problem is large, directly applying the dense SDP hierarchy proposed in [11] without exploiting the sparsity leads to very large SDP problems which cannot be handled by a SDP solver such as SeDuMi [34] . Indeed, we confirm in our numerical computation that the dense SDP hierarchy proposed in [11] can only be used to solve \((EP_R)\) up to dimension 20. The larger problems than dimension 20 resulted in out-of-memory error.
4.3 Chained-wood function over nonnegative orthant
Let \(n \in 4\mathbb {N}\) where \(\mathbb {N}\) denotes the set of integers. Consider the Chained-wood function given by
where \(J=\{1,3,5,\ldots ,n-3\}\). The function \(f_C\) is a SOS nonconvex polynomial, and is coercive. Adding nonnegative constraints for the variables \(x_i, \ i=1,\ldots ,n\) results in the following polynomial optimization problem:
Clearly, the feasible set of this polynomial optimization is unbounded, and this problem has a unique global minimizer \(\mathbf{e}_n:=(\underbrace{1,\ldots ,1}_n)\).
Note that \(f_C\) can be equivalently rewritten as
For each \(l=1,\ldots ,\frac{n}{2}-1\), let \(I_l=\{2l-1,2l,2l+1,2l+2\}\) and \(g_j(x)=x_j,\) \(j=1,\ldots ,n\). Then, the assumptions in Theorem 3.1 are satisfied as \(\mathrm{supp} g_j=\{j\} \subseteq I_{[\frac{j}{2}]}\), where \([\frac{j}{2}]\) is the largest integer that is smaller than \(\frac{j}{2}\), and for each \(l=1,\ldots ,n-1\),
Therefore, Theorem 3.1 implies that the optimal value of the proposed sparse SDP hierarchy (3.1) converges to the optimal value of the global minimum of \((EP_R)\).
We now test our proposed sparse SDP hierarchy (3.1) on \((EP_C)\) for different values of n with \(c=n+1\). From Table 3, we see that for dimension \(n=10{,}000\) or \(n=20{,}000\), the sparsePOP with the relaxation order =2 returns an accurate solution in the proposed sparse hierarchy Sparse SDP hierarchy (3.1).
The numerical experiment using the dense SDP hierarchy proposed in [11] could solve \((EP_C)\) up to only \(n=10\). We observe again that much larger problems can be solved by the sparse SDP hierarchy.
4.4 Further examples
In this part, for further illustration, we consider the following polynomial optimization problem:
where \(I_l \subseteq \{1,\ldots ,n\}\), \(l=1,\ldots ,q\), \(\bigcup _{l=1}^qI_l=\{1,\ldots ,n\}\), \(n_l=|I_l|\), that is, \(n_l\) is the cardinality of \(I_l\), \(l=1,\ldots ,q\), \(I_l=\{i_1,\ldots ,i_{n_l}\}\) and \(g_j^l\) are polynomials, \(j=1,\ldots ,m, \, l=1,\ldots ,q\).
Note that problem \((P_0)\) is a polynomial optimization reformulation of the sparse optimization problem with polynomial inequality constraints. Problems of this kind arise in signal processing and statistics [3, 4] and have received a great deal of attention in the literature. For a recent comprehensive survey, see [2]. Other related references can be found in [33]. The relationship between problem \((P_0)\) with the sparse optimization problem has been briefly outlined in the Appendix.
We also note that, there are other efficient approaches for solving the problem \((P_0)\) (or the corresponding equivalent sparse optimization problem) such as the branch and bound approach and the techniques of mathematical programming with equilibrium constraints (for example see [6]). Due to the limitation of the SDP solvers, our approach cannot compete with these ad-hoc but numerically tractable approaches. The purpose of this section is to illustrate that our proposed sparse SDP hierarchy can be used to solve problem \((P_0)\) with large size where the usual dense SDP hierarchy may not be directly applicable.
We now introduce a SDP hierarchy for problem \((P_0)\) using the results in Sect. 3. It is worth noting that, problem \((P_0)\) is a minimization problem with variables \((x,w)\in \mathbb {R}^n \times \mathbb {R}^n\). Thus, the feasible set of \((P_0)\) can be unbounded in general (for example, simply take \(g_j\equiv 0\). Then, (w, 0) is feasible for problem \((P_0)\) for any \(w \in \mathbb {R}^n\)).
Let \(f(x,w)=\sum _{l=1}^q f_l(x^l,w^l)\), where \(f_l(x^l,w^l)=\sum _{i \in I_l}(w_{i}^l)^2\) and \(x=(x^{1},\ldots ,x^{q}) \in \mathbb {R}^{\sum _{l=1}^q n_l}\) and \(w=(w^1,\ldots ,w^{q}) \in \mathbb {R}^{\sum _{l=1}^q n_l}\). For each \(l=1,\ldots ,q\), define \(\bar{g}_{il}(x^l,w^l)=(1-w_i^l)x_i^l\), \(i \in I_l\) and \(G_l(x^l,w^l):=\Vert x^l\Vert _2^2-M^l\). Let c be a number such that \(c>n^l\), \(l=1,\ldots ,q\). For each integer k, we define the following sparse truncated quadratic module \(\hat{\mathbf {M}}_k\) generated by the polynomials \(c - f^l\), \(-g_j^l, j=1,\ldots ,m\), \(\bar{g}_{il}\), \(i \in I_l\) and \(G_l\), \(l=1,\ldots ,q\), as follows:
Consider the following relaxation problem
Then, one can show that \(\lim _{k \rightarrow \infty }f_k^*=\min (P_0)\) if the running intersection property holds.
Proposition 4.1
(Convergence of the sparse SDP hierarchy value for \((P_0)\) ) For problem \((P_0)\) and the SDP hierarchy (4.2), assume that the running intersection property holds for \(I_l\), \(l=1,\ldots ,p\). Then, \(\lim _{k \rightarrow \infty }f_k^*=\min (P_0)\).
Proof
Note that the problem \((P_0)\) can be equivalently rewritten as
As \(\bigcup _{l=1}^pI_l=\{1,\ldots ,n\}\), \(\sum _{l=1}^pf^l(x^l,w^l)+\sum _{l=1}^p G_l(x^l,w^l)\) is strongly coercive. So, the assumptions in Theorem 3.1 are satisfied. Therefore, Theorem 3.1 implies that \(\lim _{k \rightarrow \infty }f_k^*=\min (P_0)\). \(\square \)
We now illustrate the SDP hierarchy using a numerical example. For the numerical test on the effectiveness of the proposed SDP hierarchy, we let \(q \in \mathbb {N}\). For each \(l=1,\ldots ,q\), we generate a 3-by-4 matrix \(A_l\) containing random values drawn from the uniform distribution on the interval [0, 1] and define \(b_l=4 A_l(:,1)\), where \(A_l(:,1)\) denotes the first column of \(A_l\). Let \(I_l=\{4l-3,4l-2,4l-1,4l\}\), \(l=1,\ldots ,q\). Denote \(x^l=(x^l_i)_{i \in I_l} \in \mathbb {R}^4\) and \(w^l=(w^l_i)_{i \in I_l} \in \mathbb {R}^4\), \(l=1,\ldots ,q\). Consider the problem:
It is not hard to see that the optimal solution of (EP) is \((x^*,w^*)\) with optimal value q, where \(x^*:=[x^{1*},\ldots ,x^{q*}]\) with \(x^{l*}=[4,0,0,0]\) and \(w^*:=[w^{1*},\ldots ,w^{q*}]\) with \(w^{l*}=[1,0,0,0]\), \(l=1,\ldots ,q\). Moreover, it can be verified that the running intersection property holds. So, the optimal value of the proposed sparse SDP hierarchy (4.2) converges to the optimal value of (EP).
As shown in Table 4, for \(q=200\) and \(q=1000\), in our numerical experiment, sparsePOP with the relaxation order 2, \(c=5\) and \(M=100\), returns an accurate optimal value and an accurate optimal solution \(x^*\) in the proposed Sparse SDP hierarchy (4.2).
We also notice that the numerical experiment using the dense SDP hierarchy proposed in [11] could solve the above problem with number of blocks q up to only 5. We observe again that much larger problems can be solved by the sparse SDP hierarchy.
Finally, we note that we can obtain optimal solution for (EP) using the solver sparsePOP. For example, consider problem (EP) where \(q=200\),
Then, the true unique solution of this problem is \((x^*,w^*)\) where \(x^*:=[x^{1*},\ldots ,x^{q*}]\) with \(x^{l*}=[4,0,0,0]\), and \(w^*:=[w^{1*},\ldots ,w^{q*}]\) with \(w^{l*}=[1,0,0,0]\) \(l=1,\ldots ,200\). In our numerical experiment, sparsePOP with the relaxation order 2 and \(c=5\), returns an accurate optimal value 200 and an accurate optimal solution \((x^*,w^*)\), in the proposed Sparse SDP hierarchy (4.2), as shown in Table 5.
References
Andersen, M., Vandenberghe, L., Dahl, J.: Linear matrix inequalities with chordal sparsity patterns and applications to robust quadratic optimization. In: Proceedings of IEEE International Symposium on Computer-Aided Control System Design (CACSD) (2010)
Bruckstein, A.M., Donoho, D.L., Elad, M.: From sparse solutions of systems of equations to sparse modeling of signals and images. SIAM Rev. 51(1), 34–81 (2009)
Candés, E.J., Tao, T.: Decoding by linear programming. IEEE Trans. Inf. Theory 51, 4203–4215 (2005)
Candés, E.J., Wakin, M.B., Boyd, S.: Enhancing sparsity by re-weighted \(l_1\) minimization. J. Fourier Anal. Appl. 14(5), 877–905 (2008)
Demmel, J., Nie, J.W.: Sparse SOS relaxations for minimizing functions that are summations of small polynomials. SIAM J. Optim. 19(4), 1534–1558 (2008)
Feng, M., Mitchell, J.E., Pang, J.S., Shen, X., Wachter, A.: Complementarity formulations of \(l_0\)-norm optimization problems. http://www.optimization-online.org/DB_HTML/2013/09/4053.html
Feng, C., Lagoa, C., Sznaier, M.: Hybrid system identification via sparse polynomial optimization. In: American Control Conference (ACC), pp. 160–165 (2010)
Fukuda, M., Kojima, M., Murota, K., Nakata, K.: Exploiting sparsity in semidefinite programming via matrix completion I: general framework. SIAM J. Optim. 11, 647–674 (2000)
Ghaddar, B., Marecek, J., Mevissen, M.: Optimal power flow as a polynomial optimization problem. IEEE Trans. Power Syst. 99, 1–8 (2015)
Hà, H.V., Phạm, T.S.: Representations of positive polynomials and optimization on noncompact semi-algebraic sets. SIAM J. Optim. 20, 3082–3103 (2010)
Jeyakumar, V., Lasserre, J.B., Li, G.: On polynomial optimization over non-compact semi-algebraic sets. J. Optim. Theory Appl. 163(3), 707–718 (2014)
Jeyakumar, V., Li, G., Pham, T.S.: Convergence of the Lasserre hierarchy of SDP relaxations for convex polynomial programs without compactness. Oper. Res. Lett. 42, 34–40 (2014)
Jeyakumar, V., Li, G.: A new class of alternative theorems for SOS-convex inequalities and robust optimization. Appl. Anal. 94, 56–74 (2015)
Jeyakumar, V., Li, G.: Exact SDP relaxations for classes of nonlinear semidefinite programming problems. Oper. Res. Lett. 40, 529–536 (2012)
Kleniati, P., Parpas, P., Rustem, B.: Partitioning procedure for polynomial optimization. J. Glob. Optim. 48, 549–567 (2010)
Kim, S., Kojima, M., Mevissen, M., Yamashita, M.: Exploiting sparsity in linear and nonlinear matrix inequalities via positive semidefinite matrix completion. Math. Program. 129, 33–68 (2011)
Kim, S., Kojima, M., Waki, H., Yamashita, M.: Algorithm 920: SFSDP: a sparse version of full semidefinite programming relaxation for sensor network localization problems. ACM Trans. Math. Softw. 38, 1–19 (2012)
Kojima, M., Muramatsu, M.: A note on sparse SOS and SDP relaxations for polynomial optimization problems over symmetric cones. Comput. Optim. Appl. 42(1), 31–41 (2009)
Lasserre, J.B.: Global optimization with polynomials and the problem of moments. SIAM J. Optim. 11, 796–817 (2001)
Lasserre, J.B.: Moments, Positive Polynomials and Their Applications. Imperial College Press, London (2009)
Lasserre, J.B.: Convergent SDP-relaxations in polynomial optimization with sparsity. SIAM J. Optim. 17, 822–843 (2006)
Lu, W.-S., Hinamoto, T.: Design of FIR filters with discrete coefficients via polynomial programming: towards the global solution. In: IEEE International Symposium on Circuit and Systems, pp. 2048–2051 (2007)
Luo, Z.-Q., Sidiropoulos, N.D., Tseng, P., Zhang, S.: Approximation bounds for quadratic optimization with homogeneous quadratic constraints. SIAM J. Optim. 18, 1–28 (2007)
Marshall, M.: Positive Polynomials and Sums of Squares, Mathematical Surveys and Monographs, vol. 146. American Mathematical Society, Providence (2008)
Nie, J.W.: Sum of squares method for sensor network localization. Comput. Optim. Appl. 43(2), 151–179 (2009)
Nie, J.W.: An exact Jacobian SDP relaxation for polynomial optimization. Math. Program. 137(1–2), 225–255 (2013)
Nie, J.: Optimality conditions and finite convergence of Lasserre’s hierarchy. Math. Program. 146, 97–121 (2014)
Parrilo, P.A.: Structured semidefinite programs and semi-algebraic geometry methods in robustness and optimization, Ph.D. thesis, California Institute of Technology (2000)
Putinar, M.: Positive polynomials on compact semi-algebraic sets. Indiana Univ. Math. J. 41, 49–95 (1993)
Scheiderer, C.: Sums of squares on real algebraic curves. Math. Z. 245, 725–760 (2003)
Schmüdgen, K.: The K-moment problem for compact semi-algebraic sets. Math. Ann. 289, 203–206 (1991)
Schweighofer, M.: Global optimization of polynomials using gradient tentacles and sums of squares. SIAM J. Optim. 17, 920–942 (2006)
Simon, F., Holger, R.: A Mathematical Introduction to Compressive Sensing, Applied and Numerical Harmonic Analysis, Chapter 15. Birkhauser/Springer, New York (2013)
Strum, J.F.: SeDuMi 1.02, a MATLAB toolbox for optimization over symmetric cones. Optim. Methods Softw. 11—-12, 625–653 (1999)
Vandenberghe, L., Andersen, M.: Chordal graphs and semidefinite optimization. Found. Trends Optim. 1, 241–433 (2014)
Waki, H., Kim, S., Kojima, M., Muramatsu, M.: Sums of squares and semidefinite programming relaxations for polynomial optimization problems with structured sparsity. SIAM J. Optim. 17, 218–242 (2006)
Waki, H., Kim, S., Kojima, M., Muramtasu, M., Sugimoto, H.: Algorithm 883: SparsePOP: a sparse semidefinite programming relaxation of polynomial optimization problems. ACM Trans. Math. Softw. 35, 15 (2008)
Author information
Authors and Affiliations
Corresponding author
Additional information
The first and the fourth author were partially supported by a Grant from Australian Research Council. The second author was supported by NRF 2012-R1A1A2-038982 and NRF 2014-R1A2A1A11049618. The third author was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2013R1A1A2005378).
Appendix
Appendix
The polynomial optimization problem \((P_0)\) has a close relationship with the problem of finding the solution with the least number of nonzero components which satisfies a system of polynomial inequalities and simple bounds. Mathematically, the problem of finding the solution with the least number of nonzero components which satisfies a system of polynomial inequalities and simple bounds, can be formulated as
where \(n_l \in \mathbb {N}\), \(l=1,\ldots ,q\), and \(\Vert x\Vert _0\) denotes the \(l_0\)-seminorm of the vector \(x \in \mathbb {R}^n\), which gives the number of nonzero components of the vector x.
In the case where \(q=1\), \(I_1=\{1,\ldots ,n\}\), \(g_j(x)=a_j^Tx-b_j\), \(j=1,\ldots ,m\), \(g_{j}(x)=-(a_j^Tx-b_j)\), \(j=m+1,\ldots ,2m\), the problem \((P_0')\) collapses to the sparse optimization problem which finds the solution with the least number of nonzero components satisfying simple bounds as well as linear equations \(Ax=b\) with more unknowns than equalities:
where \(A=(a_1,\ldots ,a_m)^T \in \mathbb {R}^{m \times n}\) \((m \le n)\), \(b=(b_1,\ldots ,b_m)^T \in \mathbb {R}^m\). We note that the standard sparse optimization problem which is given by
arises in signal processing and was examined, for example, [2–4, 33]. Moreover, problems \((P_1)\) and \((P_2)\) have the same optimal value if \(M> \Vert x^*\Vert _2^2\) for some solution \(x^*\) of problem \((P_2)\).
In fact, the problem \((P_0)\) and problem \((P_0')\) are equivalent in the sense that \(\min (P_0)=\min (P_0')\) and \((x^{1*}, \ldots , x^{q*})\in \mathbb {R}^{n_1} \times \cdots \times \mathbb {R}^{n_q}\) is a solution of problem \((P_0)\) if and only if \((x^{l*},w^{l*}) \in \mathbb {R}^{n_l} \times \mathbb {R}^{n_l}\), \(l=1,\ldots ,q\), is a solution of problem \((P_0')\) where \(w^{l*}=(w^{l*}_{i_1},\ldots ,w^{l*}_{i_{n_l}}) \in \mathbb {R}^{n_l}\) is defined by
Rights and permissions
About this article
Cite this article
Jeyakumar, V., Kim, S., Lee, G.M. et al. Semidefinite programming relaxation methods for global optimization problems with sparse polynomials and unbounded semialgebraic feasible sets. J Glob Optim 65, 175–190 (2016). https://doi.org/10.1007/s10898-015-0356-6
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10898-015-0356-6