1 Introduction

The copositive programming (CP) and completely positive programming (CPP) relaxation [15] for quadratic optimization problems (QOPs) have attracted considerable attention in recent years. The class of QOPs considered by Burer in [4] was binary and continuous nonconvex QOPs with linear constraints, and a QOP with an additional constraint \({\mathbf {u}}\in D\) in its variable vector \({\mathbf {u}}\), where D is a closed (not necessarily convex) set, was represented in a CPP by Eichfelder and Povh [5], extending Burer’s results. More recently, it was shown in [1] that a QOP model with quadratic constraints could be reformulated as a CPP under the hierarchy of copositivity and zeros at infinity conditions. In [6], QOPs with linear and complementarity constraints were shown to be formulated as CPPs. All of these results show that the proposed CP and CPP relaxations are exact for the given QOP. That is, the optimal value of the CPP relaxation is equivalent to that of the given QOP.

For polynomial optimization problems (POPs), semidefinite programming (SDP) relaxations proposed by [7] have been very popular as solution methods. Noting the CPP relaxations are stronger than SDP relaxations for QOPs, it is natural to ask whether the results on the CP and CPP relaxations for QOPs can be extended to a class of POPs. Pe\(\tilde{\text{ n }}\)a et al. [8] proposed a canonical convexification procedure for POPs under the hierarchy of copositivity and zeros at infinity conditions, and formulated a class of POPs as an equivalent conic program over the cone of completely positive tensors under certain conditions in [9].

The main goal of this paper was to propose the moment cone relaxation for a class of POPs as an extension of the CPP relaxation given in [1]. We present the moment cone relaxation for a POP as an extension of the CPP relaxation, and show under certain conditions that the optimal value of the POP coincides with that of the moment cone relaxation of the POP.

The POP considered in this paper is quite general in that it includes various types of QOPs and POPs. We refer to [1, 3, 4, 9] for the QOPs and POPs that can be transformed into POPs of the form in this paper satisfying the required conditions for the equivalence to its moment cone relaxation.

In Sect. 2, we summarize the notation and describe the POP considered in this paper. The illustrative example described in Sect. 2 is used throughout for better understanding of the discussions in this paper. The main results showing the equivalence of the optimal value of the POP and its moment cone relaxation are stated in Sect. 3 and their proofs in Sect. 4. In Sect. 5, we describe how to transform a general POP into the form of POP in this paper and discuss some similarities and differences between the proposed moment cone relaxation and the completely positive reformation [9]. We conclude in Sect. 6.

2 Preliminaries

2.1 Notation and Symbols

Let \({\mathbb {R}}\) denote the set of real numbers, \({\mathbb {R}}_+\) the set of nonnegative real numbers, and \({\mathbb {Z}}_+\) the set of nonnegative integers. We denote the ith coordinate unit as \({\mathbf {e}}_i \in {\mathbb {R}}^n\), and the vector of all elements 1 as \(\mathbf{1} \in {\mathbb {R}}^n\). Let \(\left| \varvec{\beta }\right| _1 = \sum _{i=1}^n \beta _i\) for each \(\varvec{\beta }\in {\mathbb {Z}}^n_+\). \({\mathbb {R}}[\varvec{x}]\) is the set of real-valued multivariate polynomials in n variables \(x_1,\ldots ,x_n \in {\mathbb {R}}\). A polynomial \(f \in {\mathbb {R}}[\varvec{x}]\) is represented as \(f(\varvec{x}) = \sum _{\varvec{\varvec{\beta }}\in {{\mathcal {H}}}} f_{\varvec{\varvec{\beta }}} \varvec{x}^{\varvec{\varvec{\beta }}}\), where \({\mathcal {H}}\subset {\mathbb {Z}}^n_+\) is a nonempty finite set, \(f_{\varvec{\varvec{\beta }}}\;(\varvec{\beta }\in {\mathcal {H}})\) are real coefficients, \(\varvec{x}^{\varvec{\varvec{\beta }}} = x_1^{\beta _1}x_2^{\beta _2} \cdots x_n^{\beta _n}\), and \(\varvec{\beta }= (\beta _1,\beta _2,\ldots ,\beta _n) \in {\mathbb {Z}}^n_+\). We note that if \(\mathbf{0} \in {\mathcal {H}}\), then \(\varvec{x}^{{\mathbf{0}}} = 1\) for any \(\varvec{x}\in {\mathbb {R}}^n\) and \(f_{{\mathbf{0}}}\varvec{x}^{{\mathbf{0}}}\) represents the constant term \(f_{{\mathbf{0}}}\) of the polynomial \(f \in {\mathbb {R}}[\varvec{x}]\). The support of f and the degree of f are defined by \(\text{ supp }(f) := \{ \varvec{\beta }\in {\mathcal {H}}: f_{\varvec{\varvec{\beta }}} \not = 0 \} \subset {\mathbb {Z}}^n_+\) and \(\text{ deg }(f) := \max \{ \left| \varvec{\beta }\right| _1 : \varvec{\beta }\in \text{ supp }(f) \},\) respectively. Let \({\mathcal {H}}\) be a nonempty finite subset of \({\mathbb {Z}}^n_+\). \(\left| {\mathcal {H}}\right| \) stands for the number of elements of \({\mathcal {H}}\). \({\mathbb {R}}[\varvec{x},{\mathcal {H}}] := \{ f \in {\mathbb {R}}[\varvec{x}] : \text{ supp }(f) \subset {\mathcal {H}}\}\). Let \({\mathbb {R}}^{{{\mathcal {H}}}}\) denote the \(\left| {\mathcal {H}}\right| \)-dimensional Euclidean space whose coordinates are indexed by \(\varvec{\beta }\in {\mathcal {H}}\). For \(A \subset {\mathbb {R}}^{{{\mathcal {H}}}}\), conv A denotes the convex hull of A, cone A the cone generated by A and closure A the closure of A; hence, closure conv A is the closure of the convex hull of A. For the definitions of cone A and closure A, we refer to [10]. Each vector of \({\mathbb {R}}^{{{\mathcal {H}}}}\) with elements \(z_{\varvec{\varvec{\beta }}}\;(\varvec{\beta }\in {\mathcal {H}})\) is denoted as \((z_{\varvec{\varvec{\beta }}} : {\mathcal {H}})\). We assume that \((z_{\varvec{\varvec{\beta }}} : {\mathcal {H}})\) is a column vector when it is multiplied by a matrix. If \(\varvec{x}\in {\mathbb {R}}^n\), \((\varvec{x}^{\varvec{\varvec{\beta }}} : {\mathcal {H}})\) denotes the \(\left| {\mathcal {H}}\right| \)-dimensional (column) vector with elements \(z_{\varvec{\varvec{\beta }}} = \varvec{x}^{\varvec{\varvec{\beta }}}\;(\varvec{\beta }\in {\mathcal {H}})\). We frequently write a polynomial \(f \in {\mathbb {R}}[\varvec{x},{\mathcal {H}}]\) as \(f(\varvec{x}) = (f_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \cdot (\varvec{x}^{\varvec{\varvec{\beta }}} : {\mathcal {H}})\) for some \((f_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \in {\mathbb {R}}^{{{\mathcal {H}}}}\), where \( (f_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \cdot (\varvec{x}^{\varvec{\varvec{\beta }}} : {\mathcal {H}})\) denotes the inner product \(\sum _{\varvec{\varvec{\beta }}\in {{\mathcal {H}}}} f_{\beta } \varvec{x}^{\varvec{\varvec{\beta }}}\) of \( (f_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \in {\mathbb {R}}^{{{\mathcal {H}}}}\) and \( (\varvec{x}^{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \in {\mathbb {R}}^{{{\mathcal {H}}}}\).

2.2 Polynomial Optimization Problems

Let \({\mathbb {R}}[\varvec{x}]\) be the set of real-valued multivariate polynomials in n variables \(x_1,\ldots ,x_n \in {\mathbb {R}}\), where \(\varvec{x}= (x_1,x_2,\ldots ,x_n) \in {\mathbb {R}}^n\). As a theoretical framework for the moment cone relaxation, we consider the following POP:

(1)

where \( J = \{1,\ldots ,\ell \}, \, J_0 = \{ 0 \} \bigcup J = \{0,1,\ldots ,\ell \}, \psi , \, h_j \in {\mathbb {R}}[\varvec{x}] \; (j \in J_0) \) and \({\mathbb {L}}\) is a closed (not necessarily convex) cone in \({\mathbb {R}}^n\). This model is an extension of the standard QOP model [3] of minimizing a quadratic form over the simplex \(\left\{ \varvec{x}\in {\mathbb {R}}^n : \varvec{x}\ge \mathbf{0}, \left( \sum _{i=1}^n x_i \right) ^2 = 1 \right\} \) and the QOP model studied in [1, 2]. We assume throughout the paper that

$$\begin{aligned} \psi , \,h_j \in {\mathbb {R}}[\varvec{x}] \,(j \in J_0)\quad \text{ are } \text{ homogeneous } \text{ polynomials } \text{ with } \text{ degree } \tau \ge 1. \end{aligned}$$
(2)

Here \(f \in {\mathbb {R}}[\varvec{x}]\) is called a homogeneous polynomial with degree \(\tau \) if

$$\begin{aligned} f(\lambda \varvec{x}) = \lambda ^{\tau } f(\varvec{x}) \,\text{ for } \text{ every } \varvec{x}\in {\mathbb {R}}^n \text{ and } \lambda \in {\mathbb {R}}. \end{aligned}$$

Let \({\mathbb {K}}\subset {\mathbb {R}}^m\) denote a closed cone, \(J = \{1,\ldots ,\ell \}\) and \(\varphi , g_j \in {\mathbb {R}}[{\mathbf {w}}]\;(j \in J)\). We can handle a more general POP of the form:

(3)

We note that the homogeneity of the polynomials \(\varphi , \,g_j \in {\mathbb {R}}[{\mathbf {w}}]\;(j \in J)\) is not assumed in POP (3), but (3) is easily transformed into POP (1) satisfying the homogeneity condition (2) on the polynomials \(\varphi , \,g_j \in {\mathbb {R}}[{\mathbf {w}}]\;(j \in J)\) by introducing an auxiliary variable \(w_0 \in {\mathbb {R}}_+\) fixed to 1, which corresponds to the equality constraint \(h_0(w_0,{\mathbf {w}}) = 1, \) with \(\varvec{x}= (w_0,{\mathbf {w}})\) in (1) and the cone \({\mathbb {L}}= {\mathbb {R}}_+ \times {\mathbb {K}}\).

The conditions include the hierarchy of copositivity and a variation of the zeros at infinity introduced in [8] and later used in [1]. For every \(j = 1, \ldots ,\ell \), the hierarchy of copositivity for (1) is described as

$$\begin{aligned} \begin{aligned}&h_0(\varvec{x}) \ge 0 \quad \text{ for } \text{ every } \quad \varvec{x}\in {\mathbb {L}}, \\&h_j(\varvec{x}) \ge 0 \quad \text{ for } \text{ every } \quad \varvec{x}\in \widetilde{G}_{j-1} \quad (j \in J), \end{aligned} \end{aligned}$$
(4)

where

$$\begin{aligned} \widetilde{G}_0= & {} \left\{ \varvec{x}\in {\mathbb {L}}: \; h_0(\varvec{x}) = 1 \right\} , \nonumber \\ \widetilde{G}_{j}= & {} \left\{ \varvec{x}\in \widetilde{G}_{j-1} : h_j (\varvec{x}) = 0 \right\} \nonumber \\= & {} \left\{ \varvec{x}\in {\mathbb {L}}: \; h_0(\varvec{x}) = 1 \text{ and } h_k( \varvec{x}) = 0 \; (k=1,2,\ldots ,j) \right\} . \end{aligned}$$
(5)

We note that a simple copositivity condition

$$\begin{aligned} h_j(\varvec{x}) \ge 0 \text{ for } \text{ every } \varvec{x}\in {\mathbb {L}}\; (j \in J_0), \end{aligned}$$
(6)

is a stronger version of the hierarchy of copositivity (4), and

$$\begin{aligned} \varvec{x}= \mathbf{0} \quad \text{ if } \quad \varvec{x}\in {\mathbb {L}}\quad \text{ and } \quad h_j(\varvec{x}) = 0 \,(j \in J_0) \end{aligned}$$
(7)

is a stronger version of the zero at infinity condition since \(\varvec{x}= \mathbf{0}\) is required. Condition (6) is not very restrictive theoretically because \(\psi (\varvec{x})\) can always be replaced by \(h_0(\varvec{x})\psi (\varvec{x})\) and \(h_j(\varvec{x})\) by \(h_j(\varvec{x})^2\;(j \in J_0)\) in POP (1) to satisfy both (2) and (6). This, however, may destroy the sparsity of the polynomials. Condition (7), together with (2), requires that the feasible region of POP (1) is bounded, while the zeros at infinity, a weaker condition, allows that the feasible region is unbounded.

A popular choice for the closed cone \({\mathbb {L}}\) in (1) is the Cartesian product of \({\mathbb {R}}^{n_1}\) and \({\mathbb {R}}^{n_2}_+\) for \(n_1\) and \(n_2\) satisfying \(n=n_1+n_2\). More generally, we can choose a second-order cone, the vectorization of a positive semidefinite symmetric matrix cone and the vectorization of a cone of nonnegative symmetric matrices for \({\mathbb {L}}\). We also note that if \({\mathbb {L}}_1\) and \({\mathbb {L}}_2\) in \({\mathbb {R}}^n\) are cones, so are their intersection, union, difference, symmetric difference and Minkowski sum.

For (homogeneous) QOPs, two different descriptions of the completely positive cone are known: \( \text{ the } \text{ convex } \text{ cone } \text{ generated } \text{ by } \left\{ \varvec{x}\varvec{x}^T : \varvec{x}\in {\mathbb {R}}^n_+ \right\} \) and

$$\begin{aligned} \left\{ \sum _{p=1}^q \varvec{x}_p\varvec{x}_p^T : \varvec{x}_p \in {\mathbb {R}}^n_+ \, (p=1,\ldots ,q), \quad q \ge 0 \right\} . \end{aligned}$$

These two descriptions are equivalent. In [9], the completely positive cone in the former description is generalized to the cone of completely positive tensors. The completely positive cone described in the latter can be generalized similarly. When their procedure is applied to nonhomogeneous POPs of the form (3), the two generalized descriptions are different. In particular, the latter is neither convex nor conic. On the one hand, the two descriptions remain equivalent in our homogeneous POP model (1) satisfying (2) (see Lemma 3.1).

This difference is a fundamental and essential feature of our POP model, which makes it possible to (a) allow a straightforward extension from the CPP relaxation for the QOP model [1] to POP (1), (b) make the derivation of an equivalent moment cone relaxation of (1) simple, (c) directly handle cases where the closed cone \({\mathbb {L}}\) is neither convex nor pointed and (d) naturally take account of sparsity of the polynomials \(\psi , \,h_j \in {\mathbb {R}}[\varvec{x}]\;(j \in J_0)\) in POP (1). We note that the last advantage is important for developing efficient approximation of the moment cone relaxation problem in practice, such as the doubly nonnegative relaxation. More detailed comparison between our moment cone relaxation and the completely positive reformulation in [9] is in Sect. 6.

2.3 An Illustrative Example

We consider a polynomial optimization problem

$$\begin{aligned} \begin{array}{l@{\quad }l} \text{ minimize } &{} (x_1^4 + 2x_1^2x_2^2 - 4x_3^4)\\ \text{ subject } \text{ to } &{} x_1^4+x_2^4+x_3^4 = 1, x_1x_2 - x_3^2 \ge 0, \,x_i \ge 0\quad (i=1,2,3). \end{array} \end{aligned}$$
(8)

By introducing a slack variable \(x_4 \in {\mathbb {R}}\) and a variable vector \(\varvec{x}= (x_1,x_2,x_3,x_4)\), we convert the problem into

$$\begin{aligned} \begin{array}{llll} \text{ minimize }&\quad \psi (\varvec{x})&\text{ subject } \text{ to }&\quad h_0(\varvec{x}) = 1, h_1(\varvec{x}) = 0, \varvec{x}\in {\mathbb {R}}^4_+, \end{array} \end{aligned}$$
(9)

where \( \psi (\varvec{x}) = x_1^4 + 2x_1^2x_2^2 - 4x_3^4, h_0(\varvec{x}) = x_1^4+x_2^4+x_3^4, h_1(\varvec{x}) = (x_1x_2 - x_3^2 -x_4^2)^2. \) In addition to condition (2) with \(\tau = 4\), the problem (9) satisfies conditions (6) and (7) with \(J_0 = \{ 0,1\}\) and \({\mathbb {L}}= {\mathbb {R}}^4_+\). This problem serves as an illustrative example in the subsequent discussions. We see that

$$\begin{aligned} \text{ deg }(\varphi )= & {} \text{ deg }(h_0) = \text{ deg }(h_1) = 4, \text{ supp }(\psi ) = \begin{array}{c} \{ (4 \, 0 \, 0 \, 0), \, (2 \, 2 \, 0 \, 0 ), ( 0 \, 0 \, 4 \, 0 )\}, \end{array}\\ \text{ supp }(h_0)= & {} \begin{array}{c} \{ ( 4 \, 0 \, 0 \, 0 ), (0 \, 4 \, 0 \, 0 ), \, ( 0 \, 0 \, 4 \, 0) \}, \end{array}\\ \text{ supp }(h_1)= & {} \left\{ \begin{array}{c} ( 2 \, 2 \, 0 \, 0 ), \, (1 \, 1 \, 2 \, 0), \, ( 1 \, 1 \, 0 \, 2 ), \, ( 0 \, 0 \, 4 \, 0 ), \, (0 \, 0 \, 2 \, 2 ),\\ (0 \, 0 \, 0 \, 4 ) \end{array}\right\} .\\ \text{ Let } {\mathcal {H}}_{\min }= & {} \text{ supp }(\psi ) \bigcup \text{ supp }(h_0) \bigcup \text{ supp }(h_1) \\= & {} \left\{ \begin{array}{c} (4 \, 0 \, 0 \, 0 ), ( 2 \, 2 \, 0 \, 0), (1 \, 1 \, 2 \, 0), (1 \, 1 \, 0 \, 2), \, ( 0 \, 4 \, 0 \, 0 ), \\ ( 0 \, 0 \, 4 \, 0 ),\, ( 0 \,0 \, 2 \, 2 ), \, ( 0 \, 0 \, 0 \, 4 ) \end{array} \right\} . \end{aligned}$$

Then, we can regard \(\psi , \,h_0, \,h_1 \in {\mathbb {R}}[\varvec{x},{\mathcal {H}}]\) for any \({\mathcal {H}}\supset {\mathcal {H}}_{\min }\). For example, if we take \({\mathcal {H}}= {\mathcal {H}}_{\min }\), \(\psi \in {\mathbb {R}}[\varvec{x},{\mathcal {H}}]\) is represented as follows:

$$\begin{aligned} \begin{aligned} (\psi _{\varvec{\varvec{\beta }}} : {\mathcal {H}})&= \left( \psi _{(4000)},\psi _{(2200)},\psi _{(1120)},\psi _{(1102)}, \psi _{(0400)},\psi _{(0040)},\psi _{(0022)},\psi _{(0004)} \right) , \\&= \left( 1,2,0,0,0,-4,0,0 \right) \in {\mathbb {R}}^{{{\mathcal {H}}}}, \\ (\varvec{x}^{\varvec{\varvec{\beta }}} : {\mathcal {H}})&= \left( {x_1^4},{x_1^2x_2^2},{x_1x_2x_3^2},{x_1x_2x_4^2},{x_2^4},{x_3^4},{x_3^2x_4^2},{x_4^4} \right) \in {\mathbb {R}}^{{{\mathcal {H}}}}, \\ \psi (\varvec{x})&= (\psi _{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \cdot (\varvec{x}^{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \in {\mathbb {R}}[\varvec{x},{\mathcal {H}}]. \end{aligned} \end{aligned}$$
(10)

3 Main Results

We consider POP (1) satisfying condition (2). Recall that \({\mathbb {L}}\) denotes a closed (not necessarily convex) cone. Let \(T_*\) denote the feasible region of POP (1);

$$\begin{aligned} T_*= & {} \left\{ \varvec{x}\in {\mathbb {L}}: h_0(\varvec{x}) = 1, {\mathbf {h}}_j(\varvec{x}) = 0 \,(j \in J) \right\} . \end{aligned}$$

Condition (2) can be restated as

$$\begin{aligned} \psi (\lambda \varvec{x})= & {} \lambda ^{\tau } \psi (\varvec{x}), \,h_j(\lambda \varvec{x}) = \lambda ^{\tau } h_j(\varvec{x}) \,(j \in J_0) \nonumber \\&\quad \text{ for } \text{ some } \text{ integer } \tau \ge 1, \text{ every } \varvec{x}\in {\mathbb {R}}^{n} \text{ and } \text{ every } \lambda \in {\mathbb {R}}_+. \end{aligned}$$
(11)

Let \({\mathcal {H}}_{\min } = \text{ supp }(\psi ) \bigcup \left( \bigcup _{j \in J_0} \text{ supp }(h_j) \right) . \) Then, (11) is equivalent to

$$\begin{aligned} \left| \varvec{\beta }\right| _1= & {} \tau \text{ for } \text{ some } \text{ positive } \text{ integer } \tau \ge 1 \hbox { and every } \varvec{\beta }\in {\mathcal {H}}_{\min }. \end{aligned}$$
(12)

Let \({\mathcal {H}}_{\max } = \left\{ \varvec{\beta }\in {\mathbb {Z}}^n_+ : \left| \varvec{\beta }\right| _1 = \tau \right\} \). Choose \({\mathcal {H}}\subset {\mathbb {Z}}^n_+\) as \( {\mathcal {H}}_{\min } \subset {\mathcal {H}}\subset {\mathcal {H}}_{\max }. \) Then, the polynomials \(\psi , \,h_j \in {\mathbb {R}}[\varvec{x}]\;(j \in J_0)\) are written as

$$\begin{aligned} \psi (\varvec{x}) = (\psi _{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \cdot (\varvec{x}^{\varvec{\varvec{\beta }}} : {\mathcal {H}}), \,h_j(\varvec{x}) = ((h_j){\varvec{\varvec{\beta }}} : {\mathcal {H}}) \cdot (\varvec{x}^{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \, (j \in J_0), \end{aligned}$$

for some \((\psi _{\varvec{\varvec{\beta }}} : {\mathcal {H}}), \, ((h_j){\varvec{\varvec{\beta }}} : {\mathcal {H}}) \in {\mathbb {R}}^{{{\mathcal {H}}}}\;(j \in J_0)\). Let

$$\begin{aligned} \widetilde{T}({\mathcal {H}})= & {} \left\{ (\varvec{x}^{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \in {\mathbb {R}}^{{{\mathcal {H}}}} : \varvec{x}\in T_* \right\} \\= & {} \left\{ (\varvec{x}^{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \in {\mathbb {R}}^{{{\mathcal {H}}}} : \begin{array}{ll} \varvec{x}\in {\mathbb {L}}, \, ((h_0){\varvec{\varvec{\beta }}} : {\mathcal {H}}) \cdot (\varvec{x}^{\varvec{\varvec{\beta }}} : {\mathcal {H}}) = 1, \\ ((h_j){\varvec{\varvec{\beta }}} : {\mathcal {H}}) \cdot (\varvec{x}^{\varvec{\varvec{\beta }}} : {\mathcal {H}}) = 0 \, (j \in J) \end{array} \right\} . \end{aligned}$$

Then, we can rewrite POP (1) as

$$\begin{aligned} \begin{array}{llll} \text{ minimize }&(\psi _{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \cdot (z_{\varvec{\varvec{\beta }}} : {\mathcal {H}})&\text{ subject } \text{ to }&(z_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \in \widetilde{T}({\mathcal {H}}). \end{array} \end{aligned}$$

Since the objective function is linear with respect to \((z_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \in {\mathbb {R}}^{{{\mathcal {H}}}}\), the problem above is equivalent to

$$\begin{aligned} \begin{array}{llll} \text{ minimize }&(\psi _{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \cdot (z_{\varvec{\varvec{\beta }}} : {\mathcal {H}})&\text{ subject } \text{ to }&(z_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \in \text{ conv } \widetilde{T}({\mathcal {H}}). \end{array} \end{aligned}$$
(13)

In the case of POP (9), we see that

$$\begin{aligned} \widetilde{T}({\mathcal {H}})= & {} \left\{ \begin{array}{l} \left( {x_1^4},{x_1^2x_2^2},{x_1x_2x_3^2},{x_1x_2x_4^2}, {x_2^4},{x_3^4},{x_3^2x_4^2},{x_4^4}\right) : \varvec{x}\in {\mathbb {R}}^4_+, \\ h_0(\varvec{x}) = x_1^4+x_2^4+x_3^4 = 1, h_1(\varvec{x}) = (x_1x_2 - x_3^2 -x_4^2)^2 = 0 \, \end{array}\right\} , \\ (z_{\varvec{\varvec{\beta }}} : {\mathcal {H}})= & {} \left( z_{(4000)},z_{(2200)},z_{(1120)},z_{(1102)},z_{(0400)}, z_{(0040)},z_{(0022)},z_{(0004)}\right) . \end{aligned}$$

See also (10) for \((\psi _{\varvec{\varvec{\beta }}} : {\mathcal {H}})\) and \((\varvec{x}^{\varvec{\varvec{\beta }}} : {\mathcal {H}})\).

Define the moment cone generated by \({\mathcal {H}}\) and \({\mathbb {L}}\) as

$$\begin{aligned} {\mathbb {M}}({\mathcal {H}},{\mathbb {L}}):= & {} \left\{ \sum _{p=1}^q ((\varvec{x}_p)^{\varvec{\varvec{\beta }}} : {\mathcal {H}}) : \varvec{x}_p \in {\mathbb {L}}\,(p=1,2,\ldots ,q) \text{ and } q \in {\mathbb {Z}}_+ \right\} . \end{aligned}$$
(14)

\({\mathbb {M}}({\mathcal {H}},{\mathbb {L}})\) forms a convex cone by the following lemma. Hence, by Carathéodory’s Theorem the nonnegative integer q to which the summation is taken in the description of \({\mathbb {M}}({\mathcal {H}},{\mathbb {L}})\) can be fixed to \(q^* = \left| {\mathcal {H}}\right| \);

$$\begin{aligned} {\mathbb {M}}({\mathcal {H}},{\mathbb {L}})= & {} \left\{ \sum _{p=1}^{q^*} ((\varvec{x}_p)^{\varvec{\varvec{\beta }}} : {\mathcal {H}}) : \varvec{x}_p \in {\mathbb {L}}\,(p=1,2,\ldots ,q^*) \right\} . \end{aligned}$$

Lemma 3.1

Suppose that \({\mathbb {L}}\) is a closed cone in \({\mathbb {R}}^n\) and \( {\mathcal {H}}_{\min } \subset {\mathcal {H}}\subset {\mathcal {H}}_{max}. \)

  1. (a)

    \({\mathbb {M}}({\mathcal {H}},{\mathbb {L}})\) is a convex cone.

  2. (b)

    Assume that \(\left\{ \tau {\mathbf {e}}_1,\ldots ,\tau {\mathbf {e}}_n \right\} \subset {\mathcal {H}}\). If \(\tau \) is even or \({\mathbb {L}}= {\mathbb {R}}^n_+\), then \({\mathbb {M}}({\mathcal {H}},{\mathbb {L}})\) is closed, where \({\mathbf {e}}_i\) denotes the ith coordinate unit vector of \({\mathbb {R}}^n\).

Proof

See Sections 5.1 and 5.2. \(\square \)

If the assumption in (b) is not satisfied, \({\mathbb {M}}({\mathcal {H}},{\mathbb {L}})\) is not necessarily closed. For example, let \(n=2, {\mathbb {L}}= {\mathbb {R}}^2_+, \tau = 2, {\mathcal {H}}= \left\{ (2,0), (1,1) \right\} \not \ni (0,2)\). Then

$$\begin{aligned} {\mathbb {M}}({\mathcal {H}}, {\mathbb {R}}^2_+) = \left\{ (x_1^2, x_1x_2) + (y_1^2,y_1y_2) : \varvec{x}= (x_1,x_2), {\mathbf {y}}= (y_1,y_2) \in {\mathbb {R}}^2_+ \right\} . \end{aligned}$$

If we take a sequence \(\left\{ \varvec{x}_r = (1/r,r) \in {\mathbb {R}}^2_+ : r=1,2,\ldots \,\right\} \), then the sequence

$$\begin{aligned} \left\{ ((\varvec{x}_r)^{\varvec{\varvec{\beta }}} : {\mathcal {H}}) = ((1/r)^2,1) \in {\mathbb {M}}({\mathcal {H}},{\mathbb {R}}^2_+) : r=1,2,\ldots , \right\} \end{aligned}$$

converges to \((0,1) \not \in {\mathbb {M}}({\mathcal {H}}, {\mathbb {R}}^2_+)\).

Define

$$\begin{aligned} \widehat{T}({\mathcal {H}}):= & {} \left\{ (z_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \in {\mathbb {R}}^{{{\mathcal {H}}}} : \begin{array}{l} (z_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \in {\mathbb {M}}({\mathcal {H}},{\mathbb {L}}), ((h_0){\varvec{\varvec{\beta }}} : {\mathcal {H}}) \cdot (z_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) = 1, \\ ( (h_j)_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \cdot (z_{\varvec{\varvec{\beta }}} : {\mathcal {H}})= 0 \, (j \in J) \end{array} \right\} . \end{aligned}$$

We introduce the moment cone relaxation of POP (1).

(15)

Recall that \({\mathcal {H}}\) can be an arbitrary subset of \({\mathbb {Z}}^n_+\) satisfying

$$\begin{aligned}&{\mathcal {H}}_{\min } = \text{ supp }(\psi ) \bigcup \left( \bigcup _{j \in J_0} \text{ supp }(h_j) \right) \subset {\mathcal {H}}\subset {\mathcal {H}}_{max}, \text{ or } \\&{\mathcal {H}}_{\min } \bigcup \left\{ \tau {\mathbf {e}}_1,\ldots ,\tau {\mathbf {e}}_n \right\} \subset {\mathcal {H}}\subset {\mathcal {H}}_{max}, \end{aligned}$$

for the closedness of \({\mathbb {M}}({\mathcal {H}},{\mathbb {L}})\) when \({\mathbb {L}}= {\mathbb {R}}^n_+\). If the polynomials \(\psi , \,h_j \in {\mathbb {R}}[\varvec{x}]\;(j \in J_0)\) of POP (1) are sparse or they involve a small number of monomials, the dimension \(\left| {\mathcal {H}}\right| \) of the variable vector \((z_{\varvec{\varvec{\beta }}} : {\mathcal {H}})\) of the problem (15) can be small. Thus, the moment cone relaxation (15) naturally inherits such sparsity from POP (1).

We note that the problems (13) and (15) have the same linear objective function \((\psi _{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \cdot (z_{\varvec{\varvec{\beta }}} : {\mathcal {H}})\). Let

$$\begin{aligned} T_0= & {} \left\{ \varvec{x}\in {\mathbb {L}}: h_0(\varvec{x}) \ge 0 \right\} , \quad \,T_j = \left\{ \varvec{x}\in T_{j-1} : h_j(\varvec{x}) = 0 \right\} \quad (j \in J) \\= & {} \left\{ \varvec{x}\in {\mathbb {L}}: h_0(\varvec{x}) \ge 0, h_i(\varvec{x}) = 0 \,(i=1,\ldots ,j) \right\} \quad (j \in J). \end{aligned}$$

We consider the following conditions to ensure that (13) and (15) have equivalent feasible regions in the sense that \(\text{ closure } \text{ conv } \widetilde{T}({\mathcal {H}}) = \text{ closure } \widehat{T}({\mathcal {H}})\).

$$\begin{aligned}&h_0(\varvec{x}) \ge 0 \quad \text{ for } \text{ every } \varvec{x}\in {\mathbb {L}}, \quad \hbox {i.e.}, \,T_0 = {\mathbb {L}}, \end{aligned}$$
(16)
$$\begin{aligned}&h_j(\varvec{x}) \ge 0 \quad \text{ for } \text{ every } \varvec{x}\in T_{j-1} (j \in J), \end{aligned}$$
(17)
$$\begin{aligned}&T_*^{\infty } \supset \left\{ \varvec{x}\in {\mathbb {L}}: h_j(\varvec{x}) = 0 (j\in J_0) \right\} . \end{aligned}$$
(18)

Here, for every \(A \subset {\mathbb {R}}^n\), \(A^{\infty }\) denotes the horizon cone defined by

$$\begin{aligned} A^{\infty } := \left\{ \varvec{x}\in {\mathbb {R}}^{n} : \begin{array}{l} \text{ there } \text{ exists } (\mu _r,{\mathbf {y}}_r) \in {\mathbb {R}}_+ \times A \,(\hbox {r}=1,2,\ldots \,) \\ \text{ such } \text{ that } (\mu _r,\mu _r{\mathbf {y}}_r) \rightarrow (0,\varvec{x})\hbox { as }r \rightarrow \infty \end{array} \right\} \end{aligned}$$

(see, for example, [11]). If \(\tau =2\) (i.e., (1) represents a homogeneous QOP), conditions (16), (17) and (18) are equivalent to the set of conditions (A)’, (\(\widetilde{B}\)), (\(\widetilde{C}\)) and (D) assumed in [1]. These conditions will be compared to the conditions imposed in [9] on the nonhomogeneous POP of the form (3) for the equivalence to its completely positive reformulation in Sect. 6. It is easily verified that the converse inclusion of (18) always holds.

The following theorem asserts that the closures of feasible regions \(\text{ conv } \widetilde{T}({\mathcal {H}})\) of POP (13) and \(\widehat{T}({\mathcal {H}})\) of the moment cone relaxation problem (15) coincide with each other. Thus, (13) and (15) are equivalent. Since POPs (1) and (13) have a same optimal value, (15) attains the exact optimal value of (1).

Theorem 3.1

Assume that \({\mathbb {L}}\) is a closed cone and conditions (11), (16), (17) and (18) hold. If \({\mathcal {H}}_{min} \subset {\mathcal {H}}\subset {\mathcal {H}}_{\max }\), then closure conv \(\widetilde{T}({\mathcal {H}}) = \) closure \(\widehat{T}({\mathcal {H}})\) and

$$\begin{aligned} \inf \left\{ \psi (\varvec{x}) : \varvec{x}\in T_* \right\} = \inf \left\{ (\psi _{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \cdot (z_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) : (z_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \in \widehat{T}({\mathcal {H}}) \right\} . \end{aligned}$$
(19)

Proof

See Sections 5.3 and 5.4.

We conclude this section by extending the previous discussions to more general cases.

Corollary 3.1

Assume that \({\mathbb {L}}\) is a closed cone and that conditions (11), (16), (17) and (18) hold. If \({\mathcal {H}}_{\min } \subset {\mathcal {H}}\) (but not necessarily \({\mathcal {H}}\subset {\mathcal {H}}_{\max }\)), then (19) holds.

Proof

We first observe that conditions (11), (16), (17) and (18) do not depend on any choice of \({\mathcal {H}}\supset {\mathcal {H}}_{\min }\) and that all definitions of \((\psi _{\varvec{\varvec{\beta }}} : {\mathcal {H}}),\) \({\mathbb {M}}({\mathcal {H}},{\mathbb {L}})\), \(((h_j)_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \in {\mathbb {R}}^{{{\mathcal {H}}}}\;(j \in J_0)\), \(\widetilde{T}({\mathcal {H}})\) and \(\widehat{T}({\mathcal {H}})\) remain consistent, although Lemma 3.1 may not be true. We can easily verify that \(\widetilde{T}({\mathcal {H}}) \subset \widehat{T}({\mathcal {H}})\). Hence,

$$\begin{aligned} \inf \left\{ \psi (\varvec{x}) : \varvec{x}\in T_* \right\}= & {} \inf \left\{ (\psi _{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \cdot (z_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) : (z_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \in \widetilde{T}({\mathcal {H}}) \right\} \\\ge & {} \inf \left\{ (\psi _{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \cdot (z_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) : (z_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \in \widehat{T}({\mathcal {H}}) \right\} . \end{aligned}$$

On the other hand, if \((\bar{z}_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \in \widehat{T}({\mathcal {H}})\), then \((\bar{z}_{\varvec{\varvec{\beta }}} : {\mathcal {H}}\bigcap {\mathcal {H}}_{\max }) \in \widehat{T}({\mathcal {H}}\bigcap {\mathcal {H}}_{\max }))\) and \( (\psi _{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \cdot (\bar{z}_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) = (\psi _{\varvec{\varvec{\beta }}} : {\mathcal {H}}\bigcap {\mathcal {H}}_{\max }) \cdot (\bar{z}_{\varvec{\varvec{\beta }}} : {\mathcal {H}}\bigcap {\mathcal {H}}_{\max }). \) Therefore,

$$\begin{aligned}&\inf \left\{ (\psi _{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \cdot (z_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) : \cdot (z_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \in \widehat{T}({\mathcal {H}}) \right\} \\&\quad \ge \inf \left\{ (\psi _{\varvec{\varvec{\beta }}} : {\mathcal {H}}\bigcap {\mathcal {H}}_{\max }) \cdot (z_{\varvec{\varvec{\beta }}} : {\mathcal {H}}\bigcap {\mathcal {H}}_{\max }) : \right. \\&\qquad \left. (z_{\varvec{\varvec{\beta }}} : {\mathcal {H}}\bigcap {\mathcal {H}}_{\max }) \in \widehat{T}({\mathcal {H}}\bigcap {\mathcal {H}}_{\max }) \right\} \\&\quad = \inf \left\{ \psi (\varvec{x}) : \varvec{x}\in T_* \right\} . \end{aligned}$$

Here the last equality follows from Theorem 3.1. \(\square \)

4 Proof

4.1 Proof of (a) in Lemma 3.1

Suppose that \( \sum _{p=1}^{q} ((\varvec{x}_p)^{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \in {\mathbb {M}}({\mathcal {H}},{\mathbb {L}})\), \(\,\varvec{x}_p \in {\mathbb {L}}\, (p=1,\ldots ,q)\), \(\sum _{p=1}^{\bar{q}} (\bar{\varvec{x}}_p^{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \in {\mathbb {M}}({\mathcal {H}},{\mathbb {L}})\), \(\bar{\varvec{x}}_p \in {\mathbb {L}}\;(p=1,\ldots ,\bar{q})\), \(\lambda \ge 0\) and \(\bar{\lambda } \ge 0\). Since \({\mathbb {L}}\) is a cone, we see that \( \lambda ^{1/\tau } \varvec{x}_p \in {\mathbb {L}}\, (p=1,\ldots ,q) \hbox { and } \bar{\lambda }^{1/\tau } \bar{\varvec{x}}_p \in {\mathbb {L}}\,(p=1,\ldots ,\bar{q}). \) By \({\mathcal {H}}\subset {\mathcal {H}}_{\max } = \left\{ \varvec{\beta }\in {\mathbb {Z}}^n_+ : \left| \varvec{\beta }\right| _1 = \tau \right\} \),

$$\begin{aligned}&\lambda \sum _{p=1}^{q} ((\varvec{x}_p)^{\varvec{\varvec{\beta }}} : {\mathcal {H}}) + \bar{\lambda } \sum _{p=1}^{\bar{q}} ((\bar{\varvec{x}}_p)^{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \\&\quad = \sum _{p=1}^{q} ((\lambda ^{1/\tau } \varvec{x}_p)^{\varvec{\varvec{\beta }}} : {\mathcal {H}}) + \sum _{p=1}^{\bar{q}} ((\bar{\lambda }^{1/\tau } \bar{\varvec{x}}_p)^{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \in {\mathbb {M}}({\mathcal {H}},{\mathbb {L}}). \end{aligned}$$

Thus, we have shown that \({\mathbb {M}}({\mathcal {H}},{\mathbb {L}})\) is a convex cone. \(\square \)

4.2 Proof of (b) in Lemma 3.1

Consider a sequence

$$\begin{aligned} {\mathbb {M}}({\mathcal {H}},{\mathbb {L}}) \ni ((z_r)_{\varvec{\varvec{\beta }}} : {\mathcal {H}})= & {} \sum _{p=1}^{q} ((\varvec{x}_{rp})^{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \nonumber \\&\quad \text{ with } \varvec{x}_{rp} \in {\mathbb {L}}\, (p=1,2,\ldots ,q) \, (r=1,2,\ldots ), \end{aligned}$$
(20)

which converges to some \((\bar{z}_{\varvec{\varvec{\beta }}} : {\mathcal {H}})\) as \(r \rightarrow \infty \). We show that the sequence \(\left\{ \varvec{x}_{rp} \in {\mathbb {L}}: \right. \;\left. r=1,2,\ldots \,\right\} \) is bounded \((p=1,2,\ldots ,q)\). From (20), we observe

$$\begin{aligned} \sum _{p=1}^{q} (\varvec{x}_{rp})^{\varvec{\varvec{\beta }}} = (z_r)_{\varvec{\varvec{\beta }}} \rightarrow \bar{z}_{\varvec{\varvec{\beta }}} \, \text{ as } r \rightarrow \infty \,(\varvec{\beta }\in {\mathcal {H}}). \end{aligned}$$

By the assumption, we know that \(\left\{ \tau {\mathbf {e}}_1,\ldots ,\tau {\mathbf {e}}_n \right\} \subset {\mathcal {H}}\). As a result, the above relation holds for \(\varvec{\beta }= \tau {\mathbf {e}}_i \in {\mathcal {H}}\;(i=1,\ldots ,n)\). If each \(\varvec{x}_{rp}\) is denoted as \((x_{rp1},\ldots ,x_{rpn})\), then \((\varvec{x}_{rp})^{(\tau {{\mathbf {e}}}_i)} = (x_{rpi})^{\tau } \ge 0\) since \(\tau \) is a even integer or \({\mathbb {L}}= {\mathbb {R}}^n_+\) by the assumption. Hence, we obtain by induction on j that

$$\begin{aligned} 0 \le (x_{rpi})^{\tau }\le & {} \sum _{q=1}^{q} (x_{rqi})^{\tau } = \sum _{q=1}^{q} (\varvec{x}_{rq})^{(\tau {\mathbf {e}}_i)} = (z_r)_{(\tau {{\mathbf {e}}}_i)} \rightarrow \bar{z}_{(\tau {{\mathbf {e}}}_i)} \, \text{ as } \quad r \rightarrow \infty \nonumber \\&\quad \text{ for } \quad i=1,\ldots ,n \quad \text{ and } \quad p=1,2,\ldots ,q \quad (r=1,2,\ldots ,). \end{aligned}$$
(21)

This implies that all sequences \(\{ \varvec{x}_{rp} \in {\mathbb {L}}: r=1,2,\ldots , \}\;(p=1,2,\ldots ,q)\) are bounded. Thus, we can take a subsequence of (20) along which \(\varvec{x}_{rp} \in {\mathbb {L}}\) converges to some \(\bar{\varvec{x}}_p \in {\mathbb {L}}\) as \(r \rightarrow \infty \;(p=1,2,\ldots ,q)\). Therefore,

$$\begin{aligned} (\bar{z}^{\varvec{\varvec{\beta }}} : {\mathcal {H}}) = \sum _{p=1}^{q} ((\bar{\varvec{x}}_{p})^{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \in {\mathbb {M}}({\mathcal {H}},{\mathbb {L}}). \end{aligned}$$

\(\square \)

4.3 Proof of Closure conv \(\widetilde{T}({\mathcal {H}}) \subset \) Closure \(\widehat{T}({\mathcal {H}})\) in Theorem 3.1

Assume that \((z_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) = (\varvec{x}^{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \in \widetilde{T}({\mathcal {H}})\). Then, \((z_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \in \widehat{T}({\mathcal {H}})\) by definition. As \(\text{ closure } \widehat{T}({\mathcal {H}})\) is convex and closed, \( \text{ closure } \text{ conv } \widetilde{T}({\mathcal {H}}) \subseteq \,\text{ closure } \widehat{T}({\mathcal {H}})\) follows. \(\square \)

4.4 Proof of Closure \(\widehat{T}({\mathcal {H}}) \subset \) Closure conv \(\widetilde{T}({\mathcal {H}})\) in Theorem 3.1

It suffices to show that \(\widehat{T}({\mathcal {H}}) \subset \text{ closure } \text{ conv } \widetilde{T}({\mathcal {H}})\). Suppose \((z_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \in \widehat{T}({\mathcal {H}})\). Then, \( \,((h_0)_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \cdot (z_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) = 1, \, ((h_j)_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \cdot (z_{\varvec{\varvec{\beta }}} : {\mathcal {H}})= 0 \, (j \in J),\)

$$\begin{aligned}&(z_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) = \sum _{p=1}^{q} \left( (\varvec{x}_p)^{\varvec{\varvec{\beta }}} : {\mathcal {H}}\right) \quad \text{ for } \text{ some } \varvec{x}_p \in {\mathbb {L}}\quad (p=1,\ldots ,q). \end{aligned}$$

It follows that

$$\begin{aligned} 1= & {} \left( (h_0)_{\varvec{\varvec{\beta }}} : {\mathcal {H}}\right) \cdot \left( \sum _{p=1}^{q} \left( (\varvec{x}_p)^{\varvec{\varvec{\beta }}} : {\mathcal {H}}\right) \right) \nonumber \\= & {} \sum _{p=1}^{q} \left( (h_0)_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \cdot \left( (\varvec{x}_p)^{\varvec{\varvec{\beta }}} : {\mathcal {H}}\right) \right) = \sum _{p=1}^{q} h_0(\varvec{x}_p) , \nonumber \\ 0= & {} \left( (h_j)_{\varvec{\varvec{\beta }}} : {\mathcal {H}}\right) \cdot \left( \sum _{p=1}^{q} \left( (\varvec{x}_p)^{\varvec{\varvec{\beta }}} : {\mathcal {H}}\right) \right) \nonumber \\= & {} \sum _{p=1}^{q} \left( (h_j)_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \cdot \left( (\varvec{x}_p)^{\varvec{\varvec{\beta }}} : {\mathcal {H}}\right) \right) = \sum _{p=1}^{q} h_j(\varvec{x}_p) \quad \left( j \in J = \{1,\ldots ,\ell \}\right) . \end{aligned}$$
(22)

We will show by induction that

$$\begin{aligned} \varvec{x}_p \in T_j \quad (p=1,\ldots ,q) \quad (j=0,\ldots ,\ell ). \end{aligned}$$
(23)

It follows from \(\varvec{x}_p \in {\mathbb {L}}\) and (16) that \(\varvec{x}_p \in T_0\). Now assume that \(\varvec{x}_p \in T_{j-1}\) for some j with \(j \in J\;(p=1,\ldots ,q)\). By (17), we see that  \(h_{j}(\varvec{x}_p) \ge 0\) \((p=1,\ldots ,q)\). Hence, (22) implies that \(h_{j}(\varvec{x}_p) = 0\), and \(\varvec{x}_p \in T_{j}\;(p=1,\ldots ,q)\). Thus, we have shown (23).

From \(\varvec{x}_p \in T_0\), we know that \(\lambda _p = h_0(\varvec{x}_p)\) is nonnegative \((p=1,2,\ldots ,q)\). Let \( \,I_+ = \{ p : \lambda _p = h_0(\varvec{x}_p) > 0 \}, \,\,I_0 = \{ p : \lambda _p = h_0(\varvec{x}_p) = 0 \} \) and \( \bar{\varvec{x}}_p = \varvec{x}_p / (\lambda _p)^{1/\tau } \in {\mathbb {L}}\,(p \in I_+). \) By (11), (12) and \({\mathcal {H}}_{\min } \subset {\mathcal {H}}\subset {\mathcal {H}}_{\max }\) that

$$\begin{aligned} \left( (\varvec{x}_p)^{\varvec{\varvec{\beta }}} : {\mathcal {H}}\right)= & {} \left( \left( (\lambda _p)^{1/\tau }\bar{\varvec{x}}_p \right) ^{\varvec{\varvec{\beta }}} : {\mathcal {H}}\right) = \lambda _p \left( \left( \bar{\varvec{x}}_p\right) ^{\varvec{\varvec{\beta }}} : {\mathcal {H}}\right) \quad (p \in I_+), \\ h_0(\bar{\varvec{x}}_p)= & {} h_0( \varvec{x}_p / \lambda _p^{1/{\tau }}) = h_0(\varvec{x}_p)/\lambda _p = 1 \,(p \in I_+), \\ h_j(\bar{\varvec{x}}_p)= & {} h_j\left( \varvec{x}_p / \lambda _p^{1/{\tau }}\right) = h_j(\varvec{x}_p)/\lambda _p = 0 \,\,(p \in I_+) \,(j \in J). \end{aligned}$$

Hence, \(((\bar{\varvec{x}}_p)^{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \in \widetilde{T}({\mathcal {H}}) \, \,(p \in I_+)\)

$$\begin{aligned} 1= & {} \sum _{p = 1}^{q} h_0(\varvec{x}_p) = \sum _{p \in I_+} h_0(\varvec{x}_p) = \sum _{p \in I_+} \lambda _p, \,\lambda _p > 0 \,(p \in I_+), \nonumber \\ (z_{\varvec{\varvec{\beta }}} : {\mathcal {H}})= & {} \sum _{p=1}^{q} ((\varvec{x}_p)^{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \,= \sum _{p \in I_+} \lambda _p ((\bar{\varvec{x}}_p)^{\varvec{\varvec{\beta }}} : {\mathcal {H}}) + \sum _{p \in I_0} ((\varvec{x}_p)^{\varvec{\varvec{\beta }}} : {\mathcal {H}}), \\ \varvec{x}_p\in & {} \left\{ \varvec{x}\in {\mathbb {L}}: h_j(\varvec{x}) = 0 \,(j \in J_0) \right\} \,(p \in I_0). \end{aligned}$$

By (18), for each \(p \in I_0\), there exists a sequence \(\left\{ (\mu _{pr}, {\mathbf {y}}_{pr}) \in {\mathbb {R}}_+ \times T_* \right\} \) such that \((\mu _{pr},\mu _{pr} {\mathbf {y}}_{pr}) \rightarrow (0,\varvec{x}_p)\) as \(r \rightarrow \infty \). Let \(\tilde{p} \in I_+\) and \(\tilde{I}_+ = I_+ \backslash \{ \tilde{p} \}\). Then, for sufficiently large r such that \( \lambda _{\tilde{p} }- \sum _{p \in I_0} (\mu _{rp})^{\tau } > 0\),

$$\begin{aligned}&\text{ conv } \widetilde{T}({\mathcal {H}}) \\&\quad \ni \left( \lambda _{\tilde{p} }- \sum _{p \in I_0} (\mu _{rp})^{\tau } \right) ((\bar{\varvec{x}}_{\tilde{p}})^{\varvec{\varvec{\beta }}} : {\mathcal {H}}) + \sum _{p \in \tilde{I}_+} \lambda _p ((\bar{\varvec{x}}_p)^{\varvec{\varvec{\beta }}} : {\mathcal {H}})\\&\qquad +\sum _{p \in I_0} (\mu _{rp})^{\tau }(({\mathbf {y}}_{pr})^{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \\&\quad = \left( \lambda _{\tilde{p}} - \sum _{p \in I_0} (\mu _{rp})^p \right) ((\bar{\varvec{x}}_{\tilde{p}})^{\varvec{\varvec{\beta }}} : {\mathcal {H}}) + \sum _{p \in \tilde{I}_+} \lambda _p ((\bar{\varvec{x}}_p)^{\varvec{\varvec{\beta }}} : {\mathcal {H}})\\&\qquad +\sum _{p \in I_0} ((\mu _{rp}{\mathbf {y}}_{pr})^{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \\&\quad \rightarrow \sum _{p \in I_+} \lambda _p ((\bar{\varvec{x}}_p)^{\varvec{\varvec{\beta }}} : {\mathcal {H}}) + \sum _{p \in I_0} ((\varvec{x}_p)^{\varvec{\varvec{\beta }}} : {\mathcal {H}}) = (z_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \,\text{ as } r \rightarrow \infty . \end{aligned}$$

Therefore, we have shown that \((z_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \in \text{ closure } \text{ conv } \widetilde{T}({\mathcal {H}})\). \(\square \)

5 Nonhomogeneous Model

The discussions up to this point have been focused on POP (1) described by homogeneous polynomials \(\psi , \,h_j \in {\mathbb {R}}[\varvec{x}]\;(j \in J_0)\) characterized by condition (2). In this section, we deal with POP of the form (3) described by general (nonhomogeneous) polynomials \(\varphi , \, g_j \in {\mathbb {R}}[{\mathbf {w}}]\;(j \in J = \{1,\ldots ,\ell \})\) with any degrees, where \({\mathbf {w}}= (w_1,\ldots ,w_m) \in {\mathbb {R}}^m\). Pe\(\tilde{\hbox {n}}\)a et al. [9] formulated this type of POP (3) with \({\mathbb {K}}= {\mathbb {R}}^m_+\) as a linear optimization problem over the cone of completely positive tensors equivalent to POP (3) in Theorem 5 of [9]. We impose conditions with \({\mathbf {w}}\in {\mathbb {K}}\), which are similar to but more general than theirs in the sense that \({\mathbb {K}}\) can be any cone, not necessarily convex or pointed. Then, we convert POP (3) into POP (1) satisfying conditions (11), (16), (17) and (18). Hence, Theorem 3.1 holds. Let \(\tau = \max \{ \text{ deg }(\varphi ), \, \text{ deg }(g_j) \, (j \in J) \},\)

$$\begin{aligned} {\mathcal {G}}_{\min }= & {} \text{ supp } (\varphi ) \bigcup \left( \bigcup _{j \in J} \text{ supp }(g_j) \right) , \quad {\mathcal {G}}_{\max } = \left\{ \varvec{\alpha }\in {\mathbb {Z}}^{m}_+ : \left| \varvec{\alpha }\right| _1 \le \tau \right\} . \end{aligned}$$

Choose \({\mathcal {G}}\subset {\mathbb {Z}}^m_+\) such that \({\mathcal {G}}_{\min }\bigcup \{ \mathbf{0} \} \subset {\mathcal {G}}\subset {\mathcal {G}}_{\max }\). Then, \(\varphi , \,g_j \in {\mathbb {R}}[{\mathbf {w}}]\;(j \in J)\) can be represented as

$$\begin{aligned} \varphi ({\mathbf {w}})= & {} (\varphi _{\varvec{\varvec{\alpha }}} : {\mathcal {G}}) \cdot ({\mathbf {w}}^{\varvec{\varvec{\alpha }}} : {\mathcal {G}}) \quad \text{ for } \text{ some } (\varphi _{\varvec{\varvec{\alpha }}} : {\mathcal {G}}) \in {\mathbb {R}}^{{{\mathcal {G}}}}, \\ g_j({\mathbf {w}})= & {} ((g_j)_{\varvec{\varvec{\alpha }}} : {\mathcal {G}}) \cdot ({\mathbf {w}}^{\varvec{\varvec{\alpha }}} : {\mathcal {G}}) \quad \text{ for } \text{ some } ((g_j)_{\varvec{\varvec{\alpha }}} : {\mathcal {G}}) \in {\mathbb {R}}^{{{\mathcal {G}}}} \quad (j \in J) \end{aligned}$$

Let \( \text{ Copos }({\mathcal {G}},{\mathbb {K}})^* = \text{ cone } \text{ conv } \left\{ ({\mathbf {w}}^{\varvec{\varvec{\alpha }}} : {\mathcal {G}}) \in {\mathbb {R}}^{{{\mathcal {G}}}} : {\mathbf {w}}\in {\mathbb {K}}\right\} . \) We now consider the linear conic program over the cone \(\text{ Copos }({\mathcal {G}},{\mathbb {K}})^*\)

$$\begin{aligned} \begin{array}{llll} \text{ minimize }&\quad (\varphi _{\varvec{\varvec{\alpha }}} : {\mathcal {G}}) \cdot (y_{\varvec{\varvec{\alpha }}} : {\mathcal {G}})&\quad \text{ subject } \text{ to }&(y_{\varvec{\varvec{\alpha }}} : {\mathcal {G}}) \in \widehat{S}({\mathcal {G}}), \end{array} \end{aligned}$$
(24)

where

$$\begin{aligned} \widehat{S}({\mathcal {G}})= & {} \left\{ (y_{\varvec{\varvec{\alpha }}} : {\mathcal {G}}) \in {\mathbb {R}}^{{{\mathcal {G}}}} : \begin{array}{l} (y_{\varvec{\varvec{\alpha }}} : {\mathcal {G}}) \in \text{ Copos }({\mathcal {G}},{\mathbb {K}})^*, \\ ((g_0)_{\varvec{\varvec{\alpha }}} : {\mathcal {G}}) \cdot (y_{\varvec{\varvec{\alpha }}} : {\mathcal {G}}) = 1, \\ ((g_j)_{\varvec{\varvec{\alpha }}} : {\mathcal {G}}) \cdot (y_{\varvec{\varvec{\alpha }}} : {\mathcal {G}}) = 0 \,(j \in J) \end{array} \right\} , \\ g_0({\mathbf {w}})= & {} ((g_0)_{\varvec{\varvec{\alpha }}} : {\mathcal {G}}) \cdot ({\mathbf {w}}^{\varvec{\varvec{\alpha }}} : {\mathcal {G}}), \,\text{ where } (g_0)_{\varvec{\varvec{\alpha }}} = \left\{ \begin{array}{lll} 1, &{} \quad \text{ if }\quad \varvec{\alpha }= \mathbf{0} \in {\mathcal {G}}, \\ 0, &{}\quad \text{ if }\quad \varvec{\alpha }\in {\mathcal {G}}\quad \hbox {and} \quad \varvec{\alpha }\not = \mathbf{0}, \end{array} \right. . \end{aligned}$$

We note that \(g_0 \in {\mathbb {R}}[{\mathbf {w}},{\mathcal {G}}]\) has been consistently defined since \(\mathbf{0} \in {\mathcal {G}}\).

Let \(S_*\) denote the feasible region of POP (3). By construction, if \({\mathbf {w}}\in {\mathbb {R}}^m\) is a feasible solution of POP (3), then \((y_{\varvec{\varvec{\alpha }}} : {\mathcal {G}}) = ({\mathbf {w}}^{\varvec{\varvec{\alpha }}} : {\mathcal {G}})\) is a feasible solution of the problem (24), and the objective value \((\varphi _{\varvec{\varvec{\alpha }}} : {\mathcal {G}}) \cdot (y_{\varvec{\varvec{\alpha }}} : {\mathcal {G}})\) coincides with the objective value \(\varphi ({\mathbf {w}})\) at \({\mathbf {w}}\in {\mathbb {R}}^m\). Therefore, the problem (24) serves as a relaxation problem of POP (3), and

$$\begin{aligned} \inf \left\{ \varphi ({\mathbf {w}}) : {\mathbf {w}}\in S_* \right\} \ge \inf \left\{ (\varphi _{\varvec{\varvec{\alpha }}} : {\mathcal {G}}) \cdot (y_{\varvec{\varvec{\alpha }}} : {\mathcal {G}}) : (y_{\varvec{\varvec{\alpha }}} : {\mathcal {G}}) \in \widehat{S}({\mathcal {G}}) \right\} . \end{aligned}$$
(25)

We now convert POP (3) into POP (1), and the problem (24) into the moment cone problem (15), respectively, then, show the identity (19) by applying Theorem 3.1. Let \(n = 1+m, \,{\mathbb {L}}= {\mathbb {R}}_+ \times {\mathbb {K}}\) and \(J_0 = \{ 0 \} \bigcup J\). Define \(\theta : {\mathcal {G}}\rightarrow {\mathbb {Z}}^n_+\) by \(\theta (\varvec{\alpha }) := (\tau -\left| \varvec{\alpha }\right| _1,\varvec{\alpha })\) for every \(\varvec{\alpha }\in {\mathcal {G}}\). It is obvious that \(\theta \) is one-to-one mapping from \({\mathcal {G}}\) onto its image \({\mathcal {H}}= \theta ({\mathcal {G}}) = \left\{ \theta (\varvec{\alpha }) : \varvec{\alpha }\in {\mathcal {G}}\right\} \). Observe that \(((1,{\mathbf {w}})^{\varvec{\varvec{\beta }}} : {\mathcal {H}}) = ( {\mathbf {w}}^{\varvec{\varvec{\alpha }}} : {\mathcal {G}})\).

Thus, the \(\left| {\mathcal {G}}\right| \)-dimensional space \({\mathbb {R}}^{{{\mathcal {G}}}}\) can be identified with the \(\left| {\mathcal {H}}\right| \)-dimensional space \({\mathbb {R}}^{{{\mathcal {H}}}}\); the coordinate index \(\varvec{\alpha }\in {\mathcal {G}}\) of the space \({\mathbb {R}}^{{{\mathcal {G}}}}\) corresponds to the coordinate index \(\theta (\varvec{\alpha }) \in {\mathcal {H}}\) of the space \({\mathbb {R}}^{{{\mathcal {H}}}}\) and vice vera. Specifically, the coordinate index \(\mathbf{0} \in {\mathcal {G}}\) corresponds to \(\theta (\mathbf{0}) = (\tau ,\mathbf{0}) \in {\mathcal {H}}\). As a result, the polynomials \(\psi , \,h_j \in {\mathbb {R}}[\varvec{x},{\mathcal {H}}]\;(j \in J_0)\) can be consistently defined by

$$\begin{aligned} \psi (\varvec{x}):= & {} (\psi _{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \cdot (\varvec{x}^{\varvec{\varvec{\beta }}} : {\mathcal {H}}),\quad \text{ where } (\psi _{\varvec{\varvec{\beta }}} : {\mathcal {H}}) = (\varphi _{{\theta }(\varvec{\varvec{\alpha }})} : {\mathcal {G}}) \in {\mathbb {R}}^{{{\mathcal {H}}}}, \\ h_j(\varvec{x}):= & {} (\psi _{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \cdot (\varvec{x}^{\varvec{\varvec{\beta }}}: {\mathcal {H}}), \quad \text{ where } \left( (h_j)_{\varvec{\varvec{\beta }}} : {\mathcal {H}}\right) = \left( (g_j)_{{\theta }(\varvec{\varvec{\alpha }})} : {\mathcal {G}}\right) \in {\mathbb {R}}^{{{\mathcal {H}}}} \quad \left( j \in J_0\right) . \end{aligned}$$

We observe that, by construction,

$$\begin{aligned} h_0(\varvec{x})= & {} (w_0)^{\tau } \,\text{ for } \text{ every } \varvec{x}= (w_0,{\mathbf {w}}) \in {\mathbb {L}}= {\mathbb {R}}_+ \times {\mathbb {K}}, \nonumber \\ \psi (\varvec{x})= & {} \varphi ({\mathbf {w}}), \,h_j(\varvec{x}) = g_j({\mathbf {w}}) \,(j \in J) \\&\text{ if } \quad \varvec{x}= (w_0,{\mathbf {w}}) \in {\mathbb {L}} \text{ satisfies } h_0(\varvec{x}) = w_0^{\tau } = 1. \nonumber \end{aligned}$$
(26)

Therefore, POP (3) is equivalent to POP (1) with these polynomials \(\psi , \,h_j \in {\mathbb {R}}[\varvec{x},{\mathcal {H}}]\;(j \in J_0)\) and the cone \({\mathbb {L}}= {\mathbb {R}}_+ \times {\mathbb {K}}\). Thus,

$$\begin{aligned} \inf \left\{ \varphi ({\mathbf {w}}) : {\mathbf {w}}\in S_* \right\} = \inf \left\{ \psi (\varvec{x}) : \varvec{x}\in T_* \right\} . \end{aligned}$$
(27)

Define \(\displaystyle {\mathbb {M}}^o({\mathcal {H}},{\mathbb {L}}) := \left\{ \sum _{p=1}^{q^*} ((w_{p0},{\mathbf {w}}_p)^{\varvec{\varvec{\beta }}} : {\mathcal {H}}) : \begin{array}{ll} w_{p0} > 0, \,(w_{p0},{\mathbf {w}}_p) \in {\mathbb {L}}\\ (p=1,\ldots ,q), \,q \ge 0 \end{array}\right\} .\)

Lemma 5.1

\(\text{ Cops }({\mathcal {G}},{\mathbb {K}})^* = {\mathbb {M}}^o({\mathcal {H}},{\mathbb {L}}) \subset {\mathbb {M}}({\mathcal {H}},{\mathbb {L}})\).

Proof

Suppose that \(w_{p0} > 0\) and \((w_{p0},{\mathbf {w}}_p) \in {\mathbb {L}}\;(p=1,\ldots ,q)\). Then

$$\begin{aligned} \sum _{p=1}^{q} ((w_{p0},{\mathbf {w}}_p)^{\varvec{\varvec{\beta }}} : {\mathcal {H}})= & {} \sum _{p=1}^{q} (w_{p0})^{\tau }((1,{\mathbf {w}}_p/w_{p0})^{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \\= & {} \sum _{p=1}^{q} (w_{p0})^{\tau } (({\mathbf {w}}_p/w_{p0})^{\varvec{\varvec{\alpha }}} : {\mathcal {G}}) \in \text{ Copos }({\mathcal {G}},{\mathbb {K}})^*. \end{aligned}$$

Now suppose that \((y_{\varvec{\varvec{\alpha }}} : {\mathcal {G}}) \in \text{ Copos }({\mathcal {G}},{\mathbb {K}})^* \). Then there exist \(\lambda _p > 0\) and \({\mathbf {w}}_p \in {\mathbb {K}}\;(p=1,\ldots ,q)\) such that \( (y_{\varvec{\varvec{\alpha }}} : {\mathcal {G}}) = \sum _{p=1}^{q} \lambda _p (({\mathbf {w}}_p)^{\varvec{\varvec{\alpha }}} : {\mathcal {G}}). \) Hence,

$$\begin{aligned} (y_{\varvec{\varvec{\alpha }}} : {\mathcal {G}})= & {} \sum _{p=1}^{q} \lambda _p ((1,{\mathbf {w}}_p)^{\varvec{\varvec{\beta }}} : {\mathcal {H}}) = \sum _{p=1}^{q} (( (\lambda _p)^{1/\tau },(\lambda _p)^{1/\tau }{\mathbf {w}}_p)^{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \\\in & {} \left\{ \sum _{p=1}^{q} ((w_{p0},{\mathbf {w}}_p)^{\varvec{\varvec{\beta }}} : {\mathcal {H}}) : w_{p0} > 0, \,(w_{p0},{\mathbf {w}}_p) \in {\mathbb {L}}\,(p=1,\ldots ,q) \right\} . \end{aligned}$$

Thus, we have shown the desired identity. The latter inclusion relation follows directly from definition. \(\square \)

By Lemma 5.1, we can rewrite the problem (24) as

$$\begin{aligned} \begin{array}{llll} \text{ minimize }&\quad (\psi _{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \cdot (z_{\varvec{\varvec{\beta }}} : {\mathcal {H}})&\quad \text{ subject } \text{ to }&\quad (z_{\beta } : {\mathcal {H}}) \in \widehat{T}^o({\mathcal {H}}), \end{array} \end{aligned}$$
(28)

where

$$\begin{aligned} \widehat{T}^o({\mathcal {H}})= & {} \left\{ (z_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \in {\mathbb {R}}^{{{\mathcal {H}}}} : \begin{array}{l} (z_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \in {\mathbb {M}}^o({\mathcal {H}},{\mathbb {L}}), \\ ((h_0)_{\varvec{\varvec{\alpha }}} : {\mathcal {H}}) \cdot (z_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) = 1, \\ ((h_j)_{\varvec{\varvec{\alpha }}} : {\mathcal {H}}) \cdot (z_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) = 0 \, (j \in J) \end{array} \right\} . \end{aligned}$$

Since \(\widehat{T}^o({\mathcal {H}}) \subset \widehat{T}({\mathcal {H}})\), we obtain that

$$\begin{aligned}&\inf \left\{ (\varphi _{\varvec{\varvec{\alpha }}} : {\mathcal {G}}) \cdot (y_{\varvec{\varvec{\alpha }}} : {\mathcal {G}}) : (y_{\varvec{\varvec{\alpha }}} : {\mathcal {G}}) \in \widehat{S}({\mathcal {G}})\right\} \nonumber \\&\quad = \inf \left\{ (\psi _{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \cdot (z_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) : (z_{\beta } : {\mathcal {H}}) \in \widehat{T}^o({\mathcal {H}}) \right\} \nonumber \\&\quad \ge \inf \left\{ ( \psi _{\varvec{\varvec{\beta }}} : {\mathcal {H}}) \cdot (z_{\varvec{\varvec{\beta }}} : {\mathcal {H}}) : (z_{\beta } : {\mathcal {H}}) \in \widehat{T}({\mathcal {H}}) \right\} . \end{aligned}$$
(29)

For the conditions imposed on POP (3), we need to introduce some notation and symbols. Let \( S_j = \left\{ {\mathbf {w}}\in {\mathbb {K}}: g_i({\mathbf {w}}) = 0 \,(i < j) \right\} \,(j \in J) \,\) and \( \widehat{{\mathcal {G}}} = \left\{ \varvec{\alpha }\in {\mathcal {G}}: \left| \varvec{\alpha }\right| _1 = \tau \right\} . \) For each \(j \in J\), the homogeneous component of \(g_j\) with degree \(\tau \) is written as \( \hat{g}_j({\mathbf {w}}) = ((g_j)_{\varvec{\varvec{\alpha }}} : \widehat{{\mathcal {G}}}) \cdot ({\mathbf {w}}^{\varvec{\varvec{\alpha }}} : \widehat{{\mathcal {G}}}). \) We assume the following conditions.

$$\begin{aligned}&g_j({\mathbf {w}}) \ge 0 \quad \text{ for } \text{ every } \quad {\mathbf {w}}\in S_{j} \,(j \in J), \end{aligned}$$
(30)
$$\begin{aligned}&S_{*}^{\infty } \supset \left\{ {\mathbf {w}}\in {\mathbb {K}}: \hat{g}_j({\mathbf {w}}) = 0 \,(j \in J) \right\} . \end{aligned}$$
(31)

Note that these conditions do not depend on any choice of \({\mathcal {G}}\subset {\mathbb {Z}}^m_+\) such that \({\mathcal {G}}_{\min }\bigcup \{ \mathbf{0} \} \subset {\mathcal {G}}\subset {\mathcal {G}}_{\max }\). Condition (30) is equivalent to the one assumed in Theorem 5 of [9] if \({\mathbb {R}}_+^n\) is chosen for \({\mathbb {K}}\). Condition (31) is more general than the one assumed there, since the convexity of the cone is not required. In fact, we can take any closed (even nonconvex and/or nonpointed) cone in \({\mathbb {R}}^m\) in POP (3), while the cone \({\mathbb {K}}\) is restricted to \({\mathbb {R}}^m_+\) in Theorem 5 of [9].

If \( {\mathcal {H}}_{\min } := \text{ supp } (\psi ) \bigcup \left( \bigcup _{j \in J} \text{ supp }(h_j) \right) \,\text{ and } {\mathcal {H}}_{\max } := \left\{ \varvec{\beta }\in {\mathbb {Z}}^{n}_+ : \left| \varvec{\beta }\right| _1 = \tau \right\} , \) then \({\mathcal {H}}_{\min } \subset {\mathcal {H}}\subset {\mathcal {H}}_{\max }\) obviously holds. In addition, condition (11) holds by construction. In the remaining of this section, we show that conditions (16), (17) and (18) are satisfied to apply Theorem 3.1.

By definition, \(h_0(\varvec{x}) = w_0^{\tau }\) for every \(\varvec{x}= (w_0,{\mathbf {w}}) \in {\mathbb {L}}= {\mathbb {R}}_+ \times {\mathbb {K}}\). Thus, (16) follows. Let \(j \in J\). By (11), we observe that the identity

$$\begin{aligned} h_j(w_0,{\mathbf {w}}) = (w_0)^{\tau } h_j(1,{\mathbf {w}}/((w_0)^{\tau }))) = (w_0)^{\tau } g_j({\mathbf {w}}/((w_0)^{\tau }))) \end{aligned}$$

holds for every \(\varvec{x}= (w_0,{\mathbf {w}}) \in {\mathbb {L}}\) with \(w_0 > 0\). Hence,

$$\begin{aligned} h_j(w_0,{\mathbf {w}})\ge & {} \text{ or } = 0 \text{ for } \text{ every } \varvec{x}= (w_0,{\mathbf {w}}) \in {\mathbb {L}}= {\mathbb {R}}_+ \times {\mathbb {K}}\,\text{ with } w_0 > 0 \\&\text{ if } \text{ and } \text{ only } \text{ if } g_j({\mathbf {w}}) \ge \text{ or } = 0 \,\text{ for } \text{ every } {\mathbf {w}}\in {\mathbb {K}}, \text{ respectively }. \end{aligned}$$

By the continuity, we can relax the restriction \(w_0 > 0\) into \(w_0 \ge 0\), and obtain

$$\begin{aligned} h_j(w_0,{\mathbf {w}})\ge & {} \text{ or } = 0 \text{ for } \text{ every } \varvec{x}= (w_0,{\mathbf {w}}) \in {\mathbb {L}}= {\mathbb {R}}_+ \times {\mathbb {K}}\\&\text{ if } \text{ and } \text{ only } \text{ if } g_j({\mathbf {w}}) \ge \,\text{ or } 0 \,\text{ for } \text{ every } {\mathbf {w}}\in {\mathbb {K}}, \text{ respectively }. \end{aligned}$$

This relation holds for every \(j \in J\). Therefore, (17) follows from (30).

Assume that \( \varvec{x}= (w_0,{\mathbf {w}}) \in \left\{ \varvec{x}\in {\mathbb {L}}: h_j(\varvec{x}) = 0 \,(j \in J_0) \right\} . \) Then \({\mathbf {w}}\in {\mathbb {K}}\), \(w_0 = 0\) and \(0 = h_j(0,{\mathbf {w}}) = \hat{g}_j({\mathbf {w}}) \,(j \in J)\). By condition (31), there exists a sequence \(\left\{ (\mu _r,{\mathbf {v}}_r) \in {\mathbb {R}}^{n} \right\} \) such that \((\mu _r,{\mathbf {v}}_r) \in {\mathbb {R}}_+ \times {\mathbb {K}}, \,g_j({\mathbf {v}}_r) = 0 \,\) \((j \in J) \text{ and } (\mu _r,\mu _r {\mathbf {v}}_r) \rightarrow (0,{\mathbf {w}}) \, \text{ as } r \rightarrow \infty . \) By letting \({\mathbf {y}}_r = (1,{\mathbf {v}}_r) \in {\mathbb {L}}\;((r=1,2,\ldots , \,)\), we have

$$\begin{aligned}&(\mu _r, {\mathbf {y}}_r) \in {\mathbb {R}}_+ \times {\mathbb {L}}, \,h_0({\mathbf {y}}_r) = 1, \,h_j({\mathbf {y}}_r) = g_j({\mathbf {v}}_r) = 0 \,(j \in J), \\&(\mu _r,\mu _r{\mathbf {y}}_r) = (\mu _r,(\mu _r,\mu _r{\mathbf {v}}_r)) \rightarrow (0,(0,{\mathbf {w}})) = (0,\varvec{x}) \,\text{ as } r \rightarrow \infty . \end{aligned}$$

This implies that \(\varvec{x}\in T_*^{\infty }\). Consequently, we have shown (18).

By applying Theorem 3.1, we know that the identity (19) is satisfied. Taking account of all equalities and inequalities in (19), (25), (27) and (29), we finally conclude that the equality holds in the inequality (25), i.e., POP (3) and its relaxation (24) have a same optimal objective value.

6 Conclusions

We have shown that the results on the CPP relaxation for QOPs [1] can be extended to POP (1) satisfying conditions (11), (16), (17) and (18). For this extension, we have introduced the moment cone (14) and the moment cone relaxation (15) of the POP, which provides the exact optimal value of the POP. We note that implementing the moment cone relaxation computationally is quite difficult.

When compared with the conditions used in [9], the conditions for the equivalence presented in this paper are weaker as the convexity of the cone is not required.

Another difference between the proposed moment cone relaxation for POP (1) and the completely positive reformulation in [9] is that the proposed relaxation takes account of sparsity of POP (1), instead of using all the monomials with degree up to \(\tau \). As a result, a much smaller moment cone relaxation can be obtained by the proposed method.

The doubly nonnegative relaxation, a further relaxation of the moment cone relaxation for POPs, can be considered for implementation. The moment cone relaxation in this paper has been derived in a manner that the sparsity of the polynomial is preserved. As a result, its doubly nonnegative relaxation that requires each variable to be nonnegative involves a smaller number of variables and the sparsity can be exploited.