1 Introduction

Let \(\mathbb R[X]:=\mathbb R[X_1,\ldots ,X_n]\) be the ring of polynomials in the variables \(X_1,\ldots ,X_n\) with real coefficients. Denote by \(\varDelta _n\) the standard n-simplex in \(\mathbb R^n\), which is defined by

$$\begin{aligned} \varDelta _n:=\left\{ x=(x_1,\ldots ,x_n)\in \mathbb R^n| x_i\ge 0, \sum _{i=1}^n x_i=1\right\} . \end{aligned}$$

Pólya [9] proved in 1928 that for a homogeneous polynomial \(f\in \mathbb R[X]\), if \(f(x)>0\) for every \(x\in \varDelta _n\), then there exists a sufficiently large number N such that all coefficients of the polynomial \((X_1+\cdots +X_n)^N \cdot f\) are positive.

Powers and Reznick [10] gave an explicit bound for the number N, and applied it to give a constructive version of Handelman’s Positivstellensatz. More explicitly, let \(P\subseteq \mathbb R^n\) be a convex, compact polyhedron with non-empty interior, bounded by linear polynomials \(L_1, \ldots , L_m\in \mathbb R[X]\). By choosing the sign of the \(L_i\)’s, we may assume that

$$\begin{aligned} P=\left\{ x\in \mathbb R^n| L_i(x)\ge 0, ~ i=1,\ldots ,m\right\} . \end{aligned}$$
(1)

Theorem 1

(Handelman’s Positivstellensatz [4]) For a polynomial \(f\in \mathbb R[X]\), if \(f(x)>0\) for all \(x\in P\), then f can be represented as

$$\begin{aligned} f = \sum _{|\alpha |\le M} f_\alpha L_1^{\alpha _1}\ldots L_m^{\alpha _m} \end{aligned}$$

for some \(M\in \mathbb N\) and \(f_\alpha \ge 0\) for all \(\alpha =(\alpha _1,\ldots ,\alpha _m) \in \mathbb N^m\) such that \(|\alpha |\le M\).

Krivine [6] proved Handelman’s Positivstellensatz for a special polyhedron. Moreover, one can find a generalization of this Positivstellensatz in [11, Theorem 5.4.6] (or [8, Theorem 7.1.6]). A bound for the number M was given by Powers and Reznick [10], using the bound for the number N in Pólya’s Positivstellensatz.

Theorem 1 yields the following consequence.

Corollary 1

For a polynomial \(f\in \mathbb R[X]\), if \(f(x)>0\) for all \(x\in P\), then f can be represented as

$$\begin{aligned} f = \sum _{e=(e_1,\ldots ,e_m)\in \{0,1\}^m} f^2_e L_1^{e_1}\ldots L_m^{e_m}, \end{aligned}$$

where \(f_e\in \mathbb R[X]\) and \(\deg (f_e^2) \le M\).

This corollary is a special case of Schmüdgen’s Positivstellensatz [14] for convex, compact polyhedra which includes an explicit bound on the degrees of sums of squares coefficients \(f_e^2\).

Schmüdgen’s Positivstellensatz has many important applications, especially in solving polynomial optimization problems and moment problems for compact semi-algebraic sets. Therefore, as a special case of Schmüdgen’s Positivstellensatz, Handelman’s theorem for polynomials plays an important role in application.

A matrix version of Pólya’s Positivstellensatz was given by Scherer and Hol [13], with applications e.g. in robust polynomial semi-definite programs. Schmüdgen’s theorem for operator polynomials has been discovered by Cimpric̆ and Zalar [3]. Positivstellensätze for polynomial matrices have been studied by some other authors, see for example in [1, 2, 7, 12]. The main aim of this paper is to give a version of Handelman’s Positivstellensatz for polynomial matrices with an explicit degree bound.

We need to introduce some notations. For \(t\in \mathbb N^*\), let \(\mathcal {M}_t(R)\) denote the ring of square matrices of order t with entries from a commutative unital ring R. Denote by \(\mathcal {S}_t(R)\) the subset of \(\mathcal {M}_t(R)\) consisting of all symmetric matrices.

In this paper we consider mainly R to be the ring \(\mathbb R[X]\) of polynomials in n variables \(X_1,\ldots , X_n\) with real coefficients. Each element \(\mathbf{A } \in \mathcal {M}_t(\mathbb R[X])\) is a matrix whose entries are polynomials from \(\mathbb R[X]\), called a polynomial matrix. Each element \(\mathbf{A }\in \mathcal {M}_t(\mathbb R[X])\) is also called a matrix polynomial, because it can be viewed as a polynomial in \(X_1,\ldots , X_n\) whose entries from \(\mathcal {M}_t(\mathbb R)\). Namely, we can write \(\mathbf{A }\) as

$$\begin{aligned} \mathbf{A }=\sum _{|\alpha |=0}^{d} \mathbf{A }_{\alpha }X^{\alpha }, \end{aligned}$$

where \(\alpha =(\alpha _1,\ldots ,\alpha _n) \in \mathbb N^{n}\), \(|\alpha |:=\alpha _1+\cdots + \alpha _n\), \(X^{\alpha }:=X_1^{\alpha _1}\ldots X_n^{\alpha _n}\), \(\mathbf{A }_\alpha \in \mathcal {M}_t(\mathbb R)\), d is the maximum over all degree of entries of \(\mathbf{A }\), and it is called the degree of the matrix polynomial \(\mathbf{A }\). To unify notation, throughout the paper each element of \(\mathcal {M}_t(\mathbb R[X])\) is called a polynomial matrix.

For any polynomial matrix \(\mathbf{A }\in \mathcal {M}_t(\mathbb R[X])\) and for any subset \(K\subseteq \mathbb R^n\), by \(\mathbf{A }\succcurlyeq 0 \) (resp. \(\mathbf{A }\succ 0\)) on K we mean that for any \(x\in K\), the matrix \(\mathbf{A }(x)\) is positive semidefinite (resp. positive definite), i.e. all eigenvalues of the matrix \(\mathbf{A }(x)\) are non-negative (resp. positive).

For any polynomial matrices \(\mathbf{A }, \mathbf{B }\in \mathcal {M}_t(\mathbb R[X])\), the notation \(\mathbf{A }\succcurlyeq \mathbf{B }\) on K means that \(\mathbf{A }-\mathbf{B }\succcurlyeq \mathbf{0 }\) on K.

Suppose that we have a convex, compact polyhedron \(P\subseteq \mathbb R^n\) with non-empty interior, bounded by linear polynomials \(L_1, \ldots , L_m\in \mathbb R[X]\), defined by (1). Let \(\mathbf{F } \in \mathcal {S}_t(\mathbb R[X])\) be a polynomial matrix of degree \(d>0\). Assume \(\mathbf{F } \succ \mathbf{0 }\) on P. The main result of this paper is presented in Theorem 3 which is a matrix version of Handelman’s Positivstellensatz, stating that there exists a number \(N_0\) such that for all integer \(N >N_0\) the polynomial matrix \(\mathbf{F }\) can be written as

$$\begin{aligned} \mathbf{F }=\sum _{|\alpha |=N+d}\mathbf{F }_\alpha L_1^{\alpha _1}\ldots L_m^{\alpha _m}, \end{aligned}$$

where \(\mathbf{F }_\alpha \in \mathcal {S}_t(\mathbb R)\) are positive definite scalar matrices with \(|\alpha |=N+d\).

The main idea in the proof of this theorem inherits from Powers and Reznick [10], using a matrix version of Pólya’s Positivstellensatz [13] and the continuity of eigenvalue functions of the polynomial matrix \(\mathbf{F }\) on the entries of \(\mathbf{F }\) (by [16, Theorem 1]). As a corollary of this theorem, we give a special case of Schmüdgen’s Positivstellensatz for polynomial matrices positive definite on convex, compact polyhedra. Furthermore, we give a procedure to find such a representation for the polynomial matrix \(\mathbf{F }\).

2 Representation of polynomial matrices positive definite on simplices

In this section we consider a simple case where P is an n-simplex with vertices \(\{v_0,v_1,\ldots ,v_n\}\) and let \(\{L_0,L_1,\ldots ,L_n\}\) be the set of barycentric coordinates of P, that is, each \(L_i\in \mathbb R[X]\) is linear and

$$\begin{aligned} X=\sum _{i=0}^n L_i(X) v_i, ~ \sum _{i=0}^nL_i(X)=1, L_i(v_j)=\delta _{ij}. \end{aligned}$$
(2)

Let \(\mathbf{F }\in \mathcal {S}_t(\mathbb R[X])\) be a polynomial matrix of degree \(d>0\). We can express \(\mathbf{F }\) as

$$\begin{aligned} \mathbf{F }(X)=\sum _{|\alpha |\le d} \mathbf{A }_\alpha X^{\alpha }, \end{aligned}$$

where \(\mathbf{A }_\alpha \in \mathcal {M}_t(\mathbb R)\).

Let us consider the Bernstein–Bézier form of \(\mathbf{F }\) with respect to P:

$$\begin{aligned} \widetilde{\mathbf{F }}_d(Y):=\widetilde{\mathbf{F }}_d(Y_0,\ldots ,Y_n):=\sum _{|\alpha |\le d} \mathbf{A }_\alpha \left( \sum _{i=0}^n Y_iv_i\right) ^\alpha \left( \sum _{i=0}^n Y_i\right) ^{d-|\alpha |}. \end{aligned}$$
(3)

It is easy to see that \(\widetilde{\mathbf{F }}_d(Y)\in \mathcal {S}_t(\mathbb R[Y])\) is a homogeneous polynomial matrix of degree d. Moreover, it follows from the relations (2) that

$$\begin{aligned} \widetilde{\mathbf{F }}_d(L_0,\ldots ,L_n)=\mathbf{F }(X). \end{aligned}$$

Following Scherer and Hol [13], for each multi-index \(\alpha =(\alpha _1,\ldots ,\alpha _n)\in \mathbb N^n\), let us denote

$$\begin{aligned} \alpha !:=\alpha _1!\ldots \alpha _n!; ~ D_\alpha :=\partial _1^{\alpha _1}\ldots \partial _n^{\alpha _n}. \end{aligned}$$

With these notations, we can re-write \(\mathbf{F }\) as

$$\begin{aligned} \mathbf{F }(X)= \sum _{|\alpha |\le d} \dfrac{D_\alpha \mathbf{F }(0)}{\alpha !}X^\alpha . \end{aligned}$$

With the spectral norm \(\Vert {\cdot }\Vert \), following Scherer and Hol [13], we define

$$\begin{aligned} C(\mathbf{F }):=\max _{|\alpha |\le d}\dfrac{\Vert D_\alpha \mathbf{F }(0)\Vert }{|\alpha |!}. \end{aligned}$$
(4)

Using these notations, we have the following representation of polynomial matrices which are positive on simplices.

Theorem 2

Let \(P\subseteq \mathbb R^n\) be an n-simplex given as above and \(\mathbf{F }\in \mathcal {S}_t(\mathbb R[X])\) a polynomial matrix of degree \(d>0\). Assume that \(\mathbf{F }\succcurlyeq \lambda \mathbf{I }_t\) on P for some \(\lambda >0\). Let \(C:=C(\widetilde{\mathbf{F }_d})\). Then for each \(N>\dfrac{d(d-1)}{2} \dfrac{C}{\lambda }-d\), \(\mathbf{F }\) can be represented as

$$\begin{aligned} \mathbf{F }=\sum _{|\alpha |=N+d}\mathbf{F }_\alpha L_0^{\alpha _0}\ldots L_n^{\alpha _n}, \end{aligned}$$

where each \(\mathbf{F }_\alpha \in \mathcal {S}_t(\mathbb R)\) is positive definite.

Proof

Let us denote by \(\varDelta _{n+1}\) the standard simplex in \(\mathbb R^{n+1}\), i.e.

$$\begin{aligned} \varDelta _{n+1}=\left\{ (y_0,\ldots ,y_n)\in \mathbb R^{n+1}| y_i \ge 0, \sum _{i=0}^n y_i=1\right\} . \end{aligned}$$

Since \(\mathbf{F }(x)\succcurlyeq \lambda \mathbf{I }_t\) for all \(x\in P\), the Bernstein–Bézier form \(\widetilde{\mathbf{F }_d}\) of \(\mathbf{F }\) with respect to P satisfies

$$\begin{aligned} \widetilde{\mathbf{F }_d}(y_0,\ldots ,y_n)\succcurlyeq \lambda \mathbf{I }_t, \forall (y_0,\ldots ,y_n)\in \varDelta _{n+1}. \end{aligned}$$

Then it follows from Pólya’s theorem for polynomial matrices [13, Theorem 3], that for each \(N> \dfrac{d(d-1)}{2} \dfrac{C}{\lambda }-d\),

$$\begin{aligned} \left( \sum _{i=0}^n Y_i\right) ^N\widetilde{\mathbf{F }_d}(Y)=\sum _{|\alpha |=N+d}\mathbf{F }_\alpha Y_0^{\alpha _0}\ldots Y_n^{\alpha _n}, \end{aligned}$$
(5)

where each \(\mathbf{F }_\alpha \in \mathcal {S}_t(\mathbb R)\) is positive definite. Substituting \(Y_i\) by \(L_i\) on both sides of (5), noting that

$$\begin{aligned} \widetilde{\mathbf{F }_d}(L_0(X),\ldots ,L_n(X)) = F(X) \text{ and } \sum _{i=0}^N L_i(X)=1, \end{aligned}$$

we obtain the required representation for \(\mathbf{F }\). \(\square \)

3 Representation of polynomial matrices positive definite on convex, compact polyhedra

Throughout this section, let \(P\subseteq \mathbb R^n\) be a convex, compact polyhedron with non-empty interior, given by (1). By [15], there exist positive numbers \(c_i\in \mathbb R\) such that \(\sum _{i=1}^m c_iL_i(X)=1\). Replacing each \(L_i\) by \(c_iL_i\) we may assume that

$$\begin{aligned} \sum _{i=1}^m L_i(X)=1. \end{aligned}$$
(6)

Moreover, it is easy to check that for each \(i=1,\ldots ,n\), there exist real numbers \(b_{ij}\in \mathbb R, j=1,\ldots ,m\) such that

$$\begin{aligned} X_i = \sum _{j=1}^m b_{ij}L_j(X). \end{aligned}$$

Let us consider the \(n\times m\) matrix \(\mathbf{B }:=(b_{ij})_{i=1,\ldots ,n; j=1,\ldots ,m}\). Then for \(X=(X_1,\ldots ,X_n)\) and \(L=(L_1,\ldots ,L_m)\), we have \( X^T=\mathbf{B }\cdot L^T. \) In other words, we have

$$\begin{aligned} X=L\cdot \mathbf{B }^T. \end{aligned}$$
(7)

Denote \(\mathbb R[Y]:=\mathbb R[Y_1,\ldots ,Y_m]\), and consider the ring homomorphism

$$\begin{aligned} \varphi {:}\,\mathbb R[Y] \rightarrow \mathbb R[X], \quad Y_i\longmapsto L_i(X), \forall i=1,\ldots ,m. \end{aligned}$$

It follows from (6) that \(\sum _{i=1}^m Y_i-1\in \text{ Ker }(\varphi )\). Hence we may assume that the ideal \(I:=\text{ Ker }(\varphi )\) is generated by polynomials \(R_1(Y),\ldots , R_s(Y)\in \mathbb R[Y]\),

$$\begin{aligned} I:=\text{ Ker }(\varphi )=\left\langle R_1(Y),\ldots ,R_s(Y)\right\rangle , \end{aligned}$$

where \(\sum _{i=1}^m Y_i-1\) is one of the \(R_i\)’s.

Note that the homomorphism \(\varphi \) induces a ring homomorphism

$$\begin{aligned} M_\varphi {:}\,\mathcal {M}_t(\mathbb R[Y]) \longrightarrow \mathcal {M}_t(\mathbb R[X]),\quad \mathbf{G }=(g_{ij}(Y))\longmapsto (\varphi (g_{ij}(Y))). \end{aligned}$$

Lemma 1

The homomorphism \(M_\varphi \) is surjective, and

$$\begin{aligned} \mathcal {I}:=\text{ Ker }(M_\varphi )=\left\langle R_1(Y)\mathbf{I }_t,\ldots , R_s(Y)\mathbf{I }_t\right\rangle , \end{aligned}$$

where \(\mathbf{I }_t\) denotes the identity matrix in \(\mathcal {M}_t(\mathbb R[Y])\).

Proof

For each \(g(X)=\sum _{|\alpha |\le d}a_\alpha X^\alpha \in \mathbb R[X]\), denote

$$\begin{aligned} \widetilde{g}(Y):=\sum _{|\alpha |\le d}a_\alpha (Y\cdot B^T)^\alpha \left( \sum _{i=1}^mY_i\right) ^{d-|\alpha |}\in \mathbb R[Y]. \end{aligned}$$
(8)

It is clear that \(\widetilde{g}\) is homogeneous of degree d. Moreover \(\varphi (\widetilde{g}(Y))=g(X)\). Hence \(\varphi \) is surjective. Then the surjectivity of \(M_\varphi \) follows from that of \(\varphi \).

On the other hand, \(\mathbf{G }=(g_{ij}(Y))\in \text{ Ker }(M_\varphi )\) if and only if \(g_{ij}\in \text{ Ker }(\varphi )\) for all \(i,j=1,\ldots ,t\). Hence for each \(i,j=1,\ldots ,t\) we have

$$\begin{aligned} g_{ij}(Y)=\sum _{k=1}^s a_{ijk}(Y)R_k(Y), \text{ for } \text{ some } a_{ijk}(Y) \in \mathbb R[Y]. \end{aligned}$$

Then \(\mathbf{G }\) can be written as

$$\begin{aligned} \mathbf{G }=\sum _{k=1}^s R_k\mathbf{A }_{\mathbf{k}}=\sum _{k=1}^s (R_k\mathbf{I }_t)\mathbf{A }_{\mathbf{k}}, \end{aligned}$$

where \(\mathbf{A }_{\mathbf{k}}=(a_{ijk}(Y))\in \mathcal {M}_t(\mathbb R[Y])\) for each \(k=1,\ldots , s\). It is equivalent to the fact that \(\mathbf{G }\in \left\langle R_1\mathbf{I }_t,\ldots , R_s\mathbf{I }_t\right\rangle .\) The proof is complete. \(\square \)

Let \(\mathbf{F }=(f_{ij})\in \mathcal {S}_t(\mathbb R[X])\) be a polynomial matrix of degree \(d>0\). Denote \(\widetilde{\mathbf{F }}:=(\widetilde{f_{ij}})\in \mathcal {S}_t(\mathbb R[Y])\), where each \(\widetilde{f_{ij}}\) is defined by (8), which is a homogeneous polynomial of degree d.

Assume \(\lambda (\mathbf{F })\) is an eigenvalue function of \(\mathbf{F }\). It follows from [16, Theorem 1] that \(\lambda (\mathbf{F })\) is a continuous function on \(f_{ij}(X)\), \(i,j=1,\ldots ,t\). That is, there exists a continuous function \(\varLambda {:}\,\mathbb R^{t\times t} \rightarrow \mathbb R\) such that \(\lambda (\mathbf{F })=\varLambda (f_{ij}(X))\). Denote \(\widetilde{\lambda (\mathbf{F })}(Y):=\varLambda (\widetilde{f_{ij}}(Y))\), which is actually an eigenvalue function of the polynomial matrix \(\widetilde{\mathbf{F }}\).

Denote \(R(Y):=\sum _{i=1}^s R_i^2(Y)\). With the notations given above, we have the following useful lemma.

Lemma 2

Let \(\mathbf{F }=(f_{ij})\in \mathcal {S}_t(\mathbb R[X])\) be a polynomial matrix of degree \(d>0\). Let \(\lambda (\mathbf{F })\) is an eigenvalue function of \(\mathbf{F }\). If \(\lambda (\mathbf{F })> 0\) on P, then there exists a sufficiently large constant c such that \(\widetilde{\lambda (\mathbf{F })}+cR>0\) on the standard m-simplex \( \varDelta _m\). More explicitly, this holds for \(c>-m_1/m_2\), where \(m_1\) is the minimum of \(\widetilde{\lambda (\mathbf{F })}\) on \(\varDelta _m\) and \(m_2\) is the minimum of the polynomial R on the compact set \(\varDelta _m\cap \{y\in \mathbb R^m|\widetilde{\lambda (\mathbf{F })}(y)\le 0\}\).

Proof

The proof goes along the same lines as the proof of [10, Lemma 4], using continuity of the function \(\widetilde{\lambda (\mathbf{F })}\). \(\square \)

Applying this lemma, we have

Lemma 3

Let \(\mathbf{F }=(f_{ij})\in \mathcal {S}_t(\mathbb R[X])\) be a polynomial matrix of degree \(d>0\). Denote \(\widetilde{\mathbf{F }}:=(\widetilde{f_{ij}})\in \mathcal {S}_t(\mathbb R[Y])\). Assume \(\mathbf{F }\succ \mathbf{0 }\) on P. Then there exists a sufficiently large constant c such that \(\widetilde{\mathbf{F }}+cR\mathbf{I }_t \succ \mathbf{0 }\) on the standard m-simplex \(\varDelta _m\).

Proof

Since \(\mathbf{F }\) is positive definite on P, its eigenvalue functions \(\lambda _k(\mathbf{F }), k=1,\ldots ,t\), are positive on P. It follows from Lemma 2 that for each k, there exists a sufficiently large constant \(c_k\) such that \(\widetilde{\lambda _k(\mathbf{F })} + c_k R\) is positive on \(\varDelta _m\). Let \( c:=\max _{k=1,\ldots ,t}{c_k}\). Then \(\widetilde{\lambda _k(\mathbf{F })} + c R\) is positive on \(\varDelta _m\) for each \(k=1, \ldots , t\). Note that, \(\widetilde{\lambda _k(\mathbf{F })}\), \(k=1,\ldots ,t\), are eigenvalues of the polynomial matrix \(\widetilde{\mathbf{F }}\). Moreover, the eigenvalues of the matrix \(\widetilde{\mathbf{F }}+cR\mathbf{I }_t\) are \(\widetilde{\lambda _k(\mathbf{F })}+cR\), \(k=1,\ldots ,t\). It follows that \(\widetilde{\mathbf{F }}+cr\mathbf{I }_t\) is positive definite on \(\varDelta _m\). The proof is complete. \(\square \)

Note that \(\overline{\mathbf{F }}:=\widetilde{\mathbf{F }}+cR\mathbf{I }_t\) need not be homogeneous. However, by homogenization \(\overline{\mathbf{F }}\) by \(\sum _{i=1}^m Y_i\), we obtain a homogeneous polynomial matrix of the same degree as \(\overline{\mathbf{F }}\). More explicitly, if we express \(\overline{\mathbf{F }}\) as

$$\begin{aligned} \overline{\mathbf{F }} = \sum _{|\beta |\le d} \overline{\mathbf{F }}_\beta Y^\beta ,\quad \overline{\mathbf{F }}_\beta \in \mathcal {S}_t(\mathbb R), \end{aligned}$$

then its homogenization by \(\sum _{i=1}^m Y_i\) is

$$\begin{aligned} \overline{\mathbf{F }}^h=\sum _{|\beta |\le d} \overline{\mathbf{F }}_\beta Y^\beta \left( \sum _{i=1}^m Y_i\right) ^{d-|\beta |}. \end{aligned}$$
(9)

\(\overline{\mathbf{F }}^h\) is a homogeneous polynomial matrix of degree d. Moreover, \(M_\varphi (\overline{\mathbf{F }}^h)=\mathbf{F }\), and \(\overline{\mathbf{F }}^h\) is positive definite on \(\varDelta _m\).

Now we can state and prove the following matrix version of Handelman’s Positivstellensatz.

Theorem 3

Let P, \(\mathbf{F }\), \(\overline{\mathbf{F }}\), \(\overline{\mathbf{F }}^h\) be given as above, with \(\mathbf{F }\) positive definite on P. Assume that \(\overline{\mathbf{F }}^h\succcurlyeq \lambda \mathbf{I }_t\) on \(\varDelta _m\) for some \(\lambda >0\). Let \(d:=\text{ deg }(\overline{\mathbf{F }})\) and \(C:=C(\overline{\mathbf{F }}^h)\). Then for each \(N> \dfrac{d(d-1)}{2} \dfrac{C}{\lambda }-d\), \(\mathbf{F }\) can be represented as

$$\begin{aligned} \mathbf{F }=\sum _{|\alpha |=N+d}\mathbf{F }_\alpha L_1^{\alpha _1}\ldots L_m^{\alpha _m}, \end{aligned}$$
(10)

where each \(\mathbf{F }_\alpha \in \mathcal {S}_t(\mathbb R)\) is positive definite.

Proof

Firstly, applying the matrix version of Pólya’s Positivstellensatz given in [13, Theorem 3] for \(\overline{\mathbf{F }}^h\), observing that \(d=\deg (\overline{\mathbf{F }}^h)\). Then, applying \(M_\varphi \), using the fact that \(M_\varphi (\overline{\mathbf{F }}^h)=\mathbf{F }\) and \(\varphi (\sum _{i=1}^m Y_i)=1\). \(\square \)

As a summary, we formulate the construction given above as a procedure to find a representation for the polynomial matrix \(\mathbf{F }=(f_{ij})\in \mathcal {S}_t(\mathbb R[X])\) positive definite on a convex, compact polyhedron \(P\subseteq \mathbb R^n\) as follows:

  1. (1)

    Following [4] to find positive constants \(c_i\in \mathbb R\) such that \(\sum _{i=1}^m c_iL_i(X) = 1\). Constructing the \(c_i\)’s comes down to find a positive solution to an under-determined linear system.

  2. (2)

    Solving the system of equations

    $$\begin{aligned} X_i=\sum _{j=1}^m b_{ij}L_{i}(X), \quad i=1,\ldots ,n, \end{aligned}$$

    to find the matrix \(\mathbf{B }=(b_{ij})_{i=1,\ldots ,n; j=1,\ldots ,m}\).

  3. (3)

    Using (8) to find \(\widetilde{f_{ij}}\), \(i,j=1,\ldots ,t\).

  4. (4)

    Using Gröbner bases to find a basis \(\{R_1,\ldots ,R_s\}\) for the kernel \(\text{ Ker }(\varphi )\) of the ring homomorphism \(\varphi \).

  5. (5)

    Following the proof of Lemma 3 to find a sufficiently large c such that \(\widetilde{\mathbf{F }}+c R\mathbf{I }_t \succ \mathbf{0 }\) on \(\varDelta _m\).

  6. (6)

    Using (9) to construct the homogenization \(\overline{\mathbf{F }}^h\) of \(\overline{\mathbf{F }}:=\widetilde{\mathbf{F }}+c R \mathbf{I }_t\).

  7. (7)

    Following the proof of Lemma 4 below to find the positive number \(\lambda \) such that \(\overline{\mathbf{F }}^h(y)\succcurlyeq \lambda \mathbf{I }_t\) for all \(y\in \varDelta _m\).

Lemma 4

Let \(K\subseteq \mathbb R^m\) be a non-empty compact set, and \(\mathbf{G }\in \mathcal {S}_t(\mathbb R[Y])\). Then there exists a number \(c \in \mathbb R\) such that

$$\begin{aligned} \mathbf{G }(y)\succcurlyeq c\mathbf{I }_t, \text{ for } \text{ all } y \in K. \end{aligned}$$

In particular, if \(\mathbf{G }(y)\succ 0\) for all \(y\in K\), then we can choose a number \(c >0 \) such that \(\mathbf{G }(y)\succcurlyeq c \mathbf{I }_t, \text{ for } \text{ all } y \in K. \)

Proof

Let \(\lambda _{1}(\mathbf{G }), \ldots , \lambda _t(\mathbf{G })\) be (real-valued) eigenvalue functions of the polynomial matrix \(\mathbf{G }\in \mathcal {S}_t(\mathbb R[Y]). \) It follows from [16, Theorem 1] that \(\lambda _i(\mathbf{G })\) are continuous functions. Since K is compact, let

$$\begin{aligned} c_i:= \min _{y\in K} \lambda _i(\mathbf{G })(y), \quad i=1,\ldots , t. \end{aligned}$$

Denote \(c:=\min _{i=1,\ldots ,t} c_i\). Since eigenvalue functions of \(\mathbf{G }-c\mathbf{I }_t\) are \(\lambda _i(\mathbf{G })-c\), \(i=1,\ldots ,t\), it follows from the definition of c that

$$\begin{aligned} \lambda _i(\mathbf{G })(y) - c \ge \lambda _i(\mathbf{G })(y) -c_i\ge 0 \end{aligned}$$

for all \(y\in K\) and for all \(i=1,\ldots ,t\). This implies that \(\mathbf{G }(y)\succcurlyeq c\mathbf{I }_t, \text{ for } \text{ all } y \in K.\) \(\square \)

  1. (8)

    Using the formula (4) to find the number \(C:=C(\overline{\mathbf{F }}^h)\).

  2. (9)

    Find a number \(N> \dfrac{d(d-1)}{2} \dfrac{C}{\lambda }-d\).

  3. (10)

    Find the matrix coefficients of the polynomial matrix \((\sum _{i=1}^m Y_i)^N \overline{\mathbf{F }}^h\in \mathcal {S}_t(\mathbb R[Y])\), substituting \(Y_i\) by \(L_i(X)\), we obtain the desired representation for \(\mathbf{F }\).

We illustrate the procedure given above by the following example which is computed explicitly using MATLAB Version 7.10 (Release 2010a) and its add-on GloptiPoly 3 discovered by Henrion et al. [5].

Example 1

Let us consider the unit square centered at the origin

$$\begin{aligned} P:= & {} \left\{ (x,y)\in \mathbb R^2| L'_1=1+x\ge 0, L'_2=1-x\ge 0,\right. \\&\left. \qquad L'_3=1+y\ge 0, L'_4=1-y\ge 0\right\} . \end{aligned}$$

Choosing \(c_1=c_2=c_3=c_4=\dfrac{1}{4}\), we have \(\sum _{i=1}^4c_iL'_i(x,y)=1\). Therefore, consider

$$\begin{aligned} L_1:=\dfrac{1}{4}+\dfrac{1}{4}x, ~L_2:=\dfrac{1}{4}-\dfrac{1}{4}x,~ L_3:=\dfrac{1}{4}+\dfrac{1}{4}y, ~L_4:=\dfrac{1}{4}-\dfrac{1}{4}y \in \mathbb R[x,y], \end{aligned}$$

we have \( \sum _{i=1}^4 L_i = 1\).

It is easy to see that the matrix \(\mathbf{B }=\left[ \begin{array}{llll} 2&{}\quad -2&{}\quad 0&{}\quad 0\\ 0&{}\quad 0&{}\quad 2&{}\quad -2 \end{array}\right] \) satisfies the equation

$$\begin{aligned} \mathbf{B }\cdot \left[ L_1 ~ L_2 ~ L_3 ~ L_4\right] ^T = [x ~ y]^T. \end{aligned}$$

Let \(\varphi {:}\,\mathbb R[y_1,y_2,y_3,y_4] \rightarrow \mathbb R[x,y]\) be the ring homomorphism defined by \(\varphi (y_i):=L_i(x,y)\), \(i=1,2,3,4\). Using any monomial ordering in \(\mathbb R[y_1,y_2,y_3,y_4]\) we can find a Gröbner basis for the kernel \(\text{ Ker }(\varphi )\) of \(\varphi \):

$$\begin{aligned} \{R_1, R_2\}:=\left\{ y_1+y_2-\frac{1}{2}, y_3+y_4-\frac{1}{2}\right\} . \end{aligned}$$

Consider \(R:=R_1^2+R_2^2\).

Now we consider the polynomial matrix

$$\begin{aligned} \mathbf{F }:=\begin{bmatrix} -4x^2y+7x^2+y+3&\quad x^3+5xy-3x\\ x^3+5xy-3x&\quad x^4+x^2y+3x^2-4y+6\\ \end{bmatrix}. \end{aligned}$$

Eigenvalue functions of \(\mathbf{F }\) are

$$\begin{aligned} \lambda _1(\mathbf{F })=6x^2 - 4x^2y - 4y + 6; ~ \lambda _2(\mathbf{F })=x^4 + x^2y + 4x^2 +y+ 3. \end{aligned}$$

For any \((x,y)\in P\) we have \(\lambda _i(\mathbf{F })(x,y)\ge 2\), \(i=1,2\). Hence \(\mathbf{F }(x,y)\succcurlyeq 2\mathbf{I }_2\) for every \((x,y)\in P\).

With the matrix \(\mathbf{B }\) considered above, using the formula (8), we find \(\widetilde{f_{ij}}\), \(i,j=1,2\), and then we obtain the polynomial matrix \(\widetilde{\mathbf{F }}=(\widetilde{f}_{ij})\). We can compute exactly the eigenvalue functions \(\lambda _1(\widetilde{\mathbf{F }})\) and \(\lambda _2(\widetilde{\mathbf{F }})\) of \(\widetilde{\mathbf{F }}\) which satisfy

$$\begin{aligned} \min _{\varDelta _4}\lambda _1(\widetilde{\mathbf{F }}) =1,~ \min _{\varDelta _4}\lambda _2(\widetilde{\mathbf{F }}) =-2. \end{aligned}$$

Moreover, \(\min _{\varDelta _4\cap \{\lambda _2(\widetilde{\mathbf{F }})\le 0\}} R(y_1,y_2,y_3,y_4) = 0.125\). Thus we can choose

$$\begin{aligned} c>-\dfrac{-2}{0.125}=16, \text{ namely }, ~c=17, \end{aligned}$$

for which \(\overline{\mathbf{F }}:=\widetilde{\mathbf{F }}+cR \mathbf{I }_2 \succ \mathbf{0 } \) on \(\varDelta _4\).

Homogenizing \(\overline{\mathbf{F }}\) by \(\sum \nolimits _{i=1}^4y_i\) we obtain a homogeneous polynomial matrix \(\overline{\mathbf{F }}^h=(\overline{f_{ij}}^h)\). Then we compute exactly the eigenvalue functions of the matrix \(\overline{\mathbf{F }}^h\) which satisfy

$$\begin{aligned} \min _{\varDelta _4}\lambda _1(\overline{\mathbf{F }}^h) = 1.9706,~ \min _{\varDelta _4}\lambda _2(\overline{\mathbf{F }}^h) = 1.5294. \end{aligned}$$

It follows that \(\overline{\mathbf{F }}^h\succcurlyeq 1.5294 ~ \mathbf{ I }_2\) on \(\varDelta _4\), and put \(\lambda :=1.5294\).

Using the formula (4), we can find the number \(C:=C(\overline{\mathbf{F }}^h)=\dfrac{1044}{24}=\dfrac{87}{2}\).

Therefore, choosing \(N=167\), the polynomial matrix \((y_1+y_2+y_3+y_4)^{167}\overline{\mathbf{F }}^h\) has positive definite coefficients.

Find the matrix coefficients of the polynomial matrix \((y_1+y_2+y_3+y_4)^{167}\overline{\mathbf{F }}^h\in \mathcal {S}_t(\mathbb R[y_1,y_2,y_3,y_4])\), substituting \(y_i\) by \(L_i(x,y)\), we obtain the desired representation for \(\mathbf{F }\).

As a consequence of Theorem 3, we obtain the following matrix version of Schmüdgen’s Positivstellensatz for convex, compact polyhedra.

Corollary 2

Let P, \(\mathbf{F }\), \(\overline{\mathbf{F }}\), \(\overline{\mathbf{F }}^h\) be given as above, with \(\mathbf{F }\) positive definite on P. Assume that \(\overline{\mathbf{F }}^h\succcurlyeq \lambda \mathbf{I }_t\) on \(\varDelta _m\) for some \(\lambda >0\). Let \(d:=\text{ deg }(\overline{\mathbf{F }})\) and \(C:=C(\overline{\mathbf{F }}^h)\). Then for \(N> \dfrac{d(d-1)}{2} \dfrac{C}{\lambda }-d\), \(\mathbf{F }\) can be represented as

$$\begin{aligned} \mathbf{F }=\sum _{e=(e_1,\ldots ,e_m)\in \{0,1\}^m}(\mathbf{F }_e^T \mathbf{F }_e) L_1^{e_1}\ldots L_m^{e_m}, \end{aligned}$$
(11)

where \(\mathbf{F }_e\in \mathcal {M}_t(\mathbb R[X])\) and the degree of each sum of squares \(\mathbf{F }_e^T\mathbf{F }_e\) does not exceed \(N+d\).

Proof

The proof follows directly from Theorem 3, with the observation that any positive definite matrix \(\mathbf{F }_{\alpha }\in \mathcal {S}_t(\mathbb R)\) can be written as

$$\begin{aligned} \mathbf{F }_{\alpha } = \mathbf{G }_{\alpha }^T \mathbf{G }_{\alpha }, \end{aligned}$$

where \( \mathbf{G }_{\alpha }\in \mathcal {M}_t(\mathbb R)\) is a non-singular matrix. \(\square \)