1 Introduction

A homogeneous polynomial \(f\in {\mathbb {R}}[x]\) of degree \(d\) in variables \(x=(x_1,\dots ,x_n)\) is called hyperbolic with respect to a point \(e\in {\mathbb {R}}^{n}\) if \(f(e)\ne 0\) and for every \(a\in {\mathbb {R}}^{n}\), all roots of the univariate polynomial \(f(te+a)\in {\mathbb {R}}[t]\) are real. Its hyperbolicity cone, denoted \(C(f,e)\) is the connected component of \(e\) in \({\mathbb {R}}^{n} \backslash \mathcal {V}_{{\mathbb {R}}}(f)\) and can also be defined as

$$\begin{aligned} C(f,e) = \{ a\in {\mathbb {R}}^{n} \;:\; f(te-a) \ne 0 \;\text { when }\; t\le 0\}. \end{aligned}$$

As shown in Gårding [6], \(C(f,e)\) is an open convex cone and \(f\) is hyperbolic with respect to any point contained in it. Hyperbolicity is reflected in the topology of the real projective variety \({\mathcal {V}}_{\mathbb {R}}(f)\) in \({\mathbb {P}}^{n-1}({\mathbb {R}})\). If \({\mathcal {V}}_{{\mathbb {R}}}(f)\) is smooth, then \(f\) is hyperbolic if and only if \({\mathcal {V}}_{\mathbb {R}}(f)\) consists of \(\lfloor \frac{d}{2}\rfloor \) nested ovaloids, and a pseudo-hyperplane if \(d\) is odd (see [8, Thm. 5.2]) (Fig. 1).

Fig. 1
figure 1

A quartic hyperbolic hypersurface and two of its affine slices

A hyperbolic program, introduced and developed by Güler [7], Renegar [14] and others, is the problem of maximizing a linear function over an affine section of the convex cone \(C(f,e)\). This provides a very general context in which interior point methods are effective. For example, taking \(f = \prod _i x_i\) and \(e=(1,\ldots , 1)\), we see that \(C(f,e)\) is the positive orthant \(({\mathbb {R}}_+\!)^{n}\) and the corresponding hyperbolic program is a linear program. If instead we take \(f\) as the determinant of a symmetric matrix of variables \(X=(x_{ij})\) and \(e\) is the identity matrix, then \(C(f,e)\) is the cone of positive definite matrices.

\(f\)

\(e\)

\(C(f,e)\)

Hyperbolic program

\(\prod _i x_i\)

\((1,\ldots , 1)\;\;\)

\(({\mathbb {R}}_+)^{n}\)

Linear program

\(\;\det (X)\;\)

\(I\)

Positive definite matrices

Semidefinite program

It is a fundamental open question whether or not every hyperbolic program can be rewritten as a semidefinite program. Helton and Vinnikov [8] showed that if \(f\in {\mathbb {R}}[x_1,x_2,x_3]\) is hyperbolic with respect to a point \(e\), then \(f\) has a definite determinantal representation \(f = \det (\sum _ix_iM_i)\) where \(M_1, M_2, M_3\) are real symmetric matrices and the matrix \(\sum _i e_i M_i\) is positive definite. Thus every three dimensional hyperbolicity cone is a slice of the cone of positive semidefinite matrices. For a survey of these results and future perspectives, see also [18]. On the other hand, Brändén [2] has given an example of a hyperbolic polynomial \(f\) (see Example 5.11) such that no power of \(f\) has a definite determinantal representation. There is a close connection between definite determinantal representations of a hyperbolic polynomial \(f\) and polynomials of degree one-less that interlace it, which has also been used in [12] to study Hermitian determinantal representations of hyperbolic curves.

Definition 1.1

Let \(f,g\in {\mathbb {R}}[t]\) be univariate polynomials with only real zeros and with \(\deg (g)=\deg (f)-1\). Let \(\alpha _1\leqslant \cdots \leqslant \alpha _d\) be the roots of \(f\), and let \(\beta _1\leqslant \cdots \leqslant \beta _{d-1}\) be the roots of \(g\). We say that \(g\) interlaces \(f\) if \(\alpha _i\leqslant \beta _i\leqslant \alpha _{i+1}\) for all \(i=1,\dots ,d-1\). If all these inequalities are strict, we say that \(g\) strictly interlaces \(f\).

If \(f\in {\mathbb {R}}[x]\) is hyperbolic with respect to \(e\) and \(g\) is homogeneous of degree \(\deg (f)-1\), we say that \(g\) interlaces \(f\) with respect to \(e\) if \(g(te+a)\) interlaces \(f(te+a)\) for every \(a\in {\mathbb {R}}^n\). This implies that \(g\) is also hyperbolic with respect to \(e\). We say that \(g\) strictly interlaces \(f\) if \(g(te+a)\) strictly interlaces \(f(te+a)\) for \(a\) in a nonempty Zariski-open subset of \({\mathbb {R}}^n\) (Fig. 2).

The most natural example of an interlacing polynomial is the derivative. If \(f(t)\) is a real polynomial with only real roots, then its derivative \(f'(t)\) has only real roots, which interlace the roots of \(f\). Extending this to multivariate polynomials, we see that the roots of \(\frac{\partial }{\partial t}f(te+a)\) interlace those of \(f(te+a)\) for all \(a\in {\mathbb {R}}^{n}\). Thus

$$\begin{aligned} D_ef = \sum _{i=1}^n e_i \frac{\partial f}{\partial x_i} \end{aligned}$$

interlaces \(f\) with respect to \(e\). If \(f\) is square-free, then \(D_ef\) strictly interlaces \(f\). This was already noted by Gårding [6] and has been used extensively, for example in [1] and [14]; for general information on interlacing polynomials, see also [5] and [13, Ch. 6].

Fig. 2
figure 2

Two affine slices of a cubic interlacing a quartic

Remark 1.2

If \(f\) is square-free and \(d=\deg (f)\), then \(f\) is hyperbolic with respect to \(e\) if and only if \(f(te+a)\) has \(d\) distinct real roots for \(a\) in a Zariski-open subset of \({\mathbb {R}}^{n}\). In this case, if \(g\) interlaces \(f\) and has no common factors with \(f\), then \(g\) strictly interlaces \(f\).

In this paper, we examine the set of polynomials in \({\mathbb {R}}[x]_{d-1}\) interlacing a fixed hyperbolic polynomial. The main result is a description of a hyperbolicity cone \(C(f,e)\) as a linear slice of the cone of nonnegative polynomials. Using the cone of sums of squares instead gives an inner approximation of \(C(f,e)\) by a projection of a spectrahedron. This is closely related to recent results due to Netzer and Sanyal [10] and Parrilo and Saunderson [11]. We discuss both this theorem and the resulting approximation in Sect. 3. In Sect. 4 we see that the relaxation we obtain is exact if some power of \(f\) has a definite determinantal representation. A multiaffine example for which our relaxation is not exact is discussed in Sect. 5. Here we also provide a criterion to test whether or not a hyperbolic multiaffine polynomial has a definite determinantal representation. The full cone of interlacers has a nice structure, which we discuss in Sect. 6. First we need to build up some basic facts about interlacing polynomials.

2 Interlacers

Let \(f\) be a homogeneous polynomial of degree \(d\) that is hyperbolic with respect to the point \(e\in {\mathbb {R}}^{n}\). We will always assume that \(f(e)>0\). Define \(\mathrm{Int}(f,e)\) to be the set of real polynomials of degree \(d-1\) that interlace \(f\) with respect to \(e\) and are positive at \(e\):

$$\begin{aligned} \mathrm{Int}(f,e) \;=\; \bigl \{ g\in {\mathbb {R}}[x]_{d-1} \;:\;g \hbox { interlaces } f \hbox { with respect to }e\text { and }g(e)>0\bigr \}. \end{aligned}$$

As noted above, the hyperbolicity cone \(C(f,e)\) depends only on \(f\) and the connected component of \({\mathbb {R}}^n{\setminus }{\mathcal {V}}_{\mathbb {R}}(f)\) containing \(e\). In other words, we have \(C(f,e)=C(f,a)\) for all \(a\in C(f,e)\). We will see shortly that \(\mathrm{Int}(f,e)\) does not depend on \(e\) either, but only on \(C(f,e)\).

Theorem 2.1

Let \(f \in {\mathbb {R}}[x]_d\) be square-free and hyperbolic with respect to \(e\in {\mathbb {R}}^{n}\), where \(f(e)>0\). For \(h \in {\mathbb {R}}[x]_{d-1}\), the following are equivalent:

  1. (1)

    \(h \in \mathrm{Int}(f,e)\);

  2. (2)

    \(h \in \mathrm{Int}(f,a)\) for all \(a \in C(f,e)\);

  3. (3)

    \( D_ef\cdot h\) is nonnegative on \(\mathcal {V}_{{\mathbb {R}}}(f)\);

  4. (4)

    \(D_ef \cdot h- f \cdot D_eh \) is nonnegative on \({\mathbb {R}}^{n}\).

The proof of this theorem and an important corollary are at the end of this section. First, we need to build up some theory about the forms in \(\mathrm{Int}(f,e)\).

Lemma 2.2

Suppose \(f_1, f_2\), and \(h\) are real homogeneous polynomials.

  1. (a)

    The product \(f_1\cdot f_2\) is hyperbolic with respect to \(e\) if and only if both \(f_1\) and \(f_2\) are hyperbolic with respect to \(e\). In this case, \(C(f_1\cdot f_2 , e) = C(f_1,e)\cap C(f_2,e)\).

  2. (b)

    If \(f_1\) and \(f_2\) are hyperbolic with respect to \(e\), then \(f_1\cdot h\) interlaces \(f_1\cdot f_2\) if and only if \(h\) interlaces \(f_2\).

  3. (c)

    If \(h\) interlaces \((f_1)^k f_2\) for \(k\in \mathbb {N}\), then \((f_1)^{k-1}\) divides \(h\).

Proof

These statements are checked directly after reducing to the one-dimensional case.\(\square \)

Lemma 2.3

For any \(g\) and \(h\) in \(\mathrm{Int}(f,e)\), the product \(g\cdot h\) is nonnegative on \(\mathcal {V}_{{\mathbb {R}}}(f)\).

Proof

To prove this statement, it suffices to restrict to any line \(x = te+a\) where \(a\in {\mathbb {R}}^{n}\). Suppose that \(f(te+a) \in {\mathbb {R}}[t]\) has roots \(\alpha _1\le \cdots \le \alpha _d\) and \(g(te+a)\) and \(h(te+a)\) have roots \(\beta _1\le \cdots \le \beta _{d-1}\) and \(\gamma _1\le \cdots \le \gamma _{d-1}\), respectively. By the assumption that both \(g\) and \(h\) interlace \(f\), we know that \(\beta _i, \gamma _i\in [\alpha _i,\alpha _{i+1}]\) for all \(1\le i\le d-1\). Thus, if \(\alpha _i\) and \(\alpha _j\) are not also roots of \(g(te+a)\) or \(h(te+a)\), the polynomial \(g(te+a)h(te+a)\) has an even number of roots in the interval \([\alpha _i, \alpha _j]\). Then the sign of \(g(\alpha _ie+a)h(\alpha _ie+a)\) is the same for all \(i\) for which it is not zero. Because \(g(e)h(e)>0\), we see that that sign must be nonnegative (Fig. 3).\(\square \)

Fig. 3
figure 3

Affine slices of two cubics interlacing a quartic

Lemma 2.4

Suppose that \(f\) is square-free and that \(g \in \mathrm{Int}(f,e)\) strictly interlaces \(f\). Then a polynomial \(h\in {\mathbb {R}}[x]_{d-1}\) belongs to \(\mathrm{Int}(f,e)\) if and only if \(g\cdot h\) is nonnegative on \(\mathcal {V}_{{\mathbb {R}}}(f)\).

Proof

One direction follows from Lemma 2.3. For the other, let \(h\in {\mathbb {R}}[x]_{d-1}\) for which \(g\cdot h\) is nonnegative on \(\mathcal {V}_{\mathbb {R}}(f)\). First, let us consider the case where \(f\) and \(h\) have no common factor. Then, for generic \(a \in {\mathbb {R}}^n\), the roots of \(f(te+a)\) are distinct from each other and from the roots of \(g(te+a)\) and \(h(te+a)\). The product \(g(te+a)h(te+a)\) is then positive on all of the roots of \(f(te+a)\). Since \(g(te+a)\) changes sign on consecutive roots of \(f(te+a)\), we see that \(h(te+a)\) must have a root between each pair of consecutive roots of \(f(te+a)\), and thus \(h\) interlaces \(f\) with respect to \(e\).

Now suppose \(f=f_1\cdot f_2\) and \(h=f_1\cdot h_1\). We will show that \(h_1\) interlaces \(f_2\), and thus \(h\) interlaces \(f\). Again, we can choose generic \(a\) for which the roots of \(f(te+a)\) and \(g(te+a)\) are all distinct. Consider two consecutive roots \(\alpha <\beta \) of the polynomial \(f_2(te+a)\). Let \(k\) be the number of roots of \(f_1(te+a)\) in the interval \((\alpha , \beta )\). Because \(g\) strictly interlaces \(f=f_1\cdot f_2\), its restriction \(g(te+a)\) must have \(k+1\) roots in the interval \((\alpha , \beta )\). Thus the polynomial \(g(te+a)f_1(te+a)\) has an odd number of roots in this interval and must therefore have different signs in \(\alpha \) and \(\beta \). Since \(g\cdot f_1 \cdot h_1 \ge 0\) on \(V(f)\), the polynomial \(h_1(te+a)\) must have a root in this interval. Thus \(h_1\) interlaces \(f_2\) and \(h\) interlaces \(f\).\(\square \)

Example 2.5

In the above lemma, it is indeed necessary that \(f\) and \(g\) be without common factors. For example, consider \(f=(x^2+y^2-z^2)(x-2z)\) and \(g=(x^2+y^2-z^2)\). Both \(f\) and \(g\) are hyperbolic with respect to \([0:0:1]\) and \(g\) interlaces \(f\) with respect to this point. However if \(h=y(x-2z)\), then \(g\cdot h\) vanishes identically on \(\mathcal {V}_{{\mathbb {R}}}(f)\) but \(h\) does not interlace \(f\).

For \(a\in C(f,e)\), the derivative \(D_af\) obviously interlaces \(f\) with respect to \(a\), since \(f\) is hyperbolic with respect to \(a\). We need to show that \(D_af\) also interlaces \(f\) with respect to \(e\).

Lemma 2.6

For \(a\in C(f,e)\), the polynomial \(D_e f\cdot D_a f\) is nonnegative on \(\mathcal {V}_{{\mathbb {R}}}(f)\).

Proof

For any \(b\in C(f,e)\) and \(x\in \mathcal {V}_{{\mathbb {R}}}(f)\), let \(\alpha _1(b,x)\le \cdots \le \alpha _d(b,x)\) denote the roots of \(f(tb +x)\). Because \(C(f,e)\) is convex, the line segment joining \(e\) and \(a\) belongs to this cone. As we vary \(b\) from \(e\) to \(a\) along this line segment, the roots \(\{\alpha _i(b,x)b +x\}_{i\in [d]}\), form \(d\) non-crossing arcs in the plane \(x+\mathrm{span}\{e,a\}\), as shown in Fig. 4. Since \(f(x)=0\), one of these arcs is just the point \(x\). That is, there is some \(k\) for which \(\alpha _k(b,x)=0\) for all \(b\) in the convex hull of \(e\) and \(a\).

Now \(f(e)>0\) implies \(f(b)>0\) for all \(b\in C(f,e)\). Thus \(\frac{\partial }{\partial t}f(tb+x)\) is positive for \(t>\alpha _d(b,x)\). Furthermore, the sign of this derivative on the \(i\)th root, \(\alpha _i(b,x)\) depends only on \(i\). Specifically, for all \(i=1,\ldots , d\),

$$\begin{aligned} (-1)^{d-i}\cdot D_b f(\alpha _{i}(b,x)b+x) \; \ge \;0. \end{aligned}$$

In particular, the sign of \(D_bf\) on the \(k\)th root, \(\alpha _k(b,x)b+x = x\), is constant:

$$\begin{aligned} (-1)^{d-k}D_bf(x) \ge 0. \end{aligned}$$

Then, regardless of \(k\), the product \(D_ef(x)D_af(x)\) is non-negative.\(\square \)

Fig. 4
figure 4

Non-crossing arcs of Lemma 2.6

Now we are ready to prove Theorem 2.1.

Proof of Theorem 2.1

      (\(4 \Rightarrow 3\)) Clear.

(\(1 \Leftrightarrow 3\)) If \(f\) is square free, then \(D_ef\) strictly interlaces \(f\). This equivalence then follows from Lemma 2.4.

(\(1,3 \Rightarrow 4\)) Here we need a useful fact about Wronskians. The Wronskian of two univariate polynomials \(p(t),q(t)\) is the polynomial

$$\begin{aligned} W(p,q) \; = \; p\cdot q' - p' \cdot q \; = \; q^2 \cdot \left( \frac{p}{q}\right) '. \end{aligned}$$

It is a classical fact that if the roots of \(p\) and \(q\) are all distinct and interlace, then \(W(p,q)\) is a nonnegative or nonpositive polynomial [20, §2.3]. Thus if \(h\in \mathrm{Int}(f,e)\) is coprime to \(f\), then for generic \(x\), the roots of \(f(te+x)\) and \(h(te+x)\) interlace and are all distinct. Thus their Wronskian \( h(te+x) f'(te+x)- h'(te+x) f(te+x)\) is either nonnegative or nonpositive for all \(t\). By \((3)\), the product \(h(te+x)f'(te+x)\) is nonnegative on the zeroes of \(f\), so we see that the Wronskian is nonnegative. Setting \(t=0\) gives us that \(h\cdot D_ef - D_eh\cdot f \) is nonnegative for all \(x\in {\mathbb {R}}^n\), as desired. If \(f\) and \(h\) share a factor, say \(f=f_1\cdot f_2\), \(h=f_1\cdot h_1\), we can use the identity \(W(f_1\cdot f_2,f_1\cdot h_1)=f_1^2W(f_2,h_1)\) to reduce to the coprime case.

(\(2 \Leftrightarrow 1\)) Because \(f\) is square free, both \(D_e f\) and \(D_af\) share no factors with \(f\). Thus \(D_ef\) strictly interlaces \(f\) with respect to \(e\) and \(D_af\) strictly interlaces \(f\) with respect to \(a\).

Suppose \(h\) interlaces \(f\) with respect to \(a\) and \(h(a)>0\). By Lemma 2.4, \(h\cdot D_af\) is nonnegative on \(\mathcal {V}_{{\mathbb {R}}}(f)\). Using Lemma 2.6, we see that \(D_ef \cdot D_af\) is also nonnegative on \(\mathcal {V}_{{\mathbb {R}}}(f)\). Taking the product, it follows that \((D_af)^2\cdot D_ef \cdot h \) is nonnegative on \(\mathcal {V}_{{\mathbb {R}}}(f)\). Because \(D_af \) and \(f\) have no common factors, we can conclude that \(D_ef \cdot h\) is nonnegative on \(\mathcal {V}_{{\mathbb {R}}}(f)\). Using Lemma 2.4 again, we have \(h\in \mathrm{Int}(f,e)\). Switching the roles of \(a\) and \(e\) in this argument gives the reverse implication.\(\square \)

Corollary 2.7

The set \(\mathrm{Int}(f,e)\) is a closed convex cone. If \(f\) is square-free, this cone is linearly isomorphic to a section of the cone of nonnegative polynomials of degree \(2\deg (f)-2\):

$$\begin{aligned} \mathrm{Int}(f,e) = \{ h\in {\mathbb {R}}[x]_{\deg (f)-1} \;:\; D_ef \cdot h - f \cdot D_eh \;\ge 0 \;\text { on } {\mathbb {R}}^{n} \}. \end{aligned}$$
(2.1)

If \(f=f_1\cdot f_2\) where \(\mathcal {V}(f)=\mathcal {V}(f_2)\) and \(f_2\) is square-free, then

$$\begin{aligned} \mathrm{Int}(f,e) = f_1 \cdot \mathrm{Int}(f_2, e) \end{aligned}$$

and is isomorphic to a section of the cone of nonnegative polynomials of degree \(2\deg (f_2)-2\).

Proof

For square-free \(f\), the description (2.1) follows directly from Theorem 2.1. The map

$$\begin{aligned} h \;\; \mapsto \;\; D_ef \cdot h - f \cdot D_eh \end{aligned}$$

is a linear map from \({\mathbb {R}}[x]_{\deg (f)-1}\) to \({\mathbb {R}}[x]_{(2\deg (f)-2)}\). We see that \(\mathrm{Int}(f,e)\) is the preimage of the cone of nonnegative polynomials in \({\mathbb {R}}[x]_{(2\deg (f)-2)}\) under this map. We can also check that this map is injective. Because \(f\) is square free, \(D_ef\) and \(f\) are coprime. Hence if \(f\) were to divide \(D_ef \cdot h\), then \(f\) would have to divide \(h\), which it cannot. Thus \(D_ef \cdot h - f \cdot D_eh\) cannot be identically zero.

If \(f\) is not square-free, then \(f\) factors as \(f_1\cdot f_2\) as above. By Lemma 2.2(c), any polynomial that interlaces \(f\) must be divisible by \(f_1\). By part (b), the remainder must interlace \(f_2\). Thus \(\mathrm{Int}(f,e) \subseteq f_1 \cdot \mathrm{Int}(f_2, e)\). Similarly, if \(h\) interlaces \(f_2\), then \(f_1\cdot h\) interlaces \(f=f_1\cdot f_2\). Thus \(\mathrm{Int}(f,e)\) is the image of the convex cone \(\mathrm{Int}(f_2,e)\) under a linear map, namely multiplication by \(f_1\). This shows that it is linearly isomorphic to a section of the cone of nonnegative polynomials of degree \(2\deg (f_2)-2\).\(\square \)

3 Hyperbolicity cones and nonnegative polynomials

An interesting consequence of the results in the preceding section is that we can recover the hyperbolicity cone \(C(f,e)\) as a linear section of \(\mathrm{Int}(f,e)\), and thus as a linear section of the cone of nonnegative polynomials. We show this by considering which partial derivatives \(D_a(f)\) interlace \(f\). Using Theorem 2.1, we often have to deal with the polynomials

$$\begin{aligned} \Delta _{e,a}f = D_e f \cdot D_a f - f\cdot D_eD_a f. \end{aligned}$$

Notice that \(\Delta _{e,a}f\) is homogeneous of degree \(2d-2\), symmetric in \(e\) and \(a\), and linear in each.

Theorem 3.1

Let \(f\in {\mathbb {R}}[x]_d\) be square-free and hyperbolic with respect to the point \(e\in {\mathbb {R}}^{n}\). The intersection of \(\mathrm{Int}(f,e)\) with the plane spanned by the partial derivatives of \(f\) is the image of \(\overline{C(f,e)}\) under the linear map \(a \mapsto D_a f\). That is,

$$\begin{aligned} \overline{C(f,e)} = \{ a\in {\mathbb {R}}^{n} \;:\; D_a f \;\in \; \mathrm{Int}(f,e) \}. \end{aligned}$$
(3.1)

Furthermore, \(\overline{C(f,e)}\) is a section of the cone of nonnegative polynomials of degree \(2d-2\):

$$\begin{aligned} \overline{C(f,e)} = \{ a \in {\mathbb {R}}^{n} \;\;:\;\; \Delta _{e,a}f \ge 0 \text { on } {\mathbb {R}}^{n}\}. \end{aligned}$$
(3.2)

Proof

Let \(C\) be the set on the right hand side of (3.1). From Theorem 2.1, we see that \(D_a f\) interlaces \(f\) with respect to \(e\) for all \(a\in C(f,e)\). This shows \(C(f,e)\subset C\) and hence the inclusion \(\overline{C(f,e)}\subset C\), since \(C\) is closed. If this inclusion were strict, there would exist a point \(a\in C{\setminus }\overline{C(f,e)}\) with \(f(a)\ne 0\), since \(C\) is also a convex cone by Corollary 2.7. Thus to show the reverse inclusion, it therefore suffices to show that for any point \(a\) outside of \(\overline{C(f,e)}\) with \(f(a)\ne 0\), the polynomial \(D_a f\) does not belong to \(\mathrm{Int}(f,e)\). If \(a\) belongs to \(-\overline{C(f,e)}\), then \(-D_a f\) belongs to \(\mathrm{Int}(f,e)\). In particular, \(-D_a f (e) >0\) and \(D_a f\) does not belong to \(\mathrm{Int}(f,e)\). Thus we may assume \(a\notin \overline{C(f,e)}\cup -\overline{C(f,e)}\). Since \(f\) is hyperbolic with respect to \(e\), all of the roots \(\alpha _1\le \cdots \le \alpha _d\) of \(f(te+a)\) are real. The reciprocals of these roots, \(1/\alpha _1, \ldots 1/\alpha _d\), are roots of the polynomial \(f(e+ta)\).

Because \(a\) is not in \(\overline{C(f,e)}\cup -\overline{C(f,e)}\), there is some \(1\le i < n\) for which \(\alpha _i < 0<\alpha _{i+1}\). Since \(f(e)\ne 0\) and \(f(a)\ne 0\), we can take reciprocals to find the roots of \(f(e+ta)\):

$$\begin{aligned} \frac{1}{\alpha _i} \le \frac{1}{\alpha _{i-1}} \le \cdots \le \frac{1}{\alpha _1} < 0 < \frac{1}{\alpha _d} \le \frac{1}{\alpha _{d-1}}\le \cdots \le \frac{1}{\alpha _{i+1}}. \end{aligned}$$

By Rolle’s Theorem, the roots of \(\frac{\partial }{\partial t} f(e+ta)\) interlace those of \(f(e+ta)\). Note that the polynomial \(\frac{\partial }{\partial t} f(e+ta)\) is precisely \(D_a f(e+ta)\), so the roots of \(D_a f(e+ta)\) interlace those of \(f(e+ta)\). In particular, there is some root \(\beta \) of \(D_a f(e+ta)\) in the open interval \((1/\alpha _1, 1/\alpha _d)\), and thus \(1/\beta \not \in [\alpha _1, \alpha _d]\) is a zero of \(D_a f(te+a)\). Therefore \(D_a f(te+a)\) has only \(d-2\) roots in the interval \([\alpha _1, \alpha _d]\) and thus cannot interlace \(f\) with respect to \(e\) (Fig. 5).

Combining this with Theorem 2.1 shows the equality in (3.2). \(\square \)

Fig. 5
figure 5

For \(a\) outside of the hyperbolicity cone, \(D_af\) does not interlace \(f\)

Corollary 3.2

Relaxing nonnegativity to sums-of-squares in (3.2) gives an inner approximation to the hyperbolicity cone of \(f\):

$$\begin{aligned} \{a \in {\mathbb {R}}^{n}\;:\; \Delta _{e,a}f\ \text { is a sum of squares}\} \subseteq \overline{C(f,e)}. \end{aligned}$$
(3.3)

If the relaxation (3.3) is exact, then the hyperbolicity cone is a projection of a spectrahedron, namely of a section of the cone of positive semidefinite matrices in \(\mathrm{Sym}_N({\mathbb {R}})\), where \(N = \left( {\begin{array}{c}n+d-2\\ n-1\end{array}}\right) = \dim {\mathbb {R}}[x]_{d-1}\). A polynomial \(F\) is a sum of squares if and only if there exists a positive semidefinite matrix \(G\) such that \(F = v^TGv\), where \(v\) is the vector of monomials of degree at most \(\deg (F)/2\). We call such a matrix \(G\) a Gram matrix of \(F\). The linear equations giving the Gram matrices of \(\Delta _{e,a}f\) give the desired section of \(\mathrm{Sym}_N({\mathbb {R}})\).

If the relaxation (3.3) is not exact, one can allow for denominators in the sums of squares and successively improve the relaxation. More precisely, for any integer \(N\geqslant 0\) consider

$$\begin{aligned} C_N=\left\{ a \in {\mathbb {R}}^{n}\;:\; \left( \sum \limits _{i=1}^n\! x_i^2\right) ^N\cdot \Delta _{e,a}f\ \text { is a sum of squares} \right\} \subseteq \overline{C(f,e)}.\qquad \end{aligned}$$
(3.4)

As above, \(C_N\) is seen to be a projection of a spectrahedron. Furthermore, by a result of Reznick in [15], for any positive definite form \(F\in {\mathbb {R}}[x]\) there exists some positive integer \(N\) such that \((\sum _{i=1}^n x_i^2)^N\cdot F\) is a sum of squares. Thus if \({\mathcal {V}}_{\mathbb {R}}(f)\) is smooth, then \(\{\Delta _{e,a}f\,|\, a\in {\mathbb {R}}^n\}\) contains a strictly positive polynomial, for example \(\Delta _{e,e}f\). It follows that the hyperbolicity cone \(\overline{C(f,e)}\) is the closure of the union of all the cones \(C_N\).

Remark 3.3

In a recent paper [10], Netzer and Sanyal showed that the hyperbolicity cone of a hyperbolic polynomial without real singularities is the projection of a spectrahedron. Their proof uses general results on projected spectrahedra due to Helton and Nie and is not fully constructive. In particular, it does not imply anything about equality in (3.3) or (3.4).

Explicit representations of hyperbolicity cones as projected spectrahedra have recently been obtained by Parrilo and Saunderson in [11] for elementary symmetric polynomials and for directional derivatives of polynomials possessing a definite determinantal representation.

Remark 3.4

We also have the relaxation

$$\begin{aligned} \{a \in {\mathbb {R}}^{n}\;:\; D_e f \cdot D_a f \ \text { is a sum of squares modulo }( f ) \} \subseteq \overline{C(f,e)}.\qquad \end{aligned}$$
(3.5)

It is unclear whether or not this relaxation is always equal to (3.3). Its exactness would also show \(\overline{C(f,e)}\) to be the projection of a spectrahedron. We will see below that if \(f\) has a definite determinantal representation, then we get equality in (3.3) and (3.5).

Example 3.5

Consider the quadratic form \(f(x) = x_1^2 - x_2^2-\cdots -x_n^2\), which is hyperbolic with respect to the point \(e=(1,0,\ldots , 0)\). The hyperbolicity cone \(C(f,e)\) is known as the Lorentz cone. In this example, the relaxation (3.3) is exact. To see this, note that

$$\begin{aligned} \Delta _{e,a}f&= (2x_1)\left( 2a_1x_1-\sum _{j\ne 1}2a_jx_j\right) - \left( x_1^2-\sum _{j\ne 1} x_j^2\right) (2a_1) \\&=2\left( a_1x_1^2-2\sum _{j\ne 1} a_jx_1x_j +\sum _{j\ne 1}a_1x_j^2\right) . \end{aligned}$$

Since every nonnegative quadratic form is a sum of squares, there is equality in (3.3). In fact, taking the Gram matrix of \(\frac{1}{2}\Delta _{e,a}f\), we recover the Lorentz cone as

$$\begin{aligned} \overline{C(f,e)} = \left\{ a \in {\mathbb {R}}^{n}\;:\; \begin{pmatrix} a_1 &{}\quad -a_2 &{} \quad \ldots &{} \quad -a_n \\ -a_2 &{}\quad a_1 &{} \quad &{}\quad 0 \\ \vdots &{} \quad &{}\quad \ddots &{}\quad \vdots \\ -a_n &{}\quad 0 &{}\quad \ldots &{}\quad a_1 \end{pmatrix} \succeq 0 \right\} . \end{aligned}$$

Note also that this Gram matrix gives a definite determinantal representation of \(a_1^{n-2}f(a)\).

Example 3.6

Consider the hyperbolic cubic polynomial

$$\begin{aligned} f= (x - y) (x + y) (x + 2 y) - x z^2, \end{aligned}$$

with \(e=[1:0:0]\). Here the polynomial \(\Delta _{e,a}f\) has degree four in \(x,y,z\). In this case, the relaxation (3.3) is exact, as shown in Corollary 4.5 below. (One can also see exactness from the fact that every nonnegative ternary quartic is a sum of squares). Using the basis \((x^2, y^2, z^2, x y, x z, y z)\) of \({\mathbb {R}}[x,y,z]_2\), we can then write the cone \(\overline{C(f,e)}\) as the set of \((a,b,c)\) in \({\mathbb {R}}^3\) for which there exists \((g_1, \ldots , g_6)\in {\mathbb {R}}^6\) to make the real symmetric matrix

$$\begin{aligned} \begin{pmatrix} 3 a + 2 b&{}\quad g_1&{}\quad g_2&{}\quad 4 a -2 b&{}\quad -2 c&{}\quad g_3\\ g_1&{}\quad 9 a + 2 b&{}\quad g_4&{}\quad 4a - 8 b&{}\quad g_5&{}\quad -2 c\\ g_2&{}\quad g_4&{}\quad a&{}\quad g_6&{}\quad 0&{}\quad 0\\ 4 a - 2 b&{}\quad 4a - 8 b&{}\quad g_6&{}\quad 8 a - 20 b - 2 g_1&{}\quad -2 c - g_3&{}\quad -g_5\\ -2 c&{}\quad g_5&{}\quad 0&{}\quad -2 c - g_3&{}\quad 2 b - 2 g_2&{}\quad -2 a - g_6\\ g_3&{}\quad -2 c&{}\quad 0&{}\quad -g_5&{}\quad -2 a - g_6&{}\quad 2a + 6 b - 2 g_4 \end{pmatrix} \end{aligned}$$

positive semidefinite.

The sums of squares relaxation (3.3) is not always exact. A counterexample comes from a multilinear hyperbolic polynomial and will be discussed in Example 5.11.

4 Definite symmetric determinants

We consider \(\det (X)\) as a polynomial in \({\mathbb {R}}[X_{ij}:i\le j \in [d]]\), where \(X = (X_{ij})\) is a symmetric matrix of variables. Since all eigenvalues of a real symmetric matrix are real, this polynomial is hyperbolic with respect to the identity matrix. The hyperbolicity cone \(C(\det (X),I)\) is the cone of positive definite matrices. Hence, for any positive semidefinite matrix \(E\ne 0\), the polynomial

$$\begin{aligned} D_E(\det (X)) = {{\mathrm{tr}}}\!\left( E\cdot X^{\mathrm{adj}}\right) \end{aligned}$$
(4.1)

interlaces \(\det (X)\), where \(X^\mathrm{adj}\) denotes the adjugate matrix, whose entries are the signed \((d-1)\times (d-1)\)-minors of \(X\). This holds true when we restrict to linear subspaces. For real symmetric \(d\times d\) matrices \(M_1, \ldots , M_n\) and variables \(x=(x_1,\ldots , x_n)\), denote

$$\begin{aligned} M(x) = \sum _{j=1}^n x_j M_j. \end{aligned}$$

If \(M(e)\) is positive definite for some \(e\in {\mathbb {R}}^{n}\), then the polynomial \( \det (M(x))\) is hyperbolic with respect to the point \(e\).

Proposition 4.1

If \(M\) is a real symmetric matrix of linear forms such that \(M(e) \succ 0\) for some \(e\in {\mathbb {R}}^{n}\), then for any positive semidefinite matrix \(E\), the polynomial \({{\mathrm{tr}}}\!\left( E\cdot M^{\mathrm{adj}}\right) \) interlaces \(\det (M)\) with respect to \(e\).

Proof

By the discussion above, the polynomial \(D_E(\det (X)) ={{\mathrm{tr}}}\!\left( E\cdot X^{\mathrm{adj}}\right) \) interlaces \(\det (X)\) with respect to \(E\). (In fact these are all of the interlacers of \(\det (X)\). See Example 6.2 below.) By Theorem 2.1, \({{\mathrm{tr}}}\!\left( E\cdot X^{\mathrm{adj}}\right) \) interlaces \(\det (X)\) with respect to any positive definite matrix, in particular \(M(e)\). Restricting to the linear space \(\{M(x) : x\in {\mathbb {R}}^{n}\}\) shows that \({{\mathrm{tr}}}\!\left( E\cdot M^{\mathrm{adj}}\right) \) interlaces \(\det (M)\) with respect to \(e\).\(\square \)

Theorem 4.2

If \(f\in {\mathbb {R}}[x]_d\) has a definite symmetric determinantal representation \(f=\det (M)\) with \(M(e)\succeq 0\) and \(M(a)\succeq 0\), then \(\Delta _{e,a}f\) is a sum of squares. In particular, there is equality in (3.3).

Proof

Because \(M(e)\) and \(M(a)\) are positive semidefinite, we can write them as sums of rank-one matrices: \(M(e) = \sum _i \lambda _i \lambda _i^T\) and \(M(a) = \sum _j \mu _j \mu _j^T\), where \(\lambda _i, \mu _j \in {\mathbb {R}}^d\). Then \(D_e f = \langle M(e), M^\mathrm{adj} \rangle =\langle \sum _i \lambda _i \lambda _i^T, M^\mathrm{adj} \rangle = \sum _i \lambda _i^T M^\mathrm{adj}\lambda _i\), so

$$\begin{aligned} D_e f = \sum _i \lambda _i^T M^\mathrm{adj}\lambda _i \;\;\;\text { and, similarly, }\;\;\; D_a f = \sum _j \mu _j^T M^\mathrm{adj}\mu _j. \end{aligned}$$

Furthermore, by Proposition 4.6 below, the second derivative \(D_aD_bf\) is

$$\begin{aligned} D_eD_af =D_e\left( \sum _j \mu _j^T M^\mathrm{adj}\mu _j\right) = \sum _{i,j} u_{ij} \;\;\; \text {where} \;\;\; u_{ij} = \left| \begin{matrix} M &{}\quad \lambda _i &{}\quad \mu _j \\ \lambda _i^T &{}\quad 0 &{}\quad 0 \\ \mu _j^T &{}\quad 0 &{}\quad 0 \end{matrix} \right| . \end{aligned}$$

Now, again using Proposition 4.6, we see that \(\Delta _{e,a}f \) equals

$$\begin{aligned} \sum _{i,j} \biggl (( \lambda _i^T M^\mathrm{adj}\lambda _i )(\mu _j^T M^\mathrm{adj}\mu _j) - \det (M)\cdot u_{ij}\biggl ) =\sum _{i,j} (\lambda _i^T M^\mathrm{adj}\mu _j)^2, \end{aligned}$$
(4.2)

which is the desired sum of squares.\(\square \)

In fact, something stronger is true. We can also consider the case where some power of \(f\) has a definite determinantal representation. This is particularly interesting because taking powers of a hyperbolic polynomial does not change the hyperbolicity cone.

Corollary 4.3

If \(f\in {\mathbb {R}}[x]_d\) and a power \(f^r\) has a definite symmetric determinantal representation \(f^r=\det (M)\) with \(M(e),M(a)\succeq 0\), then \(\Delta _{e,a}(f)\) is a sum of squares. In particular, there is equality in (3.3).

Proof

Let \(f^r\) have a definite determinantal representation. We have \(\Delta _{e,a}(f^r)=r f^{2(r-1)} \Delta _{e,a}f.\) Theorem 4.2 states that \(\Delta _{e,a}(f^r)\) is a sum of squares,

$$\begin{aligned} g_1^2+\cdots +g_s^2 \;=\; r f^{2(r-1)} \Delta _{e,a}f \end{aligned}$$

for some \(g_i \in {\mathbb {R}}[x]\). Let \(p\) be an irreducible factor of \(f^{2(r-1)}\). Then \(p\) is hyperbolic with respect to \(e\) and the right hand side vanishes on \(\mathcal {V}_{{\mathbb {C}}}(p)\). Therefore, each \(g_i\) vanishes on \({\mathcal {V}}_{\mathbb {R}}(p)\) and thus on \({\mathcal {V}}_{\mathbb {C}}(p)\), since \({\mathcal {V}}_{\mathbb {R}}(p)\) is Zariski dense in \({\mathcal {V}}_{\mathbb {C}}(p)\). Thus we can divide the \(g_i\) by \(p\). By iterating this argument, we get the claim. \(\square \)

Remark 4.4

This result is closely related to (but does not seem to follow from) [9, Thm. 1.6], which says that the parametrized Hermite matrix of \(f\) is a sum of matrix squares whenever a power of \(f\) possesses a definite determinantal representation.

Corollary 4.5

If \(f\in {\mathbb {R}}[x_1, x_2, x_3]\), then there is equality in (3.3).

Proof

By the Helton-Vinnikov Theorem [8], every hyperbolic polynomial in three variables has a definite determinantal representation. The claim then follows from Theorem 4.2.\(\square \)

The following determinantal identities were needed in the proof of Theorem 4.2 above.

Proposition 4.6

Let \(X\) be a \(d\times d\) matrix of variables \(X_{ij}\) and let \(|\cdot |\) denote \(\det (\cdot )\). Then for any vectors \(\alpha , \beta , \gamma , \delta \in {\mathbb {C}}^d\) we have

$$\begin{aligned} \left| \begin{matrix} X &{} \quad \beta \\ \alpha ^T &{} \quad 0 \end{matrix}\right| \cdot \left| \begin{matrix}X &{}\quad \delta \\ \gamma ^T &{}\quad 0 \end{matrix}\right| -\left| \begin{matrix}X &{}\quad \delta \\ \alpha ^T &{}\quad 0 \end{matrix}\right| \cdot \left| \begin{matrix}X &{}\quad \beta \\ \gamma ^T &{}\quad 0 \end{matrix}\right| =|X| \cdot \left| \begin{matrix}X &{}\quad \beta &{}\quad \delta \\ \alpha ^T &{}\quad 0 &{}\quad 0 \\ \gamma ^T &{}\quad 0 &{}\quad 0 \end{matrix}\right| \end{aligned}$$
(4.3)

in \({\mathbb {C}}[X_{ij}:\; 1\le i,j\le d]\). Furthermore,

$$\begin{aligned} D_{\beta \alpha ^T}|X| = \left| \begin{matrix}X \quad &{} \beta \\ \alpha ^T &{} \quad 0 \end{matrix}\right| \;\;\;\;\; \text {and}\;\;\;\;\; D_{\delta \gamma ^T } D_{ \beta \alpha ^T }|X| = \left| \begin{matrix}X &{}\quad \beta &{}\quad \delta \\ \alpha ^T &{}\quad 0 &{}\quad 0 \\ \gamma ^T &{}\quad 0 &{}\quad 0 \end{matrix}\right| . \end{aligned}$$

Proof

We will prove the first identity using Schur complements. See, for example, [3, §1]. If \(A\) is a \(m\times m\) submatrix of the \(n\times n\) matrix \({\small \begin{pmatrix}A &{} C \\ B &{} D \end{pmatrix}}\), then its determinant equals \(|A| \cdot |D-BA^{-1}C|\). If \(D\) is the zero matrix, this simplifies to

$$\begin{aligned} \left| \begin{matrix}A &{}\quad C \\ B &{}\quad 0 \end{matrix}\right| = |A|\cdot \left| \frac{-1}{|A|}\cdot BA^\mathrm{adj}C \right| = |A| \cdot \left( \frac{-1}{|A|}\right) ^{n-m} \cdot | BA^\mathrm{adj}C|. \end{aligned}$$

To obtain the desired identity, we set \(A=X, B = {\small \begin{pmatrix} \alpha ^T \\ \gamma ^T \end{pmatrix}}\), and \(C = {\begin{pmatrix} \beta&\delta \end{pmatrix}}\):

$$\begin{aligned} \left| \begin{matrix}X &{}\quad \beta &{}\quad \delta \\ \alpha ^T &{}\quad 0 &{}\quad 0 \\ \gamma ^T &{}\quad 0 &{}\quad 0 \end{matrix} \right|&= |X| \cdot \left( \frac{-1}{|X|}\right) ^{2} \cdot \left| \begin{pmatrix} \alpha ^T \\ \gamma ^T \end{pmatrix} X^\mathrm{adj} \begin{pmatrix} \beta&\quad \delta \end{pmatrix}\right| \\&= \frac{1}{|X|} \cdot \left| \begin{matrix} \alpha ^T X^\mathrm{adj} \beta &{}\quad \alpha ^T X^\mathrm{adj} \delta \\ \gamma ^T X^\mathrm{adj} \beta &{}\quad \gamma ^T X^\mathrm{adj} \delta \end{matrix}\right| . \end{aligned}$$

Multiplying both sides by \(\det (X)\) finishes the proof of the determinantal identity.

For the claim about derivatives of the determinant, by additivity, we only need to look at the case when \(\alpha , \beta , \gamma , \delta \) are unit vectors, \(e_i, e_j, e_k, e_l\), respectively. Then \(D_{\beta \alpha ^T}|X| = D_{e_j e_i^T}|X|\) is the derivative of \(|X|\) with respect to the entry \(X_{ji}\). This is the signed minor of \(X\) obtained by removing the \(j\)th row and \(i\)th column, which is precisely the determinant \({\small \left| \begin{matrix}X &{} e_j \\ e_i^T &{} 0 \end{matrix}\right| }\). Taking the derivative of this determinant with respect to \(X_{lk}\) the same way gives

$$\begin{aligned} \frac{\partial ^2 |X|}{\partial X_{ji} \partial X_{lk}} \;=\; D_{ e_l e_k^T} D_{e_je_i^T }|X| \;=\; D_{ e_le_k^T} \left| \begin{matrix}X &{}\quad e_j \\ e_i^T &{}\quad 0 \end{matrix}\right| \; = \; \left| \begin{matrix}X &{}\quad e_j &{}\quad e_l \\ e_i^T &{}\quad 0 &{}\quad 0 \\ e_k^T &{}\quad 0 &{}\quad 0 \end{matrix}\right| . \end{aligned}$$

\(\square \)

We conclude this section with a general result inspired by Dixon’s construction of determinantal representations of plane curves, which will be applied in the next section. If \(f\in {\mathbb {R}}[x]_d\) has a definite determinantal representation, \(f=\det (M)\) with \(M(e)\succ 0\), then \(M^\mathrm{adj}\) is a \(d \times d\) matrix with entries of degree \(d-1\). This matrix has rank at most one on \({\mathcal {V}}(f)\), as seen by the identity \(M\cdot M^\mathrm{adj} = \det (M)\cdot I\). By Proposition 4.1, the top left entry \(M^\mathrm{adj}_{11}\) interlaces \(f\) with respect to \(e\). In fact, these properties of \(M^\mathrm{adj}\) are enough to reconstruct a definite determinantal representation \(M\).

Theorem 4.7

Let \(A=(a_{ij})\) be a symmetric \(d \times d\) matrix of real forms of degree \(d-1\). Suppose that \(f\in {\mathbb {R}}[x]_d\) is irreducible and hyperbolic with respect to \(e \in {\mathbb {R}}^n\). If \(A\) has rank one modulo \((f)\), then \(f^{d-2}\) divides the entries of \(A^\mathrm{adj}\) and the matrix \(M=(1/f^{d-2})A^\mathrm{adj}\) has linear entries. Furthermore there exists \(\gamma \in {\mathbb {R}}\) such that

$$\begin{aligned} \det (M) = \gamma f. \end{aligned}$$

If \(a_{11}\) interlaces \(f\) with respect to \(e\) and \(A\) has full rank, then \(\gamma \ne 0\) and \(M(e)\) is definite.

Proof

By assumption, \(f\) divides all the \(2\times 2\) minors of \(A\). Therefore, \(f^{d-2}\) divides all of the \((d-1)\times (d-1)\) minors of \(A\) and thus all of the entries of the adjugate matrix \(A^\mathrm{adj}\), see [12, Lemma 4.7]. We can then consider the matrix \(M = (1/f^{d-2})\cdot A^\mathrm{adj}\). By similar arguments, \(f^{d-1}\) divides \(\det (A)\). Because these both have degree \(d(d-1)\), we conclude that \(\det (A) = \lambda f^{d-1}\) for some \(\lambda \in {\mathbb {R}}\). Putting all of this together, we find that

$$\begin{aligned} \det (M) =\frac{1}{f^{d(d-2)}}\cdot \det (A^\mathrm{adj})=\frac{1}{f^{d(d-2)}} \det (A)^{d-1}=\lambda ^{d-1} f, \end{aligned}$$

so we can take \(\gamma =\lambda ^{d-1}\). Now, suppose that \(a_{11}\) interlaces \(f\) and that \(\gamma =\lambda =0\). Then \(\det (A)\) is identically zero. In particular, the determinant of \(A(e)\) is zero, there is some nonzero vector \(v\in {\mathbb {R}}^d\) in its kernel, and \(v^TA(e)v\) is also zero.

We will show that the polynomial \(v^TAv\) is not identically zero and that it interlaces \(f\) with respect to \(e\). This will contradict the conclusion that \(v^TA(e)v=0\). Because \(A\) has rank one on \({\mathcal {V}}(f)\), for each \(i=1,\ldots , d\) we have that

$$\begin{aligned} (e_i^TAe_i)(v^TAv) - (e_i^TAv)^2 = 0 \;\; \text { modulo }\;(f). \end{aligned}$$
(4.4)

If \(v^TAv\) is identically zero, then \(e_i^TAv\) vanishes on \({\mathcal {V}}(f)\). Since \(e_i^TAv\) only has degree \(d-1\), it must vanish identically as well. As this holds for each \(i\), this implies that \(Av\) is zero, which contradicts our assumption. Thus \(v^TAv\) cannot be identically zero.

Furthermore, (4.4) shows that \(a_{11}\cdot (v^TAv)\) is nonnegative on \({\mathcal {V}}_{{\mathbb {R}}}(f)\). Then Lemma 2.4 shows that \(v^TAv\) interlaces \(f\) with respect to \(e\). In particular, \(v^TAv\) cannot vanish at the point \(e\). Thus the determinant of \(A\) and hence \(M\) cannot be identically zero.

Thus \(M\) is a determinantal representation of \(f\). To show that \(M(e)\) is definite, it suffices to show that \(A(e)\) is definite. For any vector \(v \in {\mathbb {R}}^d\), we see from (4.4) with \(i=1\) that \(a_{11}v^TAv\) is nonnegative on \({\mathcal {V}}_{{\mathbb {R}}}(f)\). Thus \(a_{11}(e) \cdot v^TAv\) belongs to \(\mathrm{Int}(f,e)\) by Lemma 2.4 and in particular \(a_{11}(e) \cdot v^TA(e)v\) is positive for all \(v\in {\mathbb {R}}^d\). Hence the matrix \(A(e)\) is definite.\(\square \)

5 Multiaffine polynomials

An interesting special case of a hyperbolic polynomial is a multiaffine polynomial whose hyperbolicity cone contains the positive orthant. These polynomials are deeply connected to the theory of matroids [2, 4, 21].

Definition 5.1

A polynomial \(f\in {\mathbb {R}}[x]\) is called affine in \(x_i\) if the degree of \(f\) in \(x_i\) is at most one. If \(f\) is affine in each variable \(x_1, \ldots , x_n\), then \(f\) is called multiaffine.

Much of the literature on these polynomials deals with complex polynomials, rather than real polynomials, and the property of stability in place of hyperbolicity.

Definition 5.2

A polynomial \(f\in {\mathbb {C}}[x]\) is called stable if \(f(\mu )\) is non-zero whenever the imaginary part of each coordinate \(\mu _i\) is positive for all \(1 \le i \le n\).

A real homogeneous polynomial \(f \in {\mathbb {R}}[x]\) is stable if and only if \(f\) is hyperbolic with respect to every point in the positive orthant. After a linear change of variables, every hyperbolic polynomial is stable. In 2004, Choe et al. [4] showed that if \(f\in {\mathbb {R}}[x]_d\) is stable, homogeneous, and multiaffine, then its support (the collection of \(I\subset \{1,\dots ,n\}\) for which the monomial \(\prod _{i\in I}x_i\) appears in \(f\)) is the set of bases of a matroid. They further show that any representable matroid is the support of some stable multiaffine polynomial. In 2010, Brändén [2] used this deep connection to disprove the generalized Lax conjecture by showing that the bases-generating polynomial of the Vámos matroid (see Example 5.11) is hyperbolic but none of its powers has a determinantal representation. This example will also provide a counterexample to equality in our relaxation (3.3).

The Wronskian polynomials \(\Delta _{e,a}f\) also played a large part in the study of multiaffine stable polynomials. They are particularly useful when the points \(e\) and \(a\) are unit vectors. In this case, we will simplify our notation and write

$$\begin{aligned} \Delta _{ij}(f) :=\Delta _{e_i, e_j}(f) = \frac{\partial f}{\partial x_i} \cdot \frac{\partial f}{\partial x_j} - f\cdot \frac{\partial ^2 f}{\partial x_i \partial x_j}. \end{aligned}$$

Using these polynomials, Brändén [1] established a necessary and sufficient condition for multiaffine polynomials to be stable.

Theorem 5.3

([1, Theorem 5.6) For multiaffine \(f\in {\mathbb {R}}[x]\), the following are equivalent:

  1. (1)

    \(\Delta _{ij} f\) is nonnegative on \({\mathbb {R}}^n\) for all \(1 \le i,j \le n\),

  2. (2)

    \(f\) is stable.

Brändén also notes that the implication (2)\(\Rightarrow \)(1) holds for polynomials that are not multiaffine, giving an alternative proof of a part of Theorem 2.1 above. The other implication however, does not hold in general, as the following example shows.

Example 5.4

Let \(h=x_1^2+x_2^2, q=x_1+x_2\) and \(N \in {\mathbb {N}}\). Clearly \(q^N h\) is not hyperbolic with respect to any \(e \in {\mathbb {R}}^2\), but for all \(i,j \in \{1,2\}\) we have

$$\begin{aligned} \Delta _{ij}(q^N h)\,&=\,q^{2N} \Delta _{ij}h + N q^{2N-2} h^2 \Delta _{ij}q \\&=\,q^{2N-2}(q^2 \Delta _{ij}h+ N h^2). \end{aligned}$$

Now let \(z \in {\mathbb {R}}\) be the minimal value that \(q^2 \Delta _{ij}h\) takes on the unit sphere and let \(N>|z|\). Then, since \(\Delta _{ij}(q^N h)\) is homogeneous, we see that \(\Delta _{ij}(q^N h)\) is nonnegative on \({\mathbb {R}}^2\). Because \(\Delta _{ij}(q^N h)\) is a homogeneous polynomial in two variables, it is even a sum of squares. Thus \(\Delta _{ab} (q^N h)\) is a sum of squares for all \(a,b\) in the positive orthant. This also shows that the converse of Corollary 4.3 is not true, i.e. there is some polynomial \(f\) such that \(\Delta _{e,a}f\) is a sum of squares for all \(e,a\) in some full dimensional cone, but no power of \(f\) has a definite determinantal representation.

In an analogous statement, the polynomials \(\Delta _{ij}\) can also be used to determine whether or not a homogeneous multiaffine stable polynomial has a definite determinantal representation.

Theorem 5.5

Let \(f\in {\mathbb {R}}[x]_d\) be homogeneous and stable. Suppose \(f\) is affine in the variables \(x_1, \ldots , x_d\) and the coefficient of \(x_1 \cdots x_d\) in \(f\) is non-zero. Then the following are equivalent:

  1. (1)

    \(\Delta _{ij} f\) is a square in \({\mathbb {R}}[x]\) for all \(1 \le i,j \le d\);

  2. (2)

    \(\frac{\partial f}{\partial x_i} \cdot \frac{\partial f}{\partial x_j}\) is a square in \({\mathbb {R}}[x]/(f)\) for all \(1 \le i,j \le d\);

  3. (3)

    \(f\) has a definite determinantal representation.

Lemma 5.6

Let \(f\in {\mathbb {R}}[x]\) be affine in \(x_i\) and \(x_j\) for some \(i,j \in \{1,\dots ,n\}\). If \(f=g\cdot h\) with \(g,h \in {\mathbb {R}}[x]\), then \(\Delta _{ij} f\) is a square if and only if \(\Delta _{ij} g\) and \(\Delta _{ij} h\) are squares.

Proof

Suppose \(\Delta _{ij} f\) is a square. Since \(f\) is affine in \(x_i, x_j\), both \(g\) and \(h\) are affine in \(x_i, x_j\) and either \(\frac{\partial g}{\partial x_i}=0\) or \(\frac{\partial h}{\partial x_i}=0\). It follows that either \(\Delta _{ij}g=0\) or \(\Delta _{ij}h=0\). Using the identity \(\Delta _{ij} f = g^2 \Delta _{ij}h + h^2 \Delta _{ij} g\), we see that either \(\Delta _{ij}g=0\) or \(\Delta _{ij}g=(\Delta _{ij} f )/ h^2.\) In both cases \(\Delta _{ij} g\) is a square. The same holds true for \(\Delta _{ij} h\). For the converse, suppose that \(\Delta _{ij} g\) and \(\Delta _{ij} h\) are squares. As we saw above, one of them is zero. Thus \(\Delta _{ij} f=h^2 \Delta _{ij} g\) or \(\Delta _{ij} f=g^2 \Delta _{ij} h\).\(\square \)

Proof of Theorem 5.5

      (\(1 \Rightarrow 2\)) Clear.

(\(2\Rightarrow 3\)) For a start, suppose that \(f\) is irreducible. We will construct a matrix \(A\) satisfying the hypotheses of Theorem 4.7. For every \(i \le j\), the polynomial \(\frac{\partial f}{\partial x_i} \cdot \frac{\partial f}{\partial x_j}\) is equivalent to a square \(a_{ij}^2\) modulo \((f)\). In the case \(i=j\) we can choose \(a_{ii}=\frac{\partial f}{\partial x_i}\). Then it is easy to check that \(a_{11}a_{ii}\) equals \(a_{1i}^2\) modulo \((f)\). Further, for every \(2\le i<j \le d\), the polynomials \((a_{11}a_{ij})^2\) and \((a_{1i}a_{1j})^2\) are equivalent modulo \(f\). After changing the sign of \(a_{ij}\) if necessary, we see that \(a_{11}a_{ij}\) equals \(a_{1i}a_{1j}\) modulo \((f)\). Because \(f\) is irreducible, it follows that the symmetric matrix \(A =(a_{ij})_{ij}\) has rank one on \({\mathcal {V}}(f)\).

We now need to show that \(A\) has full rank. For each \(k=1, \ldots , d\), consider the point \(p_k=\sum _{j\in [d]\backslash \{k\}}e_j\), which lies in the real variety of \(f\). For \(j\ne k\), we see that \(\partial f / \partial x_j\) vanishes at \(p_k\), and therefore so must \(a_{kj}\). On the other hand, \(a_{kk}(p_k) = \partial f / \partial x_k (p_k)\) equals the nonzero coefficient of \(x_1\cdots x_d\) in \(f\). Now suppose that \(Av=0\) for some \(v\in {\mathbb {R}}^d\). The \(k\)th row of this is \(\sum _j v_j a_{kj}=0\). Plugging in the point \(p_k\) then shows that \(v_k\) must be zero, and thus \(v\) is the zero vector. Since \(f\) is stable, \(a_{11}=\partial f/\partial x_1\) interlaces it, and so by Theorem 4.7, \(f\) has a definite determinantal representation.

If \(f\) is reducible and \(g\) is an irreducible factor of \(f\), then, by Lemma 5.6, \(\Delta _{ij} g\) is a square. Since every irreducible factor of \(f\) has a definite determinantal representation, so has \(f\).

(\(3 \Rightarrow 1\)) Let \(f = \det (M)= \det (\sum _ix_i M_i)\) where \(M_1, \ldots , M_n\) are real symmetric \(d\times d\) matrices where \(\sum _i M_i \succ 0\). Because \(f\) is affine in each of the variables \(x_1, \ldots , x_d\), the matrices \(M_1, \ldots , M_d\) must have rank one. Furthermore, since \(f\) is stable, these rank-one matrices must be positive semidefinite (see [2], proof of Theorem 2.2). Thus we can write \(M_i = v_i v_i^{ T}\), with \(v_i \in {\mathbb {R}}^{d}\) for each \(1 \le i \le d\). Then by (4.2) and Proposition 4.6, we have \(\Delta _{ij} f = (v_i^T M^\mathrm{adj}v_j)^2\) for \(1 \le i,j \le d\). \(\square \)

Corollary 5.7

Let \(f\in {\mathbb {R}}[x]\) be homogeneous, stable and multiaffine. Then the following are equivalent:

  1. (1)

    \(\Delta _{ij} f\) is a square for all \(1 \le i,j \le n\);

  2. (2)

    \(\frac{\partial f}{\partial x_i} \cdot \frac{\partial f}{\partial x_j}\) is a square in \({\mathbb {R}}[x]/(f)\) for all \(1 \le i,j \le n\);

  3. (3)

    \(f\) has a definite determinantal representation.

Proof

This is an immediate consequence of the preceding theorem.\(\square \)

Corollary 5.8

Let \(1 \le k \le n\) and let \(f\in {\mathbb {R}}[x]\) be a multiaffine stable polynomial. If \(f\) has a definite determinantal representation, then \(\frac{\partial f}{\partial x_k}\) and \(f|_{x_k=0}\) also have a definite determinantal representation.

Proof

Let \(1 \le k,i,j \le n, g=\frac{\partial f}{\partial x_k}\) and \(h=f|_{x_k=0}\). Wagner and Wei [21] calculated

$$\begin{aligned} \Delta _{ij} f = x_k^2 \cdot \Delta _{ij} g + x_k \cdot p+\Delta _{ij} h, \end{aligned}$$

where \(p, g, h \in {\mathbb {R}}[x_1, \ldots , x_n]\) do not depend on \(x_k\). Since \(\Delta _{ij} f\) is a square, \(\Delta _{ij} g\) and \(\Delta _{ij} h\) are squares as well. Thus \(g\) and \(h\) have a definite determinantal representation.

\(\square \)

Corollary 5.9

Let \(f=g\cdot h\), where \(f,g,h \in {\mathbb {R}}[x]\) are multiaffine stable polynomials. Then \(f\) has a definite determinantal representation if and only if both \(g\) and \(h\) do.

Proof

This follows directly from Lemma 5.6 and Theorem 5.5.\(\square \)

Example 5.10

[Elementary Symmetric Polynomials] Let \(e_d \in {\mathbb {R}}[x]\) be the elementary symmetric polynomial of degree \(d\). We have \(\Delta _{ij} e_1=1, \Delta _{ij} e_n=0\) and \(\Delta _{ij} e_{n-1}= (x_1 \ldots x_n/x_i x_j)^2\) for all \(1 \le i<j \le n\). It is a classical result that these are the only cases where \(e_d\) has a definite determinantal representation [16]. Indeed, for \(n\ge 4\) and \(2 \le d \le n-2\) the coefficients of the monomials \((x_3 x_5 \cdots x_{d+2})^2, (x_4 x_5 \cdots x_{d+2})^2\) and \(x_3 x_4 ( x_5 \cdots x_{d+2})^2\) in \(\Delta _{12} e_d\) are all \(1\). Specializing to \(x_j=1\) for \(j\ge 5\) then shows that \(\Delta _{12}f\) is not a square.

Example 5.11

[The Vámos Polynomial] The relaxations (3.3) and (3.5) are not always exact. An example of this comes from the multiaffine quartic polynomial in \({\mathbb {R}}[x_1, \ldots , x_8]_4\) given as the bases-generating polynomial of the Vámos matroid:

$$\begin{aligned} h(x_1, \ldots , x_8) = \sum _{I \subset \left( {\begin{array}{c}[8]\\ 4\end{array}}\right) \backslash C} \prod _{i\in I} x_i, \end{aligned}$$

where \(C = \{\{1,2,3,4\},\{1,2,5,6\},\{1,2,7,8\},\{3,4,5,6\},\{3,4,7,8\}\}\). Wagner and Wei [21] have shown that the polynomial \(h\) is stable, using an improved version of Theorem 5.3 and representing \(\Delta _{13}h\) as a sum of squares. But it turns out that \(\Delta _{78} h\) is not a sum of squares. Because the cone of sums of squares is closed, it follows that for some \(a,e\) in the hyperbolicity cone of \(h\), the polynomial \(D_{e}h\cdot D_{a}h - h\cdot D_{e}D_{a}h\) is not a sum of squares. In order to show that \(\Delta _{78} h\) is not a sum of squares, it suffices to restrict to the subspace \(\{x =x_1 = x_2,\;y =x_3=x_4, \;z =x_5=x_6\}\) and show that the resulting polynomial \(W=(1/4)\Delta _{78} h (x,x,y,y,z,z,w,w)\) is not a sum of squares. This restriction is given by

$$\begin{aligned} W&= x^4y^2 + 2x^3y^3 + x^2y^4 + x^4yz + 5x^3y^2z + 6x^2y^3z + 2xy^4z + x^4z^2 + 5x^3yz^2 \\&+ 10x^2y^2z^2 + 6xy^3z^2 + y^4z^2 + 2x^3z^3 + 6x^2yz^3 + 6xy^2z^3 +2y^3z^3 + x^2z^4 \\&+ 2xyz^4 + y^2z^4. \end{aligned}$$

This polynomial vanishes at six points in \({\mathbb {P}}^2({\mathbb {R}})\),

$$\begin{aligned}{}[1:0:0], \;[0:1:0],\; [0:0:1],\;[1:-1:0],\;[1:0:-1], \text { and }\;[0:1:-1]. \end{aligned}$$

Thus if \(W\) is written as a sum of squares \(\sum _k h_k^2\), then each \(h_k\) must vanish at each of these six points. The subspace of \({\mathbb {R}}[x,y,z]_3\) of cubics vanishing in these six points is four dimensional and spanned by \(v =\{x^2y+xy^2, x^2z+xz^2, y^2z+yz^2, xyz\}\). Then \(W\) is a sum of squares if and only if there exists a positive semidefinite \(4\times 4\) matrix \(G\) such that \(W = v^TGv\). However, the resulting linear equations in the variables \(G_{ij}, \; 1\le i\le j \le 4\), have the unique solution

$$\begin{aligned} G = \begin{pmatrix} 1&{}\quad 1/2&{}\quad 1&{}\quad 2\\ 1/2&{}\quad 1&{}\quad 1&{}\quad 2\\ 1&{}\quad 1&{}\quad 1&{} \quad 2\\ 2&{}\quad 2&{}\quad 2&{}\quad 5 \end{pmatrix}. \end{aligned}$$

One can see that \(G\) is not positive semidefinite from its determinant, which is \(-1/4\). Thus \(W\) cannot be written as a sum of squares.

This, along with Corollary 4.3, provides another proof that no power of the Vámos polynomial \(h(x)\) has a definite determinantal representation.

The polynomial \(\Delta _{78} h\) is also not a sum of squares modulo the ideal \((h)\). To see this, suppose \(\Delta _{78} h=\sum _i q_i^2 + p\cdot h\) for some \(p,q_i \in {\mathbb {R}}[x]\) and consider the terms with largest degree in \(x_7\) and \(x_8\) in this expression. Writing \(h= h_0+h_1(x_7+x_8)+x_7x_8h_2\) where \(h_0, h_1, h_2\) lie in \({\mathbb {R}}[x_1, \ldots , x_6]\), we see that the leading form \(x_7x_8 h_2\) is real radical, meaning that whenever a sum of squares \(\sum _i g_i^2\) lies in the ideal \((x_7x_8 h_2)\), this ideal contains each polynomial \(g_i\). Since \(\Delta _{78} h\) does not involve the variables \(x_7\) and \(x_8\), we can then reduce the polynomials \(q_i\) modulo the ideal \((h)\) so that they do not contain the variables \(x_7, x_8\). See [19, Lemma 3.4]. Because \(h\) does involve the variable \(x_7\) and \(x_8\), this results in a representation of \(\Delta _{78} h\) as a sum of squares, which is impossible, as we have just seen.

6 The cone of interlacers and its boundary

Here we investigate the convex cone \(\mathrm{Int}(f,e)\) of polynomials interlacing \(f\). We compute this cone in two examples coming from optimization and discuss its algebraic boundary, the minimal polynomial vanishing on the boundary of \(\mathrm{Int}(f,e)\), when this cone is full dimensional. For smooth polynomials, this algebraic boundary is irreducible.

If the real variety of a hyperbolic polynomial \(f\) is smooth, then the cone \(\mathrm{Int}(f,e)\) of interlacers is full dimensional in \({\mathbb {R}}[x]_{d-1}\). On the other hand, if \({\mathcal {V}}(f)\) has a real singular point, then every polynomial that interlaces \(f\) must pass through this point. This has two interesting consequences for the hyperbolic polynomials coming from linear programming and semidefinite programming.

Example 6.1

Consider \(f = \prod _{i=1}^nx_i\). The singular locus of \({\mathcal {V}}(f)\) consists of the set of vectors with two or more zero coordinates. The subspace of polynomials in \({\mathbb {R}}[x]_{n-1}\) vanishing in these points is spanned by the \(n\) polynomials \(\{\prod _{j\ne i}x_j\,:\; i=1,\ldots , n\}\). Note that this is exactly the linear space spanned by the partial derivatives of \(f\). Theorem 3.1 then shows that the cone of interlacers is isomorphic to \(\overline{C(f,e)} = ({\mathbb {R}}_{\ge 0})^{n}\):

$$\begin{aligned} \mathrm{Int}\left( \prod x_i, \mathbf {1}\right) =\left\{ \; \sum _{i=1}^n a_i \prod _{j\ne i}x_j \;:\; a\in ({\mathbb {R}}_{\ge 0})^{n} \;\right\} \cong ({\mathbb {R}}_{\ge 0})^{n}. \end{aligned}$$

Interestingly, this also happens when we replace the positive orthant by the cone of positive definite symmetric matrices.

Example 6.2

Let \(f = \det (X)\) where \(X\) is a \(d\times d\) symmetric matrix of variables. The singular locus of \({\mathcal {V}}(f)\) is the locus of matrices with rank \(\le ~d-2\). The corresponding ideal is generated by the \((d-1)\times (d-1)\) minors of \(X\). Since these have degree \(d-1\), we see that the polynomials interlacing \(\det (X)\) must lie in the linear span of the \((d-1)\times (d-1)\) minors of \(X\). Again, this is exactly the linear span of the directional derivatives \(D_E(f) = {{\mathrm{tr}}}(E\cdot X^\mathrm{adj})\). Thus Theorem 3.1 identifies \(\mathrm{Int}(f,e)\) with the cone of positive semidefinite matrices:

$$\begin{aligned} \mathrm{Int}\left( \det (X), I\right) =\biggl \{\; {{\mathrm{tr}}}(A\cdot X^\mathrm{adj})\;:\; A\in {\mathbb {R}}^{d\times d}_{\succeq 0} \;\biggl \} \cong {\mathbb {R}}^{d\times d}_{\succeq 0}. \end{aligned}$$

If \({\mathcal {V}}_{{\mathbb {R}}}(f)\) is nonsingular, then the cone \(\mathrm{Int}(f,e)\) is full dimensional and its algebraic boundary is a hypersurface in \({\mathbb {R}}[x]_{d-1}\). We see that any polynomial \(g\) on the boundary of \(\mathrm{Int}(f,e)\) must have a non-transverse intersection point with \(f\). As we see in the next theorem, this algebraic condition exactly characterizes the algebraic boundary of \(\mathrm{Int}(f,e)\).

Theorem 6.3

Let \(f \in {\mathbb {R}}[x]_d\) be hyperbolic with respect to \(e\in {\mathbb {R}}^n\) and assume that the projective variety \({\mathcal {V}}(f)\) is smooth. Then the algebraic boundary of the convex cone \(\mathrm{Int}(f,e)\) is the irreducible hypersurface in \({\mathbb {C}}[x]_{d-1}\) given by

$$\begin{aligned} \left\{ g\!\in \!{\mathbb {C}}[x]_{d-1} \; \exists \; p\!\in \!{\mathbb {P}}^{n-1} \text { such that } f(p)\!=\!g(p)\!=\!0 \text { and }{{\mathrm{rank}}}\begin{pmatrix} \nabla f (p) \\ \nabla g (p) \end{pmatrix} \!\le \!1\right\} \!.\nonumber \\ \end{aligned}$$
(6.1)

Proof

First, we show that the set (6.1) is irreducible. Consider the incidence variety \(X\) of polynomials \(g\) and points \(p\) satisfying this condition,

$$\begin{aligned} X = \left\{ (g,p) \!\in \! {\mathbb {P}}({\mathbb {C}}[x]_{d-1})\times {\mathbb {P}}^{n-1} \;:\;f(p)=g(p)\!=\!0 \text { and }{{\mathrm{rank}}}\begin{pmatrix} \nabla f (p) \\ \nabla g (p) \end{pmatrix} \!\le \! 1\right\} \!. \end{aligned}$$

The projection \(\pi _2\) onto the second factor is \({\mathcal {V}}(f)\) in \({\mathbb {P}}^{n-1}\). Note that the fibres of \(\pi _2\) are linear spaces in \({\mathbb {P}}({\mathbb {C}}[x]_{d-1})\) of constant dimension. In particular, all fibres of \(\pi _2\) are irreducible of the same dimension. Since \(X\) and \({\mathcal {V}}(f)\) are projective and the latter is irreducible, this implies that \(X\) is irreducible (see [17, §I.6, Thm. 8]), so its projection \(\pi _1(X)\) onto the first factor, which is our desired set (6.1), is also irreducible.

If \({\mathcal {V}}(f)\) is smooth, then by [12, Lemma 2.4], \(f\) and \(D_ef\) share no real roots. This shows that the set of polynomials \(g\in {\mathbb {R}}[x]_{d-1}\) for which \(D_ef\cdot g\) is strictly positive on \({\mathcal {V}}_{{\mathbb {R}}}(f)\) is nonempty, as it contains \(D_ef\) itself. This set is open and contained in \(\mathrm{Int}(f,e)\), so \(\mathrm{Int}(f,e)\) is full dimensional in \({\mathbb {R}}[x]_{d-1}\). Thus its algebraic boundary \(\overline{\partial \mathrm{Int}(f,e)}\) is a hypersurface in \({\mathbb {C}}[x]_{d-1}\). To finish the proof, we just need to show that this hypersurface is contained in (6.1), since the latter is irreducible.

To see this, suppose that \(g\in {\mathbb {R}}[x]_{d-1}\) lies in the boundary of \(\mathrm{Int}(f,e)\). By Theorem 2.1, there is some point \(p\in {\mathcal {V}}_{{\mathbb {R}}}(f)\) at which \(g\cdot D_ef\) is zero. As \(f\) is nonsingular, \(D_ef(p)\) cannot be zero, again using [12, Lemma 2.4]. Thus \(g(p)=0\). Moreover, the polynomial \(g\cdot D_ef - f\cdot D_eg\) is globally nonnegative, so its gradient also vanishes at the point \(p\). As \(f(p)=g(p)=0\), this means that \(D_ef(p)\cdot \nabla g (p) = D_eg(p)\cdot \nabla f(p)\). Thus the pair \((g,p)\) belongs to \(X\) above.\(\square \)

When \({\mathcal {V}}(f)\) has real singularities, computing the dimension of \(\mathrm{Int}(f,e)\) becomes more subtle. In particular, it depends on the type of singularity.

Example 6.4

Consider the two hyperbolic quartic polynomials

$$\begin{aligned} f_1 \!=\! 3 y^4 \!+\! x^4 \!+\! 5 x^3 z \!+\! 6 x^2 z^2 \!-\! 6 y^2 z^2 \;\;\text { and } \;\; f_2 \!= \!(x^2 \!+\! y^2 \!+\! 2 x z)(x^2 \!+\! y^2 \!+\! 3 x z), \end{aligned}$$

whose real varieties are shown in Fig. 6 in the plane \(\{z=1\}\). Both are hyperbolic with respect to the point \(e=[-1:0:1]\) and singular at \([0:0:1]\). Every polynomial interlacing either of these must pass through the point \([0:0:1]\). However, for a polynomial \(g\) to interlace \(f_2\), its partial derivative \(\partial g/\partial y\) must also vanish at \([0:0:1]\). Thus \(\mathrm{Int}(f_1,e)\) has codimension one in \({\mathbb {R}}[x,y,z]_3\) whereas \(\mathrm{Int}(f_2,e)\) has codimension two.

Theorem 3.1 states that \(\overline{C(f,e)}\) is a linear slice of the cone \(\mathrm{Int}(f,e)\). By taking boundaries of these cones, we recover \({\mathcal {V}}(f)\) as a linear slice of the algebraic boundary of \(\mathrm{Int}(f,e)\).

Fig. 6
figure 6

Two singular hyperbolic quartics with different dimensions of interlacers

Definition 6.5

We say that a polynomial \(f\in {\mathbb {R}}[x]\) is cylindrical if there exists an invertible linear change of coordinates \(T\) on \({\mathbb {R}}^{n}\) such that \(f(Tx)\in {\mathbb {R}}[x_1,\dots ,x_{n-1}]\).

Corollary 6.6

For non-cylindrical \(f\), the map \({\mathbb {R}}^{n}\rightarrow {\mathbb {R}}[x]_{d-1}\) given by \(a \mapsto D_a f\) is injective and maps the boundary of \(\overline{C(f,e)}\) into the boundary of \(\mathrm{Int}(f,e)\). If \(f\) is irreducible, this map identifies \(\mathcal {V}(f)\) with a component of the Zariski closure of the boundary of \(\mathrm{Int}(f,e)\) in the plane spanned by \(\partial f/\partial x_1, \ldots , \partial f/\partial x_n\).

Proof

Since \(f\) is not cylindrical, the \(n\) partial derivatives \(\partial f/\partial x_j\) are linearly independent, so that \(a\mapsto D_af\) is injective. The claim now follows from taking the boundaries of the cones in (3.1). If \(f\) is irreducible, then the Zariski closure of the boundary of \(\overline{C(f,e)}\) is \(\mathcal {V}(f)\).\(\square \)

Example 6.7

We take the cubic form \(f(x,y,z)= (x - y) (x + y) (x + 2 y) - x z^2\), which is hyperbolic with respect to the point \([1:0:0]\). Using the computer algebra system Macaulay 2, we can calculate the minimal polynomial in \({\mathbb {Q}}[c_{11}, c_{12}, c_{13}, c_{22}, c_{23}, c_{33}]\) that vanishes on the boundary of the cone \(\mathrm{Int}(f,e)\). Note that any conic

$$\begin{aligned} q = c_{11}x^2+c_{12}xy+c_{13}xz+c_{22}y^2+c_{23}yz+c_{33}z^2 \end{aligned}$$

on the boundary of \(\mathrm{Int}(f,e)\) must have a singular intersection point with \({\mathcal {V}}(f)\). Saturating with the ideal \((x,y,z)\) and eliminating the variables \(x,y,z\) from the ideal

$$\begin{aligned} (f,q ) + \mathrm{minors}_2(\mathrm{Jacobian}(f, q)) \end{aligned}$$

gives an irreducible polynomial of degree twelve in the six coefficients of \(q\). This hypersurface is the algebraic boundary of \(\mathrm{Int}(f,e)\). When we restrict to the three-dimensional subspace given by \(q = a \frac{\partial f}{\partial x}+b \frac{\partial f}{\partial y}+c \frac{\partial f}{\partial z}\), this polynomial of degree twelve factors as

$$\begin{aligned}&a \!\cdot \!f(a,b,c)\!\cdot \! (961 a^8 \!+\! 5952 a^7 b \!+\! 11076 a^6 b^2 \!-\! 3416 a^5 b^3 \!-\! 34770 a^4 b^4 \!-\! 31344 a^3 b^5 \\&\quad + 14884 a^2 b^6 + 34632 a b^7 + 13689 b^8 - 1896 a^6 c^2 - 4440 a^5 b c^2+ 6984 a^4 b^2 c^2 \\&\quad + 25728 a^3 b^3 c^2 + 15960 a^2 b^4 c^2 - 7560 a b^5 c^2 - 7560 b^6 c^2 + 1074 a^4 c^4 \\&\quad - 1680 a^3 b c^4- 7116 a^2 b^2 c^4 - 2376 a b^3 c^4 + 2106 b^4 c^4 + 16 a^2 c^6\\&\quad + 936 a b c^6 - 27 c^8). \end{aligned}$$

One might hope that \(\mathrm{Int}(f,e)\) is also a hyperbolicity cone of some hyperbolic polynomial, but we see that this is not the case. Restricting to \(c=0\) shows that the polynomial above, unlike \(f\), is not hyperbolic with respect to \([1:0:0]\).