1 Introduction

The concept of polynomial stability arises naturally in a variety of disciplines, such as Analysis, Electrical Engineering, and Control Theory (Borcea et al. 2009; Wagner 2011; Basu and Fettweis 1987; Kummert 2002; Doyle 1982; Gurvits 2008; Xu et al. 2008; Scheicher 2013; Ghamgui et al. 2013; Li et al. 2013). In this paper, we discuss two-variable polynomial stability with respect to the open unit bidisk

$$\begin{aligned} {\mathbb {D}}^2= \{(z_1, z_2) \in {\mathbb {C}}^2:|z_1|<1,\ |z_2|<1\}. \end{aligned}$$

A bivariate polynomial will be called stable if it has no zeros in \(\mathbb {D}^2\), and strongly stable Footnote 1 if it has no zeros in the closure of the unit bidisk. The bidegree of \(p\in \mathbb {C}[z_1, z_2]\) is the pair \(\deg p=(\deg _1p, \deg _2p)\) of its partial degrees in each variable. The reverse of \(p\) is defined as \(\overleftarrow{p}(z)=z^{\deg p}\bar{p}(1/z)\), where for \(z=(z_1,z_2)\) and \(n=(n_1,n_2)\) we set \(z^n=z_1^{n_1}z_2^{n_2}\), \(\bar{p}(z)=\overline{p(\bar{z})}\), \(\bar{z}=(\bar{z}_1, \bar{z}_2)\), and \(1/z=(1/z_1, 1/z_2)\). A polynomial is self-reversive if it agrees with its reverse. A stable polynomial \(p\) is scattering Schur (Basu and Fettweis 1987) if \(p\) and \(\overleftarrow{p}\) are coprime, i.e., have no common factors.

For every stable \(p\in \mathbb {C}[z_1, z_2]\), with \(p(0, 0)=1\), we construct a representation

$$\begin{aligned} p(z_1,z_2)=\det (I_{|n|} - K Z_n ), \end{aligned}$$
(1.1)

with \(n=\deg p\) and \(K\) a contraction (i.e., \(\Vert K \Vert \le 1\) where \(\Vert \cdot \Vert \) stands for the largest singular value), where \(|n|=n_1+n_2\) and \(Z_n=z_1I_{n_1}\oplus z_2I_{n_2}\); see Theorem 2.1. Although we follow a slightly different path to achieve this result, we are essentially in the trail of Kummert (1989, 1990, 2002), who established (1.1) in the case of scattering Schur polynomials (Kummert 2002, Theorem 1). Note that, given a contractive \(K\), every polynomial defined by (1.1) is stable, so one gains practical means of designing stable bivariate polynomials.

As an application of Theorem 2.1, we also establish in Theorem 3.2 a representation (1.1) for stable self-reversive polynomials, with \(K\) a unitary matrix; notice that stable self-reversive polynomials are never scattering Schur. This representation was previously established directly in (Geronimo et al. (2013), Section 10) in a somewhat different setting.

In the one-variable case, the situation is transparent: every \(p\in \mathbb {C}[z]\), with \(p(0)=1\), can be written in the form

$$\begin{aligned} p(z)=(1-a_1z)\cdots (1-a_nz)=\det \left( I_n- \left[ \begin{array}{c@{\quad }c@{\quad }c} a_1 &{} &{}\\ &{}\ddots &{}\\ &{} &{} a_n \end{array}\right] \left[ \begin{array}{c@{\quad }c@{\quad }c} z &{} &{} \\ &{}\ddots &{} \\ &{} &{} z \end{array}\right] \right) , \end{aligned}$$

where \(1/a_i\) are the zeros of \(p\) counting multiplicities. Thus \(p\) admits a representation (1.1), with \(K=\mathrm{diag }[a_1,\ldots ,a_n]\). This representation is minimal in size, \(n=\deg p\), and in norm, \(\Vert K\Vert =\max _{1\le i\le d}|a_i|\). Observe that \(p\) is stable (respectively, strongly stable) if and only if \(K=\mathrm{diag }[a_1,\ldots ,a_n]\) is contractive (respectively, strictly contractive); in particular, all zeros of \(p\) are on the unit circle if and only if (1.1) holds with \(K\) a diagonal unitary. We also note that \(K\) can be chosen to have all, with perhaps one exception, singular values equal to 1. By a result of Horn (1954), this choice is realized by an upper-triangular \(K\) with eigenvalues \(a_1, \ldots , a_n\) and singular values equal to \(1,\ldots , 1, |\prod _{i=1}^n a_i |\); see also (Horn and Johnson (1991), Theorem 3.6.6).

Our main result, Theorem 2.1, shows that in the two-variable case we are as well able to find a representation (1.1) that is minimal both in the size and in norm of \(K\). In particular, this means that for a two-variable stable polynomial \(p\) with \(p(0,0)=1\), we can find a representation (1.1) with \(n=\mathrm{deg}\ p\) and \(K\) a contraction.

In three or more variables, such a result does not hold; indeed, it follows from (Grinshpan et al. (2013), Example 5.1) that for \(5/6<r<1\) the strongly stable polynomial

$$\begin{aligned} q(z_1, z_2, z_3)=1+\frac{r}{5}\ z_1z_2z_3\bigg (z_1^2z_2^2 +z_2^2z_3^2+z_3^2z_1^2-2z_1z_2z_3^2-2z_1z_2^2z_3-2z_1^2z_2z_3 \bigg ), \end{aligned}$$

does not have a representation (1.1) with \(K\) a contractive \(9\times 9\) matrix, even though its degree is \((3,3,3)\). In general, the problem of finding a representation (1.1) with some (not necessarily contractive) matrix \(K\) and \(n=\deg p\) for a multivariable polynomial \(p\), is overdetermined (see Grinshpan et al. 2013). It is also unknown whether a stable polynomial \(p\) in more than two variables admits a representation of the form (1.1) with a contractive matrix \(K\) of any size (see Grinshpan et al. 2013 for a discussion). An alternative certificate for stability in any number of variables is given in Woerdeman (2013). The general problem of constructing linear determinantal representations of a polynomial is a well known classical problem in algebraic geometry, see Kerner and Vinnikov (2012) and the references therein.

The analogue of stable self-reversive polynomials for the real line (as opposed to the unit circle in the complex plane) are real-zero polynomials [or—upon homogenization—hyperbolic polynomials first introduced by Gårding (1951, 1959)]. Determinantal representations of these polynomials were actively studied in recent years in relation to semidefinite programming; see Vinnikov (2012) for a survey. Using a stable factorization of univariate matrix polynomials that are positive semidefinite on the real line, we construct in Sect. 4 a positive Hermitian determinantal representation for two-variable real-zero polynomials, i.e., a determinantal representation of the form \(p(x_1,x_2)= p(0,0) \det (I+x_1A_1+x_2A_2)\), where \(A_1\) and \(A_2\) are Hermitian matrices of the size equal to the degree of \(p\). This reproves the main result of Helton and Vinnikov (2007), see also Hanselka (in preparation) (which amounts to the solution of the Lax conjecture for hyperbolic polynomials in three variables, see Lewis et al. 2005), in a somewhat weaker form [see also (Vinnikov 2012, Section 5) and Plaumann and Vinzant 2013]. Indeed, in Helton and Vinnikov (2007) it was proven that \(A_1\) and \(A_2\) can be chosen to be real symmetric. The advantage of the approach here is that the proof uses factorizations of matrix polynomials (unlike the algebro-geometric techniques used in Helton and Vinnikov (2007)) making it especially suitable for computations.

2 Norm-constrained determinantal representations

Given a non-constant bivariate polynomial \(p\), its stability radius is defined as

$$\begin{aligned} s(p):= \max \Big \{ r>0:p(z)\ne 0,\ z\in r{\mathbb {D}}^2\Big \}. \end{aligned}$$

Thus \(p\) is stable if and only if \(s(p)\ge 1\), and strongly stable if and only if \(s(p)>1\).

Theorem 2.1

Let \(p(z_1, z_2)\), with \(p(0, 0)=1\), be a non-constant bivariate polynomial. Then \(p\) admits a representation (1.1) with \(n = \mathrm{deg}\, p\) and \(\Vert K\Vert = s(p)^{-1}\).

Before we dwelve on the bivariate case, let us consider an alternative way to obtain (1.1) in the univariate case that does not require us to compute the roots of \(p\). Let \(p(z) = p_0 + \cdots + p_n z^n\) be a strongly stable polynomial. Then the classical matrix theory says that \(p\) is the characteristic polynomial of the associated companion matrix. Writing this in a form especially useful for our purposes, we have

$$\begin{aligned} p(z) = p_0 \det (I_n - z C_p), \end{aligned}$$

where

(2.1)

As \(p\) is strongly stable, all the eigenvalues of \(C_p\) lie in \({\mathbb {D}}\). Thus \(C_p\) is similar to a strict contraction. For our purposes, it will suffice to find a similarity to a (not necessarily strict) contraction. To this end, we proceed by introducing the Bezoutian

$$\begin{aligned} Q = AA^* - B^* B, \end{aligned}$$

where

$$\begin{aligned} A = \left[ \begin{array}{c@{\quad }c@{\quad }c} p_0 &{} &{} \\ \vdots &{} \ddots &{} \\ p_{n-1} &{} \cdots &{} p_0 \end{array}\right] , B = \left[ \begin{array}{c@{\quad }c@{\quad }c} p_{n} &{} \cdots &{} p_{1} \\ &{} \ddots &{} \vdots \\ &{} &{} p_{n} \end{array}\right] . \end{aligned}$$

Here \(M^*\) denotes the complex conjugate transpose of the matrix \(M\).

The Schur–Cohn criterion [see, e.g., (Lancaster and Tismenetsky 1985, Section 13.5)] tells us that \(p\) is strongly stable if and only if \(Q>0\). If we now factor \(Q= PP^*\), with \(P\) a square (and thus automatically invertible) matrix, then \(K=P^{-1}C_pP\) is a contraction (Woerdeman 2013). To see this, one shows that \(\left[ \begin{array}{c@{\quad }c} Q^{-1} &{}C_p^*Q^{-1}\\ Q^{-1} C_p &{} Q^{-1}\end{array}\right] \) and consequently \(P(I-K^*K)P^*\) are positive semi-definite. Since the range of \(K^*\) is contained in the range of \(P^*\), it follows that \(\Vert K\Vert \le 1\). We now have the desired representation \(p(z)=p_0 \det (I_n - z K )\).

This alternative derivation of a representation (1.1) in a univariate case provides the basis for the construction of (1.1) in the bivariate case, where now the coefficients \(p_i\) and the matrices \(C_p\), \(A\), \(B\), \(Q\), and \(P\) will depend on one of the variables.

We will need the following proposition.

Proposition 2.2

Every strongly stable polynomial is scattering Schur.

Proof

We will prove the statement here for bivariate polynomials, but it can obviously be extended to any number of variables.

Suppose \(p\) is a strongly stable bivariate polynomial and is not scattering Schur, and let \(f\) be a nontrivial common factor of \(p\) and \(\overleftarrow{p}\). Then \(f\) depends on at least one of the two variables \(z_1\) and \(z_2\). Assume for definiteness that \(f\) depends on \(z_1\), then it is certainly possible to fix \(z_2=\lambda \) on the unit circle \(\mathbb {T}\) so that \(q(z_1)=f(z_1,\lambda )\) is a nontrivial polynomial in \(z_1\). Since \(q\) divides \(p(z_1,\lambda )\), \(q\) has no zeros in \(\overline{\mathbb {D}}\). Since \(q\) divides \(\overleftarrow{p}(z_1,\lambda )=\lambda ^{n_2} z_1^{n_1} \overline{p(1/{\bar{z}_1},\lambda )}\), \(q\) has also no zeros in \(\mathbb {C}\setminus \overline{\mathbb {D}}\), a contradiction to the Fundamental Theorem of Algebra.

Thus, \(p\) is scattering Schur. \(\square \)

Notice that an alternative proof of Proposition 2.2 can be given as follows: one can argue that every irreducible common factor of \(p\) and \(\overleftarrow{p}\) would be self-reversive. By Proposition 3.1 (iii), it would have a zero on \(\mathbb {T}^2\) (\(\mathbb {T}^d\) in the general case) that contradicts the assumption of strong stability

Before proceeding to the proof of Theorem 2.1, we make two further comments. First, we view polynomials of bidegree \((n_1, n_2)\) or less with constant term \(1\) as points in the coefficient space \(\mathbb {C}^N\), where \(N=(n_1+1)(n_2+1)-1\). Notice that if a sequence of polynomials in \(\mathbb {C}^N\) converges to a polynomial of bidegree \((n_1, n_2)\), then the polynomials in the sequence will eventually have the same bidegree. Notice also that strongly stable polynomials form an open set in \(\mathbb {C}^N\), and that every stable polynomial \(p\) is a limit of strongly stable dilations \(p_r(z_1,z_2):=p(rz_1, rz_2)\) as \(r \uparrow 1\).

We will also make use of the 1D system realization theory. A univariate matrix-valued rational function \(f\) is said to have a (finite-dimensional) transfer-function realization if

$$\begin{aligned} f(z) = D + Cz (I-Az)^{-1}B \end{aligned}$$
(2.2)

for some complex matrix \(\left[ \begin{array}{c@{\quad }c} A &{}B \\ C &{} D \end{array}\right] \). A realization (2.2) of \(f\) is called minimal if the block \(A\) is of minimal possible size. Every rational matrix-valued function \(f\) which is analytic and contractive on \(\mathbb {D}\) has a realization (Kalman et al. 1969); moreover, the system matrix \(\left[ \begin{array}{c@{\quad }c} A &{} B\\ C &{} D \end{array}\right] \) of a minimal realization of \(f\) can be chosen to be contractive (Arov 1979).

Proof of Theorem 2.1

It is enough to establish the existence of a representation \(\det (I_{|n}-KZ_n)\), with \(K\) a contraction, for a stable polynomial of degree \(n\) with constant term \(1\). Indeed, given a bivariate polynomial \(p\) with \(p(0,0)=1\), \(q(z)=p(s(p)\,z)\) is stable, and we get \(q(z)=\det (I_{|n}-Z_nK_q)\), \(\Vert K_q\Vert \le 1\), giving us \(p(z)=\det (I_{|n}-Z_nK)\) where we set \(K=K_q/s(p)\) so that \(\Vert K\Vert \le s(p)^{-1}\). Since necessarily \(\Vert K\Vert \ge s(p)^{-1}\), the proof is complete.

Next, if we can show the existence of a representation \(\det (I_{|n|}-KZ_n)\), with \(K\) a contraction, for a dense subset of stable polynomials of degree \(n\) with constant term \(1\), then we are done. Indeed, if \(p^{(j)} = \det (I_{|n|}-K^{(j)} Z_n)\), with \(K^{(j)}\) a contraction, and \(p^{(j)} \rightarrow p\), then \(p=\det (I_{|n|} - K Z_n )\), where \(K\) is a limit point of the sequence \(\{ K^{(j)} :j \in {\mathbb {N}} \}\) (which exists as the contractions in \({\mathbb {C}}^{|n|\times |n|}\) form a compact set). Thus, we are allowed to make some generic assumptions on \(p\). For starters,

we may assume that \(p\) is strongly stable; by Proposition 2.2, this implies that \(p\) is scattering Schur, hence in particular \(p\) and \(\overleftarrow{p}\) have only finitely many common roots (see Geronimo and Woerdeman 2004, 2006). In addition, we will assume that these common roots are simple, do not have the same \(z_1\) coordinate, and do not occur with \(z_1=0\).Footnote 2 Finally, we assume that when we expand \(p\) in the powers of \(z_2\),

$$\begin{aligned} p(z_1, z_2) = p_0( z_1 ) + \cdots + p_{n_2}( z_1) z_2^{n_2}, \end{aligned}$$
(2.3)

that \(p_{n_2}\) is of degree \(n_1\) and is coprime with \(p_0\).

We now start the proof of the existence of a representation (1.1) with \(n=\deg p\) and \(\Vert K \Vert \le 1\) for an irreducible strongly stable polynomial \(p\) as above. Using the expansion (2.3), we introduce the companion matrix

$$\begin{aligned} \mathtt C (z_1) = \left[ \begin{array}{c@{\quad }c@{\quad }c@{\quad }c} - \frac{p_1(z_1)}{p_0(z_1)} &{} 1 &{} &{} 0 \\ \vdots &{} &{} \ddots &{} \\ -\frac{p_{n_2-1}(z_1)}{p_0(z_1)} &{} 0 &{} &{} 1\\ - \frac{p_{n_2}(z_1)}{p_0(z_1)} &{} 0 &{} \cdots &{} 0 \end{array}\right] , \end{aligned}$$
(2.4)

and the triangular Toeplitz matrices

$$\begin{aligned} \mathtt A(z_1) = \left[ \begin{array}{c@{\quad }c@{\quad }c} p_0 (z_1)&{} &{} \\ \vdots &{} \ddots &{} \\ p_{n_2-1} (z_1) &{} \cdots &{} p_0(z_1) \end{array}\right] , \end{aligned}$$
$$\begin{aligned} \mathtt B(z_1) = \left[ \begin{array}{c@{\quad }c@{\quad }c} p_{n_2} (z_1) &{} \cdots &{} p_{1} (z_1) \\ &{} \ddots &{} \vdots \\ &{} &{} p_{n_2} (z_1) \end{array}\right] . \end{aligned}$$

Form the Bezoutian \(Q(z_1):= \mathtt A(z_1) \mathtt A(1/\overline{z}_1)^* - \mathtt B(1/\overline{z}_1)^* \mathtt B(z_1)\). Since the polynomial \(p(z_1, \cdot )\) is strongly stable for every \(z_1\in \mathbb {T}\) (\(:=\{ z\in {\mathbb {C}}: \ |z|=1 \}\)), we have that \(Q(z_1)\) is positive definite for every \(z_1\in \mathbb {T}\) (Lancaster and Tismenetsky 1985, Section 13.5). Then there exists a \(n_2\times n_2\) matrix-valued polynomial \(P(z_1) = P_0 + \cdots + P_{n_1} z_1^{n_1}\) such that the factorization \(Q(z_1) = P(z_1)P(z_1)^*\), \(z_1 \in {\mathbb {T}}\), holds and \(P(z_1)\) is invertible for every \(z_1 \in \overline{\mathbb {D}}\) (Rosenblatt 1958; Dritschel and Rovnyak 2010). Since \(p_0(z_1)=p(z_1,0)\ne 0\) for every \(z_1\in \overline{\mathbb {D}}\), the rational matrix-valued function

$$\begin{aligned} M(z_1):= P(z_1)^{-1}\mathtt C(z_1) P(z_1) \end{aligned}$$

is analytic on \(\overline{\mathbb {D}}\). In fact, \(M\) is also contractive there Woerdeman (2013). To see this, one shows that \(\left[ \begin{array}{c@{\quad }c} Q(z_1)^{-1} &{} \mathtt C(z_1)^*Q(z_1)^{-1}\\ Q(z_1)^{-1}\mathtt C(z_1) &{} Q(z_1)^{-1}\end{array}\right] \) and consequently \(P(z_1)(I-M(z_1)^*M(z_1))P(z_1)^*\) are positive semi-definite for \(z_1\in \mathbb T\). Since the range of \(M(z_1)^*\) is contained in the range of \(P(z_1)^*\), it follows that \(\Vert M(z_1)\Vert \le 1\) for every \(z_1\in \mathbb {T}\), and by the maximum principle, for every \(z_1\in \overline{\mathbb {D}}\).

Claim: The only poles of \(M\) are the zeros of \(p_0\).

To prove the claim, we will first show that a zero \(z_1=a\) of \(P\) has geometric multiplicity 1 and the corresponding vector in the left kernel is a left eigenvector of \(\mathtt C(a)\). Indeed, first observe that by analytic continuation, \(Q(z_1)=P(z_1)P(1/\bar{z}_1)^*\), where the analyticity domains of the rational matrix-valued functions on the two sides of the equality coincide. Then the zeros of \(P\) are exactly the zeros of \(Q\) that lie in \({\mathbb {C}} \setminus \overline{\mathbb {D}}\). Let \(z_1\ne 0\). Observe that

$$\begin{aligned} R(z_1) := \left[ \begin{array}{c@{\quad }c} \mathtt A(z_1) &{}\mathtt B(1/\bar{z_1})^* \\ \mathtt B(z_1) &{} \mathtt A (1/\bar{z_1})^* \end{array}\right] \end{aligned}$$

is the resultant for the polynomials \(p(z_1, \cdot )\) and \(g(\cdot )\), where

$$\begin{aligned} g(z_2) =\overleftarrow{p} (z_1, z_2)/z_1^{n_1}= \overline{p_{n_1,n_2}}/z_1^{n_1} + \cdots + \overline{p_{00}} z_2^{n_2}. \end{aligned}$$

Thus \(\det R(z_1) = 0 \) if and only if \(p(z_1,z_2) = 0 =\overleftarrow{p} (z_1, z_2)\) for some \(z_2\), and as we noticed in footnote 2, the assumptions on the common roots of \(p\) and \(\overleftarrow{p}\) imply that the zeros of \(R\) have multiplicity 1. Moreover, when \(p(z_1,z_2) = 0 =\overleftarrow{p} (z_1, z_2)\) we have that the row vector \(v_{2n_2-1} (z_2)\) is in the left kernel of \(R(z_1)\); here

$$\begin{aligned} v_k (z) = \left[ \begin{array}{c@{\quad }c@{\quad }c@{\quad }c} 1&z&\cdots&z^k \end{array}\right] . \end{aligned}$$

When \(\overline{p_0} (1/z_1) \ne 0\), we have that \(\mathtt A (1/\bar{z_1})^*\) is invertible, and

$$\begin{aligned} \left[ \begin{array}{c@{\quad }c} v_{n_2-1} (z_2)&z_2^{n_2} v_{n_2-1} (z_2 ) \end{array}\right] \left[ \begin{array}{c@{\quad }c} \mathtt A(z_1) &{} \mathtt B(1/\bar{z_1})^* \\ \mathtt B(z_1) &{} \mathtt A (1/\bar{z_1})^* \end{array}\right] = 0 \end{aligned}$$

implies

$$\begin{aligned} v_{n_2-1} (z_2) Q(z_1) = 0. \end{aligned}$$
(2.5)

Notice that we used that \(\mathtt A(w_1)^*\) and \(\mathtt B(w_2)\) commute as they are both upper triangular Toeplitz matrices. Thus the left kernel of \(P(a)\) is spanned by \(v_{n_2-1}(z_2)\), where \((a,z_2)\) is a common zero of \(p\) and \(\overleftarrow{p}\).Footnote 3 It is now straightforward to check that \(v_{n_2-1}(z_2)\) is a left eigenvector of \(\mathtt C(a)\) corresponding to the eigenvalue \(1/z_2\).

To show that the only poles of \(M\) are the zeros of \(p_0\), we observe that the only other possible source of poles of \(M\) would be the zeros of \(P\). As the zero \(a\) of \(P\) has multiplicity 1, and is not a zero of \(p_0\), we obtain that

$$\begin{aligned}&P(z_1) = P_0 + P_1 (z_1-a) + \cdots +P_{n_1} (z_1-a)^{n_1},\\&P(z_1)^{-1} = S_{-1}/(z_1-a) + S_0 + S_1 (z_1-a) + \cdots , \end{aligned}$$

where \(\dim \mathrm{Ker} P_0 = 1\) and \(\mathrm{rank} S_{-1} = 1\) [see, e.g., (Bart et al. (1979), Chapter II)]. In addition, as \(P(z_1)\) and \(P(z_1)^{-1}\) multiply to \(I_{n_2}\), we have \(P_0 S_{-1} = 0 = S_{-1} P_0\). By the result of the previous paragraph, we must have that \(S_{-1}= w v_{n_2-1}(z_2)\), for some column vector \(w\). But then we have that

$$\begin{aligned} M(z_1) = P(z_1)^{-1} \mathtt C(z_1) P(z_1) = S_{-1} \mathtt C(a) P_0/(z_1-a) + G(z_1), \end{aligned}$$

where \(G(z_1)\) is analytic in a neighborhood of \(a\). Since

$$\begin{aligned} v_{n_2-1} (z_2) \mathtt C(a) P_0 = (1/z_2) v_{n_2-1} (z_2) P(a) = 0, \end{aligned}$$

we have \( S_{-1} \mathtt C(a) P_0 = 0\), and thus \(M(z_1)\) does not have a pole at \(a\). This proves the claim.

By the assumption that \(p_{n_2}\) is of degree \(n_1\) and is coprime with \(p_0\), the McMillan degree of \(\mathtt C\), and hence, of \(M\), is \(n_1\). Therefore there exists a minimal contractive realization of \(M\) with \(A\) a \(n_1\times n_1\) matrix:

$$\begin{aligned} M(z_1) = D + Cz_1 (I_{n_1}-Az_1)^{-1} B \end{aligned}$$

[see (Bart et al. (1979), Section 4.2) or (Ball et al. (1990), Sections 4.1, 4.2) for the notion of the McMillan degree of a rational matrix-valued function and its equality to the size of a minimal realization of the function]. We have

$$\begin{aligned} p(z_1, z_2)&= p_0(z_1)\det (I_{n_2}-M(z_1)z_2)\\&\quad =\frac{p_0(z_1)}{\det (I_{n_1}-z_1A)} \det \left( I_{|n|}-\left[ \begin{array}{c@{\quad }c} A &{} B\\ C &{}D \end{array}\right] \left[ \begin{array}{c@{\quad }c} z_1I_{n_1}&{} 0\\ 0 &{} z_2I_{n_2} \end{array}\right] \right) . \end{aligned}$$

As \(\det (I_{n_1}-z_1A)\) is the denominator of the coprime fraction representation of \(\det M\) (Ball et al. 1990, Section 4.2), it follows that \(p_0(z_1)=\det (I_{n_1}-z_1A)\). This proves that

$$\begin{aligned} p(z_1, z_2)= \det \left( I_{|n|}-\left[ \begin{array}{c@{\quad }c} A &{} B\\ C &{} D\end{array}\right] \left[ \begin{array}{c@{\quad }c} z_1I_{n_1}&{} 0\\ 0 &{} z_2I_{n_2}\end{array}\right] \right) , \end{aligned}$$

and we are done. \(\square \)

Notice that the proof outlines a procedure to find a representation (1.1) of a stable polynomial \(p\) with \(n=\deg p\) and \(K\) a contraction. Let us try this out on two simple examples.

Example 2.3

Let \(p(z_1,z_2) = 1+az_1+bz_2\), where \(a+b<1\) and \(a, b>0\). Then \(p_0(z_1) = 1+az_1 \) and \(p_1(z_1) = b\). We have

$$\begin{aligned} M(z_1) =\mathtt {C}(z_1)=- \frac{b}{1+a z_1} = - b+\sqrt{ab}\ z_1(1+az_1)^{-1}\sqrt{ab} \end{aligned}$$

and obtain the representation \(p(z_1, z_2) = \det \left( I_2 - \left[ \begin{array}{c@{\quad }c} -a &{} \sqrt{ab} \\ \sqrt{ab} &{} -b \end{array}\right] \left[ \begin{array}{c@{\quad }c} z_1 &{} 0\\ 0 &{} z_2 \end{array}\right] \right) \) with a contraction \( \left[ \begin{array}{c@{\quad }c} -a &{} \sqrt{ab} \\ \sqrt{ab} &{} -b \end{array}\right] \).

Example 2.4

Let \(p(z_1,z_2) = 1+z_1 + z_2 + \frac{z_1^2}{4} + \frac{z_2^2}{4} + \frac{z_1 z_2}{2}\). Then we find that

$$\begin{aligned} Q(z_1) = \frac{1}{4} (1+z_1)(1+\frac{1}{z_1}) \left[ \begin{array}{c@{\quad }c} z_1 + 3 + \frac{1}{z_1} &{} z_1+2 \\ \frac{1}{z_1}+2 &{} z_1 + 3 + \frac{1}{z_1} \end{array}\right] , \end{aligned}$$

which gives us a factor

$$\begin{aligned} P(z_1) = \frac{1}{2} (1+z_1)\left[ \begin{array}{c@{\quad }c} 1 &{} 1+z_1 \\ -z_1 &{} 1+z_1 \end{array}\right] . \end{aligned}$$

Next

$$\begin{aligned} M(z_1) =P(z_1)^{-1}\mathtt {C}(z_1)P(z_1) =\begin{pmatrix} \frac{z_1 + 1}{(z_1+2)^2} -1 &{} \frac{(z_1 + 1)^2}{(z_1+2)^2} \\ - \frac{(z_1 + 1)^2}{(z_1+2)^2} &{} 1 - \frac{3z_1 + 5}{(z_1+2)^2} \end{pmatrix}. \end{aligned}$$

A minimal contractive realization of \(M(z_1)=D+Cz_1(I-Az_1)^{-1} B\) is given by

$$\begin{aligned} \left[ \begin{array}{c@{\quad }c} A &{} B \\ C &{} D \end{array}\right] = \left[ \begin{array}{c@{\quad }c@{\quad }c@{\quad }c} -\frac{3}{4} &{} \frac{1}{4} &{} - \frac{1}{4} &{} -\frac{1}{4} \\ -\frac{1}{4} &{} -\frac{1}{4} &{} \frac{1}{4} &{} - \frac{3}{4} \\ -\frac{1}{4} &{} -\frac{1}{4} &{} -\frac{3}{4} &{} \frac{1}{4} \\ \frac{1}{4} &{} -\frac{3}{4} &{} -\frac{1}{4} &{} -\frac{1}{4} \end{array}\right] =:K. \end{aligned}$$

Now we find \(p(z_1,z_2) = \det ( I_4 - K Z_{(2,2)} )\).

Remark 2.5

It is an interesting question as to whether the procedure for constructing a representation (1.1) with \(n=\deg p\) and \(K\) a contraction outlined in the proof of Theorem 2.1 continues to work if some of the generic assumptions on the stable polynomial \(p\) are dropped. A necessary condition for the procedure to work is that \(p\) is a scattering Schur polynomial, since if \(p\) and \(\overleftarrow{p}\) are not relatively prime then the Bezoutian \(Q\) is singular.

Remark 2.6

Theorem 2.1 implies that if \(p\) is a strongly stable bivariate polynomial with \(p(0,0)=1\) then it admits a determinantal representation (1.1) with \(n=\deg p\) and \(K\) a strict contraction, \(\Vert K\Vert <1\). Notice that in order to construct such a determinantal representation we cannot apply the procedure outlined in the proof of Theorem 2.1 to the polynomial \(q(z)=p(s(p)\,z)\) since \(q\) is stable rather than strongly stable (so the generic assumptions in the proof of Theorem 2.1 are never fulfilled for \(q\)); rather, we should apply the procedure to the polynomial \(\tilde{q}(z) = p(r\,z)\) for some \(r\) with \(1 < r < s(p)\).

Kummert (1989, 1990, 2002) proved Theorem 2.1 for bivariate scattering Schur polynomials. He first constructed for such a polynomial \(p\) a 2D Givone–Roesser system realization (Givone and Roesser 1972) of \(f=\overleftarrow{p}/p\):

$$\begin{aligned} f=D+CZ_n(I_{|n|}-AZ_n)^{-1}B, \end{aligned}$$
(2.6)

with the complex \((|n|+1)\times (|n|+1)\) matrix \(\left[ \begin{array}{c@{\quad }c} A &{} B\\ C &{} D \end{array}\right] \) being unitary, and then wrote it as

$$\begin{aligned} \frac{\overleftarrow{p}}{p}=\frac{\det \left[ \begin{array}{c@{\quad }c} I_{|n|}-AZ_n &{} B\\ -CZ_n &{}D \end{array}\right] }{\det (I_{|n|}-AZ_n)}. \end{aligned}$$

Since the fraction representation on the left-hand side is coprime and the bidegree of the polynomial in the denominator of the fraction on the right-hand side is less than or equal \(n=(n_1,n_2)\) (in the componentwise sense), the denominators must be equal:

$$\begin{aligned} p=\det (I_{|n|}-AZ_n). \end{aligned}$$

Since \(A\) is a contraction, Theorem 2.1 follows for this case, with \(K=A\). We also remark that in this construction \(K\) has all singular values, except one, equal to 1.

Let us note that the existence of 2D Givone–Roesser unitary system realizations was proved by Agler (1990) for a much more general class of contractive analytic operator-valued functions on the bidisk \(\mathbb {D}^2\), however the (unitary) system matrix in such a realization has, in general, infinite-dimensional Hilbert-space operator blocks (in particular, \(A\) is a contraction on an infinite-dimensional Hilbert space). Kummert’s result in the special case of scalar rational inner functions is sharper in the sense that it provides a concrete finite-dimensional unitary realization of the smallest possible size. We also remark that an alternative construction of a finite-dimensional Givone–Roesser unitary system realization for matrix-valued rational inner functions of two-variables is given in Ball et al. (2005).

The general case of Theorem 2.1 can also be deduced from the special case of scattering Schur polynomials, since the latter is a dense subset in the set of all stable bivariate polynomials of bidegree \(n=(n_1,n_2)\) with the constant term \(1\) by Proposition 2.2, and the approximation argument as in our proof of Theorem 2.1 works.

Notice that at least when \(p\) is an irreducible scattering Schur polynomial, all the representations (1.1) of \(p\) with \(n=\deg p\) and \(K\) a contraction having all but one singular values equal to \(1\), arise from a Givone–Roesser realization (2.6) of the rational inner function \(f=\overleftarrow{p}/p\) with a unitary system matrix \(\left[ \begin{array}{c@{\quad }c} A &{} B\\ C &{} D \end{array}\right] \) where \(K=A\). Indeed, let \(\left[ \begin{array}{c@{\quad }c} A &{} B\\ C &{} D \end{array}\right] \), \(A:=K\), be the Halmos dilation of \(K\), i.e., an embedding of \(K\) into a \((|n|+1)\times (|n|+1)\) unitary matrix, and define \(f\) by (2.6). Then \(f\) is a rational inner function so necessarily \(f=\overleftarrow{q}/q\) for some stable polynomial \(q\) which is scattering Schur. We see that \(q\) divides \(p\); since \(A\) has all but one singular values equal to \(1\), \(B\) and \(C\) are not zero implying that \(|D|<1\), so that \(f\) is not a constant of absolute value \(1\) and \(q\) is not a constant polynomial, therefore \(q=p\) as asserted.

With the existence of contractive determinantal representations \(p(z_1,z_2) = \det (I_{|n|} - KZ_n)\), \(\Vert K\Vert \le 1\), of a stable bivariate polynomial \(p\) of degree \(n=(n_1,n_2)\) with \(p(0,0)=1\) established, the next natural question is classifying such determinantal representations either up to structured similarity: \(K \mapsto \left[ \begin{matrix} T_1 &{} 0 \\ 0 &{} T_2 \end{matrix}\right] K \left[ \begin{matrix} T_1 &{} 0 \\ 0 &{} T_2 \end{matrix}\right] ^{-1}\), for \(T_1\) and \(T_2\) complex invertible matrices of size \(n_1\) and \(n_2\), respectively, or up to structured unitary equivalence \(K \mapsto \left[ \begin{matrix} U_1 &{} 0 \\ 0 &{} U_2 \end{matrix}\right] K \left[ \begin{matrix} U_1 &{} 0 \\ 0 &{} U_2 \end{matrix}\right] ^{-1}\), for \(U_1\) and \(U_2\) unitary matrices of size \(n_1\) and \(n_2\), respectively. Notice that the proof of Theorem 2.1 that we presented provides a single contractive determinantal representation up to structured similarity (since all minimal contractive realizations of \(M(z_1)\) are similar), but—in general—many different contractive determinantal representations up to structured unitary equivalence (since in general \(M(z_1)\) will admit many different minimal contractive realizations up to unitary equivalence, see Arov 1979).

Proposition 2.7

Assume that \(p\) is an irreducible strongly stable bivariate polynomial of degree \(n=(n_1,n_2)\) having no critical points on the zero set \({\mathcal Z}_p\) of \(p\), and such that upon the expansion of \(p\) in the powers of \(z_1\) the coefficients of both \(z_1^{n_1}\) and of \(1\) are polynomials in \(z_2\) of degree \(n_2\) with distinct roots, and similarly for the development of \(p\) in the powers of \(z_2\). Then \(p\) admits a \(g\)-dimensional family of strictly contractive determinantal representations \(p(z_1,z_2) = \det (I_{|n|} - KZ_n)\), \(\Vert K\Vert < 1\), no two of which are structured similar, where \(g=n_1n_2-n_1-n_2\).

Unlike the rest of this paper, the proof uses some theory of algebraic curves and of their determinantal representations (for which we refer to Fulton 1969; Vinnikov 1989; Ball and Vinnikov 1996, 1999; Kerner and Vinnikov 2012).

Proof

We let \(\tilde{p}(z_1,z_2) = z_1^{n_1} z_2^{n_2} p(1/z_1,1/z_2)\) and let \({\tilde{\mathbf {p}}}(z_0,z_1,z_2) = z_0^{n_1+n_2} \tilde{p}(z_1/z_0,z_2/z_0)\) be the homogenization of \(\tilde{p}\). The assumptions on \(p\) imply that the zero set \({\mathcal Z}_{{{\tilde{\mathbf {p}}}}}\) of \({\tilde{\mathbf {p}}}\) in the complex projective plane—which is an irreducible projective algebraic curve—is actually smooth except for the two points at infinity on the curve, \([0 :1 :0]\) and \([0 :0 :1]\), which are ordinary multiple points of multiplicity \(n_2\) and \(n_1\), respectively. Furthermore, the genus of the desingularizing Riemann surface of \({\mathcal Z}_{{{\tilde{\mathbf {p}}}}}\) is exactly \(g\), and the \(n_2\) distinct preimages of the point \([0 :1 :0]\), respectively the \(n_1\) distinct preimages of the point \([0 :0 :1]\), are the distinct simple poles of the affine coordinate function \(z_1/z_0\), respectively \(z_2/z_0\).

Consider now a determinantal representation \({\tilde{\mathbf {p}}}(z_0,z_1,z_2) = c \, \det (z_0A_0+z_1A_1+z_2A_2)\), where \(c \in {\mathbb C}\), \(c \ne 0\), and \(A_0,A_1,A_2 \in {\mathbb C}^{(n_1+n_2) \times (n_1+n_2)}\). Such representations have a natural equivalence relation given by \((A_0,A_1,A_2) \mapsto (SA_0T,SA_1T,SA_2T)\) for \(S\), \(T\) complex invertible matrices of size \(n_1+n_2\). It is a basic fact that the kernel of a determinantal representation of the linear matrix pencil \(U(z_0,z_1,z_2) = z_0A_0+z_1A_1+z_2A_2\)—which is nonzero precisely for the points \([z_0 :z_1 :z_2]\) on the curve \({\mathcal Z}_{{{\tilde{\mathbf {p}}}}}\)—is one-dimensional, except possibly at the singular points \([0 :1 :0]\) and \([0 :0 :1]\), where the dimension of the kernel is less than or equal to the multiplicity. A determinantal representation is called maximal (or maximally generated) if the dimension of the kernel at each singular point equals the corresponding multiplicity. It is then the case that \(\ker U(0,0,1) \dotplus \ker U(0,1,0) = {\mathbb C}^{n_1+n_2}\) and \(\ker _\ell U(0,0,1) \dotplus \ker _\ell U(0,1,0) = {\mathbb C}^{1 \times (n_1+n_2)}\) (where \(\ker _\ell \) denotes the left kernel of a matrix) are direct sum decompositions into subspaces of dimensions \(n_1\) and \(n_2\) [see, e.g., (Ball and Vinnikov (1996), (2.24)–(2.25) and the proof of Theorem 3.3); we are using here the fact that \([0 :0 :1]\) and \([0 :1 :0]\) are the two intersection points of \({\mathcal Z}_{{\tilde{\mathbf {p}}}}\) with the line \(z_0=0\)]. Since \(U(0,0,1) = A_2\) and \(U(0,1,0) = A_1\), this implies that we can find invertible matrices \(S\), \(T\) such that \(S A_2 T = \left[ \begin{matrix} 0 &{} 0 \\ 0 &{} I_{n_2} \end{matrix} \right] \) and \(S A_1 T = \left[ \begin{matrix} I_{n_1} &{} 0 \\ 0 &{} 0 \end{matrix} \right] \). There is therefore a one-to-one correspondence between maximal determinantal representations of \({\tilde{\mathbf {p}}}\) up to equivalence, and determinantal representations of the form \({\tilde{\mathbf {p}}}(z_0,z_1,z_2) = \det (z_0 A_0 + Z_n)\), which is the same as \(p(z_1,z_2) = \det (I_{|n|} - AZ_n)\) (\(A=-A_0\)), up to structured similarity.

Now, maximal determinantal representations of \({\tilde{\mathbf {p}}}\) up to equivalence are classified by points in a certain dense open subset of the Jacobian variety of the desingularizing Riemann surface of \({\mathcal Z}_{{\tilde{\mathbf {p}}}}\). More explicitly, the formulae (5.3) in Ball and Vinnikov (1999) yield a matrix \(A(\zeta )\) which is analytic in \(\zeta \), \(\zeta \in O \subseteq {\mathbb C}^g\) a certain open dense subset, such that for any determinantal representation of the form \(p(z_1,z_2) = \det (I_{|n|} - Z_nA)\), \(A\) is structured similar to \(A(\zeta )\) for \(\zeta \in O\) which is uniquely determined modulo a lattice of rank \(2g\) in \({\mathbb C}^g\).

Let \(p(z_1,z_2) = \det (I_{|n|} - Z_nK^0)\), \(\Vert K^0\Vert <1\), be a strictly contractive determinantal representation that exists by Theorem 2.1. Then \(K^0 = \left[ \begin{matrix} T_1 &{} 0 \\ 0 &{} T_2 \end{matrix}\right] A(\zeta ^0) \left[ \begin{matrix} T_1 &{} 0 \\ 0 &{} T_2 \end{matrix}\right] ^{-1}\), for \(T_1\) and \(T_2\) complex invertible matrices of size \(n_1\) and \(n_2\), respectively, and \(\zeta ^0 \in O\). If we set \(K(\zeta ) = \left[ \begin{matrix} T_1 &{} 0 \\ 0 &{} T_2 \end{matrix}\right] A(\zeta ) \left[ \begin{matrix} T_1 &{} 0 \\ 0 &{} T_2 \end{matrix}\right] ^{-1}\), then \(\Vert K(\zeta )\Vert <1\) and \(K(\zeta )\) is not structured similar to \(K(\zeta ')\) for \(\zeta ,\zeta '\) in a sufficiently small open neighborhood of \(\zeta ^0\). \(\square \)

3 The case of self-reversive polynomials

Given a stable polynomial \(p\), one has the factorization \(p=us,\) where \(u\) is a stable self-reversive polynomial and \(s\) is a scattering Schur polynomial (Basu and Fettweis 1987, Theorem 4). In the case where \(p\) is stable and self-reversive, the factor \(s\) is a constant. Our next theorem specializes the result of Theorem 2.1 to this case. We first give several equivalent conditions for a stable polynomial to be self-reversive; while we formulate and prove the next proposition for bivariate polynomials, it is clear that it extends to any number of variables.

Proposition 3.1

Let \(p\) be a stable bivariate polynomial of bidegree \(n=(n_1,n_2)\) with \(p(0,0)=1\); then the following statements are equivalent:

  1. (i)

    \(p\) is self-reversive up to a unimodular constant;

  2. (ii)

    the coefficient of \(z_1^{n_1}z_2^{n_2}\) in \(p\) is unimodular;

  3. (iii)

    if \((z_1, z_2) \in {\mathbb T}^2\), the one-variable polynomial \(t\mapsto p(tz_1, tz_2)\) has all its zeros on \({\mathbb T}\).

Proof

(i)\(\Rightarrow \)(ii) is obvious, since for a bivariate polynomial \(p\) of bidegree \(n=(n_1,n_2)\), the constant term of \(\overleftarrow{p}\) equals the conjugate of the coefficient of \(z_1^{n_1}z_2^{n_2}\) in \(p\).

(ii)\(\Rightarrow \)(iii) Let \(p(z_1,z_2)=\sum _{j=0}^rp_j(z_1,z_2)\), \(p_0(z_1,z_2)=1\), be the expansion of \(p\) in homogeneous polynomials. Then

$$\begin{aligned} p_{(z_1,z_2)}(t):=p(tz_1,tz_2)=\sum _{j=0}^rp_j(z_1,z_2)t^j. \end{aligned}$$
(3.1)

If the coefficient \(p_{n_1,n_2}\) of \(z_1^{n_1}z_2^{n_2}\) is unimodular, then \(r=n_1+n_2\) and \(p_r(z_1,z_2)=p_{n_1,n_2}z_1^{n_1}z_2^{n_2}\). For \((z_1,z_2)\in \mathbb {T}^2\), we can write

$$\begin{aligned} p_{(z_1,z_2)}(t) = (1-a_1(z_1,z_2)t) \cdots (1-a_r(z_1,z)t), \end{aligned}$$

where \(1/a_1(z_1,z_2),\ldots ,1/a_r(z_1,z_2)\) are the roots of \(p_{(z_1,z_2)}\) counting multiplicities. Because of semi-stability, \(|a_i(z_1,z_2)| \ge 1\); but \(p_r(z_1,z_2) = (-1)^r a_1(z_1,z_2) \cdots a_r(z_1,z_2)\) is unimodular, hence \(|a_i(z_1,z_2)| = 1\).

(iii)\(\Rightarrow \)(i) Let \(p_{(z_1,z_2)}\) be as in (3.1). By the assumption, for every \((z_1,z_2)\in \mathbb {T}^2\) the polynomial \(p_{(z_1,z_2)}\) is self-reversive up to a unimodular constant, hence \(p_r(z_1,z_2)\) is either zero or unimodular. Since the polynomial \(p_r\) is nonzero, it is not identically zero on \(\mathbb {T}^2\) (e.g., by the uniqueness principle for bivariate analytic functions). By continuity, \(p_r(z_1,z_2)\) is unimodular, and thus \(\deg p_{(z_1,z_2)}=r\) for every \((z_1,z_2)\in \mathbb {T}^2\). It follows (e.g., by Rudin’s characterization of rational inner functions (Rudin 1969, Theorem 5.2.5) that \(p_r\) is a monomial:

$$\begin{aligned} p_r(z_1,z_2)=p_{m_1,m_2}z_1^{m_1}z_2^{m_2}, \end{aligned}$$

with \((m_1,m_2)\le (n_1,n_2)\), \(|m|=r\), and \(|p_{m_1,m_2}|=1\).

Now, the fact that for \((z_1,z_2)\in \mathbb {T}^2\) the polynomial \(p_{(z_1,z_2)}\) is self-reversive up to a unimodular constant implies that

$$\begin{aligned} \overline{p_{r-j}(z_1,z_2)} = \overline{p_r(z_1,z_2)} p_j(z_1,z_2) \end{aligned}$$

for \((z_1,z_2)\in \mathbb {T}^2\) and \(j=0,\ldots ,r\), and therefore by analytic continuation

$$\begin{aligned} \overline{p_{r-j}\Big (\frac{1}{\bar{z}_1},\frac{1}{\bar{z}_2}\Big )} = \overline{p_r\Big (\frac{1}{\bar{z}_1},\frac{1}{\bar{z}_2}\Big )} p_j(z_1,z_2) \end{aligned}$$

for all \(z_1,z_2 \ne 0\) and \(j=0,\ldots ,r\). It follows that

$$\begin{aligned} \overline{p\Big (\frac{1}{\bar{z}_1},\frac{1}{\bar{z}_2}\Big )} = \overline{p_r\Big (\frac{1}{\bar{z}_1},\frac{1}{\bar{z}_2}\Big )} p(z_1,z_2), \end{aligned}$$

and finally, since \(p_r(z_1,z_2)=p_{m_1,m_2}z_1^{m_1}z_2^{m_2}\),

$$\begin{aligned} z_1^{m_1}z_2^{m_2} \overline{p\Big (\frac{1}{\bar{z}_1},\frac{1}{\bar{z}_2}\Big )} = \overline{p}_{m_1,m_2} p(z_1,z_2). \end{aligned}$$

Comparing the degrees of \(z_1\) and \(z_2\) we see that \(n_1=m_1\), \(n_2=m_2\), and \(p\) is self-reversive up to the unimodular constant \(\overline{p}_{m_1,m_2}\). \(\square \)

Theorem 3.2

Let a nonconstant bivariate polynomial \(p\) of bidegree \(n=(n_1,n_2)\), with \(p(0,0)=1\), be stable. Then \(p\) is self-reversive up to a unimodular constant if and only if \(p\) admits a representation (1.1) with \(n=\deg p\) and \(K\) unitary.

The result above can also be derived using the theory of distinguished varieties. Indeed, by combining Theorem 1.12 in Agler and McCarthy (2005) (see also Theorem 2.1 in Knese 2010) with Lemma 2.5 and Proposition 4.3 (or the Proposition before Lemma 2.5) in Knese (2010), one may arrive to the same result. We provide a proof based on Theorem 2.1.

Proof

The proof in one direction is immediate. If \(p=\det (I_{|n|}-KZ_n)\), with \(K\) unitary and \(n=\deg p\), then

$$\begin{aligned} \overleftarrow{p}(z)=z^{n}\det (I_{|n|}-\overline{K}Z_n^{-1})=\det (Z_n-\overline{K})=\det (Z_n-K^*)=\det (-K^*)p(z), \end{aligned}$$

with \(\det (-K^*)\in \mathbb {T}\).

Conversely, assume that \(p\) is self-reversive up to a unimodular constant, or equivalently (by Proposition 3.1) that the coefficient of \(z_1^{n_1}z_2^{n_2}\) in \(p\) is unimodular. By Theorem 2.1, \(p\) has a representation (1.1) with a contractive \(K\). Observe that the modulus of the coefficient of \(z_1^{n_1}z_2^{n_2}\) equals \(|\det K |\), which in turn equals the product of the singular values of \(K\). As \(|\det K |=1\), all singular values of \(K\) must be equal to 1, yielding that \(K\) is unitary. \(\square \)

We notice that the procedure outlined in the proof of Theorem 2.1 to find a representation (1.1) does not work for self-reversive polynomials as the Bezoutian \(Q\) is \(0\), and a limiting process (as in the beginning of the proof of Theorem 2.1) is necessary. However, using the ideas in Knese (2010) (see also Geronimo et al. 2013, where a more general case is considered) we can provide the following construction.

Suppose \(p\in \mathbb {C}[z_1,z_2]\) is a stable self-reversive polynomial with \(p(0,0)=1\). It suffices to assume that \(p\) is irreducible. Indeed, if the polynomials \(p_1\) and \(p_2\) allow the representation (1.1) as in Theorem 2.1, then so does their product. We have

$$\begin{aligned} p_1(z_1,z_2)p_2(z_1,z_2)&= \det (I-K_1 Z^{(1)})\det (I-K_2 Z^{(2)})\\&\quad =\det (I-(K_1\oplus K_2)(Z^{(1)}\oplus Z^{(2)})). \end{aligned}$$

If \(K_1\) and \(K_2\) are unitary, then so is \(K_1\oplus K_2\). Using a permutation matrix \(P\) such that \(Z=P(Z^{(1)}\oplus Z^{(2)})P=z_1I_{n_1}\oplus z_2I_{n_2}\), we obtain

$$\begin{aligned} p_1(z_1,z_2)p_2(z_1,z_2)=\det (I-P(K_1\oplus K_2)P^2(Z^{(1)}\oplus Z^{(2)})P)=\det (I-KZ) \end{aligned}$$

with a unitary matrix \(K=P(K_1\oplus K_2)P\). Since \(\deg (p_1p_2)=\deg p_1+\deg p_2\), this representation is of the required (minimal) size.

So, let \(p\) be a stable self-reversive irreducible polynomial. We also assume that \(n_2>0\), the case \(n_2=0\) being trivial. It follows (Knese (2010), Theorem 2.9) (see also Geronimo et al. 2013, Lemma 10.7) that the polynomial \(\overleftarrow{\partial p/\partial z_2}\) is stable and \(\frac{z_2\partial p/\partial z_2}{\overleftarrow{\partial p/\partial z_2}}\) is a coprime fraction representation of a rational inner function. Then it is well known (see Cole and Wermer 1999; see also Kummert 2002; Geronimo and Woerdeman 2004; Ball et al. 2005; Knese 2008) that there exist bivariate polynomials \(A_1\), ..., \(A_{n_1}\) of bidegree \((n_1-1,n_2)\) or less, and bivariate polynomials \(B_1\), ..., \(B_{n_2}\) of bidegree \((n_1,n_2-1)\) or less, not all zero, such that

$$\begin{aligned}&\overleftarrow{\frac{\partial p}{\partial z_2}} (z_1,z_2) \overline{\overleftarrow{\frac{\partial p}{\partial z_2}} (w_1,w_2)}- z_2 \overline{w_2} \frac{\partial p}{\partial z_2} (z_1, z_2) \overline{ \frac{\partial p}{\partial z_2} (w_1, w_2) }\nonumber \\&\quad = (1-z_1 \overline{w_1}) \sum _{i=1}^{n_1}A_i(z_1,z_2) \overline{ A_i(w_1,w_2)} + (1-z_2 \overline{w_2}) \sum _{j=1}^{n_2}B_j(z_1,z_2) \overline{ B_j(w_1,w_2)}.\nonumber \\ \end{aligned}$$
(3.2)

(Furthermore, these polynomials can be found using semidefinite programming software. In Example 3.3 below we use the software package CVX, a package for specifying and solving convex programs (Grant and Boyd 2013, 2008). It is straightforward to verify the identity \(n_2p=\frac{\overleftarrow{\partial p}}{\partial z_2}+z_2\frac{\partial p}{\partial z_2}\), which implies that the left-hand side of (3.2) is equal to

$$\begin{aligned} n_2^2p(z_1,z_2)\overline{p(w_1,w_2)}-n_2z_2\frac{ \partial p}{\partial z_2}(z_1,z_2) \overline{p(w_1,w_2)}- n_2p(z_1,z_2)\overline{w_2\frac{\partial p}{\partial z_2}(w_1,w_2)}. \end{aligned}$$

When both \((z_1,z_2)\) and \((w_1,w_2)\) lie in the zero set \(\mathcal {Z}_p\) of the polynomial \(p\), this expression and, thus, the left-hand side of (3.2) are equal to 0. Then we use the standard “lurking isometry” argument. We first rewrite the equality (3.2) restricted to \(\mathcal {Z}_p\times \mathcal {Z}_p\) as

$$\begin{aligned} \left[ \begin{array}{c} A(w_1,w_2)\\ B(w_1,w_2) \end{array}\right] ^*\left[ \begin{array}{c} A(z_1,z_2)\\ B(z_1,z_2) \end{array}\right] =\left[ \begin{array}{c} w_1A(w_1,w_2)\\ w_2B(w_1,w_2) \end{array}\right] ^*\left[ \begin{array}{c} z_1A(z_1,z_2)\\ z_2B(z_1,z_2) \end{array}\right] , \end{aligned}$$

where \(A(z_1,z_2):=\mathrm{col }_{i=1,\ldots ,n_1}[A_i(z_1,z_2)]\) and \(B(z_1,z_2):=\mathrm{col }_{i=1,\ldots ,n_2}[B_i(z_1,z_2)]\). Then we observe that this identity uniquely determines an isometry

$$\begin{aligned} T:\mathrm{span }\left\{ \left[ \begin{array}{c} z_1A(z_1,z_2)\\ z_2B(z_1,z_2) \end{array}\right] :(z_1,z_2)\in \mathcal {Z}_p \right\} \rightarrow \mathrm{span }\left\{ \left[ \begin{array}{c} A(z_1,z_2)\\ B(z_1,z_2) \end{array}\right] :(z_1,z_2)\in \mathcal {Z}_p\right\} , \end{aligned}$$

defined on generating vectors by

$$\begin{aligned} T:\left[ \begin{array}{c} z_1A(z_1,z_2)\\ z_2B(z_1,z_2) \end{array}\right] \mapsto \left[ \begin{array}{c} A(z_1,z_2)\\ B(z_1,z_2) \end{array}\right] \end{aligned}$$

and then extended by linearity. We shall see a posteriori that in fact the span of the vectors on the right-hand side is all of \({\mathbb C}^{|n|}\), so that the isometry \(T\) is a unitary mapping of \(\mathbb {C}^{|n|}\) onto itself. At any rate, \(T\) can be extended to a unitary mapping \(K\) of \(\mathbb {C}^{|n|}\) onto itself, i.e., to a \(|n| \times |n|\) unitary matrix.

The nonzero polynomial \(P=\left[ \begin{array}{c} A\\ B \end{array}\right] \in \mathbb {C}^{|n|}[z_1,z_2]\) does not vanish identically on \(\mathcal {Z}_p\). Indeed, Bézout’s theorem (Fulton 1969, p. 112) says that two bivariate polynomials with no common factors can have at most a finite number of common zeros equal to the product of total degrees of the polynomials. Therefore, if \(P\) vanishes identically on \(\mathcal {Z}_p\), then the irreducible polynomial \(p\) should divide every component of \(P\), but since these components, \(A_i\), \(i=1,\ldots ,n_1\), and \(B_j\), \(j=1,\ldots ,n_2\), are polynomials of smaller bidegree than \(p\), this is impossible. Moreover, the set \(\mathcal {Z}_p\setminus \mathcal {Z}_P\) is Zariski relatively open and dense in \(\mathcal {Z}_p\). Since the polynomial \(q=\det (I_{|n|}-KZ_n)\) vanishes on this set, it vanishes on \(\mathcal {Z}_p\) as well. Applying Bézout’s theorem again, we see that \(p\) divides \(q\). Since \(\deg q=\deg p=n\) and \(p(0)=q(0)=1\), we must have \(p=q\), i.e., \(p\) has a representation (1.1) with \(n=\deg p\) and \(K\) unitary. This provides an alternative proof of the non-trivial direction in Theorem 3.2.

We notice that the restriction of \(P=\left[ \begin{array}{c} A\\ B \end{array}\right] \in \mathbb {C}^{|n|}[z_1,z_2]\) to \(\mathcal {Z}_p\) is a section of the kernel bundle of the determinantal representation \(I_{|n|}-KZ_n\) of the irreducible polynomial \(p\), see Vinnikov (1989), Kerner and Vinnikov (2012). It follows (essentially since such a section is generated by the columns of the adjoint matrix \(\mathrm{adj }(I_{|n|}-KZ_n)\)) that the entries of the restriction of \(P\) to \(\mathcal {Z}_p\) are linearly independent, in other words there exists no nonzero \(c \in {\mathbb C}^{1 \times |n|}\) such that \(c P(z_1,z_2) = 0\) for all \((z_1,z_2) \in \mathcal {Z}_p\). Therefore the span of \(P(z_1,z_2)\), \((z_1,z_2) \in \mathcal {Z}_p\), is all of \({\mathbb C}^{|n|}\), so that the isometry \(T=K\) is already a unitary mapping of \(\mathbb {C}^{|n|}\) onto itself and no extension is needed.

We illustrate this on the following example.

Example 3.3

Let \(p(z_1, z_2) = 1-z_1z_2 -\frac{1}{2} z_1^2 -\frac{1}{2} z_2^2 + z_1^2z_2^2\), so that \(\deg p=(2,2)\). We compute

$$\begin{aligned} \frac{\partial p}{\partial z_2} (z_1, z_2) = -z_1 -z_2 + 2z_1^2 z_2,\quad \overleftarrow{\frac{\partial p}{ \partial z_2}} (z_1,z_2) = 2 -z_1z_2 -z_1^2, \end{aligned}$$

and find

$$\begin{aligned} A_1(z_1,z_2)= \sqrt{2} (1-z_1z_2),\quad A_2(z_1,z_2)= z_1-z_2, \end{aligned}$$
$$\begin{aligned} B_1(z_1,z_2)=\sqrt{2}(1-z_1^2),\quad B_2(z_1,z_2) =z_1+z_2-2z_1^2z_2, \end{aligned}$$

so that (3.2) holds.

Taking the zeros \((0, \sqrt{2})\), \((\sqrt{2}, 0)\), \((\frac{1}{2}, -1+\frac{3}{\sqrt{2}})\), \(( -1+\frac{3}{\sqrt{2}}, \frac{1}{2})\), we find that the unitary \(K=T\) is the matrix

$$\begin{aligned} K= \frac{1}{\sqrt{2}} \left[ \begin{array}{c@{\quad }c@{\quad }c@{\quad }c} 0&{} 1 &{} 0 &{} 1\\ 1 &{} 0 &{} -1 &{} 0 \\ 0 &{} -1 &{} 0 &{} 1 \\ 1 &{} 0 &{} 1 &{} 0 \end{array}\right] . \end{aligned}$$

One can easily check that \(p(z_1, z_2) = \det (I_4 - K Z_{(2,2)}).\)

4 Real-zero polynomials and Hermitian determinantal representations

We consider bivariate real-zero polynomials, which are polynomials \(p\in \mathbb {R}[x_1,x_2]\) with the property that for every \((x_1,x_2) \in {\mathbb R}^2\) the one-variable polynomial \(p(tx_1,tx_2)\) has only real zeros. In (Helton and Vinnikov (2007), Theorem 2.2) it was shown that every real-zero polynomial \(p\) with \(p(0,0)=1\) may be represented as

$$\begin{aligned} p(x_1,x_2)=\det (I+ x_1A_1 + x_2A_2), \end{aligned}$$
(4.1)

where \(A_1,A_2 \in {\mathbb R}^{d\times d}\) are symmetric matrices and \(d\) is the total degree of \(p\); in the homogeneous setting of hyperbolic polynomials this statement was known as the Lax conjecture, see Lewis et al. (2005). We refer to Vinnikov (2012) for a detailed survey and further references. The proof in Helton and Vinnikov (2007) is based on the results of Vinnikov (1993) and Ball and Vinnikov (1999), see also Dubrovin (1983), and uses algebro-geometric techniques—the correspondence between (certain) determinantal representations of an irreducible plane curve and line bundles on its desingularization, together with a detailed analysis of the action of the complex conjugation on the Jacobian variety and the theory of Riemann’s theta function; a new proof, using instead the theory of quadratic forms, has been discovered recently in Hanselka. (in preparation) A somewhat weaker statement—namely, the existence of a representation (4.1) where now \(A_1,A_2 \in {\mathbb C}^{d\times d}\) are Hermitian matrices—has been established recently in (Vinnikov (2012), Section 5) and Plaumann and Vinzant (2013); these proofs are also algebro-geometric but avoid the transcendental machinery of Jacobian varieties and theta functions. In this section (Theorem 4.1), we provide a new proof (actually, two closely related proofs) of the existence of a representation (4.1) with \(A_1,A_2 \in {\mathbb C}^{d\times d}\) Hermitian matrices using factorizations of matrix-valued polynomials. One advantage of our proof is that it provides a constructive way to find such a representation. The most involved step is finding a stable factorization for a one-variable matrix polynomial that is positive semidefinite on the real line. As the latter can be implemented using any semidefinite programming package or a Riccati equation solver [see, e.g., Hachez and Woerdeman 2007 or (Bakonyi and Woerdeman (2011), Section 2.7)], this construction can be easily implemented numerically. For more on computational questions related to the construction of determinantal representations of real-zero polynomials, see Henrion (2010), Plaumann et al. (2011), Plaumann et al. (2012), Leykin and Plaumann (2012). By a simple trick, Theorem 4.1 also implies the existence of a \(2d \times 2d\) real symmetric representation for \(p^2\)—see Remark 4.6.

Theorem 4.1

Let \(p\) be a bivariate real-zero polynomial of total degree \(d>0\) with \(p(0,0)=1\). Then there exist \(d\times d\) Hermitian matrices \(A_1\) and \(A_2\) so that (4.1) holds.

We will need two lemmata. The first one is simply a restatement of one of the results of Nuij (1968) in the non-homogeneous setting.

Lemma 4.2

Let \(p\) be a real-zero polynomial of total degree \(d\) and with \(p(0,0)=1\). For every \(\epsilon > 0\) there exists a real-zero polynomial \(q\) of total degree \(d\) and with \(q(0,0)=1\) such that each coefficient of \(q\) is within \(\epsilon \) distance of the corresponding coefficient of \(p\), and for every \(x_2\in {\mathbb R}\) the one-variable polynomial \(\check{q}_{x_2}\) defined via

$$\begin{aligned} \check{q}_{x_2}(t):= t^d q(1/t, x_2/t) \end{aligned}$$

has only simple real zeros.

Proof

Let \({\mathbf p}(x_0,x_1,x_2):= x_0^d p(x_1/x_0,x_2/x_0)\). Then \({\mathbf p}\) is a degree \(d\) homogeneous polynomial in three variables that is hyperbolic with respect to \(e=(1,0,0)\), which means that \({\mathbf p}(e)\ne 0\) and for every \((x_0,x_1,x_2)\in {\mathbb R}^3\) the one-variable polynomial \(t \rightarrow {\mathbf p}(x_0-t,x_1,x_2)\) has only real zeroes. By a result of Nuij (1968), the polynomial \({\mathbf p}\) can be approximated arbitrarily close, in the sense of coefficients, by a degree \(d\) homogeneous polynomial \({\mathbf q}\) which is strictly hyperbolic with respect to \(e\); that is, \({\mathbf q}\) is hyperbolic with respect to \(e\), and for every \((x_0,x_1,x_2)\in {\mathbb R}^3\) with \((x_1,x_2)\ne (0,0)\) the zeros of \(t \rightarrow {\mathbf q}(x_0-t,x_1,x_2)\) are simple. But then \(q(x_1,x_2):={\mathbf q}(1,x_1,x_2)/{\mathbf q}(1,0,0)\) has the desired property. Notice that while a priori the total degree of \(q\) is at most \(d\), it will be actually equal to \(d\) if we choose \(\epsilon \) small enough. \(\square \)

The following result is due to Hanselka (in preparation). For the sake of completeness, we include a proof.

Lemma 4.3

Let \(M\) be a \(d \times d\) matrix-valued polynomial in one variable with Hermitian coefficients. If the polynomial \(\det (tI_d - M(s))\) has total degree at most \(d\), then \(M\) is linear (i.e., \(\deg M \le 1\)).

Proof

Let

$$\begin{aligned} \det (tI_d - M(s)) = t^d + p_1(s)t^{d-1} + \cdots + p_d(s), \end{aligned}$$

where \(p_j\) is a polynomial of degree at most \(j\). Assume that \(M\) is a polynomial of degree \(k\), and write \(-M(s) = B_0 + \cdots + B_k s^k\). The sum of \(j \times j\) principal minors in \(-M(s)\) is exactly \(p_j(s)\); therefore the coefficient of \(s^{kj}\) in \(p_j(s)\) is the sum of \(j \times j\) principal minors in \(B_k\). But \(\deg p_j \le j\) for all \(j\), hence if \(k>1\) we conclude that the sum of \(j \times j\) principal minors in \(B_k\) is zero for all \(j>0\). It follows that \(B_k\) is nilpotent. Since \(B_k\) is also Hermitian, it must be zero, a contradiction. \(\square \)

We will present two closely related proofs of Theorem 4.1: the first proof uses the Hermite matrix [considered in the context of real-zero polynomials and determinantal representations in Henrion (2010) and in Netzer et al. (2013)], whereas the second proof uses intertwining polynomials and the Bezoutian [considered in this context in Vinnikov (2012) and in Plaumann and Vinzant (2013), Kummer et al. (2012).

First proof of Theorem 4.1

We first claim that if we can establish the existence of a required determinantal representation for a dense subset of real-zero polynomials of total degree \(d\) and with constant term \(1\), then we are done.Footnote 4 Indeed, assume that we have real-zero polynomials \(p^{(n)}\), \(n\in {\mathbb N}\), of total degree \(d\) with \(p^{(n)}(0,0)=1\), so that the sequence \(\{p^{(n)}\}_{n\in {\mathbb N}}\) converges to \(p\) and so that there exist Hermitian \(d\times d\) matrices \(A_1^{(n)}\) and \(A_2^{(n)}\) with \(p^{(n)}(x_1,x_2)=\det (I_d + x_1 A_1^{(n)} + x_2 A_2^{(n)})\). Let

$$\begin{aligned} \mu := \min \{ |t|:\ p(t,0)p(0,t) = 0 \}. \end{aligned}$$

Clearly, \(\mu >0\). Then for \(n\) large enough the spectra of \(A_1^{(n)}\) and \(A_2^{(n)}\) lie in the interval \((-2\mu ^{-1},2\mu ^{-1})\). Since the spectral radius of an Hermitian matrix coincides with its operator \((2,2)\) norm, the matrices \(A_1^{(n)}\) and \(A_2^{(n)}\) have norms bounded by \(2\mu ^{-1}\), and therefore the sequence \(\{(A_1^{(n)}, A_2^{(n)}) \}_{n\in {\mathbb N}}\), has a limit point \((A_1,A_2)\). Then we get that \(p(x_1,x_2)=\det (I_d + x_1A_1+x_2A_2)\), with Hermitian \(d\times d\) matrices \(A_1\) and \(A_2\), as desired.

Given \(p\), we introduce

$$\begin{aligned} \check{p}_{x_2}(t):= t^d p(1/t, x_2/t) = t^d + p_1(x_2) t^{d-1} + \cdots + p_d(x_2). \end{aligned}$$

One easily observes that \(\deg p_j \le j\), \(j=1,\ldots , d\), and that for every \(x_2\in {\mathbb R}\) the polynomial \(\check{p}_{x_2}\) has only real zeros. Furthermore, we may assume by the previous paragraph and by Lemma 4.2 that for every \(x_2\in {\mathbb R}\) the polynomial \(\check{p}_{x_2}\) has only simple zeros.

Let \(C(x_2)\) be the companion matrix

$$\begin{aligned} C(x_2) = \left[ \begin{array}{c@{\quad }c@{\quad }c@{\quad }c} 0 &{} \cdots &{} 0 &{} -p_d(x_2) \\ 1 &{} &{} 0 &{} -p_{d-1}(x_2) \\ &{} \ddots &{} &{} \vdots \\ 0 &{} &{} 1 &{} -p_1(x_2) \end{array}\right] . \end{aligned}$$
(4.2)

Then

$$\begin{aligned} \check{p}_{x_2}(t) = \det (tI_d -C(x_2)). \end{aligned}$$

Denote the zeros of \(\check{p}_{x_2}\) by \(\lambda _1(x_2), \ldots , \lambda _d(x_2)\), and let \(s_j(x_2)\) be their \(j\)th Newton sum:

$$\begin{aligned} s_j(x_2) = \sum _{k=1}^d \lambda _k(x_2)^j,\quad j=0, 1,\ldots . \end{aligned}$$

As is well known, \(s_j(x_2)\) can be expressed in terms of \(p_j(x_2)\), as follows

$$\begin{aligned} s_0(x_2)=d,\quad s_1(x_2)= -p_1(z), \end{aligned}$$
$$\begin{aligned} s_j(x_2) = -jp_j(x_2)-\sum _{k=1}^{j-1} p_k(x_2) s_{j-k}(x_2), j=2,\ldots ,d. \end{aligned}$$

Note that \(s_j\) is a polynomial of degree \(\le j\), \(j=0,\ldots ,d\). We let \(H(x_2)\) be the Hermite matrix of \(\check{p}_{x_2}\), namely (see, e.g., Krein and Naimark 1936) the Hankel matrix whose entries are the Newton sums of the zeros of \(\check{p}_{x_2}\):

$$\begin{aligned} H(x_2) = [s_{i+j}(x_2)]_{i,j=0,\ldots ,d-1}. \end{aligned}$$

Clearly, \(H\) is a matrix polynomial of degree at most \(2d\). E.g., for \(d=2\) we have

$$\begin{aligned} H(x_2) = \left[ \begin{array}{c@{\quad }c} 2 &{} -p_1(x_2)\\ -p_1(x_2) &{} p_1(x_2)^2-2 p_2(x_2) \end{array}\right] . \end{aligned}$$

Since all the zeros of \(\check{p}_{x_2}\) are real and simple for real \(x_2\), we have that \(H(x_2) > 0\), \(x_2 \in {\mathbb R}\). This is well known and it follows immediately from

$$\begin{aligned} H(x_2) = V(x_2)^\top V(x_2), \end{aligned}$$
(4.3)

where \(V(x_2)\) is the (real) Vandermonde matrix

$$\begin{aligned} V(x_2) = [\lambda _{k+1}(x_2)^{j}]_{k,j=0,\ldots ,d-1}. \end{aligned}$$
(4.4)

In addition, one may easily check (e.g., using (4.3)) that

$$\begin{aligned} C(x_2)^\top H(x_2) = H(x_2) C(x_2). \end{aligned}$$
(4.5)

By the positive definiteness of \(H(x_2)\) for all real \(x_2\), we may factor \(H(x_2)\) as

$$\begin{aligned} H(x_2) = Q(x_2)^* Q(x_2),\quad x_2\in \mathbb {R}, \end{aligned}$$
(4.6)

where \(Q(x_2)\) is a matrix polynomial of degree \(d\) and \(Q(x_2)\) is invertible for \(\mathrm{Im } x_2 \ge 0\) (Yakubovich 1970; Rosenblum and Rovnyak 1985). We now let

$$\begin{aligned} M(x_2) = Q(x_2) C(x_2) Q(x_2)^{-1}, \end{aligned}$$

and obtain that

$$\begin{aligned} \check{p}_{x_2}(t) = \det (tI_d - M(x_2)). \end{aligned}$$

Note that \(M(x_2)=M(x_2)^*\) for \(x_2 \in {\mathbb R}\). Indeed, (4.5) implies that

$$\begin{aligned} C(x_2)^*Q(x_2)^*Q(x_2) = Q(x_2)^*Q(x_2)C(x_2), \quad x_2 \in {\mathbb R}. \end{aligned}$$

Multiplying on the left with \(Q(x_2)^{*-1}\) and on the right with \(Q(x_2)^{-1}\), yields that \(M(x_2) = M(x_2)^*\), \(x_2\in {\mathbb R}\).

Next, we claim that the rational matrix function \(M(x_2)\) is in fact a matrix polynomial. The only possible poles arise from the zeros of \(Q(x_2)\). Let \(a\) be a zero of \(Q(x_2)\). Then \(\mathrm{Im } a <0\). We rewrite (4.6) as

$$\begin{aligned} H(x_2) = Q(\bar{x}_2)^* Q(x_2) \end{aligned}$$

for all \(x_2 \in {\mathbb C}\), and substitute in (4.5), obtaining

$$\begin{aligned} C(x_2)^\top Q(\bar{x}_2)^* = Q(\bar{x}_2)^* Q(x_2) C(x_2) Q(x_2)^{-1} = Q(\bar{x}_2)^* M(x_2), \end{aligned}$$
(4.7)

for all \(x_2\in {\mathbb C}\). Since \(Q(\bar{a})\) is invertible, we conclude that

$$\begin{aligned} M(x_2) = Q(\bar{x}_2)^{*-1} C(x_2)^\top Q(\bar{x}_2)^*, \end{aligned}$$

is regular at \(a\), i.e., \(a\) is not a pole of \(M(x_2)\).

It follows now from Lemma 4.3 that \(\deg M \le 1\), i.e., we can write

$$\begin{aligned} M(x_2) = -A_1 - A_2x_2, \end{aligned}$$

where \(A_1\) and \(A_2\) are \(d\times d\) Hermitian matrices. Then

$$\begin{aligned} \check{p}_{x_2}(t) = \det (tI_d + A_1 + A_2x_2), \end{aligned}$$

and thus

$$\begin{aligned} p(x_1,x_2)= x_1^d \check{p}_{\frac{x_2}{x_1}}\Big (\frac{1}{x_1}\Big )= \det (I_d + x_1A_1+x_2A_2). \end{aligned}$$

\(\square \)

Note that the proof provides a constructive way to find a representation (4.1). We illustrate this with an example.

Example 4.4

Let \(p(x,y) = 1+10y+4x -y^2 -2xy -x^2.\) Then \(\check{p}_y(t) = t^2 + (10y+4)t +(-1-2y-y^2).\) We get

$$\begin{aligned} H(y) = \left[ \begin{array}{c@{\quad }c} 2 &{} -10y-4\\ -10 y-4 &{} 102y^2+84y+18 \end{array}\right] . \end{aligned}$$

Factoring as in (4.6) we find that

$$\begin{aligned} Q(y) = \left[ \begin{array}{c@{\quad }c} \sqrt{2} &{} -2\sqrt{2}-5\sqrt{2}y\\ 0 &{} \sqrt{10}( 1+ y\frac{11+3i}{5}) \end{array}\right] . \end{aligned}$$

Then

$$\begin{aligned} M(y) = Q(y)C(y)Q(y)^{-1} =\left[ \begin{array}{c@{\quad }c} -5y-2 &{}\sqrt{5}(1+\frac{y}{5}(11-3i))\\ \sqrt{5}(1+\frac{y}{5}(11+3i)) &{} -5y-2 \end{array}\right] . \end{aligned}$$

Ultimately, we find that

$$\begin{aligned} p(x,y)= \det \left( I_2 + x \left[ \begin{array}{c@{\quad }c} 2 &{}-\sqrt{5} \\ -\sqrt{5} &{} 2 \end{array}\right] + \right. \left. y \left[ \begin{array}{c@{\quad }c} 5 &{} \frac{\sqrt{5}}{5}(-11+3i)\\ \frac{\sqrt{5}}{5}(-11-3i) &{} 5 \end{array}\right] \right) .\nonumber \\ \end{aligned}$$
(4.8)

By the way, the polynomial \(p\) was constructed using \(A_1=\left[ \begin{array}{c@{\quad }c} 1 &{} 2\\ 2 &{}\quad 3 \end{array}\right] \) and \(A_2=\left[ \begin{array}{c@{\quad }c} 4 &{} 5\\ 5 &{} 6 \end{array}\right] \).

Before presenting a second proof of Theorem 4.1, we introduce a definition. Let \(p\) be a real-zero polynomial of total degree \(d\), \(p(0,0)=1\), and let \(q\) be a real-zero polynomial of total degree less than \(d\), \(q(0,0) > 0\). We define

$$\begin{aligned} \check{p}_{x_2}(t):= t^d p(1/t, x_2/t),\quad \check{q}_{x_2}(t):= t^{d-1} q(1/t, x_2/t), \end{aligned}$$

and let, for \(x_2 \in {\mathbb R}\), \(\lambda _1(x_2) \le \cdots \le \lambda _d(x_2)\) and \(\mu _1(x_2) \le \cdots \le \mu _{d-1}(x_2)\) be the zeros of \(\check{p}_{x_2}\) and of \(\check{q}_{x_2}\), respectively, counting multiplicities. We will say that \(q\) interlaces \(p\) if

$$\begin{aligned} \lambda _1(x_2) \le \mu _1(x_2) \le \lambda _2(x_2) \le \cdots \le \lambda _{d-1}(x_2) \le \mu _{d-1}(x_2) \le \lambda _d(x_2), \end{aligned}$$
(4.9)

for all \(x_2 \in {\mathbb R}\). We will say that \(q\) strictly interlaces \(p\) if all the zeros of \(\check{p}_{x_2}\) are simple and strict inequalities hold in (4.9), for all \(x_2 \in {\mathbb R}\).

As an example, let \(p\) be a real-zero polynomial of total degree \(d\), \(p(0,0)=1\), and let \((x^0_1,x^0_2)\) belong to the connected component of \((0,0)\) in \(\{(x_1,x_2)\in {\mathbb R}^2 :p(x_1,x_2)>0\}\). We set \({\mathbf p}(x_0,x_1,x_2)=x_0^d p(x_1/x_0,x_2/x_0)\) and define

$$\begin{aligned} q(x_1,x_2) = \left. \frac{d}{ds}\,{\mathbf p}(1+s,x_1+sx^0_1,x_2+sx^0_2)\right| _{s=0}; \end{aligned}$$
(4.10)

\(q\) is called the Renegar derivative of \(p\) with respect to \((x^0_1,x^0_2)\) and it interlaces \(p\); see Renegar (2006), Netzer et al. (2010). The interlacing is strict if all the zeros of \(\check{p}_{x_2}\) are simple for all \(x_2 \in {\mathbb R}\). Notice that for \((x^0_1,x^0_2)=(0,0)\), we have simply \(\check{q}_{x_2} = \left( \check{p}_{x_2}\right) '\).

Second proof of Theorem 4.1

We assume as in the first proof that \(p\) is a real-zero polynomial of total degree \(d\), \(p(0,0)=1\), such that the polynomial \(\check{p}_{x_2}\) has only simple zeros for all \(x_2 \in {\mathbb R}\). We choose a real-zero polynomial \(q\) of total degree less than \(d\), \(q(0,0) \ne 0\), that strictly interlaces \(p\). We let \(B(x_2)\) be the Bezoutian of the polynomials \(\check{q}_{x_2}\) and \(\check{p}_{x_2}\), namely (see, e.g., Krein and Naimark 1936)

$$\begin{aligned} B(x_2) = [b_{ij}(x_2)]_{i,j=0,\ldots ,d-1}, \end{aligned}$$

where \(b_{ij}(x_2)\) are determined from

$$\begin{aligned} \frac{\check{p}_{x_2}(t)\check{q}_{x_2}(s) -\check{q}_{x_2}(t) \check{p}_{x_2}(s)}{t-s} = \sum _{i,j=0,\ldots ,d-1} b_{ij}(x_2) t^is^j. \end{aligned}$$
(4.11)

It is easily seen that \(b_{ij}(x_2)\) are polynomials (over \({\mathbb Z}\)) in the coefficients of \(\check{p}_{x_2}\) and \(\check{q}_{x_2}\), hence polynomials in \(x_2\), i.e., \(B\) is a matrix polynomial. The defining equation (4.11) can be conveniently rewritten as

$$\begin{aligned} v_{d-1}(t) B(x_2) v_{d-1}(s)^\top = \frac{\check{p}_{x_2}(t)\check{q}_{x_2}(s) -\check{q}_{x_2}(t) \check{p}_{x_2}(s)}{t-s}, \end{aligned}$$
(4.12)

and taking the limit \(t \rightarrow s\),

$$\begin{aligned} v_{d-1}(s) B(x_2) v_{d-1}(s)^\top = \left( \check{p}_{x_2}\right) '(s)\check{q}_{x_2}(s)-\left( \check{q}_{x_2}\right) '(s) \check{p}_{x_2}(s), \end{aligned}$$
(4.13)

where

$$\begin{aligned} v_{d-1}(t) = \left[ \begin{array}{c@{\quad }c@{\quad }c@{\quad }c} 1&t&\cdots&t^{d-1}\end{array}\right] . \end{aligned}$$

Since the zeros of \(\check{p}_{x_2}\) and \(\check{q}_{x_2}\) are real, simple, and alternate for real \(x_2\), we have that \(B(x_2) > 0\), \(x_2 \in {\mathbb R}\). Indeed,

$$\begin{aligned} V(x_2)B(x_2)V(x_2)^\top =\mathrm{diag } \left[ \left( \check{p}_{x_2}\right) '(\lambda _i(x_2))\check{q}_{x_2}(\lambda _i(x_2)) \right] _{i=1,\ldots ,d}, \end{aligned}$$

where \(V(x_2)\) is the Vandermonde matrix (4.4) based at the zeros of \(\check{p}_{x_2}\). In addition, one has

$$\begin{aligned} C(x_2) B(x_2) = B(x_2) C(x_2)^\top \end{aligned}$$
(4.14)

(where \(C(x_2)\) is given by (4.2)), which can be verified by multiplying by \(V(x_2)\) on the left and by \(V(x_2)^\top \) on the right and using (4.13).

By the positive-definiteness of \(B(x_2)\) for all real \(x_2\), we may factor \(B(x_2)\) as

$$\begin{aligned} B(x_2) = P(x_2) P(x_2)^*,\quad x_2\in \mathbb {R}, \end{aligned}$$
(4.15)

where \(P(x_2)\) is a matrix polynomial and \(P(x_2)\) is invertible for \(\mathrm{Im } x_2 \ge 0\) (Yakubovich 1970; Rosenblum and Rovnyak 1985), and we let

$$\begin{aligned} M(x_2) = P(x_2)^{-1} C(x_2) P(x_2), \end{aligned}$$

and obtain that

$$\begin{aligned} \check{p}_{x_2}(t) = \det (tI_d - M(x_2)). \end{aligned}$$

As in the first proof of the theorem, (4.14) and (4.15) imply that \(M(x_2) = M(x_2)^*\), \(x_2\in {\mathbb R}\), and that the rational matrix function \(M(x_2)\) is regular at a zero \(a\) of \(P(x_2)\), so that it is in fact a matrix polynomial.Footnote 5 It follows from Lemma 4.3 that \(M\) is linear:

$$\begin{aligned} M(x_2) = -A_1 - A_2x_2, \end{aligned}$$

where \(A_1\) and \(A_2\) are \(d\times d\) Hermitian matrices, and then

$$\begin{aligned} p(x_1,x_2)= x_1^d \check{p}_{\frac{x_2}{x_1}}\Big (\frac{1}{x_1}\Big )= \det (I_d + x_1A_1+x_2A_2). \end{aligned}$$

\(\square \)

This second proof of Theorem 4.1 is of course constructive as well as soon as we choose a strictly interlacing polynomial \(q\), which can be the choice given in (4.10). When in Example 4.4 we choose \(q\) as in (4.10) with \((x_1^0,x_2^0)=(0,0)\), we find that

$$\begin{aligned} B(y) = \begin{pmatrix} 18+84y+102y^2 &{} 4+10y \\ 4+10y &{} 2 \end{pmatrix}, \end{aligned}$$

and we ultimately obtain the same representation (4.8).

We notice also that the algebro-geometric proof of Theorem 4.1 given in Vinnikov (2012) and in Plaumann and Vinzant (2013) also uses an interlacing polynomial \(q\), and yields a determinantal representation with

$$\begin{aligned} q(x_1,x_2) = c \mathrm{adj }(I+x_1A_1+x_2A_2) \, c^*, \end{aligned}$$

where \(\mathrm{adj }\) denotes the classical adjoint or adjugate matrix (the matrix of cofactors) and \(c \in {\mathbb C}^{1 \times d}\). It would be interesting to see whether this relation holds for the determinantal representation constructed in the second proof of Theorem 4.1 above (meaning that the two constructions are essentially equivalent, despite using quite different methods).

Remark 4.5

Note that for \(d=2\) we can always convert a representation \(p(x_1,x_2)=\det (I_2+ x_1A_1 + x_2A_2)\) with \(A_1\) and \(A_2\) Hermitian, to one with real symmetric \(A_1\) and \(A_2\). Indeed, write \(A_1=UDU^*\), with \(U\) unitary and \(D\) diagonal, and consider \(U^*A_2U\) which has a complex \((1,2)\) entry with, say, argument \(\theta \). Then letting \(V=\small \left[ \begin{array}{c@{\quad }c} 1 &{}0 \\ 0 &{} e^{i\theta } \end{array}\right] \) and \(\hat{A}_1 = D = VDV^*, \hat{A}_2 = VU^*A_2UV^* \in {\mathbb R}^{2\times 2}\), we obtain \(p(x_1,x_2)=\det (I_2+ x\hat{A}_1 + x_2\hat{A}_2)\), as desired.

Remark 4.6

(See (Ramana and Goldman (1995), Section 1.4) and (Netzer and Thom (2012), Lemma 2.14)) From the representation as in Theorem 4.1, we may represent \(p(x_1,x_2)^2\) as

$$\begin{aligned} p(x_1,x_2)^2=\det (I_{2d} + x_1 \alpha _1 + x_2 \alpha _2 ), \end{aligned}$$
(4.16)

where \(\alpha _1=\alpha _1^\top , \alpha _2=\alpha _2^\top \in {\mathbb R}^{2d \times 2d}\). Indeed, with \(A_1\) and \(A_2\) as in Theorem 4.1, we write

$$\begin{aligned} A_1=A_{1R} + i A_{1I}, A_2 = A_{2R} + i A_{2I}, \end{aligned}$$

where \(A_{1R}, A_{2R}, A_{1I}, A_{2I} \in {\mathbb R}^{d\times d}.\) It is easy to check that since \(A_1\) and \(A_2\) are Hermitian, \(A_{1R}\), \(A_{2R}\) are symmetric and \(A_{1I}\), \(A_{2I}\) are skew-symmetric. Let now

$$\begin{aligned} \alpha _1 = \left[ \begin{array}{c@{\quad }c} A_{1R} &{} A_{1I}\\ -A_{1I} &{} A_{1R} \end{array}\right] ,\qquad \alpha _2 = \left[ \begin{array}{c@{\quad }c} A_{2R} &{} A_{2I}\\ -A_{2I} &{} A_{2R} \end{array}\right] , \end{aligned}$$

and (4.16) follows. Indeed, using

$$\begin{aligned} U=\frac{1}{\sqrt{2}}\left[ \begin{array}{c@{\quad }c} I_d &{} I_d\\ iI_d &{} -i I_d \end{array}\right] , \end{aligned}$$

it is easy to check that

$$\begin{aligned} U \left[ \begin{array}{c@{\quad }c} A_1 &{} 0 \\ 0 &{} A_1^\top \end{array}\right] U^* = \alpha _1,\qquad U \left[ \begin{array}{c@{\quad }c} A_2 &{} 0\\ 0 &{} A_2^\top \end{array}\right] U^* = \alpha _2. \end{aligned}$$

5 Conclusions

The paper gives a detailed analysis of certain determinantal representations of bivariate polynomials and can be divided into two parts, the relation between which is of special interest.

The first and main part deals with polynomials \(p(z_1, z_2)\) stable with respect to the unit bidisk and normalized by \(p(0, 0)=1\). For every such polynomial of bidegree \((n_1, n_2)\) we construct a determinantal representation of the form

$$\begin{aligned} p(z_1,z_2)=\det (I - K Z ), \end{aligned}$$

where \(Z\) is an \((n_1+n_2)\times (n_1+n_2)\) diagonal matrix with coordinate variables \(z_1\), \(z_2\) on the diagonal and \(K\) is a contraction. This representation is an important certificate of stability, it was first established by Kummert, see (Kummert (1989), Theorem 1), in the case of scattering Schur polynomials. Our method employs a polynomial factorization technique for an appropriately formed stable Bezoutian matrix in one variable. We pay separate attention to the case of self-reversive polynomials—such polynomials are parameterized by unitary representation matrices \(K\). A constructive approach to build a unitary representation matrix is outlined and illustrated.

The second part of the paper deals with real-zero polynomials and Hermitian determinantal representations. It was shown in Hanselka (in preparation, Theorem 2.2) that every real-zero polynomial \(p\) with \(p(0,0)=1\) may be represented as

$$\begin{aligned} p(x_1,x_2)=\det (I+ x_1A_1 + x_2A_2), \end{aligned}$$

where \(A_1, A_2 \in {\mathbb R}^{d\times d}\) are symmetric matrices and \(d\) is the total degree of \(p\). In the homogeneous setting of hyperbolic polynomials this statement was known as the Lax conjecture, see Lancaster and Tismenetsky (1985). A somewhat weaker statement—namely, the existence of the preceding representation where now \(A_1, A_2 \in {\mathbb C}^{d\times d}\) are Hermitian matrices—has been established recently in (Speyer (2005), Section 5) and Plaumann et al. (2011). We provide a new approach to this problem based on the factorization of matrix-valued polynomials parallel to the one employed in the first part of the paper. Two closely related proofs of the Hermitian case are given. Each is constructive and can be numerically implemented using a semi-definite programming package.