1 Introduction

Let \({\mathcal {S}}\) denote the Schur class of complex-valued functions analytic on the open unit disk \({{\mathbb {D}}}\subset {\mathbb {C}}\) and mapping \({{\mathbb {D}}}\) into its closure, i.e., the closed unit ball of the Hardy space \(H^\infty ({{\mathbb {D}}})\). The classical Carathéodory–Schur problem [11, 21, 22] consists of finding a Schur-class function f with prescribed first n Taylor coefficients \(f_0,\ldots ,f_{n-1}\) at the origin. The answer is given in terms of the matrices

$$\begin{aligned} {\mathfrak {S}}_n^f=I_n-\textbf{T}_n^f\textbf{T}_n^{f*},\quad \text{ where }\quad \textbf{T}_n^f=\left[ \begin{array}{cccc}f_{0} &{} 0 &{} \ldots &{} 0 \\ f_{1}&{} f_{0} &{} \ddots &{} \vdots \\ \vdots &{} \ddots &{} \ddots &{} 0 \\ f_{n-1}&{} \ldots &{} f_{1} &{} f_{0}\end{array}\right] , \end{aligned}$$
(1.1)

constructed from the given coefficients. Namely, the problem has a solution if and only if \({\mathfrak {S}}_n^f\) is positive semidefinite (i.e., the matrix \(\textbf{T}_n^f\) is contractive). Moreover, if \({\mathfrak {S}}_n^f\) is positive definite, the problem has infinitely many solutions that can be parametrized by a linear fractional formula. If \({\mathfrak {S}}_n^f\) is singular, then the problem has a unique solution which necessarily is a Blaschke product of degree \(\deg f=\textrm{rank} \, {\mathfrak {S}}_n^f\). As a consequence of this fact, it was shown in [11] (with further elaboration in [23]) that the set of all functions analytic on \({\mathbb {D}}\) and with fixed n first Taylor coefficients contains a unique element (a scalar multiple of a finite Blaschke product) with minimally possible \(H^\infty \)-norm. Another consequence is the Carathéodory approximation theorem asserting that any Schur-class function can be uniformly approximated on any compact subset of \({{\mathbb {D}}}\) by finite Blaschke products. Yet, another consequence is the identification of Schur-class functions as power series f with Toeplitz matrices \(\textbf{T}_n^f\) being contractive for all \(n\ge 1\) and more specifically, of Blaschke products of degree k as power series for which the matrix \({\mathfrak {S}}_n^f\) is positive semidefinite and has rank equal \(\min (n,k)\) for all \(n\ge 1\).

It turns out that the CSP is equivalent to the structured positive semidefinite matrix extension problem: given \(f_0,\ldots ,f_{n-1}\in {{\mathbb {C}}}\), find its extension \(\{f_j\}_{j\ge 0}\) , so that \({\mathfrak {S}}_k^f\succeq 0\) for all \(k\ge 0\). The latter problem admits a fairly straightforward quaternion analog, with \(f_0,\ldots ,f_{n-1}\in {\mathbb {H}}\) and appropriately defined matrix positivity. This problem will be settled in Sect. 3. The Schur-complement argument used there will allow us to extend a given invertible matrix \({\mathfrak {S}}_n^f\) (not necessarily positive semidefinite) in a way that the number of negative eigenvalues of \({\mathfrak {S}}_m^f\) will be the same as that of \({\mathfrak {S}}_n^f\) for all \(m>n\). In Sect. 4, using the power-series characterization of quaternion Schur-class functions and finite Blaschke products, we will get to the analytic version of quaternion Carathéodory–Schur interpolation problem and establish quaternion analogs of the Carathéodory–Fejér theorem on the minimal-norm solution and the Carathéodory approximation theorem.

The CSP originally appeared in [8, 9, 26] in the setting of analytic functions with positive real part in \({{\mathbb {D}}}\) (that is, in the Carathéodory class \({\mathcal {C}}\)). The quaternionic counterpart of this class will be recalled in Sect. 5 and the interpolation results will be translated from the Schur-class setting using the Cayley transform.

In Sect. 6, we use the indefinite results from Sect. 3 to handle the CSP in the class of generalized Schur power series (such that the associated matrix \({\mathfrak {S}}_n^f\) has \(\kappa \) negative squares for all n big enough). As in the complex setting [18], the solution set of the problem is parametrized by a linear fractional formula with the Schur-class parameter. Due to possible zero cancelation (which does not occur in the definite case), some parameters give rise to only partial solutions to the problems. These special “excluded" parameters are classified in Sect. 6.3 and the corresponding quasi-solutions are studied in Sect. 6.4.

2 Preliminaries

We denote by \({\mathbb {H}}\) the skew field of quaternions

$$\begin{aligned} \alpha =x_0+\textbf{i}x_1+\textbf{j}x_2+\textbf{k}x_3 \qquad (x_0,x_1,x_2,x_3\in {\mathbb {R}}), \end{aligned}$$
(2.1)

with imaginary units \(\textbf{i}, \textbf{j}, \textbf{k}\) commuting with reals and subject to equalities \(\textbf{i}^2=\textbf{j}^2=\textbf{k}^2=\textbf{ijk}=-1\). By \({\mathbb {H}}[[z]]\), we denote the ring of formal power series in one variable z commuting with quaternion coefficients with the ring operations given by

$$\begin{aligned} (f+g)(z)=\sum _{k=0}^\infty z^k(f_k+g_k)\quad \text{ and }\quad (fg)(z)=\sum _{k=0}^\infty z^k\bigg (\sum _{\ell =0}^k f_\ell g_{k-\ell }\bigg ). \end{aligned}$$
(2.2)

The real part, the conjugate, and the absolute value of \(\alpha \in {\mathbb {H}}\) of the form (2.1) are defined by

$$\begin{aligned} \Re \alpha =x_0,\quad {\overline{\alpha }}=x_0-\textbf{i}x_1-\textbf{j}x_2-\textbf{k}x_3,\quad |\alpha |=\sqrt{\alpha {\overline{\alpha }}}=\sqrt{x_0^2+x_1^2+x_2^2+x_3^2}. \end{aligned}$$

For any non-real element \(\alpha \in {\mathbb {H}}\), its minimal central (real) polynomial equals

$$\begin{aligned} \varvec{\mu }_\alpha (z)=z^2-2z\Re \alpha +|\alpha |^2. \end{aligned}$$
(2.3)

Two elements \(\alpha ,\beta \in {\mathbb {H}}\) are called similar (\(\alpha \sim \beta \)) if \(\beta =\gamma \alpha \gamma ^{-1}\) for some non-zero \(\gamma \in {\mathbb {H}}\). We denote by

$$\begin{aligned}{}[\alpha ]:=\{\beta \in {\mathbb {H}}: \; \beta \sim \alpha \}= \{\beta \in {\mathbb {H}}: \; \Re \alpha =\Re \beta \; \; \text{ and } \; \; |\alpha |=|\beta |\} \end{aligned}$$
(2.4)

the similarity class of \(\alpha \). The second characterization in (2.4) follows from (2.3) and a general division-ring fact that \(\alpha \sim \beta \) if and only if \(\varvec{\mu }_\alpha =\varvec{\mu }_\beta \), and interprets \([\alpha ]\) as a 2-sphere of radius \(\sqrt{|\alpha |^2-(\Re \alpha )^2}\) around \(\Re \alpha \).

Another object associated with a non-real \(\alpha \in {\mathbb {H}}\) is its centralizer

$$\begin{aligned} {{\mathbb {C}}}_\alpha :=\{\beta \in {\mathbb {H}}: \, \alpha \beta =\beta \alpha \}=\textbf{span}_{{\mathbb {R}}}(1, \, \alpha ) \end{aligned}$$
(2.5)

which can be interpreted as the two-dimensional real subspace of \({\mathbb {H}}\) spanned by 1 and \(\alpha \). For \(\alpha \in {{\mathbb {R}}}\), (2.4) and (2.5) amount to \([\alpha ]=\{\alpha \}\) and \({{\mathbb {C}}}_\alpha ={\mathbb {H}}\).

2.1 Matrices over \({\mathbb {H}}\)

We denote by \({\mathbb {H}}^{n\times m}\) the set of \(n\times m\) matrices with quaternion entries, shortening this notation to \({\mathbb {H}}^n\) in case \(m=1\). An element \(\alpha \in {{\mathbb {H}}}\) is called a right eigenvalue of a matrix \(A\in {\mathbb {H}}^{n\times n}\) if \(A\textbf{x}=\textbf{x}\alpha \) for some non-zero \(\textbf{x}\in {{\mathbb {H}}}^{n}\). In this case, for any \(\beta =h^{-1}\alpha h\in [\alpha ]\), we also have \(A\textbf{x}h= \textbf{x}hh^{-1}\alpha h=\textbf{x}h\beta \), and hence, any element in the similarity class \([\alpha ]\) is a right eigenvalue of A. Therefore, the right spectrum \(\sigma _\textbf{r}(A)\) of A is the union of disjoint similarity classes (some of which may be real singletons).

Given a matrix \(A=[a_{ij}]\) its adjoint (conjugate transpose) \(A^*\) is defined as \(A^*=[{\overline{a}}_{ji}]\). If A is Hermitian (i.e., \(A=A^*\)), all its eigenvalues are real; if they are all nonnegative, the matrix A is called positive semidefinite; in notation, \(A\succeq 0\). It turns out that \(A\succeq 0\) if and only if \(\textbf{x}^*A\textbf{x}\ge 0\) for any \(\textbf{x}\in {\mathbb {H}}^n\). If all eigenvalues are positive (equivalently, \(\textbf{x}^*A\textbf{x}>0\) for any \(\textbf{x}\in {\mathbb {H}}^n\)), we say that A is positive definite and write \(A\succ 0\). We will write \(\nu _{\pm }(A)\) to denote the number of positive/negative eigenvalues (counted with multiplicities) of a Hermitian matrix A. For any matrix T, nonnegative square roots of right eigenvalues of \(TT^*\succeq 0\) are called singular values of T.

To deal with Hermitian structured extensions, we will need the Cauchy interlacing theorem for quaternionic Hermitian matrices which can be proven by suitably modified complex-setting arguments (avoiding determinants [25, Lecture 8] or by applying the classical result to the \(2n\times 2n\)-complex representation of a given \(n\times n\) quaternion matrix [24]).

Theorem 2.1

( [24, 25]) If \(A\in {\mathbb {H}}^{n\times n}\) is a Hermitian matrix with eigenvalues \(\lambda _1\le \ldots \le \lambda _n\), and \(B\in \mathbb {H}^{m\times m}\) is a principal submatrix of A with eigenvalues \(\mu _1\le \ldots \le \mu _m\), then \(\lambda _k\le \mu _k\le \lambda _{k+n-m}\) for \(k=1,\ldots ,m\).

In what follows, we write \(I_n\) for the \(n\times n\) identity matrix and we will make use of \(Z_n\in {{\mathbb {R}}}^{n\times n}\) and \(\textbf{e}_n,\widetilde{\textbf{e}}_n\in {{\mathbb {R}}}^n\) given by

$$\begin{aligned} Z_n=\left[ \begin{array}{cccc}0 &{} 0 &{} \ldots &{} 0 \\ 1&{} 0 &{} \ddots &{} \vdots \\ \vdots &{} \ddots &{} \ddots &{} 0 \\ 0&{} \ldots &{} 1 &{} 0\end{array}\right] , \quad \textbf{e}_n=\begin{bmatrix}1 \\ 0 \\ \vdots \\ 0\end{bmatrix},\quad \widetilde{\textbf{e}}_n=\begin{bmatrix} 0 \\ \vdots \\ 0 \\ 1\end{bmatrix}, \end{aligned}$$
(2.6)

dropping the subscript n if the dimension is clear from the context.

2.2 Stein equations and Schur complements

Given a matrix \(J\in {\mathbb {H}}^{m\times m}\), such that \(J=J^*=J^{-1}\) and a sequence \(c_j\in {\mathbb {H}}^{1\times m}\) (\(j\ge 0\)), for each fixed \(n\ge 0\), let us denote by \(P_n\in {\mathbb {H}}^{n\times n}\) a unique solution of the Stein equation

$$\begin{aligned} P_n-Z_nP_nZ_n^*=C_nJC_n^*,\quad \text{ where }\quad C_n=\left[ {\begin{matrix} c_0 \\ c_1 \\ \vdots \\ c_{n-1} \end{matrix}}\right] \in {\mathbb {H}}^{n\times m}. \end{aligned}$$
(2.7)

The \(n^2\) entries of \(P_n\) are determined by mn elements from \(C_n\), and the equation (2.7) determines certain (displacement) structure of \(P_n\) in case \(m\ll n\). The integer m is called the displacement rank of \(P_n\); see [17]. An important fact is that the displacement structure of matrices is inherited by their Schur complements. Note that the matrix \(P_n\) is uniquely recovered from (2.7) by the formula \(P_n={\displaystyle \sum _{j=0}^{n-1}Z_n^jC_nJC_n^*Z_n^{*j}}\), from which we can see that \(P_n\) is the leading principal submatrix of \(P_{n+k}\) for any \(k\ge 1\). Writing \(P_{n+k}\) as

$$\begin{aligned} P_{n+k}=\begin{bmatrix} P_n &{} B_{n,k}^* \\ B_{n,k} &{} D_k\end{bmatrix} \end{aligned}$$
(2.8)

and assuming that \(P_n\) is invertible, we can factor \(P_{n+k}\) as

$$\begin{aligned} P_{n+k}=\begin{bmatrix}I &{} 0 \\ B_{n,k}P_n^{-1} &{} I\end{bmatrix} \begin{bmatrix}P_n &{} 0 \\ 0 &{} \textbf{S}_k\end{bmatrix} \begin{bmatrix}I &{} P_n^{-1}B^*_{n,k} \\ 0 &{} I\end{bmatrix}, \end{aligned}$$
(2.9)

where

$$\begin{aligned} \textbf{S}_k:=D_k-B_{n,k}P_n^{-1}B_{n,k}^*. \end{aligned}$$

The latter matrix is called the Schur complement of the block \(P_n\) in (2.8). It follows from (2.9) that:

$$\begin{aligned} \nu _{\pm }(P_{n+k})=\nu _{\pm }(P_{n})+\nu _{\pm }(\textbf{S}_k)\quad \text{ for } \text{ all }\quad k\ge 1. \end{aligned}$$
(2.10)

We next show that \(\textbf{S}_k\) satisfies the Stein identity similar to (2.7).

Lemma 2.2

The matrix \(\textbf{S}_k\) satisfies the Stein identity

$$\begin{aligned} \textbf{S}_k-Z_k\textbf{S}_kZ_k^*=C^{\varvec{\prime }}_kJC^{\varvec{\prime }*}_k, \end{aligned}$$
(2.11)

where \(C^{\varvec{\prime }}_k\in {\mathbb {H}}^{k\times m}\) is given by

$$\begin{aligned} C^{\varvec{\prime }}_k=\left[ {\begin{matrix} c^{\varvec{\prime }}_0 \\ \vdots \\ c^{\varvec{\prime }}_{k-1} \end{matrix}}\right] = (I-Z_k)\begin{bmatrix}-B_{n,k}P_n^{-1}&I_k\end{bmatrix}(I-Z_{n+k})^{-1}C_{n+k}. \end{aligned}$$
(2.12)

Furthermore, the top row \(c^{\varvec{\prime }}_0\) in \(C^{\varvec{\prime }}_k\) is non-zero.

Proof

We start with the Stein identity

$$\begin{aligned} P_{n+k}-Z_{n+k}P_{n+k}Z_{n+k}^*=C_{n+k}JC_{n+k}^* \end{aligned}$$
(2.13)

(the same as (2.7) but with n replaced by \(n+k\)), from which it follows that:

$$\begin{aligned}&(I-Z_{n+k})^{-1}C_{n+k}JC_{n+k}^*(I-Z^*_{n+k})^{-1}\nonumber \\&=(I-Z_{n+k})^{-1}\left( P_{n+k}-Z_{n+k}P_{n+k}Z_{n+k}^*\right) (I-Z^*_{n+k})^{-1}\nonumber \\&=(I-Z_{n+k})^{-1}P_{n+k}(I-Z^*_{n+k})^{-1}\nonumber \\&\quad -\left( (I-Z_{n+k})^{-1}-I\right) P_{n+k}\left( (I-Z^*_{n+k})^{-1}-I\right) \nonumber \\&=(I-Z_{n+k})^{-1}P_{n+k}+P_{n+k}(I-Z^*_{n+k})^{-1}-P_{n+k}. \end{aligned}$$
(2.14)

Making use of the latter equality and (2.12) gives

$$\begin{aligned} C^{\varvec{\prime }}_kJC^{\varvec{\prime }*}_k&= (I-Z_k)\begin{bmatrix}-B_{n,k}P_n^{-1}&I_k\end{bmatrix} (I-Z_{n+k})^{-1}C_{n+k}JC_{n+k}^*\nonumber \\&\quad \times (I-Z_{n+k}^*)^{-1} \begin{bmatrix}-P_n^{-1}B_{n,k}^*\\ I_k\end{bmatrix}(I-Z^*_k)\nonumber \\&=(I-Z_k)\begin{bmatrix}-B_{n,k}P_n^{-1}&I_k\end{bmatrix}\big ( (I-Z_{n+k})^{-1}P_{n+k}\nonumber \\&\quad +P_{n+k}(I-Z^*_{n+k})^{-1}-P_{n+k}\big ) \begin{bmatrix}-P_n^{-1}B_{n,k}^*\\ I_k\end{bmatrix}(I-Z^*_k). \end{aligned}$$
(2.15)

Due to partition (2.9), we have

$$\begin{aligned}&\begin{bmatrix}-B_{n,k}P_n^{-1}&I_k\end{bmatrix}P_{n+k}\begin{bmatrix}-P_n^{-1}B_{n,k}^*\\ I_k\end{bmatrix} =D_k-B_{n,k}P_n^{-1}B_{n,k}^*=\textbf{S}_k,\\&\begin{bmatrix}-B_{n,k}P_n^{-1}&I_k\end{bmatrix}(I-Z_{n+k})^{-1}P_{n+k} \begin{bmatrix}-P_n^{-1}B_{n,k}^*\\ I_k\end{bmatrix}= (I-Z_k)^{-1}{} \textbf{S}_k. \end{aligned}$$

Substituting the latter equalities into (2.15), we get

$$\begin{aligned} C^{\varvec{\prime }}_kJC_k^{\varvec{\prime }*}&=(I-Z_k)\left( (I-Z_k)^{-1}{} \textbf{S}_k +\textbf{S}_k(I-Z^*_k)^{-1}-\textbf{S}_k\right) (I-Z^*_k)\\&=\textbf{S}_k-Z_k\textbf{S}_kZ_k^*, \end{aligned}$$

thus verifying (2.11). To prove the last statement, let us assume via contradiction, that \(c^{\varvec{\prime }}_0=0\). Upon letting \(k=1\) in (2.12), we then have

$$\begin{aligned} c^{\varvec{\prime }}_0=\begin{bmatrix}-B_{n,1}P_n^{-1}&1\end{bmatrix}(I-Z_{n+1})^{-1}C_{n+1}=0, \end{aligned}$$

and therefore

$$\begin{aligned} 0=c^{\varvec{\prime }}_0JC_n^* =\begin{bmatrix}-B_{n,1}P_n^{-1}&1\end{bmatrix}(I-Z_{n+1})^{-1}C_{n+1}JC_n^*. \end{aligned}$$
(2.16)

By the Stein identity (2.13) (with \(k=1\))

$$\begin{aligned} (I-Z_{n+1})^{-1}C_{n+1}JC_n^*&=(I-Z_{n+1})^{-1}\left( \begin{bmatrix}P_{n} \\ B_{n,1}\end{bmatrix} -Z_{n+1}\begin{bmatrix}P_{n} \\ B_{n,1}\end{bmatrix}Z_n^*\right) \\&=\begin{bmatrix}P_{n} \\ B_{n,1}\end{bmatrix}Z_n^*+(I-Z_{n+1})^{-1}\begin{bmatrix}P_{n} \\ B_{n,1}\end{bmatrix}(I-Z_n^*). \end{aligned}$$

Substituting the latter equality into (2.16) gives

$$\begin{aligned} 0=\begin{bmatrix}-B_{n,1}P_n^{-1}&1\end{bmatrix}(I-Z_{n+1})^{-1}\begin{bmatrix}P_{n} \\ B_{n,1}\end{bmatrix}(I-Z_n^*). \end{aligned}$$

Multiplying both sides by \((I-Z_n^*)^{-1}P_n^{-1}(I-Z_n)\) on the right gives

$$\begin{aligned} 0&=\begin{bmatrix}-B_{n,1}P_n^{-1}&1\end{bmatrix}(I-Z_{n+1})^{-1}\begin{bmatrix}I_n \\ B_{n,1}P_n^{-1}\end{bmatrix}(I-Z_n)\nonumber \\&=\begin{bmatrix}-B_{n,1}P_n^{-1}&1\end{bmatrix}\begin{bmatrix}(I-Z_n)^{-1}&{} 0 \\ \widetilde{\textbf{e}}_n^*(I-Z_n)^{-1} &{} 1 \end{bmatrix}\begin{bmatrix}I_n \\ B_{n,1}P_n^{-1}\end{bmatrix}(I-Z_n)\nonumber \\&=\widetilde{\textbf{e}}_n^*-B_{n,1}P_n^{-1}Z_n. \end{aligned}$$
(2.17)

By (2.6), the rightmost entry in the row-vector \(\widetilde{\textbf{e}}_n^*-B_1P_n^{-1}Z_n\) equals one which contradicts (2.17), thus completing the proof. \(\square \)

If \(c_0\ne 0\) and \(J=\pm I_m\), then the matrix \(P_n\) defined by the Eq. (2.7) is positive or negative definite. Otherwise (i.e., if \(\sigma (J)=\{\pm 1\}\)), \(P_n\) may have positive, negative, and zero eigenvalues. In this case, to determine the inertia of \(P_n\) in terms of \(C_n\) is a non-trivial question. Since the family \(\{P_n\}_{n\ge 1}\) is nested in the sense of (2.8), it follows by Theorem 2.1 that \(\nu _\pm (P_{n+k})\ge \nu _\pm (P_n)\) for all \(n,k\ge 1\), and then, a follow-up question is to extend a given \(P_n\) to \(P_{n+k}\) (by an appropriate choice of \(c_n,\ldots ,c_{n+k-1}\)) with the minimally possible negative (or positive) inertia. Below, we will examine both questions for two particular choices of \(C_n\) and J in (2.7) leading to Hermitian matrices \({\mathfrak {S}}_n\) and \({\mathfrak {C}}_n\) defined in (2.19).

2.3 Two particular cases

With any power series \(f\in {\mathbb {H}}[[z]]\) (or a quaternion sequence \(\{f_j\}_{j\ge 0}\)), we associate lower triangular Toeplitz matrices \(\textbf{T}_n^f\) by the rule

$$\begin{aligned} \textbf{T}_n^f=\left[ \begin{array}{cccc}f_{0} &{} 0 &{} \ldots &{} 0 \\ f_{1}&{} f_{0} &{} \ddots &{} \vdots \\ \vdots &{} \ddots &{} \ddots &{} 0 \\ f_{n-1}&{} \ldots &{} f_{1} &{} f_{0}\end{array}\right] \quad \text{ if }\quad f(z)=\sum _{k=0}^\infty f_kz^k, \end{aligned}$$
(2.18)

and subsequently, Hermitian matrices

$$\begin{aligned} {\mathfrak {S}}_n^f=I_n-\textbf{T}_n^f\textbf{T}_n^{f*}\quad \text{ and }\quad {\mathfrak {C}}_n^f=\textbf{T}_n^f+\textbf{T}_n^{f*} \end{aligned}$$
(2.19)

for all \(n\ge 1\). Let us note that \({\mathfrak {C}}_n^f\) is a generic Hermitian Toeplitz matrix. Furthermore, the relations

$$\begin{aligned} \textbf{T}_n^{f+g}=\textbf{T}_n^f+\textbf{T}_n^g, \quad \textbf{T}_n^{fg}=\textbf{T}_n^f\textbf{T}_n^g,\quad \textbf{T}_n^{f^{-1}}=(\textbf{T}_n^{f})^{-1} \end{aligned}$$
(2.20)

that hold for all \(f,g\in {\mathbb {H}}[[z]]\) (in the rightmost relation, f needs to be invertible in \({\mathbb {H}}[[z]]\)) follow immediately from (2.2) and (2.18).

Let us also note that \({\mathfrak {S}}_n={\mathfrak {S}}_n^f\) and \({\mathfrak {C}}_n={\mathfrak {C}}_n^f\) are unique solutions to the respective Stein equations

$$\begin{aligned} {\mathfrak {S}}_n-Z_n{\mathfrak {S}}_nZ_n^*&=\textbf{e}_n\textbf{e}_n^*-F_nF_n^*=\begin{bmatrix}\textbf{e}_n&F_n\end{bmatrix}\begin{bmatrix} 1 &{} 0 \\ 0 &{}-1\end{bmatrix}\begin{bmatrix}{} \textbf{e}^*_n\\ F^*_n\end{bmatrix}, \end{aligned}$$
(2.21)
$$\begin{aligned} {\mathfrak {C}}_n-Z_n{\mathfrak {C}}_nZ_n^*&=\textbf{e}_nF_n^*+\textbf{e}_n^*F_n=\begin{bmatrix}\textbf{e}_n&F_n\end{bmatrix}\begin{bmatrix} 0 &{} 1 \\ 1 &{}0\end{bmatrix}\begin{bmatrix}{} \textbf{e}^*_n\\ F^*_n\end{bmatrix}, \end{aligned}$$
(2.22)

where \(Z_n\) and \(\textbf{e}_n\) are given in (2.6) and where

$$\begin{aligned} F_n:=\textbf{T}_n^{f}{} \textbf{e}_n=\left[ {\begin{matrix} f_0 \\ \vdots \\ f_{n-1} \end{matrix}}\right] . \end{aligned}$$
(2.23)

Indeed, since \(Z_n\textbf{T}_n^{f}=\textbf{T}_n^{f}Z_n\) and \(I_n-Z_nZ_n^*=\textbf{e}_n\textbf{e}_n^*\), we get (2.21) as follows:

$$\begin{aligned} {\mathfrak {S}}_n-Z_n{\mathfrak {S}}_n Z_n^*&=I_n-\textbf{T}_n^{f}{} \textbf{T}_n^{f*}-Z_n(I_n-\textbf{T}_n^{f}{} \textbf{T}_n^{f*})Z_n^*\\&=I_n-Z_nZ_n^*-\textbf{T}_n^{f}(I_n-Z_nZ_n^*)\textbf{T}_n^{f*}\\&=\textbf{e}_n\textbf{e}_n^*-\textbf{T}_n^{f}{} \textbf{e}_n\textbf{e}_n^*\textbf{T}_n^{f*}=\textbf{e}_n\textbf{e}_n^*-F_nF_n^*. \end{aligned}$$

Equality (2.22) is verified similarly.

Remark 2.3

The rightmost expressions in Eqs. (2.21), (2.22) tell us that the latter equations are particular cases of (2.7) with \(m=2\), \(C_n=\begin{bmatrix}{} \textbf{e}_n&F_n\end{bmatrix}\) and \(J=\left[ {\begin{matrix} 1 &{} 0 \\ 0 &{}-1 \end{matrix}}\right] \) in (2.21) and \(J=\left[ {\begin{matrix} 0 &{} 1 \\ 1 &{}0 \end{matrix}}\right] \) in (2.22).

3 Extensions of \({\mathfrak {S}}_n\) and \({\mathfrak {C}}_n\) with minimal negative inertia

In this section, we consider the following structured extension problems: given \(f_0,\ldots ,f_{n-1}\) (i.e., given a matrix \({\mathfrak {S}}_n\) or \({\mathfrak {C}}_n\)) find \(\{f_j\}_{j\ge n}\), so that \(\nu _-({\mathfrak {S}}_{n+k})\) (or \(\nu _-({\mathfrak {C}}_{n+k}))\) will be minimally possible for each fixed \(k\ge 1\). The two settings are equivalent in the sense that the results for one setting are translated to another via Cayley transform. However, for specific questions, one setting might be more convenient than another. We start with Hermitian Toeplitz matrices \({\mathfrak {C}}_n\) (\(n\ge 1\)).

3.1 Extensions of singular \({\mathfrak {C}}_n\) with minimal negative inertia

Given a sequence \(\{f_j\}_{j\ge 0}\), let us consider conformal block-decompositions

$$\begin{aligned} \textbf{T}^f_{n+k}=\begin{bmatrix}{} \textbf{T}_{n}^f &{} 0 \\ T_{n,k} &{} \textbf{T}_{k}^f\end{bmatrix}\quad \text{ and }\quad {\mathfrak {C}}_{n+k}:={\mathfrak {C}}^f_{n+k}=\begin{bmatrix}{\mathfrak {C}}_n &{} T_{n,k}^* \\ T_{n,k} &{} {\mathfrak {C}}_k\end{bmatrix} \end{aligned}$$
(3.1)

for all \(k\ge 1\), where

$$\begin{aligned} T_{n,k}=\left[ {\begin{matrix} f_n &{} f_{n-1} &{} \ldots &{} f_1\\ f_{n+1} &{} f_{n} &{}\ldots &{}f_2\\ \vdots &{}\vdots &{}&{} \vdots \\ f_{n+k-1} &{} f_{n+k-2} &{} \ldots &{}f_k \end{matrix}}\right] , \end{aligned}$$
(3.2)

and let us assume that \({\mathfrak {C}}_n\) is invertible. Upon letting

$$\begin{aligned} C_{n+k}=\begin{bmatrix}{} \textbf{e}_{n+k}&F_{n+k}\end{bmatrix}, \quad J=\left[ {\begin{matrix} 0 &{} 1 \\ 1 &{}0 \end{matrix}}\right] ,\quad P_j={\mathfrak {C}}_j,\quad C^{\varvec{\prime }}_k=\begin{bmatrix}X_k&Y_k\end{bmatrix} \end{aligned}$$

in Lemma 2.2, we conclude that \(\textbf{S}_k={\mathfrak {C}}^f_k-T_{n,k}{\mathfrak {C}}_n^{-1}T_{n,k}^*\), the Schur complement of \({\mathfrak {C}}_n\) in \({\mathfrak {C}}_{n+k}\), satisfies the Stein identity

$$\begin{aligned} \textbf{S}_k-Z_k\textbf{S}_kZ_k^*=X_kY_k^*+Y_kX_k^*, \end{aligned}$$
(3.3)

where \(X_k,Y_k\in {\mathbb {H}}^k\) are given by the formula

$$\begin{aligned} \begin{bmatrix}X_k&Y_k\end{bmatrix}&= \left[ {\begin{matrix} x_0 &{} y_0\\ \vdots &{}\vdots \\ x_{k-1} &{} y_{k-1} \end{matrix}}\right] \nonumber \\&=(I-Z_k)\begin{bmatrix}-T_{n,k}{\mathfrak {C}}_n^{-1}&I_k\end{bmatrix}(I-Z_{n+k})^{-1}\begin{bmatrix} \textbf{e}_{n+k}&F_{n+k}\end{bmatrix}, \end{aligned}$$
(3.4)

and furthermore, \(\begin{bmatrix}x_0&y_0\end{bmatrix}\ne 0\). Without loss of generality, we may assume \(x_0\ne 0\). Since the formula (3.4) holds for all \(k\ge 1\), we get two sequences \(\{x_j\}_{j\ge 0}\) and \(\{y_j\}_{j\ge 0}\). Since \(x_0\ne 0\), the matrix \(\textbf{T}_k^{x}\) is invertible for all \(k\ge 1\), so we can introduce the sequence \(\{\varepsilon _j\}_{j\ge 0}\) by the formula

$$\begin{aligned} {\mathcal {E}}_k=\left[ {\begin{matrix} \varepsilon _0 \\ \vdots \\ \varepsilon _{k-1} \end{matrix}}\right] :=(\textbf{T}_k^{x})^{-1}Y_k\quad \text{ for }\quad k\ge 1. \end{aligned}$$
(3.5)

Multiplying both sides of (3.3) by \((\textbf{T}_k^{x})^{-1}\) on the left and by its adjoint on the right and taking into account that \(\textbf{T}_k^{x}\) commutes with \(Z_k\), we arrive at

$$\begin{aligned} (\textbf{T}_k^{x})^{-1}{} \textbf{S}_k(\textbf{T}_k^{x*})^{-1}-Z_k(\textbf{T}_k^{x})^{-1}{} \textbf{S}_k(\textbf{T}_k^{x*})^{-1}Z_k^*= \textbf{e}_k\mathcal {E}_k^*+{\mathcal {E}}_k\textbf{e}^*_k, \end{aligned}$$

which is the Stein equation of the form (2.22) and, hence, has a unique solution \({\mathfrak {C}}_k^\varepsilon \). Thus

$$\begin{aligned} (\textbf{T}_k^{x})^{-1}{} \textbf{S}_k(\textbf{T}_k^{x*})^{-1}={\mathfrak {C}}_k^\varepsilon =\textbf{T}^\varepsilon _k+\textbf{T}^{\varepsilon *}_k. \end{aligned}$$
(3.6)

By the Sylvester law of inertia (see, e.g., [20] for the quaternionic version), \(\nu _{\pm }(\textbf{S}_k)=\nu _{\pm }({\mathfrak {C}}_k^\varepsilon )\). Combining the latter equalities with the general relation (2.10), we conclude

$$\begin{aligned} \nu _\pm ({\mathfrak {C}}_{n+k})=\nu _\pm ({\mathfrak {C}}_n)+\nu _\pm (\textbf{S}_k)=\nu _\pm ({\mathfrak {C}}_n)+\nu _\pm ({\mathfrak {C}}^\varepsilon _k)\quad \text{ for } \text{ all }\quad k\ge 1. \end{aligned}$$
(3.7)

Lemma 3.1

Given a sequence \(\{f_j\}_{j\ge 0}\), let us assume that \({\mathfrak {C}}^f_n\) is invertible and \({\mathfrak {C}}^f_{n+1}\) is singular. If \(\nu _-({\mathfrak {C}}^f_{n+k})=\nu _-({\mathfrak {C}}_n^f)\) for some \(k>1\), then

  1. 1.

    \(\textrm{rank}({\mathfrak {C}}^f_{n+k})=\textrm{rank}({\mathfrak {C}}^f_{n})\).

  2. 2.

    The elements \(f_{n+1},\ldots ,f_{n+k-1}\) are uniquely determined by \(f_0,\ldots ,f_n\).

Proof

Let \(\{\varepsilon _j\}_{j\ge 0}\) be the sequence constructed from \(\{f_j\}_{j\ge 0}\) as in (3.5). Since \({\mathfrak {C}}^f_{n+1}\) is singular, \(\textbf{S}_1=0\) and hence \(\varepsilon _0+{\overline{\varepsilon }}_0=0\), by (3.6) (with \(k=1\)). Since \(\nu _-({\mathfrak {C}}^f_{n+k})=\nu _-({\mathfrak {C}}_n^f)\), we have \(\nu _-({\mathfrak {C}}^\varepsilon _k)=0\). Therefore, \({\mathfrak {C}}^\varepsilon _k\) is a positive semidefinite matrix with zero diagonal entries. Therefore, \({\mathfrak {C}}^\varepsilon _k=0\), and hence, \(\nu _+({\mathfrak {C}}^f_{n+k})=\nu _+({\mathfrak {C}}_n^f)\) (by (3.7)), from which part (1) follows.

For part (2), let us decompose \({\mathfrak {C}}_{n+k}:={\mathfrak {C}}^f_{n+k}\) as follows:

$$\begin{aligned} {\mathfrak {C}}_{n+k}=\begin{bmatrix}f_0+f_0^* &{} \textbf{b}^*&{} \textbf{c}^* \\ \textbf{b}&{} {\mathfrak {C}}_n &{} T_{n,k-1}^* \\ \textbf{c} &{}T_{n,k-1}&{} {\mathfrak {C}}_{k-1}\end{bmatrix}, \quad \textbf{b}=\begin{bmatrix}f_1 \\ \vdots \\ f_{n}\end{bmatrix}, \; \; \textbf{c}=\begin{bmatrix}f_{n+1} \\ \vdots \\ f_{n+k-1}\end{bmatrix}. \end{aligned}$$

By part (1), the Schur complement of the block \({\mathfrak {C}}_n\) is the zero matrix

$$\begin{aligned} \begin{bmatrix}f_0+f_0^* &{} \textbf{c}^* \\ \textbf{c} &{} {\mathfrak {C}}_{k-1}\end{bmatrix}-\begin{bmatrix}{} \textbf{b}^* \\ T_{n,k-1}\end{bmatrix}{\mathfrak {C}}_n^{-1} \begin{bmatrix}{} \textbf{b}&T_{n,k-1}^*\end{bmatrix}=0. \end{aligned}$$

In particular, we have \(\textbf{c}=T_{n,k-1}{\mathfrak {C}}_n^{-1}{} \textbf{b}\), which we write entry-wise as

$$\begin{aligned} f_{n+j}=\begin{bmatrix}f_{n+j-1}&f_{n+j-2}&\ldots&f_j\end{bmatrix}{\mathfrak {C}}_n^{-1}\left[ {\begin{matrix} f_1 \\ \vdots \\ f_{n} \end{matrix}}\right] , \; \; j=1,\ldots ,k-1. \end{aligned}$$
(3.8)

The latter formula recursively recovers \(f_{n+1},\ldots ,f_{n+k-1}\) from \(f_0,\ldots ,f_n\). \(\square \)

Remark 3.2

If \({{\widehat{A}}}=\left[ {\begin{matrix} A &{} *\\ *&{}* \end{matrix}}\right] \) is any Hermitian extension of \(A=A^*\) with \(\textrm{rank}({{\widehat{A}}})=\textrm{rank}(A)\), then necessarily \(\nu _{\pm }({{\widehat{A}}})=\nu _{\pm }(A)\).

Proof

By Theorem 2.1, \(\nu _{\pm }({{\widehat{A}}})\ge \nu _{\pm }(A)\). Therefore, the equality

$$\begin{aligned} \nu _+({{\widehat{A}}})+\nu _-({{\widehat{A}}})=\textrm{rank}({{\widehat{A}}})=\textrm{rank}(A)=\nu _+(A)+\nu _-(A) \end{aligned}$$

is possible if and only if \(\nu _{\pm }({{\widehat{A}}})=\nu _{\pm }(A)\). \(\square \)

Lemma 3.3

Given a sequence \(\{f_j\}_{j\ge 0}\), let us assume that

  1. 1.

    the matrix \({\mathfrak {C}}^f_n\) is invertible,

  2. 2.

    the matrices \({\mathfrak {C}}^f_{n+1}, \ldots , {\mathfrak {C}}^f_{n+k}\) are all singular,

  3. 3.

    \(\textrm{rank}({\mathfrak {C}}^f_n)< \textrm{rank}({\mathfrak {C}}^f_{n+k})=d\).

Then, \(\nu _{\pm }({\mathfrak {C}}^f_{n+k+j})=\nu _{\pm }({\mathfrak {C}}^f_{n+k})+j \;\) for all \(j=1,\ldots , n+k-d\).

Proof

As in the proof of Lemma 3.1, we consider the sequence \(\{\varepsilon _j\}_{j\ge 0}\) defined as in (3.5). By (3.7), the assumptions 3 and 2 in the lemma translate to Toeplitz matrices \({\mathfrak {C}}_j^\varepsilon \) as follows: \(\textrm{rank}({\mathfrak {C}}_k^\varepsilon )> 0\) (i.e., \({\mathfrak {C}}_k^\varepsilon \ne 0\)) and the leading principal submatrices \({\mathfrak {C}}_1^\varepsilon ,\ldots , {\mathfrak {C}}_{k-1}^\varepsilon \) of \({\mathfrak {C}}_k^\varepsilon \) are all singular. Due to the Toeplitz structure of \({\mathfrak {C}}_k^\varepsilon \), it follows that there is an integer i, such that:

$$\begin{aligned} \Re \varepsilon _0=\varepsilon _1=\cdots =\varepsilon _{i-1}=0,\quad \varepsilon _i\ne 0,\quad \frac{k}{2}<i<k. \end{aligned}$$

Then, the matrix \({\mathfrak {C}}_k^\varepsilon \) is of the form

$$\begin{aligned} {\mathfrak {C}}_k^\varepsilon =\begin{bmatrix}0 &{} 0 &{} B_{i,k}^* \\ 0&{}0&{}0\\ B_{i,k}&{}0&{}0\end{bmatrix},\quad \text{ where }\quad B_{i,k}=\left[ {\begin{matrix} \varepsilon _i &{} 0&{} \ldots &{} 0 \\ \varepsilon _{i+1}&{}\varepsilon _i&{}\ddots &{} \vdots \\ \vdots &{}\ddots &{}\ddots &{} 0 \\ \varepsilon _{k-1}&{}\ldots &{} \varepsilon _{i+1}&{}\varepsilon _i \end{matrix}}\right] . \end{aligned}$$
(3.9)

Since the triangular Toeplitz matrix \(B_{i,k}\in \mathbb {H}^{(k-i)\times (k-i)}\) is invertible, we have \(\textrm{rank}({\mathfrak {C}}_k^\varepsilon )=2(k-i)\). Upon considering \({\mathfrak {C}}_k^\varepsilon \) as the extension of the block \(\left[ {\begin{matrix} 0 &{} 0 \\ 0&{}0 \end{matrix}}\right] \) in (3.9), we conclude by Theorem 2.1 that \(\nu _\pm ({\mathfrak {C}}_k^\varepsilon )\le k-i\). Since

$$\begin{aligned} \nu _+({\mathfrak {C}}_k^\varepsilon )+\nu _-({\mathfrak {C}}_k^\varepsilon )= \textrm{rank}({\mathfrak {C}}_k^\varepsilon )=2(k-i), \end{aligned}$$

it follows that \(\nu _\pm ({\mathfrak {C}}_k^\varepsilon )=k-i\). On the other hand, due to (3.7)

$$\begin{aligned} d:=\textrm{rank}({\mathfrak {C}}^f_{n+k})=\textrm{rank}({\mathfrak {C}}^f_n)+\textrm{rank}({\mathfrak {C}}_k^\varepsilon )=n+2(k-i), \end{aligned}$$

and therefore, \(i=\frac{n+2k-d}{2}\). Again, making use of (3.7), we now get

$$\begin{aligned} \nu _\pm ({\mathfrak {C}}^f_{n+k})=\nu _\pm ({\mathfrak {C}}^f_{n})+\nu _\pm ({\mathfrak {C}}_k^\varepsilon )=\nu _\pm ({\mathfrak {C}}^f_{n})+k-i=\nu _\pm ({\mathfrak {C}}^f_{n})+\frac{d-n}{2}. \end{aligned}$$
(3.10)

Now, let us consider the matrix \({\mathfrak {C}}^f_{n+k+j}\), \(1\le j\le n+k-d\). The Schur complement of its leading principal submatrix \({\mathfrak {C}}^f_{n}\) is congruent to the Toeplitz matrix \({\mathfrak {C}}_{k+j}^\varepsilon \), the structured extension of \({\mathfrak {C}}_k^\varepsilon \). Therefore, \({\mathfrak {C}}_{k+j}^\varepsilon \) is of the form

$$\begin{aligned} {\mathfrak {C}}_{k+j}^\varepsilon =\begin{bmatrix}0 &{} 0 &{} B_{i,k+j}^* \\ 0&{}0&{}0\\ B_{i,k+j}&{}0&{}0\end{bmatrix}, \end{aligned}$$

where the matrix \(B_{i,k+j}\in {\mathbb {H}}^{(k-i+j)\times (k-i+j)}\) is defined as in (3.9) and the integer i is the same as above: \(i=\frac{n+2k-d}{2}\). Therefore

$$\begin{aligned} \textrm{rank}({\mathfrak {C}}_{k+j}^\varepsilon )=2(k-i+j)=d-n+2j. \end{aligned}$$

Upon invoking Theorem 2.1 as in the previous part of the proof, we derive \(\nu _\pm ({\mathfrak {C}}_{k+j}^\varepsilon )=\frac{d-n}{2}+j\), and subsequently, on account of (3.10)

$$\begin{aligned} \nu _\pm ({\mathfrak {C}}^f_{n+k+j})=\nu _\pm ({\mathfrak {C}}^f_{n})+\nu _\pm ({\mathfrak {C}}_{k+j}^\varepsilon )=\nu _\pm ({\mathfrak {C}}^f_{n})+\frac{d-n}{2}+j =\nu _\pm ({\mathfrak {C}}^f_{n+k})+j, \end{aligned}$$

which completes the proof. \(\square \)

For the sake of completeness, we include the following remark, although it will not be used in the rest of the paper.

Remark 3.4

In the setting of Lemma 3.3

$$\begin{aligned} \nu _\pm ({\mathfrak {C}}_{n+j}^f)=\nu _\pm ({\mathfrak {C}}^f_{n})\quad \text{ for } \; 1\le j\le i:=\frac{n+2k-d}{2}, \end{aligned}$$

and hence, \(f_{n+1},\ldots , f_{n+i-1}\) are uniquely determined by \(f_0,\ldots ,f_{n-1}\) (by Lemma 3.1). Furthermore, for further extensions (with \(j\le 2i\))

$$\begin{aligned} \nu _{\pm }({\mathfrak {C}}^f_{n+j})=\nu _{\pm }({\mathfrak {C}}^f_{n})+j-i\quad \text{ for } \; \; i<j\le 2i= n+2k-d. \end{aligned}$$

Indeed, we see from (3.9) that \({\mathfrak {C}}_{i}^\varepsilon =0\), and hence, the first statement follows by (3.7). Furthermore, the formula (3.9) makes sense with k replaced by any j, \(i<j\le 2i\). Then, we may conclude exactly as in the proof of Lemma 3.3 that \(\nu _{\pm }({\mathfrak {C}}_{j}^\varepsilon )=j-i\), and consequently,

$$\begin{aligned} \nu _\pm ({\mathfrak {C}}_{n+j}^f)=\nu _\pm ({\mathfrak {C}}^f_{n})+\nu _{\pm }({\mathfrak {C}}_{j}^\varepsilon )=\nu _\pm ({\mathfrak {C}}^f_{n})+j-i. \end{aligned}$$

Upon letting \(N:=n+k\) in Lemmas 3.1 and 3.3, we arrive at the following extension result for singular Hermitian Toeplitz matrices; see [15] for the complex case.

Theorem 3.5

Given \(f_0,\ldots ,f_{N-1}\), let us assume that the matrix \({\mathfrak {C}}_N:={\mathfrak {C}}^f_{N}\) is singular and that \({\mathfrak {C}}_n\) (\(n<N\)) is the maximal invertible leading principal submatrix of \({\mathfrak {C}}_{N}\).

  1. 1.

    If \(\textrm{rank}({\mathfrak {C}}_{N})=\textrm{rank}({\mathfrak {C}}_{n})=n\), then \(\nu _-({\mathfrak {C}}_{N})=\nu _-({\mathfrak {C}}_{n}):=\kappa \) and for each \(k\ge 1\), the extension \({\mathfrak {C}}_{N+k}\) with \(\nu _-({\mathfrak {C}}_{N+k})=\kappa \) is unique and satisfies \(\textrm{rank}({\mathfrak {C}}_{N+k})=n\).

  2. 2.

    If \(\textrm{rank}({\mathfrak {C}}_{N})=d>n\), then for any choice of \(f_{N},\ldots ,f_{2N-d-1}\)

    $$\begin{aligned} \nu _\pm ({\mathfrak {C}}_{N+j})=\nu _\pm ({\mathfrak {C}}_{N})+j\quad \text{ for }\quad j=1,\ldots ,N-d. \end{aligned}$$

    In particular, the matrix \({\mathfrak {C}}_{2N-d}\) is invertible.

To translate Theorem 3.5 to the setting of matrices \({\mathfrak {S}}_j\), we make the following observation.

Remark 3.6

Given a sequence \(\{f_j\}_{j\ge 0}\), there exists a sequence \(\{g_j\}_{j\ge 0}\), such that the matrix \({\mathfrak {S}}_n^f\) is congruent to the matrix \({\mathfrak {C}}_n^g\) for each \(n\ge 1\).

Proof

If \(f_0\ne 1\), we recursively define the sequence \(\{g_j\}_{j\ge 0}\) by

$$\begin{aligned} g_0=(1-f_0)^{-1}(1+f_0), \quad g_j=(1-f_0)^{-1}\bigg (f_j(1+g_0)+\sum _{k=1}^{j-1}f_kg_{j-k}\bigg ). \end{aligned}$$

Then, the associated Toeplitz matrices are related by

$$\begin{aligned} \textbf{T}_n^g=(I-\textbf{T}_n^f)^{-1}(I+\textbf{T}_n^f)\quad \text{ for } \text{ all }\quad n\ge 1, \end{aligned}$$

and therefore

$$\begin{aligned} {\mathfrak {C}}_n^g:=\textbf{T}_n^g+\textbf{T}_n^{g*}&=2(I-\textbf{T}_n^f)^{-1}\big (I-\textbf{T}_n^f\textbf{T}_n^{f*}\big ) (I-\textbf{T}_n^{f*})^{-1}\\&=2(I-\textbf{T}_n^f)^{-1}{\mathfrak {S}}_n^f (I-\textbf{T}_n^{f*})^{-1}. \end{aligned}$$

If \(f_0=1\), the matrix \(I-\textbf{T}_n^f\) is not invertible. In this case, we pass to the sequence \(\{-f_j\}_{j\ge 0}\) and then construct \(\{g_j\}_{j\ge 0}\) as above. Then, we have

$$\begin{aligned} {\mathfrak {C}}_n^g=2(I-\textbf{T}_n^{-f})^{-1}{\mathfrak {S}}_n^{-f} (I-\textbf{T}_n^{-f*})^{-1}=2(I+\textbf{T}_n^{f})^{-1}{\mathfrak {S}}_n^{f} (I+\textbf{T}_n^{f*})^{-1}, \end{aligned}$$

and the desired congruence follows. \(\square \)

Since congruent matrices have the same inertia, we arrive at the following conclusion.

Remark 3.7

Lemmas 3.1,  3.3, and Theorem 3.5 with all matrices \({\mathfrak {C}}_j\) replaced by \({\mathfrak {S}}_j\) hold true.

Theorem 3.5 is concerned with structured extensions of a given singular matrix \({\mathfrak {C}}_N\) (or \({\mathfrak {S}}_N\)). As matter of fact, an invertible matrix \({\mathfrak {C}}_N\) (or \({\mathfrak {S}}_N\)) always can be extended without increase of the negative inertia and there are infinitely many such extensions. In the complex setting, this result goes back to [16]. The quaternion version is given below for the setting of the matrices \({\mathfrak {S}}_N\) which seems to be more convenient here.

3.2 Extensions of invertible \({\mathfrak {S}}_n\) with minimal negative inertia

We start with \(f_0,\ldots ,f_{n+k-1}\in {\mathbb {H}}\), such that the matrix \({\mathfrak {S}}_n:={\mathfrak {S}}^f_n\) is invertible, and consider the block-decomposition of \({\mathfrak {S}}^f_{n+k}\) conformal with (3.1)

(3.11)

Upon letting

$$\begin{aligned} C_{n+k}=\begin{bmatrix}{} \textbf{e}_{n+k}&F_{n+k}\end{bmatrix}, \quad J=\left[ {\begin{matrix} 1 &{} 0 \\ 0 &{}-1 \end{matrix}}\right] ,\quad P_j={\mathfrak {S}}_j,\quad C^{\varvec{\prime }}_k=\begin{bmatrix}X_k&Y_k\end{bmatrix} \end{aligned}$$
(3.12)

[where \(F_{n+k}\) is defined as in (2.23)] in Lemma 2.2, we conclude that

$$\begin{aligned} \textbf{S}_k&={\mathfrak {S}}^f_{k}-T_{n,k}T_{n,k}^*- T_{n,k}{} \textbf{T}_{n}^{f*}{\mathfrak {S}}_n^{-1}{} \textbf{T}_{n}^f T_{n,k}^*\nonumber \\&={\mathfrak {S}}^f_{k}-T_{n,k}\left( I_n-\textbf{T}_{n}^{f*}\textbf{T}^f_{n}\right) ^{-1}T_{n,k}^*, \end{aligned}$$
(3.13)

the Schur complement of \({\mathfrak {S}}_n\) in (3.11) satisfies the Stein identity

$$\begin{aligned} \textbf{S}_k-Z_k\textbf{S}_kZ_k^*=X_kX_k^*-Y_kY_k^*, \end{aligned}$$
(3.14)

where \(X_k,Y_k\in {\mathbb {H}}^k\) are given by the formula

$$\begin{aligned} \begin{bmatrix}X_k&Y_k\end{bmatrix}&= \left[ {\begin{matrix} x_0 &{} y_0\\ \vdots &{}\vdots \\ x_{k-1} &{} y_{k-1} \end{matrix}}\right] \nonumber \\&=(I-Z_k)\begin{bmatrix}T_{n,k}{} \textbf{T}_n^{f*}{\mathfrak {S}}_n^{-1}&I_k\end{bmatrix}(I-Z_{n+k})^{-1}\begin{bmatrix} \textbf{e}_{n+k}&F_{n+k}\end{bmatrix}, \end{aligned}$$
(3.15)

and furthermore, \(\begin{bmatrix}x_0&y_0\end{bmatrix}\ne 0\).

Remark 3.8

Let us assume that \(\nu _{-}({\mathfrak {S}}_{n+1})=\nu _{-}({\mathfrak {S}}_{n})\). Then, \(|x_0|>|y_0|\) if \({\mathfrak {S}}_{n+1}\) is invertible and \(|x_0|=|y_0|>0\) if \({\mathfrak {S}}_{n+1}\) is singular. The latter is clear from equalities

$$\begin{aligned} \nu _{\pm }({\mathfrak {S}}_{n+1})=\nu _{\pm }({\mathfrak {S}}_{n})+\nu _{\pm }(\textbf{S}_1)\quad \text{ and }\quad \textbf{S}_1=|x_0|^2-|y_0|^2 \end{aligned}$$

which in turn, are particular cases (\(k=1\)) of (2.10) and (3.14), respectively.

We next introduce matrix polynomials \(\Psi =\left[ {\begin{matrix} \psi _{11} &{} \psi _{12}\\ \psi _{21}&{} \psi _{22} \end{matrix}}\right] \) and \(\Theta =\left[ {\begin{matrix} \theta _{11} &{} \theta _{12}\\ \theta _{21} &{} \theta _{22} \end{matrix}}\right] \)

$$\begin{aligned} \Psi (z)&=z^n\bigg (I_2-\begin{bmatrix}{} \textbf{e}^* \\ F_n^*\end{bmatrix}(I-Z_n^*)^{-1}{\mathfrak {S}}_n^{-1}\begin{bmatrix}{} \textbf{e}&\quad -F_n\end{bmatrix}\bigg )\nonumber \\&\quad +\sum _{j=1}^n z^{n-j}\begin{bmatrix}{} \textbf{e}^* \\ F_n^*\end{bmatrix}(I-Z_n^*)^{-1}{\mathfrak {S}}_n^{-1}(I-Z_n)Z_n^{j-1}\begin{bmatrix}{} \textbf{e}&\quad -F_n\end{bmatrix}\nonumber \\&=z^nI_2-(z-1)\begin{bmatrix}{} \textbf{e}^* \\ F_n^*\end{bmatrix}(I-Z_n^*)^{-1}{\mathfrak {S}}_n^{-1}\textbf{Z}_n(z)\begin{bmatrix}{} \textbf{e}&\quad -F_n\end{bmatrix}, \end{aligned}$$
(3.16)
$$\begin{aligned} \Theta (z)&=I_2+(z-1)\begin{bmatrix}{} \textbf{e}^* \\ F_n^*\end{bmatrix}(I-zZ_n^{*})^{-1}{\mathfrak {S}}_n^{-1}(I-Z_n)^{-1}\begin{bmatrix}\textbf{e}&\quad -F_n\end{bmatrix}, \end{aligned}$$
(3.17)

where

$$\begin{aligned} \textbf{Z}_n(z):=\sum _{j=1}^n z^{n-j}Z_n^{j-1}\quad \text{ and }\quad (I-zZ_n^{*})^{-1}=\sum _{j=0}^{n-1} z^jZ_n^{*j}. \end{aligned}$$

Remark 3.9

At least one of the polynomials \(\theta _{21}\) and \(\theta _{22}\) in (3.17) and at least one of the polynomials \(\psi _{11}\) and \(\psi _{21}\) in (3.16) are invertible in \({\mathbb {H}}[[z]]\), that is

$$\begin{aligned} \begin{bmatrix} \theta _{21,0}&\theta _{22,0}\end{bmatrix}\ne 0\quad \text{ and }\quad \begin{bmatrix} \psi _{11,0} \\ \psi _{21,0}\end{bmatrix}\ne 0. \end{aligned}$$
(3.18)

Proof

To prove the first relation in (3.18), let us assume, via contradiction, that

$$\begin{aligned} \theta _{21,0}&=-F_n^*{\mathfrak {S}}_n^{-1}(I-Z_n)^{-1}{} \textbf{e}=0, \end{aligned}$$
(3.19)
$$\begin{aligned} \theta _{22,0}&=1+F_n^*{\mathfrak {S}}_n^{-1}(I-Z_n)^{-1}F_n=0. \end{aligned}$$
(3.20)

Then, it follows by (2.21) that:

$$\begin{aligned} 0&=\theta _{22,0}\, F_n^*+\theta _{21,0}\, \textbf{e}^*\\&=F_n^*-F_n^*{\mathfrak {S}}_n^{-1}(I-Z_n)^{-1}(\textbf{e}{} \textbf{e}^*-F_nF_n^*)\\&=F_n^*-F_n^*{\mathfrak {S}}_n^{-1}(I-Z_n)^{-1}({\mathfrak {S}}_n-Z_n{\mathfrak {S}}_nZ_n^*)\\&=F_n^*{\mathfrak {S}}_n^{-1}(I-Z_n)^{-1}((I-Z_n){\mathfrak {S}}_n-{\mathfrak {S}}_n+Z_n{\mathfrak {S}}_nZ_n^*)\\&=F_n^*{\mathfrak {S}}_n^{-1}Z_n(I-Z_n)^{-1}{\mathfrak {S}}_n(Z_n^*-I), \end{aligned}$$

and therefore, upon canceling invertible matrices on the right

$$\begin{aligned} F_n^*{\mathfrak {S}}_n^{-1}Z_n=0. \end{aligned}$$
(3.21)

Combining (3.21) with (3.19) gives

$$\begin{aligned} 0=\theta _{21,0}=-F_n^*{\mathfrak {S}}_n^{-1}(I-Z_n)^{-1}\textbf{e}=-\sum _{j=0}^{n-1}F_n^*{\mathfrak {S}}_n^{-1}Z^j_n\textbf{e}=-F_n^*{\mathfrak {S}}_n^{-1}\textbf{e} \end{aligned}$$

which together with (3.21) implies \(F_n^*{\mathfrak {S}}_n^{-1}=0\), and hence, \(F_n=0\), which contradicts (3.20) and thus completes the proof of the first relation in (3.18). To prove the second one, we use the explicit formulas for \(\psi _{11}\) and \(\psi _{21}\) (derived from (3.16)) and assume, via contradiction, that

$$\begin{aligned} \begin{bmatrix} \psi _{11,0} \\ \psi _{21,0}\end{bmatrix}= \begin{bmatrix}{} \textbf{e}^* \\ F_n^*\end{bmatrix}(I-Z_n^*)^{-1}{\mathfrak {S}}_n^{-1}\widetilde{\textbf{e}}_n=0. \end{aligned}$$
(3.22)

Then, it follows by (2.21) that:

$$\begin{aligned} 0=\begin{bmatrix}{} \textbf{e}&-F_n\end{bmatrix}\begin{bmatrix} \psi _{11,0} \\ \psi _{21,0}\end{bmatrix}&=({\mathfrak {S}}_n-Z_n{\mathfrak {S}}_nZ_n^*)(I-Z_n^*)^{-1}{\mathfrak {S}}_n^{-1}\widetilde{\textbf{e}}_n\\&=(I-Z_n){\mathfrak {S}}_n(I-Z_n^*)^{-1}{\mathfrak {S}}_n^{-1}\widetilde{\textbf{e}}_n+Z_n\widetilde{\textbf{e}}_n\\&=(I-Z_n){\mathfrak {S}}_n(I-Z_n^*)^{-1}{\mathfrak {S}}_n^{-1}\widetilde{\textbf{e}}_n\ne 0, \end{aligned}$$

and the latter contradiction completes the proof. \(\square \)

In what follows, we will use the equality

$$\begin{aligned}&{\mathfrak {S}}_n^{-1}(I-Z_n)^{-1}(\textbf{e}{} \textbf{e}^*-F_nF_n^*)(I-Z_n^*)^{-1}{\mathfrak {S}}_n^{-1}\nonumber \\&={\mathfrak {S}}_n^{-1}(I-Z_n)^{-1}+(I-Z^*_n)^{-1}{\mathfrak {S}}_n^{-1}-{\mathfrak {S}}_n^{-1} \end{aligned}$$
(3.23)

which can be verified directly upon making use of the Stein identity (2.21), or upon specifying the general identity (2.14) to the present setting.

Lemma 3.10

The polynomials \(\Theta \) and \(\Psi \) defined in (3.17) and (3.16) are subject to identities

$$\begin{aligned} \Theta (z)\Psi (z)=z^nI_2=\Psi (z)\Theta (z). \end{aligned}$$
(3.24)

Proof

By (3.17) and (3.16)

$$\begin{aligned} \Theta (z)\Psi (z)-z^nI_2=(z-1)\begin{bmatrix}{} \textbf{e}^* \\ F_n^*\end{bmatrix} (I-zZ_n^{*})^{-1}W(z)\begin{bmatrix}{} \textbf{e}&\quad -F_n\end{bmatrix}, \end{aligned}$$

where

$$\begin{aligned} W(z)&=z^n{\mathfrak {S}}_n^{-1}(I-Z_n)^{-1}-(I-zZ_n^{*})(I-Z_n^*)^{-1}{\mathfrak {S}}_n^{-1}{} \textbf{Z}_n(z) \nonumber \\&\quad +(1-z){\mathfrak {S}}_n^{-1}(I-Z_n)^{-1}(\textbf{e}\textbf{e}^*-F_nF_n^*)(I-Z_n^*)^{-1}{\mathfrak {S}}_n^{-1}{} \textbf{Z}_n(z). \end{aligned}$$
(3.25)

Observe that \((zI-Z_n)\textbf{Z}_n(z)=z^nI\) and therefore

$$\begin{aligned} (1-z)(I-Z_n)^{-1}{} \textbf{Z}_n(z)=\textbf{Z}_n(z)-z^n(I-Z_n)^{-1}. \end{aligned}$$
(3.26)

Making use of the last equality along with (3.23), we simplify the rightmost term in (3.25) to

$$\begin{aligned}&(1-z)\left( {\mathfrak {S}}_n^{-1}(I-Z_n)^{-1}+(1-Z_n^*)^{-1}Z_n^*{\mathfrak {S}}_n^{-1}\right) \textbf{Z}_n(z)\\&\quad ={\mathfrak {S}}_n^{-1}{} \textbf{Z}_n(z)-z^n{\mathfrak {S}}_n^{-1}(I-Z_n)^{-1}+((I-Z_n^*)^{-1}(I-zZ_n^*)-I){\mathfrak {S}}_n^{-1}{} \textbf{Z}_n(z)\\&=-z^n{\mathfrak {S}}_n^{-1}(I-Z_n)^{-1}+(I-Z_n^*)^{-1}(I-zZ_n^*){\mathfrak {S}}_n^{-1}\textbf{Z}_n(z). \end{aligned}$$

Substituting the latter expression into (3.25) leads to \(W(z)=0\), thus confirming the first equality in (3.24). The second equality is now clear. \(\square \)

Lemma 3.11

Given \(f_0,\ldots ,f_{n+k-1}\in {\mathbb {H}}\) such that the matrix \({\mathfrak {S}}_n\) is invertible, let \(F_{n+k}\) and \(\Theta \) be defined as in (2.23) and (3.17). Then

$$\begin{aligned} \begin{bmatrix}{} \textbf{e}_{n+k}&-F_{n+k}\end{bmatrix}\Theta (z)=\begin{bmatrix}0 &{} 0\\ X_k &{} -Y_k\end{bmatrix}+ (zI-Z_{n+k})\Phi _k(z), \end{aligned}$$
(3.27)

where the columns \(X_k,Y_k\in {\mathbb {H}}^k\) are the same as in (3.15) and \(\Phi _k\) is the \((n+k)\times 2\)-matrix polynomial given by

$$\begin{aligned} \Phi _k(z)=\begin{bmatrix} {\mathfrak {S}}_n \\ -T_{n,k}\textbf{T}_n^{f*}\end{bmatrix}(I-zZ_n^*)^{-1}(I-Z_n^*){\mathfrak {S}}_n^{-1}(I-Z_n)^{-1} \begin{bmatrix} \textbf{e}_n&\quad -F_n\end{bmatrix}.\qquad \end{aligned}$$
(3.28)

Proof

Here, we will use the equality

$$\begin{aligned} \textbf{e}_{n+k}{} \textbf{e}_n^*-F_{n+k}F_n^*=\begin{bmatrix}{\mathfrak {S}}_{n} \\ -T_{n,k}{} \textbf{T}_n^{f*}\end{bmatrix} -Z_{n+k}\begin{bmatrix}{\mathfrak {S}}_{n} \\ -T_{n,k}{} \textbf{T}_n^{f*}\end{bmatrix}Z_n^*, \end{aligned}$$

which follows from (2.13) (specified to the present setting) upon multiplying both sides of the latter by \(\left[ {\begin{matrix} I_n \\ 0 \end{matrix}}\right] \) on the right. Using the latter equality along with (3.17) and (3.28), we compute the polynomial (which eventually turns out to be a constant)

$$\begin{aligned}&L:=\begin{bmatrix}{} \textbf{e}_{n+k}&\quad -F_{n+k}\end{bmatrix}\Theta (z)-(zI-Z_{n+k})\Phi _k(z)\nonumber \\&=\begin{bmatrix}{} \textbf{e}_{n+k}&\quad -F_{n+k}\end{bmatrix}+\bigg ((z-1)\left( \begin{bmatrix}{\mathfrak {S}}_{n} \\ -T_{n,k}{} \textbf{T}_n^{f*}\end{bmatrix} -Z_{n+k}\begin{bmatrix}{\mathfrak {S}}_{n} \\ -T_{n,k}{} \textbf{T}_n^{f*}\end{bmatrix}Z_n^*\right) \nonumber \\&\qquad \qquad \qquad \qquad \qquad -(zI-Z_{n+k})\begin{bmatrix} {\mathfrak {S}}_n \\ -T_{n,k}{} \textbf{T}_n^{f*}\end{bmatrix}(I-Z_n^*)\bigg )\nonumber \\&\qquad \times (I-zZ_n^{*})^{-1}{\mathfrak {S}}_n^{-1}(I-Z_n)^{-1}\begin{bmatrix}{} \textbf{e}_{n}&\quad -F_{n}\end{bmatrix}\nonumber \\&=\begin{bmatrix}{} \textbf{e}_{n+k}&\quad -F_{n+k}\end{bmatrix}-(I-Z_{n+k})\begin{bmatrix} {\mathfrak {S}}_n \\ -T_{n,k}{} \textbf{T}_n^{f*}\end{bmatrix}{\mathfrak {S}}_n^{-1}(I-Z_n)^{-1} \begin{bmatrix}{} \textbf{e}_{n}&\quad -F_{n}\end{bmatrix}. \end{aligned}$$
(3.29)

It is readily seen from (3.29) that the n top rows in L are zeros

$$\begin{aligned} \begin{bmatrix}I_n&0\end{bmatrix}L=\begin{bmatrix}{} \textbf{e}_{n}&\quad -F_{n}\end{bmatrix}-\begin{bmatrix}{} \textbf{e}_{n}&\quad -F_{n}\end{bmatrix}=0, \end{aligned}$$
(3.30)

while, for the k bottom rows, we have

$$\begin{aligned} \begin{bmatrix}0&I_k\end{bmatrix}L&=\begin{bmatrix}0&I_k\end{bmatrix}\begin{bmatrix}{} \textbf{e}_{n+k}&\quad -F_{n+k}\end{bmatrix}\\&\quad +(\textbf{e}_k\widetilde{\textbf{e}}_n^*+(I-Z_k)T_{n,k}\textbf{T}_n^{f*}{\mathfrak {S}}_n^{-1})(I-Z_n)^{-1}\begin{bmatrix}{} \textbf{e}_{n}&\quad -F_{n}\end{bmatrix}\\&=\begin{bmatrix}(\textbf{e}_k\widetilde{\textbf{e}}_n^*+(I-Z_k)T_{n,k}\textbf{T}_n^{f*}{\mathfrak {S}}_n^{-1})(I-Z_n)^{-1}&I_k\end{bmatrix} \begin{bmatrix}{} \textbf{e}_{n+k}&\quad -F_{n+k}\end{bmatrix}\\&=(I-Z_k)\begin{bmatrix}T_{n,k}{} \textbf{T}_n^{f*}{\mathfrak {S}}_n^{-1}&I_k\end{bmatrix}\left[ {\begin{matrix} I-Z_n)^{-1}&{} 0 \\ (I-Z_k)^{-1}\textbf{e}_k\widetilde{\textbf{e}}_n^* (I-Z_n)^{-1}&{} (I-Z_k)^{-1} \end{matrix}}\right] \\&\quad \times \begin{bmatrix}{} \textbf{e}_{n+k}&\quad -F_{n+k}\end{bmatrix}\\&=(I-Z_k)\begin{bmatrix}T_{n,k}{} \textbf{T}_n^{f*}{\mathfrak {S}}_n^{-1}&I_k\end{bmatrix}(I-Z_{n+k})^{-1}\begin{bmatrix}{} \textbf{e}_{n+k}&\quad -F_{n+k}\end{bmatrix}. \end{aligned}$$

Comparing the last expression with (2.12) implies

$$\begin{aligned} \begin{bmatrix}0&I_k\end{bmatrix}L=\begin{bmatrix}X_k&\quad -Y_k\end{bmatrix}, \end{aligned}$$

which together with (3.29) and (3.30) implies (3.27). \(\square \)

The next result is a consequence of formulas (3.27) and (3.15). Here, we start with an infinite sequence \(\{f_j\}_{j\ge 0}\) (or its Z-transform \(f(z)=\sum f_jz^j\)). If the n first terms (coefficients) are such that the matrix \({\mathfrak {S}}_n\) is invertible, the formulas (3.15) define the sequences \(\{x_j\}_{j\ge 0}\), \(\{y_j\}_{j\ge 0}\) and their Z-transforms x(z), y(z).

Lemma 3.12

If the n first coefficients of \(f(z)=\sum f_jz^j\in {\mathbb {H}}[[z]]\) are such that the matrix \({\mathfrak {S}}_n={\mathfrak {S}}_n^f\) (1.1) is invertible, and if \(\theta _{ij}\) are the polynomials defined as in (3.17), then

$$\begin{aligned} \begin{bmatrix}1&\quad -f\end{bmatrix}\Theta = \begin{bmatrix} \theta _{11}-f\theta _{21}&\quad \theta _{12}-f\theta _{22}\end{bmatrix}= z^n\begin{bmatrix}x&\quad -y\end{bmatrix}, \end{aligned}$$
(3.31)

where

$$\begin{aligned} y(z)=\sum _{j=0}^\infty y_jz^j,\qquad x(z)=\sum _{j=0}^\infty x_jz^j \end{aligned}$$

are the power series with coefficients defined via formulas (3.15) for \(k\ge 1\).

Proof

Let \(p_{n+k}\) be the polynomial defined by

$$\begin{aligned} p_{n+k}(z):=f_0+\cdots +f_{n+k-1}z^{n+k-1}=f(z)-z^{n+k}\cdot \sum _{j=0}^\infty f_{n+k+j}z^j. \end{aligned}$$
(3.32)

Multiplying both sides of (3.27) by \(\begin{bmatrix}1&z&\ldots&z^{n+k-1}\end{bmatrix}\) and taking into account the structure of matrices in (2.6) and (3.15), we get

$$\begin{aligned} \begin{bmatrix}1&\quad -p_{n+k}(z)\end{bmatrix}\Theta (z)=z^n\begin{bmatrix}{\displaystyle \sum _{j=0}^{k-1} x_jz^j}&-{\displaystyle \sum _{j=0}^{k-1} y_jz^j}\end{bmatrix}+z^{n+k}\widetilde{\textbf{e}}_{n+k}^*\Phi _k(z). \end{aligned}$$
(3.33)

Due to (3.32), the latter equality can be rearranged as

$$\begin{aligned} \begin{bmatrix}1&\quad -f(z)\end{bmatrix}\Theta (z)&= z^n\begin{bmatrix}{\displaystyle \sum _{j=0}^{k-1} x_jz^j}&\quad -{\displaystyle \sum _{j=0}^{k-1} y_jz^j}\end{bmatrix}\nonumber \\&\quad +z^{n+k}\bigg (\widetilde{\textbf{e}}_{n+k}^*\Phi _k(z)-\bigg (\sum _{j=0}^\infty f_{n+k+j}z^j\bigg ) \begin{bmatrix}\theta _{21}(z)&\theta _{22}(z)\end{bmatrix}\bigg ). \end{aligned}$$
(3.34)

It follows from (3.34) that the equality (3.31) holds for power series \(x,y\in {\mathbb {H}}[[z]]\), the k first coefficients of which are given by formulas (3.15). Since equalities (3.15) and (3.34) hold for all \(k\ge 1\), the statement follows. \(\square \)

Theorem 3.13

Let \(f_0,\ldots ,f_{n-1}\in {\mathbb {H}}\) be such that the matrix \({\mathfrak {S}}_n={\mathfrak {S}}_n^f\) is invertible and let \(\Theta \) and \(\Psi \) be the matrix polynomials defined in (3.17) and (3.16). Given \(f\in {\mathbb {H}}[[z]]\), the following are equivalent:

  1. 1.

    f is of the form

    $$\begin{aligned} f(z)=\underbrace{f_0+f_1z+\cdots +f_{n-1}z^{n-1}}_{p_n(z)}+\cdots \end{aligned}$$
    (3.35)
  2. 2.

    \(\begin{bmatrix}1&\quad -f\end{bmatrix}\Theta =z^n \begin{bmatrix}x&\quad -y\end{bmatrix}\) for some \(x,y\in {\mathbb {H}}[[z]]\).

  3. 3.

    f is of the form

    $$\begin{aligned} f=(x\psi _{11}-y\psi _{21})^{-1}(y\psi _{22}-x\psi _{12}) \end{aligned}$$
    (3.36)

    for some \(x,y\in {\mathbb {H}}[[z]]\), such that \(x\psi _{11}-y\psi _{21}\) is invertible in \({\mathbb {H}}[[z]]\).

Proof

Implication \((1)\Rightarrow (2)\) follows by Lemma 3.12.

Proof of \((2)\Rightarrow (1)\): We first note that equality (3.33) makes sense even for \(k=0\), in which case it takes the form

$$\begin{aligned} \begin{bmatrix}1&\quad -p_{n}(z)\end{bmatrix}\Theta (z)=z^n\widetilde{\textbf{e}}_{n}^*\Phi _0(z), \end{aligned}$$
(3.37)

where according to (3.28)

$$\begin{aligned} \Phi _0(z)={\mathfrak {S}}_n(I-zZ_n^*)^{-1}(I-Z_n^*){\mathfrak {S}}_n^{-1}(I-Z_n)^{-1}\begin{bmatrix} \textbf{e}_n&\quad -F_n\end{bmatrix}. \end{aligned}$$

Assuming that (3.31) holds for some \(x,y\in {\mathbb {H}}[[z]]\), we subtract (3.31) from (3.37) to get

$$\begin{aligned} \begin{bmatrix}0&f-p_n\end{bmatrix}\Theta =(f-p_n)\begin{bmatrix}\theta _{21}&\theta _{22}\end{bmatrix} =z^n \big (\widetilde{\textbf{e}}_{n}^*\Phi _0-\begin{bmatrix}x&-y\end{bmatrix}\big ). \end{aligned}$$
(3.38)

By Remark 3.9, the free coefficient of the polynomial \(\begin{bmatrix}\theta _{21}&\theta _{22}\end{bmatrix}\) is non-zero. Then, it follows from (3.38) that the n first coefficients of the power series \(f-p_n\) are zeros, and hence, f is of the form (3.35).

Proof of \((2)\Leftrightarrow (3)\): Assume first that equality (3.31) holds for some \(x,y\in {\mathbb {H}}[[z]]\). Multiplying both sides of (3.31) by \(\Psi (z)\) on the right, making use of (3.24) and canceling \(z^n\), we arrive at

$$\begin{aligned} \begin{bmatrix}1&-f\end{bmatrix}=\begin{bmatrix}x&-y\end{bmatrix}\Psi =\begin{bmatrix}x\psi _{11}-y\psi _{21}&x\psi _{12}- y\psi _{22}\end{bmatrix}. \end{aligned}$$
(3.39)

Therefore, \(x\psi _{11}-y\psi _{21}=1\) (in particular, it is invertible in \({\mathbb {H}}[[z]]\)) and

$$\begin{aligned} f=y\psi _{22}-x\psi _{12}=(x\psi _{11}-y\psi _{21})^{-1}(y\psi _{22}-x\psi _{12}). \end{aligned}$$

Conversely, if f is of the form (3.36), then \((x\psi _{11}-y)f=y\psi _{22}-x\psi _{12}\), which can be rearranged as in (3.39). Multiplying both sides of (3.39) by \(\Theta (z)\) on the right and making use of (3.24), we get back to (3.31). \(\square \)

The next theorem parametrizes all structured extensions \({\mathfrak {S}}_{n+k}\) of a given invertible \({\mathfrak {S}}_{n}\) with minimally possible negative inertia.

Theorem 3.14

Let \(f_0,\ldots ,f_{n-1}\in {\mathbb {H}}\) be such that the matrix \({\mathfrak {S}}_n={\mathfrak {S}}_n^f\) is invertible and let \(\Psi =\left[ {\begin{matrix} \psi _{11}&{} \psi _{12}\\ \psi _{21}&{}\psi _{22} \end{matrix}}\right] \) and \(\Theta =\left[ {\begin{matrix} \theta _{11}&{} \theta _{12}\\ \theta _{21}&{}\theta _{22} \end{matrix}}\right] \) be the polynomials defined in (3.16), (3.17). Then

  1. 1.

    The equality

    $$\begin{aligned} (\psi _{11}-\varepsilon \psi _{21})^{-1}(\varepsilon \psi _{22}-\psi _{12})= (\theta _{11}\varepsilon +\theta _{21})(\theta _{21}\varepsilon +\theta _{22})^{-1} \end{aligned}$$
    (3.40)

    holds for any \(\varepsilon \in {\mathbb {H}}[[z]]\) subject to equivalent conditions

    $$\begin{aligned} \psi _{11,0}-\varepsilon _0\psi _{21,0}\ne 0 \; \; \Longleftrightarrow \; \; \theta _{21,0}\varepsilon _0+\theta _{22,0}\ne 0. \end{aligned}$$
    (3.41)
  2. 2.

    Conditions (3.41) are met for any any \(\varepsilon \in {\mathbb {H}}[[z]]\) with \(|\varepsilon _0|\le 1\) if and only if the bottom diagonal entry in \({\mathfrak {S}}_n^{-1}\) is positive

    $$\begin{aligned} \widetilde{\textbf{e}}^*_n{\mathfrak {S}}_n^{-1}\widetilde{\textbf{e}}_n>0. \end{aligned}$$
    (3.42)
  3. 3.

    An extended sequence \(\{f_j\}_{j\ge 0}\) satisfies equalities

    $$\begin{aligned} \nu _-({\mathfrak {S}}_{n+k})=\nu _-({\mathfrak {S}}_{n})\quad \text{ for } \text{ all }\quad k\ge 1 \end{aligned}$$
    (3.43)

    if and only if its Z-transform \(f(z):=\sum f_jz^j\) is of the form

    $$\begin{aligned} f=(\psi _{11}-\varepsilon \psi _{21})^{-1}(\varepsilon \psi _{22}-\psi _{12}) =(\theta _{11}\varepsilon +\theta _{21})(\theta _{21}\varepsilon +\theta _{22})^{-1}, \end{aligned}$$
    (3.44)

    where \(\varepsilon \in {\mathbb {H}}[[z]]\) is any power series subject to conditions (3.40) and such that \({\mathfrak {S}}_{k}^\varepsilon :=I-\textbf{T}_k^{\varepsilon }{} \textbf{T}_k^{\varepsilon *}\succeq 0\) for all \(k\ge 1\).

Proof of (1)

By (3.24), \(\Theta _0\Psi _0=0\) and hence in particular

$$\begin{aligned} \theta _{21,0}\psi _{11,0}+\theta _{22,0}\psi _{21,0}=0. \end{aligned}$$
(3.45)

Let us assume that

$$\begin{aligned} \psi _{11,0}-\varepsilon _0\psi _{21,0}=0. \end{aligned}$$
(3.46)

Then, \(\psi _{21,0}\ne 0\), since otherwise \(\psi _{11,0}=0\), which is not possible, by Remark 3.9. Hence

$$\begin{aligned} \theta _{22,0}=-\theta _{21,0}\psi _{11,0}\psi _{21,0}^{-1},\quad \varepsilon _0=\psi _{11,0}\psi _{21,0}^{-1}, \end{aligned}$$

and subsequently

$$\begin{aligned} \theta _{21,0}\varepsilon _0+\theta _{22,0}=\theta _{21,0}\psi _{11,0}\psi _{21,0}^{-1}-\theta _{21,0}\psi _{11,0}\psi _{21,0}^{-1}=0. \end{aligned}$$

The converse implication is verified similarly. This completes the justification of the equivalence (3.41). Once the conditions (3.41) are met, both sides in (3.40) make sense, and the equality (3.40) can be written as

$$\begin{aligned} (\psi _{11}-\varepsilon \psi _{21})(\theta _{11}\varepsilon +\theta _{21})- (\varepsilon \psi _{22}-\psi _{12})(\theta _{21}\varepsilon +\theta _{22})=0, \end{aligned}$$

or equivalently, as \(\begin{bmatrix}1&\quad -\varepsilon \end{bmatrix}\Psi \Theta \begin{bmatrix}\varepsilon \\ 1\end{bmatrix}=0\) which holds true due to (3.24). \(\square \)

Proof of (2)

The left condition in (3.41) holds for all \(\varepsilon _0\) subject to \(|\varepsilon _0|\le 1\) if and only if \(|\psi _{11,0}|>|\psi _{21,0}|\). The latter is equivalent to (3.42), since

$$\begin{aligned} |\psi _{11,0}|^2-|\psi _{21,0}|=\widetilde{\textbf{e}}^*_n{\mathfrak {S}}_n^{-1}\widetilde{\textbf{e}}_n. \end{aligned}$$

The latter equality follows from explicit formulas (3.22) and the identity (3.23). Details are given in computation (6.17) below (for \(j=0\)).\(\square \)

Proof of (3)

The Z-transform of any extended sequence is of the form (3.35) and therefore (by Theorem 3.13), it is of the form (3.36) for some \(x,y\in {\mathbb {H}}[[z]]\) subject to

$$\begin{aligned} x_0\psi _{11,0}-y_0\psi _{21,0}\ne 0. \end{aligned}$$
(3.47)

By Remark 3.8, equality (3.43) (for \(k=1\)) guarantees that \(x_0\ne 0\). Then, (3.47) is equivalent to the first inequality in (3.41). Furthermore, x is invertible in \({\mathbb {H}}[[z]]\). Letting \(\varepsilon :=x^{-1}y\in {\mathbb {H}}[[z]]\), we can write the formula (3.36) in the form (3.44). Since \(x_0\ne 0\), the Toeplitz matrix \(\textbf{T}_k^{x}\) associated with the power series x is invertible for any \(k\ge 1\). Furthermore, \(\textbf{T}_k^y=\textbf{T}_k^{x}{} \textbf{T}_k^\varepsilon \), \(\textbf{T}_k^{x}{} \textbf{e}=X_k\), \(\textbf{T}_k^{y}{} \textbf{e}=Y_k\). Multiplying both sides in (3.14) by \((\textbf{T}_k^{x})^{-1}\) on the left, by its adjoint on the right and commuting \((\textbf{T}_k^{x})^{-1}\) and \(Z_k\), we get

$$\begin{aligned} (\textbf{T}_k^{x})^{-1}{} \textbf{S}_k(\textbf{T}_k^{x*})^{-1}-Z_k(\textbf{T}_k^{x})^{-1}{} \textbf{S}_k(\textbf{T}_k^{x*})^{-1}Z_k^*=\textbf{e}{} \textbf{e}^*- \textbf{T}_k^{\varepsilon }{} \textbf{e}{} \textbf{e}^*\textbf{T}_k^{\varepsilon *}, \end{aligned}$$

which is the Stein equation of the form (2.21). Therefore, it admits a unique solution \({\mathfrak {C}}_k^\varepsilon \). Thus, \((\textbf{T}_k^{x})^{-1}{} \textbf{S}_k(\textbf{T}_k^{x*})^{-1}={\mathfrak {C}}_k^\varepsilon \), and therefore, \(\nu _{\pm }(\textbf{S}_k)=\nu _{\pm }({\mathfrak {C}}_k^\varepsilon )\), by the Sylvester law of inertia. Then, we have by (2.10)

$$\begin{aligned} \nu _\pm ({\mathfrak {S}}^f_{n+k})=\nu _\pm ({\mathfrak {S}}_n)+\nu _\pm (\textbf{S}_k)=\nu _\pm ({\mathfrak {S}}_n)+\nu _\pm ({\mathfrak {S}}^\varepsilon _k), \end{aligned}$$
(3.48)

and hence, equalities (3.43) hold if and only if \(\nu _- ({\mathfrak {S}}^\varepsilon _k)=0\), meaning that \({\mathfrak {S}}^\varepsilon _k\succeq 0\) for all \(k\ge 1\). \(\square \)

4 Carathéodory–Schur problem in the Schur class \({\mathcal {S}}_{{\mathbb {H}}}\)

So far, we have made no assumptions on the convergence of formal power series over \({\mathbb {H}}\). For a fixed \(\rho >0\), we now introduce the subring \({\mathcal {H}}_\rho \) of \({\mathbb {H}}[[z]]\) of power series (absolutely) converging in the ball \(\mathbb {B}_\rho =\{\alpha \in {\mathbb {H}}: \, |\alpha |<\rho \}\)

$$\begin{aligned} {\mathcal {H}}_\rho =\bigg \{f(z)=\sum _{k=0}^\infty f_kz^k: \; \limsup _{k\rightarrow \infty } \root k \of {|f_k|}\le 1/\rho \bigg \}. \end{aligned}$$

Remark 4.1

If \(f\in {\mathcal {H}}_\rho \) and \(f_0\ne 0\), then \(f^{-1}\in \mathcal {H}_\delta \) for some \(\delta >0\).

Indeed, since f converges in \({\mathbb {B}}_\rho \) absolutely, \(\sum _{k=1}^\infty |f_k||\alpha |^k<|f_0|\) for all \(\alpha \in \mathbb {B}_\delta \) for some \(\delta \in (0,\rho )\). Then, the induction argument shows that the coefficients \(g_k\) of \(f^{-1}\) obtained recursively from the identity \(f\cdot f^{-1}=\textbf{1}\) satisfy inequalities \(|g_k|\le \frac{1}{|f_0|\delta ^{k}}\) for all \(k\ge 0\), and hence, \(f^{-1}\) converges in \({\mathbb {B}}_\delta \).

Note that any \(f\in {\mathcal {H}}_\rho \) can be evaluated at any \(\alpha \in {\mathbb {B}}_\rho \) on the left or on the right via (absolutely) converging series

$$\begin{aligned} f^{\varvec{e_\ell }}(\alpha )=\sum _{k=0}^\infty \alpha ^k f_k\quad \text{ and }\quad f^{\varvec{e_r}}(\alpha )=\sum _{k=0}^\infty f_k\alpha ^k. \end{aligned}$$
(4.1)

We will write simply \(f(\alpha )\) if \(f^{\varvec{e_\ell }}(\alpha )=f^{\varvec{e_r}}(\alpha )\). It is the case, as is readily seen from (4.1) and (2.5), when \(f\in \mathbb {C}_\alpha [[z]]\); in particular, if \(f\in {\mathbb {R}}[[z]]\) or if \(\alpha \) is real.

The functions \(f^{\varvec{e_\ell }}, f^{\varvec{e_r}}: \, {\mathbb {B}}_\rho \rightarrow {\mathbb {H}}\) induced by a power series \(f\in {\mathcal {H}}_\rho \) are not only continuous on \({\mathbb {B}}_\rho \), but also holomorphic in the following sense: for each pure unit \(\alpha \) (\(\Re \alpha =0\), \(|\alpha |=1\)), the restrictions of \(f^{\varvec{e_\ell }}\) and \(f^{\varvec{e_r}}\) to the plane \({\mathbb {C}}_\alpha =\{x+\alpha y: \, x,y\in {\mathbb {R}}\}\) (more precisely, to the disk \({\mathbb {B}}_\rho \cap {\mathbb {C}}_\alpha \)) satisfy the respective Cauchy–Riemann equations

$$\begin{aligned}&\frac{\partial }{\partial x}f^{\varvec{e_\ell }}(x+\alpha y)=-\alpha \frac{\partial }{\partial y}f^{\varvec{e_\ell }}(x+\alpha y),\nonumber \\&\frac{\partial }{\partial x}f^{\varvec{e_r}}(x+\alpha y)=-\frac{\partial }{\partial y}f^{\varvec{e_r}}(x+\alpha y)\alpha . \end{aligned}$$
(4.2)

The latter left and right holomorphies were the starting point of the theory of left- and right-regular (also called slice-regular) functions initiated in [12]; we refer to [13] for a detailed exposition of more recent developments. The main advantage of the holomorphic approach is that Cauchy–Riemann equations (4.2) define regular functions over general domains. However, if a function \(g: \, {\mathbb {B}}_\rho \rightarrow {\mathbb {H}}\) is regular, it can be expanded in Taylor series \(\sum g_kz^k\) with \(g_k=g^{(k)}(0)/k!\) which belongs to \({\mathcal {H}}_\rho \). Being evaluated on the left or on the right, the latter series brings us back to the original function and its dual (left or right) counterpart.

4.1 Quaternionic Schur class \({\mathcal {S}}_{{\mathbb {H}}}\)

Quaternionic Schur functions were introduced in [2] as left-regular functions taking the values less than one in modulus over the unit ball \({\mathbb {B}}_1\) of \({\mathbb {H}}\); see [4] for a thorough account of the subject. An attempt to capture both left and right settings within a power-series approach was undertaken in [6, Section 3]. Here, we will proceed using different and more explicit arguments.

An element \(\alpha \in {\mathbb {B}}_\rho \) is called a left or right zero of \(f\in {\mathcal {H}}_\rho \) if, respectively, \(f^{\varvec{e_\ell }}(\alpha )=0\) or \(f^{\varvec{e_r}}(\alpha )=0\). If \(V\subset \mathbb {B}_\rho \) is a similarity class, then any power series \(f\in \mathcal H_\rho \) either has no zeros in V or it has one left and one right zero in V, or \(f^{\varvec{e_\ell }}(\alpha )=f^{\varvec{e_r}}(\alpha )=0\) for all \(\alpha \in V\). This observation goes back to [19] for the polynomial case; the power series case is similar (see, e.g., [7, §2.2]). Therefore, for every \(f\in {\mathcal {H}}_\rho \) and a similarity class \(V\subset {\mathbb {B}}_\rho \), either \(f^{\varvec{e_\ell }}(\alpha )=c=f^{\varvec{e_r}}(\alpha )\) for all \(\alpha \in V\), or for any \(\alpha \in V\), there is a unique \(\alpha '\in V\) such that \(f^{\varvec{e_r}}(\alpha ')=f^{\varvec{e_\ell }}(\alpha )\). We thus arrive at the following observation.

Remark 4.2

If \(f\in {\mathcal {H}}_\rho \), then the images of a similarity class V under \(f^{\varvec{e_\ell }}\) and \(f^{\varvec{e_r}}\) are equal as sets: \(f^{\varvec{e_\ell }}(V)=f^{\varvec{e_r}}(V)\). Consequently, \(f^{\varvec{e_\ell }}(\mathbb {B}_{\rho '})=f^{\varvec{e_r}}({\mathbb {B}}_{\rho '})\) for any \(\rho '<\rho \).

We now introduce the norm

$$\begin{aligned} \Vert f\Vert _{\infty }:={\displaystyle \sup _{\alpha \in \mathbb {B}_1}|f^{\varvec{e_\ell }}(\alpha )|}= {\displaystyle \sup _{\alpha \in \mathbb {B}_1}|f^{\varvec{e_r}}(\alpha )|} \end{aligned}$$
(4.3)

on \({\mathcal {H}}_1\) (considered as an \({\mathbb {H}}\)-bimodule), and define the Schur class \({\mathcal {S}}_{{\mathbb {H}}}\) to be

$$\begin{aligned} {\mathcal {S}}_{{\mathbb {H}}}:=\left\{ f\in {\mathcal {H}}_1: \; \Vert f\Vert _{\infty }\le 1\right\} . \end{aligned}$$

By Remark 4.2, \({\displaystyle \max _{\alpha \in \mathbb {B}_\rho }|f^{\varvec{e_\ell }}(\alpha )|}= {\displaystyle \max _{\alpha \in \mathbb {B}_\rho }|f^{\varvec{e_r}}(\alpha )|}\) for any \(f\in {\mathcal {H}}_1\) and \(\rho <1\), from which the second equality in (4.3) follows. The fact that \(\Vert \cdot \Vert _\infty \) is indeed a norm is easily verified. The coefficient characterization of the class \({\mathcal {S}}_{{\mathbb {H}}}\) is very much the same as in the complex case [22].

Theorem 4.3

([1]) A power series \(f\in {{\mathbb {H}}}[[z]]\) belongs to \({\mathcal {S}}_{{\mathbb {H}}}\) if and only if the Toeplitz matrix \(\textbf{T}_n^f\) defined as in (2.18) is contractive (i.e., \({\mathfrak {S}}_n^f=I_n-\textbf{T}_n^f\textbf{T}_n^{f*}\) is positive semidefinite) for all \(n\ge 1\).

Following [3], we define a Blaschke product of degree n to be the power-series product:

$$\begin{aligned} f=\phi \cdot \textbf{b}_{\alpha _1}{} \textbf{b}_{\alpha _2}\cdots \textbf{b}_{\alpha _n} \qquad (\alpha _i\in {\mathbb {B}}_1, \; \phi \in {\mathbb {H}}, \; |\phi |=1), \end{aligned}$$
(4.4)

where the Blaschke factor \(\textbf{b}_\alpha \) is the power series defined by

$$\begin{aligned} \textbf{b}_\alpha (z)=(z-\alpha )(1-z{\overline{\alpha }})^{-1}=-\alpha +(1-|\alpha |^2)\sum _{k=0}^\infty {\overline{\alpha }}^kz^{k+1} \quad (\alpha \in {\mathbb {B}}_1). \end{aligned}$$

It is easy to show (see e.g., [7, Proposition 3.2]) that

$$\begin{aligned} |\textbf{b}_{\alpha }^{\varvec{e_\ell }}(\gamma )|=|\textbf{b}_{\alpha }^{\varvec{e_r}}(\gamma )| \begin{array}{ccc}<1 &{} \text{ if } &{} |\gamma |<1, \\ =1 &{} \text{ if } &{} |\gamma |=1,\end{array} \end{aligned}$$

and therefore, \(\textbf{b}_{\alpha }\) and, more generally, f of the form (4.4), are in \({\mathcal {S}}_{{\mathbb {H}}}\). In what follows, we will write \({\mathcal {S}}_{{\mathbb {H}},n}\) for the set of all Blaschke products of degree n.

In certain cases, it is more convenient to deal with the power series representation of a finite Blaschke product rather than its factorization (4.4), which is largely non-unique (in general). The next result (see [7, Theorem 5.3] for the proof) is supplementary to Theorem 4.3.

Theorem 4.4

A power series \(f(z)=\sum f_kz^k\in {{\mathbb {H}}}[[z]]\) is a Blaschke product of degree k if and only if the associated matrix \({\mathfrak {S}}_n^f=I_n-\textbf{T}_{n}^f\textbf{T}_{n}^{f*}\) is positive semidefinite and \({\text {rank}}({\mathfrak {S}}_n^f)=\textrm{min} (k,n)\) for all \(n\ge 1\).

Remark 4.5

The class \({\mathcal {S}}_{{\mathbb {H}}}\) and finite Blaschke products can be equivalently characterized in terms of positive semidefinite matrices

$$\begin{aligned} {\mathfrak {R}}_n^f:=I_n-\textbf{T}^{f*}_n\textbf{T}_n^f. \end{aligned}$$

Indeed, \({\mathfrak {S}}_n^f\) and \({\mathfrak {R}}_n^f\) are both Schur complements of the identity blocks in partitioned matrix \(\left[ {\begin{matrix} I_n &{} \textbf{T}^{f*}_n \\ \textbf{T}_n^f &{} I_n \end{matrix}}\right] \) and, therefore, \(\nu _{\pm }({\mathfrak {S}}_n^f)=\nu _{\pm }({\widetilde{{\mathfrak {S}}}}_n^f)\). Note also that \({\mathfrak {R}}_n^f\) is uniquely recovered from the Stein identity

$$\begin{aligned} {\mathfrak {R}}_n-Z_n^*{\mathfrak {R}}_nZ_n=\widetilde{\textbf{e}}_n\widetilde{\textbf{e}}_n^*-{\widetilde{F}}_n{\widetilde{F}}_n^*,\quad \text{ where }\quad {\widetilde{F}}_n=\left[ {\begin{matrix} f_{n-1} \\ \vdots \\ f_0 \end{matrix}}\right] \end{aligned}$$

and where \(\widetilde{\textbf{e}}_n\) is defined as in (2.6).

4.2 The Carathéodory–Schur problem in \({\mathcal {S}}_{{\mathbb {H}}}\)

The problem consists of finding an \(f\in {\mathcal {S}}_{{\mathbb {H}}}\) with prescribed n first coefficients

$$\begin{aligned} f(z)=f_0+f_1z+\cdots +f_{n-1}z^{n-1}+\cdots \qquad (f\in \mathcal S_{{\mathbb {H}}}). \end{aligned}$$
(4.5)

The solvability and the uniqueness criteria for this problem are given below.

Theorem 4.6

  1. (1)

    Given \(f_0,\ldots ,f_{n-1}\in {\mathbb {H}}\), there exists an \(f\in {\mathcal {S}}_{{\mathbb {H}}}\) of the form (4.5), if and only if \({\mathfrak {S}}_n:=I_n-\textbf{T}_n^{f}{} \textbf{T}_n^{f*}\succeq 0\).

  2. (2)

    Such f is unique if and only if \({\mathfrak {S}}_n\) is singular. In this case, f is a Blaschke product of degree \(\textrm{deg} f=\textrm{rank}({\mathfrak {S}}_n)\).

The latter result appears in [4, Section 10.4] where it is settled using quaternionic de Branges–Rovnyak spaces. A more elementary power-series proof presented here bypasses reproducing-kernel arguments. The necessity in part (1) follows by Theorem 4.3. The rest follows from more detailed Theorems 4.7 and 4.8 below treating indeterminate and determinate cases, respectively.

Theorem 4.7

Let us suppose that \({\mathfrak {S}}_n\succ 0\). Then, the formula

$$\begin{aligned} f=(\theta _{11}\varepsilon +\theta _{21})(\theta _{21}\varepsilon +\theta _{22})^{-1}= (\psi _{11}-\varepsilon \psi _{21})^{-1}(\varepsilon \psi _{22}-\psi _{12}), \quad \varepsilon \in {\mathcal {S}}_{{\mathbb {H}}} \qquad \end{aligned}$$
(4.6)

parametrizes all \(f\in {\mathcal {S}}_{{\mathbb {H}}}\) subject to condition (4.5). Furthermore, f of the form (4.6) is a finite Blaschke product if and only if the parameter \(\varepsilon \) is a finite Blaschke product. In this case, \(\deg f=n+\deg \varepsilon \).

Proof

If \(f\in {\mathcal {S}}_{{\mathbb {H}}}\), then \({\mathfrak {S}}^f_{n+k}\succeq 0\) for all \(k\ge 1\), by Theorem 4.3. If in addition, f satisfies (4.5), then f is of the form (3.36) for some \(\varepsilon \in {\mathbb {H}}[[z]]\), such that \({\mathfrak {S}}_k^{\varepsilon }\succeq 0\) for all \(k\ge 1\), i.e., \(\varepsilon \in {\mathcal {S}}_{{\mathbb {H}}}\), again by Theorem 4.3.

If f is a Blaschke product of degree \(m>n\), then by Theorem 4.4 combined with (3.48), we have

$$\begin{aligned} \textrm{rank} ({\mathfrak {S}}_k^{\varepsilon })&=\textrm{rank} \, {\mathfrak {S}}^f_{n+k}-n=\min \{m,n+k\}-n=\min \{m-n,k\} \end{aligned}$$
(4.7)

for all \(k\ge 1\), and hence, \(\varepsilon \) is a Blaschke product of degree \(m-n\), again by Theorem 4.4.

For the converse statement, we first note that since \({\mathfrak {S}}_n\succ 0\), the inequality (3.42) holds, and hence, linear fractional expressions (4.6) make sense for any \({\mathcal {E}}\in \mathcal S_{{\mathbb {H}}}\) (by Theorem 3.14, part (2)).

Let f be of the form (4.6) for some \(\varepsilon \in {\mathcal {S}}_{{\mathbb {H}}}\). Theorem 3.14 guarantees that the first n coefficients of f are equal to the prescribed \(f_0,\ldots ,f_{n-1}\) and equalities (3.43) hold, which in the present case amount to

$$\begin{aligned} \nu _-({\mathfrak {S}}_{n+k})=\nu _-({\mathfrak {S}}_{n})=0. \end{aligned}$$

Hence, \({\mathfrak {S}}^f_{n+k}\succeq 0\) for all \(k\ge 1\) and, therefore, \(f\in {\mathcal {S}}_{{\mathbb {H}}}\). Finally, if \(\varepsilon \) is a Blaschke product of degree r, it follows from (4.7) that f is a Blaschke product of degree \(n+r\). \(\square \)

We now turn to the singular case. To use the above notation, we consider the problem (4.5) with prescribed \(n+k\) first coefficients \(f_0,\ldots ,f_{n+k-1}\) and assume that

$$\begin{aligned} {\mathfrak {S}}^f_{n+k}\succeq 0\quad \text{ and }\quad \textrm{rank}({\mathfrak {S}}^f_n)=\textrm{rank}({\mathfrak {S}}^f_{n+1})=n. \end{aligned}$$
(4.8)

We next let

$$\begin{aligned} B=\begin{bmatrix}f_1 \\ f_2 \\ \vdots \\ f_n\end{bmatrix}\quad \text{ and }\quad A=\begin{bmatrix} 0 &{} 1 &{} 0 &{}\ldots &{} 0 \\ 0 &{} 0 &{} 1&{} \ddots &{} \vdots \\ \vdots &{} \vdots &{}\vdots &{} \ddots &{} 0 \\ 0 &{} 0 &{} 0 &{} \ldots &{}1\\ a_1 &{} a_2 &{} a_3 &{}\ldots &{} a_n\end{bmatrix}, \end{aligned}$$
(4.9)

where \(\begin{bmatrix} a_1&\ldots&a_n\end{bmatrix}=-\begin{bmatrix} f_n&f_{n-1}&\ldots&f_1\end{bmatrix} \textbf{T}_n^{f*}{\mathfrak {S}}_n^{-1}\).

Theorem 4.8

Under the assumptions (4.8), the power series

$$\begin{aligned} s(z)=f_0+z\textbf{e}_n^*\left( I_n-zA\right) ^{-1}B \end{aligned}$$
(4.10)

is a Blaschke product of degree n and is a unique Schur-class power series with the first \(n+k\) coefficients equal to \(f_0,\ldots ,f_{n+k-1}\).

Proof

The construction (4.10) appears in [7]. Since A is a companion matrix, it follows that \(\textbf{e}_n^*A^j=\textbf{e}_n^* Z_n^{*j}\) for \(j=1,\ldots ,n-1\). On the other hand, it is readily seen from (2.6) and (4.9) that \(\textbf{e}_n^* Z_n^{*j}B=f_{j+1}\) for \(j=1,\ldots ,n-1\). Therefore, \(\textbf{e}_n^*A^jB=f_{j+1}\) for \(j=1,\ldots ,n-1\) and hence

$$\begin{aligned} s(z)=f_0+\sum _{j=1}^\infty \textbf{e}_n^*A^{j-1}B z^j=f_0+f_1z+\cdots +f_nz^n+\sum _{j=n+1}^\infty \textbf{e}_n^*A^{j-1}B z^j. \end{aligned}$$

Furthermore, it was shown in [7, Theorem 5.4] that A and B defined in (4.9) satisfy the equality

$$\begin{aligned} \begin{bmatrix}A&{}B\\ \textbf{e}_n^*&{}f_0\end{bmatrix}\begin{bmatrix}{\mathfrak {S}}_n &{} 0 \\ 0 &{} 1\end{bmatrix}\begin{bmatrix}A^* &{}\textbf{e}_n\\ B^* &{} {\overline{f}}_0\end{bmatrix}=\begin{bmatrix}{\mathfrak {S}}_n &{} 0 \\ 0 &{} 1\end{bmatrix}. \end{aligned}$$
(4.11)

Since \({\mathfrak {S}}_n\) is positive definite, the latter equality tells us that the matrix

$$\begin{aligned} \begin{bmatrix}{\widetilde{A}}&{}{\widetilde{B}}\\ {\widetilde{C}}&{}f_0\end{bmatrix}= \begin{bmatrix}{\mathfrak {S}}_n^{-\frac{1}{2}} &{} 0 \\ 0 &{} 1\end{bmatrix}\begin{bmatrix}A&{}B\\ \textbf{e}_n^*&{}f_0\end{bmatrix} \begin{bmatrix}{\mathfrak {S}}_n^{\frac{1}{2}} &{} 0 \\ 0 &{} 1\end{bmatrix} \end{aligned}$$

is unitary. Since s of the form (4.10) can be realized also as

$$\begin{aligned} s(z)=f_0+z{\widetilde{C}}(I-z{\widetilde{A}})^{-1}{\widetilde{B}} \end{aligned}$$

and since the matrix \(\left[ {\begin{matrix} {\widetilde{A}}&{}{\widetilde{B}}\\ {\widetilde{C}}&{}f_0 \end{matrix}}\right] \) is unitary, it follows by [7, Theorem 3.11] that s is a Blaschke product of degree at most n. Since \({\mathfrak {S}}_n^s={\mathfrak {S}}_n\succ 0\), it follows by Theorem 4.4 that \(\deg s=n\). It remains to show that

$$\begin{aligned} s_j:=\textbf{e}_n^*A^{j-1}B=f_j\quad \text{ for }\quad j=n+1,\ldots ,n+k-1. \end{aligned}$$
(4.12)

By (4.8), the matrix \({\mathfrak {S}}^f_{n+1}\) is singular. By Lemma 3.1 (applied to \({\mathfrak {S}}_n\) rather than \({\mathfrak {C}}_n\)), all further positive semidefinite structured extensions of \({\mathfrak {S}}^f_{n+1}\) are uniquely determined by the elements \(f_0,\ldots ,f_n\) (i.e., by \({\mathfrak {S}}^f_{n+1}\)). On the other hand, since s is a Blaschke product of degree n, the matrix \({\mathfrak {S}}_{n+k}^s\) is positive semidefinite and \(\textrm{rank}({\mathfrak {S}}_{n+k}^s)=n\). Since \(s_j=f_j\) for \(j=0,\ldots n\), we have \({\mathfrak {S}}^s_{n+1}={\mathfrak {S}}^f_{n+1}\). Thus, \({\mathfrak {S}}_{n+k}^s\) is another positive semidefinite structured extensions of \({\mathfrak {S}}^f_{n+1}\). Since such an extension is unique, \({\mathfrak {S}}_{n+k}^s={\mathfrak {S}}^f_{n+k}\), and (4.12) follows. \(\square \)

Remark 4.9

The latter proof is independent of Theorem 4.4. Actually, it follows from Theorem 4.4 that \(s\in \mathcal {S}_{{\mathbb {H}}}\) subject to condition:

$$\begin{aligned} s(z)=f_0+f_1z+\cdots +f_nz^n+\cdots \end{aligned}$$
(4.13)

exists and arises via formula (4.6) for some \(\varepsilon \in {\mathcal {S}}_{{\mathbb {H}}}\) with the unimodular free coefficient \(\varepsilon _0=x_0^{-1}y_0\). Therefore, \(\varepsilon \equiv \varepsilon _0\), and hence

$$\begin{aligned} s(z)=(\theta _{11}\varepsilon _0+\theta _{21})(\theta _{21}\varepsilon _0+\theta _{22})^{-1} \end{aligned}$$

is the unique Schur-class power series subject to (4.13). It belongs to \({\mathcal {S}}_{{{\mathbb {H}}},n}\) by the last statement in Theorem 4.7. Due to the uniqueness, this s is the same as in the formula (4.10).

4.3 Carathéodory approximation theorem

As an application of Theorem 4.7, we now show that any \(f\in {\mathcal {S}}_{{\mathbb {H}}}\) can be uniformly approximated by (quaternion) finite Blaschke products on compact subsets of \(\mathbb {B}_1\). The function-theoretic (rather than power-series) context is crucial here.

Theorem 4.10

Let \(f\in {\mathcal {S}}_{{\mathbb {H}}}\). For any \(\rho <1\) and \(\epsilon >0\), there exists a finite Blaschke product B, such that

$$\begin{aligned} \left| f^{\varvec{e_\ell }}(\alpha )-B^{\varvec{e_\ell }}(\alpha )\right|<\epsilon \quad \text{ and }\quad \left| f^{\varvec{e_r}}(\alpha )-B^{\varvec{e_r}}(\alpha )\right| <\epsilon \end{aligned}$$
(4.14)

for all \(\alpha \in \overline{{\mathbb {B}}}_\rho \).

Proof

We choose n, such that \(2\rho ^n<\epsilon \) and assume that \(f(z)=\sum _{j\ge 0}f_jz^j\) is not a finite Blaschke product (since otherwise we can take \(B=f\)), so that the associated matrix \({\mathfrak {S}}^f_n=I-\textbf{T}^f_n\textbf{T}^{f*}_n\) is positive definite. By Theorem 4.7, there is a finite Blaschke product B solving the problem (4.5), i.e., having the same first n coefficients as f. Then \(g=\frac{1}{2}(f-B)\) is a Schur-class power series with first n coefficients equal zero, i.e., \(g(z)=z^n h(z)\) for some \(h\in {\mathcal {H}}_1\). Since for any \(m\ge 1\), the Toeplitz matrices (2.18) associated with g and h are related by \(\textbf{T}^g_{n+m}=\left[ {\begin{matrix} 0 &{} 0 \\ \textbf{T}^h_{m} &{} 0 \end{matrix}}\right] \), we have

$$\begin{aligned} {\mathfrak {S}}_{m+n}^g=I_{m+n}-\textbf{T}^g_{n+m}{} \textbf{T}^{g*}_{n+m}=\begin{bmatrix} I_n &{} 0 \\ 0 &{} I_m-\textbf{T}^h_{m}{} \textbf{T}^{h*}_{m}\end{bmatrix} =\begin{bmatrix} I_n &{} 0 \\ 0 &{} {\mathfrak {S}}_m^h\end{bmatrix}\succeq 0 \end{aligned}$$

for all \(m\ge 1\), and hence, \(h\in {\mathcal {S}}_{{\mathbb {H}}}\), by Theorem 4.3. Therefore, for any \(\alpha \in \overline{\mathbb {B}}_\rho \)

$$\begin{aligned} \left| f^{\varvec{e_\ell }}(\alpha )-B^{\varvec{e_\ell }}(\alpha )\right| =2|g^{\varvec{e_\ell }}(\alpha )|=2|\alpha |^n|h^{\varvec{e_\ell }}(\alpha )|\le 2\rho ^n<\epsilon \end{aligned}$$

which verifies the first inequality in (4.14). The second inequality follows similarly. \(\square \)

The complex-valued counterpart of Theorem 4.10 stating that any Schur function \(f\in {\mathcal {S}}\) can be uniformly approximated by finite Blaschke products on compact subsets of the unit disk \({\mathbb {D}}\subset {\mathbb {C}}\), is due to Carathéodory [10].

4.4 Carathéodory–Fejér extremal problem

It was shown in [11] (with further elaboration in [23]) that the set of all functions analytic on \({\mathbb {D}}\) and with fixed n first Taylor coefficients contains a unique element (a scalar multiple of a finite Blaschke product) with minimally possible \(H^\infty \)-norm. Due to Theorems 4.7 and 4.6, this result extends to the quaternion setting as follows. By rescaling we see that given \(g_0,\ldots ,g_{n-1}\in {\mathbb {H}}\) and \(\lambda >0\), a power series \(g\in {\mathcal {H}}_1\) satisfies

$$\begin{aligned} g(z)=g_0+g_1z+\cdots +g_{n-1}z^{n-1}+\cdots \quad \text{ and }\quad \Vert g\Vert _\infty \le \lambda \end{aligned}$$
(4.15)

[see (4.10)] if and only if the power series \(f=\lambda ^{-1}g\) solves the problem (4.5) with \(f_j=\lambda ^{-1}g_j\) for \(j=0,\ldots ,n-1\). Then, it follows by Theorem 4.6 that \(g\in {\mathcal {H}}_1\) subject to conditions (4.15) exists if and only if:

$$\begin{aligned} I_n-\textbf{T}_n^f\textbf{T}_n^{f*}=I_n-\lambda ^{-2}{} \textbf{T}_n^g\textbf{T}_n^{g*}\succeq 0 \end{aligned}$$

or equivalently

$$\begin{aligned} \lambda ^2I_n-\textbf{T}_n^g\textbf{T}_n^{g*}\succeq 0. \end{aligned}$$

The minimally possible \(\lambda >0\) for which the latter inequality holds is equal to \(\sigma _{\textrm{max}}(\textbf{T}_n^g)\), the maximal singular value of the matrix \(\textbf{T}_n^g\). Again, by Theorem 4.6, the unique \(s\in {\mathcal {S}}_{{\mathbb {H}}}\) solving the problem (4.5) with \(f_j=\sigma _{\textrm{max}}(\textbf{T}_n^g)^{-1}g_j\) is a Blaschke product of degree \(k=\textrm{rank}(\lambda _{\min }^2I_n-\textbf{T}_n^g\textbf{T}_n^{g*})\) (explicitly constructed as in Theorem 4.8). Consequently, the minimally possible \(\Vert g\Vert _\infty \) for g subject to the first condition in (4.15) equals \(\sigma _{\textrm{max}}(\textbf{T}_n^g)\), and \(g=\lambda s\) is the unique power series on which this minimum is attained.

5 The Carathéodory class \({\mathcal {C}}_{{\mathbb {H}}}\)

To define the Carathéodory class in the quaternion setting, we recall evaluation formulas (4.1) and observe from them as a consequence of the equality \(\Re (\alpha \beta )=\Re (\beta \alpha )\) holding for all \(\alpha ,\beta \in {\mathbb {H}}\), that \(\Re (f^{\varvec{e_\ell }}(\alpha ))=\Re (f^{\varvec{e_r}}(\alpha ))\) which justifies the notation

$$\begin{aligned} \Re (f(\alpha )):=\Re (f^{\varvec{e_\ell }}(\alpha ))=(\Re f^{\varvec{e_r}}(\alpha )) \quad \text{ for }\quad f\in {\mathcal {H}}_\rho \; \;\text{ and }\; \; \alpha \in {\mathbb {H}}. \end{aligned}$$

We now define the Carathéodory class

$$\begin{aligned} {\mathcal {C}}_{{\mathbb {H}}}:=\left\{ f\in {\mathcal {H}}_1: \; \Re (f(\alpha ))\ge 0 \; \; \text{ for } \text{ all } \; \; \alpha \in \mathbb {B}_1\right\} . \end{aligned}$$

5.1 Cayley transform

The evaluation functionals \(f\rightarrow f^{\varvec{e_\ell }}(\alpha )\) are right-linear but not multiplicative, in general (unless \(\alpha \) is real). By (2.2) and (4.1), for any \(f,g\in {\mathcal {H}}_\rho \) and any \(\alpha \in {\mathbb {B}}_\rho \), we have

$$\begin{aligned} (fg)^{\varvec{e_\ell }}(\alpha )=\sum _{k=0}^\infty \alpha ^kf^{\varvec{e_\ell }}(\alpha )g_k, \end{aligned}$$

from which it follows that:

$$\begin{aligned} (fg)^{\varvec{e_\ell }}(\alpha )=\left\{ \begin{array}{ccc} f^{\varvec{e_\ell }}(\alpha )\cdot g^{\varvec{e_\ell }}\left( f^{\varvec{e_\ell }}(\alpha )^{-1}\alpha f^{\varvec{e_\ell }}(\alpha )\right) &{}\text{ if } &{} f^{\varvec{e_\ell }}(\alpha )\ne 0, \\ 0 &{} \text{ if } &{} f^{\varvec{e_\ell }}(\alpha )= 0.\end{array}\right. \end{aligned}$$
(5.1)

In case f has no zeros in \({\mathbb {B}}_\rho \) the formula (5.1) suggests to introduce the transformation

$$\begin{aligned} \Upsilon _{f,{\varvec{\ell }}}: \, \alpha \mapsto f^{\varvec{e_\ell }}(\alpha )^{-1}\alpha f^{\varvec{e_\ell }}(\alpha ) \end{aligned}$$
(5.2)

which turns out to be a bijection (even a homeomorphism) on \(\mathbb {B}_\rho \) preserving all similarity classes; see [13, §5.5]. Indeed, letting \(f^\sharp \) to denote the power-series conjugate of f

$$\begin{aligned} f^\sharp (z)=\bigg (\sum _{k=0}^\infty z^kf_k\bigg )^\sharp =\sum _{k=0}^\infty z^k {\overline{f}}_k, \end{aligned}$$
(5.3)

and observing that \(ff^\sharp \in {\mathbb {R}}[[z]]\) and, therefore, \(ff^\sharp (\alpha )\) commutes with \(\alpha \) for all \(\alpha \in {\mathbb {B}}_\rho \), we apply the formula (5.1) (with \(g=f^\sharp \)) to get

$$\begin{aligned} \Upsilon _{f^\sharp ,{\varvec{\ell }}}\circ \Upsilon _{f,{\varvec{\ell }}}(\alpha )&= \left( f^{\varvec{e_\ell }}(\alpha ) f^{\sharp {\varvec{e_\ell }}}(\Upsilon _{f,{\varvec{\ell }}}(\alpha ))\right) ^{-1}\alpha f^{\varvec{e_\ell }}(\alpha ) f^{\sharp {\varvec{e_\ell }}}(\Upsilon _{f,{\varvec{\ell }}}(\alpha ))\\&=(ff^\sharp )(\alpha )^{-1}\alpha (ff^\sharp )(\alpha )=\alpha , \end{aligned}$$

which shows that the map \(\Upsilon _{f,{\varvec{\ell }}}\) is invertible with the inverse equal to \(\Upsilon _{f^\sharp ,{\varvec{\ell }}}\).

We now recall the Cayley transform connecting the classes \(\mathcal C_{{\mathbb {H}}}\) and \({\mathcal {S}}_{{\mathbb {H}}}\). Let \(\textbf{1}\) denote the power series with the free coefficient equal one and all other coefficients equal zero. If \(f\in {\mathcal {C}}_{{\mathbb {H}}}\), then the series \(\textbf{1}+f\) has non-zero left and right values over \(\mathbb {B}_1\), and hence its formal inverse belongs to \({\mathcal {H}}_1\).

Remark 5.1

The Cayley transform

$$\begin{aligned} f={\mathfrak {T}}[g]:=(\textbf{1}+g)^{-1}(g-\textbf{1})=(g-\textbf{1})(\textbf{1}+g)^{-1} \end{aligned}$$
(5.4)

establishes a one-to-one correspondence between \({\mathcal {C}}_{\mathbb {H}}\) and \({\mathcal {S}}_{{\mathbb {H}}}\backslash \{\textbf{1}\}\).

Proof

Making use of formula (5.1) and notation (5.2), we evaluate power-series equality \((\textbf{1}+g)f=g-\textbf{1}\) on the left to get

$$\begin{aligned} (1+g^{\varvec{e_\ell }}(\alpha ))f^{\varvec{e_\ell }}(\Upsilon _{\textbf{1}+g,{\varvec{\ell }}}(\alpha ))=g^{\varvec{e_\ell }}(\alpha )-1, \end{aligned}$$
(5.5)

which, in turn, implies

$$\begin{aligned} 1-|f^{\varvec{e_\ell }}(\Upsilon _{\textbf{1}+g,{\varvec{\ell }}}(\alpha ))|^2= 2\cdot \frac{g^{\varvec{e_\ell }}(\alpha )+\overline{g^{\varvec{e_\ell }}(\alpha )}}{|1+g^{\varvec{e_\ell }}(\alpha )|^2}. \end{aligned}$$
(5.6)

Since \(f\in {\mathcal {C}}_{{\mathbb {H}}}\), the expression on the right side is nonnegative. Since \(\Upsilon _{\textbf{1}+g,{\varvec{\ell }}}\) is a bijection on \({\mathbb {B}}_1\), we conclude from (5.6) that \(|f^{\varvec{e_\ell }}(\beta )|\le 1\) for all \(\beta \in {\mathbb {B}}_1\) and, hence, \(f\in {\mathcal {S}}_{{\mathbb {H}}}\). It is clear from (5.5) that \(f^{\varvec{e_\ell }}(\Upsilon _{\textbf{1}+g,{\varvec{\ell }}}(\alpha ))\ne 1\) and, therefore, \(f\ne \textbf{1}\). Finally, if \(f\in {\mathcal {S}}_{\mathbb {H}}\backslash \{\textbf{1}\}\), then the formal inverse of \(\textbf{1}-f\) exists; the power series \(g={\mathfrak {T}}^{-1}[f]=(\textbf{1}+f)(\textbf{1}-f)^{-1}\) (the inverse Cayley transform of f) belongs to \({\mathcal {H}}_1\) and satisfies equality (5.6) for all \(\alpha \in {\mathbb {B}}_1\). Since the left side in (5.6) is now nonnegative, we conclude that \(g\in {\mathcal {C}}_{{\mathbb {H}}}\). \(\square \)

As an application of Remark 5.1, we characterize Carathéodory-class power series in terms of their coefficients.

Theorem 5.2

The power series \(g(z)=\sum _{k\ge 0} g_k z^k\) belongs to the Carathéodory class \({\mathcal {C}}_{{\mathbb {H}}}\) if and only if

$$\begin{aligned} \textbf{T}_n^g+\textbf{T}_n^{g*}= \left[ \begin{array}{cccc}g_{0}+{\overline{g}}_0 &{} {\overline{g}}_1 &{} \ldots &{} {\overline{g}}_{n-1} \\ g_{1}&{} g_{0}+{\overline{g}}_0 &{} \ddots &{} \vdots \\ \vdots &{} \ddots &{} \ddots &{} {\overline{g}}_1 \\ g_{n-1}&{} \ldots &{} g_{1} &{} g_{0}+{\overline{g}}_0\end{array}\right] \succeq 0 \end{aligned}$$
(5.7)

for all \(n\ge 1\), where \(\textbf{T}^g_n\) is the Toeplitz matrix defined as in (2.18).

Proof

Let \(f={\mathfrak {T}}[g]\). Applying formulas (2.20) to the power-series equality (5.4), we see that the Toeplitz matrices \(\textbf{T}_n^g\) and \(\textbf{T}_n^f\) associated with g and f via formula (2.18) are related by

$$\begin{aligned} \textbf{T}_n^f=\big (I+\textbf{T}_n^g\big )^{-1}\big (\textbf{T}_n^g-I\big ). \end{aligned}$$
(5.8)

Then, we have

$$\begin{aligned} I-\textbf{T}_n^f\textbf{T}_n^{f*}=2\big (I+\textbf{T}_n^g\big )^{-1}\big (\textbf{T}_n^g+\textbf{T}_n^{g*}\big ) \big (I+\textbf{T}_n^{g*}\big )^{-1}. \end{aligned}$$
(5.9)

Since \(g\in {\mathcal {C}}_{{\mathbb {H}}}\Leftrightarrow f\in \mathcal {S}_{{\mathbb {H}}}\) (by Remark 5.1) \(\Leftrightarrow I_n-\textbf{T}_n^f\textbf{T}_n^{f*}\succeq 0\) for all \(n\ge 1\) (by Theorem 4.3) \(\Leftrightarrow \) (5.7) (by (5.9)), the statement follows. \(\square \)

Remark 5.3

If \(g\in {\mathcal {C}}_{{\mathbb {H}}}\) and \(f\in {\mathcal {S}}_{\mathbb {H}}\) are such that \(f={\mathfrak {T}}[g]\), then

$$\begin{aligned} \textrm{rank} \, \big (\textbf{T}_n^g+\textbf{T}_n^{g*}\big )=\textrm{rank} \, \big (I-\textbf{T}_n^f\textbf{T}_n^{f*}\big ) \quad \text{ for } \text{ all } \; \; n\ge 1. \end{aligned}$$

The latter equality follows from (5.9), since the matrix \(I+\textbf{T}_n^g\) is lower triangular and invertible.

Definition 5.4

We denote by \({\mathcal {C}}_{{{\mathbb {H}}},n}\) the class of all \(g\in {\mathbb {H}}[[z]]\), such that

$$\begin{aligned} \textbf{T}_k^g+\textbf{T}_k^{g*}\succeq 0 \quad \text{ and }\quad {\text {rank}}(\textbf{T}_k^g+\textbf{T}_k^{g*})=\textrm{min} (k,n) \; \; \text{ for } \text{ all } \; \; k\ge 1. \end{aligned}$$

Equivalently, \({\mathcal {C}}_{{{\mathbb {H}}},n}\) is the set of quaternion power series whose Cayley transform is a Blaschke product of degree n

$$\begin{aligned} {\mathcal {C}}_{{{\mathbb {H}}},n}=\left\{ g\in {\mathbb {H}}[[z]]: \; \; {\mathfrak {T}}[g]\in {\mathcal {S}}_{{{\mathbb {H}}},n}\right\} . \end{aligned}$$

The latter equivalence follows from Theorem 4.4 and Remark 5.3. As \({\mathcal {S}}_{{{\mathbb {H}}},0}\) is identified with the unit sphere of \({\mathbb {H}}\), we identify \(\mathcal C_{{{\mathbb {H}}},0}\) with the space of all pure quaternions.

5.2 The Carathéodory–Schur problem in \({\mathcal {C}}_{{\mathbb {H}}}\)

The problem of finding a power series

$$\begin{aligned} g(z)=g_0+g_1z+\cdots +g_{n-1}z^{n-1}+\cdots \qquad (g\in \mathcal C_{{\mathbb {H}}}) \end{aligned}$$
(5.10)

with preassigned first n coefficients is equivalent to the problem (4.5) in the following sense: g solves the problem (5.10) if and only if its Cayley transform (5.4) solves the problem (4.5) with \(f_0,\ldots ,f_{n-1}\) determined from \(g_0,\ldots ,g_{n-1}\) via the formula (5.8), or equivalently, by

$$\begin{aligned} \left[ {\begin{matrix} f_0 \\ \vdots \\ f_{n-1} \end{matrix}}\right] =\big (I_n+\textbf{T}_n^g\big )^{-1}\big (\textbf{T}_n^g-I_n\big )\textbf{e}_n. \end{aligned}$$
(5.11)

To complete the story, it remains to spell out Theorems 4.64.7, and 4.8 in terms of \(g_0,\ldots ,g_{n-1}\). The details are given below.

Theorem 5.5

Given \(g_0,\ldots ,g_{n-1}\in {\mathbb {H}}\), there exists \(g\in \mathcal C_{{\mathbb {H}}}\) of the form (5.10) if and only if \({\mathfrak {C}}_n:=\textbf{T}_n^g+\textbf{T}_n^{g*}\succeq 0\). Such g is unique if and only if \({\mathfrak {C}}_n\) is singular, in which case \(g\in {\mathcal {C}}_{{{\mathbb {H}}},d}\), where \(d=\textrm{rank} ({\mathfrak {C}}_n)\).

Proof

For \(f_0,\ldots ,f_{n-1}\) defined as in (5.11), the matrices \({\mathfrak {S}}_n=I_n-\textbf{T}_n^f\textbf{T}_n^{f*}\) and \({\mathfrak {C}}_n=\textbf{T}_n^g+\textbf{T}_n^{g*}\) are related as in (5.9), and hence, \({\mathfrak {S}}_n\succeq 0\) if and only if \({\mathfrak {C}}_n\succeq 0\), and the first statement follows by Theorem 4.6. The problem (5.10) has a unique solution f if and only \(f=\mathfrak {T}[g]\) is a unique solution of the associated problem (4.5), which is the case if and only if \({\mathfrak {S}}_n\) is singular or equivalently, \({\mathfrak {C}}_n\) is singular. Since \(\textrm{rank} \, {\mathfrak {S}}_n=\textrm{rank} \, {\mathfrak {C}}_n\) (by Remark 5.3) and f is a Blaschke product of degree \(d=\textrm{rank} \, {\mathfrak {S}}_n\) (by Theorem 4.7), it follows that \(g\in {\mathcal {C}}_{{{\mathbb {H}}},d}\). \(\square \)

For the next theorem, we let \(\overline{{\mathcal {C}}}_{\mathbb {H}}:={\mathcal {C}}_{{\mathbb {H}}}\cup \{\infty \}\) and we assign \(\infty \) to the class \({\mathcal {C}}_{{{\mathbb {H}}},0}\). With this convention, the Cayley transform (5.4) extends to a bijection from \(\overline{{\mathcal {C}}}_{{\mathbb {H}}}\) to \({\mathcal {S}}_{{\mathbb {H}}}\).

Theorem 5.6

Given \(g_0,\ldots , g_{n-1}\), let us suppose that \({\mathfrak {C}}_n=\textbf{T}_n^g+\textbf{T}_n^{g*}\succ 0\) and let us introduce the \(2\times 2\)-matrix polynomial

$$\begin{aligned} {\mathfrak {A}}(z)=I_2+(z-1)\begin{bmatrix}G_n^* \\ -\textbf{e}_n^*\end{bmatrix}(I-zZ_n^{*})^{-1}{\mathfrak {C}}_n^{-1}(I-Z_n)^{-1}\begin{bmatrix}\textbf{e}_n&-G_n\end{bmatrix}, \end{aligned}$$
(5.12)

where \(Z_n\) and \(\textbf{e}_n\) are defined in (2.6) and where

$$\begin{aligned} G_n:=\textbf{T}^g_n\textbf{e}_n=\left[ {\begin{matrix} g_0\\ \vdots \\ g_{n-1} \end{matrix}}\right] . \end{aligned}$$
(5.13)

Then, the formula

$$\begin{aligned} g=({\mathfrak {A}}_{11}\varphi +{\mathfrak {A}}_{21})({\mathfrak {A}}_{21}\varphi +{\mathfrak {A}}_{22})^{-1}, \quad \varphi \in \overline{{\mathcal {C}}}_{{\mathbb {H}}} \end{aligned}$$
(5.14)

establishes a bijection between \(\overline{{\mathcal {C}}}_{{\mathbb {H}}}\) and the set of all \(g\in {\mathcal {C}}_{{\mathbb {H}}}\) subject to condition (5.10). Moreover, \(\varphi \in {\mathcal {C}}_{{{\mathbb {H}}},k}\) if and only if \(g\in {\mathcal {C}}_{{{\mathbb {H}}},n+k}\).

Proof

If we use \(f_0,\ldots ,f_{n-1}\) from (5.11) to construct the polynomial \(\Theta \) as in (3.17), then by Theorem 4.7 and the discussion preceeding Theorem 5.5, the formula (4.6) written as

$$\begin{aligned} {\mathfrak {C}}[f]=(\theta _{11}\mathfrak {C}[\varphi ]+\theta _{21})(\theta _{21}\mathfrak {C}[\varphi ]+\theta _{22})^{-1}, \quad \varphi \in \overline{\mathcal {C}}_{{\mathbb {H}}}, \end{aligned}$$
(5.15)

parametrizes all solutions f to the problem (5.10). Since (5.4) is a linear fractional transform based on the matrix \(\left[ {\begin{matrix} 1 &{}-1\\ 1&{} 1 \end{matrix}}\right] \), we can take the superposition of three linear fractional maps to recover g from (5.15) by formula (5.14) with

$$\begin{aligned} \begin{bmatrix}{\mathfrak {A}}_{11} &{} {\mathfrak {A}}_{12}\\ {\mathfrak {A}}_{21} &{} {\mathfrak {A}}_{22}\end{bmatrix}= \frac{1}{2}\begin{bmatrix}1 &{} 1 \\ -1 &{} 1\end{bmatrix}\begin{bmatrix}\theta _{11} &{} \theta _{12}\\ \theta _{21} &{} \theta _{22}\end{bmatrix} \begin{bmatrix}1 &{} -1 \\ 1 &{} 1\end{bmatrix}. \end{aligned}$$
(5.16)

The latter formula defines the same polynomial as in (5.12). Indeed, substituting (3.17) into (5.16) gives

$$\begin{aligned} {\mathfrak {A}}(z)&=I_2+\frac{z-1}{2}\begin{bmatrix}{} \textbf{e}_n^*(\textbf{T}_n^{f*}+I)\nonumber \\ \textbf{e}_n^*(\textbf{T}_n^{f*}-I)\end{bmatrix}(I-zZ_n^{*})^{-1}{\mathfrak {S}}_n^{-1}\\&\qquad \qquad \times (I-Z_n)^{-1}\begin{bmatrix}(I-\textbf{T}_n^{f})\textbf{e}_n&-(\textbf{T}_n^{f}+I)\textbf{e}_n\end{bmatrix}. \end{aligned}$$
(5.17)

By (5.9) and since \(\textbf{T}_n^{f}\) commutes with \(Z_n\), we have

$$\begin{aligned}&(I-zZ_n^{*})^{-1}{\mathfrak {S}}_n^{-1}(I-Z_n)^{-1}\\&\quad =\frac{1}{2}(I-zZ_n^{*})^{-1}(I+\textbf{T}_n^{g*}){\mathfrak {C}}_n^{-1}(I+\textbf{T}_n^{g})(I-Z_n)^{-1}\\&\quad =\frac{1}{2}(I+\textbf{T}_n^{g*})(I-zZ_n^{*})^{-1}{\mathfrak {C}}_n^{-1}(I-Z_n)^{-1}(I+\textbf{T}_n^{g}). \end{aligned}$$

Plugging in the latter equality into (5.17), making use of equalities

$$\begin{aligned} (I+\textbf{T}_n^{g})(I-\textbf{T}_n^{f})=2I_n\quad \text{ and }\quad (I+\textbf{T}_n^{g})(\textbf{T}_n^{f}+I)=2\textbf{T}_n^{g} \end{aligned}$$

which follow directly from (5.8), and taking into account (5.13), we arrive at

$$\begin{aligned} {\mathfrak {A}}(z)&=I_2+\frac{z-1}{4}\begin{bmatrix}{} \textbf{e}_n^*(\textbf{T}_n^{f*}+I) \\ \textbf{e}_n^*(\textbf{T}_n^{f*}-I)\end{bmatrix}(I+\textbf{T}_n^{g*})(I-zZ_n^{*})^{-1}{\mathfrak {C}}_n^{-1}(I-Z_n)^{-1}\\&\qquad \qquad \qquad \times (I+\textbf{T}_n^{g}) \begin{bmatrix}(I-\textbf{T}_n^{f})\textbf{e}_n&-(\textbf{T}_n^{f}+I)\textbf{e}_n\end{bmatrix}\\&=I_2+\frac{z-1}{4}\begin{bmatrix}2\textbf{e}_n^*\textbf{T}_n^{g*}\\ -2\textbf{e}_n^*\end{bmatrix}(I-zZ_n^{*})^{-1}{\mathfrak {C}}_n^{-1}(I-Z_n)^{-1} \begin{bmatrix}2\textbf{e}_n&-2\textbf{T}_n^{g}{} \textbf{e}_n\end{bmatrix}\\&=I_2+(z-1)\begin{bmatrix}G_n^*\\ -\textbf{e}_n^*\end{bmatrix}(I-zZ_n^{*})^{-1}{\mathfrak {C}}_n^{-1}(I-Z_n)^{-1} \begin{bmatrix}{} \textbf{e}_n&-G_n\end{bmatrix}, \end{aligned}$$

which is the same as (5.12). The last statement follows from Theorem 4.7 and the definition of the class \(\mathcal {C}_{{{\mathbb {H}}},n}\). \(\square \)

We next assume that \(g_0,\ldots ,g_n\in {\mathbb {H}}\) are such that

$$\begin{aligned} \textrm{rank}({\mathfrak {C}}_n)=\textrm{rank}({\mathfrak {C}}_{n+1})=n, \end{aligned}$$
(5.18)

and let

$$\begin{aligned} B'=\begin{bmatrix}g_1 \\ \vdots \\ g_n\end{bmatrix}\quad \text{ and }\quad A'=Z_n^*-\widetilde{\textbf{e}}_n\begin{bmatrix}g_n&\ldots&g_1\end{bmatrix}{\mathfrak {C}}_n^{-1}, \end{aligned}$$

where \(\widetilde{\textbf{e}}_n\) is given in (2.6). Note that \(A'\) is a companion matrix as in (4.9) but with the bottom row equal \(\begin{bmatrix}g_n&\ldots&g_1\end{bmatrix}{\mathfrak {C}}_n^{-1}\).

Theorem 5.7

Under the assumptions (5.18), the power series

$$\begin{aligned} h(z)=g_0+z\textbf{e}_n^*\left( I_n-zA'\right) ^{-1}B' \end{aligned}$$
(5.19)

belongs to \({\mathcal {C}}_{{{\mathbb {H}}},n}\) and is a unique Carathéodory-class power series with the first \(n+1\) coefficients equal to \(g_0,\ldots ,g_{n}\).

Proof

We use the given \(g_0,\ldots ,g_n\) to define the elements \(f_0,\ldots ,f_n\) by the formula (5.11) with n replaced by \(n+1\). Then, we introduce the matrices \(\textbf{T}^f_{n+1}\) and \({\mathfrak {S}}_{n+1}=I-\textbf{T}^f_{n+1}{} \textbf{T}^{f*}_{n+1}\). Since \(\textrm{rank}({\mathfrak {S}}_n)=\textrm{rank}({\mathfrak {S}}_{n+1})=n\), by (5.18) and Remark 5.3, there is a unique \(s\in {\mathcal {S}}_{{\mathbb {H}}}\) subject to condition (4.13), which is a Blaschke product of degree n and is given by the realization formula (4.10). Its (inverse) Cayley transform \(h={\mathfrak {T}}^{-1}[s]=(1-s)^{-1}(1+s)\) has all the desired properties. It remains to verify that \(h={\mathfrak {T}}^{-1}[s]\) can be written in the form (5.19).

Toward this end, note that for s defined as in (4.10), we have

$$\begin{aligned}&(1-s(z))^{-1}=\big (1-f_0-z\textbf{e}_n^*(I-zA)^{-1}B\big )^{-1}\\&\quad =(1-f_0)^{-1}+z(1-f_0)^{-1}{} \textbf{e}_n^*\big (1-z(A+B(1-f_0)^{-1}\textbf{e}^*_n)\big )^{-1}B(1-f_0)^{-1}, \end{aligned}$$

from which we get

$$\begin{aligned} h(z)&=(1-s(z))^{-1}(1+s(z))=-1+2(1-s(z))^{-1}\nonumber \\&=g_0+2z(1-f_0)^{-1}{} \textbf{e}_n^*\big (1-z(A+B(1-f_0)^{-1}\textbf{e}^*_n)\big )^{-1}B(1-f_0)^{-1}.\qquad \end{aligned}$$
(5.20)

It remains to express the right side in terms of the original \(g_0,\ldots ,g_n\). We first use the equalities

$$\begin{aligned} B(1-f_0)^{-1}=\big (I+\textbf{T}_n^g\big )^{-1}B',\quad 2(1-f_0)^{-1}\textbf{e}_n^*=\textbf{e}_n^*\big (I+\textbf{T}_n^g\big ) \end{aligned}$$

(the first follows from relation (5.8) with \(n+1\) instead of n, and the second is immediate, since \((1+g_0)(1-f_0)=2\)) to write (5.20) as

$$\begin{aligned} h(z)=g_0+z\textbf{e}_n^*\big (1-zU\big )^{-1}B', \end{aligned}$$

where

$$\begin{aligned} U=\big (I+\textbf{T}_n^g\big )A\big (I+\textbf{T}_n^g\big )^{-1}+B'\textbf{e}^*_n\big (I+\textbf{T}_n^g\big )^{-1}. \end{aligned}$$
(5.21)

It remains to show that \(U=A'\). To this end, we write the companion matrix A in (4.9) as

$$\begin{aligned} A=Z_n^*-\widetilde{\textbf{e}}_n\begin{bmatrix}f_n&\ldots&f_1\end{bmatrix}{} \textbf{T}_n^{f*}{\mathfrak {S}}_n^{-1} \end{aligned}$$

(\(\widetilde{\textbf{e}}_n\) is defined in (2.6)) and then make substitutions (5.8), (5.9) and

$$\begin{aligned} \begin{bmatrix}f_n&\ldots&f_1\end{bmatrix}=2(1+g_0)^{-1}\begin{bmatrix}g_n&\ldots&g_1\end{bmatrix}\big (I+\textbf{T}_n^g\big )^{-1} \end{aligned}$$

to write A in terms of \(g_0,\ldots ,g_n\) as

$$\begin{aligned} A=Z^*-\widetilde{\textbf{e}}_n(1+g_0)^{-1}\begin{bmatrix}g_n&\ldots&g_1\end{bmatrix}\big (I+\textbf{T}_n^g\big )^{-1} (\textbf{T}_n^{g*}-I){\mathfrak {C}}_n^{-1}\big (I+\textbf{T}_n^{g}\big ). \end{aligned}$$

Substituting the latter expression into (5.21) results in

$$\begin{aligned} U&=\left( \big (I+\textbf{T}_n^g\big )Z_n^*+B'{} \textbf{e}^*_n\right) \big (I+\textbf{T}_n^g\big )^{-1}\nonumber \\&\quad -\widetilde{\textbf{e}}_n\begin{bmatrix}g_n&\ldots&g_1\end{bmatrix} \big (I+\textbf{T}_n^g\big )^{-1}(\textbf{T}_n^{g*}-I){\mathfrak {C}}_n^{-1}. \end{aligned}$$
(5.22)

By the direct inspection, one can see that

$$\begin{aligned} \big (I+\textbf{T}_n^g\big )Z_n^*+B'{} \textbf{e}^*_n= Z_n^*\big (I+\textbf{T}_n^g\big )+\widetilde{\textbf{e}}_n\begin{bmatrix}g_n&\ldots&g_1\end{bmatrix}, \end{aligned}$$

which allows us to write (5.22) as

$$\begin{aligned} U&=Z_n^*+\widetilde{\textbf{e}}_n\begin{bmatrix}g_n&\ldots&g_1\end{bmatrix}\big (I+\textbf{T}_n^g\big )^{-1}\big ( I-(\textbf{T}_n^{g*}-I){\mathfrak {C}}_n^{-1}\big )\\&=Z_n^*+\widetilde{\textbf{e}}_n\begin{bmatrix}g_n&\ldots&g_1\end{bmatrix}\big (I+\textbf{T}_n^g\big )^{-1} \left( {\mathfrak {C}}_n-\textbf{T}_n^{g*}+I\right) {\mathfrak {C}}_n^{-1}\\&=Z_n^*+\widetilde{\textbf{e}}_n\begin{bmatrix}g_n&\ldots&g_1\end{bmatrix}{\mathfrak {C}}_n^{-1}, \end{aligned}$$

where the last equality holds since \({\mathfrak {C}}_n-\textbf{T}_n^{g*}+I=\textbf{T}_n^g+I\). Therefore, \(U=A^\prime \), and the proof is complete. \(\square \)

The current section is included for the convenience of future references. For example, Theorem 5.6 leads to a Herglotz-type representation theorem for left- and right-regular functions generated by a Carathéodory-class power series, which in turn leads to a quite meaningful quaternionic version of a trigonometric moment problem and, subsequently, to the spectral theorem for quaternionic unitary operators that is very much similar to the classical complex-valued one. These topics will be elaborated in a separate publication. In the last section, we get back to the Schur-class setting and discuss its indefinite generalization.

6 The generalized Schur class \({\mathcal {S}}^\kappa _{{\mathbb {H}}}\)

Following the classical case [18], we say that a power series \(f\in {\mathbb {H}}[[z]]\) belongs to the generalized Schur class \({\mathcal {S}}^\kappa _{{\mathbb {H}}}\) if there exists an integer \(n_0\ge 0\), such that the Hermitian matrices \({\mathfrak {S}}^f_n=I-\textbf{T}_n^{f}{} \textbf{T}_n^{f*}\) have \(\kappa \) negative eigenvalues counted with multiplicities:

$$\begin{aligned} \nu _-({\mathfrak {S}}^f_n)=\kappa \quad \text{ for } \text{ all }\quad n\ge n_0. \end{aligned}$$

Similarly, a power series \(g\in {\mathbb {H}}[[z]]\) belongs to the generalized Carathéodory class \({\mathcal {C}}^\kappa _{{\mathbb {H}}}\) if the above inequalities hold for the matrices \({\mathfrak {C}}^g_n=\textbf{T}_n^g+\textbf{T}_n^{g*}\) rather than \({\mathfrak {S}}_n\). In the complex setting, this class appeared in [16].

6.1 The indefinite Carathéodory–Schur problem

The problem consists of finding an \(f\in {\mathcal {S}}^\kappa _{{\mathbb {H}}}\) with prescribed n first coefficients \(f_0,\ldots ,f_{n-1}\)

$$\begin{aligned} f(z)=f_0+zf_1+\cdots +f_{n-1}z^{n-1}+\cdots ,\qquad f\in \mathcal S^\kappa _{{\mathbb {H}}}, \end{aligned}$$
(6.1)

with minimally possible \(\kappa \) (which cannot be less than \(\nu _-({\mathfrak {S}}_n)\), by the eigenvalue interlacing theorem). If the matrix \({\mathfrak {S}}_n\) is invertible, then the minimal \(\kappa \) equals \(\nu _-({\mathfrak {S}}_n)\), as the next result shows.

Theorem 6.1

Let us suppose that \({\mathfrak {S}}_n\) is invertible, let \(\kappa :=\nu _-({\mathfrak {S}}_n)\), and let \(\Psi \) and \(\Theta \) be the polynomials defined as in (3.16), (3.17). Then, the formula

$$\begin{aligned} f=(\theta _{11}\varepsilon +\theta _{21})(\theta _{21}\varepsilon +\theta _{22})^{-1}= (\psi _{11}-\varepsilon \psi _{21})^{-1}(\varepsilon \psi _{22}-\psi _{12}) \end{aligned}$$
(6.2)

with free parameter \(\varepsilon \in {\mathcal {S}}_{{\mathbb {H}}}\) subject to conditions (3.41) parametrizes all \(f\in \mathcal S^\kappa _{{\mathbb {H}}}\) of the form (6.1).

Proof

Let us assume that f belongs to \({\mathcal {S}}^\kappa _{{\mathbb {H}}}\) (i.e., \(\nu _-({\mathfrak {S}}^f_{n+k})=\nu _-({\mathfrak {S}}_n)=\kappa \) for all \(k\ge 1\)) and satisfies (6.1). Then, by Theorem 3.14, f is of the form (3.44) for some \(\varepsilon \in {\mathcal {S}}_{{\mathbb {H}}}\) subject to conditions (3.41).

Conversely, let f be of the form (4.6) for some \(\varepsilon \in {\mathcal {S}}_{{\mathbb {H}}}\) subject to conditions (3.41). Again, by Theorem 3.14, the first n coefficients of f are equal to the prescribed \(f_0,\ldots ,f_{n-1}\) and equalities (3.43) hold. Thus, \(\nu _-({\mathfrak {S}}^f_{n+k})=\nu _-({\mathfrak {S}}_n)=\kappa \) for all \(k\ge 1\), and hence, \(f\in {\mathcal {S}}^\kappa _{{\mathbb {H}}}\). \(\square \)

The singular case relies on Theorem 3.5 for \({\mathfrak {S}}_n\)-setting.

Theorem 6.2

Let us suppose that \({\mathfrak {S}}_n\) is singular, \(\textrm{rank}({\mathfrak {S}}_n)=d<n\), and let \({\mathfrak {S}}_r\) (\(r<n\)) be the maximal invertible leading submatrix of \({\mathfrak {S}}_n\).

  1. 1.

    If \(d=r\) (i.e., \(\textrm{rank}({\mathfrak {S}}_n)=\textrm{rank}({\mathfrak {S}}_r)\)), then the formula (4.10) (with n replaced by r) defines a unique \(s\in {\mathcal {S}}^\kappa _{{\mathbb {H}}}\) (\(\kappa =\nu _-({\mathfrak {S}}_n)=\nu _-({\mathfrak {S}}_r)\)) with initial coefficients \(f_0,\ldots ,f_{n-1}\).

  2. 2.

    If \(d>r\), then the minimally possible \(\kappa \) equals

    $$\begin{aligned} \kappa =\nu _-({\mathfrak {S}}_n)+n-d=\nu _-({\mathfrak {S}}_n)+\nu _0({\mathfrak {S}}_n), \end{aligned}$$
    (6.3)

    where \(\nu _0\) stands for the multiplicity of the zero eigenvalue.

Proof

It was verified in the proof of Theorem 4.8 that the first \(r+1\) coefficients of s of form (4.10) (with r instead of n) are equal to \(f_0,\ldots , f_r\). Therefore, \({\mathfrak {S}}_r^s={\mathfrak {S}}_r\) and \({\mathfrak {S}}_{r+1}^s={\mathfrak {S}}_{r+1}\). We next show that for all \(m>r\)

$$\begin{aligned} {\mathfrak {S}}_m^s:=I-\textbf{T}^s_m\textbf{T}^{s*}_m=\begin{bmatrix}{} \textbf{e}_r^* \\ \textbf{e}_r^*A \\ \vdots \\ \textbf{e}_r^*A^{m-1}\end{bmatrix}{\mathfrak {S}}_r \begin{bmatrix}{} \textbf{e}_r&A^*\textbf{e}_r&\ldots&A^{*(m-1)}{} \textbf{e}_r\end{bmatrix}. \end{aligned}$$
(6.4)

Since the matrices on both sides of (6.4) are Hermitian, it suffices to verify that the corresponding entries on and below the main diagonal are equal, that is

$$\begin{aligned} 1-\sum _{j=0}^{k-1}|s_j|^2&=\textbf{e}_r^*A^{k-1}{\mathfrak {S}}_r A^{*(k-1)}{} \textbf{e}_r\quad \text{ for } \text{ all }\quad k,\nonumber \\ -\sum _{j=0}^{\ell -1} s_{k-\ell +j}{\overline{s}}_j&=\textbf{e}_r^*A^{k-1}{\mathfrak {S}}_r A^{*(\ell -1)}{} \textbf{e}_r\quad \text{ for } \text{ all }\quad k>\ell . \end{aligned}$$
(6.5)

Using the formulas \(s_j=\textbf{e}_r^*A^{j-1}B\) for \(j\ge 1\) and equalities

$$\begin{aligned} BB^*={\mathfrak {S}}_r-A{\mathfrak {S}}_rA^*,\quad B{\overline{f}}_0=-A{\mathfrak {S}}_r\textbf{e}_r,\quad 1-|f_0|^2=\textbf{e}_r{\mathfrak {S}}_r\textbf{e}_r \end{aligned}$$

which follow from (4.11), we transform the left-side expressions in (6.5):

$$\begin{aligned} 1-\sum _{j=0}^{k-1}|s_j|^2&=1-|f_0|^2-\sum _{j=1}^{k-1}{} \textbf{e}_r^*A^{j-1}BB^*A^{*(j-1)}{} \textbf{e}_r\\&=\textbf{e}_n{\mathfrak {S}}_n\textbf{e}_n-\sum _{j=1}^{k-1}{} \textbf{e}_r^*A^{j-1}({\mathfrak {S}}_r-A{\mathfrak {S}}_rA^*)A^{*(j-1)}{} \textbf{e}_r\\&=\sum _{j=0}^{k-1}{} \textbf{e}_r^*A^{j}{\mathfrak {S}}_rA^{*j}{} \textbf{e}_r-\sum _{j=1}^{k-2}{} \textbf{e}_r^*A^{j}{\mathfrak {S}}_rA^{*j}{} \textbf{e}_r\\&=\textbf{e}_r^*A^{k-1}{\mathfrak {S}}_rA^{*(k-1)}{} \textbf{e}_r,\\ -\sum _{j=0}^{\ell -1} s_{k-\ell +j}{\overline{s}}_j&=-\textbf{e}_r^*A^{k-\ell -1}B{\overline{f}}_0- \sum _{j=1}^{\ell -1} \textbf{e}_r^*A^{k-\ell +j-1}BB^*A^{*(j-1)}{} \textbf{e}_r\\&=\textbf{e}_r^*A^{k-\ell }{\mathfrak {S}}_r\textbf{e}_n-\sum _{j=1}^{\ell -1} \textbf{e}_r^*A^{k-\ell +j-1}({\mathfrak {S}}_r-A{\mathfrak {S}}_rA^*)A^{*(j-1)}{} \textbf{e}_r\\&=\textbf{e}_r^*A^{k-\ell }\bigg ({\mathfrak {S}}_r-\sum _{j=1}^{\ell -1}A^{j-1}({\mathfrak {S}}_r-A{\mathfrak {S}}_rA^*)A^{*(j-1)}\bigg )\textbf{e}_r\\&=\textbf{e}_r^*A^{k-1}{\mathfrak {S}}_r A^{*(\ell -1)}{} \textbf{e}_r, \end{aligned}$$

confirming equalities (6.5) and hence (6.4). It follows from (6.4) that:

$$\begin{aligned} \textrm{rank}({\mathfrak {S}}_m^s)\le \textrm{rank}({\mathfrak {S}}_r)\quad \text{ for } \text{ all }\quad m>r. \end{aligned}$$

Since \({\mathfrak {S}}_r^s={\mathfrak {S}}_r\), we actually have \(\textrm{rank}({\mathfrak {S}}_m^s)=\textrm{rank}({\mathfrak {S}}_r)\) and, therefore, \(\nu _-({\mathfrak {S}}_m^s)=\nu _-({\mathfrak {S}}_r)=\kappa \) for all \(m>r\). Therefore, \(s\in {\mathcal {S}}_{{\mathbb {H}}}^\kappa \). Since both \({\mathfrak {S}}_n\) and \({\mathfrak {S}}_n^s\) are structured extensions of \({\mathfrak {S}}_r\) with \(\nu _-({\mathfrak {S}}_n)=\nu _-({\mathfrak {S}}_n^s)=\nu _-({\mathfrak {S}}_r)\), they are equal by Lemma 3.1, and hence, \(s_j=f_j\) for \(j=1,\ldots ,n-1\).

To prove (2), we recall that for any choice of \(f_n,\ldots , f_{2n-d}\), the matrix \({\mathfrak {S}}_{2n-d}\) is invertible and has \(\nu _-({\mathfrak {S}}_r)+n-d\) negative eigenvalues. Therefore, \(\kappa \) defined in (6.3) is minimally possible. To get all \(f\in \mathcal S_{{\mathbb {H}}}^\kappa \) with the first n coefficients equal to \(f_0,\ldots ,f_{n-1}\), we first choose an arbitrary tuple \(\textbf{f}=(f_n,\ldots ,f_{2n-d})\) and then apply the linear fractional formula (6.2) with the polynomial \(\Psi _\textbf{f}\) constructed via formula (3.16) but based on given \(f_0,\ldots ,f_{2n-d}\); the formula (3.16) makes sense, since the matrix \({\mathfrak {S}}_{2n-d}\) is invertible. \(\square \)

6.2 Regular meromorphic functions associated with \({\mathcal {S}}^\kappa _{{\mathbb {H}}}\)

To put the class \({\mathcal {S}}^\kappa _{{\mathbb {H}}}\) in the function-theoretic context, we need the absolute convergence of these series in a neighborhood of the origin.

Theorem 6.3

Any power series \(f\in {\mathcal {S}}^\kappa _{{\mathbb {H}}}\) converges absolutely in a neighborhood of the origin.

Proof

For any \(f\in {\mathcal {S}}^\kappa _{{\mathbb {H}}}\), there is \(n\ge \kappa \), such that

$$\begin{aligned} {\mathfrak {S}}_n:={\mathfrak {S}}^f_{n}\quad \text{ is } \text{ invertible } \text{ and }\quad \nu _-({\mathfrak {S}}_{n})=\kappa . \end{aligned}$$

Indeed, assuming that such n does not exist, we conclude that in particular, the matrix \({\mathfrak {S}}^f_\kappa \) has \(\kappa \) negative eigenvalues and is singular, which is not possible. We next fix such an n, define the polynomial \(\Theta \) as in (3.17) and conclude by Theorem 6.1 that f admits a representation (6.2) for some \(\varepsilon \in {\mathcal {S}}_{{\mathbb {H}}}\), such that \(\theta _{21,0}\varepsilon _0+\theta _{22,0}\ne 0\). Therefore, the power series \(\theta _{21}\varepsilon +\theta _{22}\) has no zeros in a neighborhood of the origin and, therefore, f (of the form (6.2)) converges absolutely in this neighborhood. \(\square \)

Thus, the power series \(f\in {\mathcal {S}}^\kappa _{{\mathbb {H}}}\) can be left and right evaluated in a neighborhood of the origin giving rise to left and right-regular functions \(f^{\varvec{e_\ell }}\) and \(f^{\varvec{e_r}}\). Further elaboration comes from the Krein–Langer type factorization result: for any \(f\in {\mathcal {S}}^\kappa _{{\mathbb {H}}}\), there exist Schur-class power series \(S_L\), \(S_R\) and Blaschke products \(B_L\) and \(B_R\) of degree \(\kappa \) so that f admits the following power-series factorizations:

$$\begin{aligned} f(z)=B_L(z)^{-1}S_L(z)=S_R(z)B_R(z)^{-1}. \end{aligned}$$

Furthermore, \(B_LB_L^\sharp =B_RB_R^\sharp \) (where \(B^\sharp \) is defined via formula (5.3)). If we denote by \({\mathcal {Z}}\) the zero set of the real Blaschke product \({\widetilde{B}}:=B_LB_L^\sharp =B_RB_R^\sharp \), then the functions \(f^{\varvec{e_\ell }}\) and \(f^{\varvec{e_r}}\) admit meromorphic (semi-regular) extensions to \({\mathbb {B}}\backslash {\mathcal {Z}}\) by the formulas

$$\begin{aligned} f^{\varvec{e_\ell }}(\alpha )={\widetilde{B}}(\alpha )^{-1}(B_L^\sharp S_L)^{\varvec{e_\ell }}(\alpha ),\quad f^{\varvec{e_r}}(\alpha )=(S_R B^\sharp _R)^{\varvec{e_r}}(\alpha ){\widetilde{B}}(\alpha )^{-1}. \end{aligned}$$
(6.6)

We refer to [3] and [4] for further details. Note that left-regular generalized Schur functions considered in [3], citeacsbook are slightly more general than the ones arising from \(f\in {\mathcal {S}}^\kappa _{{\mathbb {H}}}\) as they are allowed to have a pole at the origin.

6.3 Excluded parameters

We now get back to the parametrization formula (6.2) and focus on parameters \(\varepsilon \in {\mathcal {S}}_{{\mathbb {H}}}\) that do not satisfy conditions (3.41); we will refer to them as to excluded parameters. In what follows, we will use notation:

$$\begin{aligned} \textbf{L}_\Psi [\varepsilon ]:=(\psi _{11}-\varepsilon \psi _{21})^{-1}(\varepsilon \psi _{22}-\psi _{12}), \quad \Psi =\left[ {\begin{matrix} \psi _{11} &{} \psi _{12}\\ \psi _{21}&{}\psi _{22} \end{matrix}}\right] , \end{aligned}$$
(6.7)

for the left linear fractional transformation based on the matrix polynomial \(\Psi \). As a consequence of part (2) in Theorem 3.14, we get the following result.

Remark 6.4

An excluded parameter of the linear fractional transformation (6.7) exists if and only if \(\widetilde{\textbf{e}}^*_n{\mathfrak {S}}_n^{-1}\widetilde{\textbf{e}}_n\le 0\).

The notion of an excluded parameter was introduced in [14] in the context of the Nevanlinna–Pick interpolation problem (with no derivatives involved in the interpolation conditions) for generalized Schur functions. In the setting of the Carathéodory–Schur problem (see [5] for the complex case), the power series \(\psi _{11}-\varepsilon \psi _{21}\) may have a multiple zero at the origin which suggests a more detailed classification of excluded parameters.

Definition 6.5

We will say that \(\varepsilon \in {\mathcal {S}}_{{\mathbb {H}}}\) is an excluded parameter of order at least k of the linear fractional transformation (6.7) if

$$\begin{aligned} \psi _{11}-\varepsilon \psi _{21}=z^kh\quad \text{ for } \text{ some } \; \; h\in {\mathbb {H}}[[z]], \end{aligned}$$
(6.8)

and is an excluded parameter of order k if, in addition, \(h_0=h(0)\ne 0\).

The next criterion is the extension of Remark 6.4.

Proposition 6.6

There exists an excluded parameter \(\varepsilon \in \mathcal S_{{\mathbb {H}}}\) of order at least \(k\le n\) of the transformation (6.7) if and only if the \(k\times k\) bottom principal submatrix of \({\mathfrak {S}}_n^{-1}\) is negative semidefinite.

Proof

By Remark 3.9, at least one of the elements \(\psi _{11,0}\) and \(\psi _{21,0}\) is non-zero. Therefore, the condition (6.8) implies in particular that \(\psi _{21,0}\ne 0\) and hence, \(\psi _{21}\) is invertible in \({\mathbb {H}}[[z]]\). Writing (6.8) in terms of associated Toeplitz matrices as \(\textbf{T}_k^{\varepsilon }{} \textbf{T}_k^{\psi _{21}}=\textbf{T}_k^{\psi _{11}}\), we conclude by Remark 4.5 that (6.8) holds for some \(\varepsilon \in {\mathcal {S}}_{{\mathbb {H}}}\) if and only if the following matrix \(R_k\) is positive semidefinite:

$$\begin{aligned} R_k:=\textbf{T}_k^{\psi _{21}*}{} \textbf{T}_k^{\psi _{21}}-\textbf{T}_k^{\psi _{11}*}{} \textbf{T}_k^{\psi _{11}} =\textbf{T}_k^{\psi _{21}*} \left( I_k-\textbf{T}_k^{\varepsilon *}\textbf{T}_k^{\varepsilon }\right) \textbf{T}_k^{\psi _{21}}\succeq 0. \end{aligned}$$
(6.9)

The desired statement will follow once we will show that

$$\begin{aligned} R_k=-\begin{bmatrix}0&I_k\end{bmatrix}{\mathfrak {S}}_n^{-1}\begin{bmatrix}0 \\ I_k\end{bmatrix} \quad \text{ for } \text{ all }\quad k\le n. \end{aligned}$$
(6.10)

Toward this end, we first observe that \(R_k\) satisfies (and is uniquely recovered from) the Stein identity

$$\begin{aligned} R_k-Z_k^*R_kZ_k=M_kM_k^*-N_kN_k^*, \end{aligned}$$
(6.11)

where \(M_k^*\) and \(N_k^*\) are the bottom rows of the matrices \(\textbf{T}_k^{\psi _{21}}\) and \(\textbf{T}_k^{\psi _{11}}\)

$$\begin{aligned} M^*_k&=\widetilde{\textbf{e}}_k^*\textbf{T}_k^{\psi _{21}}= \begin{bmatrix}\psi _{21,k-1}&\ldots&\psi _{21,1}&\psi _{21,0}\end{bmatrix}, \nonumber \\ N^*_k&=\widetilde{\textbf{e}}_k^*\textbf{T}_k^{\psi _{11}}= \begin{bmatrix}\psi _{11,k-1}&\ldots&\psi _{11,1}&\psi _{11,0}\end{bmatrix}. \end{aligned}$$
(6.12)

Indeed, since \(Z_k\textbf{T}_k^{\psi _{21}}=\textbf{T}_k^{\psi _{21}}Z_k\) and \(I-Z_k^*Z_k=\widetilde{\textbf{e}}_k\widetilde{\textbf{e}}_k^*\), we have

$$\begin{aligned} R_k-Z_k^*R_kZ_k&= \textbf{T}_k^{\psi _{21}*}\left( I-Z_k^*Z_k\right) \textbf{T}_k^{\psi _{21}} -\textbf{T}_k^{\psi _{11}*}\left( I-Z_k^*Z_k\right) \textbf{T}_k^{\psi _{11}}\\&=\textbf{T}_k^{\psi _{21}*}\widetilde{\textbf{e}}_k\widetilde{\textbf{e}}_k^*\textbf{T}_k^{\psi _{21}} -\textbf{T}_k^{\psi _{11}*}\widetilde{\textbf{e}}_k\widetilde{\textbf{e}}_k^*\textbf{T}_k^{\psi _{11}} =M_kM_k^*-N_kN_k^*. \end{aligned}$$

One can see from (3.16) that or \(k=n\), the formulas (6.12) can be written as

$$\begin{aligned} \begin{bmatrix}N^*_n \\ M^*_n\end{bmatrix}=\begin{bmatrix}{} \textbf{e}^*\\ F^*_n\end{bmatrix} (I-Z^*_n)^{-1}{\mathfrak {S}}_n^{-1}(I-Z_n) \end{aligned}$$
(6.13)

which together with (3.23) implies

$$\begin{aligned} N_nN_n^*-M_nM_n^*&=(I-Z_n^*){\mathfrak {S}}_n^{-1}(I-Z_n)^{-1}\left( \textbf{e}{} \textbf{e}^*-F_nF_n^*\right) \nonumber \\&\qquad \times (I-Z_n^*)^{-1}{\mathfrak {S}}_n^{-1}(I-Z_n)\nonumber \\&=(I-Z_n^*){\mathfrak {S}}_n^{-1}+{\mathfrak {S}}_n^{-1}(I-Z_n)-(I-Z_n^*){\mathfrak {S}}_n^{-1}(I-Z_n)\nonumber \\&={\mathfrak {S}}_n^{-1}-Z_n^*{\mathfrak {S}}_n^{-1}Z_n. \end{aligned}$$
(6.14)

Upon comparing the \(k\times k\) bottom principal blocks in (6.14), we see that the \(k\times k\) bottom principal submatrix of \({\mathfrak {S}}_n^{-1}\) satisfies the same Stein equation (6.11) as the matrix \(-R_k\). By the uniqueness of the solution, we have equality (6.10) which completes the proof. \(\square \)

The next statement shows, in particular, that the assumption \(k\le n\) in Proposition 6.6 is not restrictive.

Proposition 6.7

Let \(\varepsilon \) be an excluded parameter of the transformation (6.7) of order at least k, i.e., let us assume that (6.8) holds. Then, \(k\le n\) and

$$\begin{aligned} \psi _{12}-\varepsilon \psi _{22}=z^kg\quad \text{ for } \text{ some } \; \; g\in {\mathbb {H}}[[z]]. \end{aligned}$$
(6.15)

Therefore, the formula \(f_\varepsilon =\textbf{L}_\Psi [\varepsilon ]\) defines a power series \(f_\varepsilon \in {\mathbb {H}}[[z]]\).

Proof

By (3.16), the leading coefficients of \(\psi _{11}\) and \(\psi _{21}\) are given by

$$\begin{aligned} \psi _{11,n}=1-\textbf{e}_n^*(I-Z_n^*)^{-1}{\mathfrak {S}}_n^{-1}{} \textbf{e}_n, \quad \psi _{21,n}=-F_n^*(I-Z_n^*)^{-1}{\mathfrak {S}}_n^{-1}{} \textbf{e}_n. \qquad \end{aligned}$$
(6.16)

Then, it follows by (3.23) that:

$$\begin{aligned} |\psi _{21,n}|^2-|\psi _{11,n}|^2&= \textbf{e}_n^*{\mathfrak {S}}_n^{-1}(I-Z_n)^{-1}\left( F_nF_n^*-\textbf{e}_n\textbf{e}_n^* \right) (I-Z_n^*)^{-1}{\mathfrak {S}}_n^{-1}{} \textbf{e}_n\\&\quad +\textbf{e}_n^*(I-Z_n^*)^{-1}{\mathfrak {S}}_n^{-1}{} \textbf{e}_n +\textbf{e}_n^*{\mathfrak {S}}_n^{-1}(I-Z_n)^{-1}{} \textbf{e}_n-1\\&=\textbf{e}_n^*{\mathfrak {S}}_n^{-1}{} \textbf{e}_n-1. \end{aligned}$$

Similar computations show that for \(j=0,\ldots ,n-1\)

$$\begin{aligned} |\psi _{21,j}|^2-|\psi _{11,j}|^2&= \textbf{e}_n^*Z_n^{*(n-j-1)}(I-Z_n^*){\mathfrak {S}}_n^{-1}(I-Z_n)^{-1}\left( F_nF_n^*-\textbf{e}_n\textbf{e}_n^*\right) \nonumber \\&\quad \times (I-Z_n^*)^{-1}{\mathfrak {S}}_n^{-1}(I-Z_n)Z_n^{(n-j-1)}{} \textbf{e}_n\nonumber \\&=\textbf{e}_n^*Z_n^{*(n-j-1)}\left( Z_n^*{\mathfrak {S}}_n^{-1}Z_n-{\mathfrak {S}}_n^{-1}\right) Z_n^{(n-j-1)}\textbf{e}_n.\qquad \end{aligned}$$
(6.17)

If we now consider the matrix \(R_{n+1}\) defined as in (6.9), then it follows from the latter computations that its leading entry is equal to:

$$\begin{aligned} \left[ R_{n+1}\right] _{11}&=\sum _{j=0}^n \left( |\psi _{21,j}|^2-|\psi _{11,j}|^2\right) \nonumber \\&=\textbf{e}_n^*{\mathfrak {S}}_n^{-1}{} \textbf{e}_n+\sum _{j=0}^{n-1}\textbf{e}_n^*Z_n^{*(n-j-1)}\left( Z_n^*{\mathfrak {S}}_n^{-1}Z_n- {\mathfrak {S}}_n^{-1}\right) Z_n^{(n-j-1)}{} \textbf{e}_n-1\nonumber \\&=\textbf{e}_n^*Z_n^{*n}{\mathfrak {S}}_n^{-1}Z_n^n\textbf{e}_n-1=-1. \end{aligned}$$
(6.18)

If the condition (6.8) holds for \(k>n\), then the matrix \(R_{n+1}\) is positive semidefinite (by (6.9)) which is not the case, due to (6.18).

To prove (6.15), we first observe from (3.16), (6.12) and (2.18) that

$$\begin{aligned} \begin{bmatrix}\psi _{12,k-1}&{} \ldots &{}\psi _{12,1} &{} \psi _{12,0}\\ \psi _{22,k-1}&{} \ldots &{}\psi _{22,1} &{} \psi _{22,0}\end{bmatrix}=\begin{bmatrix}\psi _{11,k-1}&{} \ldots &{}\psi _{11,1} &{} \psi _{11.0}\\ \psi _{21,k-1}&{} \ldots &{}\psi _{21,1} &{} \psi _{21,0}\end{bmatrix}\textbf{T}_k^f \end{aligned}$$
(6.19)

for any \(k=1,\ldots ,n\), which implies the equalities

$$\begin{aligned} \textbf{T}_k^{\psi _{12}}=\textbf{T}_k^{\psi _{11}}\textbf{T}_k^f\quad \text{ and }\quad \textbf{T}_k^{\psi _{22}}=\textbf{T}_k^{\psi _{21}}{} \textbf{T}_k^f\quad \text{ for }\quad k=1,\ldots ,n. \end{aligned}$$

Due to condition (6.8), we have \(\textbf{T}_k^{\varepsilon }{} \textbf{T}_k^{\psi _{21}}=\textbf{T}_k^{\psi _{11}}\), and hence

$$\begin{aligned} \textbf{T}_k^{\varepsilon }{} \textbf{T}_k^{\psi _{22}}-\textbf{T}_k^{\psi _{12}}= \textbf{T}_k^{\varepsilon }{} \textbf{T}_k^{\psi _{11}}{} \textbf{T}_k^f-\textbf{T}_k^{\psi _{21}}{} \textbf{T}_k^f =\big (\textbf{T}_k^{\varepsilon }\textbf{T}_k^{\psi _{11}}{} \textbf{T}_k^{\psi _{21}}\big )\textbf{T}_k^f=0 \end{aligned}$$

which is equivalent to (6.15). \(\square \)

We finally present a refinement of Proposition 6.6.

Proposition 6.8

There exists an excluded parameter \(\varepsilon \in \mathcal S_{{\mathbb {H}}}\) of order k of the transformation (6.7) if and only if the \(k\times k\) bottom principal submatrix of \({\mathfrak {S}}_n^{-1}\) (i.e., the matrix \(-R_k\) defined in (6.10)) is either

  1. (1)

    negative definite, in which case there are infinitely many excluded parameters of order k, or

  2. (2)

    the maximal negative semidefinite bottom principal submatrix of \({\mathfrak {S}}_n^{-1}\), in which case there is a unique excluded parameter \(\varepsilon \) of order k.

Proof

Equality (6.8) specifies (in terms of \(\psi _{11,j}\) and \(\psi _{21,j}\)) the k first coefficients of the unknown \(\varepsilon \in {\mathcal {S}}_{{\mathbb {H}}}\). As was pointed out above, equality (6.8) holds for some \(\varepsilon \in \mathcal S_{{\mathbb {H}}}\) if and only if the matrix \(R_k\) is positive semidefinite. We next apply the results from Section 4. If \(R_k\) is singular, there is a unique \(\varepsilon \in {\mathcal {S}}_{{\mathbb {H}}}\) subject to condition (6.8), and this only \(\varepsilon \) is a Blaschke product of degree \(\deg \varepsilon =\textrm{rank}(R_k)\). Moreover, the order of this excluded parameter is less than \(k+1\) if and only if the matrix \(R_{k+1}\) is not positive semidefinite.

If \(R_k\succ 0\), then the set of all \(\varepsilon \in \mathcal S_{{\mathbb {H}}}\) subject to condition (6.8) is parametrized by a linear fractional formula

$$\begin{aligned} \varepsilon =({\mathfrak {d}}_{11}-\sigma {\mathfrak {d}}_{21})^{-1}(\sigma {\mathfrak {d}}_{22}-{\mathfrak {d}}_{12}) \end{aligned}$$
(6.20)

with polynomial coefficients and a free parameter \(\sigma \in \mathcal S_{{\mathbb {H}}}\). The k first coefficients \(\varepsilon _0,\ldots , \varepsilon _{k-1}\) of each \(\varepsilon \) of the form (6.20) are the same and guarantee (6.8). The coefficient \(\varepsilon _k\) is uniquely determined by the free coefficient of the parameter \(\sigma \) in (6.20). Since there is a unique \(\varepsilon ^\prime _k\in {\mathbb {H}}\), such that the power series

$$\begin{aligned} \varepsilon (z)=\varepsilon _0+\cdots +\varepsilon _{k-1}z^{k-1}+\varepsilon ^\prime _k z^k+\cdots \end{aligned}$$
(6.21)

is an excluded parameter of order at least \(k+1\), i.e., such that

$$\begin{aligned} \psi _{11}-\varepsilon \psi _{21}=z^{k+1}{\widetilde{h}}\quad \text{ for } \text{ some } \; \; {\widetilde{h}}\in {\mathbb {H}}[[z]], \end{aligned}$$
(6.22)

and since there is a unique \(\sigma ^\prime _0\in {\mathbb {H}}\) producing the series (6.21) via formula (6.20) (more precisely, any \(\sigma \in {\mathcal {S}}_{{\mathbb {H}}}\) with the free coefficient equal \(\sigma ^\prime _0\)), we conclude that all excluded parameters \(\varepsilon \) of order k are parametrized by the formula (6.20) with free parameter \(\sigma \in {\mathcal {S}}_{{\mathbb {H}}}\) such that \(\sigma _0\ne \sigma ^\prime _0\). Note that in case the matrix \(R_{k+1}\) is not positive semidefinite, any \(\varepsilon \) subject to condition (6.22) does not belong to \(\mathcal S_{{\mathbb {H}}}\) and, hence, the formula (6.20) with free parameter \(\sigma \in {\mathcal {S}}_{{\mathbb {H}}}\) describes all excluded parameters of order k. \(\square \)

6.4 Quasi-solutions arising from excluded parameters

Any excluded parameter \(\varepsilon \in {\mathcal {S}}_{{\mathbb {H}}}\) gives rise via formula (6.7) to a power series \(f_\varepsilon =\textbf{L}_\Psi [\varepsilon ]\) which is not a solution to the Carathéodory problem (6.1): such \(f_\varepsilon \) does not belong to \({\mathcal {S}}_{{\mathbb {H}}}^\kappa \) and does not satisfy the condition (6.1). However, something still can be said about this \(f_\varepsilon \). The next theorem is the main result of this section.

Theorem 6.9

Let \(\varepsilon \in {\mathcal {S}}_{{\mathbb {H}}}\) be an excluded parameter of order k. Then, the power series \(f_\varepsilon =\textbf{L}_\Psi [\varepsilon ]\) belongs to the class \({\mathcal {S}}_{\mathbb {H}}^{\kappa -k}\) and is of the form

$$\begin{aligned} f_\varepsilon (z)=f_0+\cdots +f_{n-k-1}z^{n-k-1}+f_{\varepsilon ,n-k} z^{n-k}+\cdots ,\quad f_{\varepsilon ,n-k} \ne f_{n-k}. \qquad \end{aligned}$$
(6.23)

In other words, the \(n-k\) first coefficients of \(f_\varepsilon \) are equal to prescribed \(f_0,\ldots ,f_{n-k-1}\), but \(f_{\varepsilon ,n-k}\) is different from the prescribed \(f_{n-k}\).

Proof

Since \(\varepsilon \) is an excluded parameter of order k, it satisfies the equality (6.8) (with \(h_0\ne 0\)) and therefore, the equality (6.15), by Proposition 6.7. Combining these two equalities gives

$$\begin{aligned} \begin{bmatrix} 1&\quad -\varepsilon \end{bmatrix}\Psi =z^k\begin{bmatrix} h&\quad -g\end{bmatrix}. \end{aligned}$$
(6.24)

By (6.2), \((\psi _{11}-\varepsilon \psi _{21})f_\varepsilon =(\varepsilon \psi _{22}-\psi _{12})\) which on account of (6.24) can be written as \(z^khf_\varepsilon =-z^kg\), which in turn allows us to write (6.24) as

$$\begin{aligned} \begin{bmatrix} 1&\quad -\varepsilon \end{bmatrix}\Psi =z^kh\begin{bmatrix} 1&\quad -f_\varepsilon \end{bmatrix}. \end{aligned}$$

Multiplying both parts by \(\Theta \) on the right and making use of (3.24) gives

$$\begin{aligned} z^kh\begin{bmatrix} 1&\quad -f_\varepsilon \end{bmatrix}\Theta =\begin{bmatrix} 1&\quad -\varepsilon \end{bmatrix}\Psi \Theta = z^n\begin{bmatrix} 1&\quad -\varepsilon \end{bmatrix}. \end{aligned}$$

We next cancel \(z^k\) and recall that \(h_0\ne 0\) (i.e., h is invertible in \({\mathbb {H}}[[z]]\)) to conclude that

$$\begin{aligned} \begin{bmatrix} 1&\quad -f_\varepsilon \end{bmatrix}\Theta =z^{n-k}\begin{bmatrix}u&\quad -v\end{bmatrix}\quad \text{ for } \text{ some }\quad u,v\in {\mathbb {H}}[[z]]. \end{aligned}$$

As in the proof of the implication \((2)\Rightarrow (1)\) in Theorem 3.13, we use the polynomials (3.32) and combine the last equality with (3.37) to get

$$\begin{aligned} (f_\varepsilon -p_{n-k})\begin{bmatrix}\theta _{21}&\theta _{22}\end{bmatrix}&= \begin{bmatrix}0&\quad f_\varepsilon -p_{n}\end{bmatrix}\Theta +(p_n-p_{n-k})\begin{bmatrix}\theta _{21}&\theta _{22}\end{bmatrix}\nonumber \\&=\begin{bmatrix} 1&\quad -p_n\end{bmatrix}\Theta -\begin{bmatrix} 1&\quad -f_\varepsilon \end{bmatrix}\Theta +z^{n-k}{\widetilde{p}} \begin{bmatrix}\theta _{21}&\theta _{22}\end{bmatrix}\nonumber \\&=z^{n-k}\big (z^k\widetilde{\textbf{e}}_{n}^*\Phi _0-\begin{bmatrix}u&\quad -v\end{bmatrix}+{\widetilde{p}} \begin{bmatrix}\theta _{21}&\theta _{22}\end{bmatrix}\big ), \end{aligned}$$
(6.25)

where \({\widetilde{p}}\) is the polynomial given by

$$\begin{aligned} {\widetilde{p}}(z)=f_{n-k}+f_{n-k+1}z+\cdots +f_{n-1}z^{k-1}. \end{aligned}$$

Since the free coefficient of the polynomial \(\begin{bmatrix}\theta _{21}&\theta _{22}\end{bmatrix}\) is non-zero, it follows from (6.25) that the \(n-k\) first coefficients of \(f_\varepsilon -p_{n-k}\) are zeros and, hence, \(f_\varepsilon \) is indeed of the form:

$$\begin{aligned} f_\varepsilon (z)=f_0+\cdots +f_{n-k-1}z^{n-k-1}+\cdots \end{aligned}$$

It remains to verify the inequality in (6.23) and to show that \(f_\varepsilon \) belongs to \({\mathcal {S}}_{{\mathbb {H}}}^{\kappa -k}\). This will be done below after some needed preliminaries. \(\square \)

Let us consider the decomposition (3.11) (with \(n+k\) instead of n)

$$\begin{aligned} {\mathfrak {S}}_{n}&=\begin{bmatrix}{\mathfrak {S}}_{n-k} &{} -\textbf{T}_{n-k}^f T_{n-k,k}^* \\ -T_{n-k,k}{} \textbf{T}_{n-k}^{f*} &{} {\mathfrak {S}}_{k}-T_{n-k,k}T_{n-k,k}^*\end{bmatrix} \nonumber \\&=\begin{bmatrix}I &{} 0 \\ -\textbf{T}_{n-k}^f T_{n-k,k}^*{\mathfrak {S}}_{n-k}^{-1}&{} I\end{bmatrix} \begin{bmatrix}{\mathfrak {S}}_{n-k} &{} 0 \\ 0 &{} \textbf{S}_k\end{bmatrix} \begin{bmatrix}I &{} -{\mathfrak {S}}_{n-k}^{-1}T_{n-k,k}{} \textbf{T}_{n-k}^{f*}\\ 0 &{} I\end{bmatrix}, \end{aligned}$$
(6.26)

where \(\textbf{S}_k\) is the Schur complement of the block \({\mathfrak {S}}_{n-k}\), and the conformal decomposition [justified by equality (6.10)]

$$\begin{aligned} {\mathfrak {S}}_n^{-1}=\begin{bmatrix}\alpha &{} \beta \\ \beta ^* &{} -R_k\end{bmatrix}. \end{aligned}$$
(6.27)

Lemma 6.10

Let \({\mathfrak {S}}_n\) and \({\mathfrak {S}}_n^{-1}\) be partitioned as in (6.26), (6.27).

  1. 1.

    If \(R_k\succeq 0\), then \(\nu _-({\mathfrak {S}}_{n-k})=\nu _-({\mathfrak {S}}_n)-k\).

  2. 2.

    If, moreover, \(R_k\succ 0\), then

    $$\begin{aligned} R_k^{-1}=-\textbf{S}_k, \end{aligned}$$
    (6.28)

    and the following equality holds:

    $$\begin{aligned} \begin{bmatrix}N_k&M_k\end{bmatrix}=(I-Z_k^*)\textbf{S}_k^{-1}(I-Z_k)^{-1}\begin{bmatrix}X_k&Y_k\end{bmatrix}, \end{aligned}$$
    (6.29)

    where the columns \(N_k,M_k\) are defined via formula (6.12) and and \(X_k,Y_k\) are defined via formula (3.15) (with \(n+k\) replaced by n):

    $$\begin{aligned} \begin{bmatrix}X_k&Y_k\end{bmatrix}= (I-Z_k)\begin{bmatrix}T_{n-k,k}{} \textbf{T}_{n-k}^{f*}{\mathfrak {S}}_{n-k}^{-1}&I_k\end{bmatrix}(I-Z_{n})^{-1}\begin{bmatrix} \textbf{e}_{n}&F_{n}\end{bmatrix}. \end{aligned}$$
    (6.30)

Proof

Let us consider the matrix

$$\begin{aligned} P_\delta :={\mathfrak {S}}_n^{-1}-\delta \begin{bmatrix} 0 &{} 0 \\ 0 &{} I_k\end{bmatrix}= \begin{bmatrix}\alpha &{} \beta \\ \beta ^* &{} -R_k-\delta I_k\end{bmatrix}, \quad \delta >0. \end{aligned}$$

If \(\delta \) is small enough, then \(P_\delta \) is invertible and has the same inertia as \({\mathfrak {S}}_n^{-1}\) and \({\mathfrak {S}}_n\). On the other hand, since \(R_k+\delta I_k\succ 0\), we have

$$\begin{aligned} \nu _\pm ({\mathfrak {S}}_n)=\nu _\pm (P_\delta )=\nu _\pm (S_\delta )+k, \; \; \text{ where } \; \; S_\delta :=\alpha +\beta (R_k+\delta I_k)^{-1}\beta ^*. \qquad \end{aligned}$$
(6.31)

Note that the inertia of \(S_\delta \) is the same for all sufficiently small \(\delta \). Furthermore, since \(P_\delta \) increases and tends to \({\mathfrak {S}}_n^{-1}\) as \(\delta \searrow 0\), it follows that \(P_\delta ^{-1}\) decreases and tends to \({\mathfrak {S}}_n\) as \(\delta \searrow 0\). Since \(P_\delta ^{-1}\) has the form \(P_\delta ^{-1}=\left[ {\begin{matrix} S_\delta ^{-1} &{} *\\ *&{} * \end{matrix}}\right] \), we see that \(S_\delta ^{-1}\) decreases and tends to \({\mathfrak {S}}_{n-k}\). Since \(\nu _-(S_\delta )\) does not depend on \(\delta \), we conclude that \(\nu _-(S_\delta )=\nu _-({\mathfrak {S}}_{n-k})\). Combining the latter equality with (6.31), we get the desired conclusion in part (1).

To prove part (2), we first invert \({\mathfrak {S}}_n\) via factorization (6.26)

$$\begin{aligned} {\mathfrak {S}}_{n}^{-1}=\left[ {\begin{matrix} I &{} {\mathfrak {S}}_{n-k}^{-1}T_{n-k,k}{} \textbf{T}_{n-k}^{f*}\\ 0 &{} I \end{matrix}}\right] \left[ {\begin{matrix} {\mathfrak {S}}_{n-k}^{-1} &{} 0 \\ 0 &{} \textbf{S}_k^{-1} \end{matrix}}\right] \left[ {\begin{matrix} I &{} 0 \\ \textbf{T}_{n-k}^f T_{n-k,k}^*{\mathfrak {S}}_{n-k}^{-1}&{} I \end{matrix}}\right] . \end{aligned}$$
(6.32)

Comparing the 22-blocks in (6.27) and (6.32) results in \(-R_k=\textbf{S}_k^{-1}\) which is equivalent to (6.28). We next combine (6.13) and (6.32) to get

$$\begin{aligned} \begin{bmatrix}N_k&M_k\end{bmatrix}&=\begin{bmatrix} 0&I_k\end{bmatrix} \begin{bmatrix}N_n&M_n\end{bmatrix}\\&=\begin{bmatrix} 0&I_k\end{bmatrix}(I-Z_n^*){\mathfrak {S}}_n^{-1}(I-Z_n)^{-1}\begin{bmatrix}{} \textbf{e}&F_n\end{bmatrix}\\&=(I-Z_k^*)\begin{bmatrix} 0&I_k\end{bmatrix}{\mathfrak {S}}_n^{-1}(I-Z_n)^{-1}\begin{bmatrix}{} \textbf{e}&F_n\end{bmatrix}\\&=(I-Z_k^*)\begin{bmatrix} 0&I_k\end{bmatrix} \left[ {\begin{matrix} I &{} {\mathfrak {S}}_{n-k}^{-1}T_{n-k,k}{} \textbf{T}_{n-k}^{f*}\\ 0 &{} I \end{matrix}}\right] \left[ {\begin{matrix} {\mathfrak {S}}_{n-k}^{-1} &{} 0 \\ 0 &{} \textbf{S}_k^{-1} \end{matrix}}\right] \\&\qquad \times \left[ {\begin{matrix} I &{} 0 \\ \textbf{T}_{n-k}^f T_{n-k,k}^*{\mathfrak {S}}_{n-k}^{-1}&{} I \end{matrix}}\right] (I-Z_n)^{-1}\begin{bmatrix}{} \textbf{e}&F_n\end{bmatrix}\\&=(I-Z_k^*)\textbf{S}_k^{-1}\begin{bmatrix}T_{n-k,k}\textbf{T}_{n-k}^{f*}{\mathfrak {S}}_{n-k}^{-1}&I\end{bmatrix} (I-Z_n)^{-1}\begin{bmatrix}{} \textbf{e}&F_n\end{bmatrix}. \end{aligned}$$

By (6.30)

$$\begin{aligned} \begin{bmatrix}T_{n-k,k}{} \textbf{T}_{n-k}^{f*}{\mathfrak {S}}_{n-k}^{-1}&I\end{bmatrix} (I-Z_n)^{-1}\begin{bmatrix}{} \textbf{e}&F_n\end{bmatrix}=(I-Z_k)^{-1}\begin{bmatrix}X_k&Y_k\end{bmatrix}. \end{aligned}$$

Combining the two last equalities leads us to (6.29). \(\square \)

Remark 6.11

Since for an excluded parameter \(\varepsilon \) of order k, the first \(n-k\) coefficients of \(f_\varepsilon \) are equal to \(f_0,\ldots ,f_{n-k-1}\), and since \(\nu _-({\mathfrak {S}}_{n-k})=\nu _-({\mathfrak {S}}_{n-k})-k=\kappa -k\), it follows that \(f_\varepsilon \in {\mathcal {S}}_{{\mathbb {H}}}^m\) with \(m\ge \kappa -k\).

Thus, it remains to show that \(m\le \kappa -k\). In the complex setting [5, Theorem 6.1], the latter was shown by combining the winding number argument with Krein–Langer factorization (6.6) of \(f_\varepsilon \). By the winding number argument (or Rouche’s theorem), it follows that for any \(\varepsilon \in {\mathcal {S}}\), the function \(\psi _{11}-\varepsilon \psi _{21}\) has exactly \(\kappa \) zeros inside the unit disk \({\mathbb {D}}\). If \(\varepsilon \) is an excluded parameter of order k, then after canceling \(z^k\) in the numerator and the denominator of \(\textbf{L}_{\Psi }[\varepsilon ]\), we get a fraction \(f_\varepsilon \) having \(\kappa -k\) zeros inside \({{\mathbb {D}}}\). Since \(f_\varepsilon \) is a generalized Schur power series, its membership in \({\mathcal {S}}^{\kappa -k}\) follows from the Krein–Langer characterization. To bypass the Krein–Langer formulas in the quaternionic case as well as a suitable quaternionic version of Rouche’s theorem (which we do not have at the moment, except for a trivial planar case), we will follow a different and substantially more computational approach. We first establish the explicit formulas for the coefficients in the linear fractional formula (6.20).

Lemma 6.12

If \(R_k=-\textbf{S}_k^{-1}\succ 0\), then the set of all \(\varepsilon \in {\mathcal {S}}_{{\mathbb {H}}}\) subject to condition (6.8) is parametrized by the linear fractional formula

$$\begin{aligned} \varepsilon =({\mathfrak {d}}_{11}-\sigma {\mathfrak {d}}_{21})^{-1}(\sigma {\mathfrak {d}}_{22}-{\mathfrak {d}}_{12}), \quad \sigma \in \mathcal S_{{\mathbb {H}}}, \end{aligned}$$
(6.33)

with the matrix of coefficients \({\mathfrak {D}}=\left[ {\begin{matrix} {\mathfrak {d}}_{11} &{} {\mathfrak {d}}_{12}\\ {\mathfrak {d}}_{21}&{} {\mathfrak {d}}_{22} \end{matrix}}\right] \) given by

$$\begin{aligned} {\mathfrak {D}}(z)=I_2+(z-1)\begin{bmatrix}X_k^* \\ Y_k^*\end{bmatrix}(I-zZ_k^{*})^{-1}\textbf{S}_k^{-1}(I-Z_k)^{-1}\begin{bmatrix}X_k&\quad -Y_k\end{bmatrix} \end{aligned}$$
(6.34)

with the columns \(X_k\) and \(Y_k\) defined in (6.30).

Proof

Let \(\varepsilon _0,\ldots ,\varepsilon _{k-1}\) denote the coefficients of \(\varepsilon \) prescribed by condition (6.8). Comparing the bottom rows in the matrix equality \(\textbf{T}_k^{\varepsilon }=\textbf{T}_k^{\psi _{11}}(\textbf{T}_k^{\psi _{21}})^{-1}\) (which is equivalent to (6.8)) gives

$$\begin{aligned} \begin{bmatrix}\varepsilon _{k-1}&\ldots&\varepsilon _{1}&\varepsilon _{0}\end{bmatrix}= \begin{bmatrix}\psi _{11,k-1}&\ldots&\psi _{11,1}&\psi _{11,0}\end{bmatrix}(\textbf{T}_k^{\psi _{21}})^{-1}. \end{aligned}$$

Multiplying both sides by the unit anti-diagonal matrix \(V_k\) and making use of notation (6.12), we get

$$\begin{aligned} E_k^*:=\begin{bmatrix}\varepsilon _{0}&\varepsilon _{1}&\ldots&\varepsilon _{k-1}\end{bmatrix}=N_k^*(\textbf{T}_k^{\psi _{21}})^{-1}V_k. \end{aligned}$$
(6.35)

Observe also from (6.12) that

$$\begin{aligned} \textbf{e}_k^*=M_k^*(\textbf{T}_k^{\psi _{21}})^{-1}V_k. \end{aligned}$$
(6.36)

It follows from (6.11) that:

$$\begin{aligned} \textbf{e}_k\textbf{e}_k^*-E_kE_k^*&=V_k(\textbf{T}_k^{\psi _{21}*})^{-1}\big (M_kM_k^*-N_kN_k^*\big )(\textbf{T}_k^{\psi _{21}})^{-1}V_k\\&=V_k(\textbf{T}_k^{\psi _{21}*})^{-1}\big (R_k-Z_k^*R_kZ_k\big )(\textbf{T}_k^{\psi _{21}})^{-1}V_k\\&={\mathfrak {R}}-Z_k{\mathfrak {R}} Z_k^* \end{aligned}$$

where we have set

$$\begin{aligned} {\mathfrak {R}}=V_k(\textbf{T}_k^{\psi _{21}*})^{-1} R_k(\textbf{T}_k^{\psi _{21}})^{-1}V_k \end{aligned}$$
(6.37)

and used equalities

$$\begin{aligned} Z_k(\textbf{T}_k^{\psi _{21}})^{-1}=(\textbf{T}_k^{\psi _{21}})^{-1}Z_k\quad \text{ and }\quad Z_kV_k=V _kZ_k^* \end{aligned}$$
(6.38)

for the last step. Note that the matrix \({\mathfrak {R}}\) is positive definite. Following the formula (3.17), we introduce the matrix polynomial:

$$\begin{aligned} {{\widetilde{\Theta }}}(z)=I_2+(z-1)\begin{bmatrix}{} \textbf{e}_k^* \\ E_k^*\end{bmatrix}(I-zZ_k^{*})^{-1}\mathfrak {K}^{-1}(I-Z_k)^{-1}\begin{bmatrix}{} \textbf{e}_k&\quad -E_k\end{bmatrix} \end{aligned}$$
(6.39)

and conclude by virtue of Theorem 4.7 that the formula

$$\begin{aligned} \sigma \mapsto ({{\widetilde{\theta }}}_{11}\sigma ^\sharp +{{\widetilde{\theta }}}_{21})({{\widetilde{\theta }}}_{21}\sigma ^\sharp +{{\widetilde{\theta }}}_{22})^{-1}, \quad \sigma \in {\mathcal {S}}_{{\mathbb {H}}}, \end{aligned}$$

parametrizes all Schur-class power series with first k coefficients equal to \({\overline{\varepsilon }}_0,\ldots , {\overline{\varepsilon }}_{k-1}\) Taking power-series conjugates (see (5.3)) and still writing \(\sigma \) instead of \(\sigma ^\sharp \), we then get all \(\varepsilon \in {\mathcal {S}}_{{\mathbb {H}}}\) subject to condition (6.8)

$$\begin{aligned} \varepsilon =(\sigma {{\widetilde{\theta }}}_{21}^\sharp +{{\widetilde{\theta }}}_{22}^\sharp )^{-1}(\sigma {{\widetilde{\theta }}}_{11}^\sharp +{{\widetilde{\theta }}}_{21}^\sharp ). \end{aligned}$$
(6.40)

Comparing the latter formula with (6.33), we see that it remains to verify polynomial equalities

$$\begin{aligned} {\mathfrak {d}}_{11}={{\widetilde{\theta }}}_{22}^\sharp , \; \; \mathfrak {d}_{12}=-{{\widetilde{\theta }}}_{12}^\sharp , \; \; \mathfrak {d}_{21}=-{{\widetilde{\theta }}}_{21}^\sharp , \; \; \mathfrak {d}_{22}={{\widetilde{\theta }}}_{11}^\sharp . \end{aligned}$$
(6.41)

Toward this end, we substitute (6.35), (6.36), (6.37) into (6.39) and make use of equalities (6.38) to write \({\widetilde{\Theta }}\) in terms of \(M_k\) and \(N_k\)

$$\begin{aligned} {{\widetilde{\Theta }}}(z)=I_2+(z-1)\begin{bmatrix}M_k^* \\ N_k^*\end{bmatrix}(I-zZ_k^{*})^{-1}R_k^{-1}(I-Z_k)^{-1}\begin{bmatrix}M_k&\quad -N_k\end{bmatrix}. \end{aligned}$$

We next use equalities (6.28) and (6.29) to write \({\widetilde{\Theta }}\) in terms of \(X_k\) and \(Y_k\)

$$\begin{aligned} {{\widetilde{\Theta }}}(z)&=I_2-(z-1)\begin{bmatrix}M_k^* \\ N_k^*\end{bmatrix}(I-zZ_k)^{-1}(I-Z_k)^{-1}\begin{bmatrix}Y_k&\quad -X_k\end{bmatrix}\\&=I_2-(z-1)\begin{bmatrix}Y_k^* \\ X_k^*\end{bmatrix}(I-Z_k^*)^{-1}\textbf{S}_k^{-1}(I-zZ_k)^{-1}\begin{bmatrix}Y_k&\quad -X_k\end{bmatrix}. \end{aligned}$$

Therefore

$$\begin{aligned} \begin{bmatrix}{{\widetilde{\theta }}}_{22}^\sharp &{} -{{\widetilde{\theta }}}_{12}^\sharp \\ -{{\widetilde{\theta }}}_{21}^\sharp &{} {{\widetilde{\theta }}}_{22}^\sharp \end{bmatrix}=I_2+(z-1)\begin{bmatrix}X_k^* \\ Y_k^*\end{bmatrix}(I-zZ_k^{*})^{-1}R_k^{-1}(I-Z_k)^{-1}\begin{bmatrix}X_k&\quad -Y_k\end{bmatrix}, \end{aligned}$$

which is the same polynomial as that in (6.34). Thus, equalities (6.41) hold, which completes the proof of the lemma. \(\square \)

Remark 6.13

Note that the polynomial (6.34) has the same structure as \(\Theta \) in (3.17). If we consider the polynomial

$$\begin{aligned} {\mathfrak {K}}(z)=z^{k}I_2-(z-1)\begin{bmatrix}X_k^* \\ Y_k^*\end{bmatrix}(I-Z_k^*)^{-1}{} \textbf{S}_k^{-1}\textbf{Z}_k(z)\begin{bmatrix}X_k&\quad -Y_k\end{bmatrix} \end{aligned}$$
(6.42)

modeled from (3.16), then we have the identity

$$\begin{aligned} {\mathfrak {K}}(z){\mathfrak {D}}(z)=z^kI_2={\mathfrak {D}}(z){\mathfrak {K}}(z) \end{aligned}$$
(6.43)

which is verified by mimicking the proof of Lemma 3.10.

Lemma 6.14

Let \(\Psi :=\Psi _n\), \({\mathfrak {D}}\) and \({\mathfrak {K}}\) be the matrix polynomials defined in (3.16), (6.34), and (6.43), respectively. Then

$$\begin{aligned} {\mathfrak {D}}(z)\Psi (z)=z^{k}\Psi _{n-k}(z)\quad \text{ and }\quad \Psi (z)={\mathfrak {K}}(z)\Psi _{n-k}(z), \end{aligned}$$
(6.44)

where according to (3.16)

$$\begin{aligned} \Psi _{n-k}(z)=z^{n-k}I_2-(z-1)\begin{bmatrix}{} \textbf{e}^* \\ F_{n-k}^*\end{bmatrix}(I-Z_{n-k}^*)^{-1}{\mathfrak {S}}_{n-k}^{-1}\textbf{Z}_{n-k}(z)\begin{bmatrix}{} \textbf{e}&-F_{n-k}\end{bmatrix}.\nonumber \\ \end{aligned}$$
(6.45)

Note that \({\widetilde{\Psi }}\) is defined as in (3.16) but with n replaced by \(n-k\).

Proof

Upon replacing \(\left[ {\begin{matrix} X_k&-Y_k \end{matrix}}\right] \) in (6.34) by its expression in (6.30), we get

$$\begin{aligned} {\mathfrak {D}}(z)&=I_2+(z-1)\begin{bmatrix}X_k^* \\ Y_k^*\end{bmatrix}(I-zZ_k^{*})^{-1}{} \textbf{S}_k^{-1} \begin{bmatrix}T_{n-k,k}{} \textbf{T}_{n-k}^{f*}{\mathfrak {S}}_{n-k}^{-1}&I_k\end{bmatrix}\nonumber \\&\qquad \qquad \times (I-Z_{n})^{-1}\begin{bmatrix}{} \textbf{e}_{n}&\quad -F_{n}\end{bmatrix}. \end{aligned}$$
(6.46)

On the other hand, substituting the decomposition (6.32) for \({\mathfrak {S}}_n^{-1}\) into (3.16), making use of (6.30) and taking into account (6.45) gives

$$\begin{aligned} \Psi (z)=&z^nI_2-(z-1)\begin{bmatrix}{} \textbf{e}^* \\ F_n^*\end{bmatrix}(I-Z_n^*)^{-1}\begin{bmatrix}{\mathfrak {S}}_{n-k}^{-1} &{}0 \\ 0 &{} 0\end{bmatrix} \textbf{Z}_n(z)\begin{bmatrix}{} \textbf{e}&\quad -F_n\end{bmatrix}\nonumber \\&-(z-1)\begin{bmatrix}{} \textbf{e}^* \\ F_n^*\end{bmatrix}(I-Z_n^*)^{-1}\left[ {\begin{matrix} M_{n-k}^{-1} \textbf{T}_{n-k}^{f}T_{n-k,k}^* \\ I_k \end{matrix}}\right] {} \textbf{S}_k^{-1}\nonumber \\&\quad \times \left[ {\begin{matrix} T_{n-k,k}{} \textbf{T}_{n-k}^{f*}{\mathfrak {S}}_{n-k}^{-1}&I_k \end{matrix}}\right] {} \textbf{Z}_n(z)\begin{bmatrix}{} \textbf{e}&\quad -F_n\end{bmatrix}\nonumber \\ =&z^{k}\Psi _{n-k}(z)-(z-1)\begin{bmatrix}X_k^* \\ Y_k^*\end{bmatrix}(I-Z_k^*)^{-1}{} \textbf{S}_k^{-1}\nonumber \\&\quad \times \left[ {\begin{matrix} T_{n-k,k}{} \textbf{T}_{n-k}^{f*}{\mathfrak {S}}_{n-k}^{-1}&I_k \end{matrix}}\right] {} \textbf{Z}_n(z)\begin{bmatrix}{} \textbf{e}&\quad -F_n\end{bmatrix}. \end{aligned}$$
(6.47)

Multiplying the right-side expressions in (3.16), (6.46) and taking into account (6.47), we get

$$\begin{aligned} {\mathfrak {D}}(z)\Psi (z)=z^{k}\Psi _{n-k}(z)-(z-1)\begin{bmatrix}X_k^* \\ Y_k^*\end{bmatrix}W(z) \begin{bmatrix}{} \textbf{e}&\quad -F_n\end{bmatrix}, \end{aligned}$$
(6.48)

where

$$\begin{aligned} W(z)=&(I-Z_k^*)^{-1}{} \textbf{S}_k^{-1}\begin{bmatrix}T_{n-k,k}{} \textbf{T}_{n-k}^{f*}{\mathfrak {S}}_{n-k}^{-1}&I_k\end{bmatrix}{} \textbf{Z}_n(z)\nonumber \\&-z^n(I-zZ_k^{*})^{-1}{} \textbf{S}_k^{-1}\begin{bmatrix}T_{n-k,k}\textbf{T}_{n-k}^{f*}{\mathfrak {S}}_{n-k}^{-1}&I_k\end{bmatrix} (I-Z_{n})^{-1}\nonumber \\&+(z-1)(I-zZ_k^{*})^{-1}{} \textbf{S}_k^{-1}\begin{bmatrix}T_{n-k,k}{} \textbf{T}_{n-k}^{f*}{\mathfrak {S}}_{n-k}^{-1}&I_k\end{bmatrix}\nonumber \\&\quad \times (I-Z_n)\big (\textbf{e}\textbf{e}^*-F_nF_n^*\big )(I-Z_n^*)^{-1}{\mathfrak {S}}_n^{-1}{} \textbf{Z}_n(z). \end{aligned}$$
(6.49)

By (3.23) and due to decompositions (6.26) and (6.32)

$$\begin{aligned}&\begin{bmatrix}T_{n-k,k}{} \textbf{T}_{n-k}^{f*}{\mathfrak {S}}_{n-k}^{-1}&I_k\end{bmatrix}(I-Z_{n})^{-1} \big (\textbf{e}{} \textbf{e}^*-F_nF_n^*\big )(I-Z_n^*)^{-1}{\mathfrak {S}}_n^{-1}\\&=\begin{bmatrix}T_{n-k,k}{} \textbf{T}_{n-k}^{f*}{\mathfrak {S}}_{n-k}^{-1}&I_k\end{bmatrix} \big ((I-Z_n)^{-1}+{\mathfrak {S}}_nZ_n^*(I-Z_n^*)^{-1}{\mathfrak {S}}_n^{-1}\big )\\&=\begin{bmatrix}T_{n-k,k}{} \textbf{T}_{n-k}^{f*}{\mathfrak {S}}_{n-k}^{-1}&I_k\end{bmatrix}(I-Z_n)^{-1}\\&\quad +\textbf{S}_k Z_k^*(I-Z_k^*)^{-1}\begin{bmatrix}T_{n-k,k}\textbf{T}_{n-k}^{f*}{\mathfrak {S}}_{n-k}^{-1}&I_k\end{bmatrix}. \end{aligned}$$

Using the latter equality along with (3.26) and

$$\begin{aligned} (z-1)(I-zZ_k^{*})^{-1}Z_k^*(I-Z_k^*)^{-1}=(I-zZ_k^{*})^{-1}-(I-Z_k^*)^{-1}, \end{aligned}$$

we simplify the last term on the right side of (6.49) to

$$\begin{aligned}&(z-1)(I-zZ_k^{*})^{-1}{} \textbf{S}_k^{-1} \begin{bmatrix}T_{n-k,k}{} \textbf{T}_{n-k}^{f*}{\mathfrak {S}}_{n-k}^{-1}&I_k\end{bmatrix}(I-Z_n)^{-1}{} \textbf{Z}_n(z)\\&\qquad +(z-1)(I-zZ_k^{*})^{-1}Z_k^*(I-Z_k^*)^{-1} \begin{bmatrix}T_{n-k,k}{} \textbf{T}_{n-k}^{f*}{\mathfrak {S}}_{n-k}^{-1}&I_k\end{bmatrix}{} \textbf{Z}_n(z) \\&\quad =z^n(I-zZ_k^{*})^{-1}{} \textbf{S}_k^{-1} \begin{bmatrix}T_{n-k,k}{} \textbf{T}_{n-k}^{f*}{\mathfrak {S}}_{n-k}^{-1}&I_k\end{bmatrix}(I-Z_n)^{-1}\\&\qquad -(I-Z_k^{*})^{-1}\begin{bmatrix}T_{n-k,k}\textbf{T}_{n-k}^{f*}{\mathfrak {S}}_{n-k}^{-1}&I_k\end{bmatrix}{} \textbf{Z}_n(z). \end{aligned}$$

Substituting the latter equality into (6.49), we see that \(W(z)=0\), and, hence, the first equality in (6.44) follows from (6.48). The second equality follows from the first by multiplying the latter by \({\mathfrak {K}}(z)\) on the left and making use of (6.43). \(\square \)

6.5 Completion of the proof of Theorem 6.9

In the previous section, we showed that if \(R_k\succeq 0\) and \(\varepsilon \) is an excluded parameter of (6.7) of order k, then the power series \(f_\varepsilon =\textbf{L}_\Psi [\varepsilon ]\) is of the form (6.23). However, the rightmost inequality in (6.23) has not been justified yet.

Taking for granted that \(f_\varepsilon \) belongs to \(\mathcal S_{{\mathbb {H}}}^{\kappa -k}\) (i.e., that \(\nu _-({\mathfrak {S}}_{m}^{f_\varepsilon })\le \kappa -k\) for all \(m\ge 0\)), let us assume that the first \(n-k+1\) coefficients of \(f_\varepsilon \) are equal to \(f_0,\ldots ,f_{n-k}\). Then, \({\mathfrak {S}}_{n-k+1}^{f_\varepsilon }={\mathfrak {S}}_{n-k+1}\) which cannot be the case, since \(\nu _-({\mathfrak {S}}_{n-k+1})=\kappa -k+1\) by part (1) in Lemma 6.10. It remains to show that \(f_\varepsilon \in \mathcal S_{{\mathbb {H}}}^{\kappa -k}\). We first consider the definite case.

Case 1: Let us assume that \(R_k\succ 0\). By Lemma 6.12, the excluded parameter \(\varepsilon \) is of the form (6.33), that is

$$\begin{aligned} \varepsilon =\textbf{L}_{{\mathfrak {D}}}[\sigma ]\quad \text{ for } \text{ some }\quad \sigma \in {\mathcal {S}}_{{\mathbb {H}}}, \end{aligned}$$
(6.50)

where \({\mathfrak {D}}\) is the matrix polynomial given by (6.34). Taking the superposition of left linear fractional transformations (6.50) and (6.7) and taking into account the identity (6.44), we get

$$\begin{aligned} f_\varepsilon =\textbf{L}_{\Psi }[[\textbf{L}_{{\mathfrak {D}}}[\sigma ]]=\textbf{L}_{{\mathfrak {D}}\Psi }[\sigma ]= \textbf{L}_{z^k\Psi _{n-k}}[\sigma ]=\textbf{L}_{\Psi _{n-k}}[\sigma ]. \end{aligned}$$
(6.51)

If \(\sigma \in {\mathcal {S}}_{{\mathbb {H}}}\) were an excluded parameter of the transformation \(\textbf{L}_{\Psi _{n-k}}[\sigma ]\), then \(\varepsilon \) would have been an excluded parameter of the transformation (6.7) of order greater than k, which is not the case. Hence \(\sigma \) is not excluded and, therefore, \(f_\varepsilon =\textbf{L}_{\Psi _{n-k}}[\sigma ]\) belongs to \(\mathcal S_{{\mathbb {H}}}^{\kappa -k}\) by virtue of Theorem 6.1.

Remark 6.15

The linear fractional transformation \(\textbf{L}_{\Psi _{n-k}}\) in (6.51) is associated with the truncated \(\textbf{CSP}\) with the prescribed coefficients \(f_0,\ldots , f_{k-1}\) only. In the context of the original \(\textbf{CSP}\) (with n prescribed coefficients), it parametrizes all quasi-solutions f arising from excluded parameters \(\varepsilon \) of the transformation \(\textbf{L}_{\Psi }\) of order at least k. Furthermore, if \(\sigma \) is still an excluded parameter of \(\textbf{L}_{\Psi _{n-k}}\) of order r, then \(\varepsilon \) defined as in (6.50) is an excluded parameter of \(\textbf{L}_{\Psi }\) of order \(k+r\).

Case 2: Let us assume that \(-R_k\) is the maximal negative semidefinite (singular) bottom principal submatrix of \({\mathfrak {S}}_n^{-1}\). Let \(\textrm{rank}(R_k)=d\) (\(0\le d<k\)). Then, \(R_d\succ 0\). Since \(R_{k+1}\not \succeq 0\), it follows by virtue of Theorem 3.5 that the matrix \(R_{2k-d}\) is invertible and has k positive and \(k-d\) negative eigenvalues. Since \(\textbf{S}_{2k-d}=-R^{-1}_{2k-d}\) is the Schur complement of the leading principal submatrix \({\mathfrak {S}}_{n-2k+d}\) of \({\mathfrak {S}}_n\), it follows that:

$$\begin{aligned} \nu _-({\mathfrak {S}}_{n-2k+d})=\nu _-({\mathfrak {S}}_{n})-\nu _+(R_{2k-d}^{-1})=\kappa -k, \end{aligned}$$

which together with Lemma 6.10 (part (1)) implies

$$\begin{aligned} \nu _-({\mathfrak {S}}_{n-2k+d})=\nu _-({\mathfrak {S}}_{n-k})=\kappa -k. \end{aligned}$$
(6.52)

Upon invoking Case 1, we may assume without loss of generality that \(d=0\) and \(n-2k+d=0\). Indeed, since \(\varepsilon \) is an excluded parameter of order at least d, it is of the form (6.50), where \({\mathfrak {D}}\) is defined in (6.34) (with d instead of k). Then, \(f_\varepsilon \) is of the form \(f_\varepsilon =\textbf{L}_{\Psi _{n-d}}[\sigma ]\) where \(\sigma \) is an excluded parameter of \(\textbf{L}_{\Psi _{n-k}}\) of order \(k-d\) and the recalculated \(R_{k-d}\) is the zero matrix. On the other hand, keeping in mind the second equality in (6.44) and representing \(f_\varepsilon \) as

$$\begin{aligned} f_\varepsilon =\textbf{L}_{\Psi _{n-2k+d}}[{\widetilde{f}}_\varepsilon ],\quad \text{ where }\quad {\widetilde{f}}_{\varepsilon }=\textbf{L}_{{\mathfrak {K}}}[\varepsilon ], \end{aligned}$$

we see from (6.52) that, to justify the membership of \(f_\varepsilon \) in \({\mathcal {S}}_{{\mathbb {H}}}^{\kappa -k}\), it suffices to show that \({\widetilde{f}}_\varepsilon \) belongs to \({\mathcal {S}}_{{\mathbb {H}}}\). Thus, Case 2 will follow from the very particular Case 3 below.

Case 3: Let us assume that \(n=2k\) and \(R_k=0\). We will show that in this case, the only excluded parameter of \(\textbf{L}_{\Psi }\) is the unimodular constant \(\varepsilon \equiv f_0\) and the corresponding \(f_\varepsilon \) is equal to the unimodular constant \(-f_0\) and, hence, belongs to \({\mathcal {S}}_{{\mathbb {H}}}\).

Since \(R_k=0\), inverting \({\mathfrak {S}}_n^{-1}\) shows that \({\mathfrak {S}}_{n-k}=0\). Therefore

$$\begin{aligned} |f_0|=1\quad \text{ and }\quad f_j=0 \quad \text{ for }\quad j=1,\ldots , k-1. \end{aligned}$$
(6.53)

In this case, decompositions (3.1), (6.26), and (6.32) take the form

$$\begin{aligned}{} & {} \textbf{T}_{2k}^f=\begin{bmatrix} f_0I_k &{} 0 \\ T_{k,k} &{} f_0I_k\end{bmatrix},\quad \text{ where }\quad T_{k,k}=\left[ {\begin{matrix} f_k &{}0 &{} \ldots &{}0\\ f_{k+1} &{} f_{k} &{}\ldots &{}0\\ \vdots &{}\vdots &{}&{} \vdots \\ f_{2k-1} &{} f_{2k-2} &{} \ldots &{}f_k \end{matrix}}\right] ,\nonumber \\{} & {} {\mathfrak {S}}_{2k}=\begin{bmatrix} 0 &{} -f_0T^*_{k,k}\\ -T_{k,k}{\overline{f}}_0 &{} -T_{k,k}T_{k,k}^*\end{bmatrix}, \quad {\mathfrak {S}}_{2k}^{-1}=\begin{bmatrix} I_k &{} -f_0T^{-1}_{k,k}\\ -T^{-*}_{k,k}{\overline{f}}_0 &{} 0\end{bmatrix}. \end{aligned}$$
(6.54)

Making use of formulas (6.12) (for \(k=n\)) and (6.13), we next compute

$$\begin{aligned} \begin{bmatrix}1&-f_0\end{bmatrix}&\begin{bmatrix}\psi _{11,2k-1}&{} \ldots &{}\psi _{11,1} &{} \psi _{21,0}\\ \psi _{21,2k-1}&{} \ldots &{}\psi _{21,1} &{} \psi _{21,0}\end{bmatrix}= \begin{bmatrix}1&-f_0\end{bmatrix}\begin{bmatrix}N^*_{2k} \\ M^*_{2k}\end{bmatrix}\nonumber \\&\qquad =\begin{bmatrix}1&-f_0\end{bmatrix}\begin{bmatrix}\textbf{e}^*\\ F^*_{2k}\end{bmatrix} (I-Z^*_{2k})^{-1}{\mathfrak {S}}_{2k}^{-1}(I-Z_{2k})\nonumber \\&\qquad =\begin{bmatrix}b_{2k-1}&\ldots&b_k&0&\ldots&0\end{bmatrix}, \end{aligned}$$
(6.55)

where

$$\begin{aligned} \begin{bmatrix}b_{2k-1}&\ldots&b_k\end{bmatrix}= f_0\begin{bmatrix}{\overline{f}}_{k}&\ldots&{\overline{f}}_{2k-1}\end{bmatrix}(I-Z_k^*)^{-1} T_{k,k}^{-*}(I-Z_k){\overline{f}}_0. \end{aligned}$$
(6.56)

The last equality in (6.55) follows from the block-decomposition (6.54) for \({\mathfrak {S}}_n^{-1}\) and the equality:

$$\begin{aligned} \textbf{e}_{2k}-f_0F_{2k}^*=-\begin{bmatrix}0&\ldots&0&f_0{\overline{f}}_k&\ldots&f_0{\overline{f}}_{2k-1}\end{bmatrix}, \end{aligned}$$

which in turn, holds true due to (6.53). The explicit formula (6.56) plays no role in the sequel and is given for the sake of completeness only. Comparing the k rightmost entries in (6.55) gives

$$\begin{aligned} \psi _{11,j}-f_0\psi _{21,j}=0\quad \text{ for }\quad j=0,\ldots ,k-1, \end{aligned}$$
(6.57)

which means that \(\varepsilon \equiv f_0\) is the unique excluded parameter of \(\textbf{L}_{\Psi }\).

We next verify equalities

$$\begin{aligned} \psi _{12,j}-f_0\psi _{22,j}=(\psi _{11,j}-f_0\psi _{21,j})f_0\quad \text{ for }\quad j=k,\ldots ,2k. \end{aligned}$$
(6.58)

To this end, we combine (6.55) with (6.19) (with \(n=2k\) instead of k) and the block decomposition (6.54) for \(\textbf{T}^f_{2k}\) to get

$$\begin{aligned}&\begin{bmatrix}1&-f_0\end{bmatrix} \begin{bmatrix}\psi _{12,2k-1}&{} \ldots &{}\psi _{12,1} &{} \psi _{12,0}\\ \psi _{22,2k-1}&{} \ldots &{}\psi _{22,1} &{} \psi _{22,0}\end{bmatrix}\\&=\begin{bmatrix}1&-f_0\end{bmatrix} \begin{bmatrix}\psi _{11,2k-1}&{} \ldots &{}\psi _{11,1} &{} \psi _{11.0}\\ \psi _{21,2k-1}&{} \ldots &{}\psi _{21,1} &{} \psi _{21,0}\end{bmatrix}{} \textbf{T}^f_{2k}\\&=\begin{bmatrix}b_{2k-1}&\ldots&b_k&0&\ldots&0\end{bmatrix}\begin{bmatrix} f_0I_k &{} 0 \\ T_{k,k} &{} f_0I_k\end{bmatrix}\\&=\begin{bmatrix}b_{2k-1}f_0&\ldots&b_k f_0&0&\ldots&0\end{bmatrix}. \end{aligned}$$

Comparing the k leftmost entries with those in (6.55), we get equalities (6.58) for \(j=k,\ldots ,2k-1\). To verify (6.58) for \(j=k\), we use explicit formulas (6.16) for \(\psi _{11,2k}\), \(\psi _{21,2k}\) (recall that \(n=2k\)) derived from (3.16) and the formulas

$$\begin{aligned} \psi _{12,2k}=-\textbf{e}^*(I-Z_{2k}^*)^{-1}{\mathfrak {S}}_{2k}^{-1}F_{2k}, \quad \psi _{22,2k}=1+F_{2k}^*(I-Z_{2k}^*)^{-1}{\mathfrak {S}}_{2k}^{-1}F_{2k} \end{aligned}$$

also derived from (3.16), to compute

$$\begin{aligned}&(\psi _{11,j}-f_0\psi _{21,j})f_0-(\psi _{12,j}-f_0\psi _{22,j})\nonumber \\&=f_0-\textbf{e}^*(I-Z_{2k}^*)^{-1}{\mathfrak {S}}_{2k}^{-1}{} \textbf{e}f_0+ f_0F_{2k}^*(I-Z_{2k}^*)^{-1}{\mathfrak {S}}_{2k}^{-1}{} \textbf{e}f_0\nonumber \\&\quad +\textbf{e}^*(I-Z_{2k}^*)^{-1}{\mathfrak {S}}_{2k}^{-1}F_{2k}-f_0-f_0F_{2k}^*(I-Z_{2k}^*)^{-1}{\mathfrak {S}}_{2k}^{-1}F_{2k}\nonumber \\&=(\textbf{e}^*-f_0F^*_{2k})(I-Z_{2k}^*)^{-1}{\mathfrak {S}}_{2k}^{-1}(F_{2k}-\textbf{e}f_0). \end{aligned}$$
(6.59)

Due to (6.53), the k leftmost entries in \(\textbf{e}^*-f_0F^*_{2k}\) and the k top entries in \(F_{2k}-\textbf{e}f_0\) are zeros. Since the \(k\times k\) bottom principal submatrix of \((I-Z_{2k}^*)^{-1}{\mathfrak {S}}_{2k}^{-1}\) is the zero matrix, the expression on the right side of (6.59) equals zero, thus justifying the equality (6.58) for \(j=2k\).

Due to (6.57) and (6.58), we have the polynomial identity

$$\begin{aligned} \psi _{12}(z)-f_0\psi _{22}(z)=\psi _{11}(z)-f_0\psi _{21}(z))f_0 \end{aligned}$$

which implies that for \(\varepsilon \equiv f_0\),

$$\begin{aligned} \textbf{L}_{\Psi }[\varepsilon ]=(\psi _{11}-f_0\psi _{21})^{-1}(f_0\psi _{22}-\psi _{12})\equiv -f_0, \end{aligned}$$

thus, completing Case 3, and hence, the proof of Theorem 6.9.