Abstract
We consider the quaternion version of the Toeplitz matrix extension problem with prescribed number of negative eigenvalues. The positive semidefinite case is closely related to the Carathéodory–Schur interpolation problem (CSP) in the Schur class \(\mathcal S_{{\mathbb {H}}}\) and the Carathéodory class \({\mathcal {C}}_{{\mathbb {H}}}\) of slice-regular functions on the unit quaternionic ball \({\mathbb {B}}\) that are, respectively, bounded by one in modulus and having positive real part in \({\mathbb {B}}\). Explicit linear fractional parametrization formulas with free Schur-class parameter for the solution set of the CSP (in the indeterminate case) are given. Carathéodory–Fejér extremal problem and Carathéodory theorem on uniform approximation of a Schur-class function by quaternion finite Blaschke products are also derived. The indefinite version of the Toeplitz extension problem is applied to solve the CSP in the quaternion generalized Schur class. The linear fractional parametrization of the solution set for the indefinite indeterminate problem still exists, but some parameters should be excluded. These excluded parameters and the corresponding “quasi-solutions" are classified and discussed in detail.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Let \({\mathcal {S}}\) denote the Schur class of complex-valued functions analytic on the open unit disk \({{\mathbb {D}}}\subset {\mathbb {C}}\) and mapping \({{\mathbb {D}}}\) into its closure, i.e., the closed unit ball of the Hardy space \(H^\infty ({{\mathbb {D}}})\). The classical Carathéodory–Schur problem [11, 21, 22] consists of finding a Schur-class function f with prescribed first n Taylor coefficients \(f_0,\ldots ,f_{n-1}\) at the origin. The answer is given in terms of the matrices
constructed from the given coefficients. Namely, the problem has a solution if and only if \({\mathfrak {S}}_n^f\) is positive semidefinite (i.e., the matrix \(\textbf{T}_n^f\) is contractive). Moreover, if \({\mathfrak {S}}_n^f\) is positive definite, the problem has infinitely many solutions that can be parametrized by a linear fractional formula. If \({\mathfrak {S}}_n^f\) is singular, then the problem has a unique solution which necessarily is a Blaschke product of degree \(\deg f=\textrm{rank} \, {\mathfrak {S}}_n^f\). As a consequence of this fact, it was shown in [11] (with further elaboration in [23]) that the set of all functions analytic on \({\mathbb {D}}\) and with fixed n first Taylor coefficients contains a unique element (a scalar multiple of a finite Blaschke product) with minimally possible \(H^\infty \)-norm. Another consequence is the Carathéodory approximation theorem asserting that any Schur-class function can be uniformly approximated on any compact subset of \({{\mathbb {D}}}\) by finite Blaschke products. Yet, another consequence is the identification of Schur-class functions as power series f with Toeplitz matrices \(\textbf{T}_n^f\) being contractive for all \(n\ge 1\) and more specifically, of Blaschke products of degree k as power series for which the matrix \({\mathfrak {S}}_n^f\) is positive semidefinite and has rank equal \(\min (n,k)\) for all \(n\ge 1\).
It turns out that the CSP is equivalent to the structured positive semidefinite matrix extension problem: given \(f_0,\ldots ,f_{n-1}\in {{\mathbb {C}}}\), find its extension \(\{f_j\}_{j\ge 0}\) , so that \({\mathfrak {S}}_k^f\succeq 0\) for all \(k\ge 0\). The latter problem admits a fairly straightforward quaternion analog, with \(f_0,\ldots ,f_{n-1}\in {\mathbb {H}}\) and appropriately defined matrix positivity. This problem will be settled in Sect. 3. The Schur-complement argument used there will allow us to extend a given invertible matrix \({\mathfrak {S}}_n^f\) (not necessarily positive semidefinite) in a way that the number of negative eigenvalues of \({\mathfrak {S}}_m^f\) will be the same as that of \({\mathfrak {S}}_n^f\) for all \(m>n\). In Sect. 4, using the power-series characterization of quaternion Schur-class functions and finite Blaschke products, we will get to the analytic version of quaternion Carathéodory–Schur interpolation problem and establish quaternion analogs of the Carathéodory–Fejér theorem on the minimal-norm solution and the Carathéodory approximation theorem.
The CSP originally appeared in [8, 9, 26] in the setting of analytic functions with positive real part in \({{\mathbb {D}}}\) (that is, in the Carathéodory class \({\mathcal {C}}\)). The quaternionic counterpart of this class will be recalled in Sect. 5 and the interpolation results will be translated from the Schur-class setting using the Cayley transform.
In Sect. 6, we use the indefinite results from Sect. 3 to handle the CSP in the class of generalized Schur power series (such that the associated matrix \({\mathfrak {S}}_n^f\) has \(\kappa \) negative squares for all n big enough). As in the complex setting [18], the solution set of the problem is parametrized by a linear fractional formula with the Schur-class parameter. Due to possible zero cancelation (which does not occur in the definite case), some parameters give rise to only partial solutions to the problems. These special “excluded" parameters are classified in Sect. 6.3 and the corresponding quasi-solutions are studied in Sect. 6.4.
2 Preliminaries
We denote by \({\mathbb {H}}\) the skew field of quaternions
with imaginary units \(\textbf{i}, \textbf{j}, \textbf{k}\) commuting with reals and subject to equalities \(\textbf{i}^2=\textbf{j}^2=\textbf{k}^2=\textbf{ijk}=-1\). By \({\mathbb {H}}[[z]]\), we denote the ring of formal power series in one variable z commuting with quaternion coefficients with the ring operations given by
The real part, the conjugate, and the absolute value of \(\alpha \in {\mathbb {H}}\) of the form (2.1) are defined by
For any non-real element \(\alpha \in {\mathbb {H}}\), its minimal central (real) polynomial equals
Two elements \(\alpha ,\beta \in {\mathbb {H}}\) are called similar (\(\alpha \sim \beta \)) if \(\beta =\gamma \alpha \gamma ^{-1}\) for some non-zero \(\gamma \in {\mathbb {H}}\). We denote by
the similarity class of \(\alpha \). The second characterization in (2.4) follows from (2.3) and a general division-ring fact that \(\alpha \sim \beta \) if and only if \(\varvec{\mu }_\alpha =\varvec{\mu }_\beta \), and interprets \([\alpha ]\) as a 2-sphere of radius \(\sqrt{|\alpha |^2-(\Re \alpha )^2}\) around \(\Re \alpha \).
Another object associated with a non-real \(\alpha \in {\mathbb {H}}\) is its centralizer
which can be interpreted as the two-dimensional real subspace of \({\mathbb {H}}\) spanned by 1 and \(\alpha \). For \(\alpha \in {{\mathbb {R}}}\), (2.4) and (2.5) amount to \([\alpha ]=\{\alpha \}\) and \({{\mathbb {C}}}_\alpha ={\mathbb {H}}\).
2.1 Matrices over \({\mathbb {H}}\)
We denote by \({\mathbb {H}}^{n\times m}\) the set of \(n\times m\) matrices with quaternion entries, shortening this notation to \({\mathbb {H}}^n\) in case \(m=1\). An element \(\alpha \in {{\mathbb {H}}}\) is called a right eigenvalue of a matrix \(A\in {\mathbb {H}}^{n\times n}\) if \(A\textbf{x}=\textbf{x}\alpha \) for some non-zero \(\textbf{x}\in {{\mathbb {H}}}^{n}\). In this case, for any \(\beta =h^{-1}\alpha h\in [\alpha ]\), we also have \(A\textbf{x}h= \textbf{x}hh^{-1}\alpha h=\textbf{x}h\beta \), and hence, any element in the similarity class \([\alpha ]\) is a right eigenvalue of A. Therefore, the right spectrum \(\sigma _\textbf{r}(A)\) of A is the union of disjoint similarity classes (some of which may be real singletons).
Given a matrix \(A=[a_{ij}]\) its adjoint (conjugate transpose) \(A^*\) is defined as \(A^*=[{\overline{a}}_{ji}]\). If A is Hermitian (i.e., \(A=A^*\)), all its eigenvalues are real; if they are all nonnegative, the matrix A is called positive semidefinite; in notation, \(A\succeq 0\). It turns out that \(A\succeq 0\) if and only if \(\textbf{x}^*A\textbf{x}\ge 0\) for any \(\textbf{x}\in {\mathbb {H}}^n\). If all eigenvalues are positive (equivalently, \(\textbf{x}^*A\textbf{x}>0\) for any \(\textbf{x}\in {\mathbb {H}}^n\)), we say that A is positive definite and write \(A\succ 0\). We will write \(\nu _{\pm }(A)\) to denote the number of positive/negative eigenvalues (counted with multiplicities) of a Hermitian matrix A. For any matrix T, nonnegative square roots of right eigenvalues of \(TT^*\succeq 0\) are called singular values of T.
To deal with Hermitian structured extensions, we will need the Cauchy interlacing theorem for quaternionic Hermitian matrices which can be proven by suitably modified complex-setting arguments (avoiding determinants [25, Lecture 8] or by applying the classical result to the \(2n\times 2n\)-complex representation of a given \(n\times n\) quaternion matrix [24]).
Theorem 2.1
( [24, 25]) If \(A\in {\mathbb {H}}^{n\times n}\) is a Hermitian matrix with eigenvalues \(\lambda _1\le \ldots \le \lambda _n\), and \(B\in \mathbb {H}^{m\times m}\) is a principal submatrix of A with eigenvalues \(\mu _1\le \ldots \le \mu _m\), then \(\lambda _k\le \mu _k\le \lambda _{k+n-m}\) for \(k=1,\ldots ,m\).
In what follows, we write \(I_n\) for the \(n\times n\) identity matrix and we will make use of \(Z_n\in {{\mathbb {R}}}^{n\times n}\) and \(\textbf{e}_n,\widetilde{\textbf{e}}_n\in {{\mathbb {R}}}^n\) given by
dropping the subscript n if the dimension is clear from the context.
2.2 Stein equations and Schur complements
Given a matrix \(J\in {\mathbb {H}}^{m\times m}\), such that \(J=J^*=J^{-1}\) and a sequence \(c_j\in {\mathbb {H}}^{1\times m}\) (\(j\ge 0\)), for each fixed \(n\ge 0\), let us denote by \(P_n\in {\mathbb {H}}^{n\times n}\) a unique solution of the Stein equation
The \(n^2\) entries of \(P_n\) are determined by mn elements from \(C_n\), and the equation (2.7) determines certain (displacement) structure of \(P_n\) in case \(m\ll n\). The integer m is called the displacement rank of \(P_n\); see [17]. An important fact is that the displacement structure of matrices is inherited by their Schur complements. Note that the matrix \(P_n\) is uniquely recovered from (2.7) by the formula \(P_n={\displaystyle \sum _{j=0}^{n-1}Z_n^jC_nJC_n^*Z_n^{*j}}\), from which we can see that \(P_n\) is the leading principal submatrix of \(P_{n+k}\) for any \(k\ge 1\). Writing \(P_{n+k}\) as
and assuming that \(P_n\) is invertible, we can factor \(P_{n+k}\) as
where
The latter matrix is called the Schur complement of the block \(P_n\) in (2.8). It follows from (2.9) that:
We next show that \(\textbf{S}_k\) satisfies the Stein identity similar to (2.7).
Lemma 2.2
The matrix \(\textbf{S}_k\) satisfies the Stein identity
where \(C^{\varvec{\prime }}_k\in {\mathbb {H}}^{k\times m}\) is given by
Furthermore, the top row \(c^{\varvec{\prime }}_0\) in \(C^{\varvec{\prime }}_k\) is non-zero.
Proof
We start with the Stein identity
(the same as (2.7) but with n replaced by \(n+k\)), from which it follows that:
Making use of the latter equality and (2.12) gives
Due to partition (2.9), we have
Substituting the latter equalities into (2.15), we get
thus verifying (2.11). To prove the last statement, let us assume via contradiction, that \(c^{\varvec{\prime }}_0=0\). Upon letting \(k=1\) in (2.12), we then have
and therefore
By the Stein identity (2.13) (with \(k=1\))
Substituting the latter equality into (2.16) gives
Multiplying both sides by \((I-Z_n^*)^{-1}P_n^{-1}(I-Z_n)\) on the right gives
By (2.6), the rightmost entry in the row-vector \(\widetilde{\textbf{e}}_n^*-B_1P_n^{-1}Z_n\) equals one which contradicts (2.17), thus completing the proof. \(\square \)
If \(c_0\ne 0\) and \(J=\pm I_m\), then the matrix \(P_n\) defined by the Eq. (2.7) is positive or negative definite. Otherwise (i.e., if \(\sigma (J)=\{\pm 1\}\)), \(P_n\) may have positive, negative, and zero eigenvalues. In this case, to determine the inertia of \(P_n\) in terms of \(C_n\) is a non-trivial question. Since the family \(\{P_n\}_{n\ge 1}\) is nested in the sense of (2.8), it follows by Theorem 2.1 that \(\nu _\pm (P_{n+k})\ge \nu _\pm (P_n)\) for all \(n,k\ge 1\), and then, a follow-up question is to extend a given \(P_n\) to \(P_{n+k}\) (by an appropriate choice of \(c_n,\ldots ,c_{n+k-1}\)) with the minimally possible negative (or positive) inertia. Below, we will examine both questions for two particular choices of \(C_n\) and J in (2.7) leading to Hermitian matrices \({\mathfrak {S}}_n\) and \({\mathfrak {C}}_n\) defined in (2.19).
2.3 Two particular cases
With any power series \(f\in {\mathbb {H}}[[z]]\) (or a quaternion sequence \(\{f_j\}_{j\ge 0}\)), we associate lower triangular Toeplitz matrices \(\textbf{T}_n^f\) by the rule
and subsequently, Hermitian matrices
for all \(n\ge 1\). Let us note that \({\mathfrak {C}}_n^f\) is a generic Hermitian Toeplitz matrix. Furthermore, the relations
that hold for all \(f,g\in {\mathbb {H}}[[z]]\) (in the rightmost relation, f needs to be invertible in \({\mathbb {H}}[[z]]\)) follow immediately from (2.2) and (2.18).
Let us also note that \({\mathfrak {S}}_n={\mathfrak {S}}_n^f\) and \({\mathfrak {C}}_n={\mathfrak {C}}_n^f\) are unique solutions to the respective Stein equations
where \(Z_n\) and \(\textbf{e}_n\) are given in (2.6) and where
Indeed, since \(Z_n\textbf{T}_n^{f}=\textbf{T}_n^{f}Z_n\) and \(I_n-Z_nZ_n^*=\textbf{e}_n\textbf{e}_n^*\), we get (2.21) as follows:
Equality (2.22) is verified similarly.
Remark 2.3
The rightmost expressions in Eqs. (2.21), (2.22) tell us that the latter equations are particular cases of (2.7) with \(m=2\), \(C_n=\begin{bmatrix}{} \textbf{e}_n&F_n\end{bmatrix}\) and \(J=\left[ {\begin{matrix} 1 &{} 0 \\ 0 &{}-1 \end{matrix}}\right] \) in (2.21) and \(J=\left[ {\begin{matrix} 0 &{} 1 \\ 1 &{}0 \end{matrix}}\right] \) in (2.22).
3 Extensions of \({\mathfrak {S}}_n\) and \({\mathfrak {C}}_n\) with minimal negative inertia
In this section, we consider the following structured extension problems: given \(f_0,\ldots ,f_{n-1}\) (i.e., given a matrix \({\mathfrak {S}}_n\) or \({\mathfrak {C}}_n\)) find \(\{f_j\}_{j\ge n}\), so that \(\nu _-({\mathfrak {S}}_{n+k})\) (or \(\nu _-({\mathfrak {C}}_{n+k}))\) will be minimally possible for each fixed \(k\ge 1\). The two settings are equivalent in the sense that the results for one setting are translated to another via Cayley transform. However, for specific questions, one setting might be more convenient than another. We start with Hermitian Toeplitz matrices \({\mathfrak {C}}_n\) (\(n\ge 1\)).
3.1 Extensions of singular \({\mathfrak {C}}_n\) with minimal negative inertia
Given a sequence \(\{f_j\}_{j\ge 0}\), let us consider conformal block-decompositions
for all \(k\ge 1\), where
and let us assume that \({\mathfrak {C}}_n\) is invertible. Upon letting
in Lemma 2.2, we conclude that \(\textbf{S}_k={\mathfrak {C}}^f_k-T_{n,k}{\mathfrak {C}}_n^{-1}T_{n,k}^*\), the Schur complement of \({\mathfrak {C}}_n\) in \({\mathfrak {C}}_{n+k}\), satisfies the Stein identity
where \(X_k,Y_k\in {\mathbb {H}}^k\) are given by the formula
and furthermore, \(\begin{bmatrix}x_0&y_0\end{bmatrix}\ne 0\). Without loss of generality, we may assume \(x_0\ne 0\). Since the formula (3.4) holds for all \(k\ge 1\), we get two sequences \(\{x_j\}_{j\ge 0}\) and \(\{y_j\}_{j\ge 0}\). Since \(x_0\ne 0\), the matrix \(\textbf{T}_k^{x}\) is invertible for all \(k\ge 1\), so we can introduce the sequence \(\{\varepsilon _j\}_{j\ge 0}\) by the formula
Multiplying both sides of (3.3) by \((\textbf{T}_k^{x})^{-1}\) on the left and by its adjoint on the right and taking into account that \(\textbf{T}_k^{x}\) commutes with \(Z_k\), we arrive at
which is the Stein equation of the form (2.22) and, hence, has a unique solution \({\mathfrak {C}}_k^\varepsilon \). Thus
By the Sylvester law of inertia (see, e.g., [20] for the quaternionic version), \(\nu _{\pm }(\textbf{S}_k)=\nu _{\pm }({\mathfrak {C}}_k^\varepsilon )\). Combining the latter equalities with the general relation (2.10), we conclude
Lemma 3.1
Given a sequence \(\{f_j\}_{j\ge 0}\), let us assume that \({\mathfrak {C}}^f_n\) is invertible and \({\mathfrak {C}}^f_{n+1}\) is singular. If \(\nu _-({\mathfrak {C}}^f_{n+k})=\nu _-({\mathfrak {C}}_n^f)\) for some \(k>1\), then
-
1.
\(\textrm{rank}({\mathfrak {C}}^f_{n+k})=\textrm{rank}({\mathfrak {C}}^f_{n})\).
-
2.
The elements \(f_{n+1},\ldots ,f_{n+k-1}\) are uniquely determined by \(f_0,\ldots ,f_n\).
Proof
Let \(\{\varepsilon _j\}_{j\ge 0}\) be the sequence constructed from \(\{f_j\}_{j\ge 0}\) as in (3.5). Since \({\mathfrak {C}}^f_{n+1}\) is singular, \(\textbf{S}_1=0\) and hence \(\varepsilon _0+{\overline{\varepsilon }}_0=0\), by (3.6) (with \(k=1\)). Since \(\nu _-({\mathfrak {C}}^f_{n+k})=\nu _-({\mathfrak {C}}_n^f)\), we have \(\nu _-({\mathfrak {C}}^\varepsilon _k)=0\). Therefore, \({\mathfrak {C}}^\varepsilon _k\) is a positive semidefinite matrix with zero diagonal entries. Therefore, \({\mathfrak {C}}^\varepsilon _k=0\), and hence, \(\nu _+({\mathfrak {C}}^f_{n+k})=\nu _+({\mathfrak {C}}_n^f)\) (by (3.7)), from which part (1) follows.
For part (2), let us decompose \({\mathfrak {C}}_{n+k}:={\mathfrak {C}}^f_{n+k}\) as follows:
By part (1), the Schur complement of the block \({\mathfrak {C}}_n\) is the zero matrix
In particular, we have \(\textbf{c}=T_{n,k-1}{\mathfrak {C}}_n^{-1}{} \textbf{b}\), which we write entry-wise as
The latter formula recursively recovers \(f_{n+1},\ldots ,f_{n+k-1}\) from \(f_0,\ldots ,f_n\). \(\square \)
Remark 3.2
If \({{\widehat{A}}}=\left[ {\begin{matrix} A &{} *\\ *&{}* \end{matrix}}\right] \) is any Hermitian extension of \(A=A^*\) with \(\textrm{rank}({{\widehat{A}}})=\textrm{rank}(A)\), then necessarily \(\nu _{\pm }({{\widehat{A}}})=\nu _{\pm }(A)\).
Proof
By Theorem 2.1, \(\nu _{\pm }({{\widehat{A}}})\ge \nu _{\pm }(A)\). Therefore, the equality
is possible if and only if \(\nu _{\pm }({{\widehat{A}}})=\nu _{\pm }(A)\). \(\square \)
Lemma 3.3
Given a sequence \(\{f_j\}_{j\ge 0}\), let us assume that
-
1.
the matrix \({\mathfrak {C}}^f_n\) is invertible,
-
2.
the matrices \({\mathfrak {C}}^f_{n+1}, \ldots , {\mathfrak {C}}^f_{n+k}\) are all singular,
-
3.
\(\textrm{rank}({\mathfrak {C}}^f_n)< \textrm{rank}({\mathfrak {C}}^f_{n+k})=d\).
Then, \(\nu _{\pm }({\mathfrak {C}}^f_{n+k+j})=\nu _{\pm }({\mathfrak {C}}^f_{n+k})+j \;\) for all \(j=1,\ldots , n+k-d\).
Proof
As in the proof of Lemma 3.1, we consider the sequence \(\{\varepsilon _j\}_{j\ge 0}\) defined as in (3.5). By (3.7), the assumptions 3 and 2 in the lemma translate to Toeplitz matrices \({\mathfrak {C}}_j^\varepsilon \) as follows: \(\textrm{rank}({\mathfrak {C}}_k^\varepsilon )> 0\) (i.e., \({\mathfrak {C}}_k^\varepsilon \ne 0\)) and the leading principal submatrices \({\mathfrak {C}}_1^\varepsilon ,\ldots , {\mathfrak {C}}_{k-1}^\varepsilon \) of \({\mathfrak {C}}_k^\varepsilon \) are all singular. Due to the Toeplitz structure of \({\mathfrak {C}}_k^\varepsilon \), it follows that there is an integer i, such that:
Then, the matrix \({\mathfrak {C}}_k^\varepsilon \) is of the form
Since the triangular Toeplitz matrix \(B_{i,k}\in \mathbb {H}^{(k-i)\times (k-i)}\) is invertible, we have \(\textrm{rank}({\mathfrak {C}}_k^\varepsilon )=2(k-i)\). Upon considering \({\mathfrak {C}}_k^\varepsilon \) as the extension of the block \(\left[ {\begin{matrix} 0 &{} 0 \\ 0&{}0 \end{matrix}}\right] \) in (3.9), we conclude by Theorem 2.1 that \(\nu _\pm ({\mathfrak {C}}_k^\varepsilon )\le k-i\). Since
it follows that \(\nu _\pm ({\mathfrak {C}}_k^\varepsilon )=k-i\). On the other hand, due to (3.7)
and therefore, \(i=\frac{n+2k-d}{2}\). Again, making use of (3.7), we now get
Now, let us consider the matrix \({\mathfrak {C}}^f_{n+k+j}\), \(1\le j\le n+k-d\). The Schur complement of its leading principal submatrix \({\mathfrak {C}}^f_{n}\) is congruent to the Toeplitz matrix \({\mathfrak {C}}_{k+j}^\varepsilon \), the structured extension of \({\mathfrak {C}}_k^\varepsilon \). Therefore, \({\mathfrak {C}}_{k+j}^\varepsilon \) is of the form
where the matrix \(B_{i,k+j}\in {\mathbb {H}}^{(k-i+j)\times (k-i+j)}\) is defined as in (3.9) and the integer i is the same as above: \(i=\frac{n+2k-d}{2}\). Therefore
Upon invoking Theorem 2.1 as in the previous part of the proof, we derive \(\nu _\pm ({\mathfrak {C}}_{k+j}^\varepsilon )=\frac{d-n}{2}+j\), and subsequently, on account of (3.10)
which completes the proof. \(\square \)
For the sake of completeness, we include the following remark, although it will not be used in the rest of the paper.
Remark 3.4
In the setting of Lemma 3.3
and hence, \(f_{n+1},\ldots , f_{n+i-1}\) are uniquely determined by \(f_0,\ldots ,f_{n-1}\) (by Lemma 3.1). Furthermore, for further extensions (with \(j\le 2i\))
Indeed, we see from (3.9) that \({\mathfrak {C}}_{i}^\varepsilon =0\), and hence, the first statement follows by (3.7). Furthermore, the formula (3.9) makes sense with k replaced by any j, \(i<j\le 2i\). Then, we may conclude exactly as in the proof of Lemma 3.3 that \(\nu _{\pm }({\mathfrak {C}}_{j}^\varepsilon )=j-i\), and consequently,
Upon letting \(N:=n+k\) in Lemmas 3.1 and 3.3, we arrive at the following extension result for singular Hermitian Toeplitz matrices; see [15] for the complex case.
Theorem 3.5
Given \(f_0,\ldots ,f_{N-1}\), let us assume that the matrix \({\mathfrak {C}}_N:={\mathfrak {C}}^f_{N}\) is singular and that \({\mathfrak {C}}_n\) (\(n<N\)) is the maximal invertible leading principal submatrix of \({\mathfrak {C}}_{N}\).
-
1.
If \(\textrm{rank}({\mathfrak {C}}_{N})=\textrm{rank}({\mathfrak {C}}_{n})=n\), then \(\nu _-({\mathfrak {C}}_{N})=\nu _-({\mathfrak {C}}_{n}):=\kappa \) and for each \(k\ge 1\), the extension \({\mathfrak {C}}_{N+k}\) with \(\nu _-({\mathfrak {C}}_{N+k})=\kappa \) is unique and satisfies \(\textrm{rank}({\mathfrak {C}}_{N+k})=n\).
-
2.
If \(\textrm{rank}({\mathfrak {C}}_{N})=d>n\), then for any choice of \(f_{N},\ldots ,f_{2N-d-1}\)
$$\begin{aligned} \nu _\pm ({\mathfrak {C}}_{N+j})=\nu _\pm ({\mathfrak {C}}_{N})+j\quad \text{ for }\quad j=1,\ldots ,N-d. \end{aligned}$$In particular, the matrix \({\mathfrak {C}}_{2N-d}\) is invertible.
To translate Theorem 3.5 to the setting of matrices \({\mathfrak {S}}_j\), we make the following observation.
Remark 3.6
Given a sequence \(\{f_j\}_{j\ge 0}\), there exists a sequence \(\{g_j\}_{j\ge 0}\), such that the matrix \({\mathfrak {S}}_n^f\) is congruent to the matrix \({\mathfrak {C}}_n^g\) for each \(n\ge 1\).
Proof
If \(f_0\ne 1\), we recursively define the sequence \(\{g_j\}_{j\ge 0}\) by
Then, the associated Toeplitz matrices are related by
and therefore
If \(f_0=1\), the matrix \(I-\textbf{T}_n^f\) is not invertible. In this case, we pass to the sequence \(\{-f_j\}_{j\ge 0}\) and then construct \(\{g_j\}_{j\ge 0}\) as above. Then, we have
and the desired congruence follows. \(\square \)
Since congruent matrices have the same inertia, we arrive at the following conclusion.
Remark 3.7
Lemmas 3.1, 3.3, and Theorem 3.5 with all matrices \({\mathfrak {C}}_j\) replaced by \({\mathfrak {S}}_j\) hold true.
Theorem 3.5 is concerned with structured extensions of a given singular matrix \({\mathfrak {C}}_N\) (or \({\mathfrak {S}}_N\)). As matter of fact, an invertible matrix \({\mathfrak {C}}_N\) (or \({\mathfrak {S}}_N\)) always can be extended without increase of the negative inertia and there are infinitely many such extensions. In the complex setting, this result goes back to [16]. The quaternion version is given below for the setting of the matrices \({\mathfrak {S}}_N\) which seems to be more convenient here.
3.2 Extensions of invertible \({\mathfrak {S}}_n\) with minimal negative inertia
We start with \(f_0,\ldots ,f_{n+k-1}\in {\mathbb {H}}\), such that the matrix \({\mathfrak {S}}_n:={\mathfrak {S}}^f_n\) is invertible, and consider the block-decomposition of \({\mathfrak {S}}^f_{n+k}\) conformal with (3.1)
Upon letting
[where \(F_{n+k}\) is defined as in (2.23)] in Lemma 2.2, we conclude that
the Schur complement of \({\mathfrak {S}}_n\) in (3.11) satisfies the Stein identity
where \(X_k,Y_k\in {\mathbb {H}}^k\) are given by the formula
and furthermore, \(\begin{bmatrix}x_0&y_0\end{bmatrix}\ne 0\).
Remark 3.8
Let us assume that \(\nu _{-}({\mathfrak {S}}_{n+1})=\nu _{-}({\mathfrak {S}}_{n})\). Then, \(|x_0|>|y_0|\) if \({\mathfrak {S}}_{n+1}\) is invertible and \(|x_0|=|y_0|>0\) if \({\mathfrak {S}}_{n+1}\) is singular. The latter is clear from equalities
which in turn, are particular cases (\(k=1\)) of (2.10) and (3.14), respectively.
We next introduce matrix polynomials \(\Psi =\left[ {\begin{matrix} \psi _{11} &{} \psi _{12}\\ \psi _{21}&{} \psi _{22} \end{matrix}}\right] \) and \(\Theta =\left[ {\begin{matrix} \theta _{11} &{} \theta _{12}\\ \theta _{21} &{} \theta _{22} \end{matrix}}\right] \)
where
Remark 3.9
At least one of the polynomials \(\theta _{21}\) and \(\theta _{22}\) in (3.17) and at least one of the polynomials \(\psi _{11}\) and \(\psi _{21}\) in (3.16) are invertible in \({\mathbb {H}}[[z]]\), that is
Proof
To prove the first relation in (3.18), let us assume, via contradiction, that
Then, it follows by (2.21) that:
and therefore, upon canceling invertible matrices on the right
Combining (3.21) with (3.19) gives
which together with (3.21) implies \(F_n^*{\mathfrak {S}}_n^{-1}=0\), and hence, \(F_n=0\), which contradicts (3.20) and thus completes the proof of the first relation in (3.18). To prove the second one, we use the explicit formulas for \(\psi _{11}\) and \(\psi _{21}\) (derived from (3.16)) and assume, via contradiction, that
Then, it follows by (2.21) that:
and the latter contradiction completes the proof. \(\square \)
In what follows, we will use the equality
which can be verified directly upon making use of the Stein identity (2.21), or upon specifying the general identity (2.14) to the present setting.
Lemma 3.10
The polynomials \(\Theta \) and \(\Psi \) defined in (3.17) and (3.16) are subject to identities
Proof
where
Observe that \((zI-Z_n)\textbf{Z}_n(z)=z^nI\) and therefore
Making use of the last equality along with (3.23), we simplify the rightmost term in (3.25) to
Substituting the latter expression into (3.25) leads to \(W(z)=0\), thus confirming the first equality in (3.24). The second equality is now clear. \(\square \)
Lemma 3.11
Given \(f_0,\ldots ,f_{n+k-1}\in {\mathbb {H}}\) such that the matrix \({\mathfrak {S}}_n\) is invertible, let \(F_{n+k}\) and \(\Theta \) be defined as in (2.23) and (3.17). Then
where the columns \(X_k,Y_k\in {\mathbb {H}}^k\) are the same as in (3.15) and \(\Phi _k\) is the \((n+k)\times 2\)-matrix polynomial given by
Proof
Here, we will use the equality
which follows from (2.13) (specified to the present setting) upon multiplying both sides of the latter by \(\left[ {\begin{matrix} I_n \\ 0 \end{matrix}}\right] \) on the right. Using the latter equality along with (3.17) and (3.28), we compute the polynomial (which eventually turns out to be a constant)
It is readily seen from (3.29) that the n top rows in L are zeros
while, for the k bottom rows, we have
Comparing the last expression with (2.12) implies
which together with (3.29) and (3.30) implies (3.27). \(\square \)
The next result is a consequence of formulas (3.27) and (3.15). Here, we start with an infinite sequence \(\{f_j\}_{j\ge 0}\) (or its Z-transform \(f(z)=\sum f_jz^j\)). If the n first terms (coefficients) are such that the matrix \({\mathfrak {S}}_n\) is invertible, the formulas (3.15) define the sequences \(\{x_j\}_{j\ge 0}\), \(\{y_j\}_{j\ge 0}\) and their Z-transforms x(z), y(z).
Lemma 3.12
If the n first coefficients of \(f(z)=\sum f_jz^j\in {\mathbb {H}}[[z]]\) are such that the matrix \({\mathfrak {S}}_n={\mathfrak {S}}_n^f\) (1.1) is invertible, and if \(\theta _{ij}\) are the polynomials defined as in (3.17), then
where
are the power series with coefficients defined via formulas (3.15) for \(k\ge 1\).
Proof
Let \(p_{n+k}\) be the polynomial defined by
Multiplying both sides of (3.27) by \(\begin{bmatrix}1&z&\ldots&z^{n+k-1}\end{bmatrix}\) and taking into account the structure of matrices in (2.6) and (3.15), we get
Due to (3.32), the latter equality can be rearranged as
It follows from (3.34) that the equality (3.31) holds for power series \(x,y\in {\mathbb {H}}[[z]]\), the k first coefficients of which are given by formulas (3.15). Since equalities (3.15) and (3.34) hold for all \(k\ge 1\), the statement follows. \(\square \)
Theorem 3.13
Let \(f_0,\ldots ,f_{n-1}\in {\mathbb {H}}\) be such that the matrix \({\mathfrak {S}}_n={\mathfrak {S}}_n^f\) is invertible and let \(\Theta \) and \(\Psi \) be the matrix polynomials defined in (3.17) and (3.16). Given \(f\in {\mathbb {H}}[[z]]\), the following are equivalent:
-
1.
f is of the form
$$\begin{aligned} f(z)=\underbrace{f_0+f_1z+\cdots +f_{n-1}z^{n-1}}_{p_n(z)}+\cdots \end{aligned}$$(3.35) -
2.
\(\begin{bmatrix}1&\quad -f\end{bmatrix}\Theta =z^n \begin{bmatrix}x&\quad -y\end{bmatrix}\) for some \(x,y\in {\mathbb {H}}[[z]]\).
-
3.
f is of the form
$$\begin{aligned} f=(x\psi _{11}-y\psi _{21})^{-1}(y\psi _{22}-x\psi _{12}) \end{aligned}$$(3.36)for some \(x,y\in {\mathbb {H}}[[z]]\), such that \(x\psi _{11}-y\psi _{21}\) is invertible in \({\mathbb {H}}[[z]]\).
Proof
Implication \((1)\Rightarrow (2)\) follows by Lemma 3.12.
Proof of \((2)\Rightarrow (1)\): We first note that equality (3.33) makes sense even for \(k=0\), in which case it takes the form
where according to (3.28)
Assuming that (3.31) holds for some \(x,y\in {\mathbb {H}}[[z]]\), we subtract (3.31) from (3.37) to get
By Remark 3.9, the free coefficient of the polynomial \(\begin{bmatrix}\theta _{21}&\theta _{22}\end{bmatrix}\) is non-zero. Then, it follows from (3.38) that the n first coefficients of the power series \(f-p_n\) are zeros, and hence, f is of the form (3.35).
Proof of \((2)\Leftrightarrow (3)\): Assume first that equality (3.31) holds for some \(x,y\in {\mathbb {H}}[[z]]\). Multiplying both sides of (3.31) by \(\Psi (z)\) on the right, making use of (3.24) and canceling \(z^n\), we arrive at
Therefore, \(x\psi _{11}-y\psi _{21}=1\) (in particular, it is invertible in \({\mathbb {H}}[[z]]\)) and
Conversely, if f is of the form (3.36), then \((x\psi _{11}-y)f=y\psi _{22}-x\psi _{12}\), which can be rearranged as in (3.39). Multiplying both sides of (3.39) by \(\Theta (z)\) on the right and making use of (3.24), we get back to (3.31). \(\square \)
The next theorem parametrizes all structured extensions \({\mathfrak {S}}_{n+k}\) of a given invertible \({\mathfrak {S}}_{n}\) with minimally possible negative inertia.
Theorem 3.14
Let \(f_0,\ldots ,f_{n-1}\in {\mathbb {H}}\) be such that the matrix \({\mathfrak {S}}_n={\mathfrak {S}}_n^f\) is invertible and let \(\Psi =\left[ {\begin{matrix} \psi _{11}&{} \psi _{12}\\ \psi _{21}&{}\psi _{22} \end{matrix}}\right] \) and \(\Theta =\left[ {\begin{matrix} \theta _{11}&{} \theta _{12}\\ \theta _{21}&{}\theta _{22} \end{matrix}}\right] \) be the polynomials defined in (3.16), (3.17). Then
-
1.
The equality
$$\begin{aligned} (\psi _{11}-\varepsilon \psi _{21})^{-1}(\varepsilon \psi _{22}-\psi _{12})= (\theta _{11}\varepsilon +\theta _{21})(\theta _{21}\varepsilon +\theta _{22})^{-1} \end{aligned}$$(3.40)holds for any \(\varepsilon \in {\mathbb {H}}[[z]]\) subject to equivalent conditions
$$\begin{aligned} \psi _{11,0}-\varepsilon _0\psi _{21,0}\ne 0 \; \; \Longleftrightarrow \; \; \theta _{21,0}\varepsilon _0+\theta _{22,0}\ne 0. \end{aligned}$$(3.41) -
2.
Conditions (3.41) are met for any any \(\varepsilon \in {\mathbb {H}}[[z]]\) with \(|\varepsilon _0|\le 1\) if and only if the bottom diagonal entry in \({\mathfrak {S}}_n^{-1}\) is positive
$$\begin{aligned} \widetilde{\textbf{e}}^*_n{\mathfrak {S}}_n^{-1}\widetilde{\textbf{e}}_n>0. \end{aligned}$$(3.42) -
3.
An extended sequence \(\{f_j\}_{j\ge 0}\) satisfies equalities
$$\begin{aligned} \nu _-({\mathfrak {S}}_{n+k})=\nu _-({\mathfrak {S}}_{n})\quad \text{ for } \text{ all }\quad k\ge 1 \end{aligned}$$(3.43)if and only if its Z-transform \(f(z):=\sum f_jz^j\) is of the form
$$\begin{aligned} f=(\psi _{11}-\varepsilon \psi _{21})^{-1}(\varepsilon \psi _{22}-\psi _{12}) =(\theta _{11}\varepsilon +\theta _{21})(\theta _{21}\varepsilon +\theta _{22})^{-1}, \end{aligned}$$(3.44)where \(\varepsilon \in {\mathbb {H}}[[z]]\) is any power series subject to conditions (3.40) and such that \({\mathfrak {S}}_{k}^\varepsilon :=I-\textbf{T}_k^{\varepsilon }{} \textbf{T}_k^{\varepsilon *}\succeq 0\) for all \(k\ge 1\).
Proof of (1)
By (3.24), \(\Theta _0\Psi _0=0\) and hence in particular
Let us assume that
Then, \(\psi _{21,0}\ne 0\), since otherwise \(\psi _{11,0}=0\), which is not possible, by Remark 3.9. Hence
and subsequently
The converse implication is verified similarly. This completes the justification of the equivalence (3.41). Once the conditions (3.41) are met, both sides in (3.40) make sense, and the equality (3.40) can be written as
or equivalently, as \(\begin{bmatrix}1&\quad -\varepsilon \end{bmatrix}\Psi \Theta \begin{bmatrix}\varepsilon \\ 1\end{bmatrix}=0\) which holds true due to (3.24). \(\square \)
Proof of (2)
The left condition in (3.41) holds for all \(\varepsilon _0\) subject to \(|\varepsilon _0|\le 1\) if and only if \(|\psi _{11,0}|>|\psi _{21,0}|\). The latter is equivalent to (3.42), since
The latter equality follows from explicit formulas (3.22) and the identity (3.23). Details are given in computation (6.17) below (for \(j=0\)).\(\square \)
Proof of (3)
The Z-transform of any extended sequence is of the form (3.35) and therefore (by Theorem 3.13), it is of the form (3.36) for some \(x,y\in {\mathbb {H}}[[z]]\) subject to
By Remark 3.8, equality (3.43) (for \(k=1\)) guarantees that \(x_0\ne 0\). Then, (3.47) is equivalent to the first inequality in (3.41). Furthermore, x is invertible in \({\mathbb {H}}[[z]]\). Letting \(\varepsilon :=x^{-1}y\in {\mathbb {H}}[[z]]\), we can write the formula (3.36) in the form (3.44). Since \(x_0\ne 0\), the Toeplitz matrix \(\textbf{T}_k^{x}\) associated with the power series x is invertible for any \(k\ge 1\). Furthermore, \(\textbf{T}_k^y=\textbf{T}_k^{x}{} \textbf{T}_k^\varepsilon \), \(\textbf{T}_k^{x}{} \textbf{e}=X_k\), \(\textbf{T}_k^{y}{} \textbf{e}=Y_k\). Multiplying both sides in (3.14) by \((\textbf{T}_k^{x})^{-1}\) on the left, by its adjoint on the right and commuting \((\textbf{T}_k^{x})^{-1}\) and \(Z_k\), we get
which is the Stein equation of the form (2.21). Therefore, it admits a unique solution \({\mathfrak {C}}_k^\varepsilon \). Thus, \((\textbf{T}_k^{x})^{-1}{} \textbf{S}_k(\textbf{T}_k^{x*})^{-1}={\mathfrak {C}}_k^\varepsilon \), and therefore, \(\nu _{\pm }(\textbf{S}_k)=\nu _{\pm }({\mathfrak {C}}_k^\varepsilon )\), by the Sylvester law of inertia. Then, we have by (2.10)
and hence, equalities (3.43) hold if and only if \(\nu _- ({\mathfrak {S}}^\varepsilon _k)=0\), meaning that \({\mathfrak {S}}^\varepsilon _k\succeq 0\) for all \(k\ge 1\). \(\square \)
4 Carathéodory–Schur problem in the Schur class \({\mathcal {S}}_{{\mathbb {H}}}\)
So far, we have made no assumptions on the convergence of formal power series over \({\mathbb {H}}\). For a fixed \(\rho >0\), we now introduce the subring \({\mathcal {H}}_\rho \) of \({\mathbb {H}}[[z]]\) of power series (absolutely) converging in the ball \(\mathbb {B}_\rho =\{\alpha \in {\mathbb {H}}: \, |\alpha |<\rho \}\)
Remark 4.1
If \(f\in {\mathcal {H}}_\rho \) and \(f_0\ne 0\), then \(f^{-1}\in \mathcal {H}_\delta \) for some \(\delta >0\).
Indeed, since f converges in \({\mathbb {B}}_\rho \) absolutely, \(\sum _{k=1}^\infty |f_k||\alpha |^k<|f_0|\) for all \(\alpha \in \mathbb {B}_\delta \) for some \(\delta \in (0,\rho )\). Then, the induction argument shows that the coefficients \(g_k\) of \(f^{-1}\) obtained recursively from the identity \(f\cdot f^{-1}=\textbf{1}\) satisfy inequalities \(|g_k|\le \frac{1}{|f_0|\delta ^{k}}\) for all \(k\ge 0\), and hence, \(f^{-1}\) converges in \({\mathbb {B}}_\delta \).
Note that any \(f\in {\mathcal {H}}_\rho \) can be evaluated at any \(\alpha \in {\mathbb {B}}_\rho \) on the left or on the right via (absolutely) converging series
We will write simply \(f(\alpha )\) if \(f^{\varvec{e_\ell }}(\alpha )=f^{\varvec{e_r}}(\alpha )\). It is the case, as is readily seen from (4.1) and (2.5), when \(f\in \mathbb {C}_\alpha [[z]]\); in particular, if \(f\in {\mathbb {R}}[[z]]\) or if \(\alpha \) is real.
The functions \(f^{\varvec{e_\ell }}, f^{\varvec{e_r}}: \, {\mathbb {B}}_\rho \rightarrow {\mathbb {H}}\) induced by a power series \(f\in {\mathcal {H}}_\rho \) are not only continuous on \({\mathbb {B}}_\rho \), but also holomorphic in the following sense: for each pure unit \(\alpha \) (\(\Re \alpha =0\), \(|\alpha |=1\)), the restrictions of \(f^{\varvec{e_\ell }}\) and \(f^{\varvec{e_r}}\) to the plane \({\mathbb {C}}_\alpha =\{x+\alpha y: \, x,y\in {\mathbb {R}}\}\) (more precisely, to the disk \({\mathbb {B}}_\rho \cap {\mathbb {C}}_\alpha \)) satisfy the respective Cauchy–Riemann equations
The latter left and right holomorphies were the starting point of the theory of left- and right-regular (also called slice-regular) functions initiated in [12]; we refer to [13] for a detailed exposition of more recent developments. The main advantage of the holomorphic approach is that Cauchy–Riemann equations (4.2) define regular functions over general domains. However, if a function \(g: \, {\mathbb {B}}_\rho \rightarrow {\mathbb {H}}\) is regular, it can be expanded in Taylor series \(\sum g_kz^k\) with \(g_k=g^{(k)}(0)/k!\) which belongs to \({\mathcal {H}}_\rho \). Being evaluated on the left or on the right, the latter series brings us back to the original function and its dual (left or right) counterpart.
4.1 Quaternionic Schur class \({\mathcal {S}}_{{\mathbb {H}}}\)
Quaternionic Schur functions were introduced in [2] as left-regular functions taking the values less than one in modulus over the unit ball \({\mathbb {B}}_1\) of \({\mathbb {H}}\); see [4] for a thorough account of the subject. An attempt to capture both left and right settings within a power-series approach was undertaken in [6, Section 3]. Here, we will proceed using different and more explicit arguments.
An element \(\alpha \in {\mathbb {B}}_\rho \) is called a left or right zero of \(f\in {\mathcal {H}}_\rho \) if, respectively, \(f^{\varvec{e_\ell }}(\alpha )=0\) or \(f^{\varvec{e_r}}(\alpha )=0\). If \(V\subset \mathbb {B}_\rho \) is a similarity class, then any power series \(f\in \mathcal H_\rho \) either has no zeros in V or it has one left and one right zero in V, or \(f^{\varvec{e_\ell }}(\alpha )=f^{\varvec{e_r}}(\alpha )=0\) for all \(\alpha \in V\). This observation goes back to [19] for the polynomial case; the power series case is similar (see, e.g., [7, §2.2]). Therefore, for every \(f\in {\mathcal {H}}_\rho \) and a similarity class \(V\subset {\mathbb {B}}_\rho \), either \(f^{\varvec{e_\ell }}(\alpha )=c=f^{\varvec{e_r}}(\alpha )\) for all \(\alpha \in V\), or for any \(\alpha \in V\), there is a unique \(\alpha '\in V\) such that \(f^{\varvec{e_r}}(\alpha ')=f^{\varvec{e_\ell }}(\alpha )\). We thus arrive at the following observation.
Remark 4.2
If \(f\in {\mathcal {H}}_\rho \), then the images of a similarity class V under \(f^{\varvec{e_\ell }}\) and \(f^{\varvec{e_r}}\) are equal as sets: \(f^{\varvec{e_\ell }}(V)=f^{\varvec{e_r}}(V)\). Consequently, \(f^{\varvec{e_\ell }}(\mathbb {B}_{\rho '})=f^{\varvec{e_r}}({\mathbb {B}}_{\rho '})\) for any \(\rho '<\rho \).
We now introduce the norm
on \({\mathcal {H}}_1\) (considered as an \({\mathbb {H}}\)-bimodule), and define the Schur class \({\mathcal {S}}_{{\mathbb {H}}}\) to be
By Remark 4.2, \({\displaystyle \max _{\alpha \in \mathbb {B}_\rho }|f^{\varvec{e_\ell }}(\alpha )|}= {\displaystyle \max _{\alpha \in \mathbb {B}_\rho }|f^{\varvec{e_r}}(\alpha )|}\) for any \(f\in {\mathcal {H}}_1\) and \(\rho <1\), from which the second equality in (4.3) follows. The fact that \(\Vert \cdot \Vert _\infty \) is indeed a norm is easily verified. The coefficient characterization of the class \({\mathcal {S}}_{{\mathbb {H}}}\) is very much the same as in the complex case [22].
Theorem 4.3
([1]) A power series \(f\in {{\mathbb {H}}}[[z]]\) belongs to \({\mathcal {S}}_{{\mathbb {H}}}\) if and only if the Toeplitz matrix \(\textbf{T}_n^f\) defined as in (2.18) is contractive (i.e., \({\mathfrak {S}}_n^f=I_n-\textbf{T}_n^f\textbf{T}_n^{f*}\) is positive semidefinite) for all \(n\ge 1\).
Following [3], we define a Blaschke product of degree n to be the power-series product:
where the Blaschke factor \(\textbf{b}_\alpha \) is the power series defined by
It is easy to show (see e.g., [7, Proposition 3.2]) that
and therefore, \(\textbf{b}_{\alpha }\) and, more generally, f of the form (4.4), are in \({\mathcal {S}}_{{\mathbb {H}}}\). In what follows, we will write \({\mathcal {S}}_{{\mathbb {H}},n}\) for the set of all Blaschke products of degree n.
In certain cases, it is more convenient to deal with the power series representation of a finite Blaschke product rather than its factorization (4.4), which is largely non-unique (in general). The next result (see [7, Theorem 5.3] for the proof) is supplementary to Theorem 4.3.
Theorem 4.4
A power series \(f(z)=\sum f_kz^k\in {{\mathbb {H}}}[[z]]\) is a Blaschke product of degree k if and only if the associated matrix \({\mathfrak {S}}_n^f=I_n-\textbf{T}_{n}^f\textbf{T}_{n}^{f*}\) is positive semidefinite and \({\text {rank}}({\mathfrak {S}}_n^f)=\textrm{min} (k,n)\) for all \(n\ge 1\).
Remark 4.5
The class \({\mathcal {S}}_{{\mathbb {H}}}\) and finite Blaschke products can be equivalently characterized in terms of positive semidefinite matrices
Indeed, \({\mathfrak {S}}_n^f\) and \({\mathfrak {R}}_n^f\) are both Schur complements of the identity blocks in partitioned matrix \(\left[ {\begin{matrix} I_n &{} \textbf{T}^{f*}_n \\ \textbf{T}_n^f &{} I_n \end{matrix}}\right] \) and, therefore, \(\nu _{\pm }({\mathfrak {S}}_n^f)=\nu _{\pm }({\widetilde{{\mathfrak {S}}}}_n^f)\). Note also that \({\mathfrak {R}}_n^f\) is uniquely recovered from the Stein identity
and where \(\widetilde{\textbf{e}}_n\) is defined as in (2.6).
4.2 The Carathéodory–Schur problem in \({\mathcal {S}}_{{\mathbb {H}}}\)
The problem consists of finding an \(f\in {\mathcal {S}}_{{\mathbb {H}}}\) with prescribed n first coefficients
The solvability and the uniqueness criteria for this problem are given below.
Theorem 4.6
-
(1)
Given \(f_0,\ldots ,f_{n-1}\in {\mathbb {H}}\), there exists an \(f\in {\mathcal {S}}_{{\mathbb {H}}}\) of the form (4.5), if and only if \({\mathfrak {S}}_n:=I_n-\textbf{T}_n^{f}{} \textbf{T}_n^{f*}\succeq 0\).
-
(2)
Such f is unique if and only if \({\mathfrak {S}}_n\) is singular. In this case, f is a Blaschke product of degree \(\textrm{deg} f=\textrm{rank}({\mathfrak {S}}_n)\).
The latter result appears in [4, Section 10.4] where it is settled using quaternionic de Branges–Rovnyak spaces. A more elementary power-series proof presented here bypasses reproducing-kernel arguments. The necessity in part (1) follows by Theorem 4.3. The rest follows from more detailed Theorems 4.7 and 4.8 below treating indeterminate and determinate cases, respectively.
Theorem 4.7
Let us suppose that \({\mathfrak {S}}_n\succ 0\). Then, the formula
parametrizes all \(f\in {\mathcal {S}}_{{\mathbb {H}}}\) subject to condition (4.5). Furthermore, f of the form (4.6) is a finite Blaschke product if and only if the parameter \(\varepsilon \) is a finite Blaschke product. In this case, \(\deg f=n+\deg \varepsilon \).
Proof
If \(f\in {\mathcal {S}}_{{\mathbb {H}}}\), then \({\mathfrak {S}}^f_{n+k}\succeq 0\) for all \(k\ge 1\), by Theorem 4.3. If in addition, f satisfies (4.5), then f is of the form (3.36) for some \(\varepsilon \in {\mathbb {H}}[[z]]\), such that \({\mathfrak {S}}_k^{\varepsilon }\succeq 0\) for all \(k\ge 1\), i.e., \(\varepsilon \in {\mathcal {S}}_{{\mathbb {H}}}\), again by Theorem 4.3.
If f is a Blaschke product of degree \(m>n\), then by Theorem 4.4 combined with (3.48), we have
for all \(k\ge 1\), and hence, \(\varepsilon \) is a Blaschke product of degree \(m-n\), again by Theorem 4.4.
For the converse statement, we first note that since \({\mathfrak {S}}_n\succ 0\), the inequality (3.42) holds, and hence, linear fractional expressions (4.6) make sense for any \({\mathcal {E}}\in \mathcal S_{{\mathbb {H}}}\) (by Theorem 3.14, part (2)).
Let f be of the form (4.6) for some \(\varepsilon \in {\mathcal {S}}_{{\mathbb {H}}}\). Theorem 3.14 guarantees that the first n coefficients of f are equal to the prescribed \(f_0,\ldots ,f_{n-1}\) and equalities (3.43) hold, which in the present case amount to
Hence, \({\mathfrak {S}}^f_{n+k}\succeq 0\) for all \(k\ge 1\) and, therefore, \(f\in {\mathcal {S}}_{{\mathbb {H}}}\). Finally, if \(\varepsilon \) is a Blaschke product of degree r, it follows from (4.7) that f is a Blaschke product of degree \(n+r\). \(\square \)
We now turn to the singular case. To use the above notation, we consider the problem (4.5) with prescribed \(n+k\) first coefficients \(f_0,\ldots ,f_{n+k-1}\) and assume that
We next let
where \(\begin{bmatrix} a_1&\ldots&a_n\end{bmatrix}=-\begin{bmatrix} f_n&f_{n-1}&\ldots&f_1\end{bmatrix} \textbf{T}_n^{f*}{\mathfrak {S}}_n^{-1}\).
Theorem 4.8
Under the assumptions (4.8), the power series
is a Blaschke product of degree n and is a unique Schur-class power series with the first \(n+k\) coefficients equal to \(f_0,\ldots ,f_{n+k-1}\).
Proof
The construction (4.10) appears in [7]. Since A is a companion matrix, it follows that \(\textbf{e}_n^*A^j=\textbf{e}_n^* Z_n^{*j}\) for \(j=1,\ldots ,n-1\). On the other hand, it is readily seen from (2.6) and (4.9) that \(\textbf{e}_n^* Z_n^{*j}B=f_{j+1}\) for \(j=1,\ldots ,n-1\). Therefore, \(\textbf{e}_n^*A^jB=f_{j+1}\) for \(j=1,\ldots ,n-1\) and hence
Furthermore, it was shown in [7, Theorem 5.4] that A and B defined in (4.9) satisfy the equality
Since \({\mathfrak {S}}_n\) is positive definite, the latter equality tells us that the matrix
is unitary. Since s of the form (4.10) can be realized also as
and since the matrix \(\left[ {\begin{matrix} {\widetilde{A}}&{}{\widetilde{B}}\\ {\widetilde{C}}&{}f_0 \end{matrix}}\right] \) is unitary, it follows by [7, Theorem 3.11] that s is a Blaschke product of degree at most n. Since \({\mathfrak {S}}_n^s={\mathfrak {S}}_n\succ 0\), it follows by Theorem 4.4 that \(\deg s=n\). It remains to show that
By (4.8), the matrix \({\mathfrak {S}}^f_{n+1}\) is singular. By Lemma 3.1 (applied to \({\mathfrak {S}}_n\) rather than \({\mathfrak {C}}_n\)), all further positive semidefinite structured extensions of \({\mathfrak {S}}^f_{n+1}\) are uniquely determined by the elements \(f_0,\ldots ,f_n\) (i.e., by \({\mathfrak {S}}^f_{n+1}\)). On the other hand, since s is a Blaschke product of degree n, the matrix \({\mathfrak {S}}_{n+k}^s\) is positive semidefinite and \(\textrm{rank}({\mathfrak {S}}_{n+k}^s)=n\). Since \(s_j=f_j\) for \(j=0,\ldots n\), we have \({\mathfrak {S}}^s_{n+1}={\mathfrak {S}}^f_{n+1}\). Thus, \({\mathfrak {S}}_{n+k}^s\) is another positive semidefinite structured extensions of \({\mathfrak {S}}^f_{n+1}\). Since such an extension is unique, \({\mathfrak {S}}_{n+k}^s={\mathfrak {S}}^f_{n+k}\), and (4.12) follows. \(\square \)
Remark 4.9
The latter proof is independent of Theorem 4.4. Actually, it follows from Theorem 4.4 that \(s\in \mathcal {S}_{{\mathbb {H}}}\) subject to condition:
exists and arises via formula (4.6) for some \(\varepsilon \in {\mathcal {S}}_{{\mathbb {H}}}\) with the unimodular free coefficient \(\varepsilon _0=x_0^{-1}y_0\). Therefore, \(\varepsilon \equiv \varepsilon _0\), and hence
is the unique Schur-class power series subject to (4.13). It belongs to \({\mathcal {S}}_{{{\mathbb {H}}},n}\) by the last statement in Theorem 4.7. Due to the uniqueness, this s is the same as in the formula (4.10).
4.3 Carathéodory approximation theorem
As an application of Theorem 4.7, we now show that any \(f\in {\mathcal {S}}_{{\mathbb {H}}}\) can be uniformly approximated by (quaternion) finite Blaschke products on compact subsets of \(\mathbb {B}_1\). The function-theoretic (rather than power-series) context is crucial here.
Theorem 4.10
Let \(f\in {\mathcal {S}}_{{\mathbb {H}}}\). For any \(\rho <1\) and \(\epsilon >0\), there exists a finite Blaschke product B, such that
for all \(\alpha \in \overline{{\mathbb {B}}}_\rho \).
Proof
We choose n, such that \(2\rho ^n<\epsilon \) and assume that \(f(z)=\sum _{j\ge 0}f_jz^j\) is not a finite Blaschke product (since otherwise we can take \(B=f\)), so that the associated matrix \({\mathfrak {S}}^f_n=I-\textbf{T}^f_n\textbf{T}^{f*}_n\) is positive definite. By Theorem 4.7, there is a finite Blaschke product B solving the problem (4.5), i.e., having the same first n coefficients as f. Then \(g=\frac{1}{2}(f-B)\) is a Schur-class power series with first n coefficients equal zero, i.e., \(g(z)=z^n h(z)\) for some \(h\in {\mathcal {H}}_1\). Since for any \(m\ge 1\), the Toeplitz matrices (2.18) associated with g and h are related by \(\textbf{T}^g_{n+m}=\left[ {\begin{matrix} 0 &{} 0 \\ \textbf{T}^h_{m} &{} 0 \end{matrix}}\right] \), we have
for all \(m\ge 1\), and hence, \(h\in {\mathcal {S}}_{{\mathbb {H}}}\), by Theorem 4.3. Therefore, for any \(\alpha \in \overline{\mathbb {B}}_\rho \)
which verifies the first inequality in (4.14). The second inequality follows similarly. \(\square \)
The complex-valued counterpart of Theorem 4.10 stating that any Schur function \(f\in {\mathcal {S}}\) can be uniformly approximated by finite Blaschke products on compact subsets of the unit disk \({\mathbb {D}}\subset {\mathbb {C}}\), is due to Carathéodory [10].
4.4 Carathéodory–Fejér extremal problem
It was shown in [11] (with further elaboration in [23]) that the set of all functions analytic on \({\mathbb {D}}\) and with fixed n first Taylor coefficients contains a unique element (a scalar multiple of a finite Blaschke product) with minimally possible \(H^\infty \)-norm. Due to Theorems 4.7 and 4.6, this result extends to the quaternion setting as follows. By rescaling we see that given \(g_0,\ldots ,g_{n-1}\in {\mathbb {H}}\) and \(\lambda >0\), a power series \(g\in {\mathcal {H}}_1\) satisfies
[see (4.10)] if and only if the power series \(f=\lambda ^{-1}g\) solves the problem (4.5) with \(f_j=\lambda ^{-1}g_j\) for \(j=0,\ldots ,n-1\). Then, it follows by Theorem 4.6 that \(g\in {\mathcal {H}}_1\) subject to conditions (4.15) exists if and only if:
or equivalently
The minimally possible \(\lambda >0\) for which the latter inequality holds is equal to \(\sigma _{\textrm{max}}(\textbf{T}_n^g)\), the maximal singular value of the matrix \(\textbf{T}_n^g\). Again, by Theorem 4.6, the unique \(s\in {\mathcal {S}}_{{\mathbb {H}}}\) solving the problem (4.5) with \(f_j=\sigma _{\textrm{max}}(\textbf{T}_n^g)^{-1}g_j\) is a Blaschke product of degree \(k=\textrm{rank}(\lambda _{\min }^2I_n-\textbf{T}_n^g\textbf{T}_n^{g*})\) (explicitly constructed as in Theorem 4.8). Consequently, the minimally possible \(\Vert g\Vert _\infty \) for g subject to the first condition in (4.15) equals \(\sigma _{\textrm{max}}(\textbf{T}_n^g)\), and \(g=\lambda s\) is the unique power series on which this minimum is attained.
5 The Carathéodory class \({\mathcal {C}}_{{\mathbb {H}}}\)
To define the Carathéodory class in the quaternion setting, we recall evaluation formulas (4.1) and observe from them as a consequence of the equality \(\Re (\alpha \beta )=\Re (\beta \alpha )\) holding for all \(\alpha ,\beta \in {\mathbb {H}}\), that \(\Re (f^{\varvec{e_\ell }}(\alpha ))=\Re (f^{\varvec{e_r}}(\alpha ))\) which justifies the notation
We now define the Carathéodory class
5.1 Cayley transform
The evaluation functionals \(f\rightarrow f^{\varvec{e_\ell }}(\alpha )\) are right-linear but not multiplicative, in general (unless \(\alpha \) is real). By (2.2) and (4.1), for any \(f,g\in {\mathcal {H}}_\rho \) and any \(\alpha \in {\mathbb {B}}_\rho \), we have
from which it follows that:
In case f has no zeros in \({\mathbb {B}}_\rho \) the formula (5.1) suggests to introduce the transformation
which turns out to be a bijection (even a homeomorphism) on \(\mathbb {B}_\rho \) preserving all similarity classes; see [13, §5.5]. Indeed, letting \(f^\sharp \) to denote the power-series conjugate of f
and observing that \(ff^\sharp \in {\mathbb {R}}[[z]]\) and, therefore, \(ff^\sharp (\alpha )\) commutes with \(\alpha \) for all \(\alpha \in {\mathbb {B}}_\rho \), we apply the formula (5.1) (with \(g=f^\sharp \)) to get
which shows that the map \(\Upsilon _{f,{\varvec{\ell }}}\) is invertible with the inverse equal to \(\Upsilon _{f^\sharp ,{\varvec{\ell }}}\).
We now recall the Cayley transform connecting the classes \(\mathcal C_{{\mathbb {H}}}\) and \({\mathcal {S}}_{{\mathbb {H}}}\). Let \(\textbf{1}\) denote the power series with the free coefficient equal one and all other coefficients equal zero. If \(f\in {\mathcal {C}}_{{\mathbb {H}}}\), then the series \(\textbf{1}+f\) has non-zero left and right values over \(\mathbb {B}_1\), and hence its formal inverse belongs to \({\mathcal {H}}_1\).
Remark 5.1
The Cayley transform
establishes a one-to-one correspondence between \({\mathcal {C}}_{\mathbb {H}}\) and \({\mathcal {S}}_{{\mathbb {H}}}\backslash \{\textbf{1}\}\).
Proof
Making use of formula (5.1) and notation (5.2), we evaluate power-series equality \((\textbf{1}+g)f=g-\textbf{1}\) on the left to get
which, in turn, implies
Since \(f\in {\mathcal {C}}_{{\mathbb {H}}}\), the expression on the right side is nonnegative. Since \(\Upsilon _{\textbf{1}+g,{\varvec{\ell }}}\) is a bijection on \({\mathbb {B}}_1\), we conclude from (5.6) that \(|f^{\varvec{e_\ell }}(\beta )|\le 1\) for all \(\beta \in {\mathbb {B}}_1\) and, hence, \(f\in {\mathcal {S}}_{{\mathbb {H}}}\). It is clear from (5.5) that \(f^{\varvec{e_\ell }}(\Upsilon _{\textbf{1}+g,{\varvec{\ell }}}(\alpha ))\ne 1\) and, therefore, \(f\ne \textbf{1}\). Finally, if \(f\in {\mathcal {S}}_{\mathbb {H}}\backslash \{\textbf{1}\}\), then the formal inverse of \(\textbf{1}-f\) exists; the power series \(g={\mathfrak {T}}^{-1}[f]=(\textbf{1}+f)(\textbf{1}-f)^{-1}\) (the inverse Cayley transform of f) belongs to \({\mathcal {H}}_1\) and satisfies equality (5.6) for all \(\alpha \in {\mathbb {B}}_1\). Since the left side in (5.6) is now nonnegative, we conclude that \(g\in {\mathcal {C}}_{{\mathbb {H}}}\). \(\square \)
As an application of Remark 5.1, we characterize Carathéodory-class power series in terms of their coefficients.
Theorem 5.2
The power series \(g(z)=\sum _{k\ge 0} g_k z^k\) belongs to the Carathéodory class \({\mathcal {C}}_{{\mathbb {H}}}\) if and only if
for all \(n\ge 1\), where \(\textbf{T}^g_n\) is the Toeplitz matrix defined as in (2.18).
Proof
Let \(f={\mathfrak {T}}[g]\). Applying formulas (2.20) to the power-series equality (5.4), we see that the Toeplitz matrices \(\textbf{T}_n^g\) and \(\textbf{T}_n^f\) associated with g and f via formula (2.18) are related by
Then, we have
Since \(g\in {\mathcal {C}}_{{\mathbb {H}}}\Leftrightarrow f\in \mathcal {S}_{{\mathbb {H}}}\) (by Remark 5.1) \(\Leftrightarrow I_n-\textbf{T}_n^f\textbf{T}_n^{f*}\succeq 0\) for all \(n\ge 1\) (by Theorem 4.3) \(\Leftrightarrow \) (5.7) (by (5.9)), the statement follows. \(\square \)
Remark 5.3
If \(g\in {\mathcal {C}}_{{\mathbb {H}}}\) and \(f\in {\mathcal {S}}_{\mathbb {H}}\) are such that \(f={\mathfrak {T}}[g]\), then
The latter equality follows from (5.9), since the matrix \(I+\textbf{T}_n^g\) is lower triangular and invertible.
Definition 5.4
We denote by \({\mathcal {C}}_{{{\mathbb {H}}},n}\) the class of all \(g\in {\mathbb {H}}[[z]]\), such that
Equivalently, \({\mathcal {C}}_{{{\mathbb {H}}},n}\) is the set of quaternion power series whose Cayley transform is a Blaschke product of degree n
The latter equivalence follows from Theorem 4.4 and Remark 5.3. As \({\mathcal {S}}_{{{\mathbb {H}}},0}\) is identified with the unit sphere of \({\mathbb {H}}\), we identify \(\mathcal C_{{{\mathbb {H}}},0}\) with the space of all pure quaternions.
5.2 The Carathéodory–Schur problem in \({\mathcal {C}}_{{\mathbb {H}}}\)
The problem of finding a power series
with preassigned first n coefficients is equivalent to the problem (4.5) in the following sense: g solves the problem (5.10) if and only if its Cayley transform (5.4) solves the problem (4.5) with \(f_0,\ldots ,f_{n-1}\) determined from \(g_0,\ldots ,g_{n-1}\) via the formula (5.8), or equivalently, by
To complete the story, it remains to spell out Theorems 4.6, 4.7, and 4.8 in terms of \(g_0,\ldots ,g_{n-1}\). The details are given below.
Theorem 5.5
Given \(g_0,\ldots ,g_{n-1}\in {\mathbb {H}}\), there exists \(g\in \mathcal C_{{\mathbb {H}}}\) of the form (5.10) if and only if \({\mathfrak {C}}_n:=\textbf{T}_n^g+\textbf{T}_n^{g*}\succeq 0\). Such g is unique if and only if \({\mathfrak {C}}_n\) is singular, in which case \(g\in {\mathcal {C}}_{{{\mathbb {H}}},d}\), where \(d=\textrm{rank} ({\mathfrak {C}}_n)\).
Proof
For \(f_0,\ldots ,f_{n-1}\) defined as in (5.11), the matrices \({\mathfrak {S}}_n=I_n-\textbf{T}_n^f\textbf{T}_n^{f*}\) and \({\mathfrak {C}}_n=\textbf{T}_n^g+\textbf{T}_n^{g*}\) are related as in (5.9), and hence, \({\mathfrak {S}}_n\succeq 0\) if and only if \({\mathfrak {C}}_n\succeq 0\), and the first statement follows by Theorem 4.6. The problem (5.10) has a unique solution f if and only \(f=\mathfrak {T}[g]\) is a unique solution of the associated problem (4.5), which is the case if and only if \({\mathfrak {S}}_n\) is singular or equivalently, \({\mathfrak {C}}_n\) is singular. Since \(\textrm{rank} \, {\mathfrak {S}}_n=\textrm{rank} \, {\mathfrak {C}}_n\) (by Remark 5.3) and f is a Blaschke product of degree \(d=\textrm{rank} \, {\mathfrak {S}}_n\) (by Theorem 4.7), it follows that \(g\in {\mathcal {C}}_{{{\mathbb {H}}},d}\). \(\square \)
For the next theorem, we let \(\overline{{\mathcal {C}}}_{\mathbb {H}}:={\mathcal {C}}_{{\mathbb {H}}}\cup \{\infty \}\) and we assign \(\infty \) to the class \({\mathcal {C}}_{{{\mathbb {H}}},0}\). With this convention, the Cayley transform (5.4) extends to a bijection from \(\overline{{\mathcal {C}}}_{{\mathbb {H}}}\) to \({\mathcal {S}}_{{\mathbb {H}}}\).
Theorem 5.6
Given \(g_0,\ldots , g_{n-1}\), let us suppose that \({\mathfrak {C}}_n=\textbf{T}_n^g+\textbf{T}_n^{g*}\succ 0\) and let us introduce the \(2\times 2\)-matrix polynomial
where \(Z_n\) and \(\textbf{e}_n\) are defined in (2.6) and where
Then, the formula
establishes a bijection between \(\overline{{\mathcal {C}}}_{{\mathbb {H}}}\) and the set of all \(g\in {\mathcal {C}}_{{\mathbb {H}}}\) subject to condition (5.10). Moreover, \(\varphi \in {\mathcal {C}}_{{{\mathbb {H}}},k}\) if and only if \(g\in {\mathcal {C}}_{{{\mathbb {H}}},n+k}\).
Proof
If we use \(f_0,\ldots ,f_{n-1}\) from (5.11) to construct the polynomial \(\Theta \) as in (3.17), then by Theorem 4.7 and the discussion preceeding Theorem 5.5, the formula (4.6) written as
parametrizes all solutions f to the problem (5.10). Since (5.4) is a linear fractional transform based on the matrix \(\left[ {\begin{matrix} 1 &{}-1\\ 1&{} 1 \end{matrix}}\right] \), we can take the superposition of three linear fractional maps to recover g from (5.15) by formula (5.14) with
The latter formula defines the same polynomial as in (5.12). Indeed, substituting (3.17) into (5.16) gives
By (5.9) and since \(\textbf{T}_n^{f}\) commutes with \(Z_n\), we have
Plugging in the latter equality into (5.17), making use of equalities
which follow directly from (5.8), and taking into account (5.13), we arrive at
which is the same as (5.12). The last statement follows from Theorem 4.7 and the definition of the class \(\mathcal {C}_{{{\mathbb {H}}},n}\). \(\square \)
We next assume that \(g_0,\ldots ,g_n\in {\mathbb {H}}\) are such that
and let
where \(\widetilde{\textbf{e}}_n\) is given in (2.6). Note that \(A'\) is a companion matrix as in (4.9) but with the bottom row equal \(\begin{bmatrix}g_n&\ldots&g_1\end{bmatrix}{\mathfrak {C}}_n^{-1}\).
Theorem 5.7
Under the assumptions (5.18), the power series
belongs to \({\mathcal {C}}_{{{\mathbb {H}}},n}\) and is a unique Carathéodory-class power series with the first \(n+1\) coefficients equal to \(g_0,\ldots ,g_{n}\).
Proof
We use the given \(g_0,\ldots ,g_n\) to define the elements \(f_0,\ldots ,f_n\) by the formula (5.11) with n replaced by \(n+1\). Then, we introduce the matrices \(\textbf{T}^f_{n+1}\) and \({\mathfrak {S}}_{n+1}=I-\textbf{T}^f_{n+1}{} \textbf{T}^{f*}_{n+1}\). Since \(\textrm{rank}({\mathfrak {S}}_n)=\textrm{rank}({\mathfrak {S}}_{n+1})=n\), by (5.18) and Remark 5.3, there is a unique \(s\in {\mathcal {S}}_{{\mathbb {H}}}\) subject to condition (4.13), which is a Blaschke product of degree n and is given by the realization formula (4.10). Its (inverse) Cayley transform \(h={\mathfrak {T}}^{-1}[s]=(1-s)^{-1}(1+s)\) has all the desired properties. It remains to verify that \(h={\mathfrak {T}}^{-1}[s]\) can be written in the form (5.19).
Toward this end, note that for s defined as in (4.10), we have
from which we get
It remains to express the right side in terms of the original \(g_0,\ldots ,g_n\). We first use the equalities
(the first follows from relation (5.8) with \(n+1\) instead of n, and the second is immediate, since \((1+g_0)(1-f_0)=2\)) to write (5.20) as
where
It remains to show that \(U=A'\). To this end, we write the companion matrix A in (4.9) as
(\(\widetilde{\textbf{e}}_n\) is defined in (2.6)) and then make substitutions (5.8), (5.9) and
to write A in terms of \(g_0,\ldots ,g_n\) as
Substituting the latter expression into (5.21) results in
By the direct inspection, one can see that
which allows us to write (5.22) as
where the last equality holds since \({\mathfrak {C}}_n-\textbf{T}_n^{g*}+I=\textbf{T}_n^g+I\). Therefore, \(U=A^\prime \), and the proof is complete. \(\square \)
The current section is included for the convenience of future references. For example, Theorem 5.6 leads to a Herglotz-type representation theorem for left- and right-regular functions generated by a Carathéodory-class power series, which in turn leads to a quite meaningful quaternionic version of a trigonometric moment problem and, subsequently, to the spectral theorem for quaternionic unitary operators that is very much similar to the classical complex-valued one. These topics will be elaborated in a separate publication. In the last section, we get back to the Schur-class setting and discuss its indefinite generalization.
6 The generalized Schur class \({\mathcal {S}}^\kappa _{{\mathbb {H}}}\)
Following the classical case [18], we say that a power series \(f\in {\mathbb {H}}[[z]]\) belongs to the generalized Schur class \({\mathcal {S}}^\kappa _{{\mathbb {H}}}\) if there exists an integer \(n_0\ge 0\), such that the Hermitian matrices \({\mathfrak {S}}^f_n=I-\textbf{T}_n^{f}{} \textbf{T}_n^{f*}\) have \(\kappa \) negative eigenvalues counted with multiplicities:
Similarly, a power series \(g\in {\mathbb {H}}[[z]]\) belongs to the generalized Carathéodory class \({\mathcal {C}}^\kappa _{{\mathbb {H}}}\) if the above inequalities hold for the matrices \({\mathfrak {C}}^g_n=\textbf{T}_n^g+\textbf{T}_n^{g*}\) rather than \({\mathfrak {S}}_n\). In the complex setting, this class appeared in [16].
6.1 The indefinite Carathéodory–Schur problem
The problem consists of finding an \(f\in {\mathcal {S}}^\kappa _{{\mathbb {H}}}\) with prescribed n first coefficients \(f_0,\ldots ,f_{n-1}\)
with minimally possible \(\kappa \) (which cannot be less than \(\nu _-({\mathfrak {S}}_n)\), by the eigenvalue interlacing theorem). If the matrix \({\mathfrak {S}}_n\) is invertible, then the minimal \(\kappa \) equals \(\nu _-({\mathfrak {S}}_n)\), as the next result shows.
Theorem 6.1
Let us suppose that \({\mathfrak {S}}_n\) is invertible, let \(\kappa :=\nu _-({\mathfrak {S}}_n)\), and let \(\Psi \) and \(\Theta \) be the polynomials defined as in (3.16), (3.17). Then, the formula
with free parameter \(\varepsilon \in {\mathcal {S}}_{{\mathbb {H}}}\) subject to conditions (3.41) parametrizes all \(f\in \mathcal S^\kappa _{{\mathbb {H}}}\) of the form (6.1).
Proof
Let us assume that f belongs to \({\mathcal {S}}^\kappa _{{\mathbb {H}}}\) (i.e., \(\nu _-({\mathfrak {S}}^f_{n+k})=\nu _-({\mathfrak {S}}_n)=\kappa \) for all \(k\ge 1\)) and satisfies (6.1). Then, by Theorem 3.14, f is of the form (3.44) for some \(\varepsilon \in {\mathcal {S}}_{{\mathbb {H}}}\) subject to conditions (3.41).
Conversely, let f be of the form (4.6) for some \(\varepsilon \in {\mathcal {S}}_{{\mathbb {H}}}\) subject to conditions (3.41). Again, by Theorem 3.14, the first n coefficients of f are equal to the prescribed \(f_0,\ldots ,f_{n-1}\) and equalities (3.43) hold. Thus, \(\nu _-({\mathfrak {S}}^f_{n+k})=\nu _-({\mathfrak {S}}_n)=\kappa \) for all \(k\ge 1\), and hence, \(f\in {\mathcal {S}}^\kappa _{{\mathbb {H}}}\). \(\square \)
The singular case relies on Theorem 3.5 for \({\mathfrak {S}}_n\)-setting.
Theorem 6.2
Let us suppose that \({\mathfrak {S}}_n\) is singular, \(\textrm{rank}({\mathfrak {S}}_n)=d<n\), and let \({\mathfrak {S}}_r\) (\(r<n\)) be the maximal invertible leading submatrix of \({\mathfrak {S}}_n\).
-
1.
If \(d=r\) (i.e., \(\textrm{rank}({\mathfrak {S}}_n)=\textrm{rank}({\mathfrak {S}}_r)\)), then the formula (4.10) (with n replaced by r) defines a unique \(s\in {\mathcal {S}}^\kappa _{{\mathbb {H}}}\) (\(\kappa =\nu _-({\mathfrak {S}}_n)=\nu _-({\mathfrak {S}}_r)\)) with initial coefficients \(f_0,\ldots ,f_{n-1}\).
-
2.
If \(d>r\), then the minimally possible \(\kappa \) equals
$$\begin{aligned} \kappa =\nu _-({\mathfrak {S}}_n)+n-d=\nu _-({\mathfrak {S}}_n)+\nu _0({\mathfrak {S}}_n), \end{aligned}$$(6.3)where \(\nu _0\) stands for the multiplicity of the zero eigenvalue.
Proof
It was verified in the proof of Theorem 4.8 that the first \(r+1\) coefficients of s of form (4.10) (with r instead of n) are equal to \(f_0,\ldots , f_r\). Therefore, \({\mathfrak {S}}_r^s={\mathfrak {S}}_r\) and \({\mathfrak {S}}_{r+1}^s={\mathfrak {S}}_{r+1}\). We next show that for all \(m>r\)
Since the matrices on both sides of (6.4) are Hermitian, it suffices to verify that the corresponding entries on and below the main diagonal are equal, that is
Using the formulas \(s_j=\textbf{e}_r^*A^{j-1}B\) for \(j\ge 1\) and equalities
which follow from (4.11), we transform the left-side expressions in (6.5):
confirming equalities (6.5) and hence (6.4). It follows from (6.4) that:
Since \({\mathfrak {S}}_r^s={\mathfrak {S}}_r\), we actually have \(\textrm{rank}({\mathfrak {S}}_m^s)=\textrm{rank}({\mathfrak {S}}_r)\) and, therefore, \(\nu _-({\mathfrak {S}}_m^s)=\nu _-({\mathfrak {S}}_r)=\kappa \) for all \(m>r\). Therefore, \(s\in {\mathcal {S}}_{{\mathbb {H}}}^\kappa \). Since both \({\mathfrak {S}}_n\) and \({\mathfrak {S}}_n^s\) are structured extensions of \({\mathfrak {S}}_r\) with \(\nu _-({\mathfrak {S}}_n)=\nu _-({\mathfrak {S}}_n^s)=\nu _-({\mathfrak {S}}_r)\), they are equal by Lemma 3.1, and hence, \(s_j=f_j\) for \(j=1,\ldots ,n-1\).
To prove (2), we recall that for any choice of \(f_n,\ldots , f_{2n-d}\), the matrix \({\mathfrak {S}}_{2n-d}\) is invertible and has \(\nu _-({\mathfrak {S}}_r)+n-d\) negative eigenvalues. Therefore, \(\kappa \) defined in (6.3) is minimally possible. To get all \(f\in \mathcal S_{{\mathbb {H}}}^\kappa \) with the first n coefficients equal to \(f_0,\ldots ,f_{n-1}\), we first choose an arbitrary tuple \(\textbf{f}=(f_n,\ldots ,f_{2n-d})\) and then apply the linear fractional formula (6.2) with the polynomial \(\Psi _\textbf{f}\) constructed via formula (3.16) but based on given \(f_0,\ldots ,f_{2n-d}\); the formula (3.16) makes sense, since the matrix \({\mathfrak {S}}_{2n-d}\) is invertible. \(\square \)
6.2 Regular meromorphic functions associated with \({\mathcal {S}}^\kappa _{{\mathbb {H}}}\)
To put the class \({\mathcal {S}}^\kappa _{{\mathbb {H}}}\) in the function-theoretic context, we need the absolute convergence of these series in a neighborhood of the origin.
Theorem 6.3
Any power series \(f\in {\mathcal {S}}^\kappa _{{\mathbb {H}}}\) converges absolutely in a neighborhood of the origin.
Proof
For any \(f\in {\mathcal {S}}^\kappa _{{\mathbb {H}}}\), there is \(n\ge \kappa \), such that
Indeed, assuming that such n does not exist, we conclude that in particular, the matrix \({\mathfrak {S}}^f_\kappa \) has \(\kappa \) negative eigenvalues and is singular, which is not possible. We next fix such an n, define the polynomial \(\Theta \) as in (3.17) and conclude by Theorem 6.1 that f admits a representation (6.2) for some \(\varepsilon \in {\mathcal {S}}_{{\mathbb {H}}}\), such that \(\theta _{21,0}\varepsilon _0+\theta _{22,0}\ne 0\). Therefore, the power series \(\theta _{21}\varepsilon +\theta _{22}\) has no zeros in a neighborhood of the origin and, therefore, f (of the form (6.2)) converges absolutely in this neighborhood. \(\square \)
Thus, the power series \(f\in {\mathcal {S}}^\kappa _{{\mathbb {H}}}\) can be left and right evaluated in a neighborhood of the origin giving rise to left and right-regular functions \(f^{\varvec{e_\ell }}\) and \(f^{\varvec{e_r}}\). Further elaboration comes from the Krein–Langer type factorization result: for any \(f\in {\mathcal {S}}^\kappa _{{\mathbb {H}}}\), there exist Schur-class power series \(S_L\), \(S_R\) and Blaschke products \(B_L\) and \(B_R\) of degree \(\kappa \) so that f admits the following power-series factorizations:
Furthermore, \(B_LB_L^\sharp =B_RB_R^\sharp \) (where \(B^\sharp \) is defined via formula (5.3)). If we denote by \({\mathcal {Z}}\) the zero set of the real Blaschke product \({\widetilde{B}}:=B_LB_L^\sharp =B_RB_R^\sharp \), then the functions \(f^{\varvec{e_\ell }}\) and \(f^{\varvec{e_r}}\) admit meromorphic (semi-regular) extensions to \({\mathbb {B}}\backslash {\mathcal {Z}}\) by the formulas
We refer to [3] and [4] for further details. Note that left-regular generalized Schur functions considered in [3], citeacsbook are slightly more general than the ones arising from \(f\in {\mathcal {S}}^\kappa _{{\mathbb {H}}}\) as they are allowed to have a pole at the origin.
6.3 Excluded parameters
We now get back to the parametrization formula (6.2) and focus on parameters \(\varepsilon \in {\mathcal {S}}_{{\mathbb {H}}}\) that do not satisfy conditions (3.41); we will refer to them as to excluded parameters. In what follows, we will use notation:
for the left linear fractional transformation based on the matrix polynomial \(\Psi \). As a consequence of part (2) in Theorem 3.14, we get the following result.
Remark 6.4
An excluded parameter of the linear fractional transformation (6.7) exists if and only if \(\widetilde{\textbf{e}}^*_n{\mathfrak {S}}_n^{-1}\widetilde{\textbf{e}}_n\le 0\).
The notion of an excluded parameter was introduced in [14] in the context of the Nevanlinna–Pick interpolation problem (with no derivatives involved in the interpolation conditions) for generalized Schur functions. In the setting of the Carathéodory–Schur problem (see [5] for the complex case), the power series \(\psi _{11}-\varepsilon \psi _{21}\) may have a multiple zero at the origin which suggests a more detailed classification of excluded parameters.
Definition 6.5
We will say that \(\varepsilon \in {\mathcal {S}}_{{\mathbb {H}}}\) is an excluded parameter of order at least k of the linear fractional transformation (6.7) if
and is an excluded parameter of order k if, in addition, \(h_0=h(0)\ne 0\).
The next criterion is the extension of Remark 6.4.
Proposition 6.6
There exists an excluded parameter \(\varepsilon \in \mathcal S_{{\mathbb {H}}}\) of order at least \(k\le n\) of the transformation (6.7) if and only if the \(k\times k\) bottom principal submatrix of \({\mathfrak {S}}_n^{-1}\) is negative semidefinite.
Proof
By Remark 3.9, at least one of the elements \(\psi _{11,0}\) and \(\psi _{21,0}\) is non-zero. Therefore, the condition (6.8) implies in particular that \(\psi _{21,0}\ne 0\) and hence, \(\psi _{21}\) is invertible in \({\mathbb {H}}[[z]]\). Writing (6.8) in terms of associated Toeplitz matrices as \(\textbf{T}_k^{\varepsilon }{} \textbf{T}_k^{\psi _{21}}=\textbf{T}_k^{\psi _{11}}\), we conclude by Remark 4.5 that (6.8) holds for some \(\varepsilon \in {\mathcal {S}}_{{\mathbb {H}}}\) if and only if the following matrix \(R_k\) is positive semidefinite:
The desired statement will follow once we will show that
Toward this end, we first observe that \(R_k\) satisfies (and is uniquely recovered from) the Stein identity
where \(M_k^*\) and \(N_k^*\) are the bottom rows of the matrices \(\textbf{T}_k^{\psi _{21}}\) and \(\textbf{T}_k^{\psi _{11}}\)
Indeed, since \(Z_k\textbf{T}_k^{\psi _{21}}=\textbf{T}_k^{\psi _{21}}Z_k\) and \(I-Z_k^*Z_k=\widetilde{\textbf{e}}_k\widetilde{\textbf{e}}_k^*\), we have
One can see from (3.16) that or \(k=n\), the formulas (6.12) can be written as
which together with (3.23) implies
Upon comparing the \(k\times k\) bottom principal blocks in (6.14), we see that the \(k\times k\) bottom principal submatrix of \({\mathfrak {S}}_n^{-1}\) satisfies the same Stein equation (6.11) as the matrix \(-R_k\). By the uniqueness of the solution, we have equality (6.10) which completes the proof. \(\square \)
The next statement shows, in particular, that the assumption \(k\le n\) in Proposition 6.6 is not restrictive.
Proposition 6.7
Let \(\varepsilon \) be an excluded parameter of the transformation (6.7) of order at least k, i.e., let us assume that (6.8) holds. Then, \(k\le n\) and
Therefore, the formula \(f_\varepsilon =\textbf{L}_\Psi [\varepsilon ]\) defines a power series \(f_\varepsilon \in {\mathbb {H}}[[z]]\).
Proof
By (3.16), the leading coefficients of \(\psi _{11}\) and \(\psi _{21}\) are given by
Then, it follows by (3.23) that:
Similar computations show that for \(j=0,\ldots ,n-1\)
If we now consider the matrix \(R_{n+1}\) defined as in (6.9), then it follows from the latter computations that its leading entry is equal to:
If the condition (6.8) holds for \(k>n\), then the matrix \(R_{n+1}\) is positive semidefinite (by (6.9)) which is not the case, due to (6.18).
To prove (6.15), we first observe from (3.16), (6.12) and (2.18) that
for any \(k=1,\ldots ,n\), which implies the equalities
Due to condition (6.8), we have \(\textbf{T}_k^{\varepsilon }{} \textbf{T}_k^{\psi _{21}}=\textbf{T}_k^{\psi _{11}}\), and hence
which is equivalent to (6.15). \(\square \)
We finally present a refinement of Proposition 6.6.
Proposition 6.8
There exists an excluded parameter \(\varepsilon \in \mathcal S_{{\mathbb {H}}}\) of order k of the transformation (6.7) if and only if the \(k\times k\) bottom principal submatrix of \({\mathfrak {S}}_n^{-1}\) (i.e., the matrix \(-R_k\) defined in (6.10)) is either
-
(1)
negative definite, in which case there are infinitely many excluded parameters of order k, or
-
(2)
the maximal negative semidefinite bottom principal submatrix of \({\mathfrak {S}}_n^{-1}\), in which case there is a unique excluded parameter \(\varepsilon \) of order k.
Proof
Equality (6.8) specifies (in terms of \(\psi _{11,j}\) and \(\psi _{21,j}\)) the k first coefficients of the unknown \(\varepsilon \in {\mathcal {S}}_{{\mathbb {H}}}\). As was pointed out above, equality (6.8) holds for some \(\varepsilon \in \mathcal S_{{\mathbb {H}}}\) if and only if the matrix \(R_k\) is positive semidefinite. We next apply the results from Section 4. If \(R_k\) is singular, there is a unique \(\varepsilon \in {\mathcal {S}}_{{\mathbb {H}}}\) subject to condition (6.8), and this only \(\varepsilon \) is a Blaschke product of degree \(\deg \varepsilon =\textrm{rank}(R_k)\). Moreover, the order of this excluded parameter is less than \(k+1\) if and only if the matrix \(R_{k+1}\) is not positive semidefinite.
If \(R_k\succ 0\), then the set of all \(\varepsilon \in \mathcal S_{{\mathbb {H}}}\) subject to condition (6.8) is parametrized by a linear fractional formula
with polynomial coefficients and a free parameter \(\sigma \in \mathcal S_{{\mathbb {H}}}\). The k first coefficients \(\varepsilon _0,\ldots , \varepsilon _{k-1}\) of each \(\varepsilon \) of the form (6.20) are the same and guarantee (6.8). The coefficient \(\varepsilon _k\) is uniquely determined by the free coefficient of the parameter \(\sigma \) in (6.20). Since there is a unique \(\varepsilon ^\prime _k\in {\mathbb {H}}\), such that the power series
is an excluded parameter of order at least \(k+1\), i.e., such that
and since there is a unique \(\sigma ^\prime _0\in {\mathbb {H}}\) producing the series (6.21) via formula (6.20) (more precisely, any \(\sigma \in {\mathcal {S}}_{{\mathbb {H}}}\) with the free coefficient equal \(\sigma ^\prime _0\)), we conclude that all excluded parameters \(\varepsilon \) of order k are parametrized by the formula (6.20) with free parameter \(\sigma \in {\mathcal {S}}_{{\mathbb {H}}}\) such that \(\sigma _0\ne \sigma ^\prime _0\). Note that in case the matrix \(R_{k+1}\) is not positive semidefinite, any \(\varepsilon \) subject to condition (6.22) does not belong to \(\mathcal S_{{\mathbb {H}}}\) and, hence, the formula (6.20) with free parameter \(\sigma \in {\mathcal {S}}_{{\mathbb {H}}}\) describes all excluded parameters of order k. \(\square \)
6.4 Quasi-solutions arising from excluded parameters
Any excluded parameter \(\varepsilon \in {\mathcal {S}}_{{\mathbb {H}}}\) gives rise via formula (6.7) to a power series \(f_\varepsilon =\textbf{L}_\Psi [\varepsilon ]\) which is not a solution to the Carathéodory problem (6.1): such \(f_\varepsilon \) does not belong to \({\mathcal {S}}_{{\mathbb {H}}}^\kappa \) and does not satisfy the condition (6.1). However, something still can be said about this \(f_\varepsilon \). The next theorem is the main result of this section.
Theorem 6.9
Let \(\varepsilon \in {\mathcal {S}}_{{\mathbb {H}}}\) be an excluded parameter of order k. Then, the power series \(f_\varepsilon =\textbf{L}_\Psi [\varepsilon ]\) belongs to the class \({\mathcal {S}}_{\mathbb {H}}^{\kappa -k}\) and is of the form
In other words, the \(n-k\) first coefficients of \(f_\varepsilon \) are equal to prescribed \(f_0,\ldots ,f_{n-k-1}\), but \(f_{\varepsilon ,n-k}\) is different from the prescribed \(f_{n-k}\).
Proof
Since \(\varepsilon \) is an excluded parameter of order k, it satisfies the equality (6.8) (with \(h_0\ne 0\)) and therefore, the equality (6.15), by Proposition 6.7. Combining these two equalities gives
By (6.2), \((\psi _{11}-\varepsilon \psi _{21})f_\varepsilon =(\varepsilon \psi _{22}-\psi _{12})\) which on account of (6.24) can be written as \(z^khf_\varepsilon =-z^kg\), which in turn allows us to write (6.24) as
Multiplying both parts by \(\Theta \) on the right and making use of (3.24) gives
We next cancel \(z^k\) and recall that \(h_0\ne 0\) (i.e., h is invertible in \({\mathbb {H}}[[z]]\)) to conclude that
As in the proof of the implication \((2)\Rightarrow (1)\) in Theorem 3.13, we use the polynomials (3.32) and combine the last equality with (3.37) to get
where \({\widetilde{p}}\) is the polynomial given by
Since the free coefficient of the polynomial \(\begin{bmatrix}\theta _{21}&\theta _{22}\end{bmatrix}\) is non-zero, it follows from (6.25) that the \(n-k\) first coefficients of \(f_\varepsilon -p_{n-k}\) are zeros and, hence, \(f_\varepsilon \) is indeed of the form:
It remains to verify the inequality in (6.23) and to show that \(f_\varepsilon \) belongs to \({\mathcal {S}}_{{\mathbb {H}}}^{\kappa -k}\). This will be done below after some needed preliminaries. \(\square \)
Let us consider the decomposition (3.11) (with \(n+k\) instead of n)
where \(\textbf{S}_k\) is the Schur complement of the block \({\mathfrak {S}}_{n-k}\), and the conformal decomposition [justified by equality (6.10)]
Lemma 6.10
Let \({\mathfrak {S}}_n\) and \({\mathfrak {S}}_n^{-1}\) be partitioned as in (6.26), (6.27).
-
1.
If \(R_k\succeq 0\), then \(\nu _-({\mathfrak {S}}_{n-k})=\nu _-({\mathfrak {S}}_n)-k\).
-
2.
If, moreover, \(R_k\succ 0\), then
$$\begin{aligned} R_k^{-1}=-\textbf{S}_k, \end{aligned}$$(6.28)and the following equality holds:
$$\begin{aligned} \begin{bmatrix}N_k&M_k\end{bmatrix}=(I-Z_k^*)\textbf{S}_k^{-1}(I-Z_k)^{-1}\begin{bmatrix}X_k&Y_k\end{bmatrix}, \end{aligned}$$(6.29)where the columns \(N_k,M_k\) are defined via formula (6.12) and and \(X_k,Y_k\) are defined via formula (3.15) (with \(n+k\) replaced by n):
$$\begin{aligned} \begin{bmatrix}X_k&Y_k\end{bmatrix}= (I-Z_k)\begin{bmatrix}T_{n-k,k}{} \textbf{T}_{n-k}^{f*}{\mathfrak {S}}_{n-k}^{-1}&I_k\end{bmatrix}(I-Z_{n})^{-1}\begin{bmatrix} \textbf{e}_{n}&F_{n}\end{bmatrix}. \end{aligned}$$(6.30)
Proof
Let us consider the matrix
If \(\delta \) is small enough, then \(P_\delta \) is invertible and has the same inertia as \({\mathfrak {S}}_n^{-1}\) and \({\mathfrak {S}}_n\). On the other hand, since \(R_k+\delta I_k\succ 0\), we have
Note that the inertia of \(S_\delta \) is the same for all sufficiently small \(\delta \). Furthermore, since \(P_\delta \) increases and tends to \({\mathfrak {S}}_n^{-1}\) as \(\delta \searrow 0\), it follows that \(P_\delta ^{-1}\) decreases and tends to \({\mathfrak {S}}_n\) as \(\delta \searrow 0\). Since \(P_\delta ^{-1}\) has the form \(P_\delta ^{-1}=\left[ {\begin{matrix} S_\delta ^{-1} &{} *\\ *&{} * \end{matrix}}\right] \), we see that \(S_\delta ^{-1}\) decreases and tends to \({\mathfrak {S}}_{n-k}\). Since \(\nu _-(S_\delta )\) does not depend on \(\delta \), we conclude that \(\nu _-(S_\delta )=\nu _-({\mathfrak {S}}_{n-k})\). Combining the latter equality with (6.31), we get the desired conclusion in part (1).
To prove part (2), we first invert \({\mathfrak {S}}_n\) via factorization (6.26)
Comparing the 22-blocks in (6.27) and (6.32) results in \(-R_k=\textbf{S}_k^{-1}\) which is equivalent to (6.28). We next combine (6.13) and (6.32) to get
By (6.30)
Combining the two last equalities leads us to (6.29). \(\square \)
Remark 6.11
Since for an excluded parameter \(\varepsilon \) of order k, the first \(n-k\) coefficients of \(f_\varepsilon \) are equal to \(f_0,\ldots ,f_{n-k-1}\), and since \(\nu _-({\mathfrak {S}}_{n-k})=\nu _-({\mathfrak {S}}_{n-k})-k=\kappa -k\), it follows that \(f_\varepsilon \in {\mathcal {S}}_{{\mathbb {H}}}^m\) with \(m\ge \kappa -k\).
Thus, it remains to show that \(m\le \kappa -k\). In the complex setting [5, Theorem 6.1], the latter was shown by combining the winding number argument with Krein–Langer factorization (6.6) of \(f_\varepsilon \). By the winding number argument (or Rouche’s theorem), it follows that for any \(\varepsilon \in {\mathcal {S}}\), the function \(\psi _{11}-\varepsilon \psi _{21}\) has exactly \(\kappa \) zeros inside the unit disk \({\mathbb {D}}\). If \(\varepsilon \) is an excluded parameter of order k, then after canceling \(z^k\) in the numerator and the denominator of \(\textbf{L}_{\Psi }[\varepsilon ]\), we get a fraction \(f_\varepsilon \) having \(\kappa -k\) zeros inside \({{\mathbb {D}}}\). Since \(f_\varepsilon \) is a generalized Schur power series, its membership in \({\mathcal {S}}^{\kappa -k}\) follows from the Krein–Langer characterization. To bypass the Krein–Langer formulas in the quaternionic case as well as a suitable quaternionic version of Rouche’s theorem (which we do not have at the moment, except for a trivial planar case), we will follow a different and substantially more computational approach. We first establish the explicit formulas for the coefficients in the linear fractional formula (6.20).
Lemma 6.12
If \(R_k=-\textbf{S}_k^{-1}\succ 0\), then the set of all \(\varepsilon \in {\mathcal {S}}_{{\mathbb {H}}}\) subject to condition (6.8) is parametrized by the linear fractional formula
with the matrix of coefficients \({\mathfrak {D}}=\left[ {\begin{matrix} {\mathfrak {d}}_{11} &{} {\mathfrak {d}}_{12}\\ {\mathfrak {d}}_{21}&{} {\mathfrak {d}}_{22} \end{matrix}}\right] \) given by
with the columns \(X_k\) and \(Y_k\) defined in (6.30).
Proof
Let \(\varepsilon _0,\ldots ,\varepsilon _{k-1}\) denote the coefficients of \(\varepsilon \) prescribed by condition (6.8). Comparing the bottom rows in the matrix equality \(\textbf{T}_k^{\varepsilon }=\textbf{T}_k^{\psi _{11}}(\textbf{T}_k^{\psi _{21}})^{-1}\) (which is equivalent to (6.8)) gives
Multiplying both sides by the unit anti-diagonal matrix \(V_k\) and making use of notation (6.12), we get
Observe also from (6.12) that
It follows from (6.11) that:
where we have set
and used equalities
for the last step. Note that the matrix \({\mathfrak {R}}\) is positive definite. Following the formula (3.17), we introduce the matrix polynomial:
and conclude by virtue of Theorem 4.7 that the formula
parametrizes all Schur-class power series with first k coefficients equal to \({\overline{\varepsilon }}_0,\ldots , {\overline{\varepsilon }}_{k-1}\) Taking power-series conjugates (see (5.3)) and still writing \(\sigma \) instead of \(\sigma ^\sharp \), we then get all \(\varepsilon \in {\mathcal {S}}_{{\mathbb {H}}}\) subject to condition (6.8)
Comparing the latter formula with (6.33), we see that it remains to verify polynomial equalities
Toward this end, we substitute (6.35), (6.36), (6.37) into (6.39) and make use of equalities (6.38) to write \({\widetilde{\Theta }}\) in terms of \(M_k\) and \(N_k\)
We next use equalities (6.28) and (6.29) to write \({\widetilde{\Theta }}\) in terms of \(X_k\) and \(Y_k\)
Therefore
which is the same polynomial as that in (6.34). Thus, equalities (6.41) hold, which completes the proof of the lemma. \(\square \)
Remark 6.13
Note that the polynomial (6.34) has the same structure as \(\Theta \) in (3.17). If we consider the polynomial
modeled from (3.16), then we have the identity
which is verified by mimicking the proof of Lemma 3.10.
Lemma 6.14
Let \(\Psi :=\Psi _n\), \({\mathfrak {D}}\) and \({\mathfrak {K}}\) be the matrix polynomials defined in (3.16), (6.34), and (6.43), respectively. Then
where according to (3.16)
Note that \({\widetilde{\Psi }}\) is defined as in (3.16) but with n replaced by \(n-k\).
Proof
Upon replacing \(\left[ {\begin{matrix} X_k&-Y_k \end{matrix}}\right] \) in (6.34) by its expression in (6.30), we get
On the other hand, substituting the decomposition (6.32) for \({\mathfrak {S}}_n^{-1}\) into (3.16), making use of (6.30) and taking into account (6.45) gives
Multiplying the right-side expressions in (3.16), (6.46) and taking into account (6.47), we get
where
By (3.23) and due to decompositions (6.26) and (6.32)
Using the latter equality along with (3.26) and
we simplify the last term on the right side of (6.49) to
Substituting the latter equality into (6.49), we see that \(W(z)=0\), and, hence, the first equality in (6.44) follows from (6.48). The second equality follows from the first by multiplying the latter by \({\mathfrak {K}}(z)\) on the left and making use of (6.43). \(\square \)
6.5 Completion of the proof of Theorem 6.9
In the previous section, we showed that if \(R_k\succeq 0\) and \(\varepsilon \) is an excluded parameter of (6.7) of order k, then the power series \(f_\varepsilon =\textbf{L}_\Psi [\varepsilon ]\) is of the form (6.23). However, the rightmost inequality in (6.23) has not been justified yet.
Taking for granted that \(f_\varepsilon \) belongs to \(\mathcal S_{{\mathbb {H}}}^{\kappa -k}\) (i.e., that \(\nu _-({\mathfrak {S}}_{m}^{f_\varepsilon })\le \kappa -k\) for all \(m\ge 0\)), let us assume that the first \(n-k+1\) coefficients of \(f_\varepsilon \) are equal to \(f_0,\ldots ,f_{n-k}\). Then, \({\mathfrak {S}}_{n-k+1}^{f_\varepsilon }={\mathfrak {S}}_{n-k+1}\) which cannot be the case, since \(\nu _-({\mathfrak {S}}_{n-k+1})=\kappa -k+1\) by part (1) in Lemma 6.10. It remains to show that \(f_\varepsilon \in \mathcal S_{{\mathbb {H}}}^{\kappa -k}\). We first consider the definite case.
Case 1: Let us assume that \(R_k\succ 0\). By Lemma 6.12, the excluded parameter \(\varepsilon \) is of the form (6.33), that is
where \({\mathfrak {D}}\) is the matrix polynomial given by (6.34). Taking the superposition of left linear fractional transformations (6.50) and (6.7) and taking into account the identity (6.44), we get
If \(\sigma \in {\mathcal {S}}_{{\mathbb {H}}}\) were an excluded parameter of the transformation \(\textbf{L}_{\Psi _{n-k}}[\sigma ]\), then \(\varepsilon \) would have been an excluded parameter of the transformation (6.7) of order greater than k, which is not the case. Hence \(\sigma \) is not excluded and, therefore, \(f_\varepsilon =\textbf{L}_{\Psi _{n-k}}[\sigma ]\) belongs to \(\mathcal S_{{\mathbb {H}}}^{\kappa -k}\) by virtue of Theorem 6.1.
Remark 6.15
The linear fractional transformation \(\textbf{L}_{\Psi _{n-k}}\) in (6.51) is associated with the truncated \(\textbf{CSP}\) with the prescribed coefficients \(f_0,\ldots , f_{k-1}\) only. In the context of the original \(\textbf{CSP}\) (with n prescribed coefficients), it parametrizes all quasi-solutions f arising from excluded parameters \(\varepsilon \) of the transformation \(\textbf{L}_{\Psi }\) of order at least k. Furthermore, if \(\sigma \) is still an excluded parameter of \(\textbf{L}_{\Psi _{n-k}}\) of order r, then \(\varepsilon \) defined as in (6.50) is an excluded parameter of \(\textbf{L}_{\Psi }\) of order \(k+r\).
Case 2: Let us assume that \(-R_k\) is the maximal negative semidefinite (singular) bottom principal submatrix of \({\mathfrak {S}}_n^{-1}\). Let \(\textrm{rank}(R_k)=d\) (\(0\le d<k\)). Then, \(R_d\succ 0\). Since \(R_{k+1}\not \succeq 0\), it follows by virtue of Theorem 3.5 that the matrix \(R_{2k-d}\) is invertible and has k positive and \(k-d\) negative eigenvalues. Since \(\textbf{S}_{2k-d}=-R^{-1}_{2k-d}\) is the Schur complement of the leading principal submatrix \({\mathfrak {S}}_{n-2k+d}\) of \({\mathfrak {S}}_n\), it follows that:
which together with Lemma 6.10 (part (1)) implies
Upon invoking Case 1, we may assume without loss of generality that \(d=0\) and \(n-2k+d=0\). Indeed, since \(\varepsilon \) is an excluded parameter of order at least d, it is of the form (6.50), where \({\mathfrak {D}}\) is defined in (6.34) (with d instead of k). Then, \(f_\varepsilon \) is of the form \(f_\varepsilon =\textbf{L}_{\Psi _{n-d}}[\sigma ]\) where \(\sigma \) is an excluded parameter of \(\textbf{L}_{\Psi _{n-k}}\) of order \(k-d\) and the recalculated \(R_{k-d}\) is the zero matrix. On the other hand, keeping in mind the second equality in (6.44) and representing \(f_\varepsilon \) as
we see from (6.52) that, to justify the membership of \(f_\varepsilon \) in \({\mathcal {S}}_{{\mathbb {H}}}^{\kappa -k}\), it suffices to show that \({\widetilde{f}}_\varepsilon \) belongs to \({\mathcal {S}}_{{\mathbb {H}}}\). Thus, Case 2 will follow from the very particular Case 3 below.
Case 3: Let us assume that \(n=2k\) and \(R_k=0\). We will show that in this case, the only excluded parameter of \(\textbf{L}_{\Psi }\) is the unimodular constant \(\varepsilon \equiv f_0\) and the corresponding \(f_\varepsilon \) is equal to the unimodular constant \(-f_0\) and, hence, belongs to \({\mathcal {S}}_{{\mathbb {H}}}\).
Since \(R_k=0\), inverting \({\mathfrak {S}}_n^{-1}\) shows that \({\mathfrak {S}}_{n-k}=0\). Therefore
In this case, decompositions (3.1), (6.26), and (6.32) take the form
Making use of formulas (6.12) (for \(k=n\)) and (6.13), we next compute
where
The last equality in (6.55) follows from the block-decomposition (6.54) for \({\mathfrak {S}}_n^{-1}\) and the equality:
which in turn, holds true due to (6.53). The explicit formula (6.56) plays no role in the sequel and is given for the sake of completeness only. Comparing the k rightmost entries in (6.55) gives
which means that \(\varepsilon \equiv f_0\) is the unique excluded parameter of \(\textbf{L}_{\Psi }\).
We next verify equalities
To this end, we combine (6.55) with (6.19) (with \(n=2k\) instead of k) and the block decomposition (6.54) for \(\textbf{T}^f_{2k}\) to get
Comparing the k leftmost entries with those in (6.55), we get equalities (6.58) for \(j=k,\ldots ,2k-1\). To verify (6.58) for \(j=k\), we use explicit formulas (6.16) for \(\psi _{11,2k}\), \(\psi _{21,2k}\) (recall that \(n=2k\)) derived from (3.16) and the formulas
also derived from (3.16), to compute
Due to (6.53), the k leftmost entries in \(\textbf{e}^*-f_0F^*_{2k}\) and the k top entries in \(F_{2k}-\textbf{e}f_0\) are zeros. Since the \(k\times k\) bottom principal submatrix of \((I-Z_{2k}^*)^{-1}{\mathfrak {S}}_{2k}^{-1}\) is the zero matrix, the expression on the right side of (6.59) equals zero, thus justifying the equality (6.58) for \(j=2k\).
Due to (6.57) and (6.58), we have the polynomial identity
which implies that for \(\varepsilon \equiv f_0\),
thus, completing Case 3, and hence, the proof of Theorem 6.9.
References
Alpay, D., Bolotnikov, V., Colombo, F., Sabadini, I.: Self-mappings of the quaternionic unit ball: multiplier properties, Schwarz-Pick inequality, and Nevanlinna-Pick interpolation problem. Indiana Univ. Math. J. 64, 151–180 (2015)
Alpay, D., Colombo, F., Sabadini, I.: Schur functions and their realizations in the slice hyperholomorphic setting. Integr. Eqn. Oper. Theory 72, 253–289 (2012)
Alpay, D., Colombo, F., Sabadini, I.: Pontryagin de Branges-Rovnyak spaces of slice hyperholomorphic functions. J. Anal. Math. 121(1), 87–125 (2013)
Alpay, D., Colombo, F., Sabadini, I.: Slice hyperholomorphic Schur analysis, Operator Theory: Advances and Applications, 256. Birkhäuser/Springer, Cham (2016)
Bolotnikov, V.: On the Carathéodory-Fejér interpolation problem for generalized Schur functions. Integr. Eqn. Oper. Theory 50(1), 9–41 (2004)
Bolotnikov, V.: Pick matricies and quaternionic power series. Integr. Eqn. Oper. Theory 80(2), 293–302 (2014)
Bolotnikov, V.: Finite Blaschke products over quaternions: unitary realizations and zero structure, Anal. Math. Phys. 10(4), Paper No. 77, 39 pp (2020)
Carathéodory, C.: Über den Variabilitãtsbereich der Koeffizienten von Potenzreihen, die gegebene Werte nicht annehmen. Math. Ann. 64, 95–115 (1907)
Carathéodory, C.: Über den Variabilitatsbereich der Fourier’schen Konstanten von positiven harmonischen Funktionen. Rend. Circ. Mat. Palermo 32, 193–217 (1911)
Carathéodory, C.: Theory of functions of a complex variable, vol. II. Chelsea, New York (1960)
Carathéodory, C., Fejér, L.: Über den Zusammenhang der Extremen von harmonischen Funktionen mit ihren Koeffizienten und den Picard-Landau’schen Satz. Rend. Circ. Mat. Palermo 32, 218–239 (1911)
Gentili, G., Struppa, D.C.: A new theory of regular functions of a quaternionic variable. Adv. Math. 216(1), 279–301 (2007)
Gentili, G., Struppa, D.C., Stoppato, C.: Regular functions of a quaternionic variable, 2nd edn. Springer Monographs in Mathematics. Springer, Cham (2022)
Golinskii, L.B.: A generalization of the matrix Nevanlinna-Pick problem. Izv. Akad. Nauk Armyan. SSR Ser. Mat. 18, 187–205 (1983)
Iohvidov, I.S.: Hankel and Toeplitz Matrices and Forms. Birkhäuser-Verlag, Basel (1982)
Iohvidov, I.S., Krein, M.G.: Spectral theory of operators in spaces with indefinite metric II. Trudy Moskov. Mat. Obsc. 8, 413–496 (1959)
Kailath, T., Kung, S.Y., Morf, M.: Displacement ranks of matrices and linear equations. J. Math. Anal. Appl. 68(2), 395–407 (1979)
Krein, M.G., Langer, H.: Uber einige Fortsetzungsprobleme, die eng mit der Theorie hermitescher Operatoren im Raume \(\Pi _\kappa \) zusammenhängen. I. Einige Funktionenklassen und ihre Darstellungen. Math. Nachr. 77, 187–236 (1977)
Niven, I.: Equations in quaternions. Am. Math. Monthly 48, 654–661 (1941)
Rodman, L.: Topics in Quaternion Linear Algebra. Princeton University Press, Princeton (2014)
Schur, I.: Über einen Satz von C, pp. 4–15. Sitzungsberichte der Königlich Preussischen Akademie der Wissenschaften zu Berlin, C. Carathéodory (1912)
Schur, I.: Über Potenzreihen, die im Innern des Einheitskreises beschränkt sind. J. Reine Angew. Math. 147, 205–232 (1917)
Takagi, T.: On an algebraic problem related to an analytic theorem of Carathéodory and Fejér and on an allied theorem of Landau. Jpn. J. Math. 1, 83–91 (1924)
Tam, T.-Y.: Interlacing inequalities and Cartan subspaces of classical real simple Lie algebras. SIAM J. Matrix Anal. Appl. 21, 581–592 (1999)
Thompson, R.C.: Matrix Spectral Inequalities. Johns Hopkins Lecture Series (1988)
Toeplitz, O.: Über die Fourier’sche Entwicklung positiver Funktionen. Rend. Circ. Mat. Palermo 32, 191–192 (1911)
Acknowledgements
The project was partially supported by Simons Foundation under Grant No. 524539.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by Tin-Yau Tam.
Dedicated to Professor Chi-Kwong Li on the occasion of his 65th birthday.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Bolotnikov, V. On the Carathéodory–Schur interpolation problem over quaternions. Adv. Oper. Theory 9, 30 (2024). https://doi.org/10.1007/s43036-024-00329-6
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s43036-024-00329-6