1 Introduction

Modeling of various real-world applications, e.g., biological, electrical or population dynamics, results in bilinear control systems [1, 30, 31, 34, 39]. Those bilinear systems usually inherit special structures based on their underlying physical meaning. For example, in the case of bilinear mechanical models, one obtains a second-order bilinear control system of the form

$$\begin{aligned} M \ddot{q}(t) + D \dot{q}(t) + K q(t)&= \sum \limits _{j = 1}^{m} N_{\textrm{p},j}q(t)u_{j}(t) + \sum \limits _{j = 1}^{m} N_{\textrm{v},j}\dot{q}(t)u_{j}(t) + B_{\textrm{u}}u(t),\nonumber \\ y(t)&= C_{\textrm{p}}q(t) + C_{\textrm{v}}\dot{q}(t), \end{aligned}$$
(1)

where \(q(t) \in \mathbb {R}^{n} \) are the (internal) degrees of freedom; \(u(t) \in \mathbb {R}^{m}\) and \(y(t) \in \mathbb {R}^{p}\) are, respectively, the inputs and outputs of the system; \(M, D, K, N_{\textrm{p},j}, N_{\textrm{v},j} \in \mathbb {R}^{n \times n}\) for all \(j = 1, \ldots , m\), \(B_{\textrm{u}} \in \mathbb {R}^{n \times m}\) and \(C_{\textrm{p}}, C_{\textrm{v}} \in \mathbb {R}^{p \times n}\). Due to the usual request for high-fidelity modeling, the number of differential equations, n, describing the dynamics of systems as in (1), quickly increases. This often results in a high demand for computational resources such as time and memory. One remedy is model order reduction: a new, reduced, system is created, consisting of a significantly smaller number of differential equations than the original one while still accurately approximating the input-to-output behavior [2, 3, 14,15,16]. Then one can use this lower order approximation as a surrogate model for faster simulations or within algorithms for design optimization and controller synthesis.

In the case of unstructured bilinear systems with the state-space form

$$\begin{aligned} \begin{aligned} E {\dot{x}}(t)&= A x(t) + \sum \limits _{j = 1}^{m}N_{j}x(t)u_{j}(t) + B u(t),\\ y(t)&= C x(t), \end{aligned} \end{aligned}$$
(2)

where \(E, A, N_{j} \in \mathbb {R}^{n \times n}\) for all \(j = 1, \ldots , m\), \(B \in \mathbb {R}^{n \times m}\) and \(C \in \mathbb {R}^{p \times n}\), there already exist different model reduction methodologies, e.g., bilinear balanced truncation [1, 11, 27], different types of interpolation approaches for the underlying multi-variate transfer functions in the frequency domain [3, 5, 17, 19, 20], complete Volterra series interpolation [9, 21, 42], the bilinear Loewner framework [4, 25], and the Koopman operator framework with the dynamic mode decomposition [23, 24, 28, 32]. For structured bilinear control systems as in (1), recently [13] developed the structure-preserving interpolation framework where interpolation for multi-input/multi-output (MIMO) systems was enforced as for single-input/single-output (SISO) systems, i.e., using full matrix interpolation. One of our major contributions in this paper is to devise a proper interpolation framework for structured MIMO systems.

Reduction of MIMO bilinear systems is an intricate problem and only a few of the aforementioned approaches provide suitable extensions for model reduction of MIMO structured bilinear systems. The lack of a proper extension is especially persistent for subsystem interpolation since enforcing matrix interpolation results in quickly increasing reduced-order dimension. For MIMO linear dynamical systems, the concept of tangential interpolation resolves this issue [22] by interpolating the matrix-valued transfer function along selected direction vectors. It is important to note that the optimal approximation in a specific, namely the \(\mathcal {H}_{2}\)-norm, satisfies tangential interpolation, not matrix interpolation [3]. For interpolatory model reduction of bilinear systems, it is not clear so far what the proper extension of tangential interpolation would be. For the unstructured case, [10, 35] provide one potential extension where only certain blocks of the subsystem transfer functions are employed in tangential interpolation. In this paper, we will introduce a new unifying framework for tangential interpolation of structured bilinear systems, inspired by the original ideas of tangential interpolation for matrix-valued functions [6]. This new framework will cover different extensions of tangential interpolation to bilinear systems under one umbrella. Especially, it will allow us to formulate a direct extension of the ideas from [10, 35] to the structured system case.

Parts of the theoretical results presented here were derived in the course of writing the dissertation of the corresponding author [40].

The rest of the paper is organized as follows: In Sect. 2, we briefly recall the theory of bilinear systems and Volterra series, introduce the structured transfer functions considered in this paper and revisit the tangential interpolation problem for linear dynamical systems. In Sect. 3, we will motivate our new unifying tangential interpolation framework and provide conditions on underlying projection spaces to satisfy interpolation conditions in this framework. Three benchmark examples are presented in Sect. 4 that illustrate the established theory by comparing different interpolatory model reduction approaches for (structured) MIMO bilinear systems, followed by the conclusions in Sect. 5.

2 Mathematical preliminaries

In this section, we briefly review various system-theoretic concepts for bilinear systems and the idea of tangential interpolation for linear systems.

2.1 Frequency-domain representation of structured bilinear systems

For the unstructured bilinear system (2), define \(N = \begin{bmatrix} N_{1}&\ldots&N_{m} \end{bmatrix}\). Assume E to be invertible and zero initial conditions, i.e., \(x(0) = 0\). Then, under some mild assumptions specified in [36], the output of (2) can be expressed in terms of a Volterra series, i.e.,

$$\begin{aligned} y(t)&= \sum \limits _{k = 1}^{\infty } \int \limits _{0}^{t} \int \limits _{0}^{t_{1}} \ldots \int \limits _{0}^{t_{k-1}} g_{k}(t_{1}, \ldots , t_{k}) \left( u(t - \sum \limits _{i = 1}^{k}t_{i}) \otimes \cdots \otimes u(t - t_{1}) \right) \textrm{d}t_{k} \cdots \textrm{d}t_{1}, \end{aligned}$$

where \(g_{k}\), for \(k \ge 1\), is the k-th regular Volterra kernel given by

$$\begin{aligned} \begin{aligned} g_{k}(t_{1}, \ldots , t_{k})&= Ce^{E^{-1}At_{k}} \left( \prod \limits _{j = 1}^{k-1} (I_{m^{j-1}} \otimes E^{-1}N) (I_{m^{j}} \otimes e^{E^{-1}At_{k-j}}) \right) \\&\quad {}\times {} (I_{m^{k-1}} \otimes E^{-1}B), \end{aligned} \end{aligned}$$
(3)

where \(I_{m^{j}}\) denotes the identity matrix of size \(m^{j} \times m^{j}\) and \(\otimes \) is the Kronecker product. Using the multivariate Laplace transform [36], the regular Volterra kernels (3) yield a representation of (2) in the frequency domain by the so-called multivariate regular transfer functions

$$\begin{aligned} G_{k}(s_{1},\ldots ,s_{k}){} & {} = C(s_{k}E - A)^{-1} \left( \prod \limits _{j = 1}^{k-1} (I_{m^{j-1}} \otimes N) (I_{m^{j}} \otimes (s_{k-j}E - A)^{-1}) \right) \nonumber \\{} & {} \times {} (I_{m^{k-1}} \otimes B), \end{aligned}$$
(4)

with \(s_{1}, \ldots , s_{k} \in \mathbb {C}\).

Remark 1

(Systems with differential-algebraic equations) To simplify the discussion, we have assumed that the E matrix in (2) is invertible. This, however, is not needed for the interpolation theory discussed below. Indeed, only the regularity of the matrix pencil \(sE - A\) is needed for the considered transfer function formulation to exist. For the details of interpolatory model reduction for systems with differential-algebraic equations (i.e., when E is not invertible), we refer the reader to [26] for unstructured linear and [12] for unstructured bilinear systems. Similar ideas could be employed for the more general structured tangential interpolation theory developed here; however, this is outside the scope of this work.

Motivated by the structured linear case [8] and structured bilinear systems as in (1), [13] introduced the frequency domain representation of structured bilinear systems in terms of the structured regular subsystem transfer functions of the form

$$\begin{aligned} G_{k}(s_{1}, \ldots , s_{k}){} & {} = \mathcal {C}(s_{k})\mathcal {K}(s_{k})^{-1} \left( \prod \limits _{j = 1}^{k-1} \big ( I_{m^{j-1}} \otimes \mathcal {N}(s_{k-j}) \big ) \big ( I_{m^{j}} \otimes \mathcal {K}(s_{k-j})^{-1} \big ) \right) \nonumber \\{} & {} \times {} \big ( I_{m^{k-1}} \otimes \mathcal {B}(s_{1}) \big ), \end{aligned}$$
(5)

for \(k \ge 1\), where \(\mathcal {C}(s):\mathbb {C}\rightarrow \mathbb {C}^{p \times n}\), \(\mathcal {K}(s):\mathbb {C}\rightarrow \mathbb {C}^{n \times n}\), \(\mathcal {B}(s):\mathbb {C}\rightarrow \mathbb {C}^{n \times m}\), and \(\mathcal {N}_{j}:\mathbb {C}\rightarrow \mathbb {C}^{n \times n}\) for \(j = 1, \ldots , m\) are matrix-valued functions, and we denote \(\mathcal {N}(s) = \begin{bmatrix} \mathcal {N}_{1}(s)&\ldots&\mathcal {N}_{m}(s) \end{bmatrix}\). This general frameworks contains the unstructured bilinear systems (2) as a special case, where

$$\begin{aligned} \begin{aligned} \mathcal {C}(s)&= C,&\mathcal {K}(s)&= sE - A,&\mathcal {B}(s)&= B,&\mathcal {N}(s)&= \begin{bmatrix} N_{1}&\ldots&N_{m} \end{bmatrix}. \end{aligned} \end{aligned}$$

Also, it recovers the bilinear second-order system (1) by choosing

$$\begin{aligned} \begin{aligned} \mathcal {C}(s)&= C_{\textrm{p}} + s C_{\textrm{v}},&\mathcal {K}(s)&= s^{2} M + s D + K,&\mathcal {B}(s)&= B_{\textrm{u}},&\mathcal {N}(s)&= N_{\textrm{p}} + s N_{\textrm{v}}, \end{aligned} \end{aligned}$$

where \(N_{\textrm{p}} = \begin{bmatrix} N_{\textrm{p}, 1}&\ldots&N_{\textrm{p}, m} \end{bmatrix}\) and \(N_{\textrm{v}} = \begin{bmatrix} N_{\textrm{v}, 1}&\ldots&N_{\textrm{v}, m} \end{bmatrix}\). We refer the reader to [13] for a more detailed derivation of structured multivariate transfer functions and other structured examples.

For the full-order structured bilinear control system with subsystem transfer functions (5), we will construct structure-preserving reduced bilinear systems using Petrov-Galerkin projection: Given two model reduction basis matrices \(W, V \in \mathbb {C}^{n \times r}\) for the test and trial spaces, respectively, with \(r \ll n\), the reduced-order quantities are given by

$$\begin{aligned} \begin{aligned} \widehat{\mathcal {C}}(s)&= \mathcal {C}(s) V,&\widehat{\mathcal {K}}(s)&= W^{\textsf{H}} \mathcal {K}(s) V,&\widehat{\mathcal {B}}(s)&= W^{\textsf{H}} \mathcal {B}(s)&\text {and}\\ \widehat{\mathcal {N}}_{j}(s)&= W^{\textsf{H}} \mathcal {N}_{j}(s) V, \end{aligned} \end{aligned}$$
(6)

for \(j = 1, \ldots , m\), where \((\cdot )^{\textsf{H}}\) denotes the conjugate transpose. The corresponding reduced-order system \(\widehat{G}\) is then given by the underlying reduced-order matrices from (6) and the corresponding multivariate transfer functions

$$\begin{aligned} \widehat{G}_{k}(s_{1}, \ldots , s_{k}){} & {} = \widehat{\mathcal {C}}(s_{k})\widehat{\mathcal {K}}(s_{k})^{-1} \left( \prod \limits _{j = 1}^{k-1} \big ( I_{m^{j-1}} \otimes \widehat{\mathcal {N}}(s_{k-j}) \big ) \big ( I_{m^{j}} \otimes \widehat{\mathcal {K}}(s_{k-j})^{-1} \big ) \right) \nonumber \\{} & {} \times {} (I_{m^{k-1}} \otimes \widehat{\mathcal {B}}(s_{1})), \end{aligned}$$
(7)

for \(k \ge 1\). For example, for the mechanical bilinear system in (1), the reduced-order model will have the form

$$\begin{aligned} \begin{aligned} \widehat{M} \ddot{\hat{q}}(t) + \widehat{D} \dot{\hat{q}}(t) + \widehat{K} \hat{q}(t)&= \sum \limits _{j = 1}^{m} \widehat{N}_{\textrm{p},j} \hat{q}(t) u_{j}(t) + \sum \limits _{j = 1}^{m} \widehat{N}_{\textrm{v},j} \dot{\hat{q}} (t)u_{j}(t) + \widehat{B}_{\textrm{u}} u(t),\\ \hat{y}(t)&= \widehat{C}_{\textrm{p}}\hat{q}(t) + \widehat{C}_{\textrm{v}} \dot{\hat{q}}(t), \end{aligned} \end{aligned}$$

where \(\widehat{M}, \widehat{D}, \widehat{K}, \widehat{N}_{\textrm{p},j}, \widehat{N}_{\textrm{v},j} \in \mathbb {R}^{r \times r}\) for \(j = 1, \ldots , m\), \(\widehat{B}_{\textrm{u}} \in \mathbb {R}^{r \times m}\), and \(\widehat{C}_{\textrm{p}}, \widehat{C}_{\textrm{v}} \in \mathbb {R}^{p \times r}\) are given by

$$\begin{aligned} \begin{aligned} \widehat{M}&= W^{\textsf{H}} M V,&\widehat{D}&= W^{\textsf{H}} D V,&\widehat{K}&= W^{\textsf{H}} K V,&\widehat{B}_{\textrm{u}}&= W^{\textsf{H}} B_{\textrm{u}}, \\ \widehat{N}_{\textrm{p},j}&= W^{\textsf{H}} {N}_{\textrm{p},j} V,&\widehat{N}_{\textrm{v},j}&= W^{\textsf{H}} {N}_{\textrm{v},j} V,&\widehat{C}_{\textrm{p}}&= {C}_{\textrm{p}} V, ~\text {and}&\widehat{C}_{\textrm{v}}&= {C}_{\textrm{v}} V. \end{aligned} \end{aligned}$$

We will construct the model reduction bases W and V such that the reduced-order subsystem transfer functions \(\widehat{G}_{k}\) in (7) are multivariate (tangential) interpolants to the full-order ones \(G_{k}\) at some selected frequencies \(\{\sigma _{1}, \sigma _{2}, \ldots , \sigma _{k}\} \subset \mathbb {C}\). Below, we will make it precise what we mean by tangential interpolation in this setting. But first, it is worth noting the dimension of \(G_{k}\). For a MIMO bilinear system with m inputs and p outputs, \(G_{k}\), evaluated at given frequencies, is a \(p \times m^{k}\) matrix, i.e., it has a polynomial growth in the input dimension. Then, full matrix interpolation of \(G_{k}\) by \(\widehat{G}_{k}\) imposes a rather large number of interpolation conditions to satisfy, leading to rapid growth of the reduced order. We will resolve this issue via tangential interpolation. With respect to the number of interpolation points and interpolated subsystem transfer function levels, tangential interpolation in the new framework reduces the dimensional growth of reduced-order models to scale only linearly rather than exponentially as for matrix interpolation. It will help to recall the tangential interpolation problem for the linear case first.

2.2 Tangential interpolation for linear dynamical systems

The tangential interpolation replaces the full matrix interpolation of a matrix-valued function with interpolation along selected directions and can be interpreted as adding constraints to the matrix interpolation problem [6]. For given interpolation points \(\sigma _{1}, \ldots , \sigma _{k} \in \mathbb {C}\), given function values \(h_{1}, \ldots , h_{k} \in \mathbb {C}^{p}\) and right tangential directions \(b_{1}, \ldots , b_{k} \in \mathbb {C}^{m}\), the task of the right tangential interpolation is to find an interpolating function \(H:\mathbb {C}\rightarrow \mathbb {C}^{p \times m}\) such that

$$\begin{aligned} \begin{aligned} H(\sigma _{j}) b_{j}&= h_{j},&\text {for } j = 1, \ldots , k. \end{aligned} \end{aligned}$$
(8)

The left interpolation problem is defined similarly.

It was then proposed in [6] and utilized in [22] to employ tangential interpolation for model reduction of linear unstructured multi-input/multi-output systems by restricting the interpolant (8) to a rational matrix-valued function and using the system’s transfer function evaluations along certain directions as function values to interpolate. In other words, given the original linear system’s transfer function \(G(s) = C (sE - A)^{-1} B\), the goal is to construct a reduced-order system with transfer function \(\widehat{G}(s) = \widehat{C} (s \widehat{E} - \widehat{A})^{-1} \widehat{B}\) such that for given interpolation points \(\sigma _{1}, \ldots , \sigma _{k} \in \mathbb {C}\) and directions \(b^{(1)}, \ldots , b^{(k)} \in \mathbb {C}^{m}\) as well as \(c^{(1)}, \ldots , c^{(k)} \in \mathbb {C}^{p}\), the right or left tangential interpolation conditions

$$\begin{aligned} \begin{aligned} G(\sigma _{j}) b^{(j)}&= \widehat{G}(\sigma _{j}) b^{(j)}{} & {} \text {or}&\big ( c^{(j)} \big )^{\textsf{H}} G(\sigma _{j})&= \big ( c^{(j)} \big )^{\textsf{H}} \widehat{G}(\sigma _{j}), \end{aligned} \end{aligned}$$
(9)

for \(j = 1, \ldots , k\), hold. It has been shown via numerous examples that tangential interpolation yields accurate reduced-order models while allowing to choose the size of the reduced-order model independent of the input and output dimensions (unlike in the matrix interpolation framework) and thus results in smaller reduced-order models. Indeed, tangential interpolation, not the matrix interpolation, forms the necessary conditions for optimal model reduction of linear systems in the \(\mathcal {H}_{2}\) norm [3]. The tangential interpolation problem (9) (and the projection-based solution framework) was later extended to structure-preserving Hermite interpolation in [8] of structured transfer functions of the form \(G(s) = \mathcal {C}(s) \mathcal {K}(s)^{-1} \mathcal {B}(s)\).

2.3 Blockwise tangential interpolation for unstructured bilinear systems

Extending tangential interpolation to unstructured bilinear systems of the form (2) was first considered in [10, 35], using the observation that multiplying out the Kronecker products in (4) yields

$$\begin{aligned} G_{k}(s_{1}, \ldots , s_{k})&= \big [ C(s_{k} E - A)^{-1} N_{1} \cdots N_{1} (s_{1} E - A)^{-1} B,\\&\qquad C(s_{k} E - A)^{-1} N_{1} \cdots N_{2} (s_{1} E - A)^{-1} B,\\&\qquad \ldots ,\\&\qquad C(s_{k} E - A)^{-1} N_{m} \cdots N_{m} (s_{1} E - A)^{-1} B \big ]. \end{aligned}$$

Each block entry in this formula is then considered as separate transfer function, which will be interpolated along the same chosen directions. For example, with a right tangential direction \(b \in \mathbb {C}^{m}\), the blockwise evaluation of the transfer function along b is given by

$$\begin{aligned} G_{k}(s_{1}, \ldots , s_{k}) (I_{m^{k - 1}} \otimes b)&= \big [ C(s_{k} E - A)^{-1} N_{1} \cdots N_{1} (s_{1} E - A)^{-1} Bb,\\&\qquad C(s_{k} E - A)^{-1} N_{1} \cdots N_{2} (s_{1} E - A)^{-1} Bb,\\&\qquad \ldots ,\\&\qquad C(s_{k} E - A)^{-1} N_{m} \cdots N_{m} (s_{1} E - A)^{-1} Bb \big ], \end{aligned}$$

leading to the concept of the blockwise tangential interpolation problem: Given interpolation points \(\sigma _{1}, \ldots , \sigma _{k} \in \mathbb {C}\) and tangential directions \(b \in \mathbb {C}^{m}\) and \(c \in \mathbb {C}^{p}\), find a reduced-order model such that

$$\begin{aligned} G_{k}(\sigma _{1}, \ldots , \sigma _{k}) (I_{m^{k - 1}} \otimes b)&= \widehat{G}_{k}(\sigma _{1}, \ldots , \sigma _{k}) (I_{m^{k - 1}} \otimes b) \quad \text {or} \end{aligned}$$
(10)
$$\begin{aligned} c^{\textsf{H}} G_{k}(\sigma _{1}, \ldots , \sigma _{k})&= c^{\textsf{H}} \widehat{G}_{k}(\sigma _{1}, \ldots , \sigma _{k}) \end{aligned}$$
(11)

hold. Also, the bi-tangential interpolation condition,

$$\begin{aligned} c^{\textsf{H}} G_{k}(\sigma _{1}, \ldots , \sigma _{k}) (I_{m^{k - 1}} \otimes b)&= c^{\textsf{H}} \widehat{G}_{k}(\sigma _{1}, \ldots , \sigma _{k}) (I_{m^{k - 1}} \otimes b), \end{aligned}$$
(12)

will be of high interest in the bilinear system case. While, in principle, (10) and (11) imply (12), we will see later that it is possible to match subsystem transfer functions of higher level k in the sense of (12) by enforcing (10) and (11) on lower level transfer functions.

The blockwise tangential interpolation problem can be viewed as a mixture of tangential interpolation, as it is done for the linear system case (9), combined with the blocks of multivariate transfer functions. While in the case of linear systems, the tangential interpolation restricts the problem to vectors or scalars of fixed sizes to be interpolated, this is not true anymore for the blockwise approach in the bilinear system case. As already observed in [10, 35], the blockwise approach still leads to the interpolation of an exponentially increasing number of vectors or matrices, making it only marginally better than the matrix interpolation method for model reduction.

2.4 Notation

To simplify notation in this work, we will use:

$$\begin{aligned} \partial _{s_{1}^{j_{1}} \cdots s_{k}^{j_{k}}} f(z_{1}, \ldots , z_{k})&:= \frac{\partial ^{j_{1} + \cdots + j_{k}} f}{\partial s_{1}^{j_{1}} \cdots \partial s_{k}^{j_{k}}} (z_{1}, \ldots , z_{k}) \end{aligned}$$
(13)

to denote the differentiation of an analytic function \(f:\mathbb {C}^{k} \rightarrow \mathbb {C}^{\ell }\) with respect to the complex variables \(s_{1}, \ldots , s_{k}\) and evaluated at \(z_{1}, \ldots , z_{k} \in \mathbb {C}\). We denote for matrix-valued functions \(\mathcal {K}:\mathbb {C}\rightarrow \mathbb {C}^{n \times n}\), which map complex scalars onto square matrices, the inverse of their evaluation by \(\mathcal {K}^{-1}:= \mathcal {K}(.)^{-1}\). This notation of the inverse of evaluated matrix-valued functions will occur together with the notation of partial derivatives (13). For example, given two matrix-valued functions \(\mathcal {B}:\mathbb {C}\rightarrow \mathbb {C}^{n \times m}\) and \(\mathcal {K}:\mathbb {C}\rightarrow \mathbb {C}^{n \times n}\), we denote the partial derivative of the product of \(\mathcal {B}\) with the inverse of \(\mathcal {K}\) evaluated in the points \(z_{1}, z_{2} \in \mathbb {C}\) by

$$\begin{aligned} \partial _{s_{1}^{j_{1}} s_{2}^{j_{2}}} (\mathcal {K}^{-1} \mathcal {B}) (z_{1}, z_{2})&:= \frac{\partial ^{j_{1} + j_{2}} \mathcal {K}(.)^{-1} \mathcal {B}(.)}{\partial s_{1}^{j_{1}} \partial s_{2}^{j_{2}}} (z_{1}, z_{2}). \end{aligned}$$

Also, we will use the notion of the Jacobi matrix given by

$$\begin{aligned} \nabla f&= \begin{bmatrix} \partial _{s_{1}} f&\ldots&\partial _{s_{k}} f \end{bmatrix}, \end{aligned}$$
(14)

denoting the concatenation of all partial derivatives of an analytic function \(f:\mathbb {C}^{k} \rightarrow \mathbb {C}^{\ell }\) with respect to the complex variables \(s_{1}, \ldots , s_{k}\).

For bilinear systems, we have already introduced the notation

$$\begin{aligned} \mathcal {N}(s)&= \begin{bmatrix} \mathcal {N}_{1}(s)&\ldots&\mathcal {N}_{m}(s) \end{bmatrix} \end{aligned}$$

to denote horizontally concatenated matrix functions corresponding to the bilinear terms. Additionally, we use

$$\begin{aligned} \widetilde{\mathcal {N}}(s)&= \begin{bmatrix} \mathcal {N}_{1}(s) \\ \vdots \\ \mathcal {N}_{m}(s) \end{bmatrix} \end{aligned}$$

for denoting the vertical concatenation of the matrix functions corresponding to the bilinear terms. We denote the vector of ones of length m by \(\mathbb {1}_{m}\).

3 Generalized structured tangential interpolation framework

In this section, we will start with two different interpretations of tangential interpolation (9) and their corresponding interpolation problems for bilinear systems. Motivated by these formulations, we introduce a unifying framework for tangential interpolation of structured bilinear systems and give subspace conditions for structure-preserving model reduction of the corresponding bilinear systems. As a special case of the unifying framework, we derive the theory for structure-preserving blockwise tangential interpolation as reviewed in Sect. 2.3 and previously employed in the literature for standard (unstructured) bilinear systems.

3.1 Tangential interpolation in the frequency domain

Examining the original formulation of tangential interpolation (8) and the multivariate transfer functions (5), a first natural approach to tangential interpolation for bilinear systems would be to choose an appropriately sized vector \(\hat{b}\in \mathbb {C}^{m^{k}}\), where

$$\begin{aligned} \hat{b}&= \begin{bmatrix} \left( \hat{b}^{(1 1 \ldots 1)} \right) ^{\textsf{H}}&\left( \hat{b}^{(2 1 \ldots 1)} \right) ^{\textsf{H}}\cdots&\left( \hat{b}^{\left( m m \ldots m \right) } \right) ^{\textsf{H}} \end{bmatrix}^{\textsf{H}} \end{aligned}$$

and \(b^{(j_{1} \ldots j_{k})} \in \mathbb {C}^{m}\) for all \(1 \le j_{1}, \ldots , j_{k} \le m\), as right tangential direction and to consider interpolating

$$\begin{aligned} \begin{aligned} G_{k}(s_{1}, \ldots , s_{k}) \hat{b}&= \sum \limits _{j_{1} = 1}^{m} \cdots \sum \limits _{j_{k-1} = 1}^{m} \mathcal {C}(s_{k}) \mathcal {K}(s_{k})^{-1} \mathcal {N}_{j_{k-1}}(s_{k-1}) \mathcal {K}(s_{k-1})^{-1} \\&\quad {}\times {} \cdots \times \mathcal {N}_{j_{1}}(s_{1}) \mathcal {K}(s_{1})^{-1} \mathcal {B}(s_{1}) \hat{b}^{(j_{1} \ldots j_{k})}. \end{aligned} \end{aligned}$$
(15)

This general approach comes along with a computational drawback. For every new transfer function level k, a different part of \(\hat{b}\) is multiplied with the matrix-valued function \(\mathcal {B}(s)\) in each term of the sum (15). Then, the corresponding basis for model reduction would grow according to the different block entries of \(\hat{b}\) and, thus, even faster than for the blockwise tangential interpolation problem (Sect. 2.3). A remedy to this problem is to restrict the full direction vector \(\hat{b}\) to the repetition of an m-dimensional direction \(b \in \mathbb {C}^{m}\), i.e.,

$$\begin{aligned} \hat{b}= \mathbb {1}_{m^{k-1}} \otimes b = \begin{bmatrix} b \\ \vdots \\ b \end{bmatrix}. \end{aligned}$$
(16)

With this particular choice of \(\hat{b}\) in (16), the right tangential interpolation problem can be written as

$$\begin{aligned} G_{k}(\sigma _{1}, \ldots , \sigma _{k}) (\mathbb {1}_{m^{k-1}} \otimes b)&= \widehat{G}_{k}(\sigma _{1}, \ldots , \sigma _{k}) (\mathbb {1}_{m^{k-1}} \otimes b), \end{aligned}$$
(17)

for given interpolation points \(\sigma _{1}, \ldots , \sigma _{k} \in \mathbb {C}\). This restricts the interpolation problem to a vector of constant length with respect to the transfer function level and thus allows for an efficient construction of the projection basis. It is not clear how much approximation accuracy in the reduced-order model is lost by this specific choice of \(\hat{b}\) in (16). The most unrestricted type of interpolation is given by the matrix interpolation approach, and in Sect. 4, we will present numerical comparisons.

For the left tangential interpolation problem, a direct extension of the classical approach (8) would lead to the same results as in the blockwise tangential interpolation case (11) since the first dimension of the transfer function is constant for all transfer function levels. To consider a dual formulation of (15) for the left tangential interpolation problem (one for which the basis dimension does not grow exponentially), we choose

$$\begin{aligned} c^{\textsf{H}} G_{k}(\sigma _{1}, \ldots , \sigma _{k}) (\mathbb {1}_{m^{k-1}} \otimes I_{m})&= c^{\textsf{H}} \widehat{G}_{k}(\sigma _{1}, \ldots , \sigma _{k}) (\mathbb {1}_{m^{k-1}} \otimes I_{m}), \end{aligned}$$
(18)

for a given direction \(c \in \mathbb {C}^{p}\) and interpolation points \(\sigma _{1}, \ldots , \sigma _{k} \in \mathbb {C}\). Consequently, we consider

$$\begin{aligned} c^{\textsf{H}} G_{k}(\sigma _{1}, \ldots , \sigma _{k}) (\mathbb {1}_{m^{k-1}} \otimes b)&= c^{\textsf{H}} \widehat{G}_{k}(\sigma _{1}, \ldots , \sigma _{k}) (\mathbb {1}_{m^{k-1}} \otimes b) \end{aligned}$$
(19)

as the bi-tangential interpolation problem.

3.2 Time domain interpretation of tangential interpolation

A different way to look at tangential interpolation of transfer functions is its interpretation in the time domain. We start with the tangential interpolation problem for linear dynamical systems (9). For simplicity, we consider only the case of linear unstructured first-order systems as given in the time domain by

$$\begin{aligned} \begin{aligned} E {\dot{x}}(t)&= A x(t) + B u(t),\\ y(t)&= C x(t), \end{aligned} \end{aligned}$$
(20)

with \(E, A \in \mathbb {R}^{n \times n}\), \(B \in \mathbb {R}^{n \times m}\) and \(C \in \mathbb {R}^{p \times n}\), and in the frequency domain by the transfer function

$$\begin{aligned} G(s)&= C (sE - A)^{-1} B. \end{aligned}$$

We note that the following derivations work for all structured linear systems as well [8]. The multiplication with tangential directions in the frequency domain can be considered independent of the chosen interpolation points, which gives new systems in the frequency domain described by the transfer functions

$$\begin{aligned} \begin{aligned} \widetilde{G}_{\textrm{b}}(s)&= G(s) b{} & {} \text {and}&\widetilde{G}_{\textrm{c}}(s)&= c^{\textsf{H}} G(s), \end{aligned} \end{aligned}$$
(21)

with the tangential directions \(b \in \mathbb {C}^{m}\) and \(c \in \mathbb {C}^{p}\). Those new systems (21) allow now for re-interpretation in the time domain. In fact, the resulting tangential systems can be seen as embedding the original linear system G into single-input or single-output systems. We set the outer inputs and outputs as \(u(t) = b \tilde{u}(t)\) and \(\tilde{y}(t) = c^{\textsf{H}} y(t)\), respectively, and obtain the new systems:

$$\begin{aligned} \widetilde{G}_{\textrm{b}}: \left\{ \begin{aligned} E {\dot{x}}(t)&= A x(t) + B b \tilde{u}(t), \\ y(t)&= C x(t), \end{aligned} \right. \end{aligned}$$
(22)

for embedding the inputs, and

$$\begin{aligned} \widetilde{G}_{\textrm{c}}: \left\{ \begin{aligned} E {\dot{x}}(t)&= A x(t) + B u(t), \\ \tilde{y}(t)&= c^{\textsf{H}} C x(t), \end{aligned} \right. \end{aligned}$$
(23)

for the outputs. Thereby, in the setting of tangential interpolation, we are restricting the system inputs to a single input signal that is spread along a given direction b to be fed into the original system (20) or we restrict the output to a linear combination of the observations of the original system (20) using the direction c.

Now, we consider the bilinear unstructured systems (2) and make use of the time domain interpretation of tangential interpolation we have done for the linear case (22) and (23). Using the same tangential directions as before and the embedding strategy for the bilinear system (2), with \(b = \begin{bmatrix} b_{1}&b_{2}&\ldots&b_{m}\end{bmatrix}^{\hspace{-0.83328pt}\textsf{T}}\) we obtain

$$\begin{aligned} \widetilde{G}_{\textrm{b}}: \left\{ \begin{aligned} E {\dot{x}}(t)&= A x(t) + \sum \limits _{j = 1}^{m}N_{j} x(t) b_{j} \tilde{u}(t) + B b \tilde{u}(t), \\ y(t)&= C x(t), \end{aligned} \right. \end{aligned}$$
(24)

for the embedded inputs,

$$\begin{aligned} \widetilde{G}_{\textrm{c}}: \left\{ \begin{aligned} E {\dot{x}}(t)&= A x(t) + \sum \limits _{j = 1}^{m}N_{j} x(t) u_{j}(t) + B u(t), \\ \tilde{y}(t)&= c^{\textsf{H}} C x(t), \end{aligned} \right. \end{aligned}$$
(25)

for embedding the outputs. Additionally, we consider here the fully embedded system

$$\begin{aligned} \widetilde{G}_{\textrm{cb}}: \left\{ \begin{aligned} E {\dot{x}}(t)&= A x(t) + \sum \limits _{j = 1}^{m}N_{j} x(t) b_{j} \tilde{u}(t) + B b \tilde{u}(t), \\ \tilde{y}(t)&= c^{\textsf{H}} C x(t), \end{aligned} \right. \end{aligned}$$
(26)

as it relates to the bi-tangential interpolation problem. These new bilinear systems (24)–(26) will be used to derive a new concept of tangential interpolation for bilinear systems. The corresponding regular transfer functions for the embedded systems are given as follows:

$$\begin{aligned} \widetilde{G}_{\textrm{b}, k}(s_{1}, \ldots , s_{k})&= C(s_{k} E - A)^{-1} \left( \prod \limits _{j = 1}^{k-1} \left( \sum \limits _{i = 1}^{m} b_{i} N_{i} \right) (s_{k-j} E - A)^{-1} \right) B b, \\ \widetilde{G}_{\textrm{c}, k}(s_{1},\ldots ,s_{k})&= c^{\textsf{H}} C(s_{k}E - A)^{-1} \left( \prod \limits _{j = 1}^{k-1} (I_{m^{j-1}} \otimes N) (I_{m^{j}} \otimes (s_{k-j}E - A)^{-1}) \right) \nonumber \end{aligned}$$
(27)
$$\begin{aligned}&\quad {}\times {} (I_{m^{k-1}} \otimes B), \end{aligned}$$
(28)
$$\begin{aligned} \widetilde{G}_{\textrm{cb},k}(s_{1}, \ldots , s_{k})&= c^{\textsf{H}} C(s_{k} E - A)^{-1} \left( \prod \limits _{j = 1}^{k-1} \left( \sum \limits _{i = 1}^{m} b_{i} N_{i} \right) (s_{k-j} E - A)^{-1} \right) B b, \end{aligned}$$
(29)

for \(k \ge 1\). These new transfer functions (27)–(29) can now be combined with our structured transfer function setting (5). For a given direction vector \(b \in \mathbb {C}^{m}\), we denote the scaled summation of the structured multivariate transfer functions by

$$\begin{aligned} \begin{aligned} \widetilde{G}_{k}(s_{1}, \ldots , s_{k})&= \mathcal {C}(s_{k}) \mathcal {K}(s_{k})^{-1} \left( \sum \limits _{j = 1}^{m} b_{j} \mathcal {N}_{j}(s_{k-1}) \right) \mathcal {K}(s_{k-1})^{-1} \\&\quad {}\times \cdots \times {} \left( \sum \limits _{j = 1}^{m} b_{j} \mathcal {N}_{j}(s_{1}) \right) \mathcal {K}(s_{1})^{-1} \mathcal {B}(s_{1}). \end{aligned} \end{aligned}$$
(30)

The bilinear terms in (30) collapsed from large concatenated matrices in (5) to simple n-dimensional matrices. Therefore, the Kronecker products become classical matrix multiplications such that \(\widetilde{G}_{k}:\mathbb {C}^{k} \rightarrow \mathbb {C}^{p \times m}\).

Denoting the scaled and summed transfer function of the reduced-order model by \(\widehat{\widetilde{G}}_{k}(s_{1}, \ldots , s_{k})\), the corresponding right tangential interpolation problem is given by

$$\begin{aligned} \widetilde{G}_{k}(\sigma _{1}, \ldots , \sigma _{k}) b&= \widehat{\widetilde{G}}_{k}(\sigma _{1}, \ldots , \sigma _{k}) b, \end{aligned}$$
(31)

for given interpolation points \(\sigma _{1}, \ldots , \sigma _{k} \in \mathbb {C}\). As before, motivated by duality, the left and bi-tangential interpolation problems are chosen to be

$$\begin{aligned} c^{\textsf{H}} \widetilde{G}_{k}(\sigma _{1}, \ldots , \sigma _{k})&= c^{\textsf{H}} \widehat{\widetilde{G}}_{k}(\sigma _{1}, \ldots , \sigma _{k}) \quad \text {and} \end{aligned}$$
(32)
$$\begin{aligned} c^{\textsf{H}} \widetilde{G}_{k}(\sigma _{1}, \ldots , \sigma _{k}) b&= c^{\textsf{H}} \widehat{\widetilde{G}}_{k}(\sigma _{1}, \ldots , \sigma _{k}) b, \end{aligned}$$
(33)

respectively.

Remark 2

(Relation to other control systems) The idea of a time domain interpretation of tangential interpolation can easily be extended to other types of control systems, e.g., to systems with polynomial nonlinearities. This might lead to new efficient tangential interpolation approaches for nonlinear multi-input/multi-output control systems.

3.3 Structured tangential interpolation framework

Now, employing the scaled and summed transfer functions we introduced in (30), we develop a generalized framework for tangential interpolation of multivariate transfer functions that unifies the different approaches to bilinear tangential interpolation discussed in Sects. 2.33.1 and 3.2. The new framework will encompass all these different approaches under one umbrella and thus give one formulation to cover all these different interpretations of bilinear tangential interpolation, filling an important gap in the interpolatory model reduction theory of bilinear systems. Moreover, we will develop this new framework for the structured bilinear dynamical systems for which tangential interpolation has not been studied yet.

We start by defining the modified multivariate transfer functions

$$\begin{aligned} \begin{aligned}&\varvec{\textsf{G}}_{k}(s_{1}, \ldots , s_{k}\,|\,d^{(1)}, \ldots , d^{(k-1)})\\&\quad := \mathcal {C}(s_{k}) \mathcal {K}(s_{k})^{-1} \left( \prod \limits _{j = 1}^{k-1} \varvec{\textsf{N}}(s_{k-j}\,|\,d^{(k-j)}) \mathcal {K}(s_{k-j})^{-1} \right) \mathcal {B}(s_{1}), \end{aligned} \end{aligned}$$
(34)

for \(k \ge 1\), with frequency points \(s_{1}, \ldots , s_{k} \in \mathbb {C}\) and scaling vectors \(d^{(1)}, \ldots , d^{(k-1)} \in \mathbb {C}^{m}\), where

$$\begin{aligned} \varvec{\textsf{N}}(s_{j}\,|\,d^{(j)})&= \mathcal {N}(s_{j}) (d^{(j)} \otimes I_{n}) = \sum \limits _{i = 1}^{m} d^{(j)}_{i} \mathcal {N}_{i}(s_{j}) \end{aligned}$$

denotes the scaled sum of the bilinear terms. Note that the first modified transfer function does not depend on a scaling vector and it holds that

$$\begin{aligned} G_{1}(s_{1})&= \varvec{\textsf{G}}_{1}(s_{1}). \end{aligned}$$

In this setting, \(\varvec{\mathsf {\widehat{G}}}_{k}(s_{1}, \ldots , s_{k}\,|\,d^{(1)}, \ldots , d^{(k-1)})\) denotes the modified transfer functions of the reduced-order model. For the modified transfer functions, we define the following tangential interpolation problem:

Problem 1

(Tangential modified transfer function interpolation) For given interpolation points \(\sigma _{1}, \ldots , \sigma _{k} \in \mathbb {C}\), scaling vectors \(d^{(1)}, \ldots , d^{(k-1)} \in \mathbb {C}^{m}\), and tangential directions \(b \in C^{m}\) and \(c \in \mathbb {C}^{p}\), find a reduced-order model such that

$$\begin{aligned} \varvec{\textsf{G}}_{k}(\sigma _{1}, \ldots , \sigma _{k}\,|\,d^{(1)}, \ldots , d^{(k-1)}) b&= \varvec{\mathsf {\widehat{G}}}_{k}(\sigma _{1}, \ldots , \sigma _{k}\,|\,d^{(1)}, \ldots , d^{(k-1)}) b, \end{aligned}$$
(35)
$$\begin{aligned} c^{\textsf{H}} \varvec{\textsf{G}}_{k}(\sigma _{1}, \ldots , \sigma _{k}\,|\,d^{(1)}, \ldots , d^{(k-1)})&= c^{\textsf{H}} \varvec{\mathsf {\widehat{G}}}_{k}(\sigma _{1}, \ldots , \sigma _{k}\,|\,d^{(1)}, \ldots , d^{(k-1)}), \quad \text {or} \end{aligned}$$
(36)
$$\begin{aligned} c^{\textsf{H}} \varvec{\textsf{G}}_{k}(\sigma _{1}, \ldots , \sigma _{k}\,|\,d^{(1)}, \ldots , d^{(k-1)}) b&= c^{\textsf{H}} \varvec{\mathsf {\widehat{G}}}_{k}(\sigma _{1}, \ldots , \sigma _{k}\,|\,d^{(1)}, \ldots , d^{(k-1)}) b \end{aligned}$$
(37)

hold.

Before we present our results that show how to construct the reduced bilinear systems to solve the structure-preserving tangential interpolation problem in the new generalized framework, we formally state in the following corollary that the earlier bilinear tangential interpolation frameworks are special cases of the proposed unifying framework. Due to its significance in the literature and its more complex formulation in the unifying framework, the case of blockwise tangential interpolation is treated separately in Sect. 3.4.

Corollary 1

(Choices of the scaling vectors) Consider the proposed tangential interpolation problem (Problem 1) with the corresponding scaling vectors \(d^{(j)}\) in (34). Then:

  1. (a)

    Choosing \(d^{(1)} = \cdots = d^{(k-1)} = \mathbb {1}_{m}\) yields the extension of classical tangential interpolation to the multivariate transfer functions of bilinear systems (17)–(19) from Sect. 3.1.

  2. (b)

    Choosing \(d^{(1)} = \cdots = d^{(k-1)} = b\), with \(b \in \mathbb {C}^{m}\) as the right tangential direction, yields the re-interpretation of tangential interpolation in time domain (31)–(33) from Sect. 3.2.

The following theorem establishes the subspace conditions on the model reduction bases V and W to construct the reduced-order model (6) that satisfies the tangential interpolation conditions (35)–(37).

Theorem 1

(Modified structured tangential interpolation) Let G be a bilinear system, associated with its modified transfer functions \(\varvec{\textsf{G}}_{k}\) in (34), and \(\widehat{G}\) the reduced-order bilinear system, constructed as in (6) with its modified transfer functions \(\varvec{\mathsf {\widehat{G}}}_{k}\). Given sets of interpolation points \(\sigma _{1}, \ldots , \sigma _{k} \in \mathbb {C}\) and \(\varsigma _{1}, \ldots , \varsigma _{\kappa } \in \mathbb {C}\) such that the matrix functions \(\mathcal {C}(s)\), \(\mathcal {K}(s)^{-1}\), \(\mathcal {N}(s)\), \(\mathcal {B}(s)\), \(\widehat{\mathcal {K}}(s)^{-1}\) are defined for \(s \in \{ \sigma _{1}, \ldots , \sigma _{k}, \varsigma _{1}, \ldots , \varsigma _{\kappa } \}\), two tangential directions \(b \in \mathbb {C}^{m}\) and \(c \in \mathbb {C}^{p}\), and two sets of scaling vectors \(d^{(1)}, \ldots , d^{(k-1)} \in \mathbb {C}^{m}\) and \(\delta ^{(1)}, \ldots , \delta ^{(\kappa -1)} \in \mathbb {C}^{m}\), the following statements hold:

  1. (a)

    If V is constructed as

    $$\begin{aligned} v_{1}&= \mathcal {K}(\sigma _{1})^{-1}\mathcal {B}(\sigma _{1})b,\\ v_{j}&= \mathcal {K}(\sigma _{j})^{-1} \varvec{\textsf{N}}(\sigma _{j-1}\,|\,d^{(j-1)}) v_{j-1},&2 \le j \le k,\\ {{\,\textrm{span}\,}}(V)&\supseteq {{\,\textrm{span}\,}}\left( [v_{1}, \ldots , v_{k}]\right) , \end{aligned}$$

    then the following interpolation conditions hold true:

    $$\begin{aligned} \varvec{\textsf{G}}_{1}(\sigma _{1})b&= \varvec{\mathsf {\widehat{G}}}_{1}(\sigma _{1})b,\\ \varvec{\textsf{G}}_{2}(\sigma _{1}, \sigma _{2}\,|\,d^{(1)}) b&= \varvec{\mathsf {\widehat{G}}}_{2}(\sigma _{1}, \sigma _{2}\,|\,d^{(1)}) b,\\&\,\,\, \vdots \\ \varvec{\textsf{G}}_{k}(\sigma _{1}, \ldots , \sigma _{k}\,|\,d^{(1)}, \ldots , d^{(k-1)}) b&= \varvec{\mathsf {\widehat{G}}}_{k}(\sigma _{1}, \ldots , \sigma _{k}\,|\,d^{(1)}, \ldots , d^{(k-1)}) b. \end{aligned}$$
  2. (b)

    If W is constructed as

    $$\begin{aligned} w_{1}&= \mathcal {K}(\varsigma _{\kappa })^{-\textsf{H}}\mathcal {C}(\varsigma _{\kappa })^{\textsf{H}} c,\\ w_{i}&= \mathcal {K}(\varsigma _{\kappa -i+1})^{-\textsf{H}} \varvec{\textsf{N}}(\varsigma _{\kappa -i+1}\,|\,\delta ^{(\kappa -i+1)})^{\textsf{H}} w_{i-1},&2 \le i \le \kappa ,\\ \textrm{span}(W)&\supseteq \textrm{span}\left( [w_{1}, \ldots , w_{\kappa }] \right) , \end{aligned}$$

    then the following interpolation conditions hold true:

    $$\begin{aligned} c^{\textsf{H}} \varvec{\textsf{G}}_{1}(\varsigma _{\kappa })&= c^{\textsf{H}} \varvec{\mathsf {\widehat{G}}}_{1}(\varsigma _{\kappa }),\\ c^{\textsf{H}} \varvec{\textsf{G}}_{1}(\varsigma _{\kappa -1}, \varsigma _{\kappa }\,|\,\delta ^{(\kappa -1)})&= c^{\textsf{H}} \varvec{\mathsf {\widehat{G}}}_{1}(\varsigma _{\kappa -1}, \varsigma _{\kappa }\,|\,\delta ^{(\kappa -1)}), \\&\,\,\, \vdots \\ c^{\textsf{H}} \varvec{\textsf{G}}_{\kappa }(\varsigma _{1}, \ldots , \varsigma _{\kappa }\,|\,\delta ^{(1)}, \ldots , \delta ^{(\kappa -1)})&= c^{\textsf{H}} \varvec{\mathsf {\widehat{G}}}_{\kappa }(\varsigma _{1}, \ldots , \varsigma _{\kappa }\,|\,\delta ^{(1)}, \ldots , \delta ^{(\kappa -1)}). \end{aligned}$$
  3. (c)

    Let V be constructed as in Part (a) and W as in Part (b). Then, additionally to the results in (a) and (b), the following interpolation conditions hold:

    $$\begin{aligned}&c^{\textsf{H}} \varvec{\textsf{G}}_{q + \eta }(\sigma _{1}, \ldots , \sigma _{q}, \varsigma _{\kappa -\eta +1}, \ldots , \varsigma _{\kappa }\,|\,d^{(1)}, \ldots , d^{(q-1)}, z, \delta ^{(\kappa -\eta +1)}, \ldots , \delta ^{(\kappa -1)}) b \\&= c^{\textsf{H}} \varvec{\mathsf {\widehat{G}}}_{q + \eta }(\sigma _{1}, \ldots , \sigma _{q}, \varsigma _{\kappa -\eta +1}, \ldots , \varsigma _{\kappa }\,|\,d^{(1)}, \ldots , d^{(q-1)}, z, \delta ^{(\kappa -\eta +1)}, \ldots , \delta ^{(\kappa -1)}) b, \end{aligned}$$

    for \(1 \le q \le k\), \(1 \le \eta \le \kappa \) and an additional arbitrary scaling vector \(z \in \mathbb {C}^{m}\).

Proof

For brevity of the presentation, we restrict ourselves to prove Part (c) of the theorem. Parts (a) and (b) can be proven analogously using the same projectors constructed in the following. The modified transfer functions of the reduced-order model are given by

$$\begin{aligned} c^{\textsf{H}}&\varvec{\mathsf {\widehat{G}}}_{q + \eta }(\sigma _{1}, \ldots , \sigma _{q}, \varsigma _{\kappa -\eta +1}, \ldots , \varsigma _{\kappa }\,|\,d^{(1)}, \ldots , d^{(q-1)}, z, \delta ^{(\kappa -\eta +1)}, \ldots , \delta ^{(\kappa -1)}) b \\&= \underbrace{c^{\textsf{H}} \widehat{\mathcal {C}}(\varsigma _{\kappa }) \widehat{\mathcal {K}}(\varsigma _{\kappa })^{-1} \left( \prod \limits _{i = 1}^{\eta - 1} \varvec{\mathsf {\widehat{N}}}(\varsigma _{\kappa -i}\,|\,\delta ^{(\kappa -i)}) \widehat{\mathcal {K}}(\varsigma _{\kappa -i})^{-1} \right) }_{ =:\,\hat{w}_{\eta }^{\textsf{H}}} \varvec{\mathsf {\widehat{N}}}(\sigma _{q}\,|\,z) \\&\quad \quad {}\times {} \underbrace{\left( \prod \limits _{j = 0}^{q-2} \widehat{\mathcal {K}}(\sigma _{q-j})^{-1} \varvec{\mathsf {\widehat{N}}}(\sigma _{q-j-1}\,|\,d^{(q-j-1)}) \right) \widehat{\mathcal {K}}(\sigma _{1})^{-1} \widehat{\mathcal {B}}(\sigma _{1}) b}_{ =:\,\hat{v}_{\eta }}\\&= \hat{w}_{\eta }^{\textsf{H}} \varvec{\mathsf {\widehat{N}}}(\sigma _{q}\,|\,z) \hat{v}_{q}\\&= \hat{w}_{\eta }^{\textsf{H}} W^{\textsf{H}} \varvec{\textsf{N}}(\sigma _{q}\,|\,z) V \hat{v}_{q}, \end{aligned}$$

for \(1 \le q \le k\), \(1 \le \eta \le \kappa \), and an arbitrary vector \(z \in \mathbb {C}^{m}\). The right-most product of the right-hand side can then be rewritten using the construction of V such that

$$\begin{aligned} V \hat{v}_{q}&= V \left( \prod \limits _{j = 0}^{q-3} \widehat{\mathcal {K}}(\sigma _{q-j})^{-1} \varvec{\mathsf {\widehat{N}}}(\sigma _{q-j-1}\,|\,d^{(q-j-1)}) \right) \widehat{\mathcal {K}}(\sigma _{2})^{-1} \varvec{\mathsf {\widehat{N}}}(\sigma _{1}\,|\,d^{(1)}) \\&\quad {}\times {} \widehat{\mathcal {K}}(\sigma _{1})^{-1} \widehat{\mathcal {B}}(\sigma _{1}) b\\&= V \left( \prod \limits _{j = 0}^{q-3} \widehat{\mathcal {K}}(\sigma _{q-j})^{-1} \varvec{\mathsf {\widehat{N}}}(\sigma _{q-j-1}\,|\,d^{(q-j-1)}) \right) \widehat{\mathcal {K}}(\sigma _{2})^{-1} W^{\textsf{H}} \varvec{\textsf{N}}(\sigma _{1}\,|\,d^{(1)})\\&\quad {}\times {} \underbrace{V \widehat{\mathcal {K}}(\sigma _{1})^{-1} W^{\textsf{H}} \mathcal {K}(\sigma _{1})}_{=:\,P_{\textrm{v}_{1}}} \underbrace{\mathcal {K}(\sigma _{1})^{-1} \mathcal {B}(\sigma _{1}) b}_{=\,v_{1}}\\&= V \left( \prod \limits _{j = 0}^{q-3} \widehat{\mathcal {K}}(\sigma _{q-j})^{-1} \varvec{\mathsf {\widehat{N}}}(\sigma _{q-j-1}\,|\,d^{(q-j-1)}) \right) \widehat{\mathcal {K}}(\sigma _{2})^{-1} W^{\textsf{H}} \varvec{\textsf{N}}(\sigma _{1}\,|\,d^{(1)}) v_{1}\\&= \cdots \\&= V \widehat{\mathcal {K}}(\sigma _{q})^{-1} W^{\textsf{H}} \varvec{\textsf{N}}(\sigma _{q-1}\,|\,d^{(q-1)}) v_{q-1}\\&= \underbrace{V \widehat{\mathcal {K}}(\sigma _{q})^{-1} W^{\textsf{H}} \mathcal {K}(\sigma _{q})}_{=:\,P_{\textrm{v}_{q}}} \underbrace{\mathcal {K}(\sigma _{q})^{-1} \varvec{\textsf{N}}(\sigma _{q-1}\,|\,d^{(q-1)}) v_{q-1}}_{=\,v_{q}}\\&= v_{q}, \end{aligned}$$

where \(P_{\textrm{v}_{1}}, \ldots , P_{\textrm{v}_{q}}\) are projectors onto \({{\,\textrm{span}\,}}(V)\), i.e., it holds \(P_{\textrm{v}_{j}} v = v\) for all \(v \in {{\,\textrm{span}\,}}(V)\) and their recursive application gives the identity above. Analogously, one can show that

$$\begin{aligned} W \hat{w}_{\eta }&= w_{\eta }, \end{aligned}$$

where \(w_{1}, \ldots , w_{\eta } \in {{\,\textrm{span}\,}}(W)\). Combining this last equality together with \(V \hat{v}_q = v_q\) yields

$$\begin{aligned}&c^{\textsf{H}} \varvec{\mathsf {\widehat{G}}}_{q + \eta }(\sigma _{1}, \ldots , \sigma _{q}, \varsigma _{\kappa -\eta +1}, \ldots , \varsigma _{\kappa }\,|\,d^{(1)}, \ldots , d^{(q-1)}, z, \delta ^{(\kappa -\eta +1)}, \ldots , \delta ^{(\kappa -1)}) b \\&= \hat{w}_{\eta }^{\textsf{H}} W^{\textsf{H}} \varvec{\textsf{N}}(\sigma _{q}\,|\,z) V \hat{v}_{q}\\&= w_{\eta }^{\textsf{H}} \varvec{\textsf{N}}(\sigma _{q}\,|\,z) v_{q}\\&= c^{\textsf{H}} \varvec{\textsf{G}}_{q + \eta }(\sigma _{1}, \ldots , \sigma _{q}, \varsigma _{\kappa -\eta +1}, \ldots , \varsigma _{\kappa }\,|\,d^{(1)}, \ldots , d^{(q-1)}, z, \delta ^{(\kappa -\eta +1)}, \ldots , \delta ^{(\kappa -1)}) b, \end{aligned}$$

which proves Part (c). \(\square \)

Remark 3

(Implicit realization of blockwise interpolation) Part (c) of Theorem 1 highlights an interesting interpolation property: The modified bilinear term in the middle between the interpolation by left and right projection allows for a completely arbitrary scaling vector z. Especially, by concatenation of higher-order transfer functions with respect to z, blockwise interpolation conditions hold true corresponding to the centering bilinear term. To further illustrate this point via a simple example, construct \({{\,\textrm{span}\,}}(V)\) and \({{\,\textrm{span}\,}}(W)\) as in Theorem 1 such that \(\varvec{\textsf{G}}_{1}(\sigma ) b\) and \(c^{\textsf{H}} \varvec{\textsf{G}}_{1}(\varsigma )\) are actively interpolated for chosen interpolation points \(\sigma , \varsigma \in \mathbb {C}\), and tangential directions \(b \in \mathbb {C}^{m}\) and \(c \in \mathbb {C}^{p}\). Then, by two-sided projection it holds additionally (Part (c) of Theorem 1) that

$$\begin{aligned} c^{\textsf{H}} \varvec{\textsf{G}}_{2}(\sigma , \varsigma \,|\,z) b&= c^{\textsf{H}} \varvec{\mathsf {\widehat{G}}}_{2}(\sigma , \varsigma \,|\,z) b, \end{aligned}$$

for all \(z \in \mathbb {C}^{m}\). Choosing \(z = \begin{bmatrix} 1&0 \end{bmatrix}^{\hspace{-0.83328pt}\textsf{T}}\) and \(z = \begin{bmatrix} 0&1 \end{bmatrix}^{\hspace{-0.83328pt}\textsf{T}}\) yields the blockwise bi-tangential interpolation condition by concatenation:

$$\begin{aligned} c^{\textsf{H}} G_{2}(\sigma , \varsigma ) (I_{m} \otimes b)&= c^{\textsf{H}} \widehat{G}_{2}(\sigma , \varsigma ) (I_{m} \otimes b), \end{aligned}$$

More details on structure-preserving blockwise tangential interpolation and its relation to the unifying framework are shown later in Sect. 3.4.

In addition to matching transfer function values, in practice, the interpolation of sensitivities with respect to the frequency points, i.e., partial derivatives, is crucial. The following theorem extends the interpolation results for modified transfer functions to Hermite interpolation.

Theorem 2

(Modified structured tangential Hermite interpolation) Let G be a bilinear system, associated with the modified transfer functions \(\varvec{\textsf{G}}_{k}\) in (34), and \(\widehat{G}\) the reduced-order bilinear system, constructed by (6) with its modified transfer functions \(\varvec{\mathsf {\widehat{G}}}_{k}\). Given sets of interpolation points \(\sigma _{1}, \ldots , \sigma _{k} \in \mathbb {C}\) and \(\varsigma _{1}, \ldots , \varsigma _{\kappa } \in \mathbb {C}\) such that the matrix functions \(\mathcal {C}(s)\), \(\mathcal {K}(s)^{-1}\), \(\mathcal {N}(s)\), \(\mathcal {B}(s), \widehat{\mathcal {K}}(s)^{-1}\) are analytic in \(s \in \{ \sigma _{1}, \ldots , \sigma _{k}, \varsigma _{1}, \ldots , \varsigma _{\kappa } \}\), two tangential directions \(b \in \mathbb {C}^{m}\) and \(c \in \mathbb {C}^{p}\), and two sets of scaling vectors \(d^{(1)}, \ldots , d^{(k-1)} \in \mathbb {C}^{m}\) and \(\delta ^{(1)}, \ldots , \delta ^{(\kappa -1)} \in \mathbb {C}^{m}\), the following statements hold:

  1. (a)

    If V is constructed as

    $$\begin{aligned} v_{1, j_{1}}&= \partial _{s^{j_{1}}} (\mathcal {K}^{-1} \mathcal {B}) (\sigma _{1}) b,&j_{1}&= 0, \ldots , \ell _{1},\\ v_{2, j_{2}}&= \partial _{s^{j_{2}}} \mathcal {K}^{-1} (\sigma _{2}) \partial _{s^{\ell _{1}}} (\varvec{\textsf{N}}(.\,|\,d^{(1)}) \mathcal {K}^{-1} \mathcal {B}) (\sigma _{1}) b,&j_{2}&= 0, \ldots , \ell _{2},\\&\,\,\,\vdots \\ v_{k, j_{k}}&= \partial _{s^{j_{k}}} \mathcal {K}^{-1} (\sigma _{k}) \\&\quad {}\times {} \left( \prod \limits _{j = 1}^{k - 2} \partial _{s^{\ell _{k-j}}} (\varvec{\textsf{N}}(.\,|\,d^{(k-j)}) \mathcal {K}^{-1}) (\sigma _{k-j}) \right) \\&\quad {}\times {} \partial _{s^{\ell _{1}}} \left( \varvec{\textsf{N}}(.\,|\,d^{(1)}) \mathcal {K}^{-1} \mathcal {B}\right) (\sigma _{1}) b,&j_{k}&= 0,\ldots ,\ell _{k},\\ {{\,\textrm{span}\,}}(V)&\supseteq {{\,\textrm{span}\,}}([v_{1,0}, \ldots , v_{k, \ell _{k}}]), \end{aligned}$$

    then the following interpolation conditions hold true:

  2. (b)

    If W is constructed as

    $$\begin{aligned} w_{1,i_{\kappa }}&= \partial _{s^{i_{\kappa }}} \left( \mathcal {K}^{-\textsf{H}} \mathcal {C}^{\textsf{H}}\right) (\varsigma _{\kappa }) c,&i_{\kappa }&= 0,\ldots ,\nu _{\kappa },\\ w_{2,i_{\kappa -1}}&= \partial _{s^{i_{\kappa -1}}}\left( \mathcal {K}^{-\textsf{H}} \varvec{\textsf{N}}(.\,|\,\delta ^{(\kappa -1)})^{\textsf{H}}\right) (\varsigma _{\kappa -1})\\&\quad {}\times {} \partial _{s^{\nu _{\kappa }}} (\mathcal {K}^{-\textsf{H}} \mathcal {C}^{\textsf{H}}) (\varsigma _{\kappa }) c,&i_{\kappa -1}&= 0,\ldots ,\nu _{\kappa -1},\\&\,\,\, \vdots \\ w_{\kappa ,i_{1}}&= \partial _{s^{i_{1}}} \left( \mathcal {K}^{-\textsf{H}} \varvec{\textsf{N}}(.\,|\,\delta ^{(1)})^{\textsf{H}}\right) (\varsigma _{1})\\&\quad {}\times {} \left( \prod \limits _{i = 2}^{\kappa - 1} \partial _{s^{\nu _{i}}} \left( \mathcal {K}^{-\textsf{H}} \varvec{\textsf{N}}(.\,|\,\delta ^{(i)})^{\textsf{H}}\right) (\varsigma _{i}) \right) \\&\quad {}\times {} \partial _{s^{\nu _{\kappa }}} \left( \mathcal {K}^{-\textsf{H}} \mathcal {C}^{\textsf{H}}\right) (\varsigma _{\kappa }) c,&i_{1}&= 0,\ldots ,\nu _{1},\\ \textrm{span}(W)&\supseteq {{\,\textrm{span}\,}}([w_{1,0}, \ldots , w_{\kappa , \nu _{\kappa }}]), \end{aligned}$$

    then the following interpolation conditions hold true:

  3. (c)

    Let V be constructed as in Part (a) and W as in Part (b). Then, additionally to the results in (a) and (b), the following conditions hold:

    $$\begin{aligned}&c^{\textsf{H}} \partial _{s_{1}^{\ell _{1}} \cdots s_{q-1}^{\ell _{q-1}} s_{q}^{j_{q}} s_{q+1}^{i_{\kappa - \eta + 1}} s_{q+2}^{\nu _{\kappa - \eta + 2}} s_{q + \eta }^{\nu _{\kappa }}} \varvec{\textsf{G}}_{q + \eta } (\sigma _{1}, \ldots , \sigma _{q}, \varsigma _{\kappa -\eta +1}, \ldots , \varsigma _{\kappa } \,|\, \\&\qquad \quad d^{(1)}, \ldots , d^{(q-1)}, z, \delta ^{(\kappa -\eta +1)}, \ldots , \delta ^{(\kappa -1)}) b\\&\quad = c^{\textsf{H}} \partial _{s_{1}^{\ell _{1}} \cdots s_{q-1}^{\ell _{q-1}} s_{q}^{j_{q}} s_{q+1}^{i_{\kappa - \eta + 1}} s_{q+2}^{\nu _{\kappa - \eta + 2}} s_{q + \eta }^{\nu _{\kappa }}} \varvec{\mathsf {\widehat{G}}}_{q + \eta } (\sigma _{1}, \ldots , \sigma _{q}, \varsigma _{\kappa -\eta +1}, \ldots , \varsigma _{\kappa } \,|\, \\&\qquad \quad d^{(1)}, \ldots , d^{(q-1)}, z, \delta ^{(\kappa -\eta +1)}, \ldots , \delta ^{(\kappa -1)}) b, \end{aligned}$$

    for \(j_{q} = 0, \ldots , \ell _{q}\); \(i_{\kappa - \eta + 1} = 0, \ldots , \nu _{\kappa - \eta + 1}\); \(1 \le q \le k\), \(1 \le \eta \le \kappa \), and an additional arbitrary scaling vector \(z \in \mathbb {C}^{m}\).

Proof

The proof works analogously to Theorem 1, using appropriate projectors onto \({{\,\textrm{span}\,}}(V)\) or \({{\,\textrm{span}\,}}(W)\) and the ideas from the proof of [13, Thm. 9] for fixed scaling vectors \(d^{(1)}, \ldots , d^{(k-1)}\) and \(\delta ^{(1)}, \ldots , \delta ^{(\kappa -1)}\). \(\square \)

To complete the theory for our new unifying interpolation framework, we consider the special cases of Theorems 1 and 2 by using identical sets of interpolation points and scaling vectors in the bi-tangential interpolation case. As in [8, 13], this allows interpolation of partial derivatives implicitly. Due to the dependency of the modified transfer functions on the scaling vectors, we will also interpolate now derivatives with respect to those scaling vectors. Therefore, the notion of the Jacobian matrix (14) for the modified transfer functions will be given as

$$\begin{aligned} \nabla \varvec{\textsf{G}}_{k}&= \left[ \partial _{s_{1}} \varvec{\textsf{G}}_{k}, \ldots , \partial _{s_{k}} \varvec{\textsf{G}}_{k}, \partial _{d^{(1)}_{1}} \varvec{\textsf{G}}_{k}, \ldots , \partial _{d^{(1)}_{m}} \varvec{\textsf{G}}_{k}, \ldots , \partial _{d^{(k-1)}_{1}} \varvec{\textsf{G}}_{k}, \ldots , \partial _{d^{(k-1)}_{m}} \varvec{\textsf{G}}_{k} \right] . \end{aligned}$$

Theorem 3

(Modified structured bi-tangential interpolation with identical point sets) Let G be a bilinear system, associated with the modified transfer functions \(\varvec{\textsf{G}}_{k}\) in (34), and \(\widehat{G}\) the reduced-order bilinear system, constructed by (6) with its modified transfer functions \(\varvec{\mathsf {\widehat{G}}}_{k}\). Given a set of interpolation points \(\sigma _{1}, \ldots , \sigma _{k} \in \mathbb {C}\) such that the matrix functions \(\mathcal {C}(s)\), \(\mathcal {K}(s)^{-1}\), \(\mathcal {N}(s)\), \(\mathcal {B}(s)\), \(\widehat{\mathcal {K}}(s)^{-1}\) are analytic in \(s \in \{ \sigma _{1}, \ldots , \sigma _{k} \}\), two tangential directions \(b \in \mathbb {C}^{m}\) and \(c \in \mathbb {C}^{p}\), and scaling vectors \(d^{(1)}, \ldots , d^{(k-1)} \in \mathbb {C}^{m}\), the following statements hold:

  1. (a)

    Let V and W be constructed as in Theorem 1 Parts (a) and (b) for the interpolation points \(\sigma _{1} = \varsigma _{1}\), \(\ldots \), \(\sigma _{k} = \varsigma _{k}\) and the scaling vectors \(d^{(1)} = \delta ^{(1)}\), \(\ldots \), \(d^{(k-1)} = \delta ^{(k-1)}\). Then, in addition to the interpolation conditions in Theorem 1, it holds

    $$\begin{aligned}&\nabla \left( c^{\textsf{H}} \varvec{\textsf{G}}_{k} b \right) (\sigma _{1}, \ldots , \sigma _{k}\,|\,d^{(1)}, \ldots , d^{(k-1)}) \\&\qquad = \nabla \left( c^{\textsf{H}} \varvec{\mathsf {\widehat{G}}}_{k} b \right) (\sigma _{1}, \ldots , \sigma _{k}\,|\,d^{(1)}, \ldots , d^{(k-1)}). \end{aligned}$$
  2. (b)

    Let V and W be constructed as in Theorem 2 Parts (a) and (b) for the interpolation points \(\sigma _{1} = \varsigma _{1}\), \(\ldots \), \(\sigma _{k} = \varsigma _{k}\), the derivative orders \(\ell _{1} = \nu _{1}\), \(\ldots \), \(\ell _{k} = \nu _{k}\), and the scaling vectors \(d^{(1)} = \delta ^{(1)}\), \(\ldots \), \(d^{(k-1)} = \delta ^{(k-1)}\). Then, in addition to the interpolation conditions in Theorem 2, it holds

    $$\begin{aligned}&\nabla \left( c^{\textsf{H}} \partial _{s_{1}^{\ell _{1}} \cdots s_{k}^{\ell _{k}}} \varvec{\textsf{G}}_{k} b \right) (\sigma _{1}, \ldots , \sigma _{k}\,|\,d^{(1)}, \ldots , d^{(k-1)})\\&\qquad = \nabla \left( c^{\textsf{H}} \partial _{s_{1}^{\ell _{1}} \cdots s_{k}^{\ell _{k}}} \varvec{\mathsf {\widehat{G}}}_{k} b \right) (\sigma _{1}, \ldots , \sigma _{k}\,|\,d^{(1)}, \ldots , d^{(k-1)}). \end{aligned}$$

Proof

First, we consider the partial derivatives with respect to the scaling vectors. For arbitrary \(1 \le j \le k-1\) and \(1 \le i \le m\), we obtain

$$\begin{aligned}&\partial _{d^{(j)}_{i}} \left( c^{\textsf{H}} \varvec{\mathsf {\widehat{G}}}_{k} b \right) (\sigma _{1}, \ldots , \sigma _{k}\,|\,d^{(1)}, \ldots , d^{(k-1)}) \\&= \underbrace{c^{\textsf{H}} \widehat{\mathcal {C}}(\sigma _{k}) \widehat{\mathcal {K}}(\sigma _{k})^{-1} \left( \prod \limits _{\ell = 1}^{k-j-1} \varvec{\mathsf {\widehat{N}}}(\sigma _{k-\ell }\,|\,d^{(k-\ell )}) \widehat{\mathcal {K}}(\sigma _{k-\ell })^{-1} \right) }_{ =:\,\hat{w}_{k-j-1}^{\textsf{H}}} \left( \partial _{d^{(j)}_{i}} \varvec{\mathsf {\widehat{N}}}(\sigma _{j}\,|\,d^{(j)}) \right) \\&\quad {}\times {} \underbrace{\left( \prod \limits _{\ell = j+1}^{k-1} \varvec{\mathsf {\widehat{N}}}(\sigma _{k-\ell }\,|\,d^{(k-\ell )}) \widehat{\mathcal {K}}(\sigma _{k-\ell })^{-1} \right) \widehat{\mathcal {B}}(s_{1}) b}_{ =:\,\hat{v}_{k-j-1}}\\&= \hat{w}_{k-j-1}^{\textsf{H}} \left( \partial _{d^{(j)}_{i}} \varvec{\mathsf {\widehat{N}}}(\sigma _{j}\,|\,d^{(j)}) \right) \hat{v}_{k-j-1}\\&= \hat{w}_{k-j-1}^{\textsf{H}} W^{\textsf{H}} \left( \partial _{d^{(j)}_{i}} \varvec{\textsf{N}}(\sigma _{j}\,|\,d^{(j)}) \right) V \hat{v}_{k-j-1} \end{aligned}$$

such that only the modified bilinear term corresponding to the scaling vector \(d^{(j)}\) needs to be differentiated. Using the same approach as in the proof of Theorem 1 and the construction of \({{\,\textrm{span}\,}}(V)\) and \({{\,\textrm{span}\,}}(W)\) yields the two equalities

$$\begin{aligned} \begin{aligned} V \hat{v}_{k-j-1}&= v_{k-j-1}&\text {and}{} & {} W \hat{w}_{k-j-1}&= w_{k-j-1}, \end{aligned} \end{aligned}$$

which gives

$$\begin{aligned}&\partial _{d^{(j)}_{i}} \left( c^{\textsf{H}} \varvec{\mathsf {\widehat{G}}}_{k} b \right) (\sigma _{1}, \ldots , \sigma _{k}\,|\,d^{(1)}, \ldots , d^{(k-1)})\\&\qquad = \hat{w}_{k-j-1}^{\textsf{H}} W^{\textsf{H}} \left( \partial _{d^{(j)}_{i}} \varvec{\textsf{N}}(\sigma _{j}\,|\,d^{(j)}) \right) V \hat{v}_{k-j-1}\\&\qquad = w_{k-j-1}^{\textsf{H}} \left( \partial _{d^{(j)}_{i}} \varvec{\textsf{N}}(\sigma _{j}\,|\,d^{(j)}) \right) v_{k-j-1}\\&\qquad = \partial _{d^{(j)}_{i}} \left( c^{\textsf{H}} \varvec{\textsf{G}}_{k} b \right) (\sigma _{1}, \ldots , \sigma _{k}\,|\,d^{(1)}, \ldots , d^{(k-1)}), \end{aligned}$$

for all \(1 \le j \le k-1\) and \(1 \le i \le m\). Therefore, the interpolation condition holds for all partial derivatives with respect to the scaling vectors. The results for the partial derivatives with respect to the frequency arguments can be proven analogously and in principle follow the ideas from [13, Cor. 2]. This proves Part (a). Part (b) can be proven analogously to Part (a) by replacing the simple interpolation by the Hermite version from Theorem 2. For brevity of the paper, we skip those details. \(\square \)

Remark 4

(Using multiple sets of interpolation points) While all results in this section are formulated for a single set of interpolation points \(\sigma _{1}, \ldots , \sigma _{k} \in \mathbb {C}\), they can be extended to multiple sets by concatenation of the model reduction bases. Consider, for example, Part (a) of Theorem 1. Let \(\sigma _{1}^{(1)}, \ldots , \sigma _{k}^{(1)}\), \(\ldots \), \(\sigma _{1}^{(n_{\textrm{s}})}, \ldots , \sigma _{k}^{(n_{\textrm{s}})} \in \mathbb {C}\) be \(n_{\textrm{s}}\) sets of interpolation points and \(V^{(1)}, \ldots , V^{(n_{\textrm{s}})}\) be the corresponding basis matrices such that the corresponding reduced-order models (tangentially) interpolate the original model for the given sets of interpolation points. Then, another reduced-order model can be constructed to satisfy all interpolation conditions associated with \(V^{(1)}, \ldots , V^{(n_{\textrm{s}})}\) by choosing

$$\begin{aligned} {{\,\textrm{span}\,}}(V) \supseteq {{\,\textrm{span}\,}}([V^{(1)}, \ldots , V^{(n_{\textrm{s}})}]), \end{aligned}$$

as the new truncation matrix V, and any W of appropriate dimension and full column rank. Analogously, multiple sets of scaling vectors and tangential directions can be incorporated.

3.4 Special case: structured blockwise tangential interpolation

As mentioned in Sect. 3.3, the new unifying tangential interpolation framework can also be used to obtain results for blockwise tangential interpolation. Due to its relevance and common use in model reduction of MIMO bilinear systems, we will state the corresponding results in this section in more detail.

First, we will generalize the idea of blockwise tangential interpolation introduced in Sect. 2.3 to the structured case. Therefore, we start by analyzing the multivariate transfer functions (5). Multiplying out the Kronecker products, we observe that (5) is actually given as concatenation of products of the linear dynamics and the bilinear terms

$$\begin{aligned} \begin{aligned}&G_{k}(s_{1}, \ldots , s_{k}) \\&\quad = \big [ \mathcal {C}(s_{k}) \mathcal {K}(s_{k})^{-1} \mathcal {N}_{1}(s_{k-1}) \mathcal {K}(s_{k-1})^{-1} \cdots \mathcal {N}_{1}(s_{1}) \mathcal {K}(s_{1})^{-1} \mathcal {B}(s_{1}),\\&\qquad \qquad \mathcal {C}(s_{k}) \mathcal {K}(s_{k})^{-1} \mathcal {N}_{1}(s_{k-1}) \mathcal {K}(s_{k-1})^{-1} \cdots \mathcal {N}_{2}(s_{1}) \mathcal {K}(s_{1})^{-1} \mathcal {B}(s_{1}), \\&\qquad \qquad \ldots \\&\qquad \qquad \mathcal {C}(s_{k}) \mathcal {K}(s_{k})^{-1} \mathcal {N}_{m}(s_{k-1}) \mathcal {K}(s_{k-1})^{-1} \cdots \mathcal {N}_{m}(s_{1}) \mathcal {K}(s_{1})^{-1} \mathcal {B}(s_{1}) \big ]. \end{aligned} \end{aligned}$$
(38)

Extending on the ideas from Sect. 2.3, we consider each block entry of (38) as separate transfer function and for each of them use tangential interpolation with the same directions. In other words, given the right tangential direction \(b \in \mathbb {C}^{m}\), we consider

$$\begin{aligned} \begin{aligned}&G_{k}(s_{1}, \ldots , s_{k}) (I_{m^{k - 1}} \otimes b) \\&\quad = \big [ \mathcal {C}(s_{k}) \mathcal {K}(s_{k})^{-1} \mathcal {N}_{1}(s_{k-1}) \mathcal {K}(s_{k-1})^{-1} \cdots \mathcal {N}_{1}(s_{1}) \mathcal {K}(s_{1})^{-1} \mathcal {B}(s_{1}) b, \\&\qquad \qquad \mathcal {C}(s_{k}) \mathcal {K}(s_{k})^{-1} \mathcal {N}_{1}(s_{k-1}) \mathcal {K}(s_{k-1})^{-1} \cdots \mathcal {N}_{2}(s_{1}) \mathcal {K}(s_{1})^{-1} \mathcal {B}(s_{1}) b, \\&\qquad \qquad \ldots \\&\qquad \qquad \mathcal {C}(s_{k}) \mathcal {K}(s_{k})^{-1} \mathcal {N}_{m}(s_{k-1}) \mathcal {K}(s_{k-1})^{-1} \cdots \mathcal {N}_{m}(s_{1}) \mathcal {K}(s_{1})^{-1} \mathcal {B}(s_{1}) b \big ] \end{aligned} \end{aligned}$$

as blockwise evaluation of the transfer function in the direction b. This formulation extends the blockwise tangential interpolation problem from (10)–(12) to the structure-preserving setting.

The modified tangential interpolation framework can now be used to obtain the subspace conditions on the blockwise tangential interpolation. Choose the scaling vectors \(d^{(j)}\) in (34) to be columns of the m-dimensional identity matrix. Then, the single block entries of (38) are given as the modified transfer functions (34) for specific choices of scaling vectors. For example, choosing \(d^{(1)} = \cdots = d^{(k-1)} = e_{1}\) to be the first column of the m-dimensional identity matrix yields

$$\begin{aligned}&\varvec{\textsf{G}}_{k}(s_{1}, \ldots , s_{k}\,|\,e_{1}, \ldots , e_{1}) \\&\quad = \mathcal {C}(s_{k}) \mathcal {K}(s_{k})^{-1} \mathcal {N}_{1}(s_{k-1}) \mathcal {K}(s_{k-1})^{-1} \cdots \mathcal {N}_{1}(s_{1}) \mathcal {K}(s_{1})^{-1} \mathcal {B}(s_{1}), \end{aligned}$$

which is the first block in (38). By column concatenation of these modified transfer functions, (38) can be completely recovered:

$$\begin{aligned} \begin{aligned} G_{k}(s_{1}, \ldots , s_{1}) =&\; \big [ \varvec{\textsf{G}}_{k}(s_{1}, \ldots , s_{k}\,|\,e_{1}, \ldots , e_{1}),\\&\quad \varvec{\textsf{G}}_{k}(s_{1}, \ldots , s_{k}\,|\,e_{1}, \ldots , e_{2}), \\&\quad \ldots ,\\&\quad \varvec{\textsf{G}}_{k}(s_{1}, \ldots , s_{k}\,|\,e_{m}, \ldots , e_{m}) \big ]. \end{aligned} \end{aligned}$$
(39)

Consequently, the blockwise interpolation results are given by concatenation of the corresponding model reduction bases constructed for all necessary modified transfer functions and the tangential directions. Due to the significance of the blockwise tangential interpolation in the literature [9, 35] and the complexity of its recovery from the unifying framework, we will state in the following the structure-preserving interpolation results for blockwise tangential interpolation. Note that the proofs directly follow from the previous section and by concatenation as discussed above.

Remark 5

(Matrix interpolation) It should be noted that the matrix interpolation results from [13] can also be recovered from the modified tangential interpolation framework. As the relation (39) shows, removing the tangential directions in the construction of the projection spaces will yield the matrix interpolation results. Thus matrix interpolation is also a special case of the modified tangential interpolation framework.

The first result follows from Theorem 1.

Corollary 2

(Structured blockwise tangential interpolation) Let G be a bilinear system, described by its subsystem transfer functions in (5), and \(\widehat{G}\) the reduced-order bilinear system, constructed by (6) with the corresponding subsystem transfer functions \(\widehat{G}_{k}\). Given sets of interpolation points \(\sigma _{1}, \ldots , \sigma _{k} \in \mathbb {C}\) and \(\varsigma _{1}, \ldots , \varsigma _{\kappa } \in \mathbb {C}\) such that the matrix functions \(\mathcal {C}(s)\), \(\mathcal {K}(s)^{-1}\), \(\mathcal {N}(s)\), \(\mathcal {B}(s), \widehat{\mathcal {K}}(s)^{-1}\) are defined for \(s \in \{ \sigma _{1}, \ldots , \sigma _{k}, \varsigma _{1}, \ldots , \varsigma _{\kappa } \}\), and two tangential directions \(b \in \mathbb {C}^{m}\) and \(c \in \mathbb {C}^{p}\), the following statements hold:

  1. (a)

    If V is constructed as

    $$\begin{aligned} V_{1}&= \mathcal {K}(\sigma _{1})^{-1}\mathcal {B}(\sigma _{1})b,\\ V_{j}&= \mathcal {K}(\sigma _{j})^{-1}\mathcal {N}(\sigma _{j-1}) (I_{m} \otimes V_{j-1}),&2 \le j \le k,\\ {{\,\textrm{span}\,}}(V)&\supseteq {{\,\textrm{span}\,}}\left( [V_{1}, \ldots , V_{k}]\right) , \end{aligned}$$

    then the following interpolation conditions hold true:

    $$\begin{aligned} \begin{aligned} G_{1}(\sigma _{1})b&= \widehat{G}_{1}(\sigma _{1})b,\\ G_{2}(\sigma _{1}, \sigma _{2}) (I_{m} \otimes b)&= \widehat{G}_{1}(\sigma _{1}) (I_{m} \otimes b),\\&\,\,\,\vdots \\ G_{k}(\sigma _{1}, \ldots , \sigma _{k}) (I_{m^{k-1}} \otimes b)&= \widehat{G}_{k}(\sigma _{1}, \ldots , \sigma _{k}) (I_{m^{k-1}} \otimes b). \end{aligned} \end{aligned}$$
  2. (b)

    If W is constructed as

    $$\begin{aligned} W_{1}&= \mathcal {K}(\varsigma _{\kappa })^{-\textsf{H}}\mathcal {C}(\varsigma _{\kappa })^{\textsf{H}} c,\\ W_{i}&= \mathcal {K}(\varsigma _{\kappa -i+1})^{-\textsf{H}} \widetilde{\mathcal {N}}(\varsigma _{k-i+1})^{\textsf{H}} (I_{m} \otimes W_{i-1}),&2 \le i \le \kappa ,\\ \textrm{span}(W)&\supseteq \textrm{span}\left( [W_{1}, \ldots , W_{\kappa }] \right) , \end{aligned}$$

    then the following interpolation conditions hold true:

    $$\begin{aligned} \begin{aligned} c^{\textsf{H}} G_{1}(\varsigma _{\kappa })&= c^{\textsf{H}} \widehat{G}_{1}(\varsigma _{\kappa }),\\ c^{\textsf{H}} G_{2}(\varsigma _{\kappa -1}, \varsigma _{\kappa })&= c^{\textsf{H}} \widehat{G}_{2}(\varsigma _{\kappa -1}, \varsigma _{\kappa }),\\&\,\,\,\vdots \\ c^{\textsf{H}} G_{\kappa }(\varsigma _{1}, \ldots , \varsigma _{\kappa })&= c^{\textsf{H}} \widehat{G}_{\kappa }(\varsigma _{1}, \ldots , \varsigma _{\kappa }). \end{aligned} \end{aligned}$$
  3. (c)

    Let V be constructed as in Part (a) and W as in Part (b). Then, additionally to the results in (a) and (b), the following conditions hold:

    $$\begin{aligned}&c^{\textsf{H}} G_{q + \eta }(\sigma _{1}, \ldots , \sigma _{q}, \varsigma _{\kappa -\eta +1}, \ldots , \varsigma _{\kappa }) (I_{m^{q+\eta -1}} \otimes b)\\&\qquad = c^{\textsf{H}} \widehat{G}_{q + \eta }(\sigma _{1}, \ldots , \sigma _{q}, \varsigma _{\kappa -\eta +1}, \ldots , \varsigma _{\kappa }) (I_{m^{q+\eta -1}} \otimes b), \end{aligned}$$

    for \(1 \le q \le k\) and \(1 \le \eta \le \kappa \).

The next corollary corresponds to Theorem 2 stating the results for Hermite interpolation.

Corollary 3

(Structured blockwise tangential Hermite interpolation) Let G be a bilinear system, described by its subsystem transfer functions in (5), and \(\widehat{G}\) the reduced-order bilinear system, constructed by (6) with the corresponding subsystem transfer functions \(\widehat{G}_{k}\). Given sets of interpolation points \(\sigma _{1}, \ldots , \sigma _{k} \in \mathbb {C}\) and \(\varsigma _{1}, \ldots , \varsigma _{\kappa } \in \mathbb {C}\) such that the matrix functions \(\mathcal {C}(s)\), \(\mathcal {K}(s)^{-1}\), \(\mathcal {N}(s)\), \(\mathcal {B}(s)\), \(\widehat{\mathcal {K}}(s)^{-1}\) are analytic in \(s \in \{ \sigma _{1}, \ldots , \sigma _{k}, \varsigma _{1}, \ldots , \varsigma _{\kappa } \}\), and two tangential directions \(b \in \mathbb {C}^{m}\) and \(c \in \mathbb {C}^{p}\), the following statements hold:

  1. (a)

    If V is constructed as

    $$\begin{aligned} V_{1, j_{1}}&= \partial _{s^{j_{1}}} (\mathcal {K}^{-1} \mathcal {B}b) (\sigma _{1}),&j_{1}&= 0, \ldots , \ell _{1},\\ V_{2, j_{2}}&= \partial _{s^{j_{2}}} \mathcal {K}^{-1} (\sigma _{2}) \partial _{s^{\ell _{1}}} (\mathcal {N}(I_{m} \otimes \mathcal {K}^{-1} \mathcal {B}b)) (\sigma _{1}),&j_{2}&= 0, \ldots , \ell _{2},\\&\,\,\,\vdots \\ V_{k,j_{k}}&= \partial _{s^{j_{k}}} \mathcal {K}^{-1} (\sigma _{k}) \left( \prod \limits _{j = 1}^{k-2} \partial _{s^{\ell _{k -j}}} ( (I_{m^{j-1}} \otimes \mathcal {N}) (I_{m^{j}} \otimes \mathcal {K}) ) (\sigma _{k-j}) \right) \\&\quad {}\times {} \partial _{s^{\ell _{1}}} ((I_{m^{k-2}} \otimes \mathcal {N})(I_{m^{k-1}} \otimes \mathcal {K}\mathcal {B}b)) (\sigma _{1}),&j_{k}&= 0, \ldots , \ell _{k},\\ {{\,\textrm{span}\,}}(V)&\supseteq {{\,\textrm{span}\,}}([V_{1,0}, \ldots , V_{k, \ell _{k}}]), \end{aligned}$$

    then the following interpolation conditions hold true:

    $$\begin{aligned}&\partial _{s_{1}^{j_{1}}} G_{1} (\sigma _{1}) b = \partial _{s_{1}^{j_{1}}} \widehat{G}_{1} (\sigma _{1}) b,&j_{1}&= 0, \ldots , \ell _{1}, \\&\hspace{5.5em} \vdots \\&\partial _{s_{1}^{\ell _{1}} \cdots s_{k-1}^{\ell _{k-1}} s_{k}^{j_{k}}} G_{k} (\sigma _{1}, \ldots , \sigma _{k}) (I_{m^{k-1}} \otimes b) \\&\qquad = \partial _{s_{1}^{\ell _{1}} \cdots s_{k-1}^{\ell _{k-1}} s_{k}^{j_{k}}} \widehat{G}_{k} (\sigma _{1}, \ldots , \sigma _{k}) (I_{m^{k-1}} \otimes b),&j_{k}&= 0, \ldots , \ell _{k}. \end{aligned}$$
  2. (b)

    If W is constructed as

    $$\begin{aligned} W_{1, i_{\kappa }}&= \partial _{s^{i_{\kappa }}} (\mathcal {K}^{-\textsf{H}} \mathcal {C}^{\textsf{H}} c) (\varsigma _{\kappa }),&i_{\kappa }&= 0,\ldots ,\nu _{\kappa },\\ W_{2,i_{\kappa -1}}&= \partial _{s^{i_{\kappa -1}}} (\mathcal {K}^{-\textsf{H}} \widetilde{\mathcal {N}}^{\textsf{H}}) (\varsigma _{\kappa -1})\\&\quad {}\times {} \left( I_{m} \otimes \partial _{s^{\nu _{\kappa }}} (\mathcal {K}^{-\textsf{H}} \mathcal {C}^{\textsf{H}} c) (\varsigma _{\kappa }) \right) ,&i_{\kappa -1}&= 0,\ldots ,\nu _{\kappa -1},\\&\,\,\,\vdots \\ W_{\kappa ,i_{1}}&= \partial _{s^{i_{1}}} (\mathcal {K}^{-\textsf{H}} \widetilde{\mathcal {N}}^{\textsf{H}}) (\varsigma _{1}) \\&\quad {}\times {} \left( \prod \limits _{i = 2}^{\kappa - 1} \partial _{s^{\nu _{i}}} (I_{m^{i-1}} \otimes \mathcal {K}^{-\textsf{H}} \widetilde{\mathcal {N}}^{\textsf{H}}) (\varsigma _{i}) \right) \\&\quad {}\times {} \left( I_{m^{\kappa -1}} \otimes \partial _{s^{\nu _{\kappa }}} (\mathcal {K}^{-\textsf{H}} \mathcal {C}^{\textsf{H}} c) (\varsigma _{\kappa }) \right) ,&i_{1}&= 0,\ldots ,\nu _{1},\\ \textrm{span}(W)&\supseteq {{\,\textrm{span}\,}}([W_{1,0}, \ldots , W_{\kappa , \nu _{\kappa }}]), \end{aligned}$$

    then the following interpolation conditions hold true:

    $$\begin{aligned} c^{\textsf{H}} \partial _{s_{1}^{i_{\kappa }}} G_{1} (\varsigma _{\kappa })&= c^{\textsf{H}} \partial _{s_{1}^{i_{\kappa }}} \widehat{G}_{1} (\varsigma _{\kappa }),&i_{\kappa } = 0,\ldots ,\nu _{\kappa },\\&\,\,\,\vdots \\ c^{\textsf{H}} \partial _{s_{1}^{i_{1}} s_{2}^{\nu _{2}} \cdots s_{\kappa }^{\nu _{\kappa }}} G_{\kappa } (\varsigma _{1}, \ldots , \varsigma _{\kappa })&= c^{\textsf{H}} \partial _{s_{1}^{i_{1}} s_{2}^{\nu _{2}} \cdots s_{\kappa }^{\nu _{\kappa }}} \widehat{G}_{\kappa } (\varsigma _{1}, \ldots , \varsigma _{\kappa }),&i_{1} = 0,\ldots ,\nu _{1}. \end{aligned}$$
  3. (c)

    Let V be constructed as in Part (a) and W as in Part (b). Then, additionally to the interpolation conditions in (a) and (b), the following conditions hold:

    $$\begin{aligned}&c^{\textsf{H}} \partial _{s_{1}^{\ell _{1}} \cdots s_{q-1}^{\ell _{q-1}} s_{q}^{j_{q}} s_{q+1}^{i_{\kappa - \eta + 1}} s_{q+2}^{\nu _{\kappa - \eta + 2}} \cdots s_{q + \eta }^{\nu _{\kappa }}} G_{q + \eta } (\sigma _{1}, \ldots , \sigma _{q}, \varsigma _{\kappa - \eta + 1}, \ldots , \varsigma _{\kappa })\\&\quad {}\times {} (I_{m^{q+\eta -1}} \otimes b)\\&= c^{\textsf{H}} \partial _{s_{1}^{\ell _{1}} \cdots s_{q-1}^{\ell _{q-1}} s_{q}^{j_{q}} s_{q+1}^{i_{\kappa - \eta + 1}} s_{q+2}^{\nu _{\kappa - \eta + 2}} \cdots s_{q + \eta }^{\nu _{\kappa }}} \widehat{G}_{q + \eta } (\sigma _{1}, \ldots , \sigma _{q}, \varsigma _{\kappa - \eta + 1}, \ldots , \varsigma _{\kappa })\\&\quad {}\times {} (I_{m^{q+\eta -1}} \otimes b), \end{aligned}$$

    for \(j_{q} = 0, \ldots , \ell _{q}\); \(i_{\kappa - \eta + 1} = 0, \ldots , \nu _{\kappa - \eta + 1}\); \(1 \le q \le k\) and \(1 \le \eta \le \kappa \).

Last, we give the results on implicit blockwise tangential interpolation of additional partial derivatives by using two-sided projection, corresponding to Theorem 3.

Corollary 4

(Structured blockwise bi-tangential interpolation with identical point sets) Let G be a bilinear system, described by its subsystem transfer functions in (5), and \(\widehat{G}\) the reduced-order bilinear system, constructed by (6) with the corresponding subsystem transfer functions \(\widehat{G}_{k}\). Given a set of interpolation points \(\sigma _{1}, \ldots , \sigma _{k} \in \mathbb {C}\) such that the matrix functions \(\mathcal {C}(s)\), \(\mathcal {K}(s)^{-1}\), \(\mathcal {N}(s)\), \(\mathcal {B}(s)\), \(\widehat{\mathcal {K}}(s)^{-1}\) are analytic in \(s \in \{ \sigma _{1}, \ldots , \sigma _{k} \}\), and two tangential directions \(b \in \mathbb {C}^{m}\) and \(c \in \mathbb {C}^{p}\), the following statements hold:

  1. (a)

    Let V and W be constructed as in Corollary 2 Parts (a) and (b) for the interpolation points \(\sigma _{1} = \varsigma _{1}\), \(\ldots \), \(\sigma _{k} = \varsigma _{k}\). Then, in addition to the interpolation conditions in Corollary 2, it holds

    $$\begin{aligned}&\nabla \big ( c^{\textsf{H}} G_{k} (I_{m^{k-1}} \otimes b) \big ) (\sigma _{1}, \ldots , \sigma _{k}) = \nabla \big ( c^{\textsf{H}} \widehat{G}_{k} (I_{m^{k-1}} \otimes b) \big ) (\sigma _{1}, \ldots , \sigma _{k}). \end{aligned}$$
  2. (b)

    Let V and W be constructed as in Corollary 3 Parts (a) and (b) for the interpolation points \(\sigma _{1} = \varsigma _{1}\), \(\ldots \), \(\sigma _{k} = \varsigma _{k}\) and derivative orders \(\ell _{1} = \nu _{1}\), \(\ldots \), \(\ell _{k} = \nu _{k}\). Then, in addition to the interpolation conditions in Corollary 3, it holds

    $$\begin{aligned}&\nabla \left( c^{\textsf{H}} \partial _{s_{1}^{\ell _{1}} \cdots s_{k}^{\ell _{k}}} G_{k} (I_{m^{k-1}} \otimes b) \right) (\sigma _{1}, \ldots , \sigma _{k}) \\&\qquad = \nabla \left( c^{\textsf{H}} \partial _{s_{1}^{\ell _{1}} \cdots s_{k}^{\ell _{k}}} \widehat{G}_{k} (I_{m^{k-1}} \otimes b) \right) (\sigma _{1}, \ldots , \sigma _{k}). \end{aligned}$$

Remark 6

(Projection space dimensions) It will be useful to understand the growth of the size of the model reduction bases and thus the order of the resulting interpolatory reduced-order model for the different interpolation approaches. Let \(n_{\textrm{s}}\) be the number of sets of interpolation points and tangential directions at which we want to enforce interpolation. Also, assume w.l.o.g. the recursively generated columns in V and W are all linearly independent (since otherwise, the dimensions of the corresponding projection spaces can be reduced while still enforcing interpolation). Then, for the matrix interpolation approach from [13, Thm. 8], we obtain

$$\begin{aligned} \begin{aligned} \dim ({{\,\textrm{span}\,}}(V_{\textrm{mtx}}))&\ge n_{\textrm{s}} \left( \sum _{j = 1}^{k} m^{k} \right)&\text {and}{} & {} \dim ({{\,\textrm{span}\,}}(W_{\textrm{mtx}}))&\ge n_{\textrm{s}} \left( \sum _{j = 1}^{k} p m^{k-1} \right) \end{aligned} \end{aligned}$$
(40)

for the right and left projection spaces, respectively. The blockwise tangential approach from Corollary 2 reduces those dimensions to

$$\begin{aligned} \dim ({{\,\textrm{span}\,}}(V_{\textrm{bwt}})) = \dim ({{\,\textrm{span}\,}}(W_{\textrm{bwt}}))&\ge n_{\textrm{s}} \left( \sum _{j = 1}^{k} m^{k-1} \right) . \end{aligned}$$
(41)

Comparing (40) and (41) shows that the blockwise tangential interpolation approach, similar to matrix interpolation, has exponentially growing dimensions of the projection spaces. In contrast, the new modified tangential interpolation approach as in Theorem 1 yields

$$\begin{aligned} \dim ({{\,\textrm{span}\,}}(V_{\textrm{st}})) = \dim ({{\,\textrm{span}\,}}(W_{\textrm{st}}))&\ge n_{\textrm{s}} k, \end{aligned}$$

which now grows only linearly. This gives more freedom in the choice of the order of interpolating reduced-order models, as well as more possibilities to adapt the choice of interpolation points to the problem.

Concerning the computational complexity of the different interpolation approaches, we note that each column of the basis matrices V and W is obtained by solving a linear system of equations. However, for matrix and blockwise tangential interpolation, many of these linear systems have the same system matrix but different right-hand sides, which can be bundled and solved all at once to reduce the computational complexity of these approaches. Therefore, it is computationally more efficient to construct reduced-order models of fixed order r with matrix interpolation than with blockwise tangential interpolation, which are both more efficient than the new tangential framework. This increase in computational complexity for the tangential approaches leads to the typical trade-off between efficiency of the computations and approximation quality of the reduced-order model. This results from the fact that the new tangential approach allows to impose interpolation conditions at more different frequency points than matrix or blockwise tangential interpolation to compute a reduced-order model of size r; see, for example, [6, 22] and our numerical experiments in Sect. 4. While in most cases, sparse direct solvers are sufficient for the solution of the linear systems, it has been shown that the computational complexity can be further reduced using iterative (inexact) solution techniques; see [7] for the linear structured case and [18] for unstructured bilinear systems.

4 Numerical examples

In this section, we will compare different structure-preserving interpolation frameworks. We compute reduced-order models by:

\(\varvec{\textsf{MtxInt}{}}\):

the structure-preserving matrix interpolation from [13],

\(\varvec{\textsf{BwtInt}{}}\):

the structure-preserving blockwise tangential interpolation as in Sect. 3.4,

\(\varvec{\textsf{SftInt}{}}\):

the modified structure-preserving tangential interpolation framework motivated in the frequency domain (Sect. 3.1), and

\(\varvec{\textsf{SttInt}{}}\):

the generalized structure-preserving tangential interpolation framework motivated in the time domain (Sect. 3.2).

In the experiments, we use MATLAB notation to define the interpolation points: We write logspace(a, b, k) to denote k logarithmically equidistant points in the interval \([10^{a}, 10^{b}]\).

For the qualitative analysis of the computed reduced-order models, we will consider approximation errors in time and frequency domains. In time domain, we consider the pointwise relative output error for a given input signal, i.e.,

$$\begin{aligned} \frac{\Vert y(t) - \hat{y}(t) \Vert _{2}}{\Vert y(t) \Vert _{2}}, \end{aligned}$$

where y and \(\hat{y}\) denote the original and reduced-order system outputs, respectively, in the time range \(t \in [0, t_{\textrm{f}}]\). Additionally, we compute the maximum error over time by

$$\begin{aligned} {{\,\textrm{err}\,}}_{\textrm{sim}}&:= \max \limits _{t \in [0, t_{\textrm{f}}]}\frac{\Vert y(t) - \hat{y}(t) \Vert _{2}}{\Vert y(t) \Vert _{2}}. \end{aligned}$$

In frequency domain, the pointwise relative error of the first and second transfer functions on the imaginary axis in the spectral norm is considered, i.e.,

$$\begin{aligned} \begin{aligned} \frac{\Vert G_{1}(\mathfrak {i}\,\omega _{1}) - \widehat{G}_{1}(\mathfrak {i}\,\omega _{1}) \Vert _{2}}{\Vert G_{1}(\mathfrak {i}\,\omega _{1}) \Vert _{2}}{} & {} \text {and}{} & {} \frac{\Vert G_{2}(\mathfrak {i}\,\omega _{1}, \mathfrak {i}\,\omega _{2}) - \widehat{G}_{2}(\mathfrak {i}\,\omega _{1}, \mathfrak {i}\,\omega _{2}) \Vert _{2}}{\Vert G_{2}(\mathfrak {i}\,\omega _{1}, \mathfrak {i}\,\omega _{2}) \Vert _{2}}, \end{aligned} \end{aligned}$$

in the frequency range \(\omega _{1}, \omega _{2} \in [\omega _{\min }, \omega _{\max }]\) together with the corresponding maximum errors over the frequency of interest defined as

$$\begin{aligned} {{\,\textrm{err}\,}}_{\textrm{G}_{1}}&:= \max \limits _{\omega _{1} \in [\omega _{\min }, \omega _{\max }]} \frac{\Vert G_{1}(\mathfrak {i}\,\omega _{1}) - \widehat{G}_{1}(\mathfrak {i}\,\omega _{1}) \Vert _{2}}{\Vert G_{1}(\mathfrak {i}\,\omega _{1}) \Vert _{2}}, \\ {{\,\textrm{err}\,}}_{\textrm{G}_{2}}&:= \max \limits _{\omega _{1}, \omega _{2} \in [\omega _{\min }, \omega _{\max }]} \frac{\Vert G_{2}(\mathfrak {i}\,\omega _{1}, \mathfrak {i}\,\omega _{2}) - \widehat{G}_{2}(\mathfrak {i}\,\omega _{1}, \mathfrak {i}\,\omega _{2}) \Vert _{2}}{\Vert G_{2} (\mathfrak {i}\,\omega _{1}, \mathfrak {i}\,\omega _{2}) \Vert _{2}}. \end{aligned}$$

Note that the time and frequency domain errors reported are actually approximated by evaluating the above expressions on a fine grid covering \([0,t_{\textrm{f}}]\) or \([\omega _{\min }, \omega _{\max } ]\), respectively.

The experiments reported here have been executed on machines with 2 Intel(R) Xeon(R) Silver 4110 CPU processors running at 2.10 GHz and equipped with either 192 GB or 384 GB total main memory. The computers run on CentOS Linux release 7.5.1804 (Core) with MATLAB 9.9.0.1467703 (R2020b). The source code, data and results of the numerical experiments are open source/open access and available at [41].

4.1 Cooling of steel profiles

We first consider a classical, unstructured bilinear system as in (2). For the optimal cooling of steel profiles, the heat transfer process is described by the two dimensional heat equation

$$\begin{aligned} c \rho \partial _{t} v(t, \zeta ) - \lambda \varDelta v(t, \zeta )&= 0,\\ v(0, \zeta )&= v_{0}(\zeta ), \end{aligned}$$

with \((t, \zeta ) \in (0, t_{\textrm{f}}) \times \varOmega \), the initial value \(v_{0}(\zeta ) \in \varOmega \), and the Robin boundary conditions

$$\begin{aligned} \lambda \partial _{\nu } v(t, \zeta )&= \left\{ \begin{aligned} q_{i} u_{i}(t) \big (1 - v(t, \zeta )\big ),{} & {} {}&\text {on}~\varGamma _{i}, i = 1, \ldots , 6,\\ q_{7} \big (u_{7}(t) - v(t, \zeta )\big ),{} & {} {}&\text {on}~\varGamma _{7}, \end{aligned} \right. \end{aligned}$$

such that \(\bigcup _{i = 1}^{7} \varGamma _{i} = \partial \varOmega \) and \(\varGamma _{i} \cap \varGamma _{j} = \emptyset \) for \(i \ne j\), where \(\partial _{\nu }\) denotes the derivative in direction of the outer normal \(\nu \) and \(u_{i}(t)\) are the exterior cooling fluid temperatures used as controls. The spatial discretization of the rail shaped domain and parameters are chosen as described in [33, 37]. As a result, we consider a system of structure (2) with \(n = 5{,}054{,}209\) states, \(m = 7\) inputs, non-zero bilinear terms corresponding to the first 6 inputs, and \(p = 6\) outputs. The data for this example is available in [38].

The reduced-order models are constructed as follows:

\(\varvec{\textsf{MtxInt}{}}\):

with the interpolation points \(\pm \texttt {logspace(-8, 2, 3)} \mathfrak {i}\) for the first and second subsystem transfer functions. Due to the rank deficiency in the generated columns, a rank truncation is performed to compress the model reduction basis, which yields a reduced-order model size of \(r_{\textsf{mtx}} = 146\).

\(\varvec{\textsf{BwtInt}{}}\):

with the interpolation points \(\pm \texttt {logspace(-8, 2, 8)} \mathfrak {i}\) for the first and second subsystem transfer functions resulting in the reduced order \(r_{\textsf{bwt}} = 112\).

\(\varvec{\textsf{SftInt}{}}\):

with the interpolation points \(\pm \texttt {logspace(-8, 2, 28)} \mathfrak {i}\) and the scaling vectors \(d^{(i)} = \mathbb {1}_{m}\) for the first and second subsystem transfer functions resulting in the reduced order \(r_{\textsf{sft}} = 112\).

\(\varvec{\textsf{SttInt}{}}\):

with the interpolation points \(\pm \texttt {logspace(-8, 2, 28)} \mathfrak {i}\) and the scaling vectors \(d^{(i)} = b^{(i)}\) for the first and second subsystem transfer functions such that the reduced-order model size is \(r_{\textsf{stt}} = 112\).

For all reduced-order models, we have chosen the same interval for the interpolation points. However, since the reduced-order dimension grows differently for different approaches, the number of interpolation points over the same interval differs so that the reduced-order models have the same (or at least comparable) order. For all directions, normalized random vectors from a uniform distribution on [0, 1] have been used. For all reduced-order models only one-sided projections (W is set to \(W = V\)) have been applied resulting in reduced-order models having asymptotically stable linear parts. Note that the matrix interpolation has a much larger reduced order as anticipated.

Figure 1 shows the results for a time simulation using a unit step signal as input. All reduced-order models yield accurate approximations. The relative errors reveal that overall, \(\textsf{SftInt}\) and \(\textsf{SttInt}\) perform best, while \(\textsf{MtxInt}\) and \(\textsf{BwtInt}\) are several orders of magnitude worse in accuracy over the whole time interval. The maximum errors attained are given in Table 1. There, the two new tangential approaches \(\textsf{SftInt}\) and \(\textsf{SttInt}\) are both three orders of magnitude better than \(\textsf{MtxInt}\) for the time domain simulation.

The frequency domain analysis (Figs. 2 and 3) illustrates similar conclusions. In the case of the first subsystem transfer function, \(\textsf{MtxInt}\) performs overall worst followed by \(\textsf{BwtInt}\). The new approaches \(\textsf{SftInt}\) and \(\textsf{SttInt}\) again show the smallest errors over the full frequency range. For the second transfer function level, all approaches behave comparable. The tangential approaches provide better errors than \(\textsf{MtxInt}\) if both frequency arguments are close to each other and \(\textsf{MtxInt}\) is more accurate for very small frequencies. For both transfer function levels, the maximum errors are given in Table 1. While for the time simulations and the first subsystem transfer functions, the worst-case relative approximation errors are as expected, for the second subsystem transfer functions these exceed 1 for all methods. This results from the fast decay of the original second subsystem transfer function for large frequency points and the lack of interpolation points for the second subsystem transfer function with mixed orders of magnitude in the different arguments. These large errors could easily be fixed by adding further interpolation points in the corresponding frequency regions. However, there is no known interpretation of the worst-case approximation error of the second subsystem transfer function for the overall approximation quality of the reduced-order models and the time simulations perform already sufficiently well.

Fig. 1
figure 1

Steel profile: time simulations of the full- and reduced-order models. The new approaches perform around two orders of magnitude better than the classical \(\textsf{MtxInt}\) and \(\textsf{BwtInt}\) in terms of pointwise relative errors

Table 1 Steel profile: Maxima of the pointwise relative errors in time and frequency domain. In the time simulation and for the first subsystem transfer function, the new tangential approaches \(\textsf{SftInt}\) and \(\textsf{SttInt}\) outperform the classical methods by at least one order of magnitude. For the second subsystem transfer functions, all methods provide a worst-case error larger than one due to the fast decay of the transfer function for large frequencies and the lack of interpolation points in those regions. This pointwise large errors apparently do not affect the overall good accuracy of the reduced-order model for time domain simulations
Fig. 2
figure 2

Steel profile: frequency domain results for the first subsystem transfer functions. The new approaches provide the smallest approximation errors over the whole frequency domain. \(\textsf{MtxInt}\) yields the worst approximation and even diverges visibly from the full-order model around 1 rad/s

Fig. 3
figure 3

Steel profile: the plots show the pointwise relative approximation errors of the second subsystem transfer functions. All methods provide comparable errors, where \(\textsf{MtxInt}\), \(\textsf{BwtInt}\) and \(\textsf{SttInt}\) are more accurate for small frequencies in both arguments than \(\textsf{SftInt}\)

4.2 Time-delayed heated rod

Here, we consider the single-input/single-output structured bilinear system from [13, 25] that models a heated rod with distributed control and homogeneous Dirichlet boundary conditions, which is cooled by a delayed feedback. The underlying dynamics are described by the one dimensional heat equation

$$\begin{aligned} \partial _{t} v(t, \zeta )&= \varDelta v(t, \zeta ) - 2 \sin (\zeta ) v(t, \zeta ) + 2 \sin (\zeta ) v(t - 1, \zeta ) + u(t), \end{aligned}$$
(42)

with \((t, \zeta ) \in (0, t_{\textrm{f}}) \times (0, \pi )\) and boundary conditions \(v(t, 0) = v(t, \pi ) = 0\) for all \(t \in [0, t_{\textrm{f}}]\). As extension of (42) to the MIMO case, we consider independent control signals on equally sized sections of the rod as well as analogous measurements. Using centered differences for the spatial discretization, we obtain the bilinear time-delay system

$$\begin{aligned} {\dot{x}}(t)&= Ax(t) + A_{\textrm{d}} x(t - 1) + \sum \limits _{k = 1}^{m} N_{k} x(t) u_{k}(t) + B u(t), \\ y(t)&= C x(t), \end{aligned}$$

with \(A, A_{\textrm{d}}, N_{k} \in \mathbb {R}^{n \times n}\), for \(k = 1, \ldots , m\), \(B \in \mathbb {R}^{n \times m}\) and \(C \in \mathbb {R}^{p \times n}\). For our experiments, we have chosen \(n = 5000\), \(m = 5\) and \(p = 2\).

Fig. 4
figure 4

Time-delay system: time simulations of the full- and reduced-order models. The \(\textsf{MtxInt}\) method performs best for this example with around one order of magnitude smaller errors than \(\textsf{SttInt}\), closely followed by the other approaches

The reduced-order models are constructed as follows:

\(\varvec{\textsf{MtxInt}{}}\):

with the interpolation points \(\pm 1 \mathfrak {i}\) for the first and second subsystem transfer functions. To overcome stability issues, only a one-sided projection was applied. The generated columns for the basis are rank deficient, therefore, a rank truncation has been performed to compress the model reduction basis resulting in the reduced order \(r_{\textsf{mtx}} = 36\).

\(\varvec{\textsf{BwtInt}{}}\):

with the interpolation points \(\pm \texttt {logspace(-4, 4, 3)} \mathfrak {i}\) for the first and second subsystem transfer functions with two-sided projection yielding the reduced order \(r_{\textsf{bwt}} = 36\).

\(\varvec{\textsf{SftInt}{}}\):

with the interpolation points \(\pm \texttt {logspace(-4, 4, 9)} \mathfrak {i}\) and the scaling vectors \(d^{(i)} = \mathbb {1}_{m}\) for the first and second subsystem transfer functions with two-sided projection to get a reduced-order model of size \(r_{\textsf{sft}} = 36\).

\(\varvec{\textsf{SttInt}{}}\):

with the interpolation points \(\pm \texttt {logspace(-4, 4, 9)} \mathfrak {i}\) and the scaling vectors \(d^{(i)} = b^{(i)}\) for the first and second subsystem transfer functions with two-sided projection to get a reduced-order model of size \(r_{\textsf{stt}} = 36\).

For all directions, normalized random vectors from a uniform distribution on [0, 1] have been used. Note that all reduced-order models have the same time-delay structure as the original system (42). All reduced-order models are chosen to be of the same size.

Fig. 5
figure 5

Time-delay system: frequency domain results for the first subsystem transfer functions. For small frequencies, \(\textsf{BwtInt}\) has around two orders of magnitude larger errors than the other approaches. Overall \(\textsf{SftInt}\) performs best

Fig. 6
figure 6

Time-delay system: the plots show the pointwise relative approximation errors of the second subsystem transfer functions. \(\textsf{MtxInt}\) has the most accurate approximation behavior, but all methods provide very similar results

Figure 4 shows the results in time domain for the input signal

$$\begin{aligned} u(t)&= \begin{bmatrix} 0.05(\cos (10 t) + \cos (5 t))&0.05(\sin (10 t) + \sin (5 t))&0.01&0.01&0.01 \end{bmatrix}^{\hspace{-0.83328pt}\textsf{T}}. \end{aligned}$$

This time, \(\textsf{MtxInt}\) is a few orders of magnitude better than the other methods in the overall behavior closely followed by \(\textsf{SttInt}\), then \(\textsf{BwtInt}\) and \(\textsf{SftInt}\). But in terms of the maximum errors (Table 2), \(\textsf{SftInt}\) and \(\textsf{SttInt}\) are almost one order of magnitude better than \(\textsf{MtxInt}\). The results are different in frequency domain. Figure 5 shows the results for the first subsystem transfer functions. While \(\textsf{BwtInt}\) still performs worst, \(\textsf{SftInt}\) performs now better than \(\textsf{MtxInt}\), which is also shown in Table 2. The error of \(\textsf{SttInt}\) is mainly following \(\textsf{MtxInt}\) over the whole frequency range and only minorly diverging at the end. This changes for the second transfer functions in Fig. 6. Here, \(\textsf{MtxInt}\) performs best with \(\textsf{SttInt}\) having comparable accuracy. \(\textsf{BwtInt}\) and \(\textsf{SftInt}\) are worse than the other two approaches but both with a comparable error. In terms of the maximum errors (Table 2), \(\textsf{BwtInt}\) and \(\textsf{MtxInt}\) perform the best.

Further results for tangential interpolation of a related example using different choices of interpolation points can be found in [40, Sect. 5.6.5.2].

Table 2 Time-delay system: Maxima of the pointwise relative errors in time and frequency domain. Except for the second subsystem transfer function, \(\textsf{SftInt}\) provides the best approximation out of all the employed methods. For the second subsystem transfer functions, \(\textsf{MtxInt}\) produces a better error than \(\textsf{SftInt}\) by half an order of magnitude

4.3 Damped mass-spring system with bilinear springs

As the third and final example, we consider the MIMO bilinear damped mass-spring system from [13]. The system has a mechanical second-order structure as the example (1) and takes the form

$$\begin{aligned} \begin{aligned} M \ddot{x}(t) + D {\dot{x}}(t) + K x(t)&= N_{\textrm{p}, 1} x(t) u_{1}(t) + N_{\textrm{p}, 2} x(t) u_{2}(t) + B_{\textrm{u}} u(t), \\ y(t)&= C_{\textrm{p}} x(t), \end{aligned} \end{aligned}$$
(43)

where \(M, D, K \in \mathbb {R}^{n \times n}\) are symmetric positive definite matrices chosen as in [29]. The external forces are applied to the first and last masses, \(B_{\textrm{u}} = [e_{1}, -e_{n}]\), the displacement of the second and fifth masses is observed, \(C_{\textrm{p}} = [e_{2}, e_{5}]^{\hspace{-0.83328pt}\textsf{T}}\); thus the system has \(m = p = 2\) inputs and outputs. The bilinear springs are chosen to be

$$\begin{aligned} \begin{aligned} N_{\textrm{p}, 1}&= -S_{1} K S_{1}&\text {and}{} & {} N_{\textrm{p}, 2}&= S_{2} K S_{2}, \end{aligned} \end{aligned}$$

where \(S_{1}\) is a diagonal matrix with entries \(\texttt {linspace(0.2,0,n)}\) and \(S_{2}\) a diagonal matrix with \(\texttt {linspace(0,0.2,n)}\). For the experiments, we chose \(n = 1000\).

Fig. 7
figure 7

Damped mass-spring system: time simulations of the full- and reduced-order models. All approaches provide a similar error behavior

It has already been shown in [13] that only the structure-preserving approximations give reasonable results for this example. Therefore, we only compare the structured approaches in this paper, i.e., all reduced-order models also have the mechanical system structure as (43). The reduced-order models are constructed as follows:

\(\varvec{\textsf{MtxInt}{}}\):

with the interpolation points \(\pm \texttt {logspace(-4, 4, 2)} \mathfrak {i}\) for the first and second subsystem transfer functions, which yields the reduced order \(r_{\textsf{mtx}} = 24\).

\(\varvec{\textsf{BwtInt}{}}\):

with the interpolation points \(\pm \texttt {logspace(-4, 4, 4)} \mathfrak {i}\) for the first and second subsystem transfer functions such that the reduced order is \(r_{\textsf{bwt}} = 24\).

\(\varvec{\textsf{SftInt}{}}\):

with the interpolation points \(\pm \texttt {logspace(-4, 4, 6)} \mathfrak {i}\) and the scaling vectors \(d^{(i)} = \mathbb {1}_{m}\) for the first and second subsystem transfer functions such that the reduced order is \(r_{\textsf{sft}} = 24\).

\(\varvec{\textsf{SttInt}{}}\):

with interpolation points \(\pm \texttt {logspace(-4, 4, 6)} \mathfrak {i}\) and the scaling vectors \(d^{(i)} = b^{(i)}\) for the first and second subsystem transfer functions such that the reduced order is \(r_{\textsf{stt}} = 24\).

To preserve the symmetry of the system matrices, only one-sided projections have been used for the construction. For all directions, normalized random vectors have been generated by drawing their entries from a uniform distribution on [0, 1]. All reduced-order models have the same order.

Figure 7 shows the time simulation results for

$$\begin{aligned} u(t)&= \begin{bmatrix} \sin (200 t) + 200 \\ -\cos (200 t) - 200 \end{bmatrix}. \end{aligned}$$

All reduced-order models yield accurate results with practically the same approximation quality. As Table 3 shows, the new tangential approaches perform a little bit better than \(\textsf{MtxInt}\) but still have the same order of accuracy. Also, in the frequency domain, the tangential interpolation as well as the matrix interpolation behave in principle all the same, where the matrix interpolation is again a bit worse than the tangential approaches as it can be seen in Figs. 8 and 9, and Table 3.

Further results for tangential interpolation of a related example using different choices of interpolation points can be found in [40, Sect. 5.6.5.1].

Fig. 8
figure 8

Damped mass-spring system: frequency domain results for the first subsystem transfer functions. All methods provide a similar error behavior

Fig. 9
figure 9

Damped mass-spring system: the plots show the pointwise relative approximation errors of the second subsystem transfer functions. All methods provide a similar error behavior

Table 3 Damped mass-spring system: Maxima of the pointwise relative errors in time and frequency domain. All methods provide similar worst-case errors compared to each other in time and frequency domain. \(\textsf{SttInt}\) performs best by around a factor of 1.5 compared to the rest

5 Conclusions

We developed the tangential interpolation framework for structure-preserving interpolation of multi-input/multi-output bilinear control systems. By revisiting the classical tangential interpolation in frequency domain and its interpretation in time domain, we developed a new unifying tangential interpolation framework for structure-preserving model reduction of MIMO bilinear systems and proved conditions on the model reduction subspaces to satisfy interpolation conditions in this new framework. We also used the new framework to obtain results on the blockwise tangential interpolation approach and extended the theory from the literature to structured bilinear systems. Motivated by classical tangential interpolation in frequency domain and its interpretation in time domain, the generality of the new approach extends beyond the results explored in this paper as the construction of interpolating structured reduced-order models is now fully independent of the system dimensions adding more flexibility in choosing interpolation conditions than in previous approaches. The numerical examples illustrate that the new approach is as good as and even better in many situation than the full matrix or the blockwise tangential interpolation methods. In other words, the new approach gives sufficiently accurate results while allowing more freedom in choosing the order of the reduced-order model compared to the existing approaches.

While we used a rather simple choice for interpolation points (logarithmically equidistant on the imaginary axis), the question of better or even optimal choices of interpolation points remains open. Other choices for interpolation point selections, heuristically inspired by the linear system case, have been used in numerical examples in [40]. Also, in the setting of tangential interpolation, the question of appropriate tangential directions needs to be answered. For our new framework, we gave two approaches for choosing the scaling vectors. Still the influence of the choice of the scaling vectors needs to be investigated as well as the question of an optimal choice. These issues will be considered in future works.