Keywords

Mathematics Subject Classification (2010)

1 Introduction

In a number of recent papers [7,8,9,10] convergence results for an overdetermined polynomial least-squares collocation for two-point boundary value problems for higher index differential-algebraic equations (DAEs) have been established. This method is comparable in computational efficiency with the widely used collocation method for ordinary differential equations using piecewise polynomials. For initial value problems (IVPs), a considerable increase in numerical efficiency of the overdetermined polynomial least-squares collocation method is expected if one can construct time-stepping or windowing techniques. Below, we consider some key issues in this respect. Our ultimate goal is that overdetermined collocation is used on succeeding individual time-windows, though we emphasize that the present note deals with the very first attempts in this context only.

We are interested in general initial-value problems (IVPs),

$$\displaystyle \begin{aligned} f((Dx)'(t), x(t),t) =0,\quad t\in[a,b],\quad G_{a}x(a) =r. \end{aligned} $$
(1.1)

\(x:[a,b]\rightarrow \mathbb {R}^{m}\) is the unknown vector-valued function defined on the finite interval \([a,b]\subset \mathbb {R}\). We assume an explicit partitioning of the unknowns into differentiated and nondifferentiated (also called algebraic) components by selecting

$$\displaystyle \begin{aligned} D\in\mathbb{R}^{k\times m}, \quad D=[I_{k}\;0] \end{aligned} $$

with the identity matrix \(I_{k}\in \mathbb {R}^{k\times k}\). The function \(f:\mathbb {R}^{k}\times \mathbb {R}^{m}\times \mathbb {R}\rightarrow \mathbb {R}^{m} \) is assumed to be sufficiently smooth, at least continuous and with continuous partial derivatives with respect to the first and second arguments.

The initial values deserve some special attention. For a solution to exist they must be consistent. We will ensure this by requiring special properties on the matrix G a. It is reasonable to assume that at most the differentiated components x 1, …, x k are fixed by initial conditions, which leads to the requirement

$$\displaystyle \begin{aligned} G_{a}\in \mathbb{R}^{l\times m}, \quad \ker G_{a}\supseteq\ker D, \end{aligned} $$

such that G ax(a) = G aD +Dx(a). Moreover, we will assume that the initial conditions are independent of each other, that is \( \operatorname {\mathrm {rank}} G_a = l\), where l denotes the actual dynamical degree of freedom. Later on, more detailed requirements, depending on the DAE will be posed.

Let the interval [a, b] be decomposed into L subintervals,

$$\displaystyle \begin{aligned} a=w_{0}< w_{1}<\cdots< w_{L}=b, \end{aligned} $$

with lengths H λ = w λ − w λ−1, λ = 1, …, L. First, for λ = 1, we provide an approximating segment \(\tilde {x}^{[1]}:[w_{0},w_{1}]\rightarrow \mathbb {R}^{m}\) by applying overdetermined collocation to the IVP

$$\displaystyle \begin{aligned} f((D\tilde x^{[1]})'(t),\tilde x^{[1]}(t),t) =0,\quad t\in [w_{0},w_{1}],\quad G_a\tilde x^{[1]}(a)=r. \end{aligned} $$
(1.2)

For λ > 1, having already the segment \(\tilde x^{[\lambda -1]}:[w_{\lambda -2},w_{\lambda -1}]\rightarrow \mathbb {R}^{m}\), we intend to determine the next segment \(\tilde x^{[\lambda ]}:[w_{\lambda -1},w_{\lambda }]\rightarrow \mathbb {R}^{m}\) by solving the DAE

$$\displaystyle \begin{aligned} f((D\tilde x^{[\lambda]})'(t),\tilde x^{[\lambda]}(t),t)=0, \quad t\in [w_{\lambda-1}, w_{\lambda}]. \end{aligned} $$
(1.3)

In order to obtain an appropriate approximation to the solution of (1.1), we need to compensate the now unavailable initial conditions by certain transfer conditions using \(\tilde x^{[\lambda -1]}\). Below we investigate two different approaches, namely,

$$\displaystyle \begin{aligned} G(w_{\lambda-1})\tilde x^{[\lambda]}(w_{\lambda-1})=G(w_{\lambda-1})\tilde x^{[\lambda-1]}(w_{\lambda-1}), \end{aligned} $$
(1.4)

with a suitably prescribed matrix function \(G:[a,b]\rightarrow \mathbb {R}^{l\times m}\), and

$$\displaystyle \begin{aligned} D\tilde x^{[\lambda]}(w_{\lambda-1})=D\tilde x^{[\lambda-1]}(w_{\lambda-1}). \end{aligned} $$
(1.5)

The construction of appropriate transfer conditions is crucial for the success of the method.Footnote 1

In the present note we merely deal with the linear version of the IVP,

$$\displaystyle \begin{aligned} A(t)(Dx)'(t)+B(t)x(t) -q(t)&=0,\quad t\in[a,b],{} \end{aligned} $$
(1.6)
$$\displaystyle \begin{aligned} G_{a}x(a) & =r,{} \end{aligned} $$
(1.7)

in which the right-hand side \(q:[a,b]\rightarrow \mathbb {R}^{m}\) and the matrix coefficients \(A:[a,b]\rightarrow \mathbb {R}^{m\times k}\) and \(B:[a,b]\rightarrow \mathbb {R}^{m\times m}\) are assumed to be sufficiently smooth, however at least continuous, thus uniformly bounded.

As it is well-known,Footnote 2 conventional time-stepping methods such as the BDF in the famous DAE solver DASSL work well only when applied to index-1 DAEs and special form index-2 DAEs. The so far available time-stepping solvers for more general higher-index DAEs are definitely bound to the construction and evaluation of so-called derivative array systems,Footnote 3 e.g., [3, 4, 12, 16, 17], which accounts for a serious limitation in view of applications. The recently discussed ansatz of overdetermined least-squares collocation [7,8,9,10] fully avoids the use of derivative arrays and no reduction procedures are incorporated, which is highly beneficial. However, this is a global ansatz over the entire interval, not a time-stepping method and large ill-conditioned discrete systems may arise. For this reason, eventually, a time-stepping version would be much more advantageous. Recall that we come up with very first related ideas here.

The paper is organized as follows: We describe the global overdetermined collocation procedure in Sect. 2 and collect there the relevant convergence results. In Sect. 3 we derive basic error estimates for overdetermined collocation on arbitrary individual subintervals corresponding to both procedures (1.2)–(1.3) and (1.4). A corresponding result for the approach (1.2)–(1.3) and (1.5) is provided in Sect. 4. We study the simpler time-stepping version with uniform window-size H and the same uniform stepsize on all subintervals in Sect. 5. Convergence of the method using the transfer condition (1.4) is shown in Sect. 5.1. However, our estimates in Sect. 4 are not sufficient to show convergence for the case (1.3), (1.5). Therefore, an investigation of a very special system in Sect. 5.3 provides some hints on what could be expected in that case. In order to demonstrate the behavior of the method, we provide a more complex example in Sect. 6 using both approaches, (1.4) as well as (1.5).

2 Global Overdetermined Collocation

2.1 The Global Procedure

Let us consider first the case of global overdetermined collocation, that is L = 1 and H = b − a. Let, for a given \(n\in \mathbb {N}\), a grid π on the interval [a, b] be defined:

$$\displaystyle \begin{aligned} \pi:\quad a=t_{0}<\cdots<t_{n}=b, \end{aligned}$$

where t j = a + jh and h = (b − a)∕n.Footnote 4

In order to be able to introduce collocation conditions we will need a space of piecewise continuous functions. Let \(C_{\pi }([a,b],\mathbb {R}^{m})\) denote the space of all functions \(x:[a,b]\rightarrow \mathbb {R}^{m}\) which are continuous on each subinterval (t j−1, t j) and feature continuous extensions onto [t j−1, t j], j = 1, …, n. Furthermore, let \(\mathcal {P}_{N}\) denote the set of polynomials of degree less than or equal to N, N ≥ 1. We define the ansatz space

$$\displaystyle \begin{aligned} X_{\pi}=\{p\in C_{\pi}([a,b],\mathbb{R}^{m})&\vert Dp\in C([a,b],\mathbb{R}^{k} ),\\ & p_{\kappa}\vert_{(t_{j-1},t_{j})}\in\mathcal{P}_{N},\kappa=1,\ldots,k, \\ & p_{\kappa}\vert_{(t_{j-1},t_{j})}\in\mathcal{P}_{N-1},\;\kappa=k+1,\ldots,m,\\ & j=1,\ldots,n\}. \end{aligned} $$

Let now M points τ i be given such that 0 < τ 1 < ⋯ < τ M < 1. The set of collocation points is given by

$$\displaystyle \begin{aligned} S_{\pi,M}=\{t_{ji}=t_{j-1}+\tau_{i}h\vert\; j=1,\ldots,n,\;i=1,\ldots,M\}. \end{aligned} $$
(2.1)

Using this set S π,M, an interpolation operator \(R_{\pi ,M}:C_{\pi }([a,b],\mathbb {R}^{m})\rightarrow C_{\pi }([a,b],\mathbb {R}^{m})\) is given by assigning, to each \(w\in C_{\pi }([a,b],\mathbb {R}^{m})\), the piecewise polynomial R π,Mw with

$$\displaystyle \begin{aligned} R_{\pi,M}w\vert_{(t_{j-1},t_{j})}\in\mathcal{P}_{M-1},\;j=1,\ldots,n,\quad R_{\pi,M}w(t)=w(t),\,t\in S_{\pi,M}. \end{aligned} $$

The functional

$$\displaystyle \begin{aligned} \varPhi_{\pi,M}(x) =\lVert R_{\pi,M}(f((Dx)'(\cdot),x(\cdot),\cdot))\rVert_{L^{2}}^{2}+\lvert G_{a}x(a)-r\rvert^{2},\quad x\in X_{\pi}, \end{aligned} $$

can be represented as (cf. [10, Subsection 2.3], also [8, 9])

$$\displaystyle \begin{aligned} \varPhi_{\pi,M}(x) =W^{T}\mathcal L W+\lvert G_{a}x(a)-r\rvert^{2},\quad x\in X_{\pi}, \end{aligned} $$

with the vector \(W\in \mathbb {R}^{mMn}\),

$$\displaystyle \begin{aligned} W=\begin{bmatrix} W_{1}\\\vdots\\W_{n} \end{bmatrix}\in \mathbb{R}^{mMn},\quad W_{j}=\left(\frac{h}{M}\right)^{1/2} \begin{bmatrix} f((Dx)'(t_{j1}),x(t_{j1}),t_{j1}))\\\vdots\\f((Dx)'(t_{jM}),x(t_{jM}),t_{jM})) \end{bmatrix}\in \mathbb{R}^{mM}, \end{aligned} $$

with the matrix \(\mathcal L\) being positive definite, symmetric and independentFootnote 5of h. Moreover there are constants κ l, κ u > 0 such that

$$\displaystyle \begin{aligned} \kappa_{l}\;\lvert V\rvert^{2}\leq V^{T}\mathcal L V\leq \kappa_{u}\;\lvert V\rvert^{2},\quad V\in \mathbb{R}^{mMn}. \end{aligned} $$
(2.2)

If the DAE in (1.1) is regular with index one, l = k, and M = N, then there is an element \(\tilde x_{\pi }\in X_{\pi }\) such that \(\varPhi _{\pi ,M}(\tilde x_{\pi })=0\), which corresponds to the classical collocation method resulting in a system of nMm + l equations for nNm + k = nMm + l unknowns. Though classical collocation works well for regular index-1 DAEs (e.g., [14]), it is known to be useless for higher-index DAEs.

Reasonably, one applies l initial conditions in compliance with the dynamical degree of freedom of the DAE. In the case of higher-index DAEs, the dynamical degree of freedom is always less than k. For 0 ≤ l ≤ k and M ≥ N + 1, necessarily an overdetermined collocation system results since nMm + l > nNm + k. Overdetermined least-squares collocation consists of choosing M ≥ N + 1 and then determining an element \(\tilde x_{\pi }\in X_{\pi }\) which minimizes the functional Φ π,M, i.e.,

$$\displaystyle \begin{aligned} \tilde{x}_{\pi} \in\operatorname{\mathrm{argmin}}\{\varPhi_{\pi,M}(x)\vert x\in X_{\pi}\}. \end{aligned} $$

This runs astonishingly well [9, 10], see also Sect. 6.

2.2 Convergence Results for the Global Overdetermined Collocation Applied to Linear IVPs

We now specify results obtained for boundary value problems in [8,9,10] for a customized application to IVPs. Even though we always assume a sufficiently smooth classical solution \(x_{*}:[a,b]\rightarrow \mathbb {R}^{m}\) of the IVP (1.6), (1.7) to exist, for the following, an operator setting in Hilbert spaces will be convenient. The spaces to be used are:

$$\displaystyle \begin{aligned} L^{2}=L^{2}((a,b),\mathbb{R}^{m}\},\quad H_{D}^{1}=\{x\in L^{2}\vert Dx\in H^{1}((a,b),\mathbb{R}^{k}\},\quad Y=L^{2}\times\mathbb{R}^{l}. \end{aligned} $$

The operator \(T:H_{D}^{1}\rightarrow L^{2}\) given by

$$\displaystyle \begin{aligned} (Tx)(t)=A(t)(Dx)'(t)+B(t)x(t),\quad a.e.\; t\in(a,b),\quad x\in H^{1}_{D}, \end{aligned} $$

is bounded. Since, for \(x\in H_{D}^{1}\), the values Dx(a) and thus G ax(a) = G aD +Dx(a) are well-defined, the composed operator \(\mathcal {T}:X\rightarrow Y\) given by

$$\displaystyle \begin{aligned} \mathcal{T}x=\left[\begin{array}{c} Tx\\ G_{a}x(a) \end{array}\right],\quad x\in H^{1}_{D}, \end{aligned}$$

is well-defined and also bounded.

Let \(U_{\pi }:H^{1}_{D}\rightarrow H^{1}_{D}\) denote the orthogonal projector of the Hilbert space \(H^{1}_{D}\) onto X π.

For a more concise notation later on, we introduce the composed interpolation operator \(\mathcal {R}_{\pi ,M}: C_{\pi }([a,b],\mathbb {R}^{m})\times \mathbb {R}^{l} \rightarrow Y\),

$$\displaystyle \begin{aligned} \mathcal{R}_{\pi,M}\left[\begin{array}{c} w\\ r \end{array}\right]=\left[\begin{array}{cc} R_{\pi,M} & 0\\ 0 & I \end{array}\right]\left[\begin{array}{c} w\\ r \end{array}\right]. \end{aligned}$$

With these settings, overdetermined least-squares collocation reduces to the minimization of

$$\displaystyle \begin{aligned} \varPhi_{\pi,M}(x)=\lVert R_{\pi,M}(Tx-q)\rVert_{L^{2}}^{2}+\lvert G_{a}x(a)-r\rvert^{2}=\lVert \mathcal R_{\pi,M}(\mathcal Tx-y)\rVert_{Y}^{2},\, x\in X_{\pi}, \end{aligned} $$

that is, to find

$$\displaystyle \begin{aligned} \tilde{x}_{\pi} & \in\operatorname{\mathrm{argmin}}\{\varPhi_{\pi,M}(x)\vert x\in X_{\pi}\}. \end{aligned} $$

Later on, we will provide conditions which ensure that \(\ker \mathcal R_{\pi ,M}\mathcal TU_{\pi }=X_{\pi }^{\bot }\) such that \( \tilde {x}_{\pi }\) is uniquely defined. Therefore,

$$\displaystyle \begin{aligned} \tilde{x}_{\pi}= (\mathcal R_{\pi,M}\mathcal TU_{\pi})^{+}\mathcal R_{\pi,M}\,y. \end{aligned}$$

We consider also the related functional

$$\displaystyle \begin{aligned} \varPhi(x) =\lVert Tx-q\rVert_{L^{2}}^{2}+\lvert G_{a}x(a)-r\rvert^{2}=\lVert \mathcal Tx-y\rVert_{Y}^{2},\quad x\in H^{1}_{D}, \end{aligned} $$

and the corresponding method for approximating the solution x by determining

$$\displaystyle \begin{aligned} x_{\pi} & \in\operatorname{\mathrm{argmin}}\{\varPhi(x)\vert x\in X_{\pi}\}. \end{aligned} $$

As before, the conditions assumed below will guarantee that the minimizer x π is unique such that

$$\displaystyle \begin{aligned} x_\pi=(\mathcal TU_{\pi})^{+}y. \end{aligned}$$

Below, the operator \(\mathcal T\) is ensured to be injective. Since \(\mathcal T\) is associated with a higher-index DAE, the inverse \(\mathcal T^{-1}\) is unbounded and the IVP is essentially ill-posed in the sense of Tikhonov. Following ideas to treat ill-posed problems, e.g., [11], the proofs in [8,9,10] are based on estimates of the type

$$\displaystyle \begin{aligned} \lVert x_{\pi}-x_{\ast}\rVert_{H_{D}^{1}} & \leq\frac{\beta_{\pi}}{\gamma_{\pi}}+\alpha_{\pi},\\ \lVert\tilde{x}_{\pi}-x_{\ast}\rVert_{H_{D}^{1}} & \leq\frac{\tilde{\beta}_{\pi}}{\tilde{\gamma}_{\pi}}+\alpha_{\pi}, \end{aligned} $$

in which

$$\displaystyle \begin{aligned} \alpha_{\pi} & =\lVert (I-U_{\pi})x_{\ast}\rVert_{H_{D}^{1}},\\ \beta_{\pi} & =\lVert\mathcal{T}(I-U_{\pi})x_{\ast})\rVert_{Y},\\ \tilde{\beta}_{\pi} & =\lVert\mathcal{R}_{\pi,M}\mathcal T(I-U_{\pi})x_{\ast}\rVert_{Y},\\ \gamma_{\pi} & =\inf_{p\in X_{\pi},p\neq0}\frac{\lVert\mathcal{T}p\rVert_{Y}}{\lVert p\rVert_{H^{1}_{D}}}=\inf_{p\in X_{\pi},p\neq0}\left(\frac{\lVert Tp\rVert_{L^{2}}^{2}+\lvert G_{a}p(a)\rvert^{2}}{\lVert p\rVert_{H_{D}^{1}}}\right)^{1/2},\\ \tilde{\gamma}_{\pi} & =\inf_{p\in X_{\pi},p\neq0}\frac{\lVert\mathcal{R}_{\pi,M}\mathcal Tp\rVert_{Y}}{\lVert p\rVert_{H^{1}_{D}}}=\inf_{p\in X_{\pi},p\neq0}\left(\frac{\lVert R_{\pi,M}Tp\rVert_{L^{2}}^{2}+\lvert G_{a}p(a)\rvert^{2}}{\lVert p\rVert_{H_{D}^{1}}}\right)^{1/2}. \end{aligned} $$

The most challenging task in this context is to provide suitable positive lower bounds of the instability thresholdsγ π and \(\tilde {\gamma }_{\pi }\), [8,9,10] and, what is the same, upper bounds for the Moore-Penrose inverses

$$\displaystyle \begin{aligned} \lVert (\mathcal TU_{\pi})^{+}\rVert =\frac{1}{\gamma_{\pi}},\quad \lVert (\mathcal R_{\pi,M}\mathcal TU_{\pi})^{+}\rVert =\frac{1}{\tilde \gamma_{\pi}}. \end{aligned} $$

It should be noted that \(\mathcal {T}\) and \(\mathcal R_{\pi ,M}\mathcal T\) are of very different nature: While \(\mathcal {T}\) is bounded, \(\mathcal R_{\pi ,M}\mathcal T\) is unbounded owing to the fact that R π,M is an unbounded operator in L 2, see [8].

We now briefly summarize the relevant estimations resulting from [8, 9] for IVPs. For details we refer to [8, 9].

The general assumptions with respect to the DAE and the initial conditions are:Footnote 6

  1. 1.

    The operator T is fine with tractability index μ ≥ 2 and characteristic values 0 < r 0 ≤⋯ ≤ r μ−1 < r μ = m.

  2. 2.

    The initial conditions are accurately stated such that \(l=m-\sum _{i=0}^{\mu -1}(m-r_{i})\) and G a = G aΠ can(a), with the canonical projector Π can. This implies \( \operatorname {\mathrm {im}} \mathcal T= \operatorname {\mathrm {im}} T \times \mathbb {R}^{l}\), see [14, Theorem 2.1].

  3. 3.

    The coefficients A, B, the right-hand side \(q\in \operatorname {\mathrm {im}} T\), and the solution x are sufficiently smooth.

Result (a), see [9]: :

Assume M ≥ N + 1. Then there are positive constants c α, c β, c γ and c such that, for all sufficiently small stepsizes h > 0,

$$\displaystyle \begin{aligned} \gamma_{\pi}\geq c_{\gamma}h^{\mu-1},\quad \alpha_{\pi}\leq c_{\alpha}h^{N},\quad \beta_{\pi}\leq c_{\beta}h^{N}, \end{aligned} $$

and eventually

$$\displaystyle \begin{aligned} \lVert x_{\pi}-x_{\ast}\rVert_{H_{D}^{1}}\leq c\,h^{N-\mu+1}. \end{aligned} $$
Result (b), see [8]: :

Assume M ≥ N + μ. Then there are positive constants \(c_{\alpha },\tilde {c}_{\beta }\), \(\tilde {c}_{\gamma }\), and \(\tilde c\) such that, for all sufficiently small stepsizes h > 0,

$$\displaystyle \begin{aligned} \tilde{\gamma}_{\pi}\geq \tilde{c}_{\gamma}h^{\mu-1},\quad \alpha_{\pi}\leq c_{\alpha}h^{N},\quad \tilde{\beta}_{\pi}\leq \tilde{c}_{\beta}h^{N}, \end{aligned} $$

and eventually

$$\displaystyle \begin{aligned} \lVert \tilde{x}_{\pi}-x_{\ast}\rVert_{H_{D}^{1}}\leq \tilde{c}\,h^{N-\mu+1}. \end{aligned} $$

By [8], one can do with \(\tilde c_{\gamma }=c_{\gamma }/2\). We refer to [9, 10] for a series of tests which confirm these estimations or perform even better. Recall that so far, IVPs for higher-index DAEs are integrated by techniques which evaluate derivative arrays, e.g., [5]. Comparing with those methods even the global overdetermined collocation method features beneficial properties. However, a time-stepping version could be much more advantageous.

3 Overdetermined Collocation on an Arbitrary Subinterval \([\bar {t},\bar {t}+H]\subset [a,b]\)

3.1 Preliminaries

We continue to consider the IVP (1.6), (1.7) as described above, but instead of the global approach immediately capturing the entire interval [a, b] we now aim at stepping forward by means of consecutive time-windows applying overdetermined least-squares collocation on each window. As special cases, we have in mind the two windowing procedures outlined by (1.2), (1.3), and (1.4), and by (1.2), (1.3), and (1.5). At the outset we ask how overdetermined collocation works on an arbitrary subinterval,

$$\displaystyle \begin{aligned}{}[\bar t,\bar t+H]\subseteq [a,b]. \end{aligned} $$

It will become important to relate global quantities (valid for overdetermined least-squares collocation on [a, b]) to their local counterparts (appropriate on subintervals of length H). We introduce the function spaces related to this subinterval,

$$\displaystyle \begin{aligned} &L^{2}_{sub}=L^{2}((\bar t,\bar t+H),\mathbb{R}^{m}\},\quad H^{1}_{sub}=H^{1}((\bar{t},\bar{t}+H),\mathbb{R}^{k}),\\ &H_{D,sub}^{1}=\{x\in L^{2}_{sub}\vert Dx\in H^{1}_{sub}\},\quad Y_{sub}=L^{2}_{sub}\times\mathbb{R}^{l},\quad \quad \hat Y_{sub}=L^{2}_{sub}\times\mathbb{R}^{k}, \end{aligned} $$

equipped with natural norms, in particular,

$$\displaystyle \begin{aligned} \lVert x\rVert_{H^{1}_{D,sub}}=(\lVert x\rVert_{L^{2}_{sub}}^{2}+\lVert(Dx)'\rVert_{L^{2}_{sub}}^{2})^{1/2},\quad x\in H^{1}_{D,sub}. \end{aligned} $$

Note that we indicate quantities associated to the subinterval by the extra subscript sub only if necessary and otherwise misunderstandings could arise.

Now we assume that the grid π is related to the subinterval only,

$$\displaystyle \begin{aligned} \pi:\quad \bar t=t_{0}<\cdots<t_{n}=\bar t+H, \end{aligned}$$

where \(t_{j}=\bar t+jh\) and h = Hn. The ansatz space reads now

$$\displaystyle \begin{aligned} X_{\pi}=\{p\in C_{\pi}([\bar t,\bar t+H],\mathbb{R}^{m})&\vert \;Dp\in C([\bar t,\bar t+H],\mathbb{R}^{k} ),\\ p_{\kappa}\vert_{(t_{j-1},t_{j})}\in\mathcal{P}_{N},\;\kappa=1,\ldots,&k,\; p_{\kappa}\vert_{(t_{j-1},t_{j})}\in\mathcal{P}_{N-1},\;\kappa=k+1,\ldots,n,\\ & j=1,\ldots,n\}. \end{aligned} $$

With 0 < τ 1 < ⋯ < τ M < 1, the set of collocation points

$$\displaystyle \begin{aligned} S_{\pi,M}=\{t_{ji}=t_{j-1}+\tau_{i}h\vert\; j=1,\ldots,n,\;i=1,\ldots,M\} \end{aligned} $$
(3.1)

belongs to the subinterval \([\bar t,\bar t+H]\). Correspondingly, the interpolation operator R π,M acts on \(C_{\pi }([\bar t,\bar t+H],\mathbb {R}^{m})\). We introduce the operator \(T_{sub}:H_{D,sub}^{1}\rightarrow L^{2}_{sub}\),

$$\displaystyle \begin{aligned} (T_{sub}x)(t)=A(t)(Dx)'(t)+B(t)x(t),\; a.e.\; t\in(\bar t,\bar t+H),\; x\in H^{1}_{D,sub}, \end{aligned} $$

and the composed operators \(\mathcal {T}_{sub}:H^{1}_{D,sub}\rightarrow Y_{sub}\) and \(\hat {\mathcal {T}}_{sub}:H^{1}_{D,sub}\rightarrow \hat {Y}_{sub}\),

$$\displaystyle \begin{aligned} \mathcal{T}_{sub}x=\left[\begin{array}{c} T_{sub}x\\ G(\bar t)x(\bar t) \end{array}\right],\quad \hat{\mathcal{T}}_{sub}x=\left[\begin{array}{c} T_{sub}x\\ Dx(\bar t) \end{array}\right],\quad x\in H^{1}_{D,sub}. \end{aligned}$$

Occasionally, we also use the operators \(T_{IC,sub}:H^{1}_{D,sub}\rightarrow \mathbb {R}^{l}\) and \(T_{ICD,sub}:H^{1}_{D,sub}\rightarrow \mathbb {R}^{k}\) given by

$$\displaystyle \begin{aligned} T_{IC,sub}x=G(\bar t)x(\bar t) ,\quad T_{ICD,sub}x=Dx(\bar t) ,\quad x\in H^{1}_{D,sub}, \end{aligned} $$

which are associated with the initial condition posed at \(\bar t\). Here, aiming for injective composed operators, we suppose a function \(G:[a,b]\rightarrow \mathbb {R}^{l}\) such that

$$\displaystyle \begin{aligned} \ker G(t)=\ker {\varPi}_{can}(t),\; \operatorname{\mathrm{im}} G(t)=\mathbb{R}^l,\; \lvert G(t)\rvert\leq c_{G},\; t\in [a,b]. \end{aligned} $$
(3.2)

Since T sub inherits the tractability index, the characteristic values of T, and also the canonical projector (restricted to the subinterval, see [13, Section 2.6]), the local initial condition at \(\bar t\), \(G(\bar t)x(\bar t)=r\), is accurately stated. Then \( \operatorname {\mathrm {im}} \mathcal T_{sub}= \operatorname {\mathrm {im}} T_{sub}\times \mathbb {R}^{l}\) and \(\ker \mathcal T_{sub}=\{0\}\), so that the overdetermined least-squares collocation on \([\bar t,\bar t+H]\) works analogously to the global one described in Sect. 2.

The composed interpolation operators \(\mathcal {R}_{\pi ,M}\) and \(\hat {\mathcal {R}}_{\pi ,M}\) act now on \(C_{\pi }([\bar t,\bar t+H],\mathbb {R}^{m})\times \mathbb {R}^{l}\) and \(C_{\pi }([\bar t,\bar t+H],\mathbb {R}^{m})\times \mathbb {R}^{k}\),

$$\displaystyle \begin{aligned} \mathcal{R}_{\pi,M}\left[\begin{array}{c} w\\ r \end{array}\right]=\left[\begin{array}{cc} R_{\pi,M} & 0\\ 0 & I_{l} \end{array}\right]\left[\begin{array}{c} w\\ r \end{array}\right],\quad \hat{\mathcal{R}}_{\pi,M}\left[\begin{array}{c} w\\ \hat{r} \end{array}\right]=\left[\begin{array}{cc} R_{\pi,M} & 0\\ 0 & I_{k} \end{array}\right]\left[\begin{array}{c} w\\ \hat{r} \end{array}\right]. \end{aligned}$$

Let \(U_{\pi ,sub}:H^{1}_{D,sub}\rightarrow H^{1}_{D,sub}\) be the orthogonal projector of \(H^{1}_{D,sub}\) onto \(X_{\pi }\subset H^{1}_{D,sub}\).

Accordingly, we define α π,sub and, furthermore, β π,sub, γ π,sub, \(\tilde {\beta }_{\pi ,sub},\) \(\tilde {\gamma }_{\pi ,sub}\), associated with the operator \(\mathcal T_{sub}\) and, similarly, \(\hat {\beta }_{\pi ,sub},\) \(\hat {\gamma }_{\pi ,sub},\) \(\tilde {\hat {\beta }}_{\pi ,sub},\) \(\tilde {\hat {\gamma }}_{\pi ,sub}\) associated with \(\hat {\mathcal T}_{sub}\).

The following lemma provides conditions for the existence of a function \(G:[a,b]\rightarrow \mathbb {R}^{}\) having the properties (3.2). The latter is a necessary prerequisite for the transition condition (1.4).

Lemma 3.1

Let the operator T be fine with tractability index μ ≥ 2, characteristic values 0 < r 0 ≤⋯ ≤ r μ−1 < r μ = m, \(l=m-\sum _{i=0}^{\mu -1}(m-r_{i})\), and the canonical projector function Π can.

Then there are continuously differentiable functions \(G:[a,b]\rightarrow \mathbb {R}^{l\times m}\) and \(K:[a,b]\rightarrow \mathbb {R}^{k\times k}\) such that

$$\displaystyle \begin{aligned} \operatorname{\mathrm{im}} G(t)=\mathbb{R}^{l},\quad \ker G(t)=\ker {\varPi}_{can}(t),\quad [I_{l}\;0] K(t)D=G(t),\quad t\in [a,b], \end{aligned} $$

K(t) remains nonsingular on [a, b], and, with \(\kappa =(\max _{a\leq t\leq b}\lvert K(t)\rvert )^{-1}\),

$$\displaystyle \begin{aligned} \lvert Dz\rvert=\lvert K(t)^{-1}K(t)Dz\rvert \geq \kappa \lvert K(t)Dz \rvert \geq \kappa \lvert G(t)z\rvert,\quad z\in \mathbb{R}^{k},\; t\in[a,b]. \end{aligned} $$

Proof

We choose an admissible matrix function sequence with admissible projector functions Q 0, …, Q μ−1, see [13, Section 2.2]. Denote P i = I − Q i, Π i = P 0P i. Then, Π μ−1 and μ−1D + are also projector functions, both with constant rank l. Since μ−1D + is continuously differentiable, we find a continuously differentiable matrix function \(\varGamma _{dyn}:[a,b]\rightarrow \mathbb {R}^{l\times k}\) so that

$$\displaystyle \begin{aligned} \operatorname{\mathrm{im}} \varGamma_{dyn}(t)=\mathbb{R}^{l},\quad \ker \varGamma_{dyn}(t)=\ker (D {\varPi}_{\mu-1}D^{+})(t),\quad t\in [a,b]. \end{aligned} $$

Furthermore, there is a pointwise reflexive generalized inverse \(\varGamma _{dyn}^{-}:[a,b]\rightarrow \mathbb {R}^{k\times l}\), also continuously differentiable, such that \(\varGamma _{dyn}\varGamma _{dyn}^{-}=I\) and \(\varGamma _{dyn}^{-}\varGamma _{dyn}=D {\varPi }_{\mu -1}D^{+}\). Similarly, we find constant-rank continuously differentiable matrix functions \(\varGamma _{nil,i}:[a,b]\rightarrow \mathbb {R}^{(m-r_{i})\times k}\) and pointwise generalized inverses \(\varGamma _{nil,i}^{-}:[a,b]\rightarrow \mathbb {R}^{k\times (m-r_{i})}\) such that

$$\displaystyle \begin{aligned} \varGamma_{nil,i}\varGamma_{nil,i}^{-}=I,\quad \varGamma_{nil,i}^{-}\varGamma_{nil,i}=D {\varPi}_{i-1}Q_{i}D^{+},\quad i=1,\ldots,\mu-1. \end{aligned} $$

The resulting k × k matrix function

$$\displaystyle \begin{aligned} K=\begin{bmatrix} \varGamma_{dyn}\\\varGamma_{nil,1}\\ \vdots\\ \varGamma_{nil,\mu-1} \end{bmatrix}= \begin{bmatrix} \varGamma_{dyn}\\\varGamma_{nil} \end{bmatrix} \end{aligned} $$

remains nonsingular on [a, b] owing to the decomposition I k = DD + =  0Q 1D + + ⋯ +  μ−2Q μ−1D + +  μ−1D +.

Set G = Γ dynD = [I l 0]KD. This implies \(\ker G(t)=\ker {\varPi }_{\mu -1}\). Taking into account the fact that \(\ker {\varPi }_{\mu -1}=\ker {\varPi }_{can}\), see [13, Theorem 2.8], one has actually \(\ker G(t)=\ker {\varPi }_{can}\).

Finally, we derive for \(z\in \mathbb {R}^{k},\; t\in [a,b]\),

$$\displaystyle \begin{aligned} \lvert Dz\rvert^{2}&=\lvert K(t)^{-1}K(t)Dz\rvert^{2} \geq \kappa^{2} \lvert K(t)Dz \rvert^{2} = \kappa^{2} ( \lvert G(t)z\rvert^{2} + \lvert \varGamma_{nil}(t)Dz\rvert^{2})\\ &\geq \kappa^{2}\lvert G(t)z\rvert^{2}, \end{aligned} $$

which completes the proof. □

Lemma 3.2

For \(\bar t\in [a,b]\), \(0<H\leq b-\bar t\), and

$$\displaystyle \begin{aligned} C_H = \left(\max\left(\frac{2}{H},2H\right)\right)^{1/2} \end{aligned} $$

it holds that

$$\displaystyle \begin{aligned} \lvert Dx(t)\rvert\leq C_H\lVert Dx\rVert_{H^1_{sub}}\leq C_H\lVert x\rVert_{H^1_D,sub},\quad t\in [\bar t,\bar t+H],\quad x\in H^{1}_{D,sub}. \end{aligned} $$

Proof

By definition, \(x\in H^{1}_{D,sub}\) implies \(u=Dx\in H^{1}_{sub}\). Since \(H^{1}_{sub}\) is continuously embedded in C sub, it follows that

$$\displaystyle \begin{aligned} u(t)=u(s)+\int_{s}^{t}u'(\tau)\mathrm{d}\tau, \quad t,s\in[\bar{t},\bar{t}+H], \end{aligned} $$

which gives

$$\displaystyle \begin{aligned} \lvert u(t)\rvert^{2} \leq2\lvert u(s)\rvert^{2}+2\left(\int_{s}^{t}\lvert u'(\tau)\rvert\mathrm{d}\tau\right)^{2} \leq2\lvert u(s)\rvert^{2}+2H\int_{\bar{t}}^{\bar{t}+H}\lvert u'(\tau)\rvert^{2} \mathrm{d}\tau. \end{aligned} $$

Integrating this inequality with respect to s leads to

$$\displaystyle \begin{aligned} H\lvert u(t)\rvert^{2}\leq2\int_{\bar{t}}^{\bar{t}+H}\lvert u(s)\rvert^{2}\mathrm{d}s+2H^{2}\int_{\bar{t}}^{\bar{t}+H}\lvert u'(\tau)\rvert^{2}\mathrm{d}\tau. \end{aligned}$$

Finally, with C H as defined in the assertion, it holds that

$$\displaystyle \begin{aligned} \lVert u\rVert_{C_{sub}}^{2}\leq C_H^2\lVert u\rVert^{2}_{H^{1}_{sub}}\leq C_H^2\lVert x\rVert^{2}_{H^{1}_{D,sub}} \end{aligned}$$

and the assertion follows. □

Lemma 3.3

Let the function G fulfilling (3.2) with the bound c Gbe given, and denote \(c_{T}=(2 \max \{\lVert A\rVert ^{2}_{\infty },\lVert B\rVert ^{2}_{\infty }\})^{1/2}\).

  1. (1)

    Then, for each subinterval, the inequalities

    $$\displaystyle \begin{aligned} &\lVert T_{sub}x\rVert_{L^{2}_{sub}}\leq c_{T}\lVert x\rVert_{H^{1}_{D,sub}},\quad x\in H^{1}_{D,sub},\\ &\lvert T_{IC,sub}x\rvert\leq c_{G} C_{H}\lVert x\rVert_{H^{1}_{D,sub}},\quad \lvert T_{ICD,sub}x\rvert\leq C_{H}\lVert x\rVert_{H^{1}_{D,sub}},\quad x\in H^{1}_{D,sub},{} \end{aligned} $$
    (3.3)

    are valid.

  2. (2)

    If M  N + 1 and A, B are of class C M, then there are constants C AB1, C AB2, both independent of the size H of the subinterval, such that

    $$\displaystyle \begin{aligned} &\lVert R_{\pi,M}T_{sub}U_{\pi}x\rVert_{L^{2}_{sub}}\leq C_{AB1}\lVert x\rVert_{H^{1}_{D,sub}},\quad x\in H^{1}_{D,sub},\\ &\lVert R_{\pi,M}T_{sub}U_{\pi}x-T_{sub}U_{\pi}x\rVert_{L^{2}_{sub}}\leq C_{AB1} h^{M-N-1/2}\lVert x\rVert_{H^{1}_{D,sub}},\quad x\in H^{1}_{D,sub}. \end{aligned} $$

Proof

  1. (1)

    Regarding that A, B are given on [a, b], by straightforward computation we obtain

    $$\displaystyle \begin{aligned} \lVert T_{sub}x\rVert^{2}_{L^{2}_{sub}}\leq 2 \max\{\lVert A\rVert^{2}_{\infty,sub},\lVert B\rVert^{2}_{\infty,sub}\}\lVert x\rVert^{2}_{H^{1}_{D,sub}}\leq c_{T}\lVert x\rVert^{2}_{H^{1}_{D,sub}}. \end{aligned} $$

    Applying Lemma 3.2 we find the inequalities (3.3).

  2. (2)

    These inequalities can be verified analogously to the first two items of [8, Proposition 4.2].

We are now prepared to estimate the values α π,sub, β π,sub, \(\tilde {\beta }_{\pi ,sub}\), \(\hat {\beta }_{\pi ,sub}\), and \(\tilde {\hat {\beta }}_{\pi ,sub}\).

Theorem 3.4

Let the operator T described in Sect. 2be fine with tractability index μ ≥ 2 and characteristic values 0 < r 0 ≤⋯ ≤ r μ−1 < r μ = m, \(l=m-\sum _{i=0}^{\mu -1}(m-r_{i})\). Let the coefficients A, B, as well as the solution x of the IVP (1.6), (1.7) be sufficiently smooth. Let the function G with (3.2) be given and \([\bar t,\bar t+H]\subset [a,b]\).

Then there are positive constants α π,sub, \(C_{\beta }, \tilde {C}_{\beta },\hat {C}_{\beta }, \tilde {\hat {C}}_{\beta }\)such that

$$\displaystyle \begin{aligned} \alpha_{\pi,sub}\leq C_{\alpha}H^{1/2}h^{N},\\ \beta_{\pi,sub}\leq C_{\beta}h^{N},\; \tilde{\beta}_{\pi,sub}\leq \tilde{C}_{\beta}h^{N},\\ \hat{\beta}_{\pi,sub}\leq \hat{C}_{\beta}h^{N},\;\tilde{\hat{\beta}}_{\pi,sub}\leq \tilde{\hat{C}}_{\beta}h^{N}. \end{aligned} $$

uniformly for all individual subintervals \([\bar t,\bar t+H]\)and all sufficient fine grids X π.

Proof

First we choose N nodes 0 < τ ∗,1 < ⋯ < τ ∗,N < 1 and construct the interpolating function p ∗,int ∈ X π so that

$$\displaystyle \begin{aligned} Dp_{*,int}(\bar t)=Dx_{*}(\bar t),\quad p_{*,int}(t_{j}+\tau_{*,i}h)=x_{*}(t_{j}+\tau_{*,i}h), \;i=1,\ldots,N,\; j=1,\ldots,n, \end{aligned} $$

yielding

$$\displaystyle \begin{aligned} \lVert x_{*}-p_{*,int}\rVert_{\infty,sub}+\lVert (Dx_{*})'-(Dp_{*,int})'\rVert_{\infty,sub}\leq C_{*}h^{N}, \end{aligned} $$

with a uniform constant C for all subintervals. C is determined by x and its derivatives given on [a, b]. Now we have also

$$\displaystyle \begin{aligned} \lVert x_{*}-p_{*,int}\rVert_{H^{1}_{D,sub}}\leq C_{*}\sqrt{2H}h^{N}, \end{aligned} $$

and therefore, with \(C_{\alpha }=C_{*}\sqrt {2}\),

$$\displaystyle \begin{aligned} \alpha_{\pi,sub}=\lVert (I-U_{\pi,sub})x_{*}\rVert_{H^{1}_{D,sub}}=\lVert (I-U_{\pi,sub})(x_{*}-p_{*,int})\rVert_{H^{1}_{D,sub}}\leq C_{\alpha}\sqrt{H}h^{N}. \end{aligned} $$

Set \(C_{D}= \sqrt {2}\max \{1,b-a\}C_{\alpha }\) such that \(C_{H}\sqrt {H}C_{\alpha }\leq C_{D}\) for all H. Using Lemma 3.2 we derive

$$\displaystyle \begin{aligned} \lvert D((I-U_{\pi,sub})x_{*})(\bar t)\rvert\leq C_{H}\alpha_{\pi,sub}\leq C_{D} h^{N}. \end{aligned} $$

We derive further

$$\displaystyle \begin{aligned} \beta^{2}_{\pi,sub}&=\lVert\mathcal T_{sub}(I-U_{\pi,sub})x_{*}\rVert^{2}_{Y_{sub}}\\ &=\lVert T_{sub}(I-U_{\pi,sub})x_{*}\rVert^{2}_{L^{2}_{sub}}+\lvert G(\bar t)D^{+}D((I-U_{\pi,sub})x_{*})(\bar t)\rvert^{2}\\ &\leq \lVert T_{sub}\rVert^{2} \alpha_{\pi,sub}^{2}+ c_{G}^{2}C_{D}^{2}h^{2N}\leq (c_{T}^{2}C_{\alpha}^{2}(b-a)+c_{G}^{2}C_{D}^{2})h^{2N}= C_{\beta}^{2}h^{2N}, \end{aligned} $$
$$\displaystyle \begin{aligned} \hat{\beta}^{2}_{\pi,sub}&=\lVert\hat{\mathcal T}_{sub}(I-U_{\pi,sub})x_{*}\rVert^{2}_{Y_{sub}}\\ &=\lVert T_{sub}(I-U_{\pi,sub})x_{*}\rVert^{2}_{L^{2}_{sub}}+\lvert D((I-U_{\pi,sub})x_{*})(\bar t)\rvert^{2}\\ &\leq \lVert T_{sub}\rVert^{2} \alpha_{\pi,sub}^{2}+ c_{G}^{2}C_{D}^{2}h^{2N}\leq (c_{T}^{2}C_{\alpha}^{2}(b-a)+C_{D}^{2})h^{2N}= \hat{C}_{\beta}^{2}h^{2N}. \end{aligned} $$

Following [8, Section 2.3], we investigate also \(w_{*}=T_{sub}(x_{*}-p_{*,int})\in C_{\pi }([\bar t,\bar t+H],\mathbb {R}^{m})\) and use the estimate (cf. [8, Section 2.3])

$$\displaystyle \begin{aligned} H^{-1/2}\lVert R_{\pi,M}w_{*}\rVert_{L^2,sub}\leq \lVert R_{\pi,M}w_{*}\rVert_{\infty,sub}\leq C_{L}\lVert w_{*}\rVert_{\infty,sub}\leq \max\{\lVert A\rVert_{\infty},\lVert B\rVert_{\infty}\} C_{L}h^{N}. \end{aligned}$$

Here, C L denotes a constant that depends only on the choice of the interpolation nodes τ ∗,1, …, τ ∗,N. Then we derive

where \(C_{RT}=C_L \max \{\lVert A\rVert _{\infty },\lVert B\rVert _{\infty }\} + \sqrt {2} C_{*}C_{AB1}\). Therefore,

$$\displaystyle \begin{aligned} \tilde{\beta}^{2}_{\pi,sub}&=\lVert\mathcal R_{\pi,m}\mathcal T_{sub}(I-U_{\pi,sub})x_{*}\rVert^{2}_{\hat{Y}_{sub}}\\ &=\lVert R_{\pi,M}T_{sub}(I-U_{\pi,sub})x_{*}\rVert^{2}_{L^{2}_{sub}}+\lvert G(\bar t)D^{+}D((I-U_{\pi,sub})x_{*})(\bar t)\rvert^{2}\\ &\leq C_{RT}^{2}H h^{2N}+c_{G}^{2}C_{D}^{2}h^{2N}\leq C_{RT}^{2}(b-a) h^{2N}+c_{G}^{2}C_{D}^{2}h^{2N}= \tilde{C}_{\beta}^{2} h^{2N},\end{aligned} $$
$$\displaystyle \begin{aligned} \tilde{\hat{\beta}}^{2}_{\pi,sub}&=\lVert\mathcal R_{\pi,m}\hat{\mathcal T}_{sub}(I-U_{\pi,sub})x_{*}\rVert^{2}_{\hat{Y}_{sub}}\\ &=\lVert R_{\pi,M}T_{sub}(I-U_{\pi,sub})x_{*}\rVert^{2}_{L^{2}_{sub}}+\lvert D((I-U_{\pi,sub})x_{*})(\bar t)\rvert^{2}\\ &\leq C_{RT}^{2}H h^{2N}+C_{D}^{2}h^{2N}\leq C_{RT}^{2}(b-a) h^{2N}+C_{D}^{2}h^{2N}= \tilde{\hat{C}}_{\beta}^{2} h^{2N}. \end{aligned} $$

3.2 Overdetermined Collocation on \([\bar {t},\bar {t}+H]\subset [a,b]\), with Accurately Stated Initial Condition at \(\bar {t}\)

We ask if there are positive constants c γ and \(\tilde {c}_{\gamma }\) serving as lower bounds for all the individual constants characterizing the instability thresholds associated to each arbitrary subinterval \([\bar t,\bar t+H]\subset [a,b]\).

Theorem 3.5

Let the operator T described in Sect. 2be fine with tractability index μ ≥ 2 and characteristic values 0 < r 0 ≤⋯ ≤ r μ−1 < r μ = m, \(l=m-\sum _{i=0}^{\mu -1}(m-r_{i})\). Let the coefficients A, B, the right-hand side \(q\in \operatorname {\mathrm {im}} T\), as well as the solution x of the IVP (1.6), (1.7) be sufficiently smooth. Let q subdenote the restriction of q onto the subinterval \([\bar t,\bar t+H]\subset [a,b]\).

Let a function G with (3.2) be given.

  1. (1)

    Then, for each arbitrary \(r\in \mathbb {R}^{l}\), there is exactly one solution x [r]of the equation \(\mathcal T_{sub}x=(q_{sub}, r)\)and

    $$\displaystyle \begin{aligned} \lVert x_{[r]}-x_{*}\rVert_{H^{1}_{D,sub}}\leq c_{sub}\, \lvert r- G(\bar t)x_{*}(\bar t)\rvert. \end{aligned} $$

    x [r]coincides on the subinterval with x , if and only if \(r=G(\bar t)x_{*}(\bar t)\). Furthermore, there is a bound C psuch that c sub ≤ C pis valid for all subintervals.

  2. (2)

    If M  N + 1, there is a constant C γ > 0 such that,

    $$\displaystyle \begin{aligned} \gamma_{\pi,sub}\geq C_{\gamma} h^{\mu-1},\quad \lVert (\mathcal T_{sub}U_{\pi,sub})^{+}\rVert_{Y_{sub}\rightarrow H^{1}_{D,sub}}=\frac{1}{\gamma_{\pi,sub}}\leq \frac{1}{C_{\gamma}h^{\mu-1}} \end{aligned} $$

    uniformly for all subintervals and sufficiently small stepsizes h > 0.

  3. (3)

    If M  N + μ, there is a positive constant \(\tilde C_{\gamma }=\frac {C_{\gamma }}{2}\)such that

    $$\displaystyle \begin{aligned} \lVert (\mathcal R_{\pi,M}\mathcal T_{sub}U_{\pi,sub})^{+}\rVert_{Y_{sub}\rightarrow H^{1}_{D,sub}}=\frac{1}{\tilde{\gamma}_{\pi,sub}}\leq \frac{1}{\tilde C_{\gamma}h^{\mu-1}} \end{aligned} $$

    uniformly for all subintervals and sufficiently small stepsizes h > 0.

Proof

  1. (1)

    This is a consequence of Proposition A.1 in the Appendix.

  2. (2)

    The constant C γ can be obtained by a careful inspection and adequate modification of the proof of [9, Theorem 4.1] on the basis of Proposition A.1 below instead of [9, Proposition 4.3]. Similarly to [9, Lemma 4.4], we provide the inequality

    $$\displaystyle \begin{aligned} \lVert q\rVert^{2}_{Z_{sub}}\leq \lVert q\rVert^{2}_{\pi}:=\lVert q\rVert^{2}_{L^{2}_{sub}}+\sum_{i=1}^{\mu-1}\sum_{s=0}^{\mu-i} d_{i,s}\lVert (D\mathcal L_{\mu-i}q)^{(s)}\rVert^{2}_{L^{2}_{sub}}, \; q\in Z_{\pi}, \end{aligned} $$

    with \(Z_{\pi }=\{q\in L^{2}_{sub}\vert D\mathcal L_{\mu -i}q\in C_{\pi }^{\mu -i}([\bar t,\bar t+H],\mathbb {R}^{k}),i=1,\ldots ,\mu -1\}\subset T_{sub}X_{\pi }\), with coefficients d i,s being independent of the subinterval.

  3. (3)

    This statement proves by a slight modification of [8, Proposition 4.2].

Theorem 3.5 allows to apply homogeneous error estimations on all subintervals. Note that the involved constants C α etc. may depend on N and M. For providing the function G with (3.2), the canonical nullspace \(N_{can}=\ker {\varPi }_{can}\) must be available, not necessarily the canonical projector itself. Owing to [13, Theorem 2.8], it holds that \(N_{can}=\ker {\varPi }_{\mu -1}\) for any admissible matrix function sequence, which makes N can easier accessible. Nevertheless, though the function G is very useful in theory it is hardly available in practice.

For problems with dynamical degree l = 0 the canonical projector Π can vanishes identically, that is, the initial condition is absent, and T sub itself is injective. This happens, for example, for Jordan systems, see also Sect. 5.3. In those cases, with no initial conditions and no transfer the window-wise forward stepping works well.

Let \( \tilde {x}_{\pi ,old}\) be already computed as approximation of the solution x on an certain old subinterval of length H old straight preceding the current one \([\bar t,\bar t+H]\). Motivated by Theorems 3.4 and 3.5 assume

$$\displaystyle \begin{aligned} \lVert \tilde x_{\pi,old}-x_{*}\rVert_{H^{1}_{sub, old}}\leq C h_{old}^{N-\mu+1} \end{aligned}$$

for sufficiently small stepsize h old. Applying Lemma 3.2 we obtain

$$\displaystyle \begin{aligned} \lvert D \tilde x_{\pi,old}(\bar t)-Dx_{*}(\bar t)\rvert\leq C_{H_{old}}C h_{old}^{N-\mu+1}. \end{aligned}$$

Next we apply overdetermined least-squares collocation on the current subinterval \([\bar t,\bar t+H]\). We use the transfer condition \(r=G(\bar t)\tilde x_{\pi ,old}(\bar t)\) to state the initial condition for the current subinterval. The overdetermined collocation generates the new segment \(\tilde x_{\pi }\),

$$\displaystyle \begin{aligned} \tilde x_{\pi}=\operatorname{\mathrm{argmin}}\{\lVert R_{\pi,M} (T_{sub}x-q)\rVert^{2}_{H^{1}_{D,sub}}+\lvert G(\bar t)x(\bar t)-G(\bar t)\tilde x_{\pi,old}(\bar t)\rvert^{2}\vert x\in X_{\pi}\}, \end{aligned} $$

which is actually an approximation of x [r] being neighboring to x , such that

$$\displaystyle \begin{aligned} \lVert \tilde x_{\pi}-x_{[r]}\rVert_{H^{1}_{D,sub}}\leq \tilde c h^{N-\mu+1}. \end{aligned} $$

Owing to Theorem 3.5 we have also

$$\displaystyle \begin{aligned} \lVert x_{[r]}-x_{*}\rVert_{H^{1}_{D,sub}}&\leq c_{sub} \lvert r- G(\bar t)x_{*}(\bar t)\rvert = c_{sub}\lvert G(\bar t)\tilde x_{\pi,old}(\bar t) - G(\bar t)x_{*}(\bar t)\rvert\\ &\leq c_{sub} c_{G}C_{H_{old}} C h_{old}^{N-\mu+1}. \end{aligned} $$

If h = h old, it follows that

$$\displaystyle \begin{aligned} \lVert \tilde x_{\pi}-x_{*}\rVert_{H^{1}_{D,sub}}\leq C_{sub} h^{N-\mu+1} \end{aligned} $$

with \(C_{sub}=c_{sub} c_{G}C_{H_{old}} C+\tilde {c}\). This is the background which ensures the windowing procedure (1.2), (1.3), (1.4) to work.

4 Overdetermined Collocation on a Subinterval \([\bar {t},\bar {t}+H]\subset [a,b]\), with Initial Conditions Related to \(Dx(\bar {t})\)

Here we proceed as in the previous section, but now we use the initial condition \(Dx(\bar t)= \hat r\) instead of \(G(\bar t)x(\bar t)=r\), to avoid the use of the function G. Obviously, this formulation is easier to use in practice since D is given. However, in contrast to the situation in Theorem 3.5, the equation \(\hat {\mathcal T}_{sub}x=(q_{sub},\hat r)\) is no longer solvable for arbitrary \(\hat r\in \mathbb {R}^{k}\). For solvability, \(\hat r\) must be consistent.

Theorem 4.1

Let the operator T described in Sect. 2.2be fine with tractability index μ ≥ 2 and characteristic values 0 < r 0 ≤⋯ ≤ r μ−1 < r μ = m, \(l=m-\sum _{i=0}^{\mu -1}(m-r_{i})\). Let the coefficients A, B, the right-hand side \(q\in \operatorname {\mathrm {im}} T\), as well as the solution x of the IVP (1.6),(1.7) be sufficiently smooth. Then the following holds:

  1. (1)

    \(\hat {\mathcal T}_{sub}\)is injective.

  2. (2)

    If M  N + 1, there is a constant \(\hat C_{\gamma }\)uniformly for all possible subintervals and sufficiently small stepsizes h > 0 such that

    $$\displaystyle \begin{aligned} \hat{\gamma}_{\pi,sub}\geq \hat C_{\gamma} h^{\mu-1}. \end{aligned} $$

    and hence

    $$\displaystyle \begin{aligned} \lVert (\hat{\mathcal T}_{sub}U_{\pi,sub})^{+}\rVert_{\hat Y_{sub}}=\frac{1}{\hat{\gamma}_{\pi,sub}}\leq \frac{1}{\hat C_{\gamma}h^{\mu-1}}. \end{aligned} $$
  3. (3)

    If M  N + μ, there is a constants \(\tilde {\hat C}_{\gamma }>0\)uniformly for all possible subintervals and sufficiently small stepsizes h > 0, such that

    $$\displaystyle \begin{aligned} \lVert (\hat{\mathcal R}_{\pi,M}\hat{\mathcal T}_{sub}U_{\pi,sub})^{+}\rVert_{\hat Y_{sub}}=\frac{1}{\tilde{\hat{\gamma}}_{\pi,sub}}\leq \frac{1}{\tilde{\hat C}_{\gamma}h^{\mu-1}}. \end{aligned} $$

Proof

The assertions are straightforward consequences of Theorem 3.5 and Lemma 3.1.

\(\hat {\mathcal T}x=0\) means Tx = 0 and \(Dx(\bar t)=0\), thus also \(G(\bar t)x(\bar t)=[I_{l}\,0]K(\bar t)Dx(\bar t)=0\), finally \(\mathcal Tx=0\). Since \(\mathcal T\) is injective it follows that x = 0. For p ∈ X π,

$$\displaystyle \begin{aligned} \lVert \hat{\mathcal T}_{sub} p\rVert_{\hat Y_{sub}}^{2}&=\lVert T_{sub} p\rVert_{ L^{2}_{sub}}^{2}+\lvert Dp(\bar t)\rvert^{2}\geq \lVert T_{sub} p\rVert_{ L^{2}_{sub}}^{2}+\kappa^{2}\lvert G(\bar t)p(\bar t)\rvert^{2}\\ &\geq \min\{1,\kappa^2\}\lVert \mathcal T_{sub} p\rVert_{Y_{sub}}^{2}\geq \min\{1,\kappa^2\} \left(C_{\gamma} h^{\mu-1} \lVert p\rVert_{H^{1}_{D,sub}}\right)^2, \end{aligned} $$

and

$$\displaystyle \begin{aligned} \lVert \hat{\mathcal R}_{\pi,M}\hat{\mathcal T}_{sub} p\rVert_{\hat Y_{sub}}^{2}&=\lVert R_{\pi,M}T_{sub} p\rVert_{L^{2}_{sub}}^{2}+\lvert Dp(\bar t)\rvert^{2}\geq \lVert R_{\pi,M}T_{sub} p\rVert_{ L^{2}_{sub}}^{2}+\kappa^{2}\lvert G(\bar t)p(\bar t)\rvert^{2}\\ &\geq \min\{1,\kappa^2\}\lVert\mathcal R_{\pi,M} \mathcal T_{sub} p\rVert_{Y_{sub}}^{2}\geq \min\{1,\kappa^2\} \left(\tilde{C}_{\gamma} h^{\mu-1} \lVert p\rVert_{H^{1}_{D,sub}}\right)^2. \end{aligned} $$

In contrast to the situation in Sect. 3.2 the equation \(\hat {\mathcal T}_{sub}x=(q_{sub},\hat r)\) is no longer solvable for all \(\hat r\in \mathbb {R}^{k}\). Recall that q sub is the restriction of q = Tx so that \(q_{sub}\in \operatorname {\mathrm {im}} T_{sub}\). Denote

$$\displaystyle \begin{aligned} \hat y=\begin{bmatrix} q_{sub}\\Dx_{*}(\bar t) \end{bmatrix},\; \hat y^{[\delta]}=\begin{bmatrix} q_{sub}\\\hat{r} \end{bmatrix},\;\delta=\lVert \hat y-\hat y^{[\delta]}\rVert=\lvert Dx_{*}(\bar t)-\hat{r}\rvert, \end{aligned} $$

and, following [11], we take \(\hat y^{[\delta ]}\) as noisy data and compute

$$\displaystyle \begin{aligned} \tilde{\hat x}_{\pi}^{[\delta]}&=\operatorname{\mathrm{argmin}}\{\lVert \hat{\mathcal{R}}_{\pi,M} (\hat{\mathcal T}_{sub}x-y^{[\delta]})\rVert^{2}_{L^{2}_{sub}\times\mathbb{R}^{k}}\vert x\in X_\pi\}\\ &=\operatorname{\mathrm{argmin}}\{\lVert R_{\pi,M} (T_{sub}x-q_{sub})\rVert^{2}_{L^{2}_{sub}}+\lvert Dx(\bar t)-\hat{r}\rvert^{2}\vert x\in X_{\pi}\} \end{aligned} $$

and similarly,

$$\displaystyle \begin{aligned} \hat{x}_{\pi}^{[\delta]}&=\operatorname{\mathrm{argmin}}\{\lVert \hat{\mathcal T}_{sub}x-y^{[\delta]}\rVert^{2}_{L^{2}_{sub}\times\mathbb{R}^{k}}\vert x\in X_\pi\}\\ &=\operatorname{\mathrm{argmin}}\{\lVert T_{sub}x-q_{sub}\rVert^{2}_{L^{2}_{sub}}+\lvert Dx(\bar t)-\hat{r}\rvert^{2}\vert x\in X_{\pi}\}. \end{aligned} $$

Applying the error representation [11, Equation (2.9)] we arrive at

and, correspondingly,

$$\displaystyle \begin{aligned} \hat{x}_{\pi}^{[\delta]}-x_{*}=(\hat{\mathcal T}U_{\pi})^{+}(\hat y^{[\delta]}-\hat y)+(\hat{\mathcal T}U_{\pi})^{+}\hat{\mathcal T}_{sub}(I-U_{\pi})x_{*}-(I-U_{\pi})x_{*}. \end{aligned} $$

Thus,

$$\displaystyle \begin{aligned} \lVert \hat x_{\pi}^{[\delta]}-x_{*}\rVert_{H^{1}_{D,sub}}&\leq \frac{1}{\tilde C_{\gamma}h^{\mu-1}}\{\lVert\hat y^{[\delta]}-\hat y\rVert+\hat{\beta}_{\pi,sub}\}+\alpha_{\pi}=\frac{1}{\tilde C_{\gamma}h^{\mu-1}}\{\delta+\hat{\beta}_{\pi,sub}\}+\alpha_{\pi}, \\ \lVert \tilde{\hat{x}}_{\pi}^{[\delta]}-x_{*}\rVert_{H^{1}_{D,sub}}&\leq \frac{1}{\tilde{\hat C}_{\gamma}h^{\mu-1}}\{\lVert\hat y^{[\delta]}-\hat y\rVert+\tilde{\hat\beta}_{\pi,sub}\}+\alpha_{\pi}=\frac{1}{\tilde{\hat{C}}_{\gamma}h^{\mu-1}}\{\delta+\tilde{\hat\beta}_{\pi,sub}\}+\alpha_{\pi}. \end{aligned} $$

All these estimations can be put together in order to arrive at a recursive error estimation for the application of (1.3), (1.5). Unfortunately, this estimate is not sufficient for proving convergence of the windowing technique in contrast to the approach using accurately stated initial conditions of Sect. 3.2!

5 Time-Stepping with b − a = LH and H = nh

We set now H = (b − a)∕L, w λ = a + λH, λ = 0, …, L, and h = Hn, and study the somehow uniform time-stepping procedures.

5.1 Time-Stepping with Accurate Transfer Conditions

In the time-stepping approach corresponding to (1.3)–(1.4), the transfer conditions are given so that G is chosen according to (3.2). Let \(\tilde {x}^{[\lambda ]}\) be the approximation provided by the overdetermined least-squares collocation for the subinterval [a + (λ − 1)H, a + λH] corresponding to the initial and transfer conditions

$$\displaystyle \begin{aligned} G_a\tilde{x}_\pi^{[1]}(a) &= r,\\ G(w_{\lambda}) \tilde{x}_{\pi}^{[\lambda]}(a+(\lambda-1)H) &= G(w_{\lambda})\tilde{x}_{\pi}^{\lambda-1}(a+(\lambda-1)H), \quad \lambda>1. \end{aligned} $$

Then we obtain from Theorem 3.5 and Lemma 3.2, for λ = 1,

$$\displaystyle \begin{aligned} \lVert \tilde{x}_\pi^{[1]}-x_\ast\rVert_{H^1_{D,sub}} \leq \tilde{C} h^{N-\mu+1} =: d_1. \end{aligned}$$

For λ > 1, let \(r=G_\lambda \tilde {x}_\pi ^{[\lambda -1]}(a+(\lambda -1)H)\). Then it holds

$$\displaystyle \begin{aligned} \lVert \tilde{x}_\pi^{[\lambda]}-x_\ast\rVert_{H^1_{D,sub}} &\leq \lVert \tilde{x}_\pi^{[\lambda]} - x_{[r]}\rVert_{H^1_{D,sub}} + \lVert x_{[r]}-x_\ast\rVert_{H^1_{D,sub}} \\ &\leq \tilde{C} h^{N-\mu+1}+C_p\lvert r-G_\lambda x_\ast(a+(\lambda-1)H)\rvert \\ &\leq \tilde{C}h^{N-\mu+1}+C_p c_G C_H \lVert \tilde{x}_\pi^{[\lambda-1]}-x_\ast \rVert_{H^1 _{D,sub}} \\ &\leq \bar{C}(h^{N-\mu+1}+ C_H\lVert \tilde{x}_\pi^{[\lambda-1]}-x_\ast \rVert_{H^1 _{D,sub}}) =: d_\lambda \end{aligned} $$

where \(\bar {C}=\max \{C_pc_G,\tilde {C}\}\). Hence,

$$\displaystyle \begin{aligned} d_1 \leq \bar{C}h^{N-\mu+1},\quad d_\lambda \leq \bar{C}(C_Hd_{\lambda-1}+h^{N-\mu+1}). \end{aligned}$$

A solution of this recursion provides us with

$$\displaystyle \begin{aligned} d_\lambda \leq \sum_{\iota=0}^{\lambda-1} \bar{C}(\bar{C}C_H)^\iota h^{N-\mu+1} = \bar{C}\frac{1-(\bar{C}C_H)^\lambda}{1-\bar{C}C_H}h^{N-\mu+1}. \end{aligned}$$

A similar estimation can be derived for the least-squares approximations using the operator \((\mathcal T_{sub}U_{\pi ,sub})^+\).

Example 5.1

The index-2 DAE with k = 2, m = 3, l = 1,

$$\displaystyle \begin{aligned} \begin{bmatrix} 1 & 0\\ 0 & 1\\ 0 & 0 \end{bmatrix}( \begin{bmatrix} 1 & 0 & 0\\ 0 & 1 & 0 \end{bmatrix}x)'(t)+ \begin{bmatrix} \theta & -1 & -1\\ \eta t(1-\eta t)-\eta & \theta & -\eta t\\ 1-\eta t & 1 & 0 \end{bmatrix}x(t)=q(t), \end{aligned} $$
(5.1)

is taken from [10, Example 1.1]. One has \(N_{can}(t)=\{z\in \mathbb {R}^{3}\vert \; \eta tz_{1}-z_{2}=0\}\) so that

$$\displaystyle \begin{aligned} G(t)=\left[\begin{array}{ccc} \eta t & -1 & 0 \end{array}\right] \end{aligned}$$

will do. We consider the DAE on the interval (0,1). The right-hand side q is chosen in such a way that

$$\displaystyle \begin{aligned} x_1(t) &= e^{-t}\sin t, \\ x_2(t) &= e^{-2t}\sin t, \\ x_3(t) &= e^{-t}\cos t \end{aligned} $$

is a solution. This solution becomes unique if an appropriate initial condition is added. With G a = G(0), the initial condition becomes

$$\displaystyle \begin{aligned} G_ax(0) = G_a \left[\begin{array}{ccc} 0 & 0 & 1 \end{array}\right]^T = 0. \end{aligned}$$

In the following experiments, η = −25 and θ = −1 have been chosen. This allows for a comparison with the experiments in [10].

This problem is solved on equidistant grids using, for each polynomial degree N, M = N + 1 Gaussian collocation points scaled to (0, 1). The tables show the errors of the approximate solutions in \(H^1_D(0,1)\). The columns labeled order contain an estimation k est of the order

$$\displaystyle \begin{aligned} k_{\mathrm{est}} = \log(\lVert x_\pi - x_\ast\rVert_{H_D^1(0,1)}/ \lVert x_{\pi'} - x_\ast\rVert_{H_D^1(0,1)})/\log 2. \end{aligned}$$

Here, π′ is obtained from π by stepsize halving. It should be noted that the norm is taken for the complete interval (0, 1) even in the windowing approach. In order to enable a comparison, we provide the results for solving the problem without windowing in Table 1. This corresponds to \(\bar {t}=0\) and H = 1.

Table 1 Errors and estimation of the convergence order for (5.1) and \(\bar {t}=0\), H = 1 using M = N + 1

In the next experiment, the time-stepping approach using accurately stated transfer conditions has been tested with n = 1. The results are shown in Table 2. □

Table 2 Errors and estimation of the convergence order for (5.1) and n = 1 using H = 1∕L

A more complex example is presented in Sect. 6.

5.2 Time-Stepping with Transfer Conditions Based on D

In our experiments in fact, the situation is much better than indicated by the estimates in Sect. 4. The latter are not sufficient to show convergence of the present time-stepping approach when the transfer conditions are based on D, see (1.5).

Example 5.2 (Continuation of Example 5.1)

We apply the time-stepping procedure under the same conditions as in Example 5.1, however, this time the transfer conditions are chosen as

$$\displaystyle \begin{aligned} \tilde x_{i}^{[\lambda]}(\bar{t}) = \tilde x_{i}^{[\lambda-1]}(\bar{t}), \quad i=1,2.\end{aligned} $$

The results are presented in Table 3. The errors are slightly worse than those of Table 2 where accurately stated transfer conditions are used. However, the observed orders of convergence are similar, at least for N ≥ 2 = μ − 1. The values for n = 2 and n = 3 have also been checked. The orders are identical to those of Table 3 even if the errors are smaller due to the smaller stepsize h. For N = 1, divergent approximations are obtained. However, this is beyond the scope of our theoretical results even in the case of accurate transfer conditions. □

Table 3 Errors and estimation of the convergence order for (5.1) and n = 1 using H = 1∕L

5.3 Studying the Damping of Inconsistent Transition Values

The results of the previous sections show that the windowing method converges if the transfer conditions used refer to the dynamic components, only. The latter are, in general, not easily available unless a detailed analysis of the DAE is available. However, so far we do not know any conditions for convergence if the practically accessible values of the differentiated components Dx are used in the transfer conditions.Footnote 7 Example 5.2 indicates, however, that the use of (1.5) may be possible. In order to gain some more insight into what could be expected in the setting of Sect. 5.2, we will consider a simple special case in this section.

The model problem in question here is a simple system featuring only one Jordan block,

$$\displaystyle \begin{aligned} J(Dx)'+x & =0,\\ Dx(\bar{t}) & = r. \end{aligned} $$

Here, \(J\in \mathbb {R}^{\mu \times (\mu -1)}\), \(D\in \mathbb {R}^{(\mu -1)\times \mu }\) where

$$\displaystyle \begin{aligned} J=\left[\begin{array}{cccc} 0\\ 1 & 0\\ & \ddots & \ddots\\ & & 1 & \end{array}\right],\quad D=\left[\begin{array}{cccc} 1 & 0 & &\\ & \ddots & \ddots & \\ & & 1 & 0 \end{array}\right]. \end{aligned}$$

This system has index μ and no dynamic components, l = 0. The system is solvable for r = 0, only, leading to the unique solution x (t) ≡ 0. When trying to solve the system using the proposed windowing technique, the only information transferred from the subinterval \([\bar {t},\bar {t}+H]\) to the next one is the value of the approximate solution x π at the end of the interval, \(Dx_\pi (\bar {t}+H)\). The latter is an approximation to the exact solution \(Dx_\ast (\bar {t}+H)\equiv 0\) that cannot be guaranteed to be consistent with the DAE. Therefore, we ask the question of how \(Dx_\pi (\bar {t}+H)\) depends on r.

Let

$$\displaystyle \begin{aligned} x_{[r],\pi} &= \operatorname{\mathrm{argmin}}\{\lVert\hat{\mathcal{T}}_{sub}x\rVert^2_{L^2_{sub}\times\mathbb{R}^k}\vert x\in X_\pi\} \\ &= \operatorname{\mathrm{argmin}}\{\lVert T_{sub}x \rVert^2_{L^2_{sub}} + \lvert Dx(\bar{t})-r \rvert^2_{\mathbb{R}^k}\vert x\in X_\pi\} \end{aligned} $$

where Tx = J(Dx) + x. Obviously, \(Dx_{[r],\pi }(\bar {t}+H)\) depends linearly on r. There exists a matrix S = S(N, H, n) such that \(Dx_{[r],\pi }(\bar {t}+H) = Sr\) which we will denote as the transfer matrix. For convergence of the method, it is necessary that the spectral radius ρ(S) of the transfer matrix is bounded by 1.

The analytical computation of S is rather tedious. After some lengthy calculations, we found that, for μ = 2, it holds, with η = (N + 1)−1,

$$\displaystyle \begin{aligned} \rho(S(N,H,n) & =\eta^{n}\left|\frac{2}{\left(-1+\sqrt{1-\eta^{2}}\right)^{n}+\left(-1-\sqrt{1-\eta^{2}}\right)^{n}}\right| \\ &\approx \eta^n2^{1-n}. \end{aligned} $$

In particular, ρ(S) is independent of H and n can be chosen arbitrarily. Moreover, the damping of the inconsistent value r is the better the larger n is. This result can be compared to the experiments in Example 5.2 (an index-2 problem) where we cannot identify any influence of an inaccuracy due to inconsistent transfer conditions.

For larger values of μ, we determined ρ(S) by numerical means. Results are shown in Tables 4, 5 and 6. We observe that, for an index μ > 2, n must be chosen larger than 1 in order to ensure ρ(S) < 1. Moreover, ρ(S) depends on H only marginally for the investigated cases.

Table 4 Spectral radius of the transfer matrix S(N, H, n) for n = 1 and H = 0.1 (left panel) and H = 0.01 (right panel). The column headings show the index μ
Table 5 Spectral radius of the transfer matrix S(N, H, n) for n = 2 and H = 0.1 (left panel) and H = 0.01 (right panel). The column headings show the index μ
Table 6 Spectral radius of the transfer matrix S(N, H, n) for n = 3 and H = 0.1 (left panel) and H = 0.01 (right panel). The column headings show the index μ

Details of the derivations are collected in the appendix.

6 A More Complex Example

In order to show the merits of the windowing technique, we will continue to use the example considered in [9]. This example is the linearized version of a test example from [5]. We consider an initial value problem for the DAE

$$\displaystyle \begin{aligned} A(Dx)'(t)+B(t)x(t)=y(t),\quad t\in [0,5]\end{aligned} $$
(6.1)

with

$$\displaystyle \begin{aligned} A=\begin{bmatrix} 1&0&0&0&0&0\\ 0&1&0&0&0&0\\ 0&0&1&0&0&0\\ 0&0&0&1&0&0\\ 0&0&0&0&1&0\\ 0&0&0&0&0&1\\ 0&0&0&0&0&0 \end{bmatrix}, D=\begin{bmatrix} 1&0&0&0&0&0&0\\ 0&1&0&0&0&0&0\\ 0&0&1&0&0&0&0\\ 0&0&0&1&0&0&0\\ 0&0&0&0&1&0&0\\ 0&0&0&0&0&1&0 \end{bmatrix}, \end{aligned} $$

the smooth matrix coefficient

$$\displaystyle \begin{aligned} B(t)= \begin{bmatrix} 0&0&0&-1&0&0&0\\ 0&0&0&0&-1&0&0\\ 0&0&0&0&0&-1&0\\ 0&0&\sin t&0&1&-\cos t&-2\rho \cos^{2}t\\ 0&0&-\cos t&-1&0&-\sin t&-2\rho \sin t\cos t\\ 0&0&1&0&0&0&2\rho \sin t\\ 2\rho \cos^{2}t&2\rho \sin t \cos t&-2\rho\sin t&0&0&0&0 \end{bmatrix},\quad \rho=5. \end{aligned} $$

This DAE is obtained if the test example from [5] is linearized in the solution x (t) considered there.Footnote 8 It has tractability index μ = 3 and dynamical degree of freedom l = 4. In order to use the windowing technique with accurately stated initial conditions, we will need a function \(G:[0,5]\rightarrow \mathbb {R}^{4\times 7}\) fulfilling the assumptions of Theorem 3.5. The nullspace of the projector Π 2 has the representation

$$\displaystyle \begin{aligned} \ker \varPi_2 = \ker \left[\begin{array}{ccc} I-\varOmega & 0 & 0 \\ \varOmega'\varOmega & I-\varOmega & 0 \\ 0 & 0 & 0 \end{array}\right], \quad \varOmega = b(t)b(t)^T,\quad b(t) = \left[\begin{array}{c} -\cos^2 t \\ -\cos t\sin t \\ \sin t \end{array}\right]. \end{aligned}$$

Based on this representation, we can use

$$\displaystyle \begin{aligned} G(t) = \left[\begin{array}{ccccccc} \sin t & -\cos t & 0 & 0 & 0 & 0 & 0 \\ 0 & 1 & \cos t & 0 & 0 & 0 & 0 \\ -\cos^3 t & -\sin t \cos^2 t & \sin t \cos t & \sin t & -\cos t & 0 & 0 \\ -(\sin t \cos t)^2 & -\sin^3 t \cos t & \sin^3 t & 0 & 1 & \cos t & 0 \end{array}\right]. \end{aligned} $$
(6.2)

In the following numerical experiments we choose the exact solution

$$\displaystyle \begin{aligned}\begin{array}{r*{20}l} x_1 &= \sin t, & x_4 &= \cos t, \\ x_2 &= \cos t, & x_5 &= -\sin t, \\ x_3 &= 2\cos^2 t, & x_6 &= -2\sin 2t, \\ x_7 &= -\rho^{-1}\sin t, & & \end{array}\end{aligned} $$

which is also the one used in [9]. Setting G a = G(0), this provides us with the initial conditionFootnote 9

$$\displaystyle \begin{aligned} G_a x(0) = \left[\begin{array}{c} -1 \\ 3 \\ 0 \\ 0 \end{array}\right]. \end{aligned}$$

The problem is solved on equidistant grids using, for each polynomial degree N, M = N + 3 Gaussian collocation points scaled to (0, 1). This number of collocation points has been chosen such that the assumptions of Theorem 3.5(3) are fulfilled. The tables show the errors of the approximate solutions in \(H^1_D(0,5)\). Similarly as in previous examples, the columns labeled order contain an estimation k est of the order

$$\displaystyle \begin{aligned} k_{\mathrm{est}} = \log(\lVert x_\pi - x_\ast\rVert_{H_D^1(0,5)}/ \lVert x_{\pi'} - x_\ast\rVert_{H_D^1(0,5)})/\log 2. \end{aligned}$$

Here, π′ is obtained from π by stepsize halving.

In order to enable a comparison, we provide the results for solving the problem without windowing in Table 7. This corresponds to \(\bar {t}=0\) and H = 5. Note that the results are almost identical to those obtained in [9] using a slightly different formulation of the initial condition and a different number of collocation points.

Table 7 Errors and estimation of the convergence order for (6.1) and \(\bar {t}=0\), H = 5 using M = N + 3

In Tables 8, 9 and 10 the results using the windowing technique with transfer conditions (1.5) for different numbers of subdivisions n of the individual windows \([\bar {t},\bar {t}+H]\) are shown. Since the transfer condition is based on all of the differentiated components Dx, they are expected to be inconsistent away from the initial point t = 0. For n = 1 and N ≤ 3, the method delivers exponentially divergent approximations.

Table 8 Errors and estimation of the convergence order for (6.1) and n = 1, H = 5∕L using M = N + 3
Table 9 Errors and estimation of the convergence order for (6.1) and n = 2, H = 5∕L using M = N + 3
Table 10 Errors and estimation of the convergence order for (6.1) and n = 3, H = 5∕L using M = N + 3

Finally, we consider the case of using accurately stated initial conditions as transfer conditions. So they correspond to choosing \(G(\bar {t})\) according to (6.2). The results are collected in Table 11. The latter can be compared to the behavior of the global method as shown in Table 7. The results are rather close to each other.

Table 11 Errors and estimation of the convergence order for (6.1) and accurately posed transfer conditions with n = 1, H = 5∕L and M = N + 3

7 Conclusions

We continued the investigation of overdetermined least-squares collocation using piecewise polynomial ansatz functions. This method is known to efficiently produce accurate numerical approximations of solutions for two-point boundary value problems for higher-index DAEs including IVPs as a special case. Since a further increase in computational efficiency is expected if modified for a customized application to IVPs, we considered time-stepping techniques for IVPs in this paper. It turned out that the success of such techniques depends strongly on the transfer conditions used. In the case that the intrinsic structure is available, meaning in particular that the dynamic solution components are known, the time-stepping method has convergence properties similar to the boundary value approach. However, if only the information about the differentiated components of the DAE is used, so far our estimates do not secure convergence of the time-stepping approach. Investigations of a model problem indicate that even in this case convergence can be obtained provided that the method parameters are chosen appropriately.

The overdetermined least-squares collocation method shows impressive convergence results in our experiments. On one hand, the accuracy is impressive, on the other hand, the computational efficiency is comparable to widely used collocation methods for ordinary differential equations. Opposed to that, there are severe difficulties to theoretically justify these methods. The underlying reason is the ill-posedness of higher-index DAEs. To the best of our knowledge, available convergence results are rather sparse and important questions of practical relevance for constructing efficient algorithms are completely open, e.g., a-posteriori error estimations, the choice of grids, polynomial orders, collocation points etc. However, the results so far are encouraging.