Keywords

Subject Classifications:

1 Introduction

Simulation is a well-established and indispensable tool in scientific research as well as in industrial development processes. Efficient tools are needed that are capable of simulating complex processes in, e.g., mechanical engineering, process engineering, or electrical engineering. Many of such processes (where appropriate after a spatial discretization of a partial differential equation) can be modeled asdifferential-algebraic equations (DAEs), which are implicit differential equations that typically consist of ordinary differential equations as well as algebraic equations. Often, DAEs are formulated automatically by software packages such as MODELICA or SIMPACK. In its general form, the initial value problem for a DAE on the compact interval I = [a, b] reads as

$$\displaystyle{ F(t,z(t),z^{{\prime}}(t)) = 0,\qquad z(a) = z_{ a}, }$$
(1.1)

where \(F: I \times \mathbb{R}^{n} \times \mathbb{R}^{n}\longrightarrow \mathbb{R}^{n}\) is a given function and \(z_{a} \in \mathbb{R}^{n}\) is an appropriate initial value at t = a. The task is to find a solution \(z: I\longrightarrow \mathbb{R}^{n}\) of (1.1). Throughout it is assumed that F is sufficiently smooth, i.e., it possesses all the continuous partial derivatives up to a requested order.

Please note that (1.1) is not just an ordinary differential equation in implicit notation, since we permit the Jacobian of F with respect to z , i.e., \(F_{z^{{\prime}}}^{{\prime}}\), to be singular along a solution. In such a situation, (1.1) cannot be solved directly for z . Particular examples with singular Jacobian are semi-explicit DAEs of type

$$\displaystyle{ F(t,z,z^{{\prime}}) = \left (\begin{array}{c} M(t,x)x^{{\prime}}- f(t,x,y) \\ g(t,x,y) \end{array} \right ),\qquad z:= (x,y)^{\top }, }$$
(1.2)

with a non-singular matrix M and the so-called differential state vector x and the algebraic state vector y. Such systems occur, e.g., in process engineering and mechanical multi-body systems. More generally, quasi-linear DAEs of type

$$\displaystyle{F(t,z,z^{{\prime}}) = Q(t,z)z^{{\prime}}- f(t,z)}$$

with a possibly singular matrix function Q frequently occur in electrical engineering.

The potential singularity of the Jacobian \(F_{z^{{\prime}}}^{{\prime}}\) has implications with regard to theoretical properties (existence and uniqueness of solutions, smoothness properties, structural properties, …) and with regard to the design of numerical methods (consistent initial values, order of convergence, stability, …). A survey on the solution theory for linear DAEs can be found in the recent survey paper [141]. A comprehensive structural analysis of linear and nonlinear DAEs can be found in the monographs [90] and [92]. While explicit ordinary differential equations (ODEs) can be viewed as well-behaved systems, DAEs are inherently ill-conditioned and the degree of ill-conditioning increases with the so-called (perturbation) index, compare [75, Definition 1.1]. As such, DAEs require suitable techniques for its numerical treatment.

To this end, the paper aims to provide an overview on the numerical treatment of the initial value problem. The intention is to cover the main ideas without too many technical details, which, if required, can be found in full detail in a huge number of publications and excellent textbooks. Naturally not all developments can be covered, so we focus on a choice of methods and concepts that are relevant in industrial simulation environments for coupled systems of potentially large size. These concepts enhance basic integration schemes by adding features like sensitivity analysis (needed, e.g., in optimization procedures), contact dynamics, real-time schemes, or co-simulation techniques. Still, the core challenges with DAEs, that is ill-conditioning, consistent initial values, index reduction, will be covered as well.

The outline of this paper is as follows. Section 2 introduces index concepts and summarizes stabilization techniques for certain classes of DAEs. Section 3 deals with the computation of the so-called consistent initial values for DAEs and their influence on parameters. Note in this respect that DAEs, in contrast to ODEs, do not permit solutions for arbitrary initial values and thus techniques are required to find suitable initial values. The basics of the most commonly used numerical discretization schemes are discussed in Sect. 4, amongst them are BDF methods, Runge–Kutta methods, and ROW methods. Co-simulation techniques for the interaction of different subsystems are presented in Sect. 5. Herein, the stability and convergence of the overall scheme are of particular importance. Section 6 discusses approaches for the simulation of time crucial systems in real-time. The influence of parameters on the (discrete and continuous) solution of an initial value problem is studied in Sect. 7. Hybrid systems and mechanical contact problems are discussed in Sect. 8.

Notation

We use the following notation. The derivative w.r.t. time of a function z(t) is denoted by z (t). The partial derivative of a function f with respect to a variable x will be denoted by f x  = ∂ f∂ x. As an abbreviation of a function of type f(t, x(t)) we use the notation f[t].

2 Error Influence and Stabilization Techniques

DAEs are frequently characterized and classified according to its index. Various index definitions exist, for instance the differentiation index [62], the structural index [45], the strangeness index [90], the tractability index [92], and the perturbation index [75]. These index definitions are not equivalent for general DAEs (1.1), but they coincide for certain subclasses thereof, for instance semi-explicit DAEs in Hessenberg form. For our purposes we will focus on the differentiation index and the perturbation index only.

The differentiation index is one of the earliest index definitions for (1.1) and is based on a structural investigation of the DAE. It aims to identify the so-called underlying ordinary differential equation. To this end let the functions \(F^{(\,j)}: [t_{0},t_{f}] \times R^{(\,j+2)n}\longrightarrow \mathbb{R}^{n}\) for the variables \(z,z^{{\prime}},\ldots,z^{(\,j+1)} \in \mathbb{R}^{n}\) for j = 0, 1, 2,  be defined by the recursion

$$\displaystyle\begin{array}{rcl} F^{(0)}(t,z,z^{{\prime}})&:=& F(t,z,z^{{\prime}}),{}\end{array}$$
(2.1)
$$\displaystyle\begin{array}{rcl} F^{(\,j)}(t,z,z^{{\prime}},\ldots,z^{(\,j+1)})&:=& \frac{\partial F^{(\,j-1)}} {\partial t} (t,z,z^{{\prime}},\ldots,z^{(\,j)}){}\end{array}$$
(2.2)
$$\displaystyle\begin{array}{rcl} & & +\sum _{\ell=0}^{j}\frac{\partial F^{(\,j-1)}} {\partial z^{(\ell)}} (t,z,z^{{\prime}},\ldots,z^{(\,j)})z^{(\ell+1)},\quad j = 1,2,\ldots.{}\end{array}$$
(2.3)

Herein, F is supposed to be sufficiently smooth such that the functions F ( j) are well defined.

The differentiation index is defined as follows:

Definition 2.1 (Differentiation Index, Compare [62])

The DAE (1.1) has differentiation index \(d \in \mathbb{N}_{0}\), if d is the smallest number in \(\mathbb{N}_{0}\) such that the so-called derivative array

$$\displaystyle{ \begin{array}{rcl} F^{(\,j)}(t,z,z^{{\prime}},\ldots,z^{(\,j+1)})& =&0,\qquad j = 0,1,\ldots,d, \end{array} }$$
(2.4)

allows to deduce a relation of type z  = f(t, z) by algebraic manipulations.

If such a relation exists, then the corresponding ordinary differential equation (ODE) z (t) = f(t, z(t)) is called the underlying ODE of the DAE (1.1).

The definition leaves some space for interpretation as it is not entirely clear what is meant by “algebraic manipulations.” However, for semi-explicit DAEs it provides a guideline to determine the differentiation index. Note that the special structure of semi-explicit DAEs is often exploited in the design of numerical schemes and stabilization techniques.

Definition 2.2 (Semi-Explicit DAE)

A DAE of type

$$\displaystyle\begin{array}{rcl} x^{{\prime}}(t)& =& f(t,x(t),y(t)),{}\end{array}$$
(2.5)
$$\displaystyle\begin{array}{rcl} 0& =& g(t,x(t),y(t)),{}\end{array}$$
(2.6)

is called semi-explicit DAE. Herein, x(⋅ ) is referred to as differential variable and y(⋅ ) is called algebraic variable. Correspondingly, (2.5) is called differential equation and (2.6) algebraic equation.

For semi-explicit DAEs the common approach is to differentiate the algebraic equation w.r.t. time and to substitute the occurring derivatives of x by the right-hand side of the differential equation. This procedure is repeated until the resulting equation can be solved for y .

Example 2.1 (Semi-Explicit DAE with Differentiation Index One)

Consider (2.5)–(2.6). Differentiation of the algebraic equation w.r.t. time yields

$$\displaystyle\begin{array}{rcl} 0& =& g_{t}^{{\prime}}[t] + g_{ x}^{{\prime}}[t]x^{{\prime}}(t) + g_{ y}^{{\prime}}[t]y^{{\prime}}(t) {}\\ & =& g_{t}^{{\prime}}[t] + g_{ x}^{{\prime}}[t]\,f[t] + g_{ y}^{{\prime}}[t]y^{{\prime}}(t). {}\\ \end{array}$$

Herein, we used the abbreviation f[t] for f(t, x(t), y(t)) and likewise for the partial derivatives of g.

Now, if the Jacobian matrix g y [t] is non-singular with a bounded inverse along a solution of the DAE, then the above equation can be solved for y by the implicit function theorem and together with the differential equation (2.5) we obtain the underlying ODE

$$\displaystyle\begin{array}{rcl} x^{{\prime}}(t)& =& f(t,x(t),y(t)), {}\\ y^{{\prime}}(t)& =& -g_{ y}^{{\prime}}[t]^{-1}\left (g_{ t}^{{\prime}}[t] + g_{ x}^{{\prime}}[t]\,f[t]\right ), {}\\ \end{array}$$

and the differentiation index is d = 1.

In the above example, the situation becomes more involved, if the Jacobian matrix g y [t] is singular. If it actually vanishes, then one can proceed as in the following example.

Example 2.2 (Semi-Explicit DAE with Differentiation Index Two)

Consider (2.5)–(2.6). Suppose g does not depend on y and thus g y [t] ≡ 0. By differentiation of the algebraic equation we obtain

$$\displaystyle{0 = g_{t}^{{\prime}}[t] + g_{ x}^{{\prime}}[t]x^{{\prime}}(t) = g_{ t}^{{\prime}}[t] + g_{ x}^{{\prime}}[t]\,f[t] =: g^{(1)}(t,x(t),y(t))}$$

A further differentiation w.r.t. time yields

$$\displaystyle{0 = (g^{(1)})_{ t}^{{\prime}}[t] + (g^{(1)})_{ x}^{{\prime}}[t]\,f[t] + (g^{(1)})_{ y}^{{\prime}}[t]y^{{\prime}}(t)}$$

with (g (1)) y [t] = g x [t] f y [t]. Now, if the matrix g x [t] f y [t] is non-singular with a bounded inverse along a solution of the DAE, then the above equation can be solved for y by the implicit function theorem and together with the differential equation (2.5) we obtain the underlying ODE

$$\displaystyle\begin{array}{rcl} x^{{\prime}}(t)& =& f(t,x(t),y(t)), {}\\ y^{{\prime}}(t)& =& -(g_{ x}^{{\prime}}[t]\,f_{ y}^{{\prime}}[t])^{-1}\left ((g^{(1)})_{ t}^{{\prime}}[t] + (g^{(1)})_{ x}^{{\prime}}[t]\,f[t]\right ), {}\\ \end{array}$$

and the differentiation index is d = 2.

The procedure of the preceding examples works for semi-explicit Hessenberg DAEs, which are defined as follows:

Definition 2.3 (Hessenberg DAE)

  1. (a)

    For a given k ≥ 2 the DAE

    $$\displaystyle{ \begin{array}{lcrllclll} x_{1}^{{\prime}}(t) & =& f_{1}(t,&y(t),&x_{1}(t),&x_{2}(t),&\ldots,&x_{k-2}(t),&x_{k-1}(t)), \\ x_{2}^{{\prime}}(t) & =& f_{2}(t,& &x_{1}(t),&x_{2}(t),&\ldots,&x_{k-2}(t),&x_{k-1}(t)),\\ & \vdots & & & & \ddots \\ x_{k-1}^{{\prime}}(t)& =&f_{k-1}(t,& & & & &x_{k-2}(t),&x_{k-1}(t)), \\ 0 & =& g(t,& & & & & &x_{k-1}(t))\end{array} }$$
    (2.7)

    is called Hessenberg DAE of order k, if the matrix

    $$\displaystyle{ R(t):= g_{x_{k-1}}^{{\prime}}[t] \cdot f_{ k-1,x_{k-2}}^{{\prime}}[t]\cdots f_{ 2,x_{1}}^{{\prime}}[t] \cdot f_{ 1,y}^{{\prime}}[t] }$$
    (2.8)

    is non-singular for all t ∈ [t 0, t f ] with a uniformly bounded inverse ∥ R −1(t) ∥ ≤ C in [t 0, t f ], where C is a constant independent of t.

  2. (b)

    The DAE

    $$\displaystyle{ \begin{array}{lcr} x^{{\prime}}(t)& =&f(t,x(t),y(t)), \\ 0 & =& g(t,x(t),y(t))\end{array} }$$
    (2.9)

    is called Hessenberg DAE of order 1, if the matrix g y [t] is non-singular with ∥ g y [t]−1 ∥ ≤ C for all t ∈ [t 0, t f ] and some constant C independent of t.

Herein, y is called algebraic variable and x = (x 1, , x k−1) in (a) and x in (b), respectively, is called differential variable.

By repeated differentiation of the algebraic constraint 0 = g(t, x k−1(t)) w.r.t. to time and simultaneous substitution of the derivatives of the differential variable by the corresponding differential equations, it is straightforward to show that the differentiation index of a Hessenberg DAE of order k is equal to k, provided the functions g and f j , j = 1, , k − 1, are sufficiently smooth. In order to formalize this procedure, define

$$\displaystyle{ g^{(0)}(t,x_{ k-1}(t)):= g(t,x_{k-1}(t)). }$$
(2.10)

Differentiation of g (0) with respect to time and substitution of

$$\displaystyle{x_{k-1}^{{\prime}}(t) = f_{ k-1}(t,x_{k-2}(t),x_{k-1}(t))}$$

leads to the equation

$$\displaystyle\begin{array}{rcl} 0& =& g_{t}^{{\prime}}(t,x_{ k-1}(t)) + g_{x_{k-1}}^{{\prime}}(t,x_{ k-1}(t)) \cdot f_{k-1}(t,x_{k-2}(t),x_{k-1}(t)) {}\\ & =:& g^{(1)}(t,x_{ k-2}(t),x_{k-1}(t)), {}\\ \end{array}$$

which is satisfied implicitly as well. Recursive application of this differentiation and substitution process leads to the algebraic equations

$$\displaystyle{ 0 = g^{(\,j)}(t,x_{ k-1-j}(t),\ldots,x_{k-1}(t)),\qquad j = 1,2,\ldots,k - 2, }$$
(2.11)

and

$$\displaystyle{ 0 = g^{(k-1)}(t,y(t),x_{ 1}(t),\ldots,x_{k-1}(t)). }$$
(2.12)

Since Eqs. (2.11)–(2.12) do not occur explicitly in the original system (2.7), these equations are called hidden constraints of the Hessenberg DAE. Note that the matrix R in (2.8) is given by ∂ g (k−1)∂ y.

A practically important subclass of Hessenberg DAEs are mechanical multibody systems in descriptor form given by

$$\displaystyle\begin{array}{rcl} q^{{\prime}}(t)& =& v(t), \\ M(t,q(t))v^{{\prime}}(t)& =& f(t,q(t),v(t)) - g_{ q}^{{\prime}}(t,q(t))^{\top }\lambda (t), \\ 0& =& g(t,q(t)), {}\end{array}$$
(2.13)

where \(q(\cdot ) \in \mathbb{R}^{n}\) denotes the vector of generalized positions, \(v(\cdot ) \in \mathbb{R}^{n}\) the vector of generalized velocities, and \(\lambda (\cdot ) \in \mathbb{R}^{m}\) are Lagrange multipliers. The mass matrix M is supposed to be symmetric and positive definite with a bounded inverse M −1 and thus, the second equation in (2.13) can be multiplied by M(t, q(t))−1. The vector f denotes the generalized forces and torques. The term g q (t, q) λ can be interpreted as a force that keeps the system on the algebraic constraint g(t, q) = 0.

The constraint g(t, q(t)) = 0 is called constraint on position level . Differentiation with respect to time of this algebraic constraint yields the constraint on velocity level

$$\displaystyle{g_{t}^{{\prime}}(t,q(t)) + g_{ q}^{{\prime}}(t,q(t)) \cdot v(t) = 0}$$

and the constraint on acceleration level

$$\displaystyle{g_{tt}^{{\prime\prime}}(t,q(t))+g_{ tq}^{{\prime\prime}}(t,q(t))\cdot v(t)+g_{ q}^{{\prime}}(t,q(t))\cdot v^{{\prime}}(t)+g_{ qq}^{{\prime\prime}}(t,q(t))(v(t),v(t)) = 0.}$$

Replacing v by

$$\displaystyle{v^{{\prime}}(t) = M(t,q(t))^{-1}\left (f(q(t),v(t)) - g_{ q}^{{\prime}}(t,q(t))^{\top }\lambda (t)\right )}$$

yields

$$\displaystyle\begin{array}{rcl} 0& =& g_{tt}^{{\prime\prime}}(t,q(t)) + g_{ tq}^{{\prime\prime}}(t,q(t)) \cdot v(t) {}\\ & & +g_{q}^{{\prime}}(t,q(t)) \cdot M(t,q(t))^{-1}\left (f(q(t),v(t)) - g_{ q}^{{\prime}}(t,q(t))^{\top }\lambda (t)\right ) {}\\ & & +g_{qq}^{{\prime\prime}}(t,q(t))(v(t),v(t)). {}\\ \end{array}$$

If g q (t, q)) has full rank, then the matrix g q (t, q)M(t, q)−1 g q (t, q) is non-singular and the latter equation can be solved for the algebraic variable λ. Thus, the differentiation index is three.

Remark 2.1

Note that semi-explicit DAEs are more general than Hessenberg DAEs since no regularity assumptions are imposed in Definition 2.2. In fact, without additional regularity assumptions, the class of semi-explicit DAEs is essentially as large as the class of general DAEs (1.1), since the settings z (t) = y(t) and F(t, y(t), z(t)) = 0 transform the DAE (1.1) into a semi-explicit DAE (some care has to be taken with regard to the smoothness of solutions, though).

2.1 Error Influence and Perturbation Index

The differentiation index is based on a structural analysis of the DAE, but it does not indicate how perturbations influence the solution. In contrast, the perturbation index addresses the influence of perturbations on the solution and thus it is concerned with the stability of DAEs. Note that perturbations frequently occur, for instance they are introduced by numerical discretization schemes.

Definition 2.4 (Perturbation Index, See [75])

The DAE (1.1) has perturbation index \(p \in \mathbb{N}\) along a solution z on [t 0, t f ], if \(p \in \mathbb{N}\) is the smallest number such that for all functions \(\tilde{z}\) satisfying the perturbed DAE

$$\displaystyle{ F(t,\tilde{z}(t),\tilde{z}^{{\prime}}(t)) =\delta (t), }$$
(2.14)

there exists a constant S depending on F and t f t 0 with

$$\displaystyle{ \|z(t) -\tilde{ z}(t)\| \leq S\left (\|z(t_{0}) -\tilde{ z}(t_{0})\| +\max _{t_{0}\leq \tau \leq t}\|\delta (\tau )\| +\ldots +\max _{0\leq \tau \leq t}\|\delta ^{(\,p-1)}(\tau )\|\right ) }$$
(2.15)

for all t ∈ [t 0, t f ], whenever the expression on the right is less than or equal to a given bound.

The perturbation index is p = 0, if the estimate

$$\displaystyle{ \|z(t) -\tilde{ z}(t)\| \leq S\left (\|z(t_{0}) -\tilde{ z}(t_{0})\| +\max _{t_{0}\leq \tau \leq t_{f}}\left \|\int _{t_{0}}^{\tau }\delta (s)ds\right \|\right ) }$$
(2.16)

holds. The DAE is said to be of higher index, if p ≥ 2.

According to the definition of the perturbation index, higher index DAEs are ill-conditioned in the sense that small perturbations with high frequencies, i.e., with large derivatives, can have a considerable influence on the solution of a higher index DAE as it can be seen in (2.15). For some time it was believed that the difference between perturbation index and differentiation index is at most one, until it was shown in [34] that the difference between perturbation index and differentiation index can be arbitrarily large. However, for the subclass of Hessenberg DAEs as defined in Definition 2.3 both index concepts (and actually all other relevant index concepts) coincide.

The definition of the perturbation index shows that the degree of ill-conditioning increases with the perturbation index. Hence, in order to make a higher index DAE accessible to numerical methods it is advisable and common practice to reduce the perturbation index of a DAE. A straightforward idea is to replace the original DAE by a mathematically equivalent DAE with lower perturbation index. The index reduction process itself is nontrivial for general DAEs, since one has to ensure that it is actually the perturbation index, which is being reduced (and not some other index like the differentiation index).

For Hessenberg DAEs, however, the index reduction process is straightforward as perturbation index and differentiation index coincide. Consider a Hessenberg DAE of order k as in (2.7). Then, by replacing the algebraic constraint 0 = g(t, x k−1(t)) by one of the hidden constraints g ( j), j ∈ { 1, , k − 1}, defined in (2.11) or (2.12) we obtain the Hessenberg DAE

$$\displaystyle{ \begin{array}{lcrllcccl} x_{1}^{{\prime}}(t) & =& f_{1}(t,&y(t),&x_{1}(t),&x_{2}(t),& \ldots, &x_{k-2}(t),&x_{k-1}(t)), \\ x_{2}^{{\prime}}(t) & =& f_{2}(t,& &x_{1}(t),&x_{2}(t),& \ldots, &x_{k-2}(t),&x_{k-1}(t)),\\ &\vdots&&&&\ddots \\ x_{k-1}^{{\prime}}(t)& =&f_{k-1}(t,& & & & &x_{k-2}(t),&x_{k-1}(t)), \\ 0 & =& g^{(\,j)}(t,& & & &x_{k-1-j}(t),& \ldots, &x_{k-1}(t)),\end{array} }$$
(2.17)

where we use the setting x 0: = y for notational convenience. The Hessenberg DAE in (2.17) has perturbation index kj. Hence, this simple index reduction strategy actually reduces the perturbation index, and it leads to a mathematically equivalent DAE with the same solution as the original DAE, if the initial values x(t 0) and y(t 0) satisfy the algebraic constraints g ()(t 0, x k−1− (t 0), , x k−1(t 0)) = 0 for all  = 0, , k − 1.

On the other hand, the index reduced DAE (2.17) in general permits additional solutions for those initial values x(t 0) and y(t 0), which merely satisfy the algebraic constraints g ()(t 0, x k−1− (t 0), , x k−1(t 0)) = 0 for all  = j, , k − 1, but not the neglected algebraic constraints with index  = 0, , j − 1. In the most extreme case j = k − 1 (the reduced DAE has index-one) x(t 0) can be chosen arbitrarily (assuming that the remaining algebraic constraint can be solved for y(t 0) given the value of x(t 0)). The following theorem shows that the use of inconsistent initial values leads to a polynomial drift off the neglected algebraic constraints in time, compare [73, Sect. VII.2].

Theorem 2.1

Consider the Hessenberg DAE of order k in ( 2.7 ) and the index reduced DAE in ( 2.17 ) with j ∈{ 1,…,k − 1}. Let x(t) and y(t) be a solution of ( 2.17 ) such that the initial values x(t 0 ) and y(t 0 ) satisfy the algebraic constraints g (ℓ) (t 0 ,x k−1−ℓ (t 0 ),…,x k−1 (t 0 )) = 0 for all ℓ = j,…,k − 1. Then for ℓ = 1,…,j and t ≥ t 0 we have

$$\displaystyle{ g^{(\,j-\ell)}(t,x_{ k-1-(j-\ell)}(t),\ldots,x_{k-1}(t)) =\sum _{ \nu =0}^{\ell-1}\frac{1} {\nu !} (t - t_{0})^{\nu }g^{(\,j-\ell+\nu )}[t_{ 0}]. }$$
(2.18)

with g ( j−ℓ+ν) [t 0 ]:= g ( j−ℓ+ν) (t 0 ,x k−1−(j−ℓ+ν) (t 0 ),…,x k−1 (t 0 )).

Proof

We use the abbreviation g ()[t] for g ()(t, x k−1− (t), x k−1(t)) for notational convenience. Observe that

$$\displaystyle{g^{(\,j-\ell+1)}[t] = \frac{d} {dt}g^{(\,j-\ell)}[t],\qquad \ell = 1,\ldots,j,}$$

and thus

$$\displaystyle{g^{(\,j-\ell)}[t] = g^{(\,j-\ell)}[t_{ 0}] +\int _{ t_{0}}^{t}g^{(\,j-\ell+1)}[\tau ]d\tau.}$$

We have g ( j)[t] = 0 and thus for  = 1:

$$\displaystyle{g^{(\,j-1)}[t] = g^{(\,j-1)}[t_{ 0}] +\int _{ t_{0}}^{t}g^{(\,j)}[\tau ]d\tau = g^{(\,j-1)}[t_{ 0}].}$$

This proves (2.18) for  = 1. Inductively we obtain

$$\displaystyle\begin{array}{rcl} g^{(\,j-(\ell+1))}[t]& =& g^{(\,j-(\ell+1))}[t_{ 0}] +\int _{ t_{0}}^{t}g^{(\,j-\ell)}[\tau ]d\tau {}\\ & =& g^{(\,j-(\ell+1))}[t_{ 0}] +\int _{ t_{0}}^{t}\sum _{ \nu =0}^{\ell-1}\frac{1} {\nu !} (\tau -t_{0})^{\nu }g^{(\,j-\ell+\nu )}[t_{ 0}]d\tau {}\\ & =& g^{(\,j-(\ell+1))}[t_{ 0}] +\sum _{ \nu =0}^{\ell-1} \frac{1} {(\nu +1)!}(t - t_{0})^{\nu +1}g^{(\,j-\ell+\nu )}[t_{ 0}] {}\\ & =& g^{(\,j-(\ell+1))}[t_{ 0}] +\sum _{ \nu =1}^{\ell}\frac{1} {\nu !} (t - t_{0})^{\nu }g^{(\,j-\ell+\nu -1)}[t_{ 0}] {}\\ & =& \sum _{\nu =0}^{\ell}\frac{1} {\nu !} (t - t_{0})^{\nu }g^{(\,j-(\ell+1)+\nu )}[t_{ 0}], {}\\ \end{array}$$

which proves the assertion. □ 

We investigate the practically relevant index-three case in more detail and consider the reduction to index one (i.e., k = 3 and j = 2). In this case Theorem 2.1 yields

$$\displaystyle\begin{array}{rcl} g^{(0)}(t,x_{ 2}(t))& =& g^{(0)}[t_{ 0}] + (t - t_{0})g^{(1)}[t_{ 0}],{}\end{array}$$
(2.19)
$$\displaystyle\begin{array}{rcl} g^{(1)}(t,x_{ 1}(t),x_{2}(t))& =& g^{(1)}[t_{ 0}].{}\end{array}$$
(2.20)

The drift-off property of the index reduced DAE causes difficulties for numerical discretization methods as the subsequent result shows, compare [73, Sect. VII.2].

Theorem 2.2

Consider the DAE ( 2.7 ) with k = 3 and the index reduced problem ( 2.17 ) with j = 2. Let z(t;t m ,z m ) denote the solution of the latter at time t with initial value z m at t m , where z = (x 1 ,x 2 ,y) denotes the vector of differential and algebraic states. Suppose the initial value z 0 at t 0 satisfies g (0) [t 0 ] = 0 and g (1) [t 0 ] = 0.

Let a numerical method generate approximations z n = (x 1,n ,x 2,n ,y n ) of z(t n ;t 0 ,z 0 ) at time points t n = t 0 + nh, \(n \in \mathbb{N}_{0}\) , with stepsize h > 0. Suppose the numerical method is of order \(p \in \mathbb{N}\) , i.e., the local error satisfies

$$\displaystyle{\|z_{n+1} - z(t_{n+1};t_{n},z_{n})\| = \mathcal{O}(h^{p+1}),\qquad n \in \mathbb{N}_{ 0}.}$$

Then, for \(n \in \mathbb{N}\) the algebraic constraint g (0) = g satisfies the estimate

$$\displaystyle{ \|g(t_{n},x_{2,n})\| \leq Ch^{p}\left (L_{ 0}(t_{n} - t_{0}) + \frac{L_{1}} {2} (t_{n} - t_{0})^{2}\right ) }$$
(2.21)

with constants C,L 0 , and L 1.

Proof

Since z 0 satisfies g (0)[t 0] = 0 and g (1)[t 0] = 0, the solution z(t; t 0, z 0) satisfies these constraints for every t. For notational convenience we use the notion g (0)(t, z(t)) instead of g (0)(t, x 2(t)) and likewise for g (1). To this end, for a given t n we have

$$\displaystyle\begin{array}{rcl} \|g^{(0)}(t_{ n},z_{n})\|& =& \|g^{(0)}(t_{ n},z_{n}) - g^{(0)}(t_{ n},z(t_{n};t_{0},z_{0}))\| \\ & =& \|\sum _{m=0}^{n-1}\left (g^{(0)}(t_{ n},z(t_{n};t_{m+1},z_{m+1})) - g^{(0)}(t_{ n},z(t_{n};t_{m},z_{m}))\right )\| \\ & \leq & \sum _{m=0}^{n-1}\|g^{(0)}(t_{ n},z(t_{n};t_{m+1},z_{m+1})) - g^{(0)}(t_{ n},z(t_{n};t_{m},z_{m}))\|.\qquad {}\end{array}$$
(2.22)

Exploitation of (2.19)–(2.20) with t 0 replaced by t m and t m+1, respectively, yields

$$\displaystyle\begin{array}{rcl} & & \|g^{(0)}(t_{ n},z(t_{n};t_{m+1},z_{m+1})) - g^{(0)}(t_{ n},z(t_{n};t_{m},z_{m}))\| {}\\ & & =\| g^{(0)}(t_{ m+1},z_{m+1}) + (t_{n} - t_{m+1})g^{(1)}(t_{ m+1},z_{m+1}) {}\\ & & \phantom{=\|} -g^{(0)}(t_{ m},z_{m}) - (t_{n} - t_{m})g^{(1)}(t_{ m},z_{m})\| {}\\ & & =\| g^{(0)}(t_{ m+1},z_{m+1}) + (t_{n} - t_{m+1})g^{(1)}(t_{ m+1},z_{m+1}) - g^{(0)}(t_{ m+1},z(t_{m+1};t_{m},z_{m})) {}\\ & & \phantom{=\|} +g^{(0)}(t_{ m+1},z(t_{m+1};t_{m},z_{m})) - g^{(0)}(t_{ m},z_{m}) - (t_{n} - t_{m})g^{(1)}(t_{ m},z_{m})\| {}\\ & & =\| g^{(0)}(t_{ m+1},z_{m+1}) + (t_{n} - t_{m+1})g^{(1)}(t_{ m+1},z_{m+1}) - g^{(0)}(t_{ m+1},z(t_{m+1};t_{m},z_{m})) {}\\ & & \phantom{=\|} +g^{(0)}(t_{ m},z_{m}) + (t_{m+1} - t_{m})g^{(1)}(t_{ m},z_{m}) - g^{(0)}(t_{ m},z_{m}) - (t_{n} - t_{m})g^{(1)}(t_{ m},z_{m})\| {}\\ & & =\| g^{(0)}(t_{ m+1},z_{m+1}) - g^{(0)}(t_{ m+1},z(t_{m+1};t_{m},z_{m})) {}\\ & & \phantom{=\|} +(t_{n} - t_{m+1})\left (g^{(1)}(t_{ m+1},z_{m+1}) - g^{(1)}(t_{ m},z_{m})\right )\| {}\\ & & \leq L_{0}Ch^{p+1} + (t_{ n} - t_{m+1})\|g^{(1)}(t_{ m+1},z_{m+1}) - g^{(1)}(t_{ m+1},z(t_{m+1};t_{m},z_{m}))\| {}\\ & & \phantom{=\|} +(t_{n} - t_{m+1})\|g^{(1)}(t_{ m+1},z(t_{m+1};t_{m},z_{m})) - g^{(1)}(t_{ m},z_{m})\| {}\\ & & \leq Ch^{p+1}\left (L_{ 0} + L_{1}(t_{n} - t_{m+1})\right ) + (t_{n} - t_{m+1})\|g^{(1)}(t_{ m},z_{m}) - g^{(1)}(t_{ m},z_{m})\| {}\\ & & = Ch^{p+1}\left (L_{ 0} + L_{1}(t_{n} - t_{m+1})\right ), {}\\ \end{array}$$

where L 0 and L 1 are Lipschitz constants of g (0) and g (1). Together with (2.22) we thus proved the estimate

$$\displaystyle\begin{array}{rcl} \|g^{(0)}(t_{ n},z_{n})\|& \leq & \sum _{m=0}^{n-1}Ch^{p+1}\left (L_{ 0} + L_{1}(t_{n} - t_{m+1})\right ) {}\\ & \leq & Ch^{p}\left (L_{ 0}(t_{n} - t_{0}) + \frac{L_{1}} {2} (t_{n} - t_{0})^{2}\right ). {}\\ \end{array}$$

 □ 

The estimate (2.21) shows that the numerical solution may violate the algebraic constraint with a quadratic drift term in t n for the setting in Theorem 2.2. This drift-off effect may lead to useless numerical simulation results, especially on long time horizons. For DAEs with even higher index, the situation becomes worse as the degree of the polynomial drift term depends on the j in (2.17), i.e., on the number of differentiations used in the index reduction.

2.2 Stabilization Techniques

The basic index reduction approach in the previous section may lead to unsatisfactory numerical results. One possibility to avoid the drift-off on numerical level is to perform a projection step onto the neglected algebraic constraints after each successful integration step for the index reduced system, see [18, 47].

Another idea is to use stabilization techniques to stabilize the index reduced DAE itself. The common approaches are Baumgarte stabilization, Gear–Gupta–Leimkuhler stabilization, and the use of overdetermined DAEs.

2.2.1 Baumgarte Stabilization

The Baumgarte stabilization [22] was originally introduced for mechanical multibody systems (2.13). It can be extended to Hessenberg DAEs in a formal way. The idea is to replace the algebraic constraint in (2.7) by a linear combination of original and hidden algebraic constraints g (),  ∈ { 0, 1, , k − 1}. With the setting x 0: = y, the resulting DAE reads as follows:

$$\displaystyle{ \begin{array}{lcrllcccl} x_{1}^{{\prime}}(t) & =& f_{1}(t,&y(t),&x_{1}(t),&x_{2}(t),& \ldots, &x_{k-2}(t),&x_{k-1}(t)), \\ x_{2}^{{\prime}}(t) & =& f_{2}(t,& &x_{1}(t),&x_{2}(t),& \ldots, &x_{k-2}(t),&x_{k-1}(t)),\\ & \vdots & & & & \ddots \\ x_{k-1}^{{\prime}}(t)& =& f_{k-1}(t,& & & & &x_{k-2}(t),&x_{k-1}(t)), \\ 0 & =&\sum \limits _{\ell=0}^{k-1}\alpha _{\ell}g^{(\ell)}(t,& & & &x_{k-1-\ell}(t),& \ldots, &x_{k-1}(t)). \end{array} }$$
(2.23)

The DAE (2.23) has index one. The weights α ,  = 0, 1, , k − 1, with α k−1 = 1 have to be chosen such that the associated differential equation

$$\displaystyle{0 =\sum \limits _{ \ell=0}^{k-1}\alpha _{ \ell}\eta ^{(\ell)}(t)}$$

is asymptotically stable with ∥ η ()(t) ∥  0 for  ∈ { 0, , k − 2} as t ⟶ ∞, compare [73, Sect. VII.2]. A proper choice of the weights is crucial since a balance between quick damping and low degree of stiffness has to be found.

The Baumgarte stabilization was used for real-time simulations in [14, 31], but on the index-two level and not on the index-one level.

2.2.2 Gear–Gupta–Leimkuhler Stabilization

The Gear–Gupta–Leimkuhler (GGL) stabilization [64] does not neglect algebraic constraints but couples them to the index reduced DAE using an additional multiplier. Consider the mechanical multibody system (2.13). The GGL stabilization reads as follows:

$$\displaystyle\begin{array}{rcl} q^{{\prime}}(t)& =& v(t) - g_{ q}^{{\prime}}(t,q(t))^{\top }\mu (t), \\ M(t,q(t))v^{{\prime}}(t)& =& f(t,q(t),v(t)) - g_{ q}^{{\prime}}(t,q(t))^{\top }\lambda (t), \\ 0& =& g(t,q(t)), \\ 0& =& g_{t}^{{\prime}}(t,q(t)) + g_{ q}^{{\prime}}(t,q(t)) \cdot v(t){}\end{array}$$
(2.24)

The DAE is of Hessenberg type (if multiplied by M −1) and it has index two, if M is symmetric and positive definite and g q has full rank. Differentiation of the first algebraic equation yields

$$\displaystyle{0 = g_{t}^{{\prime}}(t,q(t))+g_{ q}^{{\prime}}(t,q(t))\cdot \left (v(t) - g_{ q}^{{\prime}}(t,q(t))^{\top }\mu (t)\right ) = -g_{ q}^{{\prime}}(t,q(t))g_{ q}^{{\prime}}(t,q(t))^{\top }\mu (t).}$$

Since g q is supposed to be of full rank, the matrix g q [t]g q [t] is non-singular and the equation implies μ ≡ 0.

The idea of the GGL stabilization can be extended to Hessenberg DAEs. To this end consider (2.7) and the index reduced DAE (2.17) with j ∈ { 1, , k − 1} fixed. Define

$$\displaystyle{G(t,x_{1},\ldots,x_{k-1}):= \left (\begin{array}{c} g^{(0)}(t,x_{k-1}) \\ g^{(1)}(t,x_{k-2},x_{k-1})\\ \vdots \\ g^{(\,j-1)}(t,x_{k-j},\ldots,x_{k-1}) \end{array} \right )}$$

and suppose the Jacobian

has full rank. A stabilized version of (2.17) is given by

$$\displaystyle\begin{array}{rcl} x^{{\prime}}(t)& =& f(t,x(t),y(t)) - G_{ x}^{{\prime}}(t,x(t))^{\top }\mu (t), \\ 0& =& G(t,x(t)), \\ 0& =& g^{(\,j)}(t,x_{ k-j-1}(t),\ldots,x_{k-1}(t)), {}\end{array}$$
(2.25)

where μ is an additional algebraic variable, x = (x 1, , x k−1), and f = ( f 1, , f k−1). The stabilized DAE has index max{2, kj}. Note that

$$\displaystyle\begin{array}{rcl} G_{t}^{{\prime}}[t] + G_{ x}^{{\prime}}[t]\,f[t]& =& \left (\begin{array}{c} (g^{(0)})_{ t}^{{\prime}}[t] + (g^{(0)})_{ x_{k-1}}^{{\prime}}[t]f_{ k-1}[t]\\ \vdots \\ (g^{(\,j-1)})_{t}^{{\prime}}[t] +\sum \limits _{ \ell=1}^{j}(g^{(\,j-1)})_{x_{k-\ell}}^{{\prime}}[t]f_{k-\ell}[t] \end{array} \right ) {}\\ & =& \left (\begin{array}{c} g^{(1)}[t]\\ \vdots \\ g^{(\,j)}[t] \end{array} \right ) = 0.{}\\ \end{array}$$

Moreover,

$$\displaystyle\begin{array}{rcl} 0& =& \frac{d} {dt}G(t,x_{1}(t),\ldots,x_{k-1}(t)) {}\\ & =& G_{t}^{{\prime}}[t] + G_{ x}^{{\prime}}[t]\left (f[t] - G_{ x}^{{\prime}}[t]^{\top }\mu (t)\right ) {}\\ & =& G_{t}^{{\prime}}[t] + G_{ x}^{{\prime}}[t]f[t] - G_{ x}^{{\prime}}[t]G_{ x}^{{\prime}}[t]^{\top }\mu (t) {}\\ & =& -G_{x}^{{\prime}}[t]G_{ x}^{{\prime}}[t]^{\top }\mu (t) {}\\ \end{array}$$

and thus μ ≡ 0 since G x was supposed to have full rank.

2.2.3 Stabilization by Over-Determination

The GGL stabilization approaches for the mechanical multibody system in (2.24) and the Hessenberg DAE in (2.25) are mathematically equivalent to the overdetermined DAEs

$$\displaystyle\begin{array}{rcl} q^{{\prime}}(t)& =& v(t), {}\\ M(t,q(t))v^{{\prime}}(t)& =& f(t,q(t),v(t)) - g_{ q}^{{\prime}}(t,q(t))^{\top }\lambda (t), {}\\ 0& =& g(t,q(t)), {}\\ 0& =& g_{t}^{{\prime}}(t,q(t)) + g_{ q}^{{\prime}}(t,q(t)) \cdot v(t) {}\\ \end{array}$$

and

$$\displaystyle\begin{array}{rcl} x^{{\prime}}(t)& =& f(t,x(t),y(t)), {}\\ 0& =& G(t,x(t)), {}\\ 0& =& g^{(\,j)}(t,x_{ k-j-1}(t),\ldots,x_{k-1}(t)), {}\\ \end{array}$$

respectively, because the additional algebraic variable μ vanishes in either case. Hence, from an analytical point of view there is no difference between the respective systems. A different treatment is necessary from the numerical point of view, though. The GGL stabilized DAEs in (2.24) and (2.25) can be solved by standard discretization schemes, like BDF methods or methods of Runge–Kutta type, provided those are suitable for higher index DAEs. In contrast, the overdetermined DAEs require tailored numerical methods that are capable of dealing with overdetermined linear equations, which arise internally in each integration step. Typically, such overdetermined equations are solved in a least-squares sense, compare [56, 57] for details.

3 Consistent Initialization and Influence of Parameters

One of the crucial issues when dealing with DAEs is that a DAE in general only permits a solution for properly defined initial values, the so-called consistent initial values. The initial values not only have to satisfy those algebraic constraints that are explicitly present in the DAE, but hidden constraints have to be satisfied as well.

3.1 Consistent Initial Values

For the Hessenberg DAE (2.7) consistency is defined as follows.

Definition 3.1 (Consistent Initial Value for Hessenberg DAEs)

The initial values x(t 0) = (x 1(t 0), , x k−1(t 0)) and y(t 0) are consistent with (2.7), if the equations

$$\displaystyle\begin{array}{rcl} 0& =& g^{(\,j)}(t_{ 0},x_{k-1-j}(t_{0}),\ldots,x_{k-1}(t_{0})),\qquad j = 1,2,\ldots,k - 2,{}\end{array}$$
(3.1)
$$\displaystyle\begin{array}{rcl} 0& =& g^{(k-1)}(t_{ 0},y(t_{0}),x_{1}(t_{0}),\ldots,x_{k-1}(t_{0})){}\end{array}$$
(3.2)

hold.

Finding a consistent initial value for a Hessenberg DAE typically consists of two steps. Firstly, a suitable x(t 0) subject to the constraints (3.1) has to be determined. Secondly, given x(t 0) with (3.1), Eq. (3.2) can be solved for y 0 = y(t 0) by Newton’s method, if the matrix R 0 = ∂ g (k−1)∂ y is non-singular in a solution (assuming that a solution exists). For mechanical multibody systems even a linear equation arises in the second step.

Example 3.1

Consider the mechanical multibody system (2.13). A consistent initial value (q 0, v 0, λ 0) at t 0 must satisfy

$$\displaystyle\begin{array}{rcl} 0& =& g(t_{0},q_{0}), {}\\ 0& =& g_{t}^{{\prime}}(t_{ 0},q_{0}) + g_{q}^{{\prime}}(t_{ 0},q_{0}) \cdot v_{0}, {}\\ 0& =& g_{tt}^{{\prime\prime}}(t_{ 0},q_{0}) + g_{tq}^{{\prime\prime}}(t_{ 0},q_{0}) \cdot v_{0} + g_{q}^{{\prime}}(t_{ 0},q_{0}) \cdot v_{0}^{{\prime}} + g_{ qq}^{{\prime\prime}}(t_{ 0},q_{0})(v_{0},v_{0}) {}\\ \end{array}$$

with

$$\displaystyle{M(t_{0},q_{0})v_{0}^{{\prime}} = f(t_{ 0},q_{0},v_{0}) - g_{q}^{{\prime}}(t_{ 0},q_{0})^{\top }\lambda _{ 0}.}$$

The latter two equations yield a linear equation for v 0 and λ 0:

$$\displaystyle\begin{array}{rcl} & & \left (\begin{array}{cc} M(t_{0},q_{0}) &g_{q}^{{\prime}}(t_{0},q_{0})^{\top } \\ g_{q}^{{\prime}}(t_{0},q_{0})& 0 \end{array} \right )\left (\begin{array}{c} v_{0}^{{\prime}} \\ \lambda _{0} \end{array} \right ) {}\\ & & \qquad = \left (\begin{array}{c} f(t_{0},q_{0},v_{0}) \\ - g_{tt}^{{\prime\prime}}(t_{0},q_{0}) - g_{tq}^{{\prime\prime}}(t_{0},q_{0}) \cdot v_{0} - g_{qq}^{{\prime\prime}}(t_{0},q_{0})(v_{0},v_{0}) \end{array} \right ). {}\\ \end{array}$$

The matrix on the left-hand side is non-singular, if M is symmetric and positive definite and g q (t 0, q 0) is of full rank.

Definition 3.2 (Consistent Initial Value for General DAEs, Compare [29, Sect. 5.3.4])

For a general DAE (1.1) with differentiation index d the initial value z 0 = z(t 0) is said to be consistent at t 0, if the derivative array

$$\displaystyle{ F^{(\,j)}(t_{ 0},z_{0},z_{0}^{{\prime}},\ldots,z_{ 0}^{(\,j+1)}) = 0,\qquad j = 0,1,\ldots,d, }$$
(3.3)

in (2.4) has a solution (z 0, z 0 , , z 0 (d+1)).

Note that the system of nonlinear equations (3.3) in general has many solutions and additional conditions are required to obtain a particular consistent initial value, which might be relevant for a particular application. This can be achieved for instance by imposing additional constraints

$$\displaystyle{ G(t_{0},z_{0},z_{0}^{{\prime}}) = 0, }$$
(3.4)

which are known to hold for a specific application, compare [29, Sect. 5.3.4]. Of course, such additional constraints must not contradict the equations in (3.3).

If the user is not able to formulate relations in (3.4) such that the combined system of equations (3.3) and (3.4) returns a unique solution, then a least-squares approach could be used to find a consistent initial value closest to a ‘desired’ initial value, compare [33]:

$$\displaystyle{\begin{array}{ll} \mbox{ Minimize}&\frac{1} {2}\|G(t_{0},z_{0},z_{0}^{{\prime}})\|^{2} + \frac{1} {2}\sum _{\ell=2}^{d+1}\|z_{ 0}^{(\ell)})\|^{2} \\ \mbox{ w.r.t.} &(z_{0},z_{0}^{{\prime}},\ldots,z_{0}^{(d+1)})^{\top } \\ \mbox{ s.t.} &F^{(\,j)}(t_{0},z_{0},z_{0}^{{\prime}},\ldots,z_{0}^{(\,j+1)}) = 0,\qquad j = 0,1,\ldots,d.\end{array} }$$

In practical computations the major challenge for higher index DAEs is to obtain analytical expressions or numerical approximations of the derivatives in F ( j), j = 1, , d. For this purpose computer algebra packages like MAPLE, MATHEMATICA, or the symbolic toolbox of MATLAB can be used. Algorithmic differentiation tools are suitable as well, compare [72] for an overview. A potential issue is redundancy in the constraints (3.3) and the identification of the relevant equations in the derivative array. Approaches for the consistent initialization of general DAEs can be found in [1, 30, 35, 51, 71, 78, 93, 108]. A different approach is used in [127] in the context of shooting methods for parameter identification problems or optimal control problems. Herein, the algebraic constraints of the DAE are relaxed such that they are satisfied for any initial value. Then, the relaxation terms are driven to zero in the superordinate optimization problem in order to ensure consistency with the original DAE.

3.2 Dependence on Parameters

Initial values may depend on parameters that are present in the DAE. To this end the recomputation of consistent initial values for perturbed parameters becomes necessary or a parametric sensitivity analysis has to be performed, compare [66, 69]. Such issues frequently arise in the context of optimal control problems or parameter identification problems subject to DAEs, compare [68].

Example 3.2

Consider the equations of motion of a pendulum of mass m and length in the plane:

$$\displaystyle\begin{array}{rcl} q_{1}^{{\prime}}(t)& =& v_{ 1}(t), {}\\ q_{2}^{{\prime}}(t)& =& v_{ 2}(t), {}\\ mv_{1}^{{\prime}}(t)& =& \phantom{-mg} - 2q_{ 1}(t)\lambda (t), {}\\ mv_{2}^{{\prime}}(t)& =& -mg - 2q_{ 2}(t)\lambda (t), {}\\ 0& =& q_{1}(t)^{2} + q_{ 2}(t)^{2} -\ell^{2}. {}\\ \end{array}$$

Herein, (q 1, q 2) denotes the pendulum’s position, (v 1, v 2) its velocity, and λ the stress in the bar. A consistent initial value (q 1, 0, q 2, 0, v 1, 0, v 2, 0, λ 0) has to satisfy the equations

$$\displaystyle\begin{array}{rcl} 0& =& q_{1,0}^{2} + q_{ 2,0}^{2} -\ell^{2},{}\end{array}$$
(3.5)
$$\displaystyle\begin{array}{rcl} 0& =& q_{1,0}v_{1,0} + q_{2,0}v_{2,0},{}\end{array}$$
(3.6)
$$\displaystyle\begin{array}{rcl} 0& =& -\frac{\ell^{2}} {m}\lambda _{0} + (v_{1,0}^{2} + v_{ 2,0}^{2}) - q_{ 2,0}g.{}\end{array}$$
(3.7)

Apparently, the algebraic component λ 0 depends on the parameter p = (m, , g) according to

$$\displaystyle{\lambda _{0} =\lambda _{0}(\,p) = \frac{m} {\ell^{2}} \left (v_{1,0}^{2} + v_{ 2,0}^{2} - q_{ 2,0}g\right ).}$$

But in addition, the positions q 1, 0 and q 2, 0 depend on through the relation (3.5). So, if changes, then q 1, 0 and/or q 2, 0 have to change as well subject to (3.5) and (3.6). However, those equations in general do not uniquely define q 1, 0, q 2, 0, v 1, 0, v 2, 0 and the question arises, which set of values one should choose?

Firstly, we focus on the recomputation of an initial value for perturbed parameters. As the previous example shows, there is not a unique way to determine such a consistent initial value. A common approach is to use a projection technique, compare, e.g., [69] for a class of index-two DAEs, [68, Sect. 4.5.1] for Hessenberg DAEs, or [33] for general DAEs.

Consider the general parametric DAE

$$\displaystyle{ F(t,z(t),z^{{\prime}}(t),p) = 0 }$$
(3.8)

and the corresponding derivative array

$$\displaystyle{F^{(\,j)}(t,z,z^{{\prime}},\ldots,z^{(\,j+1)},p) = 0,\qquad j = 0,1,\ldots,d.}$$

Remark 3.1

Please note that the differentiation index d of the general parametric DAE (3.8) may depend on p. For simplicity, we assume throughout that this is not the case (at least locally around a fixed nominal parameter).

Let \(\tilde{p}\) be a given parameter. Suppose \(\tilde{z}_{0} = z_{0}(\,\tilde{p})\) with \(\tilde{z}_{0}^{{\prime}} = z_{0}^{{\prime}}(\,\tilde{p}),\ldots,\tilde{z}_{0}^{(d+1)} = z_{0}^{(d+1)}(\,\tilde{p})\) is consistent. In order to find a consistent initial value for p, which is supposed to be close to \(\tilde{p}\), solve the following parametric constrained least-squares problem:

LSQ(p): Minimize

$$\displaystyle{\frac{1} {2}\|(\xi _{0},\xi _{0}^{{\prime}},\ldots,\xi _{ 0}^{(d+1)})^{\top }- (\tilde{z}_{ 0},\tilde{z}_{0}^{{\prime}},\ldots,\tilde{z}_{ 0}^{(d+1)})^{\top }\|^{2}}$$

with respect to (ξ 0, ξ 0 , , ξ 0 (d+1)) subject to the constraints

$$\displaystyle{F^{(\,j)}(t_{ 0},\xi _{0},\xi _{0}^{{\prime}},\ldots,\xi _{ 0}^{(\,j+1)},p) = 0,\qquad j = 0,1,\ldots,d.}$$

Remark 3.2

In case of a parametric Hessenberg DAE it would be sufficient to consider the hidden constraints up to order k − 2 as the constraints in LSQ(p) and to compute a consistent algebraic component afterwards. Moreover, the quantities t 0 and \(\tilde{z}_{0}^{(\,j)}\), j = 0, , d + 1, could be considered as parameters of LSQ(p) as well, but here we are only interested in p.

The least-squares problem LSQ(p) is a parametric nonlinear optimization problem and allows for a sensitivity analysis in the spirit of [54] in order to investigate the sensitivity of a solution of LSQ(p) for p close to some nominal value \(\hat{p}\). Let

$$\displaystyle{ L(\xi,\mu,p) = \frac{1} {2}\|\xi -\tilde{ z}\|^{2} +\mu ^{\top }G(\xi,p) }$$
(3.9)

with ξ = (ξ 0, ξ 0 , , ξ 0 (d+1)), \(\tilde{z} = (\tilde{z}_{0},\tilde{z}_{0}^{{\prime}},\ldots,\tilde{z}_{0}^{(d+1)})^{\top }\), and

$$\displaystyle{ G(\xi,p) = \left (\begin{array}{c} F^{(\,j)}(t_{0},\xi _{0},\xi _{0}^{{\prime}},\ldots,\xi _{0}^{(\,j+1)},p) \end{array} \right )_{j=0,1,\ldots,d} }$$
(3.10)

denote the Lagrange function of LSQ(p).

Theorem 3.1 (Sensitivity Theorem, Compare [54])

Let G in ( 3.10 ) be twice continuously differentiable and \(\hat{p}\) a nominal parameter. Let \(\hat{\xi }\) be a local minimum of LSQ( \(\hat{p}\) ) with Lagrange multiplier \(\hat{\mu }\) such that the following assumptions hold:

  1. (a)

    Linear independence constraint qualification: \(G_{\xi }^{{\prime}}(\hat{\xi },\hat{p})\) has full rank.

  2. (b)

    KKT conditions: \(0 = \nabla _{\xi }L(\hat{\xi },\hat{\mu },\hat{p})\) with L from ( 3.9 )

  3. (c)

    Second-order sufficient condition:

    $$\displaystyle{L_{\xi \xi }^{{\prime\prime}}(\hat{\xi },\hat{\mu },\hat{p})(h,h) > 0\qquad \forall h\not =0\:\ G_{\xi }^{{\prime}}(\hat{\xi },\hat{p})h = 0.}$$

Then there exist neighborhoods \(B_{\epsilon }(\hat{p})\) and \(B_{\delta }(\hat{\xi },\hat{\mu })\) , such that LSQ(p) has a unique local minimum

$$\displaystyle{(\xi (\,p),\mu (\,p)) \in B_{\delta }(\hat{\xi },\hat{\mu })}$$

for each \(p \in B_{\epsilon }(\hat{p})\) . In addition, (ξ( p),μ( p)) is continuously differentiable with respect to p with

$$\displaystyle{ \left (\begin{array}{cc} L_{\xi \xi }^{{\prime\prime}}(\hat{\xi },\hat{\mu },\hat{p})&G_{\xi }^{{\prime}}(\hat{\xi },\hat{p})^{\top } \\ G_{\xi }^{{\prime}}(\hat{\xi },\hat{p}) & 0 \end{array} \right )\left (\begin{array}{c} \xi ^{{\prime}}(\hat{p}) \\ \mu ^{{\prime}}(\hat{p}) \end{array} \right ) = -\left (\begin{array}{c} L_{\xi p}^{{\prime\prime}}(\hat{\xi },\hat{\mu },\hat{p}) \\ G_{p}^{{\prime}}(\hat{\xi },\hat{p}) \end{array} \right ). }$$
(3.11)

The second equation in (3.11) reads

$$\displaystyle{G_{\xi }^{{\prime}}(\hat{\xi },\hat{p})\xi ^{{\prime}}(\hat{p}) + G_{ p}^{{\prime}}(\hat{\xi },\hat{p}) = 0}$$

and in more detail using (3.10),

$$\displaystyle{\sum _{\ell=0}^{j+1}\frac{\partial F^{(\,j)}} {\partial z^{(\ell)}} (t_{0},\hat{\xi }_{0},\ldots,\hat{\xi }_{0}^{(\,j+1)},\hat{p}) \cdot (\xi _{ 0}^{(\ell)})^{{\prime}}(\hat{p}) + \frac{\partial F^{(\,j)}} {\partial p} (t_{0},\hat{\xi }_{0},\ldots,\hat{\xi }_{0}^{(\,j+1)},\hat{p}) = 0}$$

for j = 0, 1, , d. Let us define \(S_{0}^{(\ell)}:= (\xi _{0}^{(\ell)})^{{\prime}}(\hat{p})\) for  = 0, , d + 1. Then, we obtain

$$\displaystyle{ \sum _{\ell=0}^{j+1}\frac{\partial F^{(\,j)}} {\partial z^{(\ell)}} (t_{0},\hat{\xi }_{0},\ldots,\hat{\xi }_{0}^{(\,j+1)},\hat{p}) \cdot S_{ 0}^{(\ell)} + \frac{\partial F^{(\,j)}} {\partial p} (t_{0},\hat{\xi }_{0},\ldots,\hat{\xi }_{0}^{(\,j+1)},\hat{p}) = 0 }$$
(3.12)

and in particular for j = 0,

$$\displaystyle{\frac{\partial F} {\partial z} (t_{0},\hat{\xi }_{0},\hat{\xi }_{0}^{{\prime}},\hat{p}) \cdot S_{ 0} + \frac{\partial F} {\partial z^{{\prime}}}(t_{0},\hat{\xi }_{0},\hat{\xi }_{0}^{{\prime}},\hat{p}) \cdot S_{ 0}^{{\prime}} + \frac{\partial F} {\partial p} (t_{0},\hat{\xi }_{0},\hat{\xi }_{0}^{{\prime}},\hat{p}) = 0,}$$

which is the linearization of (3.8) around \((t_{0},\xi _{0},\xi _{0}^{{\prime}},\hat{p})\) with respect to p. Taking into account the definition of the further components F ( j), j = 1, , d + 1, of the derivative array, compare (2.3), we recognize that (3.12) provides a linearization of (2.3) with respect to p. Hence, the settings

$$\displaystyle{S(t_{0}) = S_{0},\quad S^{{\prime}}(t_{ 0}) = S_{0}^{{\prime}},\ldots,S^{(d+1)}(t_{ 0}) = S_{0}^{(d+1)}}$$

provide consistent initial values for the sensitivity DAE

$$\displaystyle{F_{z}^{{\prime}}(t,z(t),z^{{\prime}}(t),p)S(t) + F_{ z^{{\prime}}}^{{\prime}}(t,z(t),z^{{\prime}}(t),p)S^{{\prime}}(t) + F_{ p}^{{\prime}}(t,z(t),z^{{\prime}}(t),p) = 0,}$$

where S(t): = ∂ z(t; p)∕∂ p denotes the sensitivity of the solution of (3.8) with respect to the parameter p, compare Sect. 7. Herein, it is assumed that F is sufficiently smooth with respect to all arguments.

In summary, the benefits of the projection approach using LSQ(p) are twofold: Firstly, it allows to compute consistent initial values for the DAE itself. Secondly, the sensitivity analysis provides consistent initial values for the sensitivity DAE. Finally, the sensitivity analysis can be used to predict consistent initial values under perturbations through the Taylor expansion

$$\displaystyle{\xi (\,p) =\xi (\hat{p}) +\xi ^{{\prime}}(\hat{p})(p -\hat{ p}) + o(\|p -\hat{ p}\|)}$$

for \(p \in B_{\varepsilon }(\hat{p})\).

4 Integration Methods

A vast number of numerical discretizations schemes exist for DAEs, most of them are originally designed for ODEs, such as BDF methods or Runge–Kutta methods. The methods for DAEs are typically (at least in part) implicit methods owing to the presence of algebraic equations. It is beyond the scope of the paper to provide a comprehensive overview on all the existing numerical discretization schemes for DAEs, since excellent textbooks with convergence results and many details are available, for instance [29, 73, 7577, 90, 92, 134]. Our intention is to discuss the most commonly used methods, their construction principles, and some of their features. Efficient implementations use a bunch of additional ideas to improve the efficiency.

All methods work on a grid

$$\displaystyle{\mathbb{G}_{h} =\{ t_{0} < t_{1} <\ldots < t_{N-1} < t_{N} = t_{f}\}}$$

with \(N \in \mathbb{N}\) and step-sizes h k  = t k+1t k , k = 0, , N − 1. The maximum step-size is denoted by \(h =\max \limits _{k=0,\ldots,N-1}h_{k}\). The methods generate a grid function \(z_{h}: \mathbb{G}_{h}\longrightarrow \mathbb{R}^{n}\) with z h (t) ≈ z(t) for \(t \in \mathbb{G}_{h}\), where z(t) denotes the solution of (1.1) with a consistent initial value z 0. The discretization schemes can be grouped into one-step methods with

$$\displaystyle{ z_{h}(t_{i+1}) = z_{h}(t_{i}) + h_{i}\varPhi (t_{i},z_{h}(t_{i}),h_{i}),\qquad i = 0,\ldots,N - 1, }$$
(4.1)

for a given consistent initial value z h (t 0) = z 0 and s-stage multi-step methods with

$$\displaystyle{ z_{h}(t_{i+s}) =\varPsi (t_{i},\ldots,t_{i+s},z_{h}(t_{i}),\ldots,z_{h}(t_{i+s-1}),h_{i},\ldots,h_{i+s-1}),\quad i = 0,\ldots,N - s, }$$
(4.2)

for given consistent initial values z h (t 0) = z 0, , z h (t s−1) = z s−1. Note that multi-step methods with s > 1 require an initialization procedure to compute z 1, , z s−1. This can be realized by performing s − 1 steps of a suitable one-step method or by using multi-step methods with 1, 2, , s − 1 stages successively for the first s − 1 steps.

The aim is to construct convergent methods such that the global error \(e_{h}: \mathbb{G}_{h}\longrightarrow \mathbb{R}^{n}\) defined by

$$\displaystyle{e_{h}:= z_{h} -\varDelta _{h}(z),\qquad e_{h}(t) = z_{h}(t) -\varDelta _{h}(z)(t),\ t \in \mathbb{G}_{h},}$$

satisfies

$$\displaystyle{\lim _{h\rightarrow 0}\|e_{h}\|_{\infty } = 0}$$

or even exhibits the order of convergence \(p \in \mathbb{N}\), i.e.

$$\displaystyle{\|e_{h}\|_{\infty } = \mathcal{O}(h^{p})\ \mbox{ as}\ h \rightarrow 0.}$$

Herein, \(\varDelta _{h}:\{ z: [t_{0},t_{f}]\longrightarrow \mathbb{R}^{n}\}\longrightarrow \{z_{h}: \mathbb{G}_{h}\longrightarrow \mathbb{R}^{n}\}\) denotes the restriction operator onto the set of grid functions on \(\mathbb{G}_{h}\) defined by Δ h (z)(t) = z(t) for \(t \in \mathbb{G}_{h}\).

A convergence proof for a specific discretization scheme typically resembles the reasoning

$$\displaystyle{\mathrm{consistency}\quad + \quad \mathrm{stability}\qquad \Longrightarrow\qquad \mathrm{convergence},}$$

compare [131]. Herein, consistency is not to be confused with consistent initial values. Instead, consistency of a discretization method measures how well the exact solution satisfies the discretization scheme. Detailed definitions of consistency and stability and convergence proofs for various classes of DAEs (index-one, Hessenberg DAEs up to order 3, constant/variable step-sizes) can be found in the above-mentioned textbooks [29, 73, 7577, 90, 92, 134]. As a rule, one cannot in general expect the same order of convergence for differential and algebraic variables for higher index DAEs.

4.1 BDF Methods

The Backward Differentiation Formulas (BDF) are implicit multi-step methods and were introduced in [40, 61]. A BDF method with \(s \in \mathbb{N}\) stages is based on the construction of interpolating polynomials, compare Fig. 1. Suppose the method has produced approximations z h (t i+k ), k = 0, , s − 1, of z at the grid points t i+k , k = 0, , s − 1. The aim is to determine an approximation z h (t i+s ) of z(t i+s ), where i ∈ { 0, , Ns}.

Fig. 1
figure 1

Idea of BDF method: polynomial interpolation of approximations

To this end, let P(t) be the interpolating polynomial of degree at most s with

$$\displaystyle{P(t_{i+k}) = z_{h}(t_{i+k}),\qquad k = 0,\ldots,s.}$$

The polynomial P can be expressed as

$$\displaystyle{P(t) =\sum _{ k=0}^{s}z_{ h}(t_{i+k})L_{k}(t),\qquad L_{k}(t) =\prod _{ \ell=0,\ell\not =k}^{s} \frac{t - t_{i+\ell}} {t_{i+k} - t_{i+\ell}},}$$

where the L k ’s denote the Lagrange polynomials. Note that P interpolates the unknown vector z h (t i+s ), which is determined implicitly by the postulation that P satisfies the DAE (1.1) at t i+s , i.e.

$$\displaystyle{ F(t_{i+s},z_{h}(t_{i+s}),P^{{\prime}}(t_{ i+s})) = 0. }$$
(4.3)

The above representation of P yields

$$\displaystyle{P^{{\prime}}(t_{ i+s}) =\sum _{ k=0}^{s}z_{ h}(t_{i+k})L_{k}^{{\prime}}(t_{ i+s}) =: \frac{1} {h_{i+s-1}}\sum \limits _{k=0}^{s}\alpha _{ k}z_{h}(t_{i+k}),}$$

with α k  = h i+s−1 L k (t i+s ), k = 0, , s.

Example 4.1

The BDF methods with s ≤ 6 and a constant step-size h read as follows, see [134, S. 335]:

$$\displaystyle\begin{array}{rcl} s = 1&:& hP^{{\prime}}(t_{ i+1}) = z_{i+1} - z_{i}\quad \mbox{ (implicit Euler method)} {}\\ s = 2&:& hP^{{\prime}}(t_{ i+2}) = \frac{1} {2}\left (3z_{i+2} - 4z_{i+1} + z_{i}\right ) {}\\ s = 3&:& hP^{{\prime}}(t_{ i+3}) = \frac{1} {6}\left (11z_{i+3} - 18z_{i+2} + 9z_{i+1} - 2z_{i}\right ) {}\\ s = 4&:& hP^{{\prime}}(t_{ i+4}) = \frac{1} {12}\left (25z_{i+4} - 48z_{i+3} + 36z_{i+2} - 16z_{i+1} + 3z_{i}\right ) {}\\ s = 5&:& hP^{{\prime}}(t_{ i+5}) = \frac{1} {60}\left (137z_{i+5} - 300z_{i+4} + 300z_{i+3} - 200z_{i+2} + 75z_{i+1} - 12z_{i}\right ) {}\\ s = 6&:& hP^{{\prime}}(t_{ i+6}) = \frac{1} {60}\left (147z_{i+6} - 360z_{i+5} + 450z_{i+4} - 400z_{i+3} + 225z_{i+2}\right. {}\\ & & \hspace{71.13188pt} \left.-72z_{i+1} + 10z_{i}\right ). {}\\ \end{array}$$

Abbreviations: z i+k  = z h (t i+k ), k = 0, , 6.

Introducing the expression for P (t i+s ) into (4.3) yields the nonlinear equation

$$\displaystyle{ F\left (t_{i+s},z_{h}(t_{i+s}), \frac{1} {h_{i+s-1}}\sum \limits _{k=0}^{s}\alpha _{ k}z_{h}(t_{i+k})\right ) = 0 }$$
(4.4)

for z h (t i+s ). Suppose (4.4) possesses a solution z h (t i+s ) and the matrix pencil

$$\displaystyle{ F_{z}^{{\prime}} + \frac{\alpha _{s}} {h_{i+s-1}}F_{z^{{\prime}}}^{{\prime}} }$$
(4.5)

is regular at this solution, i.e., there exists a step-size h i+s−1 such that the matrix (4.5) is non-singular. Then the implicit function theorem allows to solve Eq. (4.4) locally for z h (t i+s ) and to express it in the form (4.2). In practice Newton’s method or the simplified Newton method is used to solve Eq. (4.4) numerically, which requires the non-singularity of the matrix in (4.5) at the Newton iterates.

BDF methods are appealing amongst implicit methods since the effort per integration step amounts to solving just one nonlinear equation of dimension n, whereas a fully implicit s-stage Runge–Kutta method requires to solve a nonlinear equation of dimension n ⋅ s. For numerical purposes only the BDF methods with s ≤ 6 are relevant, since the methods for s > 6 are unstable. The maximal attainable order of convergence of an s-stage BDF method is s.

Convergence results assuming fixed step-sizes for BDF methods for certain subclasses of the general DAE (1.1), amongst them are index-one problems and Hessenberg DAEs, can be found in [27, 63, 99, 111]. Variable step-sizes may result in non-convergent components of the algebraic variables for index-three Hessenberg DAEs, compare [64]. This is another motivation to use an index reducing stabilization technique as in Sect. 2.2.

The famous code DASSL, [29, 109], is based on BDF methods, but adds several features like an automatic step-size selection strategy, a variable order selection strategy, a root finding strategy, and a parametric sensitivity module to the basic BDF method. Moreover, the re-use of Jacobians for one or more integration steps and numerically efficient divided difference schemes for the calculation of the interpolating polynomial P increase the efficiency of the code. The code ODASSL by Führer [56] and Führer and Leimkuhler [57] extends DASSL to overdetermined DAEs, which occur, e.g., for the GGL stabilization in Sect. 2.2. In these codes, the error tolerances for the algebraic variables of higher index DAEs have to be scaled by powers of 1∕h compared to those of the differential states since otherwise the automatic step-size selection algorithm breaks down frequently, compare [110]. An enhanced version of DASSL is available in the package SUNDIALS, [80], which provides several methods (Runge–Kutta, Adams, BDF) for ODEs and DAEs in one software package.

4.2 Runge–Kutta Methods

A Runge–Kutta method with \(s \in \mathbb{N}\) stages for (1.1) is a one-step method of type

$$\displaystyle{ z_{h}(t_{i+1}) = z_{h}(t_{i}) + h_{i}\varPhi (t_{i},z_{h}(t_{i}),h_{i}) }$$
(4.6)

with the increment function

$$\displaystyle{ \varPhi (t,z,h):=\sum _{ j=1}^{s}b_{ j}k_{j}(t,z,h) }$$
(4.7)

and the stage derivatives k j (t, z, h), j = 1, , s. The stage derivatives k j are implicitly defined by the system of n ⋅ s nonlinear equations

$$\displaystyle\begin{array}{rcl} F\left (t_{i} + c_{1}h_{i},z_{i+1}^{(1)},k_{ 1}\right )& =& 0,{}\end{array}$$
(4.8)
$$\displaystyle\begin{array}{rcl} & \vdots & \\ F\left (t_{i} + c_{s}h_{i},z_{i+1}^{(s)},k_{ s}\right )& =& 0,{}\end{array}$$
(4.9)

where

$$\displaystyle{ z_{i+1}^{(\ell)}:= z_{ h}(t_{i}) + h_{i}\sum \limits _{j=1}^{s}a_{\ell j}\,k_{j},\qquad \ell = 1,\ldots,s, }$$
(4.10)

are approximations of z at the intermediate time points t i + c h,  = 1, , s. The coefficients in the Runge–Kutta method are collected in the Butcher array

$$\displaystyle{\begin{array}{c | c c c c } c_{1} & a_{11} & a_{12} & \cdots &a_{1s} \\ c_{2} & a_{21} & a_{22} & \cdots &a_{2s}\\ \vdots&\vdots&\vdots&\ddots&\vdots \\ c_{s} &a_{s1} & a_{s2} & \cdots &a_{ss} \\ \hline & b_{1} & b_{2} & \cdots & b_{s}\end{array} }$$

Commonly used Runge–Kutta methods for DAEs are the RADAU IIA methods and the Lobatto IIIA and IIIC methods. These methods are stiffly accurate, i.e., they satisfy c s  = 1 and a sj  = b j for j = 1, , s. This is a very desirable property for DAEs since it implies that (4.9) and z i+1 (s) = z h (t i+1) hold at t i+1 = t i + c s h i . Runge–Kutta methods, which are not stiffly accurate, can be used as well. However, for those it has to be enforced that the approximation z h (t i+1) satisfies the algebraic constraints of the DAE at t i+1. This can be achieved by projecting the output of the Runge–Kutta method onto the algebraic constraints, compare [18].

Example 4.2 (RADAU IIA)

The RADAU IIA methods with s = 1, 2, 3 are defined by the following Butcher arrays, compare [134, Beispiel 6.1.5]:

$$\displaystyle{\begin{array}{c | c } 1 &1\\ \hline &1\end{array} \qquad \begin{array}{c | c c } 1/3 &5/12& - 1/12 \\ 1 & 3/4 & 1/4 \\ \hline & 3/4 & 1/4\end{array} \qquad \begin{array}{c | c c c } \frac{4-\sqrt{6}} {10} & \frac{88-7\sqrt{6}} {360} & \frac{296-169\sqrt{6}} {1800} & \frac{-2+3\sqrt{6}} {225} \\ \frac{4+\sqrt{6}} {10} & \frac{296+169\sqrt{6}} {1800} & \frac{88+7\sqrt{6}} {360} & \frac{-2-3\sqrt{6}} {225} \\ 1 & \frac{16-\sqrt{6}} {36} & \frac{16+\sqrt{6}} {36} & \frac{1} {9} \\ \hline & \frac{16-\sqrt{6}} {36} & \frac{16+\sqrt{6}} {36} & \frac{1} {9}\end{array} }$$

The maximal attainable order of convergence is 2s − 1.

Example 4.3 (Lobatto IIIA and Lobatto IIIC)

The Lobatto IIIA methods with s = 2, 3 are defined by the following Butcher arrays, compare [134, Beispiel 6.1.6]:

$$\displaystyle{\begin{array}{c | c c } 0 & 0 & 0 \\ 1 &1/2&1/2 \\ \hline &1/2&1/2\end{array} \qquad \begin{array}{c | c c c } 0 & 0 & 0 \\ 1/2 &5/24&1/3& - 1/24 \\ 1 & 1/6 &2/3& 1/6 \\ \hline & 1/6 &2/3& 1/6\end{array} }$$

The Lobatto IIIC methods with s = 2, 3 are defined by the following Butcher arrays, compare [134, Beispiel 6.1.8]:

$$\displaystyle{\begin{array}{c | c c } 0 &1/2& - 1/2 \\ 1 &1/2& 1/2 \\ \hline &1/2& 1/2\end{array} \qquad \begin{array}{c | c c c } 0 &1/6& - 1/3& 1/6 \\ 1/2 &1/6& 5/12 & - 1/12 \\ 1 &1/6& 2/3 & 1/6 \\ \hline &1/6& 2/3 & 1/6\end{array} }$$

The maximal attainable order of convergence is 2s − 2. A combined method of Lobatto IIIA and IIIC methods for mechanical multibody systems can be found in [124].

The main effort per integration step is to solve the system of nonlinear equations (4.8)–(4.9) for the unknown vector of stage derivatives k = (k 1, , k s ) by Newton’s method or by a simplified version of it, where the Jacobian matrix is kept constant for a couple of iterations or integration steps. Another way to reduce the computational effort is to consider ROW methods or half-explicit Runge–Kutta methods as in Sect. 4.3.

Convergence results and order conditions for Runge–Kutta methods applied to DAEs can be found in, e.g., [28, 75, 84, 85].

4.3 Rosenbrock-Wanner (ROW) Methods

In this section, we introduce and discuss the so-called Rosenbrock-Wanner(ROW) methods for DAEs, cf. [121], where H.H. Rosenbrock introduced this method class. ROW methods are one-step methods, which are based on implicit Runge–Kutta methods. In literature, these methods are also called Rosenbrock methods, linearly-implicit or semi-implicit Runge–Kutta methods, cf. [73].

The motivation to introduce an additional class of integration methods is to avoid solving a fully nonlinear system of dimension n ⋅ s and to solve instead of that only linear systems. Thus, the key idea for the derivation of Rosenbrock-Wanner methods is to perform one Newton-step to solve Eqs. (4.8)–(4.9) for a Runge–Kutta method with a ij  = 0 for i < j (diagonally implicit RK method, cf. [73]). We rewrite these equations for such a method and an autonomous implicit DAE,

$$\displaystyle{ F(z,z^{{\prime}}) = 0, }$$
(4.11)

as follows:

$$\displaystyle{ F\left (z_{h}(t_{i}) + h_{i}\left (\sum \limits _{j=1}^{\ell-1}a_{\ell j}\,k_{j} + a_{\ell\ell}\,k_{\ell}\right ),k_{\ell}\right ) = 0,\quad \ell = 1,\ldots,s. }$$
(4.12)

Due to the fact that we consider a diagonally implicit RK method, the above equations are decoupled and can be solved successively. Then, performing one Newton-step with starting value k (0) leads to

$$\displaystyle{ \left (F_{z}^{{\prime}}\left (z_{ i+1}^{(\ell-1)},k_{\ell}^{(0)}\right )h_{ i} \cdot a_{\ell\ell} + F_{z^{{\prime}}}^{{\prime}}\left (z_{ i+1}^{(\ell-1)},k_{\ell}^{(0)}\right )\right )\left (k_{\ell} - k_{\ell}^{(0)}\right ) = -F\left (z_{ i+1}^{(\ell-1)},k_{\ell}^{(0)}\right ), }$$
(4.13)

for  = 1, , s.

We come to the general class of Rosenbrock-Wanner methods by proceeding with the following steps. First, we take as starting value k (0) = 0 for  = 1, , s. Then, the Jacobians are evaluated at the fixed point z h (t i ) instead of z i+1 (−1), which saves computational costs substantially. Moreover, linear combinations of the previous stages k j , j = 1, ,  are introduced. And last but not least, the method is extended to general non-autonomous implicit DAEs as Eq. (1.1). We obtain the following class of Rosenbrock methods

$$\displaystyle\begin{array}{rcl} F\left (t_{i} + c_{\ell}h_{i},z_{i+1}^{(\ell-1)},0\right ) + h_{ i}J_{z}\sum _{j=1}^{\ell}\gamma _{ \ell j}k_{j} + J_{z^{{\prime}}}k_{\ell} +\gamma _{\ell}J_{t} = 0,\quad \ell = 1,\ldots,s,\qquad \quad & &{}\end{array}$$
(4.14)

with

$$\displaystyle\begin{array}{rcl} J_{z}&:=& F_{z}(t_{i},z_{h}(t_{i}),0),{}\end{array}$$
(4.15)
$$\displaystyle\begin{array}{rcl} J_{z^{{\prime}}}&:=& F_{z^{{\prime}}}(t_{i},z_{h}(t_{i}),0),{}\end{array}$$
(4.16)
$$\displaystyle\begin{array}{rcl} J_{t}&:=& F_{t}(t_{i},z_{h}(t_{i}),0).{}\end{array}$$
(4.17)

The solution at the next time point t i+1 is computed exactly as in the case of Runge–Kutta methods:

$$\displaystyle{ z_{h}(t_{i+1}) = z_{h}(t_{i}) + h_{i}\varPhi (t_{i},z_{h}(t_{i}),h_{i}),\qquad \varPhi (t,z,h):=\sum _{ j=1}^{s}b_{ j}k_{j}(t,z,h), }$$
(4.18)

with the stage derivatives k j (t, z, h) defined by the linear system (4.14). An example for a ROW method is the linearly implicit Euler method (s = 1), for which the stage derivative is defined as follows

$$\displaystyle{ F\left (t_{i},z_{i},0\right ) + h_{i}J_{z}k_{1} + J_{z^{{\prime}}}k_{1} = 0. }$$
(4.19)

For semi-explicit DAEs of the form (2.5)–(2.6), a ROW method as defined above reads as

$$\displaystyle\begin{array}{rcl} z_{h}^{x}(t_{ i+1})& =& z_{h}^{x}(t_{ i}) +\sum _{ j=1}^{s}b_{ j}k_{j}^{x}(t_{ i},z_{h}^{x}(t_{ i}),z_{h}^{y}(t_{ i}),h_{i}){}\end{array}$$
(4.20)
$$\displaystyle\begin{array}{rcl} z_{h}^{y}(t_{ i+1})& =& z_{h}^{y}(t_{ i}) +\sum _{ j=1}^{s}b_{ j}k_{j}^{y}(t_{ i},z_{h}^{x}(t_{ i}),z_{h}^{y}(t_{ i}),h_{i}),{}\end{array}$$
(4.21)

with

$$\displaystyle{ \begin{array}{ll} \left (\begin{array}{*{10}c} f\left (t_{i} + c_{\ell}h_{i},z_{i+1}^{x,(\ell-1)},z_{i+1}^{y,(\ell-1)}\right ) \\ g\left (t_{i} + c_{\ell}h_{i},z_{i+1}^{x,(\ell-1)},z_{i+1}^{y,(\ell-1)}\right ) \end{array} \right )& + h_{i} \cdot \left (\begin{array}{*{10}c} f_{z^{x}}&f_{z^{y}} \\ g_{z^{x}}&g_{z^{y}} \end{array} \right )\sum _{j=1}^{\ell}\gamma _{\ell j}\left (\begin{array}{*{10}c} k_{j}^{x} \\ k_{j}^{y} \end{array} \right ) \\ & + \left (\begin{array}{*{10}c} -k_{\ell}^{x} \\ 0\end{array} \right ) +\gamma _{\ell}\left (\begin{array}{*{10}c} f_{t} \\ g_{t} \end{array} \right ) = 0, \end{array} }$$
(4.22)

for  = 1, , s. Herein, we have set z = ((z x), (z y)) = (x , y ) and

$$\displaystyle{ F(t,z,z^{{\prime}}) = \left (\begin{array}{*{10}c} f(t,x,y) - x^{{\prime}} \\ g(t,x,y) \end{array} \right ). }$$
(4.23)

Up to now, we have considered ROW methods with exact Jacobian matrices J z  = F z , \(J_{z^{{\prime}}} = F_{z^{{\prime}}}\). There is an additional class of integration methods, which uses for J z arbitrary matrices (‘inexact Jacobians’)—such methods are called W-methods, see [73, 146, 147].

We further remark that related integration methods can be derived, if other starting values are used for the stage derivatives, instead of k (0) = 0 as it is done to derive Eq. (4.14), cf. [67, 68]—the methods derived there as well as ROW and W-methods can be seen to belong the common class of linearized implicit Runge–Kutta methods.

An introduction and more detailed discussion of ROW methods can be found in [73]; convergence results for general one-step methods (including ROW methods) applied to DAEs are available in [41]. Moreover, ROW methods for index-one DAEs in semi-explicit form are studied in [53, 117, 120, 135]; index-one problems and singularly perturbed problems are discussed in [23, 24, 74]. Analysis results and specific methods for the equations of motion of mechanical multibody systems, i.e., index-three DAEs in semi-explicit form are derived in [145, 146]; compare also the results in [14, 106, 123].

4.4 Half-Explicit Methods

In this section, we briefly discuss the so-called half-explicit Runge–Kutta methods, here for autonomous index-two DAEs in semi-explicit form. That is, we consider DAE systems of the form

$$\displaystyle\begin{array}{rcl} x^{{\prime}}(t)& =& f(x(t),y(t)),{}\end{array}$$
(4.24)
$$\displaystyle\begin{array}{rcl} 0& =& g(x(t)),{}\end{array}$$
(4.25)

with initial values (x 0, y 0) that are assumed to be consistent. To derive the class of half-explicit Runge–Kutta methods, it is more convenient to use stages rather than the stage-derivatives k as before. In particular, for the semi-explicit DAE (4.24), (4.25), we define stages for the differential and the algebraic variables as

$$\displaystyle{ X_{i\ell}:= x_{h}(t_{i}) + h_{i}\sum _{j=1}^{s}a_{\ell j}k_{j}^{x},\quad Y _{ i\ell}:= y_{h}(t_{i}) + h_{i}\sum _{j=1}^{s}a_{\ell j}k_{j}^{y},\quad \ell = 1,\ldots,s. }$$
(4.26)

Then, it holds

$$\displaystyle{ \begin{array}{lll} X_{i\ell}& =&x_{h}(t_{i}) + h_{i}\sum _{j=1}^{s}a_{\ell j}k_{j}^{x} \\ & =&x_{h}(t_{i}) + h_{i}\sum _{j=1}^{s}a_{\ell j}f\left (x_{h}(t_{i}) +\sum _{ m=1}^{s}a_{jm}k_{m}^{x},\ y_{h}(t_{i}) +\sum _{ m=1}^{s}a_{jm}k_{m}^{y}\right ) \\ & =&x_{h}(t_{i}) + h\sum _{j=1}^{s}a_{\ell j}f(X_{ij},Y _{ij}). \end{array} }$$
(4.27)

Using this notation and the coefficients of an explicit Runge–Kutta scheme, half-explicit Runge–Kutta methods as firstly introduced in [75] are defined as follows

$$\displaystyle\begin{array}{rcl} X_{i\ell}& =& x_{h}(t_{i}) + h_{i}\sum _{j=1}^{\ell-1}a_{\ell j}f(X_{nj},Y _{nj}),\qquad \ell = 1,\ldots,s,{}\end{array}$$
(4.28)
$$\displaystyle\begin{array}{rcl} 0& =& g(X_{i\ell}),{}\end{array}$$
(4.29)
$$\displaystyle\begin{array}{rcl} x_{h}(t_{i+1})& =& x_{h}(t_{i}) + h_{i}\sum _{\ell=1}^{s}b_{\ell}f(X_{ i\ell},Y _{i\ell}),{}\end{array}$$
(4.30)
$$\displaystyle\begin{array}{rcl} 0& =& g(x_{h}(t_{i+1})).{}\end{array}$$
(4.31)

The algorithmic procedure is as follows: We start with X i1 = x h (t i ) assumed to be consistent. Then, taking Eq. (4.28) for X i2 and inserting into Eq. (4.29) lead to

$$\displaystyle{ 0 = g(X_{i2}) = g\left (x_{h}(t_{i}) + a_{21}h_{i}f(X_{i1},Y _{i1})\right ), }$$
(4.32)

this is a nonlinear equation that can be solved for Y i1. Next, we calculate X i2 from Eq. (4.28) and, accordingly, Y i2, etc. For methods with c s  = 1, one obtains an approximation for the algebraic variable at the next time-point by y h (t i+1) = Y is . The key idea behind this kind of integration schemes is to apply an explicit Runge–Kutta scheme for the differential variable and to solve for the algebraic variable implicitly.

Convergence studies for this method class applied to index-two DAEs can be found in [26, 75]. In [7, 12, 105] the authors introduce a slight modification of the above stated scheme, which improves the method class concerning order conditions and computational efficiency. To be more precise, partitioned half-explicit Runge–Kutta methods for index-two DAEs in semi-explicit form are defined in the following way:

$$\displaystyle{ \begin{array}{rll} X_{i1} & =&x_{h}(t_{i}),\quad Y _{i1} = y_{h}(t_{i}), \\ X_{i\ell}& =&x_{h}(t_{i}) + h_{i}\sum _{j=1}^{\ell-1}a_{\ell j}f(X_{ij},Y _{ij}), \\ \bar{X}_{i\ell}& =&x_{h}(t_{i}) + h_{i}\sum _{j=1}^{\ell}\bar{a}_{\ell j}f(X_{ij},Y _{ij}), \\ 0& =&g(\bar{X}_{i\ell}), \\ \ell& =&2,\ldots s + 1, \\ x_{h}(t_{i+1})& =&X_{i,s+1},\qquad y_{h}(t_{i+1}) = Y _{i,s+1}.\end{array} }$$
(4.33)

Results concerning the application of half-explicit methods to index-one DAE are available in [13]; the application to index-three DAEs is discussed in [107].

4.5 Examples

Some illustrative examples with DAEs are discussed. Example 4.4 addresses an index-three mechanical multibody system of a car on a bumpy road. A docking maneuver of a satellite to a tumbling target is investigated in Example 4.5. Herein, the use of quaternions leads to a formulation with an index-one DAE.

Example 4.4

We consider a vehicle simulation for the ILTIS on a bumpy road section. A detailed description of the mechanical multibody system is provided in [128]. The system was modeled by SIMPACK [81] and the simulation results were obtained using the code export feature of SIMPACK and the BDF method DASSL [29]. The mechanical multibody system consists of 11 rigid bodies with a total of 25 degrees of freedom (DOF) (chassis with 6 DOF, wheel suspension with 4 DOF in total, wheels with 12 DOF in total, steering rod with 1 DOF, camera with 2 DOF). The motion is restricted by 9 algebraic constraints. Figure 2 illustrates the test track with bumps and the resulting pitch and roll angles, and the vertical excitation of the chassis. The integration tolerance within DASSL is set to 10−4 for the differential states and to 108 for the algebraic states (i.e., no error control was performed for the algebraic states).

Fig. 2
figure 2

Simulation results of the ILTIS on a bumpy road: roll angle, pitch angle, vertical excitation of chassis

Example 4.5

We consider a docking maneuver of a service satellite (S) to a tumbling object (T) on an orbit around the earth, compare [103]. Both objects are able to rotate freely in space and quaternions are used to parametrize their orientation. Note that, in contrast to Euler angles, quaternions lead to a continuous parametrization of the orientation without singularities.

The relative dynamics of S and T are approximately given by the Clohessy-Wilshire-Equations

$$\displaystyle\begin{array}{rcl} x^{{\prime\prime}}(t)& =& 2ny^{{\prime}}(t) - 3n^{2}x(t) + a_{ x}(t), {}\\ y^{{\prime\prime}}(t)& =& -2nx^{{\prime}}(t) + a_{ y}(t), {}\\ z^{{\prime\prime}}(t)& =& -n^{2}z(t) + a_{ z}(t), {}\\ \end{array}$$

where (x, y, z) is the relative position of S and T, a = (a x , a y , a z ) is a given control input (thrust) to S, \(n = \sqrt{\mu /a_{t }^{3}}\), a t is the semi-major axis of the orbit (assumed to be circular), and μ is the gravitational constant.

The direction cosine matrix using quaternions q = (q 1, q 2, q 3, q 4) is defined by

$$\displaystyle{ R(q)^{\top } = \left (\begin{array}{ccc} q_{1}^{2} - q_{2}^{2} - q_{3}^{2} + q_{4}^{2} & 2\left (q_{1}q_{2} + q_{3}q_{4}\right ) & 2\left (q_{1}q_{3} - q_{2}q_{4}\right ) \\ 2\left (q_{1}q_{2} - q_{3}q_{4}\right ) & - q_{1}^{2} + q_{2}^{2} - q_{3}^{2} + q_{4}^{2} & 2\left (q_{2}q_{3} + q_{1}q_{4}\right ) \\ 2\left (q_{1}q_{3} + q_{2}q_{4}\right ) & 2\left (q_{2}q_{3} - q_{1}q_{4}\right ) & - q_{1}^{2} - q_{2}^{2} + q_{3}^{2} + q_{4}^{2} \end{array} \right ). }$$

The matrix R(q) represents the rotation matrix from rotated to non-rotated state. The orientation of S and T with respect to an unrotated reference coordinate system is described by quaternions q S = (q 1 S, q 2 S, q 3 S, q 4 S) for S and q T = (q 1 T, q 2 T, q 3 T, q 4 T) for T. With the angular velocities ω S = (ω 1 S, ω 2 S, ω 3 S) and ω T = (ω 1 T, ω 2 T, ω 3 T) the quaternions obey the differential equations

$$\displaystyle{ (q^{\alpha })^{{\prime}}(t) = \frac{1} {2}\left (\begin{array}{c} \omega ^{\alpha }(t) \\ 0 \end{array} \right )\otimes q^{\alpha }(t),\qquad \alpha \in \{ S,T\}, }$$
(4.34)

where the operator ⊗ is defined by

$$\displaystyle{\left (\begin{array}{c} \omega \\ 0 \end{array} \right )\otimes q = \left (\begin{array}{cccc} 0 & \omega _{3} & -\omega _{2} & \omega _{1} \\ -\omega _{3} & 0 & \omega _{1} & \omega _{2} \\ \omega _{2} & -\omega _{1} & 0 & \omega _{3} \\ -\omega _{1} & -\omega _{2} & -\omega _{3} & 0 \end{array} \right )\left (\begin{array}{c} q_{1} \\ q_{2} \\ q_{3} \\ q_{4}\end{array} \right ).}$$

Assuming a constant mass distribution and body fixed coordinate systems that coincide with the principle axes, S and T obey the gyroscopic equations

$$\displaystyle\begin{array}{rcl} (\omega _{1}^{S})^{{\prime}}(t)& =& \frac{1} {J_{11}^{S}}\left (\omega _{2}^{S}(t)\omega _{ 3}^{S}(t)\left (J_{ 22}^{S} - J_{ 33}^{S}\right ) + u_{ 1}(t)\right ), {}\\ (\omega _{2}^{S})^{{\prime}}(t)& =& \frac{1} {J_{22}^{S}}\left (\omega _{1}^{S}(t)\omega _{ 3}^{S}(t)\left (J_{ 33}^{S} - J_{ 11}^{S}\right ) + u_{ 2}(t)\right ), {}\\ (\omega _{3}^{S})^{{\prime}}(t)& =& \frac{1} {J_{33}^{S}}\left (\omega _{2}^{S}(t)\omega _{ 1}^{S}(t)\left (J_{ 11}^{S} - J_{ 22}^{S}\right ) + u_{ 3}(t)\right ), {}\\ (\omega _{1}^{T})^{{\prime}}(t)& =& \frac{1} {J_{11}^{T}}\left (\omega _{2}^{T}(t)\omega _{ 3}^{T}(t)\left (J_{ 22}^{T} - J_{ 33}^{T}\right )\right ), {}\\ (\omega _{2}^{T})^{{\prime}}(t)& =& \frac{1} {J_{22}^{T}}\left (\omega _{1}^{T}(t)\omega _{ 3}^{T}(t)\left (J_{ 33}^{T} - J_{ 11}^{T}\right )\right ), {}\\ (\omega _{3}^{T})^{{\prime}}(t)& =& \frac{1} {J_{33}^{T}}\left (\omega _{2}^{T}(t)\omega _{ 1}^{T}(t)\left (J_{ 11}^{T} - J_{ 22}^{T}\right )\right ). {}\\ \end{array}$$

Herein u = (u 1, u 2, u 3) denotes a time-dependent torque input to S.

The quaternions are normalized to one by the algebraic constraints

$$\displaystyle{0 = (q_{1}^{\alpha })^{2} + (q_{ 2}^{\alpha })^{2} + (q_{ 3}^{\alpha })^{2} + (q_{ 4}^{\alpha })^{2} - 1,\qquad \alpha \in \{ S,T\},}$$

which has to be obeyed since otherwise a drift-off would occur owing to numerical discretization errors. In order to incorporate these algebraic constraints, we treat (q 4 S, q 4 T) as algebraic variables and drop the differential equations for q 4 S and q 4 T in (4.34). In summary, we obtain an index-one DAE with differential state \((x,y,z,x^{{\prime}},y^{{\prime}},z^{{\prime}},\omega _{1}^{S},\omega _{2}^{S},\omega _{3}^{S},q_{1}^{S},q_{2}^{S},q_{3}^{S},\omega _{1}^{T},\omega _{2}^{T},\omega _{3}^{T},q_{1}^{T},q_{2}^{T},q_{3}^{T})^{\top }\in \mathbb{R}^{18}\), algebraic state \((q_{4}^{S},q_{4}^{T})^{\top }\in \mathbb{R}^{2}\), and time-dependent control input \((a,u)^{\top }\in \mathbb{R}^{6}\) for S.

Figure 3 shows some snapshots of a docking maneuver on the time interval [0, 667] with initial states

$$\displaystyle\begin{array}{rcl} q^{S}(0)& =& (0,0,0,1)^{\top },\qquad q^{T}(0) = (-0.05,0,0,0.99875)^{\top }, {}\\ \omega ^{S}(0)& =& (0,0,0)^{\top },\qquad \omega ^{T}(0) = (0,0.0349,0.017453)^{\top }, {}\\ (x(0),y(0),z(0))^{\top }& =& (0,-100,0)^{\top },\qquad (x^{{\prime}}(0),y^{{\prime}}(0),z^{{\prime}}(0))^{\top } = (0,0,0)^{\top }, {}\\ \end{array}$$

and parameters a t  = 7071000, μ = 398 ⋅ 1012, J 11 α = 1000, J 22 α = 2000, J 33 α = 1000, α ∈ { S, T}. The integration tolerance within DASSL is set to 10−10 for the differential states and to 10−4 for the algebraic states. Figure 4 depicts the control inputs m ⋅ a = m ⋅ (a x , a y , a z ) with satellite mass m = 100 and u = (u 1, u 2, u 3). Finally, Fig. 5 shows the angular velocities ω S and ω T.

Fig. 3
figure 3

Snapshots for the docking maneuver

Fig. 4
figure 4

Control input for the docking maneuver: ma x , ma y , ma z with m = 100 (top from left to right), u 1, u 2, u 3 (bottom from left to right)

Fig. 5
figure 5

Angular velocities of the service satellite S and the tumbling target T: ω 1 S, ω 2 S, ω 3 S (top from left to right), ω 1 T, ω 2 T, ω 3 T (bottom from left to right)

5 Co-simulation

In numerical system simulation, it is an essential task to simulate the dynamic interaction of different subsystems, possibly from different physical domains, modeled with different approaches, to be solved with different numerical solvers (multiphysical system models). Especially, in vehicle engineering, this becomes more and more important, because for a mathematical model of a modern passenger car or commercial vehicle, mechanical subsystems have to be coupled with flexible components, hydraulic subsystems, electronic and electric devices, and other control units. The mathematical models for all these subsystems are often given as DAE, but, typically, they substantially differ in their complexities, time constants, and scales; hence, it is not advisable to combine all model equations to one entire DAE and to solve it numerically with one integration scheme. In contrast, modern co-simulation strategies aim at using a specific numerical solver, i.e., DAE integration method, for each subsystem and to exchange only a limited number of coupling quantities at certain communication time points. Thus, it is important to analyze the behavior of such coupled simulation strategies, ‘co-simulation’ , where the coupled subsystems are mathematically described as DAEs.

In addition to that, also the coupling may be described with an algebraic constraint equation; that is, DAE-related aspects and properties also arise here. Typical examples for such situations are network modeling approaches in general and, in particular, modeling of coupled electric circuits and coupled substructures of mechanical multibody systems, see [11].

Co-simulation techniques and their theoretical background are studied for a long time, see, for instance, the survey papers [83, 143]. In these days, a new interface standard has developed, the ‘Functional Mock-Up Interface (FMI) for Model-Exchange and Co-Simulation’, (https://www.fmi-standard.org/) . This interface is supported from more and more commercial CAE-software tools and finds more and more interest in industry for application projects. Additionally, the development of that standard and its release has also stimulated new research activities concerning co-simulation.

A coupled system of r ≥ 2 fully implicit DAEs initial value problems reads as

$$\displaystyle{ 0 = F(t,z_{i}(t),z_{i}^{{\prime}}(t),u_{ i}(t)) = 0,\ t \in [t_{0},t_{f}],\ z_{i}(t_{0}) = z_{i,0,}\quad i = 1,\ldots r }$$
(5.1)

with initial values assumed to be consistent and the (subsystem-) outputs

$$\displaystyle{\xi _{i}(t):=\varXi _{i}(t,z_{i}(t),u_{i}(t)),}$$

and the (subsystem-) inputs u i that are given by coupling conditions

$$\displaystyle{u_{i}(t) = h_{i}(\xi _{1},\ldots,\xi _{r}),\ i = 1,2,\ldots,\mbox{ i.e., }u = h(\xi ),}$$

where we have set u: = (u 1 , , u r ), ξ: = (ξ 1 , , ξ r ) and \(h:= (h_{1}^{\top },\ldots,h_{r}^{\top })^{\top }: \mathbb{R}^{n_{y}} \rightarrow \mathbb{R}^{n_{u}}\), with \(n_{y} = n_{y_{1}} +\ldots +n_{y_{r}}\), \(n_{u_{1}} +\ldots +n_{u_{r}} = n_{u}\). Moreover, we assume here

$$\displaystyle{\dfrac{\partial h_{i}} {\partial \xi _{i}} = 0,}$$

that is, the inputs of system i do not depend on his own output. If the subsystems DAEs are in semi-explicit form, Eq. (5.1) has to be replaced by

$$\displaystyle{\begin{array}{lll} \dot{x}_{i}(t)& =&f_{i}(t,x_{i}(t),y_{i}(t),u_{i}(t)), \\ 0 & =&g_{i}(t,x_{i}(t),y_{i}(t),u_{i}(t)),\end{array} }$$

with t ∈ [t 0, t f ] and (x i (t 0), y i (t 0)) = (x i, 0, y i, 0) with consistent initial values. This representation is called block-oriented; it describes the subsystems as blocks with inputs and outputs that are coupled.

In principle, it is possible to set up one monolithic system including the coupling conditions and output equations as additional algebraic equations:

$$\displaystyle\begin{array}{rcl} \dot{x}_{i}& =& f_{i}(t,x_{i}(t),y_{i}(t),u_{i}(t)), {}\\ 0& =& g_{i}(t,x_{i}(t),y_{i}(t),u_{i}(t)), {}\\ 0& =& u_{i}(t) - h(\xi (t)), {}\\ 0& =& \xi _{i}(t) -\varXi _{i}(t,x_{i}(t),u_{i}(t)),\quad i = 1,\ldots,r. {}\\ \end{array}$$

This entire system could be solved with one single integration scheme, which is, however, as indicated above typically not advisable. In contrast, in co-simulation strategies, also referred to as modular time-integration [125] or distributed time integration [11], the subsystem equations are solved separately on consecutive time-windows. Herein, the time integration of each subsystem within one time-window or macro step can be realized with a different step-size adapted to the subsystem (multirate approach), or even with different appropriate integration schemes (multimethod approach). During the integration process of one subsystem, the needed coupling quantities, i.e., inputs from other subsystems, are approximated—usually based on previous results. At the end of each macro step, coupling data is exchanged. To be more precise, for the considered time interval, we introduce a (macro) time grid \(\mathbb{G}:= \{T_{0},\ldots,T_{N}\}\) with t 0 = T 0 < T 1 <  < T N  = t f . Then, the mentioned time-windows or macro steps are given by [T n , T n+1], n = 0, , N − 1 and each subsystem is integrated independently from the others in each macro step T n  → T n+1, only using a typically limited number of coupling quantities as information from the other subsystems. The macro time points T n are also called communication points, since here, typically, coupling data is exchanged between the subsystems.

5.1 Jacobi, Gauss-Seidel, and Dynamic-Iteration Schemes

An overview on co-simulation schemes and strategies can be found, e.g., in [11, 104, 125]. There are, however, two main approaches how the above sketched co-simulation can be realized. The crucial differences are the strategy (order) how the subsystems are integrated within the macro steps and, accordingly, how coupling quantities are handled and approximated. The first possible approach is a completely parallel scheme and is called Jacobi scheme (or co-simulation/coupling of Jacobi-type). As the name indicates, the subsystems are integrated here in parallel and, thus, they have to use extrapolated input quantities during the current macro step, cf. Fig. 6. In contrast to this, the second approach is a sequential one, it is called Gauss-Seidel scheme (or co-simulation/coupling of Gauss-Seidel-type). For the special case of two coupled subsystems, r = 2, this looks as follows: one subsystem is integrated first on the current macro step using extrapolated input data yielding a (numerical) solution for this first system. Then, the second subsystem is integrated on the current macro step but, then, using already computed results from the first subsystem for the coupling quantities (since results from the first subsystem for the current macro step are available, in fact). The results from the first subsystem may be available on a fine micro time grid—within the macro step—or even as function of time, e.g., as dense output from the integration method; additionally, (polynomial) interpolation may also be used, cf. Fig. 6.

Fig. 6
figure 6

Jacobi (upper diagram) and Gauss-Seidel (lower diagram) co-simulation schemes

The sequential Gauss-Seidel scheme can be generalized straightforwardly to r > 2 coupled subsystems: The procedure is sequential, i.e., the subsystems are numerically integrated one after another and for the integration of the i-th subsystem results from the subsystems 1, , i − 1 are available for the coupling quantities, whereas data from chronologically upcoming subsystems i + 1, , r have to be extrapolated based on information from previous communication points.

The extra- and interpolation, respectively, are realized using data from previous communication points and, typically, polynomial extra- and interpolation approaches are taken. That is, in the macro step T n  → T n+1, the input of subsystem i is extrapolated using data from the communication points T nk , , T n ,

$$\displaystyle{\tilde{u}_{i}(t) =\varPsi _{i}(t;u_{i}(T_{n-k}),\ldots,u_{i}(T_{n})) =\sum _{ j=0}^{k}u_{ i}(T_{n-j})\prod _{l=0,l\neq j}^{k} \dfrac{t - T_{n-l}} {T_{n-j} - T_{n-l}},}$$

t ∈ [T n , T n+1] and with the extrapolation polynomial Ψ i with degree ≤ k; for interpolation, e.g., for Gauss-Seidel schemes, we have correspondingly

$$\displaystyle{\tilde{u}_{i}(t) =\varPsi _{i}(t;u_{i}(T_{n-k}),\ldots,u_{i}(T_{n+1})) =\sum _{ j=0}^{k+1}u_{ i}(T_{n+1-j})\prod _{l=0,l\neq j}^{k+1} \dfrac{t - T_{n+1-l}} {T_{n+1-j} - T_{n+1-l}}.}$$

The most simple extrapolation is that of zero-order, k = 0, leading to ‘frozen’ coupling quantities

$$\displaystyle\begin{array}{rcl} u_{i}(t) = u_{i}(T_{n}),\quad t \in [T_{n},T_{n+1}].& & {}\\ \end{array}$$

A third approach to establish a simulation of coupled systems are the so-called dynamic iteration schemes, [11, 20, 21], also referred to as waveform relaxation methods, [82, 94]. Here, the basic idea is to solve the subsystems iteratively on each macro step using coupling data information from previous iteration steps, in order to decrease simulation errors. How the subsystems are solved in each iteration step can be in a sequential fashion (Gauss-Seidel) or all in parallel (Jacobi or Picard), cf. [11, 20]. The schemes defined above are contained in a corresponding dynamic iteration scheme by performing exactly one iteration step.

5.2 Stability and Convergence

First of all, we point out that there is a decisive difference between convergence and stability issues for coupled ODEs on the one hand and for coupled DAEs on the other hand. The stability problems that may appear for coupled ODEs with stiff coupling terms resemble the potential problems when applying an explicit integration method to stiff ODEs—thus, these difficulties can be avoided by using sufficiently small macro step-sizes H n  = T n+1T n , cf. [9, 11, 104]. In the DAE-case, however, reducing the macro steps does not generally lead to an improvement; here, it is additionally essential that a certain contractivity condition is satisfied, see [9, 11, 21, 125].

5.2.1 The ODE-Case

For problems with coupled ODEs, convergence is studied, e.g., in [8, 10, 16, 17]. For coupled ODEs systems that are free of algebraic loops—this is guaranteed, for instance, provided that there is no direct feed-through, i.e., ∂ Ξ i ∂ u i  = 0,   i = 1, , r, for a precise definition see [10, 16]—we have the following global error estimation for a co-simulation with a Jacobi scheme with constant macro step-size H > 0 assumed to be sufficiently small,

$$\displaystyle{ \varepsilon ^{x} \leq C\left (\sum _{ i=1}^{r}\varepsilon _{ i}^{x} + H^{k+1}\right ), }$$
(5.2)

where k denotes the order of the extrapolation and ɛ i x is the global error in subsystem i and ɛ x is the overall global error, cf. [8, 10]. That is, the errors from the subsystems contribute to the global error, as well as the error from extra-(inter-)polation, \(\mathcal{O}(H^{k+1})\). These results can be straightforwardly deduced following classical convergence analysis for ODE time integration schemes.

5.2.2 The DAE-Case

For detailed analysis and both convergence and stability results for coupled DAE systems, we refer the reader to [9, 11, 20, 125] and the literature cited therein. In the sequel we summarize and sketch some aspects from these research papers.

As already said, in the DAE case, the situation becomes more difficult. Following the lines of [20], we consider the following coupled DAE-IVP representation

$$\displaystyle\begin{array}{rcl} x_{i}^{{\prime}}(t)& =& f_{ i}(x(t),y(t)),{}\end{array}$$
(5.3)
$$\displaystyle\begin{array}{rcl} 0& =& g_{i}(x(t),y(t)),\quad i = 1,\ldots,r,{}\end{array}$$
(5.4)

with x = (x 1 , , x r ), y = (y 1 , , y r ) and initial conditions (x i (t 0), y i (t 0)) = (x i, 0, y i, 0), i = 1, , r. For the following considerations, we assume that the IVP(s) possess a unique global solution and that the right-hand side functions f i , g i are sufficiently often continuously differentiable and, moreover, that it holds

$$\displaystyle{\dfrac{\partial g_{i}} {\partial y_{i}}}$$

is non-singular for i = 1, , r in a neighborhood of a solution (index-one condition for each subsystem). Notice that this representation differs from the previously stated block-oriented form. Equations (5.3)–(5.4) are, however, more convenient, in order to derive and to state the mentioned stability conditions, the coupling here is realized by the fact that all right-hand side functions f i , g i of each subsystem do depend on the entire differential and algebraic variables.

As before, we denote by a \(\ \tilde{\cdot }\ \) quantities that are only available as extra- or interpolated quantity. Thus, establishing a co-simulation scheme of Jacobi-type yields for the i-th subsystem in macro step T n  → T n+1

$$\displaystyle{\begin{array}{lll} x_{i,n}^{{\prime}}& =&f_{i}(\tilde{x}_{1,n},\ldots,\tilde{x}_{i-1,n},x_{i,n},\tilde{x}_{i+1,n},\ldots,\tilde{x}_{r,n}, \\ & &\qquad \tilde{y}_{1,n},\ldots,\tilde{y}_{i-1,n},y_{i,n},\tilde{y}_{i+1,n},\ldots,\tilde{y}_{r,n}), \\ 0 & =&g_{i}(\tilde{x}_{1,n},\ldots,\tilde{x}_{i-1,n},x_{i,n},\tilde{x}_{i+1,n},\ldots,\tilde{x}_{r,n}, \\ & &\qquad \tilde{y}_{1,n},\ldots,\tilde{y}_{i-1,n},y_{i,n},\tilde{y}_{i+1,n},\ldots,\tilde{y}_{r,n}).\\ \end{array} }$$

Accordingly, for a Gauss-Seidel-type scheme, we obtain

$$\displaystyle{\begin{array}{lll} x_{i,n}^{{\prime}}& =&f_{i}(x_{1,n},\ldots,x_{i,n},\tilde{x}_{i+1,n},\ldots,\tilde{x}_{r,n}, \\ & &\qquad y_{1,n},\ldots,y_{i,n},\tilde{y}_{i+1,n},\ldots,\tilde{y}_{r,n}), \\ 0 & =&g_{i}(x_{1,n},\ldots,x_{i,n},\tilde{x}_{i+1,n},\ldots,\tilde{x}_{r,n}, \\ & &\qquad y_{1,n},\ldots,y_{i,n},\tilde{y}_{i+1,n},\ldots,\tilde{y}_{r,n}).\\ \end{array} }$$

With g = (g 1, , g r ), a sufficient (not generally necessary) contractivity condition for stability is derived and proven in [20]. The condition is given by

$$\displaystyle{\alpha:=\Vert g_{y}^{-1}g_{\tilde{ y}}\Vert < 1,}$$

with \(\tilde{y} = (\tilde{y}_{1},\ldots,\tilde{y}_{r})\). For a detailed list of requirements and assumptions to be taken as well as for a proof and consequences, the reader is referred to [20, 21]. For the special case r = 2, the above condition leads to the following for the Jacobi-type scheme:

$$\displaystyle{\alpha = \left \Vert \left (\begin{array}{*{10}c} 0 &g_{1,y_{1}}^{-1}g_{1,y_{2}} \\ g_{2,y_{2}}^{-1}g_{2,y_{1}} & 0 \end{array} \right )\right \Vert < 1,}$$

whereas, for the Gauss-Seidel-type scheme, we obtain

$$\displaystyle{\alpha = \left \Vert \left (\begin{array}{*{10}c} g_{1,y_{1}} & 0\\ g_{ 2,y_{1}} & g_{2,y_{2}} \end{array} \right )^{-1}\left (\begin{array}{*{10}c} 0&g_{1,y_{2}} \\ 0& 0 \end{array} \right )\right \Vert < 1.}$$

An immediate consequence is that for a Jacobi-scheme of two coupled DAEs with no coupling in the algebraic equation, i.e., \(g_{i,y_{j}} = 0\), for ij, we have α = 0.

As a further example, we discuss two mechanical multibody systems coupled via a kinematic constraint:

$$\displaystyle{\begin{array}{lll} M_{i}(q_{i})q_{i}^{{\prime\prime}}& =&\psi _{i}(q_{i},q_{i}^{{\prime}}) - G_{i}^{\top }(q)\lambda,\quad i = 1,2, \\ 0 & =&\gamma (q), \end{array} }$$

with q = (q 1 , q 2 ) and G(q): = ∂ γ∂ q and G i (q): = ∂ γ∂ q i , i = 1, 2. Performing an index reduction by twice differentiating the coupling constraint and setting v i : = q i , a i : = v i as well as x i : = (q i , v i ) and y 1 = a 1 , y 2: = (a 2 , λ ), f i : = (v i , a i ), we are in the previously stated general framework:

$$\displaystyle\begin{array}{rcl} x_{1}^{{\prime}}& =& f_{ 1}\phantom{lsdfldfldgfdfdgsfd}x_{2}^{{\prime}} = f_{ 2} {}\\ 0& =& M_{1}a_{1} -\psi _{1} + G_{1}^{\top }\lambda \qquad 0 = \left [\begin{array}{*{10}c} M_{2}a_{2} -\psi _{2} + G_{2}^{\top }\lambda \\ G_{1}a_{1} + G_{2}a_{2} +\gamma ^{(II)} \end{array} \right ] {}\\ & =:& g_{1}(x_{1},x_{2},y_{1},y_{2})\qquad \quad =: g_{2}(x_{1},x_{2},y_{1},y_{2}). {}\\ \end{array}$$

Herein, γ (II) contains the remainder of the second derivative of γ without the term G 1 a 1 + G 2 a 2.

That is, the only coupling is via algebraic variables and in algebraic equations. If we set up a Jacobi-scheme, in macro step T n  → T n+1, in subsystem 1, we have to use extrapolated values from subsystem 2, i.e., y 2 is replaced by

$$\displaystyle{\tilde{y}_{2}(t) =\varPsi _{ 1}^{y}(t;y_{ 2}(T_{n-k}),\ldots,y_{2}(T_{n}))}$$

and in subsystem 1, accordingly, \(\tilde{y}_{1}(t) =\varPsi _{ 2}^{y}(t;y_{1}(T_{n-k}),\ldots,y_{1}(T_{n}))\). The above contractivity condition in this case reads

$$\displaystyle\begin{array}{rcl} \left \Vert \left (\begin{array}{*{10}c} 0 &0&M_{1}^{-1}G_{1}^{\top } \\ M_{2}^{-1}G_{2}^{\top }R_{2}^{-1}G_{1} & 0& 0 \\ -R_{2}^{-1}G_{1} & 0& 0 \end{array} \right )\right \Vert < 1,& & {}\\ \end{array}$$

with R i : = G i M i −1 G i , i = 1, 2.

Analogously, we can consider a Gauss-Seidel-type scheme. Starting with subsystem 1, we have to extrapolate here y 2 from previous macro steps yielding x 1, y 1, which then can be evaluated during time-integration of subsystem 2. Stating the contractivity condition for this case and noticing that only the algebraic variable λ has to be extrapolated from previous time points, the relevant (λ-)part of the condition requires

$$\displaystyle{ \Vert R_{2}^{-1}R_{ 1}\Vert =\Vert (G_{2}M_{2}G_{2}^{\top })^{-1}(G_{ 1}M_{1}^{-1}G_{ 1}^{\top })\Vert < 1. }$$
(5.5)

We observe in both cases that mass and inertia properties of the coupled systems may strongly influence the stability of the co-simulation. In particular for the latter sequential Gauss-Seidel scheme, the order of integration has an essential impact on stability, i.e., the choice of system 1 and 2, respectively, should be taken such that the left-hand side of (5.5) is as small as possible.

This result has been developed and proven earlier in [11] for a more general framework, which is slightly different than our setup and for which the coupled mechanical systems are also a special case. In that paper, a method for stabilization (reducing α) is suggested. In [125], the authors also study stability and convergence of coupled DAE systems in a rather general framework and propose a strategy for stabilization as well.

For the specific application field of electric circuit simulation, the reader is referred to [20, 21] and the references therein. A specific consideration of coupled mechanical multibody systems is provided in [8, 9] and in [126], where the coupling of a multibody system and a flexible structure is investigated and an innovative coupling strategy is proposed. Lately, analysis results on coupled DAE systems solved with different co-simulation strategies and stabilization approaches are provided by the authors of [129, 130]. In [19], a multibody system model of a wheel-loader described as index-three DAE in a commercial software package is coupled with a particle code for soft-soil modeling, in order to establish a coupled digging simulation.

The general topic of coupled DAE system is additionally discussed in the early papers [82, 89, 94].

A multirate integrator for constrained dynamical systems is derived in [96], which is based on a discrete variational principle. The resulting integrator is symplectic and momentum preserving.

6 Real-Time Simulation

An important field in modern numerical system simulation is real-time scenarios. Here, a numerical model is coupled with the real world and both are interacting dynamically. A typical area, in which such couplings are employed, is interactive simulators (‘human/man-in-the-loop’), such as driving simulators or flight simulators, see [58], but also interactively used software (simulators), e.g., for training purposes, cf. [98]. Apart from that, real-time couplings are used in tests for electronic control units (ECU tests) and devices (‘hardware-in-the-loop’—HiL), see, e.g., [15, 122] and in the field of model based controllers (‘model/software-in-the-loop’—MiL/SiL), see, e.g., [42, 43].

It is characteristic for all the mentioned fields that a numerical model replaces a part of the real world. In case of an automotive control unit test, the real control unit hardware is coupled with a numerical model of the rest of the considered vehicle; in case of an interactive driving simulator, the simulator hardware and, by that, the driver or the operator, respectively, is also coupled with a virtual vehicle. The benefits of such couplings are tremendous—tests and studies can be performed under fully accessible and reproducible conditions in the laboratory. Investigations and test runs with real cars and drivers can be reduced and partially avoided, which can save time, costs, and effort substantially. From the perspective of the numerical model, it receives from the real world environment signals as inputs (e.g., the steering-wheel angle from human driver in a simulator) and gives back its dynamical behavior as output (e.g., the car’s reaction is transmitted to the simulator hardware, which, in turn, follows that motion making the driver feel as he would sit in a real car). It is crucial for a realistic realization of such a coupling that the simulation as well as the communication are sufficiently fast. That is, after delivering an input to the numerical model, the real world component expects a response after a fixed time Δ T—and the numerical model has to be simulated for that time span and has to feed back the response on time. Necessary for that is that the considered numerical simulation satisfies the real-time condition : the computation (or simulation) time Δ T comp has to be smaller or equal than the simulated time Δ T.

Physical models are often described as differential equations (mechanical multibody systems that represent a vehicle model). Satisfying the real-time condition here means accordingly that the numerical time integration of the IVP

$$\displaystyle\begin{array}{rcl} F(t,z(t),z^{{\prime}}(t),u(t))& =& 0,\quad t \in [T_{ i};T_{i} +\varDelta T] {}\\ z(T_{i})& =& z_{0,i}, {}\\ \end{array}$$

is executed with a total computation time that is smaller or equal than Δ T. If a complete real-time simulation shall be run on a time horizon [t 0; t f ] which is divided by an equidistant time-grid \(\{T_{0},\ldots,T_{N}\}\), t 0 = T 0,  t f  = T N , T i+1T i  = Δ T, the real-time condition must be guaranteed for any subinterval of length Δ T. In fact, this is a coupling exactly as in classical co-simulation—with the decisive difference that one partner is not a numerical model, but a real world component and, thus, the numerical model simulation must satisfy the real-time condition. Obviously, whether or not the real-time condition can be satisfied, strongly depends both on the numerical time integration method and the differential equation and its properties itself. In principle, any time integration method can be applied, provided that the resulting simulation satisfies the real-time condition.

The fulfillment of the real-time condition as stated above has, however, to be assured deterministically in each macro time step T i  → T i +Δ T—at least in applications, where breaking this condition leads to a critical system shutdown (e.g., hardware simulators, HiL-tests). Whence, the chosen integration methods should not have indeterministic elements like step-size control or iterative inner methods (solution of nonlinear systems by Newton-like methods): varying iteration numbers lead to a varying computation time. Consequently, for real-time application, time integration methods with fixed time-steps and with a fixed number of possible iterations are preferred. Additionally, to save computation time, typically, low-order methods are in use, which is also caused by the fact that in the mentioned application situations, the coupled simulation needs not to be necessarily highly accurate, but stable.

6.1 Real-Time Integration of DAEs

For non-stiff ODE models, which have to be simulated under real-time conditions, even the simple explicit Euler scheme is frequently used. For stiff ODEs, the linearly implicit methods as discussed in Sect. 4.3 are evident, since for these method class, only linear systems have to be solved internally, which leads to an a priori known, fixed, and moderate computational effort, see [14, 15, 49, 118] and the references therein.

Since all typical and work-proven DAE time integration methods are at least partially implicit leading to the need of iterative computations, it is a common approach to avoid DAE models for real-time applications already in the modeling process (generally, for real-time applications, often specific modeling techniques are applied), whenever it is possible. However, this is often impossible in many application cases of practical relevance. For instance, the above-mentioned examples from the automotive area require a mechanical vehicle model, which is usually realized as mechanical multibody system model, whose underlying equations of motion are often a DAE as stated in Eq. (2.13). Thus, there is a need for DAE time integration schemes that are stable and highly efficient also for DAEs of realistic complexities.

Time integration methods for DAEs with a special focus on real-time applications and the fulfillment of the real-time condition are addressed, e.g., in [14, 15, 31, 32, 39, 44, 49, 50, 119]. In the sequel, we present a specific integration method for the MBS equations of motion (2.13) in its index-two formulation on velocity-level.

For the special case of the semi-explicit DAE describing a mechanical multibody system, compare (2.13), the following linearly implicit method can be applied, which is based on the linearly implicit Euler scheme. The first step is to reduce the index from three to two by replacing the original algebraic equations by its first time-derivative,

$$\displaystyle{G(q)v = 0,}$$

which is linear in v. The numerical scheme proposed in [14, 31] consists in handling the time-step for the position coordinates explicitly and requiring that the algebraic equation on velocity level is satisfied, i.e.,

$$\displaystyle{G(q_{i+1})v_{i+1} = 0.}$$

In particular, this leads to the set of linear equations as follows

$$\displaystyle\begin{array}{rcl} q_{i+1}& =& q_{i} + h_{i}v_{i}, {}\\ \left (\begin{array}{*{10}c} M - hJ_{v} - h^{2}J_{v}&G^{\top }(q_{i}) \\ G(q_{i+1}) & 0 \end{array} \right )\left (\begin{array}{*{10}c} v_{i+1} - v_{i} \\ h\lambda _{i} \end{array} \right )& =& \left (\begin{array}{*{10}c} hf_{i} + h^{2}J_{q}v_{i} \\ -G(q_{i+1})v_{i} \end{array} \right ),{}\\ \end{array}$$

where J qv : = ∂ f(qv)(q i , v i ).

An important issue is naturally the drift-off , cf. Sect. 2, in the neglected algebraic constraints—here, in the above method for the index-two version of the MBS DAE, the error in the algebraic equation on position-level, i.e., 0 = g(q), may grow linearly in time; this effect is even more severe, since a low-order method is in use. Classical strategies to stabilize this drifting are projection approaches, cf., e.g., [73, 100], which are usually of adaptive and iterative character. The authors in [14, 31] propose and discuss a non-iterative projection strategy, which consists, in fact, in one special Newton-step for the KKT conditions related to the constrained optimization problem that is used for projection; thus, only one additional linear equation has to be solved in each time-step. The authors show that using this technique leads to a bound for the error on position level, which is independent of time. An alternative way to stabilize the drift-off effect without substantially increasing the computational effort is the Baumgarte stabilization, cf. Sect. 2 and [31, 48, 122].

7 Parametric Sensitivity Analysis and Adjoints

The parametric sensitivity analysis is concerned with parametric initial value problems subject to DAEs on the interval [t 0, t f ] given by

$$\displaystyle\begin{array}{rcl} F(t,z(t),z^{{\prime}}(t),p)& =& 0,{}\end{array}$$
(7.1)
$$\displaystyle\begin{array}{rcl} z(t_{0})& =& z_{0}(\,p),{}\end{array}$$
(7.2)

where \(p \in \mathbb{R}^{m}\) is a parameter vector and the mapping \(z_{0}: \mathbb{R}^{m}\longrightarrow \mathbb{R}^{n}\) is at least continuously differentiable. We assume that the initial value problem possesses a solution for every p in some neighborhood of a given nominal parameter \(\hat{p}\) and denote the solution by z(t; p). In order to quantify the influence of the parameter on the solution, we are interested in the so-called sensitivities (sensitivity matrices)

$$\displaystyle{ S(t):= \frac{\partial z} {\partial p}(t;\hat{p})\qquad \mbox{ for }t \in [t_{0},t_{f}]. }$$
(7.3)

Throughout we tacitly assume that the sensitivities actually exist.

In many applications, e.g., from optimal control or optimization problems involving DAEs, one is not directly interested in the sensitivities S(⋅ ) themselves but in the gradient of some function \(g: \mathbb{R}^{m}\longrightarrow \mathbb{R}\) defined by

$$\displaystyle{ g(\,p):=\varphi (z(t_{f};p),p), }$$
(7.4)

where \(\varphi: \mathbb{R}^{n} \times \mathbb{R}^{m}\longrightarrow \mathbb{R}\) is continuously differentiable. Of course, if the sensitivities S(⋅ ) are available, the gradient of g at \(\hat{p}\) can easily be computed by the chain rule as

$$\displaystyle{ \nabla g(\hat{p}) = S(t_{f})^{\top }\nabla _{ z}\varphi (z(t_{f};\hat{p}),\hat{p}) + \nabla _{p}\varphi (z(t_{f};\hat{p}),\hat{p}). }$$
(7.5)

However, often the explicit computation of S is costly and should be avoided. Then the question for alternative representations of the gradient \(\nabla g(\hat{p})\) arises, which avoids the explicit computation of S. This alternative representation can be derived using an adjoint DAE. Both approaches are analytical in the sense that they provide the correct gradient, if round-off errors are not taken into account.

Remark 7.1

The computation of the gradient using S is often referred to as the forward mode and the computation using adjoints as the backward or reverse mode in the context of automatic differentiation, compare [72]. Using automatic differentiation is probably the most convenient way to compute the above gradient, since powerful tools are available, see the web-page familywww.autodiff.org.

The same kind of sensitivity investigations can be performed either for the problem (7.1)–(7.2) in continuous time or for discretizations thereof by means of one-step or multi-step methods.

7.1 Sensitivity Analysis in Discrete Time

7.1.1 The Forward Mode

Suppose a suitable discretization scheme of (7.1)–(7.2) is given, which provides approximations z h (t i ; p) at the grid points \(t_{i} \in \mathbb{G}_{h}\) in dependence on the parameter p. We are interested in the sensitivities

$$\displaystyle{S_{h}(t_{i}):= \frac{\partial z_{h}} {\partial p} (t_{i};\hat{p}) \in \mathbb{R}^{n\times m}\qquad \mbox{ for }t_{ i} \in \mathbb{G}_{h}}$$

for a nominal parameter \(\hat{p} \in \mathbb{R}^{m}\). As the computations are performed on a finite grid, these sensitivities can be obtained by differentiating the discretization scheme with respect to p. This procedure is called internal numerical differentiation (IND) and was introduced in [25].

To be more specific, let \(\hat{p}\) be a given nominal parameter and consider the one-step method

$$\displaystyle\begin{array}{rcl} z_{h}(t_{0};\hat{p})& =& z_{0}(\hat{p}),{}\end{array}$$
(7.6)
$$\displaystyle\begin{array}{rcl} z_{h}(t_{i+1};\hat{p})& =& z_{h}(t_{i};\hat{p}) + h_{i}\varPhi (t_{i},z_{h}(t_{i};\hat{p}),h_{i},\hat{p}),\quad i = 0,1,\ldots,N - 1.{}\end{array}$$
(7.7)

Differentiating both equations with respect to p and evaluating the equations at \(\hat{p}\) yields

$$\displaystyle\begin{array}{rcl} S_{h}(t_{0})& =& z_{0}^{{\prime}}(\hat{p}),{}\end{array}$$
(7.8)
$$\displaystyle\begin{array}{rcl} S_{h}(t_{i+1})& =& S_{h}(t_{i}) + h_{i}\left ( \frac{\partial \varPhi } {\partial z}[t_{i}]S_{h}(t_{i}) + \frac{\partial \varPhi } {\partial p}[t_{i}]\right ),\quad i = 0,1,\ldots,N - 1.{}\end{array}$$
(7.9)

Herein, we used the abbreviation [t i ] for \((t_{i},z_{h}(t_{i};\hat{p}),h_{i},\hat{p})\). Evaluation of (7.8)–(7.9) yields the desired sensitivities S h (t i ) of \(z_{h}(t_{i};\hat{p})\) at the grid points, if the increment function Φ of the one-step method and the function z 0 are differentiable with respect to z and p, respectively. Note that the function z 0 can be realized by the projection method in LSQ(p) in Sect. 3.2 and sufficient conditions for its differentiability are provided by Theorem 3.1.

The computation of the partial derivatives of Φ is more involved. For a Runge–Kutta method Eqs. (4.7)–(4.9) (with an additional dependence on the parameter p) have to be differentiated with respect to z and p. Details can be found in [68, Sect. 5.3.2].

The same IND approach can be applied to multi-step methods. Differentiation of the scheme (4.2) and the consistent initial values

$$\displaystyle{z_{h}(t_{0};p) = z_{0}(\,p),\ z_{h}(t_{1};p) = z_{1}(\,p),\ \ldots,\ z_{h}(t_{s-1};p) = z_{s-1}(\,p)}$$

with respect to p and evaluation at \(\hat{p}\) yields the formal scheme

$$\displaystyle\begin{array}{rcl} S_{h}(t_{\ell})& =& z_{\ell}^{{\prime}}(\hat{p}),\qquad \ell = 0,\ldots,s - 1, {}\\ S_{h}(t_{i+s})& =& \sum _{\ell=0}^{s-1} \frac{\partial \varPsi } {\partial z_{i+\ell}} \cdot S_{h}(t_{i+\ell}) + \frac{\partial \varPsi } {\partial p},\qquad i = 0,\ldots,N - s. {}\\ \end{array}$$

More specifically, for an s-stage BDF method the function ψ is implicitly given by (4.4) (with an additional dependence on the parameter p). Differentiation of (4.4) with respect to p yields

$$\displaystyle{ \left (\frac{\partial F} {\partial z} [t_{i+s}]+ \frac{\alpha _{s}} {h_{i+s-1}} \frac{\partial F} {\partial z^{{\prime}}}[t_{i+s}]\right )S_{h}(t_{i+s})+\sum _{\ell=0}^{s-1} \frac{\alpha _{\ell}} {h_{i+s-1}} \frac{\partial F} {\partial z^{{\prime}}}[t_{i+s}]S_{h}(t_{i+\ell})+\frac{\partial F} {\partial p} [t_{i+s}] = 0 }$$
(7.10)

and, if the iteration matrix \(M:= F_{z}^{{\prime}}[t_{i+s}] + \frac{\alpha _{s}} {h_{i+s-1}} F_{z^{{\prime}}}^{{\prime}}[t_{i+s}]\) is non-singular,

$$\displaystyle{S_{h}(t_{i+s}) = -M^{-1} \cdot \left (\sum _{\ell =0}^{s-1} \frac{\alpha _{\ell}} {h_{i+s-1}} \frac{\partial F} {\partial z^{{\prime}}}[t_{i+s}]S_{h}(t_{i+\ell}) + \frac{\partial F} {\partial p} [t_{i+s}]\right ).}$$

Herein, we used the abbreviation \([t_{i+s}] = \left (t_{i+s},z_{h}(t_{i+s}), \frac{1} {h_{i+s-1}} \sum \limits _{k=0}^{s}\alpha _{k}z_{h}(t_{i+k})\right )\).

7.1.2 The Backward Mode and Adjoints

Consider the function g in (7.4) subject to a discretization scheme, i.e.

$$\displaystyle{ g_{h}(\,p):=\varphi (z_{h}(t_{N};p),p). }$$
(7.11)

We intend to compute the gradient of g h at \(\hat{p}\). Using the sensitivity S h (t N ) the gradient is given by

$$\displaystyle{\nabla g_{h}(\hat{p}) = S_{h}(t_{N})^{\top }\nabla _{ z}\varphi (z_{h}(t_{N};\hat{p}),\hat{p}) + \nabla _{p}\varphi (z_{h}(t_{N};\hat{p}),\hat{p}).}$$

Now we are interested in an alternative representation of the gradient without the sensitivity S h (t N ). To this end consider the one-step method in (7.6)–(7.7). Following [68, Sect. 5.3.2] define the auxiliary functional

$$\displaystyle{g_{h}^{a}(\,p):= g_{ h}(\,p)+\sum _{i=0}^{N-1}\lambda _{ h}(t_{i+1})^{\top }\left (z_{ h}(t_{i+1};p) - z_{h}(t_{i};p) - h_{i}\varPhi (t_{i},z_{h}(t_{i};p),h_{i},p)\right )}$$

with multipliers λ h (t 1), , λ h (t N ) that will be specified later. Note that g h a ≡ g h for all discrete trajectories satisfying (7.6)–(7.7). The gradient of g h a at \(\hat{p}\) computes to

$$\displaystyle\begin{array}{rcl} \nabla g_{h}^{a}(\hat{p})& =& S_{ h}(t_{N})^{\top }\nabla _{ z}\varphi (z_{h}(t_{N};\hat{p}),\hat{p}) + \nabla _{p}\varphi (z_{h}(t_{N};\hat{p}),\hat{p}) {}\\ & & +\sum _{i=0}^{N-1}\left (S_{ h}(t_{i+1}) - S_{h}(t_{i}) - h_{i} \frac{\partial \varPhi } {\partial z}[t_{i}]S_{h}(t_{i}) - h_{i} \frac{\partial \varPhi } {\partial p}[t_{i}]\right )^{\top }\lambda _{ h}(t_{i+1}) {}\\ & =& S_{h}(t_{N})^{\top }\nabla _{ z}\varphi (z_{h}(t_{N};\hat{p}),\hat{p}) + \nabla _{p}\varphi (z_{h}(t_{N};\hat{p}),\hat{p}) {}\\ & & +\sum _{i=1}^{N}S_{ h}(t_{i})^{\top }\lambda _{ h}(t_{i}) -\sum _{i=0}^{N-1}\left (S_{ h}(t_{i}) + h_{i} \frac{\partial \varPhi } {\partial z}[t_{i}]S_{h}(t_{i})\right )^{\top }\lambda _{ h}(t_{i+1}) {}\\ & & -\sum _{i=0}^{N-1}h_{ i} \frac{\partial \varPhi } {\partial p}[t_{i}]^{\top }\lambda _{ h}(t_{i+1}) {}\\ & =& S_{h}(t_{N})^{\top }\left (\lambda _{ h}(t_{N}) + \nabla _{z}\varphi (z_{h}(t_{N};\hat{p}),\hat{p})\right ) + \nabla _{p}\varphi (z_{h}(t_{N};\hat{p}),\hat{p}) {}\\ & & +\sum _{i=1}^{N-1}S_{ h}(t_{i})^{\top }\left (\lambda _{ h}(t_{i}) -\lambda _{h}(t_{i+1}) - h_{i} \frac{\partial \varPhi } {\partial z}[t_{i}]^{\top }\lambda _{ h}(t_{i+1})\right ) {}\\ & & -S_{h}(t_{0})^{\top }\left (\lambda _{ h}(t_{1}) + h_{0} \frac{\partial \varPhi } {\partial z}[t_{0}]^{\top }\lambda _{ h}(t_{1})\right ) -\sum _{i=0}^{N-1}h_{ i} \frac{\partial \varPhi } {\partial p}[t_{i}]^{\top }\lambda _{ h}(t_{i+1}) {}\\ \end{array}$$

In order to eliminate the sensitivities, we choose the multipliers λ h such that they satisfy the adjoint equations

$$\displaystyle\begin{array}{rcl} \lambda _{h}(t_{N})& =& -\nabla _{p}\varphi (z_{h}(t_{N};\hat{p}),\hat{p}),{}\end{array}$$
(7.12)
$$\displaystyle\begin{array}{rcl} \lambda _{h}(t_{i})& =& \lambda _{h}(t_{i+1}) + h_{i} \frac{\partial \varPhi } {\partial z}[t_{i}]^{\top }\lambda _{ h}(t_{i+1}),\qquad i = 0,\ldots,N - 1.{}\end{array}$$
(7.13)

The adjoint equations have to be solved backwards in time starting at t N . With this choice the gradient of g h a reduces to

$$\displaystyle\begin{array}{rcl} \nabla g_{h}^{a}(\hat{p})& =& \nabla _{ p}\varphi (z_{h}(t_{N};\hat{p}),\hat{p}) - S_{h}(t_{0})^{\top }\lambda _{ h}(t_{0}) -\sum _{i=0}^{N-1}h_{ i} \frac{\partial \varPhi } {\partial p}[t_{i}]^{\top }\lambda _{ h}(t_{i+1}) {}\\ \end{array}$$

width \(S_{h}(t_{0}) = z_{0}^{{\prime}}(\hat{p})\). Since g h and g h a coincide for all discrete trajectories satisfying (7.6)–(7.7), the following theorem holds, see [68, Theorems 5.3.2, 5.3.3] for a proof:

Theorem 7.1

We have

$$\displaystyle{\nabla g_{h}(\hat{p}) = \nabla g_{h}^{a}(\hat{p}) = \nabla _{ p}\varphi (z_{h}(t_{N};\hat{p}),\hat{p})-S_{h}(t_{0})^{\top }\lambda _{ h}(t_{0})-\sum _{i=0}^{N-1}h_{ i} \frac{\partial \varPhi } {\partial p}[t_{i}]^{\top }\lambda _{ h}(t_{i+1}),}$$

where λ h (⋅) satisfies the adjoint Eqs. ( 7.12 )–( 7.13 ). Moreover, the combined discretization scheme ( 7.7 ) and ( 7.13 ) for z h and λ h is symplectic.

Remark 7.2

Computing the gradient of g h via the adjoint approach is more efficient than using the sensitivities, because the adjoint equations do not depend on the dimension of p, whereas the sensitivity equations (7.8)–(7.9) are matrix difference equations for the n × m-matrices S h (⋅ ).

7.2 Sensitivity Analysis in Continuous Time

7.2.1 The Forward Mode

The IND approach is based on the differentiation of the discretization scheme. Applying the same idea to the parametric DAE in continuous time (7.1)–(7.2) yields the sensitivity DAE

$$\displaystyle\begin{array}{rcl} \frac{\partial F} {\partial z} [t] \cdot S(t) + \frac{\partial F} {\partial z^{{\prime}}}[t] \cdot S^{{\prime}}(t) + \frac{\partial F} {\partial p} [t]& =& 0,\qquad t \in [t_{0},t_{f}],{}\end{array}$$
(7.14)
$$\displaystyle\begin{array}{rcl} S(t_{0})& =& z_{0}^{{\prime}}(\hat{p}){}\end{array}$$
(7.15)

for the sensitivities S(t) in (7.3). We used the abbreviation \([t] = (t,z(t;\hat{p}),z^{{\prime}}(t;\hat{p}),\hat{p})\) and assumed that

$$\displaystyle{S^{{\prime}}(t) = \frac{\partial ^{2}z} {\partial p\partial t}(t;\hat{p}).}$$

Note that the derivative \(z_{0}^{{\prime}}(\hat{p})\) can be obtained by a sensitivity analysis of the least-squares problem LSQ(p) in Sect. 3.2. Moreover, the sensitivity analysis in Theorem 3.1 provides a consistent initial value for the sensitivity DAE (7.14)–(7.15).

Now, the initial value problems for z and S in (7.1)–(7.2) and (7.14)–(7.15) can be solved simultaneously using some suitable one-step or multi-step method. Since efficient implementations often use approximate Jacobians, automatic step-size algorithms, or order selection strategies, the resulting numerical solutions \(z_{h}(\cdot;\hat{p})\) and S h (⋅ ) satisfy \(S_{h}(\cdot ) \approx \partial z_{h}/\partial p(\cdot;\hat{p})\) only up to some tolerance. As a result, the gradient of g in (7.5) will be accurate only in the range of a given integration tolerance. The forward approach using sensitivities is investigated in more detail, e.g., in [29, 37, 79, 87, 101] and a comparison is provided in [52].

A connection to the IND approach arises if the same discretization scheme and the same step-sizes for both DAEs are used. For the BDF method we obtain

$$\displaystyle{F\left (t_{i+s},z_{h}(t_{i+s}), \frac{1} {h_{i+s-1}}\sum \limits _{\ell=0}^{s}\alpha _{ \ell}z_{h}(t_{i+\ell}),\hat{p}\right ) = 0}$$

for i = 0, , Ns. Application of the same BDF method with the same step-sizes to the sensitivity DAE (7.14) yields

$$\displaystyle\begin{array}{rcl} \frac{\partial F} {\partial z} [t_{i+s}] \cdot S_{h}(t_{i+s}) +\sum \limits _{ \ell=0}^{s} \frac{\alpha _{\ell}} {h_{i+s-1}} \frac{\partial F} {\partial z^{{\prime}}}[t_{i+s}] \cdot S_{h}(t_{i+\ell}) + \frac{\partial F} {\partial p} [t_{i+s}]& =& 0, {}\\ \end{array}$$

for i = 0, , Ns. The latter coincides with the IND approach in (7.10). Hence, the discrete and continuous forward modes commute under discretization with the same method and the same step-sizes. The same is true for the Runge–Kutta method (4.7)–(4.10) applied to (7.1), i.e.,

$$\displaystyle{ z_{h}(t_{i+1}) = z_{h}(t_{i}) + h_{i}\sum _{j=1}^{s}b_{ j}k_{j}(t_{i},z_{h}(t_{i}),h_{i},\hat{p}), }$$
(7.16)

where \(k_{j}(t_{i},z_{h}(t_{i}),h_{i},\hat{p})\), j = 1, , s, are implicitly defined by

$$\displaystyle{ F\left (t_{i} + c_{\ell}h_{i},z_{h}(t_{i}) + h_{i}\sum \limits _{j=1}^{s}a_{\ell j}\,k_{j},k_{\ell},\hat{p}\right ) = 0,\qquad \ell = 1,\ldots,s. }$$
(7.17)

Application of the same Runge–Kutta method with the same step-sizes to the sensitivity DAE (7.14) yields

$$\displaystyle{S_{h}(t_{i+1}) = S_{h}(t_{i}) + h_{i}\sum _{j=1}^{s}b_{ j}K_{j},}$$

where K j , j = 1, , s, are implicitly given by the system of linear equations

$$\displaystyle\begin{array}{rcl} \frac{\partial F} {\partial z} [t_{i} + c_{\ell}h_{i}]\left (S_{h}(t_{i}) + h_{i}\sum \limits _{j=1}^{s}a_{\ell j}\,K_{j}\right ) + \frac{\partial F} {\partial z^{{\prime}}}[t_{i} + c_{\ell}h_{i}] \cdot K_{\ell} + \frac{\partial F} {\partial p} [t_{i} + c_{\ell}h_{i}]& =& 0 {}\\ \end{array}$$

for  = 1, , s. With

$$\displaystyle{K_{j} = \frac{\partial k_{j}} {\partial z} [t_{i}]S_{h}(t_{i}) + \frac{\partial k_{j}} {\partial p} [t_{i}],\qquad j = 1,\ldots,s,}$$

the latter coincides with the IND approach for (7.16)–(7.17).

7.2.2 The Backward Mode and Adjoints

Consider the function g in (7.4), i.e., g( p) = φ(z(t f ; p), p). Using the sensitivity S(t f ) the gradient is given by

$$\displaystyle{\nabla g(\hat{p}) = S(t_{f})^{\top }\nabla _{ z}\varphi (z(t_{f};\hat{p}),\hat{p}) + \nabla _{p}\varphi (z(t_{f};\hat{p}),\hat{p}).}$$

As in the discrete case we are interested in an alternative representation of the gradient without the sensitivity S(t f ). To this end we define the auxiliary functional

$$\displaystyle{g^{a}(\,p):= g(\,p) +\int _{ t_{0}}^{t_{f} }\lambda (t)^{\top }F(t,z(t;p),z^{{\prime}}(t;p),p)dt,}$$

where λ is a suitable function to be defined later. Differentiation with respect to p, evaluation at \(\hat{p}\), and integration by parts yield

$$\displaystyle\begin{array}{rcl} \nabla g^{a}(\hat{p})& =& S(t_{ f})^{\top }\nabla _{ z}\varphi (z(t_{f};\hat{p}),\hat{p}) + \nabla _{p}\varphi (z(t_{f};\hat{p}),\hat{p}) {}\\ & & +\int _{t_{0}}^{t_{f} }\left (F_{z}^{{\prime}}[t] \cdot S(t) + F_{ z^{{\prime}}}^{{\prime}}[t] \cdot S^{{\prime}}(t) + F_{ p}^{{\prime}}[t]\right )^{\top }\lambda (t)dt {}\\ & =& S(t_{f})^{\top }\left (F_{ z^{{\prime}}}^{{\prime}}[t_{ f}]^{\top }\lambda (t_{ f}) + \nabla _{z}\varphi (z(t_{f};\hat{p}),\hat{p})\right ) - S(t_{0})^{\top }F_{ z^{{\prime}}}^{{\prime}}[t_{ 0}]^{\top }\lambda (t_{ 0}) {}\\ & & +\nabla _{p}\varphi (z(t_{f};\hat{p}),\hat{p}) +\int _{ t_{0}}^{t_{f} }F_{p}^{{\prime}}[t]^{\top }\lambda (t)dt {}\\ & & +\int _{t_{0}}^{t_{f} }S(t)^{\top }\left (F_{ z}^{{\prime}}[t]^{\top }\lambda (t) - \frac{d} {dt}\left (F_{z^{{\prime}}}^{{\prime}}[t]^{\top }\lambda (t)\right )\right )dt. {}\\ \end{array}$$

Since we like to avoid the sensitivities S(t) and S(t f ) we define the adjoint DAE

$$\displaystyle\begin{array}{rcl} F_{z^{{\prime}}}^{{\prime}}[t_{ f}]^{\top }\lambda (t_{ f}) + \nabla _{z}\varphi (z(t_{f};\hat{p}),\hat{p})& =& 0,{}\end{array}$$
(7.18)
$$\displaystyle\begin{array}{rcl} F_{z}^{{\prime}}[t]^{\top }\lambda (t) - \frac{d} {dt}\left (F_{z^{{\prime}}}^{{\prime}}[t]^{\top }\lambda (t)\right )& =& 0.{}\end{array}$$
(7.19)

Please note that this derivation is a formal derivation only and it is not clear whether the adjoint DAE (7.18)–(7.19) actually possesses a solution. In fact, it may not have a solution in general. The existence and stability of solutions of the adjoint DAE subject to structural assumptions were investigated in [36]. Details can be found in [68, Sect. 5.3.3] as well.

If the adjoint DAE possesses a solution, then the gradient of g a is represented by

$$\displaystyle\begin{array}{rcl} \nabla g^{a}(\hat{p})& =& -S(t_{ 0})^{\top }F_{ z^{{\prime}}}^{{\prime}}[t_{ 0}]^{\top }\lambda (t_{ 0}) + \nabla _{p}\varphi (z(t_{f};\hat{p}),\hat{p}) +\int _{ t_{0}}^{t_{f} }F_{p}^{{\prime}}[t]^{\top }\lambda (t)dt {}\\ \end{array}$$

with \(S(t_{0}) = z_{0}^{{\prime}}(\hat{p})\) and as in the discrete case it coincides with \(\nabla g(\hat{p})\), compare [68, Sect. 5.3.3].

Remark 7.3

Solving the DAE (7.1)–(7.2) and (7.18)–(7.19) simultaneously by some suitable one-step or multi-step method in general does not commute with the discrete adjoint approach.

7.3 Example

Example 7.1 is concerned with a trolley moving on a surface, which leads to an index-three DAE. Herein, a parametric sensitivity analysis is performed and the sensitivity of the states w.r.t. to some parameters is computed using the forward mode.

Example 7.1

Consider the motion of a trolley of mass m 1 on a one-dimensional surface described by the function h(x), which is supposed to be at least twice continuously differentiable, see Fig. 7.

Fig. 7
figure 7

Configuration of the trolley

Let a load of mass m 2 be attached to the trolley’s center of gravity with a mass-less rod of length  > 0. The equations of motion are given by the following index-three DAE:

$$\displaystyle\begin{array}{rcl} m_{1}x_{1}^{{\prime\prime}}(t)& =& u(t) - 2\lambda _{ 1}(t)(x_{1}(t) - x_{3}(t)) +\lambda _{2}(t)h^{{\prime}}(x_{ 1}(t)), {}\\ m_{1}x_{2}^{{\prime\prime}}(t)& =& -m_{ 1}g - 2\lambda _{1}(t)(x_{2}(t) - x_{4}(t)) -\lambda _{2}(t), {}\\ m_{2}x_{3}^{{\prime\prime}}(t)& =& 2\lambda _{ 1}(t)(x_{1}(t) - x_{3}(t)), {}\\ m_{2}x_{4}^{{\prime\prime}}(t)& =& -m_{ 2}g + 2\lambda _{1}(t)(x_{2}(t) - x_{4}(t)), {}\\ 0& =& (x_{1}(t) - x_{3}(t))^{2} + (x_{ 2}(t) - x_{4}(t))^{2} -\ell^{2}, {}\\ 0& =& x_{2}(t) - h(x_{1}(t)). {}\\ \end{array}$$

Herein, (x 1, x 2) denotes the trolley’s center of gravity, (x 3, x 4) the load’s position, λ 1, λ 2 the algebraic variables, and u(t) a given control input.

Figures 8 shows the results of a simulation using the software OCPID-DAE1, see http://www.optimal-control.de, on the interval [0, 2. 79] (scaled to the normalized interval [0, 1]) with m 1 = 0. 3, m 2 = 0. 5,  = 0. 75, g = 9. 81, and h(x) = 0. 02sin(2π x). Figure 9 shows the control input u and the algebraic variables λ 1 and λ 2. The computations were performed for the GGL-stabilized system.

Fig. 8
figure 8

Positions of trolley (top) and velocities of load (bottom) (normalized time interval [0, 1])

Fig. 9
figure 9

Algebraic variables (λ 1, λ 2) (top) and control input u (bottom) (normalized time interval [0, 1])

Figure 10 shows the sensitivities of some states w.r.t. to m 1.

Fig. 10
figure 10

Sensitivities of positions of trolley (top) and load (bottom) w.r.t. to m 1 (normalized time interval [0, 1])

Figure 11 shows the sensitivities of some states w.r.t. to .

Fig. 11
figure 11

Sensitivities of positions of trolley (top) and load (bottom) w.r.t. to (normalized time interval [0, 1])

8 Switched Systems and Contact Problems

Many applications lead to DAE models with piecewise defined dynamics. Herein, the different DAE models are only valid in defined regions of the state space. Those regions are separated and bounded by manifolds, which are typically implicitly defined by state-dependent switching functions. A transition from one region (i.e., one DAE) to another (with another DAE) occurs, if the switching function changes its sign, i.e., the switching function indicates a switch in the dynamic system. Moreover, a transition from one region to another may come along with a discontinuity of some state components. For instance, contact and friction forces acting between two or more colliding rigid bodies typically lead to discontinuities in the velocity components of the state of a mechanical multibody system.

More general classes of switched DAEs and the existence and stability of solutions are discussed in [97]. The controllability of switched DAEs is investigated in [91]. Hybrid optimal control problems and necessary conditions can be found in [59, 133, 136].

8.1 Hybrid Systems and Switching Functions

It is convenient to view the dynamic process as a hybrid system, compare [114, 142]. To this end, the status of the system is characterized by a finite set of modes M = { 1, , P}. In mode m ∈ M, the state evolves according to the DAE

$$\displaystyle\begin{array}{rcl} x^{{\prime}}(t)& =& f^{m}(x(t),y(t)), {}\\ 0& =& g^{m}(x(t),y(t)). {}\\ \end{array}$$

The system remains in mode m as long as the trajectory z(t) = (x(t), y(t)) stays within the set

$$\displaystyle{\mathcal{S}^{m}:=\{ x \in X\;\vert \;s^{m}(x) \geq 0\},}$$

where for each m ∈ M, \(s^{m}: \mathbb{R}^{n}\longrightarrow \mathbb{R}\) is called switching function of mode m . For simplicity we exclude vector-valued switching functions in order to avoid situations with multiple active switching functions, which are difficult to resolve. \(Z = X \times Y \subseteq \mathbb{R}^{n} \times \mathbb{R}^{m}\) defines the space of possible differential and algebraic states.

A transition from mode m to another mode \(\tilde{m}\) becomes possible only in the event that x is about to cross the boundary of \(\mathcal{S}^{m}\) at some time point \(\hat{t}\), i.e., if \(s^{m}(x(\hat{t}^{-})) = 0\) and s m(x(t)) < 0 for some \(t >\hat{ t}\) provided the process would be continued with the dynamics of mode m. Herein, \(x(\hat{t}^{\pm })\) denote the left- and right-sided limits of x at \(\hat{t}\), respectively. The time point \(\hat{t}\) in the above situation is called switching point.

In case of a transition from mode m to \(\tilde{m}\) at time \(\hat{t}\), the following jump condition applies to the differential state:

$$\displaystyle{ x(\hat{t}^{+}) = x(\hat{t}^{-}) + d^{m\rightarrow \tilde{m}}(x(\hat{t}^{-})). }$$
(8.1)

Herein, \(d^{m\rightarrow \tilde{m}}: X\longrightarrow X\) denotes the jump function for a transition from mode m to mode \(\tilde{m}\). The transition from mode m to mode \(\tilde{m}\) is possible only if the state \(x(\hat{t}^{-})\) belongs to some set \(X^{m\rightarrow \tilde{m}} \subseteq X\). Moreover, \(x(\hat{t}^{+})\) is supposed to be consistent with the DAE.

The following assumption provides a sufficient condition for a proper crossing of the switching manifold {x ∈ X  |  s m(x) = 0} in mode m.

Assumption 8.1

Let the condition

$$\displaystyle{x^{{\prime}}(\hat{t}^{-})^{\top }\nabla s^{m}(x(\hat{t}^{-})) < 0}$$

be satisfied whenever the system is in mode m ∈ M and \(\hat{t}\) is a point with \(s^{m}(x(\hat{t}^{-}))=0\).

In the case \(x^{{\prime}}(\hat{t}^{-})^{\top }\nabla s^{m}(x(\hat{t}^{-})) = 0\), the trajectory is tangential to the manifold \(\mathcal{S}^{m}\) and it may or may not cross the manifold or it may even stay on the manifold. These cases are difficult to handle in general and bifurcation and non-uniqueness issues may occur. Even if Assumption 8.1 holds, infinitely many switches (Zeno phenomenon) may occur with \(\lim _{i\rightarrow \infty }(\hat{t}_{i+1} -\hat{ t}_{i}) = 0\), where the \(\hat{t}_{i}\)’s denote the switching times. The continuation of the trajectory beyond such an accumulation point (the Zeno point) is nontrivial in general. Often, the trajectory is continued such that it stays on the switching manifold.

The simulation of a hybrid system subject to Assumption 8.1 can be performed as follows:

Algorithm 1 (Hybrid System Simulation Using Switching Functions)

  1. (0)

    Init: Choose a consistent initial value z h (t 0) = (x h (t 0), y h (t 0)) at t = t 0, an initial mode m 0 ∈ M with \(s^{m_{0}}(x_{h}(t_{0})) > 0\), a final time t f  > t 0, and set k = 0.

  2. (1)

    Stop the integration, if t k  = t f .

  3. (2)

    Perform one step of a numerical integration scheme with a suitable step-size h to the DAE

    $$\displaystyle\begin{array}{rcl} x^{{\prime}}(t)& =& f^{m_{k} }(x(t),y(t)), {}\\ 0& =& g^{m_{k} }(x(t),y(t)), {}\\ \end{array}$$

    and compute the approximation z h (t k+1) = (x h (t k+1), y h (t k+1)) at time t k+1 = min{t f , t k + h}.

  4. (3)

    If \(s^{m_{k}}(x_{ h}(t_{k+1})) > 0\), set m k+1 = m k , k ← k + 1, and go to (1). Otherwise go to (4).

  5. (4)

    If \(s^{m_{k}}(x_{ h}(t_{k+1})) = 0\), find \(\tilde{m}\) with \(x_{h}(t_{k+1}) \in X^{m_{k}\rightarrow \tilde{m}}\), update the state by

    $$\displaystyle{x_{h}(t_{k+1}) = x_{h}(t_{k+1}) + d^{m_{k}\rightarrow \tilde{m}}(x_{ h}(t_{k+1})),}$$

    compute a corresponding consistent initial value y h (t k+1), update the mode by \(m_{k+1} =\tilde{ m}\), set k ← k + 1, and go to (1). Otherwise go to (5).

  6. (5)

    If \(s^{m_{k}}(x_{ h}(t_{k+1})) < 0\), determine a step-size τ ∈ [0, 1] such that \(s^{m_{k}}(x_{ h}(t_{k} +\tau h)) = 0\) using the integration scheme in (2), set t k+1 = t k +τ h and z h (t k+1) = (x h (t k+1), y h (t k+1)), and go to (4).

In order to determine the step-size τ in step (5) of Algorithm 1, a root of the function

$$\displaystyle{r(\tau ):= s^{m_{k} }(x_{h}(t_{k} +\tau h))}$$

has to be found. If \(r(0) = s^{m_{k}}(x_{ h}(t_{k})) > 0\) and \(r(1) = s^{m_{k}}(x_{ h}(t_{k} + h)) < 0\), a root can be located by the bisection method with a linear rate of convergence. Such a root τ is guaranteed to exist in [0, 1] by the intermediate value theorem, if the mapping \(\zeta: [0,1]\longrightarrow \mathbb{R}^{n}\) with ζ(τ): = x h (t k +τ h) is continuous. If ζ is continuously differentiable, then we may apply Newton’s method in order to find a root of r and hope for an at least super-linear convergence rate. The Newton iteration reads

$$\displaystyle{\tau _{\ell+1} =\tau _{\ell} - \frac{r(\tau _{\ell})} {r^{{\prime}}(\tau _{\ell})},\qquad \ell = 0,1,2,\ldots,}$$

where

$$\displaystyle{r^{{\prime}}(\tau ) =\zeta ^{{\prime}}(\tau )^{\top }\nabla s^{m_{k} }(\zeta (\tau )).}$$

Note that the iteration is well defined, if Assumption 8.1 holds. In fact, the weaker condition \(\zeta ^{{\prime}}(\hat{\tau })^{\top }\nabla s^{m_{k}}(\zeta (\hat{\tau }))\not =0\) in a root \(\hat{t}\) of r would be sufficient for a locally super-linear convergence of Newton’s method. Often, interpolation techniques are used to compute ζ(τ) and ζ (τ) approximately in order to avoid the frequent evaluation of the discretization scheme, see [29, Sect. 5.3.3] for further details.

Example 8.1

Let x = (x 1, x 2, x 3, x 4) and y = (y 1, y 2) be the differential and algebraic states of a pendulum of mass 1 and length 1 with a wall described by the switching function \(s^{1}(x) = x_{2} + \frac{1} {2}\) for mode 1 and the set

$$\displaystyle{\mathcal{S}^{1} =\{ (x_{ 1},x_{2},x_{3},x_{4})^{\top }\;\vert \;s^{1}(x_{ 2}) \geq 0\}.}$$

In mode 1 (free mode) the pendulum moves according to the DAE (GGL-stabilization)

$$\displaystyle\begin{array}{rcl} x_{1}^{{\prime}}(t)& =& x_{ 3}(t) - 2x_{1}(t)y_{2}(t), {}\\ x_{2}^{{\prime}}(t)& =& x_{ 4}(t) - 2x_{2}(t)y_{2}(t), {}\\ x_{3}^{{\prime}}(t)& =& \phantom{-g} - 2x_{ 1}(t)y_{1}(t), {}\\ x_{4}^{{\prime}}(t)& =& -g - 2x_{ 2}(t)y(t), {}\\ 0& =& x_{1}(t)^{2} + x_{ 2}(t)^{2} - 1, {}\\ 0& =& x_{1}(t)x_{3}(t) + x_{2}(t)x_{4}(t). {}\\ \end{array}$$

If the position (x 1, x 2) hits the boundary of \(\mathcal{S}^{1}\) at some \(\hat{t}\), i.e., if \(x_{2}(\hat{t}) = -\frac{1} {2}\), the jump condition

$$\displaystyle{\left (\begin{array}{c} x_{3}(\hat{t}^{+}) \\ x_{4}(\hat{t}^{+}) \end{array} \right ) = \left (\begin{array}{c} x_{3}(\hat{t}^{-}) \\ x_{4}(\hat{t}^{-}) \end{array} \right )-(1+\varepsilon )\left (\begin{array}{c} x_{3}(\hat{t}^{-}) \\ x_{4}(\hat{t}^{-}) \end{array} \right ) = -\varepsilon \left (\begin{array}{c} x_{3}(\hat{t}^{-}) \\ x_{4}(\hat{t}^{-}) \end{array} \right )}$$

applies and the mode remains unchanged. Herein, ɛ ∈ [0, 1] is the elasticity constant.

Figure 12 shows the results of a simulation with the code DASRT of the contact problem using switching functions in the time interval [0, 6] with ɛ = 0. 9, initial state x(0) = (1, 0, 0, 0), y(0) = (0, 0), and error tolerance 10−10 for the differential states.

Fig. 12
figure 12

Numerical simulation of pendulum with contact surface and switching functions

The results illustrate the Zeno phenomenon since the sequence of contact points accumulates. A natural continuation beyond the accumulation point is the constant solution with the pendulum being at rest on the switching manifold.

8.2 Parametric Sensitivity Analysis for Switched Systems

We add a parameter vector \(p \in \mathbb{R}^{q}\) to the problem setting in Sect. 8.1 and investigate the sensitivity of a solution of the hybrid system with respect to the parameter vector. To this end, let the state evolve according to the parameter-dependent DAE

$$\displaystyle\begin{array}{rcl} x^{{\prime}}(t)& =& f^{m}(x(t),y(t),p), {}\\ 0& =& g^{m}(x(t),y(t),p) {}\\ \end{array}$$

in mode m ∈ M within the set

$$\displaystyle{\mathcal{S}^{m}(\,p):=\{ x \in X\;\vert \;s^{m}(x,p) \geq 0\},\qquad s^{m}: \mathbb{R}^{n} \times \mathbb{R}^{q}\longrightarrow \mathbb{R}.}$$

In case of a transition from mode m to \(\tilde{m}\) at time \(\hat{t}\), the following jump condition applies to the differential state:

$$\displaystyle{ x(\hat{t}^{+}) = x(\hat{t}^{-}) + d^{m\rightarrow \tilde{m}}(x(\hat{t}^{-}),p). }$$
(8.2)

Herein, \(d^{m\rightarrow \tilde{m}}: X \times \mathbb{R}^{q}\longrightarrow X\) denotes the parametric jump function for a transition from mode m to mode \(\tilde{m}\), where a transition is possible if \(x(\hat{t}^{-})\) belongs to some set \(X^{m\rightarrow \tilde{m}}(\,p) \subseteq X\). The jump function d in (8.2) has to be chosen such that it provides consistent differential states \(x(\hat{t}^{+})\).

The functions f m, g m, s m, m ∈ M, and \(d^{m\rightarrow \tilde{m}}\), \(m,\tilde{m} \in M\), are supposed to be at least continuously differentiable with respect to all arguments.

Let z(t; p) = (x(t; p), y(t; p)) for t ∈ [t 0, t f ] denote a solution of the switched system for the parameter p with a consistent initial value z(t 0; p) = z 0( p) = (x 0( p), y 0( p)) in mode m with s m(x 0( p), p) > 0. In particular, let \(\hat{z}(t):= (\hat{x}(t),\hat{y}(t))^{\top }\) with \(\hat{z}(t) = z(t;\hat{p})\) and \(\hat{z}_{0} = z_{0}(\hat{p})\) denote the solution for a fixed nominal parameter \(\hat{p} \in \mathbb{R}^{q}\).

We are interested in computing the sensitivities

$$\displaystyle{S_{x}(t):= \frac{\partial x} {\partial p}(t;\hat{p}),\qquad S_{y}(t):= \frac{\partial y} {\partial p}(t;\hat{p})}$$

assuming their existence in [t 0, t f ].

While in mode m with \(s^{m}(x(t;\hat{p}),\hat{p}) > 0\), the sensitivities solve the sensitivity DAE

$$\displaystyle\begin{array}{rcl} S_{x}^{{\prime}}(t)& =& A^{m}(t)S_{ x}(t) + B^{m}(t)S_{ y}(t) + c^{m}(t), {}\\ 0& =& G^{m}(t)S_{ x}(t) + H^{m}(t)S_{ y}(t) + k^{m}(t), {}\\ \end{array}$$

with

$$\displaystyle\begin{array}{rcl} A^{m}(t)&:=& \frac{\partial f^{m}} {\partial x} (\hat{x}(t),\hat{y}(t),\hat{p}),\qquad G^{m}(t):= \frac{\partial g^{m}} {\partial x} (\hat{x}(t),\hat{y}(t),\hat{p}), {}\\ B^{m}(t)&:=& \frac{\partial f^{m}} {\partial y} (\hat{x}(t),\hat{y}(t),\hat{p}),\qquad H^{m}(t):= \frac{\partial g^{m}} {\partial y} (\hat{x}(t),\hat{y}(t),\hat{p}), {}\\ c^{m}(t)&:=& \frac{\partial f^{m}} {\partial p} (\hat{x}(t),\hat{y}(t),\hat{p}),\qquad k^{m}(t):= \frac{\partial g^{m}} {\partial p} (\hat{x}(t),\hat{y}(t),\hat{p}), {}\\ \end{array}$$

compare Sect. 7.

We investigate, what happens at a switching point \(\hat{t}\) in mode m with parameter \(\hat{p}\). Then, we have

$$\displaystyle{ s^{m}(x(\hat{t}^{-};\hat{p});\hat{p}) = 0. }$$
(8.3)

In general, this switching point will depend on the parameter. Define

$$\displaystyle{r(t,p):= s^{m}(x(t^{-};p),p)}$$

for p close to \(\hat{p}\). Equation (8.3) implies

$$\displaystyle{\hat{r}:= r(\hat{t},\hat{p}) = 0.}$$

Assumption 8.2

The switching point \(\hat{t}\) in mode m satisfies

$$\displaystyle{0\not =\frac{\partial r} {\partial t} (\hat{t},\hat{p}) = \frac{d} {dt}s^{m}(x(\hat{t}^{-};\hat{p});\hat{p}) = x^{{\prime}}(\hat{t}^{-};\hat{p})^{\top }\nabla _{ x}s^{m}(x(\hat{t}^{-};\hat{p});\hat{p}).}$$

If Assumption 8.2 holds, then, by the implicit function theorem, there exist neighborhoods \(B_{\delta }(\hat{p})\), δ > 0, \(B_{\varepsilon }(\hat{t})\), ɛ > 0, and a continuously differentiable mapping \(T: B_{\delta }(\hat{p})\longrightarrow B_{\varepsilon }(\hat{t})\) with

$$\displaystyle{\hat{t} = T(\hat{p})\qquad \mbox{ and}\qquad r(T(\,p),p) = 0\quad \forall p \in B_{\delta }(\hat{p}),}$$

and

$$\displaystyle\begin{array}{rcl} T^{{\prime}}(\hat{p})& =& -\left (\frac{\partial r} {\partial t} (\hat{t},\hat{p})\right )^{-1}\frac{\partial r} {\partial p}(\hat{t},\hat{p}) {}\\ \end{array}$$

with

$$\displaystyle\begin{array}{rcl} \frac{\partial r} {\partial t} (\hat{t},\hat{p})& =& x^{{\prime}}(\hat{t}^{-};\hat{p})^{\top }\nabla _{ x}s^{m}(x(\hat{t}^{-};\hat{p});\hat{p}), {}\\ \frac{\partial r} {\partial p}(\hat{t},\hat{p})& =& \nabla _{x}s^{m}(\hat{x}(\hat{t}^{-});\hat{p})^{\top }S_{ x}(\hat{t}^{-}) + \nabla _{ p}s^{m}(\hat{x}(\hat{t}^{-});\hat{p}). {}\\ \end{array}$$

Introducing T( p) into (8.2) yields the relation

$$\displaystyle{ x(T(\,p)^{+};p) = x(T(\,p)^{-};p) + d^{m\rightarrow \tilde{m}}(x(T(\,p)^{-};p),p). }$$
(8.4)

Herein, we assume that the transition \(m \rightarrow \tilde{ m}\) is stable under small perturbations in p. Differentiation of (8.2) with respect to p and evaluation at \(\hat{p}\) yields

$$\displaystyle\begin{array}{rcl} x^{{\prime}}(\hat{t}^{+};\hat{p})T^{{\prime}}(\hat{p}) + S_{ x}(\hat{t}^{+})& =& x^{{\prime}}(\hat{t}^{-};\hat{p})T^{{\prime}}(\hat{p}) + S_{ x}(\hat{t}^{-}) {}\\ & & +\nabla _{x}d^{m\rightarrow \tilde{m}}(\hat{x}(t^{-}),\hat{p})^{\top }\left (x^{{\prime}}(\hat{t}^{-};\hat{p})T^{{\prime}}(\hat{p}) + S_{ x}(\hat{t}^{-})\right ) {}\\ & & +\nabla _{p}d^{m\rightarrow \tilde{m}}(\hat{x}(t^{-}),\hat{p})^{\top }. {}\\ \end{array}$$

Rearranging terms leads to an update rule for the sensitivity S x at the switching point \(\hat{t}\) according to

$$\displaystyle\begin{array}{rcl} S_{x}(\hat{t}^{+})& =& \left (x^{{\prime}}(\hat{t}^{-};\hat{p}) - x^{{\prime}}(\hat{t}^{+};\hat{p}) + \nabla _{ x}d^{m\rightarrow \tilde{m}}(\hat{x}(t^{-}),\hat{p})^{\top }x^{{\prime}}(\hat{t}^{-};\hat{p})\right )T^{{\prime}}(\hat{p}) \\ & & +\left (I + \nabla _{x}d^{m\rightarrow \tilde{m}}(\hat{x}(t^{-}),\hat{p})^{\top }\right )S_{ x}(\hat{t}^{-}) \\ & & +\nabla _{p}d^{m\rightarrow \tilde{m}}(\hat{x}(t^{-}),\hat{p})^{\top }. {}\end{array}$$
(8.5)

If x is continuous at \(\hat{t}\), i.e., d ≡ 0, then the update rule for S x reduces to

$$\displaystyle\begin{array}{rcl} S_{x}(\hat{t}^{+})& =& S_{ x}(\hat{t}^{-}) + \left (x^{{\prime}}(\hat{t}^{-};\hat{p}) - x^{{\prime}}(\hat{t}^{+};\hat{p})\right )T^{{\prime}}(\hat{p}). {}\\ \end{array}$$

If, in addition, x is continuous at \(\hat{t}\), then S x is continuous at \(\hat{t}\) as well.

After \(x(\hat{t}^{+})\) and \(S_{x}(\hat{t}^{+})\) have been computed by (8.2) and (8.5), the algebraic component \(S_{y}(\hat{t}^{+})\) has to be computed consistently.

Note that this update rule is only valid under Assumption 8.2, which ensures a proper crossing of the switching manifold. If Assumption 8.2 does not hold at \(\hat{t}\), it is not clear how to update the sensitivity S x and in general the state may not depend continuously differentiable on p.

A related parametric sensitivity analysis for mechanical multibody systems using switching functions can be found in [144, Sect. 3.9] and [112]. An adjoint calculus for switched DAEs is derived in [114].

Remark 8.1

Please note that Assumptions 8.1 and 8.2 are crucial in the above analysis. These assumptions are often explicitly or implicitly assumed by standard integrators like DASRT or SCILAB/DASKR. The user needs to be aware of this since codes may fail if the assumptions are not met. As pointed out earlier, it is in general not clear how to continue integration (especially in the context of sensitivity analysis) if the assumptions are not satisfied. In case of the Zeno phenomenon, it is often assumed that the solution stays on the switching manifold. Modifications in the codes are necessary in such cases.

8.3 Contact and Friction in Mechanical Multibody Systems

Mechanical multibody dynamics taking into account contact forces and friction forces are, beyond doubt, the most important examples of switched systems and include particular impact and friction models, compare, e.g., [3, 139]. Suitable discretization schemes for such systems, which typically do not locate impact points exactly but work with a fixed step-size instead, are introduced in [2, 6, 113]. Extensions towards large-scale systems and tailored algorithms for complementarity problems can be found in [5, 137, 138, 140]. Impact models and the interpretation of the mechanical multibody system as a measure differential equation can be found in [65, 88, 102].

The equations of motion of a mechanical multibody system with contact and friction are given by

$$\displaystyle\begin{array}{rcl} q^{{\prime}}(t)& =& v(t), {}\\ M(q(t))v^{{\prime}}(t)& =& f(q(t),v(t)) + F_{ C}(q(t)), {}\\ \end{array}$$

In the above model, \(q(t) \in \mathbb{R}^{n}\) denotes the vector of generalized coordinates, v its velocity, f(q, v) the vector of generalized forces, and M(q) the non-singular mass matrix.

The above multibody system is augmented by an impact model that relates the velocity \(v(\hat{t}^{-})\) right before an impact to the velocity \(v(\hat{t}^{+})\) right after the impact. The impact model typically leads to a discontinuity of some components of the velocity vector v at a contact point \(\hat{t}\) and hence, the velocity components are only of bounded variation in general. The vector F C (q) contains the contact and friction forces, which apply only in the case of a contact between the rigid bodies of the multibody system, compare [60, 132]. Whether or not a contact between bodies occurs, is measured by distance functions \(s_{k}: \mathbb{R}^{n}\longrightarrow \mathbb{R}\) with

$$\displaystyle{s_{k}(q) \geq 0,\qquad k = 1,\ldots,m.}$$

Herein, a contact at time \(\hat{t}\) occurs, if \(s_{k}(q(\hat{t})) = 0\) for some k ∈ { 1, , m}. In case of a contact, the resulting contact and friction force F C, k (q) is an element of the friction cone

$$\displaystyle{FC_{k}(q):=\{ F^{n} + F^{t}\;\vert \;F^{n} = n_{ k}(q)\lambda,F^{t} = D_{ k}(q)\beta,\lambda \geq 0,\psi (\beta ) \leq \mu _{k}\lambda \}.}$$

Herein, F n = F k n(q) denotes the contact force into the normal direction of the contact surface, which can be expressed as F k n(q) = n k (q)λ k with the normal vector n k (q) = ∇s k (q) to the contact manifold \(\mathcal{S}_{k}(q) =\{ q \in \mathbb{R}^{n}\;\vert \;s_{k}(q) = 0\}\). λ k satisfies the Signorini contact conditions

$$\displaystyle{0 \leq s_{k}(q)\qquad \perp \qquad \lambda _{k} \geq 0,}$$

which is a complementarity system for λ k . The operator ⊥ in 0 ≤ a ⊥ b ≥ 0 is defined by a ≥ 0, b ≥ 0, and ab = 0.

The force F t = F k t(q) is the tangential force owing to friction in the contact manifold, which can be expressed as F k t(q) = D k (q)β k , where the columns of the matrix D k (q) span the friction space. For isotropic Coulomb friction, which we assume throughout, the function ψ is given by ψ(β): =  ∥ β ∥ 2 and μ k  ≥ 0 is the friction coefficient at the contact manifold \(\mathcal{S}_{k}(q)\). The norm ∥ ⋅ ∥ 2 causes some difficulties as ∥ β k  ∥ 2 ≤ μ k λ k leads to a non-smooth constraint. To overcome this difficulty, the norm ∥ ⋅ ∥ 2 is typically approximated by ∥ ⋅ ∥ 1, which leads to the following polyhedral approximation of the friction cone:

$$\displaystyle{\overline{FC}_{k}(q):=\{ F^{n} + F^{t}\;\vert \;F^{n} = n_{ k}(q)\lambda,F^{t} = D_{ k}(q)\beta,\lambda \geq 0,\|\beta \|_{1} \leq \mu _{k}\lambda \},}$$

compare [60, 132].

Depending on the choice of the friction cone, the total contact force is then defined by

$$\displaystyle{F_{C}(q) =\sum _{k:s_{k}(q)=0}F_{C,k}(q)}$$

with F C (q) being an element either of the total friction cone

$$\displaystyle{FC(q) =\sum _{k:s_{k}(q)=0}FC_{k}(q)}$$

or its polyhedral approximation

$$\displaystyle{\overline{FC}(q) =\sum _{k:s_{k}(q)=0}\overline{FC}_{k}(q).}$$

If a contact occurs at time \(\hat{t}\), i.e., \(s_{k}(q(\hat{t})) = 0\) for some k ∈ { 1, , m}, the impact model

$$\displaystyle\begin{array}{rcl} \nabla s_{k}(q(\hat{t}))^{\top }(v(\hat{t}^{+}) +\varepsilon _{ k}v(\hat{t}^{-}))& =& 0 {}\\ \end{array}$$

applies. Herein, ɛ k  ∈ [0, 1] denotes the elasticity constant. A fully elastic contact occurs if ɛ k  = 1. An inelastic contact occurs if ɛ k  = 0.

An approach to determine the friction force \(F_{k}^{t} = D_{k}(q(\hat{t}))\beta _{k}\) at a contact is based on the maximum dissipation principle , compare [132]. Herein, the corresponding friction force F k t maximizes the rate of energy dissipation for a given normal force \(F_{k}^{n} = n_{k}(q(\hat{t}))\lambda _{k}\) at a contact. This principle leads to the following convex optimization problem for β:

$$\displaystyle{ \mbox{ Maximize}\qquad - v(\hat{t}^{+})^{\top }D_{ k}(q(\hat{t}))\beta \qquad \mbox{ s.t.}\qquad \psi (\beta ) \leq \mu _{k}\lambda _{k}. }$$
(8.6)

Let β k be a solution of (8.6). If λ k  > 0, then β = 0 satisfies the Slater condition for this convex optimization problem, and a necessary and sufficient condition for the solution β k reads as follows, compare [38, Theorem 6.1.1, Proposition 6.3.1]: There exists a multiplier \(\zeta _{k} \in \mathbb{R}\) such that

$$\displaystyle\begin{array}{rcl} 0& \in & D_{k}(q(\hat{t})))^{\top }v(\hat{t}^{+}) +\zeta _{ k}\partial \psi (\beta _{k}),{}\end{array}$$
(8.7)
$$\displaystyle\begin{array}{rcl} 0& \leq & \zeta _{k}\quad \perp \quad \mu _{k}\lambda _{k} -\psi (\beta _{k}) \geq 0.{}\end{array}$$
(8.8)

Herein, ∂ ψ = ( ∥ ⋅ ∥ 2) denotes the generalized gradient of the locally Lipschitz continuous function ψ, which is given by

$$\displaystyle{\partial \psi (\beta ) = \left \{\begin{array}{ll} \frac{1} {\|\beta \|_{2}} \beta, &\mbox{ if }\beta \not =0, \\ \{\alpha \;\vert \;\|\alpha \|_{2} \leq 1\},&\mbox{ if }\beta = 0. \end{array} \right.}$$

If λ k  = 0, then β = 0 is the only feasible point in (8.6) and the conditions (8.7)–(8.8) are satisfied, e.g., by choosing

$$\displaystyle{\zeta _{k} =\| D_{k}(q(\hat{t})))^{\top }v(\hat{t}^{+})\|_{ 2}\quad \mbox{ and}\quad \alpha = \left \{\begin{array}{ll} -\frac{1} {\zeta _{k}}D_{k}(q(\hat{t})))^{\top }v(\hat{t}^{+}),&\ \mbox{ if}\ \zeta _{k} > 0, \\ 0, &\ \mbox{ if}\ \zeta _{k} = 0. \end{array} \right.}$$

Note that in either case α ∈ ∂ ψ(0). Instead of ψ(β) =  ∥ β ∥ 2 we may use the approximation ∥ β ∥ 1 in (8.6), which transforms the convex optimization problem in fact into a linear program. To this end, β is replaced by β = β +β with β +, β  ≥ 0:

$$\displaystyle{ \begin{array}{ll} \mbox{ Maximize}& - v(\hat{t}^{+})^{\top }D_{k}(q(\hat{t}))(\beta ^{+} -\beta ^{-}) \\ \mbox{ s.t.} &e^{\top }(\beta ^{+} +\beta ^{-}) \leq \mu _{k}\lambda _{k},\ \beta ^{+} \geq 0,\ \beta ^{-}\geq 0.\end{array} }$$
(8.9)

Herein, we exploited the relation ∥ β ∥ 1 = e (β + +β ) with the vector e = (1, , 1) of all ones of appropriate dimension. First order necessary and sufficient conditions for a solution β k  = β k +β k of (8.9) yield

$$\displaystyle\begin{array}{rcl} 0& =& \left [D_{k}(q(\hat{t})),-D_{k}(q(\hat{t}))\right ]^{\top }v(\hat{t}^{+}) +\zeta _{ k}\left (\begin{array}{c} e\\ e \end{array} \right ) -\left (\begin{array}{c} \eta _{k}^{+} \\ \eta _{k}^{-} \end{array} \right ), {}\\ 0& \leq & \zeta _{k}\;\;\quad \perp \quad \mu _{k}\lambda _{k} - e^{\top }(\beta _{ k}^{+} +\beta _{ k}^{-}) \geq 0, {}\\ 0& \leq & \beta _{k}^{\pm }\quad \perp \quad \eta _{ k}^{\pm }\geq 0. {}\\ \end{array}$$

Multiplication of the first equation by (β k +, β k ) from the left and exploitation of the complementarity conditions yield

$$\displaystyle\begin{array}{rcl} 0& \leq & \tilde{\beta }_{k}\quad \perp \quad \tilde{D}_{k}(q(\hat{t}))^{\top }v(\hat{t}^{+}) +\zeta _{ k}e \geq 0, {}\\ 0& \leq & \zeta _{k}\quad \perp \quad \mu _{k}\lambda _{k} - e^{\top }\tilde{\beta }_{ k} \geq 0 {}\\ \end{array}$$

with

$$\displaystyle{\tilde{D}_{k}(q(\hat{t})):= \left [D_{k}(q(\hat{t})),-D_{k}(q(\hat{t}))\right ],\quad \tilde{\beta }_{k} = [\beta _{k}^{+},\beta _{ k}^{-}]^{\top }.}$$

Note that the matrix \(\tilde{D}_{k}\) is balanced, i.e., if \(\tilde{D}_{k}\) contains a column c, then it contains − c as well. Summarizing, the equations of motion with contact and friction forces satisfy the following complementarity system:

$$\displaystyle\begin{array}{rcl} q^{{\prime}}(t)& =& v(t), {}\\ M(q(t))v^{{\prime}}(t)& =& f(q(t),v(t)) +\sum _{ k=1}^{m}n_{ k}(q(t))\lambda _{k}(t) + D_{k}(q(t))\beta _{k}(t), {}\\ 0& \leq & s_{k}(q(t))\quad \perp \quad \lambda _{k}(t) \geq 0, {}\\ 0& \leq & \tilde{\beta }_{k}(t)\quad \perp \quad \tilde{D}_{k}(q(t))^{\top }v(t^{+}) +\zeta _{ k}(t)e \geq 0, {}\\ 0& \leq & \zeta _{k}(t)\quad \perp \quad \mu _{k}\lambda _{k}(t) - e^{\top }\tilde{\beta }_{ k}(t) \geq 0 {}\\ \end{array}$$

and

$$\displaystyle\begin{array}{rcl} \nabla s_{k}(q(t))^{\top }(v(t^{+}) +\varepsilon _{ k}v(t^{-}))& =& 0\qquad \mbox{ if }s_{ k}(q(t)) = 0 {}\\ \end{array}$$

with k = 1, , m. A semi-implicit discretization scheme for the system was suggested in [6, 132]. Let z  = (q , v , λ , β , ζ ) be the state at time t and h > 0 a step-size. Let the index set of active contacts at t be defined by

$$\displaystyle{A^{\ell}:=\{ k \in \{ 1,\ldots,m\}\;\vert \;s_{k}(q^{\ell} + hv^{\ell}) \leq 0\}.}$$

Let \(\lambda _{A^{\ell}}^{\ell+1} = (\lambda _{k}^{\ell+1})_{k\in A^{\ell}}\) and likewise for \(\tilde{\beta }_{A^{\ell}}^{\ell+1}\) and \(\zeta _{A^{\ell}}^{\ell+1}\).

Then \(z^{\ell+1} = (q^{\ell+1},v^{\ell+1},\lambda _{A^{\ell}}^{\ell+1},\tilde{\beta }_{A^{\ell}}^{\ell+1},\zeta _{A^{\ell}}^{\ell+1})\) solves the following complementarity problem:

$$\displaystyle\begin{array}{rcl} q^{\ell+1} - q^{\ell}& =& hv^{\ell+1},{}\end{array}$$
(8.10)
$$\displaystyle\begin{array}{rcl} M(q^{\ell+1})\left (v^{\ell+1} - v^{\ell}\right )& =& h\left (f(q^{\ell},v^{\ell}) +\sum _{ k\in A^{\ell}}n_{k}(q^{\ell})\lambda _{ k}^{\ell+1} +\tilde{ D}_{ k}(q^{\ell})\tilde{\beta }_{ k}^{\ell+1}\right ),\qquad \quad {}\end{array}$$
(8.11)
$$\displaystyle\begin{array}{rcl} 0 \leq \lambda _{k}^{\ell+1}& \perp & \nabla s_{ k}(q^{\ell})^{\top }(v^{\ell+1} +\varepsilon _{ k}v^{\ell}) \geq 0,\quad (k \in A^{\ell}){}\end{array}$$
(8.12)
$$\displaystyle\begin{array}{rcl} 0 \leq \tilde{\beta }_{k}^{\ell+1}& \perp & \tilde{D}_{ k}(q^{\ell})^{\top }v^{\ell+1} +\zeta _{ k}^{\ell+1}e \geq 0,\quad (k \in A^{\ell}){}\end{array}$$
(8.13)
$$\displaystyle\begin{array}{rcl} 0 \leq \zeta _{k}^{\ell+1}& \perp & \mu _{ k}\lambda _{k}^{\ell+1} - e^{\top }\tilde{\beta }_{ k}^{\ell+1} \geq 0.\quad (k \in A^{\ell}){}\end{array}$$
(8.14)

Convergence results and alternative discretizations are discussed in [4, 6, 60, 113].

It remains to discuss, how the nonlinear complementarity problem can be solved. If M is independent of q or if M(q +1) was replaced by M(q ), then the problem is actually a linear complementarity problem, compare [3, 46, 138, 139], and it could be solved by Lemke’s algorithm [95] or as in [5, 137]. Another approach is to use a semi-smooth Newton method, compare [86, 115, 116]. Herein, the complementarity system (8.10)–(8.14) is rewritten as the nonlinear and non-smooth equation

$$\displaystyle\begin{array}{rcl} 0& =& G(z^{\ell+1}) \\ & =& \left (\begin{array}{c} q^{\ell+1} - q^{\ell} - hv^{\ell+1} \\ M(q^{\ell+1})\left (v^{\ell+1} - v^{\ell}\right ) - h\left (f(q^{\ell},v^{\ell}) +\sum _{k\in A^{\ell}}n_{k}(q^{\ell})\lambda _{k}^{\ell+1} +\tilde{ D}_{k}(q^{\ell})\tilde{\beta }_{k}^{\ell+1}\right ) \\ \phi _{FB}(\lambda _{k}^{\ell+1},\nabla s_{k}(q^{\ell})^{\top }(v^{\ell+1} +\varepsilon _{k}v^{\ell}))\quad (k \in A^{\ell}) \\ \phi _{FB}(\tilde{\beta }_{k}^{\ell+1},\tilde{D}_{k}(q^{\ell})^{\top }v^{\ell+1} +\zeta _{ k}^{\ell+1}e)\quad (k \in A^{\ell}) \\ \phi _{FB}(\zeta _{k}^{\ell+1},\mu _{k}\lambda _{k}^{\ell+1} - e^{\top }\tilde{\beta }_{k}^{\ell+1})\quad (k \in A^{\ell}) \end{array} \right )\!,{}\end{array}$$
(8.15)

where \(\phi _{FB}(a,b):= \sqrt{a^{2 } + b^{2}} - a - b\) denotes the Lipschitz continuous Fischer-Burmeister function, see [55]. Let ∂ G(z) denote Clarke’s generalized Jacobian of G, compare [38] and [70] for details on how to compute it. Then, a root of G can be obtained by the following basic version of the semi-smooth Newton method.

Algorithm 2

Semi-Smooth Newton Method

  1. (0)

    Init: Choose tolerance tol > 0 and an initial guess for z +1, e.g., z (0) = (q + hv , v , 0, 0, 0). Set j = 0.

  2. (1)

    If ∥ G(z ( j)) ∥ ≤ tol, set z +1 = z ( j) and STOP.

  3. (2)

    Compute an element V ( j) ∈ ∂ G(z ( j)) and the Newton direction d ( j) by solving the linear equation

    $$\displaystyle{V ^{(\,j)}d = -G(z^{(\,j)}).}$$
  4. (4)

    Set z ( j+1) = z ( j) + d ( j), j ← j + 1, and go to (1).

The following examples summarize results, which have been obtained by applying Algorithm 2 to mechanical multibody systems with contact and friction.

Example 8.2

Consider a bouncing and rotating ball with radius r = 1, mass m = 1, and moment of inertia J = 1 in the xz-plane with q = (x, z, θ), M = diag(m, m, J), f(q, q ) = (0, −mg, 0), g = 9. 81, and s(q) = zr. The friction space is spanned by

$$\displaystyle{\tilde{D}(q) = \left (\begin{array}{rr} - 1& 1\\ 0 & 0 \\ r& - r \end{array} \right ).}$$

Figure 13 shows a simulation of the bouncing ball in the time interval [0, 10] with initial state q(0) = (0, 10, 0), v(0) = (1, 10, −5), friction coefficient μ = 0. 2, and elasticity parameter ɛ = 0. 675. The states q, v, λ, β, and ζ as functions of time are depicted in Fig. 14.

Fig. 13
figure 13

Snapshot of a bouncing and rotating ball with contact and friction

Fig. 14
figure 14

Snapshot of a bouncing and rotating ball with contact and friction

Example 8.3

Consider a billiard table and the motion of a sphere on the table in the xy-plane. For simplicity, the friction on the table is neglected, but friction forces and contact forces at the boundaries of the table are taken into account with elasticity constant ɛ = 0. 9 and friction coefficient μ = 0. 5. The radius of the sphere is r = 0. 04 [m], its mass is m = 0. 1 [kg], and its moment of inertia is J = 1. The generalized coordinates are q = (x, y, θ), the mass matrix is M = diag(m, m, J), the generalized forces are f(q, q ) = (0, 0, 0), and the switching function for the opposite boundary of the table is s(q) = yr. The friction space is spanned by

$$\displaystyle{D(q) = \left (\begin{array}{rr} - 1& 1\\ 0 & 0 \\ r& - r \end{array} \right ).}$$

Figure 15 shows some snapshots of a simulation of the billiard problem in the time interval [0, 2. 05] with initial state q(0) = (0, 3∕4, 0), v(0) = (0, −2, −11), where  = 2. 24 denotes the length of the table in [m]. The states q, v, λ, β, and ζ as functions of time are depicted in Fig. 16.

Fig. 15
figure 15

Snapshot of a billiard table with contact and friction at the borders of the table. For better visibility the sphere was enlarged by a factor of two in the pictures

Fig. 16
figure 16

Snapshot of a billiard table with contact and friction at the borders of the table

9 Conclusions

Simulation is a well-established and indispensable tool in industrial design procedures. Moreover, efficient simulation techniques are required in other disciplines such as controller design, parameter identification, or optimal control. This paper aims to provide an overview on different aspects in the simulation of DAE initial value problems. The focus was set on a choice of methods and concepts that are relevant in industrial simulation environments for coupled systems of potentially large size. These concepts build upon basic integration schemes and add features like sensitivity analysis (needed, e.g., in optimization procedures), contact dynamics, real-time schemes, or co-simulation techniques. Each of these topics is a field of research in its own right with many contributions. Only some of the many contributions could be mentioned in this overview paper and we refer the interested reader to the specialized literature in the bibliography.