1 Introduction

The analysis and structural optimization of flexible multibody systems can be actively supported and facilitated providing sensitivity information. A simple and easy-to-implement way to compute gradients is numerical differentiation. Employing this method, the flexible multibody system can be treated as a black-box model, and no further information about the system is required. However, finite difference methods suffer from different deficiencies. For instance, the gradient is only an approximation, the perturbation of the design variables is not known a priori, and the computational effort increases proportionally with the number of design variables. The latter point is of special importance for structural optimization. In particular, in large-scale topology optimization problems, the computational costs using finite differences are prohibitively expensive.

Besides finite difference methods, analytical approaches such as the direct method and the adjoint variable method have been developed for the gradient computation of rigid and flexible multibody systems; see [2, 5]. The basic idea is to deduce from the variation of the objective function and the dynamic problem a set of additional equations that allow an exact gradient computation. Thereby, for systems with a large number of design variables, the adjoint variable method is often computationally more efficient than the direct method. For this reason, it is frequently used for the gradient computation in rigid multibody systems and flexible multibody systems, which are modeled using nonlinear finite element methods; see [4, 12, 16] for a recent survey. In this paper, contrary to previous works, the adjoint variable method is applied for large-scale sensitivity analysis of flexible multibody systems, which are modeled using the floating frame of reference approach. Such large-scale problems arise, for instance, in the topology optimization of flexible members of multibody systems; see [10].

In the derivation of the adjoint variable method, the type and structure of the dynamic problem determine the type and structure of the adjoint equations; see [3]. For instance, if the equations of motion are ODEs and if they can be explicitly expressed in terms of the generalized position and velocity coordinates, then applying the adjoint variable method also yields a set of adjoint ODEs; see [6]. In contrast, if the bodies of the multibody system are subjected to implicit constraint equations, for example, in the presence of kinematic loops, then algebraic equations have to be considered in the adjoint problem.

In [8] these constraint equations are taken into account at the position level, which leads to differential-algebraic equations (DAEs) for the adjoint problem. Alternatively, in [3] it is shown that considering the constraint equations at acceleration level leads to a system of adjoint ODEs and a set of auxiliary algebraic equations.

In this work, a further possibility for treating constraint equations in sensitivity analyses is described. Provided that the multibody system with kinematic loops is transferred to minimal coordinates by defining dependent and independent coordinates and applying a coordinate partitioning, constraints can be incorporated in the adjoint problem. Therefore, the variations of the dependent coordinates are systematically eliminated using the variations of the constraint equations at position and velocity level. In this way, both the equations of motion and the adjoint equations are ODEs only. The presented approach, however, can also be applied to systems without kinematic loops, such as systems in chain or tree structure.

The paper is organized in the following way. Section 2 addresses the structural analysis of flexible multibody systems. After a brief review of the floating frame of reference formulation, the dynamic problem is formulated, whereby the equations of motion are represented in minimal coordinates. Moreover, the dependencies of the dynamic problem on the design variables in case of structural parameterization of the flexible bodies are shown. In Sect. 3, the adjoint equations are derived for the given dynamic system. The procedure is tested by means of a flexible slider–crank mechanism in Sect. 4. Finally, Sect. 5 concludes with a brief summary and discussion.

2 Structural parameterized flexible multibody systems

The method of flexible multibody systems is a well-established approach to model and analyze mechanisms in which the single bodies undergo large rigid body motions and deformations. Next to rigid and flexible bodies, these systems are assembled from spring and damper elements, actuators, and ideal joints; see Fig. 1. If the deformations are comparatively small, then the floating frame of reference formulation can be used to efficiently incorporate flexible bodies into the multibody system; see [18, 19]. In the following, the basic equations of the floating frame of reference formulation are briefly reviewed, and the dynamic problem is formulated in minimal coordinates using a coordinate partitioning. Thereby, we assume that the flexible bodies are parameterized by the independent design variables \(\boldsymbol{x}\in \mathbb{R}^{h}\), whose influence on a scalar objective function \(\psi \in \mathbb{R}\) will be identified in the course of this work.

Fig. 1
figure 1

Flexible multibody system

2.1 The floating frame of reference formulation

In the floating frame of reference formulation, the deformation of a flexible body \(i\) is described with regard to a reference frame \({\mathrm{K}}^{i}_{\mathrm{R}}\), which undergoes large translational and rotational motions; see Fig. 2. The absolute position vector \(\boldsymbol{r} _{\mathrm{IP}}^{i}\) of a point P of the flexible body can be displayed as

$$ \boldsymbol{r}_{\mathrm{IP}}^{i} = \boldsymbol{r}_{\mathrm{IR}}^{i} + \boldsymbol{c}_{\mathrm{RP}}^{i} + \boldsymbol{u}_{\mathrm{P}}^{i}. $$
(1)

Thereby, \(\boldsymbol{r}_{\mathrm{IR}}^{i}\) represents the motion of the reference frame, \(\boldsymbol{c}_{\mathrm{RP}}^{i}\) is the position with regard to the undeformed configuration, and \(\boldsymbol{u}_{\mathrm{P}}^{i}\) is the elastic displacement.

Fig. 2
figure 2

Kinematics of a flexible body using the floating frame of reference formulation

The rotation of a frame fixed in \({\mathrm{P}}\) is given by the rotation matrix \(\boldsymbol{S}_{\text{IP}}^{i}\) and can be represented by

$$ \boldsymbol{S}_{\mathrm{IP}}^{i} = \boldsymbol{S}_{\mathrm{IR}}^{i} \bigl( \boldsymbol{\beta}^{i}_{\mathrm{IR}} \bigr) \boldsymbol{S}^{i}_{\mathrm{RP}}\quad \text{with }\boldsymbol{ \beta}_{\mathrm{IR}}^{i}\in\mathbb{R}^{3}. $$
(2)

Thereby, \(\boldsymbol{S}_{\mathrm{IR}}^{i}\) and \(\boldsymbol{S}_{\mathrm{RP}}^{i}\) describe the rotation of the reference frame \({\mathrm{K}}^{i}_{\mathrm{R}}\) with respect to the inertial frame and the rotation of \({\mathrm{K}}^{i}_{\mathrm{P}}\) with respect to the reference frame \({\mathrm{K}}^{i}_{\mathrm{R}}\). Provided that the rotations due to the deformation of the body are small, \(\boldsymbol{S}^{i}_{\mathrm{RP}}\) can be displayed by the rotation matrix \(\boldsymbol{S}_{\mathrm{RP}}^{0^{i}}\), which represents the orientation of \({\mathrm{K}}^{i}_{\mathrm{P}}\) in the undeformed configuration and the elastic rotational vector \(\boldsymbol{\vartheta}_{\mathrm{P}}^{i}\) such that

$$ \boldsymbol{S}_{\mathrm{RP}}^{i} = \boldsymbol{S}_{\mathrm{RP}}^{0^{i}} \bigl( \textbf{E} + \tilde{\boldsymbol{\theta}}_{\mathrm{P}}^{i} \bigr). $$
(3)

Here, \(\textbf{E}\) is the identity matrix, and the tilde defines a skew-symmetric matrix from the elastic rotational vector \(\boldsymbol {\theta}^{i}_{\mathrm{P}}\). Both the elastic displacement \(\boldsymbol{u}_{\mathrm{P}}^{i}\) and rotation \(\boldsymbol{\theta} _{\mathrm{P}}^{i}\) are approximated using a global Ritz approach,

$$ \begin{aligned} \boldsymbol{u}_{\mathrm{P}}^{i} \bigl(\boldsymbol{c}_{\mathrm{RP}}^{i}, t \bigr) &= \boldsymbol{ \varPhi }^{i} \bigl( \boldsymbol{c}_{\mathrm{RP}}^{i} \bigr) \boldsymbol{q}_{\mathrm{e}}^{i}(t), \\ \boldsymbol{\theta}_{\mathrm{P}}^{i} \bigl(\boldsymbol{c}_{\mathrm{RP}}^{i}, t \bigr) &= \boldsymbol{\varPsi}^{i} \bigl(\boldsymbol{c}_{\mathrm{RP}}^{i} \bigr) \boldsymbol{q}_{\mathrm{e}}^{i}(t), \end{aligned} $$
(4)

as the product of global shape functions, which are gathered in the matrices \(\boldsymbol{\varPhi}^{i}\) and \(\boldsymbol{\varPsi}^{i}\), and time dependent elastic coordinates \(\boldsymbol{q}_{\mathrm{e}}^{i}\). If in structural optimization the geometry and material properties of the flexible bodies are parameterized, then the global shape functions depend explicitly on the design variables \(\boldsymbol{x}\), and, thus, \(\boldsymbol{\varPhi}^{i} = \boldsymbol {\varPhi}^{i}(\boldsymbol{c}^{i}_{\mathrm{RP}}, \boldsymbol{x})\) and \(\boldsymbol{\varPsi}^{i} =\boldsymbol{\varPsi}^{i}(\boldsymbol{c}^{i}_{\mathrm{RP}}, \boldsymbol{x})\).

The position, velocity, and acceleration of each point P of the body is uniquely determined by the variables

figure a

Thereby, \(\boldsymbol{v}_{\mathrm{IR}}^{i}\) and \(\boldsymbol{\omega}_{\mathrm{IR}}^{i}\) are the velocity and angular velocity of the reference frame, \(\dot{\boldsymbol{v}}_{\mathrm{IR}}^{i}\) and \(\dot{\boldsymbol{\omega}} _{\mathrm{IR}}^{i}\) are time derivatives with regard to the reference frame, and, finally, \(\dot{\boldsymbol{q}}_{\mathrm{e}}^{i}\) and \(\ddot {\boldsymbol{q}}_{\mathrm{e}}^{i}\) represent the velocity and acceleration of the elastic coordinates. The connection between the time derivatives of the redundant position coordinates \(\dot {\boldsymbol{y}}_{\mathrm{r}}^{i}\) and the redundant velocity coordinates \(\boldsymbol{z}_{\mathrm{r}}^{i}\) is given by

$$ \dot{\boldsymbol{y}}_{\mathrm{r}}^{i} = \boldsymbol{Z}^{i} \bigl( \boldsymbol{y}_{\mathrm{r}}^{i} \bigr)\boldsymbol{z}_{\mathrm{r}}^{i}; $$
(6)

see [17] for details.

Applying Jourdain’s principle, the virtual power of a free single flexible body \(i\) yields

$$ \delta\boldsymbol{z}_{\mathrm{r}}^{i^{\mathrm{T}}} \bigl\{ \boldsymbol {M}^{i}\dot{\boldsymbol{z}}^{i}_{\mathrm{r}} + \boldsymbol{h}_{\omega}^{i} + \boldsymbol{h}_{\mathrm{e}}^{i} - \boldsymbol{h}_{\mathrm{p}}^{i} - \boldsymbol{h}_{\mathrm{b}}^{i} \bigr\} = 0, \quad\forall\delta\boldsymbol{z}_{\mathrm{r}}^{i}. $$
(7)

Thereby, \(\boldsymbol{M}^{i}\) is the mass matrix of the body, \(\boldsymbol{h}_{\omega}^{i}\) is the vector of Coriolis and centrifugal forces, \(\boldsymbol{h}_{\mathrm{e}}^{i}\) is the vector of inner forces, and \(\boldsymbol{h}_{\mathrm{p}}^{i}\) and \(\boldsymbol{h}_{\mathrm{b}}^{i}\) describe the applied surface and body forces. Since Eq. (7) holds for all possible variations \(\delta \boldsymbol{z}_{\mathrm{r}}^{i}\), the equations of motion of the single body are obtained as

$$ \boldsymbol{M}^{i} \bigl(\boldsymbol{y}_{\mathrm{r}}^{i}, \boldsymbol {x} \bigr)\dot{\boldsymbol{z}}^{i}_{\mathrm{r}} = \boldsymbol{h}_{\mathrm{p}}^{i} + \boldsymbol{h}_{\mathrm{b}}^{i} -\boldsymbol{h} _{\omega}^{i} - \boldsymbol{h}_{\mathrm{e}}^{i} = \boldsymbol{h}_{\mathrm{a}}^{i} \bigl(t, \boldsymbol{y}_{\mathrm{r}}^{i}, \boldsymbol{z}_{\mathrm{r}}^{i}, \boldsymbol{x} \bigr), $$
(8)

whereby all right-hand-side vectors are gathered in \(\boldsymbol{h}_{\mathrm{a}}^{i}\).

Regarding the structural parameterization, it should be mentioned that besides the explicit dependencies of the mass matrix and the right-hand-side vectors on the design variables, the state variables \(\boldsymbol{y}_{\mathrm{r}}^{i}\), \(\boldsymbol{z}_{\mathrm{r}}^{i}\), and \(\dot {\boldsymbol{z}}_{\mathrm{r}}^{i}\) depend implicitly on \(\boldsymbol{x}\), too.

2.2 Flexible multibody systems in minimal coordinates

Flexible multibody systems generally consist of \(p\) bodies, which are connected among each other or to the inertial frame via joint or bearings. Therefore, in holonomic systems, the redundant position and velocity coordinates \(\boldsymbol{y}_{\mathrm{r}}= [\boldsymbol {y}_{\mathrm{r}}^{1^{\mathrm{T}}}, \boldsymbol{y}_{\mathrm{r}}^{2^{\mathrm{T}}} \dots \boldsymbol{y}_{\mathrm{r}}^{p^{\mathrm{T}}}]^{\mathrm{T}}\) and \(\boldsymbol {z}_{\mathrm{r}}= [\boldsymbol{z}_{\mathrm{r}}^{1^{\mathrm{T}}}, \boldsymbol{z}_{\mathrm{r}}^{2^{\mathrm{T}}} \dots\boldsymbol{z}_{\mathrm{r}}^{p^{\mathrm{T}}}]^{\mathrm{T}}\) of the overall system are subjected to \(n_{\mathrm{c}}\) constraints, which can be formulated implicitly as

$$ \boldsymbol{c}(\boldsymbol{y}_{\mathrm{r}}, t, \boldsymbol{x}) = \boldsymbol {0},\quad \boldsymbol{c}\in\mathbb {R}^{n_{\mathrm{c}}}. $$
(9)

It is important to point out that the constraint equations (9) do not only depend on the redundant position coordinates \(\boldsymbol{y}_{\mathrm{r}}\) and the time \(t\) but also on the design variables \(\boldsymbol{x}\). This is due to the dependency of the global shape functions (4) on \(\boldsymbol{x}\). Together with the equations of motion (8) of the \(p\) bodies, they form a system of DAEs that describe the dynamics of the flexible multibody system.

In order to avoid the solution of the DAEs and to solve a system of ODEs instead, a coordinate partitioning can be performed; see [21]. Thereby, the redundant position and velocity coordinates \(\boldsymbol{y}_{\mathrm{r}}\) and \(\boldsymbol {z}_{\mathrm{r}}\) are split into generalized coordinates \(\boldsymbol{y}, \boldsymbol{z}\in\mathbb{R}^{f}\) and dependent coordinates \(\boldsymbol{y}_{\mathrm{d}}, \boldsymbol{z}_{\mathrm {d}}\in \mathbb{R}^{n_{\mathrm{c}}}\) as

figure b

where \(\boldsymbol{B}\) is a boolean matrix. In the current work, \(\boldsymbol{B}\) is determined manually and remains constant throughout the simulation. However, there are more advanced methods to determine \(\boldsymbol{B}\) as, for instance, those discussed in [11].

In addition to the splitting of the redundant coordinates, the constraint equations at velocity and acceleration level are needed. For the constraint equations at velocity level, it holds

$$ \dot{\boldsymbol{c}}= \dfrac{\partial\boldsymbol{c}}{\partial \boldsymbol{y}_{\mathrm{r}}}\dot{\boldsymbol{y}}_{\mathrm{r}} + \dfrac {\partial\boldsymbol{c}}{\partial t} = \boldsymbol{C} \dot{\boldsymbol{y}}_{\mathrm{r}} + \overline{\boldsymbol{c}} = \boldsymbol{0} $$
(11)

with \(\boldsymbol{C}\) being the Jacobian matrix of the constraints. With the kinematic relation (6), the constraint equations at acceleration level read

$$ \ddot{\boldsymbol{c}}= \boldsymbol{C}\boldsymbol{Z}\dot {\boldsymbol{z}}_{\mathrm{r}} + \dot{ (\boldsymbol{C}\boldsymbol {Z})}\boldsymbol{z}_{\mathrm{r}}+ \dfrac{\partial\overline{\boldsymbol{c}}}{\partial t} =\boldsymbol{0}. $$
(12)

Using the latter equation to express the dependent accelerations \(\dot {\boldsymbol{z}} _{\mathrm{d}}\) in terms of the independent accelerations \(\dot{\boldsymbol {z}}\), the redundant accelerations \(\dot{\boldsymbol{z}}_{\mathrm{r}}\) can be written as

$$ \dot{\boldsymbol{z}}_{\mathrm{r}} = \underbrace{\boldsymbol{B}\left [ \textstyle\begin{array}{c} \textbf{E} \\ -\boldsymbol{\varGamma}_{\mathrm{d}}^{-1}\boldsymbol{\varGamma}_{\mathrm{i}} \end{array}\displaystyle \right ]} _{\boldsymbol{J}}\dot{ \boldsymbol{z}}+ \underbrace {\boldsymbol{B}\left [ \textstyle\begin{array}{c} \boldsymbol{0} \\ -\boldsymbol{\varGamma}_{\mathrm{d}}^{-1}\boldsymbol{\gamma} \end{array}\displaystyle \right ]} _{\overline{\boldsymbol{\gamma}}} $$
(13)

with the partitioning \((\boldsymbol{C}\boldsymbol{Z}\boldsymbol{B}) = [\boldsymbol{\varGamma}_{\mathrm{i}}, \boldsymbol{\varGamma} _{\mathrm{d}}]\) and the abbreviation \(\boldsymbol{\gamma}= \dot{ (\boldsymbol{C}\boldsymbol{Z})}\boldsymbol{z}_{\mathrm{r}} + \partial{\overline{\boldsymbol{c}}}/\partial{t}\). In the last step, the redundant accelerations \(\dot{\boldsymbol {z}}_{\mathrm{r}}\) and the variations of the redundant velocities \(\delta\boldsymbol{z}_{\mathrm {r}}\) are substituted into the local equations of motion (7) for each body using Eq. (13) and the variation of Eq. (11), respectively. As a result, the flexible multibody system can be represented in minimal coordinates as

$$\begin{aligned} \dot{\boldsymbol{y}} = & \boldsymbol{Z}(\boldsymbol{y}_{\mathrm{r}}) \boldsymbol{z}_{\mathrm{r}}, \end{aligned}$$
(14)
$$\begin{aligned} \overline{\boldsymbol{M}}(t,\boldsymbol{y}_{\mathrm{r}}, \boldsymbol{x})\dot{\boldsymbol{z}}+ \overline{\boldsymbol{\gamma}}(t,\boldsymbol{y}_{\mathrm{r}}, \boldsymbol{z}_{\mathrm{r}}, \boldsymbol{x}) = & \overline{\boldsymbol{f}}(t, \boldsymbol{y}_{\mathrm{r}}, \boldsymbol{z}_{\mathrm{r}}, \boldsymbol{x}), \end{aligned}$$
(15)

with the global mass matrix

$$ \overline{\boldsymbol{M}}(t,\boldsymbol{y}_{\mathrm{r}}, \boldsymbol{x}) = \sum\limits _{i=1}^{p} \boldsymbol{J}^{i^{\mathrm{T}}}(t, \boldsymbol{y}_{\mathrm{r}}, \boldsymbol{x})\boldsymbol{M}^{i} \bigl( \boldsymbol{y}_{\mathrm {r}}^{i}, \boldsymbol{x} \bigr) \boldsymbol{J}^{i}(t, \boldsymbol{y}_{\mathrm{r}}, \boldsymbol{x}), $$
(16)

the vector of generalized local accelerations

$$ \overline{\boldsymbol{\gamma}}(t, \boldsymbol{y}_{\mathrm{r}}, \boldsymbol {z}_{\mathrm{r}}, \boldsymbol{x}) = \sum\limits _{i=1}^{p} \boldsymbol{J}^{i^{\mathrm{T}}}(t, \boldsymbol{y}_{\mathrm{r}}, \boldsymbol{x})\boldsymbol {\gamma}^{i}(t, \boldsymbol{y}_{\mathrm{r}}, \boldsymbol{z}_{\mathrm {r}}, \boldsymbol{x}), $$
(17)

and the generalized right-hand-side vector

$$ \overline{\boldsymbol{f}}(t,\boldsymbol{y}_{\mathrm{r}}, \boldsymbol{z}_{\mathrm{r}}, \boldsymbol{x}) = \sum\limits _{i=1}^{p} \boldsymbol{J}^{i^{\mathrm{T}}}(t, \boldsymbol{y}_{\mathrm{r}}, \boldsymbol{x})\boldsymbol {h}^{i}_{\mathrm{a}}(t, \boldsymbol{y}_{\mathrm{r}}, \boldsymbol {z}_{\mathrm{r}}, \boldsymbol{x}). $$
(18)

It can be seen that in addition to the local equations of motion (8), the Jacobian matrices \(\boldsymbol{J}^{i}\) and local acceleration vectors \(\boldsymbol{\gamma}^{i}\) also depend on \(\boldsymbol{x}\). This is due to the explicit dependency of the constraint equations (9) on the design variables \(\boldsymbol{x}\).

The redundant coordinates \(\boldsymbol{y}_{\mathrm{r}}\) and \(\boldsymbol{z}_{\mathrm{r}}\) remain as auxiliary variables in the system and have to be computed in each time step from the constraint equations at position and velocity level. Therefore, the flexible multibody system is completely described by Eqs. (9), (11), (14), and (15) and the initial conditions

$$ \begin{aligned} \boldsymbol{\phi}^{0} \bigl(t^{0}, \boldsymbol{y}^{0} \bigr) & = \boldsymbol{0}, \\ \dot{\boldsymbol{\phi}}^{0} \bigl(t^{0}, \boldsymbol{y}^{0}, \boldsymbol{z}^{0} \bigr) & = \boldsymbol{0} \end{aligned} $$
(19)

for the generalized position and velocity coordinates.

3 Sensitivity analysis using the adjoint variable method

In the following, the adjoint variable method is applied to flexible multibody systems modeled with the floating frame of reference approach and given in the form of Eqs. (9), (11), and (14)–(19). First, the key idea is briefly introduced, and both the objective function and the dynamic equations are given in a variational form. Then, the adjoint differential equations and the gradient equation are derived. Finally, the efficient evaluation of the gradient equation is discussed.

3.1 Variation of objective function and dynamic equations

The key idea of analytical approaches, such as the adjoint variable method, is to use variational calculus to unveil all explicit and implicit dependencies of the objective function \(\psi\) on the design variables \(\boldsymbol{x}\). In the current work, we consider integral-type objective functions of the form

$$ \psi(\boldsymbol{x}) = \int\limits _{t^{0}}^{t^{1}} F(\boldsymbol{y}_{\mathrm{r}}, \boldsymbol{x}) \,{\mathrm{d}}t. $$
(20)

They can be used to formulate, for instance, minimal compliance or tracking error problems. Even though the structure of \(\psi\) is comparatively simple, it is well suited to demonstrate the basic procedure. A more general objective function formulation can be found, for example, in [2].

Provided that the initial time \(t^{0}\) and the final time \(t^{1}\) are constant, the variation of objective functions in the form of Eq. (20) yields

$$ \delta\psi= \int\limits _{t^{0}}^{t^{1}} \biggl(\dfrac{\partial F}{\partial\boldsymbol{y}}\delta \boldsymbol{y}+ \dfrac{\partial F}{\partial\boldsymbol{y}_{\mathrm {d}}}\delta\boldsymbol{y}_{\mathrm{d}}+ \dfrac{\partial F}{\partial \boldsymbol{x}}\delta\boldsymbol{x} \biggr)\,{\mathrm{d}}t. $$
(21)

Thus, the variation of \(\psi\) depends, on the one hand, on the variations of the design variables \(\delta\boldsymbol{x}\) and, on the other hand, on the variations of the redundant position coordinates \(\delta \boldsymbol{y}_{\mathrm{r}}\). The latter are split into the variations of the generalized position coordinates \(\delta\boldsymbol{y}\) and the variations of the dependent position coordinates \(\delta\boldsymbol{y}_{\mathrm{d}}\).

There are different ways to handle the variations of the dependent position coordinates \(\delta\boldsymbol{y}_{\mathrm{d}}\). In [8] the adjoint system is augmented by the constraint equations at position level, which leads to a set of adjoint DAEs. In contrast, considering the constraint equations at acceleration level as in [3], the adjoint dynamics is represented by ODEs and a set of auxiliary algebraic equations.

In the current work, the variations of the redundant coordinates are eliminated instead of augmenting the adjoint system. Therefore, the constraint equations at position level are varied,

$$ \dfrac{\partial\boldsymbol{c}}{\partial\boldsymbol{y}}\delta \boldsymbol{y}+ \dfrac{\partial\boldsymbol{c}}{\partial\boldsymbol {y}_{\mathrm{d}}}\delta \boldsymbol{y}_{\mathrm{d}}+ \dfrac {\partial\boldsymbol{c}}{\partial\boldsymbol{x}} \delta\boldsymbol{x}= \boldsymbol{0}, $$
(22)

and used to express the variations of the dependent position coordinates as

$$ \delta\boldsymbol{y}_{\mathrm{d}}= - \biggl(\dfrac{\partial\boldsymbol {c}}{\partial\boldsymbol{y}_{\mathrm{d}}} \biggr)^{-1}\dfrac{\partial \boldsymbol{c}}{\partial\boldsymbol{y}}\delta\boldsymbol{y}- \biggl( \dfrac{\partial\boldsymbol{c}}{\partial\boldsymbol{y}_{\mathrm {d}}} \biggr)^{-1}\dfrac{\partial\boldsymbol{c}}{\partial\boldsymbol {x}}\delta\boldsymbol{x}. $$
(23)

Thus, \(\delta\boldsymbol{y}_{\mathrm{d}}\) is expressed in terms of the variations of the minimal coordinates \(\delta\boldsymbol{y}\) and the design variables \(\delta\boldsymbol{x}\) only. Then substituting \(\delta\boldsymbol{y}_{\mathrm{d}}\) into Eq. (21) yields

$$ \begin{aligned} \delta\psi= & \int\limits _{t^{0}}^{t^{1}} ( \boldsymbol{R}_{1} \delta \boldsymbol{y}+ \boldsymbol{R}_{2}\delta\boldsymbol{x})\,{\mathrm{d}}t, \end{aligned} $$
(24)

where \(\boldsymbol{R}_{1}\in\mathbb{R}^{1\times f}\) and \(\boldsymbol {R}_{2}\in\mathbb {R}^{1\times h}\) are defined as

$$ \begin{aligned} \boldsymbol{R}_{1} &= \dfrac{\partial F}{\partial \boldsymbol{y}} - \dfrac{\partial F}{\partial\boldsymbol {y}_{\mathrm{d}}} \biggl(\dfrac{\partial\boldsymbol{c}}{\partial \boldsymbol{y}_{\mathrm{d}}} \biggr)^{-1} \dfrac{\partial\boldsymbol {c}}{\partial\boldsymbol{y}}\quad\text{and} \\ \boldsymbol{R}_{2} &= \dfrac{\partial F}{\partial\boldsymbol{x}} - \dfrac{\partial F}{\partial\boldsymbol{y}_{\mathrm{d}}} \biggl( \dfrac {\partial\boldsymbol{c}}{\partial\boldsymbol{y}_{\mathrm{d}}} \biggr)^{-1}\dfrac{\partial\boldsymbol{c}}{\partial\boldsymbol{x}}. \end{aligned} $$
(25)

Even though the variations of the dependent position coordinates are eliminated in Eq. (24), the latter still depends on the variations of the position coordinates \(\delta\boldsymbol{y}\). These variations have to be eliminated next using either a direct or an adjoint approach; see [3]. In both approaches, the variations of the kinematic relation (14) and the equations of motion (15) are required.

The variation of the kinematic relation (14) in implicit form yields

$$ \delta\dot{\boldsymbol{y}}- \dfrac{\partial\boldsymbol {v}}{\partial\boldsymbol{y}}\delta\boldsymbol{y}- \dfrac{\partial \boldsymbol{v}}{\partial\boldsymbol{y}_{\mathrm{d}}}\delta \boldsymbol{y}_{\mathrm{d}}- \dfrac{\partial\boldsymbol{v}}{\partial\boldsymbol{z}}\delta \boldsymbol{z}- \dfrac{\partial\boldsymbol{v}}{\partial\boldsymbol {z}_{\mathrm{d}}}\delta\boldsymbol{z}_{\mathrm{d}}= \boldsymbol{0}, $$
(26)

where \(\boldsymbol{v}= \boldsymbol{Z}(\boldsymbol{y}_{\mathrm {r}})\boldsymbol{z}_{\mathrm{r}}\). It can be seen that Eq. (26) does not only depend on the variations of the dependent position variables \(\boldsymbol{y}_{\mathrm{d}}\) but also on the dependent velocity variables \(\delta\boldsymbol{z}_{\mathrm{d}}\). However, the latter can be replaced using the variation of the constraints at velocity level (11), which reads

$$ \dfrac{\partial\dot{\boldsymbol{c}}}{\partial\boldsymbol {y}}\delta\boldsymbol{y}+ \dfrac{\partial\dot{\boldsymbol {c}}}{\partial\boldsymbol{y}_{\mathrm{d}}}\delta\boldsymbol {y}_{\mathrm{d}}+ \dfrac{\partial\dot{\boldsymbol{c}}}{\partial \boldsymbol{z}}\delta\boldsymbol{z}+ \dfrac{\partial\dot {\boldsymbol{c}}}{\partial\boldsymbol{z}_{\mathrm{d}}} \delta \boldsymbol{z}_{\mathrm{d}}+ \dfrac{\partial\dot{\boldsymbol {c}}}{\partial\boldsymbol{x}}\delta\boldsymbol{x}= \boldsymbol{0}. $$
(27)

Substituting into Eq. (27) the variations of the dependent position coordinates using Eq. (23) and solving for the variations of the dependent velocity coordinates yield

$$ \begin{aligned} \delta\boldsymbol{z}_{\mathrm{d}}= & - \biggl( \biggl(\dfrac {\partial\dot{\boldsymbol{c}}}{\partial\boldsymbol{z}_{\mathrm{d}}} \biggr)^{-1}\dfrac{\partial\dot{\boldsymbol{c}}}{\partial\boldsymbol {y}} - \biggl( \dfrac{\partial\dot{\boldsymbol{c}}}{\partial\boldsymbol {z}_{\mathrm{d}}} \biggr)^{-1} \dfrac{\partial\dot{\boldsymbol{c}}}{\partial\boldsymbol {y}_{\mathrm{d}}} \biggl( \dfrac{\partial\boldsymbol{c}}{\partial \boldsymbol{y}_{\mathrm{d}}} \biggr)^{-1}\dfrac{\partial\boldsymbol {c}}{\partial\boldsymbol{y}} \biggr) \delta \boldsymbol{y} \\ & - \biggl(\dfrac{\partial\dot{\boldsymbol{c}}}{\partial\boldsymbol {z}_{\mathrm{d}}} \biggr)^{-1}\dfrac{\partial\dot{\boldsymbol {c}}}{\partial\boldsymbol{z}}\delta \boldsymbol{z} \\ & - \biggl( \biggl(\dfrac{\partial\dot{\boldsymbol{c}}}{\partial \boldsymbol{z}_{\mathrm{d}}} \biggr)^{-1}\dfrac{\partial\dot{\boldsymbol {c}}}{\partial\boldsymbol{x}} - \biggl(\dfrac{\partial\dot{\boldsymbol {c}}}{\partial\boldsymbol{z}_{\mathrm{d}}} \biggr)^{-1}\dfrac{\partial \dot{\boldsymbol{c}}}{\partial\boldsymbol{y}_{\mathrm{d}}} \biggl( \dfrac {\partial\boldsymbol{c}}{\partial\boldsymbol{y}_{\mathrm{d}}} \biggr)^{-1}\dfrac{\partial\boldsymbol{c}}{\partial\boldsymbol{x}} \biggr)\delta \boldsymbol{x}. \end{aligned} $$
(28)

Then, substituting the variations of the dependent positions and velocities into Eq. (26) using (23) and (28), respectively, gives

$$ \delta\dot{\boldsymbol{y}}+ \boldsymbol{S}_{1}\delta\boldsymbol {y}+ \boldsymbol{S}_{2}\delta\boldsymbol{z}+ \boldsymbol{S}_{3} \delta\boldsymbol{x}= \boldsymbol{0} $$
(29)

with three auxiliary matrices \(\boldsymbol{S}_{1}\in\mathbb {R}^{f\times f}\), \(\boldsymbol{S} _{2}\in\mathbb{R}^{f\times f}\), and \(\boldsymbol{S}_{3}\in\mathbb {R}^{f\times h}\) defined as

$$\begin{aligned} \boldsymbol{S}_{1} =& -\dfrac{\partial\boldsymbol {v}}{\partial\boldsymbol{y}} + \dfrac{\partial\boldsymbol {v}}{\partial\boldsymbol{y}_{\mathrm{d}}} \biggl(\dfrac{\partial \boldsymbol{c}}{\partial\boldsymbol{y}_{\mathrm{d}}} \biggr)^{-1} \dfrac {\partial\boldsymbol{c}}{\partial\boldsymbol{y}} + \dfrac{\partial \boldsymbol{v}}{\partial\boldsymbol{z}_{\mathrm{d}}} \biggl(\dfrac {\partial\dot{\boldsymbol{c}}}{\partial\boldsymbol{z}_{\mathrm {d}}} \biggr)^{-1}\dfrac{\partial\dot{\boldsymbol{c}}}{\partial \boldsymbol{y}} \\ &{} - \dfrac{\partial\boldsymbol{v}}{\partial\boldsymbol{z}_{\mathrm {d}}} \biggl(\dfrac{\partial\dot{\boldsymbol{c}}}{\partial\boldsymbol {z}_{\mathrm{d}}} \biggr)^{-1} \dfrac{\partial\dot{\boldsymbol {c}}}{\partial\boldsymbol{y}_{\mathrm{d}}} \biggl(\dfrac{\partial\boldsymbol{c}}{\partial\boldsymbol{y}_{\mathrm {d}}} \biggr)^{-1} \dfrac{\partial\boldsymbol{c}}{\partial\boldsymbol{y}}, \\ \boldsymbol{S}_{2} =& -\dfrac{\partial\boldsymbol{v}}{\partial \boldsymbol{z}} + \dfrac{\partial\boldsymbol{v}}{\partial \boldsymbol{z}_{\mathrm{d}}} \biggl(\dfrac{\partial\dot{\boldsymbol {c}}}{\partial\boldsymbol{z}_{\mathrm{d}}} \biggr)^{-1}\dfrac{\partial \dot{\boldsymbol{c}}}{\partial\boldsymbol{z}},\quad\text{and} \\ \boldsymbol{S}_{3} =& \dfrac{\partial\boldsymbol{v}}{\partial \boldsymbol{y}_{\mathrm{d}}} \biggl(\dfrac{\partial\boldsymbol {c}}{\partial\boldsymbol{y}_{\mathrm{d}}} \biggr)^{-1} \dfrac{\partial\boldsymbol{c}}{\partial\boldsymbol{x}} + \dfrac {\partial\boldsymbol{v}}{\partial\boldsymbol{z}_{\mathrm{d}}} \biggl( \dfrac{\partial\dot{\boldsymbol{c}}}{\partial\boldsymbol {z}_{\mathrm{d}}} \biggr)^{-1} \dfrac{\partial\dot{\boldsymbol{c}}}{\partial\boldsymbol{x}} \\ &{}- \dfrac{\partial\boldsymbol{v}}{\partial\boldsymbol{z}_{\mathrm {d}}} \biggl(\dfrac{\partial\dot{\boldsymbol{c}}}{\partial\boldsymbol {z}_{\mathrm{d}}} \biggr)^{-1} \dfrac{\partial\dot{\boldsymbol {c}}}{\partial\boldsymbol{y}_{\mathrm{d}}} \biggl(\dfrac{\partial\boldsymbol{c}}{\partial\boldsymbol{y}_{\mathrm {d}}} \biggr)^{-1} \dfrac{\partial\boldsymbol{c}}{\partial\boldsymbol{x}}. \end{aligned}$$
(30)

By analogy the variation of the equations of motion is determined, and the variations of the dependent position and velocity coordinates are substituted. Defining \(\mathrm{ODE}:=\overline{\boldsymbol{M}}(t, \boldsymbol {y}_{\mathrm{r}}, \boldsymbol{x})\dot{\boldsymbol{z}}+ \overline{\boldsymbol{\gamma}}(t, \boldsymbol{y}_{\mathrm{r}}, \boldsymbol{z}_{\mathrm{r}}, \boldsymbol{x}) - \overline{\boldsymbol{f}}(t, \boldsymbol{y}_{\mathrm{r}}, \boldsymbol{z}_{\mathrm{r}}, \boldsymbol{x} )\), for the variation of the kinetic equation (15), we have

$$ \overline{\boldsymbol{M}}\delta \dot{\boldsymbol{z}}+ \dfrac{\partial\text{ODE}}{\partial\boldsymbol{z}}\delta\boldsymbol{z}+ \dfrac {\partial\text{ODE}}{\partial\boldsymbol{z}_{\mathrm{d}}} \delta \boldsymbol{z}_{\mathrm{d}}+ \dfrac{\partial\text{ODE}}{\partial \boldsymbol{y}}\delta\boldsymbol{y}+ \dfrac{\partial\text{ODE}}{\partial\boldsymbol{y}_{\mathrm{d}}}\delta\boldsymbol {y}_{\mathrm{d}}+ \dfrac{\partial\text{ODE}}{\partial\boldsymbol {x}}\delta \boldsymbol{x}= \boldsymbol{0}. $$
(31)

Substituting the variations of the dependent position and velocity coordinates yields

$$ \overline{\boldsymbol{M}}\delta \dot{\boldsymbol{z}}+ \boldsymbol{T}_{1}\delta \boldsymbol{y}+ \boldsymbol{T}_{2}\delta \boldsymbol{z}+\boldsymbol{T}_{3} \delta\boldsymbol{x}= \boldsymbol{0} $$
(32)

with the auxiliary matrices \(\boldsymbol{T}_{1}\in\mathbb{R}^{f\times f}\), \(\boldsymbol{T}_{2}\in \mathbb{R}^{f\times f}\), and \(\boldsymbol{T}_{3}\in\mathbb {R}^{f\times h}\) defined as

$$\begin{aligned} \boldsymbol{T}_{1} =& \dfrac{\partial\text{ODE}}{\partial\boldsymbol{y}} - \dfrac{\partial\text{ODE}}{\partial\boldsymbol{y}_{\mathrm{d}}} \biggl(\dfrac{\partial\boldsymbol{c}}{\partial\boldsymbol{y}_{\mathrm {d}}} \biggr)^{-1} \dfrac{\partial\boldsymbol{c}}{\partial\boldsymbol{y}} - \dfrac{\partial\text{ODE}}{\partial\boldsymbol{z}_{\mathrm{d}}} \biggl(\dfrac{\partial\dot{\boldsymbol{c}}}{\partial\boldsymbol {z}_{\mathrm{d}}} \biggr)^{-1}\dfrac{\partial\dot{\boldsymbol {c}}}{\partial\boldsymbol{y}} \\ &{} + \dfrac{\partial\text{ODE}}{\partial\boldsymbol{z}_{\mathrm {d}}} \biggl(\dfrac{\partial\dot{\boldsymbol{c}}}{\partial\boldsymbol {z}_{\mathrm{d}}} \biggr)^{-1} \dfrac{\partial\dot{\boldsymbol {c}}}{\partial\boldsymbol{y}_{\mathrm{d}}} \biggl(\dfrac{\partial \boldsymbol{c}}{\partial\boldsymbol{y}_{\mathrm{d}}} \biggr)^{-1} \dfrac {\partial\boldsymbol{c}}{\partial\boldsymbol{y}}, \\ \boldsymbol{T}_{2} =& \dfrac{\partial\text{ODE}}{\partial \boldsymbol{z}} - \dfrac{\partial\text{ODE}}{\partial\boldsymbol {z}_{\mathrm{d}}} \biggl( \dfrac{\partial\dot{\boldsymbol{c}}}{\partial\boldsymbol {z}_{\mathrm{d}}} \biggr)^{-1}\dfrac{\partial\dot{\boldsymbol {c}}}{\partial\boldsymbol{z}},\quad \text{and} \\ \boldsymbol{T}_{3} =& \dfrac{\partial\text{ODE}}{\partial \boldsymbol{x}} - \dfrac{\partial\text{ODE}}{\partial\boldsymbol {y}_{\mathrm{d}}} \biggl( \dfrac{\partial\boldsymbol{c}}{\partial\boldsymbol{y}_{\mathrm {d}}} \biggr)^{-1}\dfrac{\partial\boldsymbol{c}}{\partial\boldsymbol{x}} - \dfrac{\partial\text{ODE}}{\partial\boldsymbol{z}_{\mathrm{d}}} \biggl(\dfrac{\partial\dot{\boldsymbol{c}}}{\partial\boldsymbol {z}_{\mathrm{d}}} \biggr)^{-1} \dfrac{\partial\dot{\boldsymbol {c}}}{\partial\boldsymbol{x}} \\ &{} + \dfrac{\partial\text{ODE}}{\partial\boldsymbol{z}_{\mathrm {d}}} \biggl(\dfrac{\partial\dot{\boldsymbol{c}}}{\partial\boldsymbol {z}_{\mathrm{d}}} \biggr)^{-1} \dfrac{\partial\dot{\boldsymbol {c}}}{\partial\boldsymbol{y}_{\mathrm{d}}} \biggl(\dfrac{\partial \boldsymbol{c}}{\partial\boldsymbol{y}_{\mathrm{d}}} \biggr)^{-1} \dfrac{\partial\boldsymbol{c}}{\partial\boldsymbol{x}}. \end{aligned}$$
(33)

3.2 Adjoint variable method

In the adjoint variable method, the dependent variations \(\delta \boldsymbol{y}\) are eliminated by augmenting the varied objective function (24) with two zero terms, which are multiplied by arbitrary time-dependent adjoint variables \(\boldsymbol{\mu}(t)\in\mathbb {R}^{f}\) and \(\boldsymbol{\nu} (t)\in\mathbb{R}^{f}\). The first zero term is the varied kinematic relation (29), and the second one is the variation of the equations of motion (32). Then, all terms in \(\delta \boldsymbol{y}\) can be eliminated by choosing the adjoint variables properly.

However, the product of the adjoint variables \(\boldsymbol{\mu}^{\mathrm{T}}(t)\) and the variation of the kinematic relation (29)

$$ \boldsymbol{\mu}^{\mathrm{T}} [\delta\dot{\boldsymbol{y}}+ \boldsymbol {S}_{1}\delta\boldsymbol{y}+ \boldsymbol{S}_{2}\delta \boldsymbol{z}+\boldsymbol{S}_{3}\delta \boldsymbol{x}] = 0 $$
(34)

is in a different form compared to Eq. (24). Therefore, Eq. (34) has to be integrated over the simulation time, yielding

$$ \int\limits _{t^{0}}^{t^{1}}\boldsymbol{\mu}^{\mathrm{T}} [\delta \dot{\boldsymbol{y}}+ \boldsymbol{S}_{1}\delta\boldsymbol{y}+ \boldsymbol{S}_{2}\delta \boldsymbol{z}+\boldsymbol{S}_{3} \delta\boldsymbol{x}]\,{\mathrm{d}}t = 0. $$
(35)

Moreover, since \(\delta\dot{\boldsymbol{y}}\) is not included in the varied objective function (24), the time derivative in \(\delta \dot{\boldsymbol{y}}\) has to be moved to the adjoint variables. Thus, using integration by parts, the first integrand can be transformed to

$$ \int\limits _{t^{0}}^{t^{1}}\boldsymbol{\mu}^{\mathrm{T}}\delta\dot{ \boldsymbol{y}}\,{\mathrm{d}}t = \boldsymbol{\mu}^{1^{\mathrm{T}}}\delta \boldsymbol{y}^{1} - \boldsymbol{\mu}^{0^{\mathrm{T}}}\delta \boldsymbol{y}^{0} - \int\limits _{t^{0}}^{t^{1}}\dot{\boldsymbol{\mu}}^{\mathrm{T}} \delta\boldsymbol{y}\,{\mathrm{d}}t. $$
(36)

Since the initial conditions (19) are assumed to be design independent, the variations of the initial generalized coordinates \(\delta\boldsymbol{y}^{0}\) are zero. Therefore, Eq. (35) can be written as

$$ \boldsymbol{\mu}^{1^{\mathrm{T}}}\delta\boldsymbol{y}^{1} + \int\limits _{t^{0}}^{t^{1}} \bigl[ \bigl(-\dot{\boldsymbol{ \mu}}^{\mathrm{T}} + \boldsymbol{\mu}^{\mathrm{T}}\boldsymbol{S}_{1} \bigr)\delta\boldsymbol {y}+ \boldsymbol{\mu}^{\mathrm{T}} \boldsymbol{S}_{2}\delta\boldsymbol{z}+ \boldsymbol{\mu}^{\mathrm{T}} \boldsymbol{S}_{3}\delta\boldsymbol{x} \bigr] \,{\mathrm{d}}t = 0. $$
(37)

In the same way, the second augmentation term is obtained. Here, the variation of the equations of motion (32) is multiplied from the left by arbitrary adjoint variables \(\boldsymbol{\nu}^{\mathrm{T}}(t)\):

$$ \boldsymbol{\nu}^{\mathrm{T}} [ \overline{\boldsymbol{M}}\delta\dot{\boldsymbol {z}}+ \boldsymbol{T}_{1}\delta\boldsymbol{y}+ \boldsymbol{T}_{2} \delta\boldsymbol{z}+\boldsymbol{T}_{3}\delta \boldsymbol{x}] = 0. $$
(38)

Integrating over the time domain,

$$ \int\limits _{t^{0}}^{t^{1}} \boldsymbol{\nu}^{\mathrm{T}} [ \overline{\boldsymbol{M}}\delta \dot{\boldsymbol{z}}+ \boldsymbol{T}_{1}\delta\boldsymbol{y}+ \boldsymbol{T}_{2}\delta\boldsymbol{z}+ \boldsymbol{T}_{3} \delta\boldsymbol{x}]\,{\mathrm{d}}t = 0, $$
(39)

and applying integration by parts to remove the time derivative of the variations of the velocity variables \(\delta\dot{\boldsymbol{z}}\) yields

$$ \int\limits _{t^{0}}^{t^{1}}\boldsymbol{\nu}^{\mathrm{T}}\overline{\boldsymbol{M}}\delta \dot{ \boldsymbol{z}}\,{\mathrm{d}}t = \boldsymbol{\nu}^{1^{\mathrm{T}}}\overline{\boldsymbol{M}}^{1}\delta \boldsymbol{z}^{1} - \boldsymbol{\nu}^{0^{\mathrm{T}}}\overline{\boldsymbol{M}}^{0}\delta \boldsymbol{z}^{0} - \int\limits _{t^{0}}^{t^{1}} \bigl(\dot{\boldsymbol{ \nu}}^{\mathrm{T}}\overline{\boldsymbol{M}} + \boldsymbol{\nu}^{\mathrm{T}} \dot{\overline{\boldsymbol{M}}} \bigr)\delta \boldsymbol{z}\,{\mathrm{d}}t. $$
(40)

Considering that the initial conditions are independent of the design variables and, hence, \(\delta\boldsymbol{z}^{0}=\boldsymbol{0}\), it is possible to rewrite Eq. (38) as

$$ \begin{aligned} & \boldsymbol{\nu}^{1^{\mathrm{T}}}\overline{\boldsymbol{M}}^{1}\delta \boldsymbol{z}^{1}- \int\limits _{t^{0}}^{t^{1}} \bigl[ \bigl(\dot{\boldsymbol{ \nu}}^{\mathrm{T}} \overline{\boldsymbol{M}} + \boldsymbol{\nu}^{\mathrm{T}} \dot{\overline{\boldsymbol{M}}} - \boldsymbol{\nu}^{\mathrm{T}} \boldsymbol{T}_{2} \bigr)\delta\boldsymbol{z}- \boldsymbol{\nu }^{\mathrm{T}}\boldsymbol{T}_{1}\delta \boldsymbol{y}- \boldsymbol{\nu}^{\mathrm{T}}\boldsymbol{T}_{3}\delta \boldsymbol{x} \bigr] \,{\mathrm{d}}t = 0, \end{aligned} $$
(41)

which is a suitable form to augment the variation of the objective function \(\delta\psi\).

Subtracting Eq. (37) and (41) from Eq. (24) and rearranging the result in terms of the dependent variations \(\delta\boldsymbol{y}\) and \(\delta\boldsymbol {z}\) and independent variations \(\delta\boldsymbol{x}\) yield

$$\begin{aligned} \delta\psi =& - \boldsymbol{ \mu}^{1^{\mathrm{T}}}\delta\boldsymbol{y}^{1} \\ &{} + \int\limits _{t^{0}}^{t^{1}} \bigl(\boldsymbol{R}_{1} + \dot{\boldsymbol{\mu}}^{\mathrm{T}} - \boldsymbol{\mu}^{\mathrm{T}} \boldsymbol{S}_{1} - \boldsymbol{\nu}^{\mathrm{T}} \boldsymbol{T}_{1} \bigr)\delta\boldsymbol{y}\,{\mathrm{d}}t \\ &{} - \boldsymbol{\nu}^{1^{\mathrm{T}}}\overline{\boldsymbol{M}}^{1}\delta\boldsymbol{z}^{1} \\ &{} + \int\limits _{t^{0}}^{t^{1}} \bigl(-\boldsymbol{\mu}^{\mathrm{T}} \boldsymbol{S}_{2} + \dot{\boldsymbol{\nu}}^{\mathrm{T}}\overline{\boldsymbol{M}} + \boldsymbol{\nu}^{\mathrm{T}}\dot{\overline{\boldsymbol{M}}} - \boldsymbol{\nu}^{\mathrm{T}}\boldsymbol{T}_{2} \bigr)\delta \boldsymbol{z}\,{\mathrm{d}}t \\ &{} + \biggl\{ \int\limits _{t^{0}}^{t^{1}} \bigl( \boldsymbol{R}_{2} - \boldsymbol{\mu}^{\mathrm{T}}\boldsymbol{S}_{3} - \boldsymbol{ \nu}^{\mathrm{T}}\boldsymbol{T}_{3} \bigr)\,{\mathrm{d}}t \biggr\} \delta\boldsymbol{x}. \end{aligned}$$
(42)

It can be seen that the term in curly brackets corresponds to the sought gradient \(\nabla\psi\), provided that the adjoint variables are chosen such that the dependent variations \(\delta\boldsymbol{y}\) and \(\delta\boldsymbol{z}\) vanish at all times, including the final time \(t^{1}\). From this condition and provided that the global mass matrix \(\overline{\boldsymbol{M}}\) is symmetric, the following equations can be derived for the adjoint variables:

$$ \begin{aligned} \boldsymbol{\mu}^{1} & = \boldsymbol{0}, \\ \overline{\boldsymbol{M}}^{1}\boldsymbol{\nu}^{1} & = \boldsymbol{0}, \\ \dot{ \boldsymbol{\mu}}& = \boldsymbol{S}_{1}^{\mathrm{T}}\boldsymbol{\mu}+ \boldsymbol{T}_{1}^{\mathrm{T}}\boldsymbol{\nu}- \boldsymbol{R}_{1}^{\mathrm{T}}, \\ \overline{\boldsymbol{M}}\dot{ \boldsymbol{\nu}}& = \boldsymbol{S}_{2}^{\mathrm{T}}\boldsymbol{\mu}- \dot{\overline{\boldsymbol{M}}} \boldsymbol{\nu}+ \boldsymbol{T}_{2}^{\mathrm{T}}\boldsymbol{\nu}. \end{aligned} $$
(43)

The adjoint system (43) is a final value problem for ODEs, which has to be solved for the adjoint variables \(\boldsymbol {\mu}\) and \(\boldsymbol{\nu} \) by integrating backward in time starting at the final conditions \(\boldsymbol{\mu} ^{1}=\boldsymbol{0}\) and \(\boldsymbol{\nu}^{1}=\boldsymbol{0}\). Thereafter, the gradient can be evaluated by

$$ \nabla\psi= \int\limits _{t^{0}}^{t^{1}} \bigl( \boldsymbol{R}_{2} - \boldsymbol{\mu}^{\mathrm{T}}\boldsymbol{S}_{3} - \boldsymbol{ \nu}^{\mathrm{T}}\boldsymbol{T}_{3} \bigr)\,{\mathrm{d}}t. $$
(44)

It is worth mentioning that the dimension of the adjoint differential equations and hence the computational effort to solve them do not depend on the number of design variables. Therefore, the adjoint variable method is favorable for structural optimization problems with a large number of design variables \(\boldsymbol{x}\).

3.3 Augmented standard input data

For the evaluation of Eq. (44), among others, the derivatives of the equations of motion (15) with regard to the design variables \(\boldsymbol{x}\) are constantly required. Therefore, on the one hand, the derivatives of the Jacobian matrices \(\boldsymbol{J}^{i}\) and the local acceleration vectors \(\boldsymbol {\gamma}^{i}\) with regard to the design variables \(\boldsymbol{x}\) have to be computed. On the other hand, the derivatives of the local equations of motion (8) with regard to \(\boldsymbol{x}\) are needed.

In order to compute the latter derivatives efficiently, it is recommended to augment the so-called standard input data. The concept of the standard input data is suggested in [20] and used to facilitate the evaluation of the equations of motion of flexible bodies. Therefore, a set of body integrals, which do not depend on the elastic coordinates, is computed before the actual time simulation. In the same way, in the current work, the SID are augmented by their derivatives with regard to the design variables \(\boldsymbol{x}\) to facilitate the computation of Eq. (44). The computational effort, however, is still considerably high for large-scale problems.

This is due to the fact that a large number of body integrals have to be computed and differentiating the SID, the derivatives of the global shape functions with regard to the design variables \(\partial\boldsymbol {\varPhi}^{i}/\partial \boldsymbol{x}\) and \(\partial\boldsymbol{\varPsi}^{i}/\partial \boldsymbol{x}\) have to be provided. In the following application example, the global shape functions are found from modal truncation and are thus the first eigenmodes of the underlying finite element model of the flexible body. For the precise and efficient derivation of the eigenmodes, however, Nelson’s method [14] can be used as also proposed in [5].

4 Application example

In order to test the gradient computation procedure, the structural sensitivity of a flexible piston rod of a slider–crank mechanism is analyzed. The sensitivity information can be used, for instance, in the topology optimization of the piston rod; see [10].

4.1 Flexible slider–crank mechanism

The flexible slider–crank mechanism, which is used as an application example, consists of a rigid crank and a flexible piston rod; see Fig. 3. The eccentricity \(\varepsilon\) of the crank is 0.1 m and the distance \(l\) between the bearings of the piston rod is 1 m. Since no sliding-block is attached to the system, the loading on the piston rod in motion originates only from its own inertia. As shown in [9], in this case, it is crucial to provide exact gradients in order to obtain viable optimization results.

Fig. 3
figure 3

Flexible slider–crank mechanism

The motion of the system is composed of two phases and applied via a rheonomic constraint of the crank angle \(\varphi\) as

$$ \varphi(t) = \textstyle\begin{cases} \sum\limits_{i=0}^{i=7} a_{i} t^{i}, & 0~\mbox{s} \le t \le2~\mbox{s}, \\ \varOmega^{1} t + \varphi^{1}, & 2~\mbox{s} < t \le3~\mbox{s}. \end{cases} $$
(45)

In the first phase, the crank is accelerated within two seconds from a resting position until a constant angular velocity is reached. Then, in the second phase, the angular velocity is kept constant for another second. For a jerk-free transition at the beginning and the end of the first phase, the polynomial coefficients \(a_{i}\) are chosen such that the following boundary conditions hold:

$$ \begin{aligned} t^{0}=0~\mbox{s}\colon\quad& \varphi= \dot{ \varphi} = \ddot{\varphi} = \dddot{\varphi} = 0, \\ t^{1}=2~\mbox{s}\colon\quad& \varphi= \varphi^{1} = 12\pi~ \mbox{rad}, \\ & \dot{\varphi} = \varOmega^{1} = 12\pi~\mbox{Hz}, \\ & \ddot{\varphi} = \dddot{\varphi} = 0. \end{aligned} $$
(46)

4.2 Parameterization and formulation of the objective function

In the slider–crank mechanism, the piston rod is assumed to be flexible. To determine a set of global shape functions \(\boldsymbol {\varPhi}\), which describe the elastic deformations of the rod, a finite element model is generated. Therefore, a reference domain of dimension \((1.0\times0.06\times0.01 )~\mbox{m}\) is meshed using \(6\times100\) planar 4-node bilinear elements; see Fig. 4.

Fig. 4
figure 4

SIMP parameterized FE model of the design domain

In order to analyze the design of the rod, both the density \(\rho\) and the stiffness \(E\) of the elements are not constant but parameterized using the solid isotropic material with penalization (SIMP) approach; see [1]. In this popular topology optimization strategy, continuous density-like design variables \(x_{i}\in(0,1]\) are introduced for each finite element. Then, following, for instance, the SIMP approach suggested in [15], the density and stiffness of an element \(i\) are computed as

$$ \begin{aligned} \rho_{i} & = \textstyle\begin{cases} c x_{i}^{q}\rho_{0} & \text{for } x_{\mathrm{min}} = 0.01 \le x_{i} < 0.1, \\ x_{i}\rho_{0} & \text{for } 0.1 \le x_{i} \le1, \end{cases}\displaystyle \\ E_{i} & = x_{i}^{p} E_{0}, \end{aligned} $$
(47)

where \(\rho_{0}\) and \(E_{0}\) are the density and the stiffness of the solid material, whereas \(c\), \(p\), and \(q\) are scalar parameters. For the modeling of the flexible piston rod, the parameters are chosen as \(\rho_{0} = 8750~\mbox{kg}/\mbox{m}^{3}\), \(E_{0} = 0.5\cdot10^{11}~\mbox{N}/\mbox{m}^{2}\), \(c = 10^{5}\), \(p = 3\), and \(q = 6\). With the exception of the first and last columns, which are assumed to be rigid interfaces, all elements of the model are parameterized; compare Fig. 4. Hence, the number of design variables \(\boldsymbol{x}\) is 588.

Assembling the global mass matrix \(\boldsymbol{M}_{\mathrm{e}}\), the global stiffness matrix \(\boldsymbol{K}_{\mathrm{e}}\), and the global vector of applied forces \(\boldsymbol{f}_{\mathrm{e}}\), the linear equations of motion of the finite element model can be obtained as

$$ \boldsymbol{M}_{\mathrm{e}}\ddot{\boldsymbol{u}}+ \boldsymbol{K}_{\mathrm{e}} \boldsymbol{u}= \boldsymbol{f}_{\mathrm{e}}. $$
(48)

Even though the number of nodal degrees of freedom \(\boldsymbol{u}\) is comparatively small in the current model, it is not efficient to consider all of them in the multibody simulation. Therefore, the dimension of Eq. (48) is reduced to \(n_{\mathrm{q}} = 6\) elastic degrees of freedom \(\boldsymbol{q}_{\mathrm{e}}\) by applying modal truncation; see [7] for details. To each of these six elastic degrees of freedom a global shape function \(\boldsymbol{\phi}^{i}\) is assigned. It should be noted that since the finite element model depends on the vector of design variables \(\boldsymbol{x}\), the global shape functions, which are gathered in \(\boldsymbol{\varPhi}(\boldsymbol{x}) = [\boldsymbol{\phi }^{1}(\boldsymbol{x}), \dots, \boldsymbol{\phi}^{n_{\mathrm{q}}}(\boldsymbol{x})]\), depend on the design variables, too.

After the flexible body is parameterized, the choice of an appropriate objective function is discussed. In this work, being a typical example of objective functions in topology optimization [1, 9], the integral compliance

$$ \psi(\boldsymbol{x}) = \int\limits _{t^{0}}^{t^{1}} (\boldsymbol{\varPhi} \boldsymbol{q}_{\mathrm{e}} )^{\mathrm{T}} \boldsymbol{K}_{\mathrm{e}} \boldsymbol{\varPhi}\boldsymbol{q}_{\mathrm{e}}\,\mathrm{d}t $$
(49)

is chosen to assess the performance of the design. The structure of Eq. (49) is the same as that of the general objective function (20). The integrand depends via the elastic coordinates \(\boldsymbol{q}_{\mathrm{e}}\) on the redundant coordinates \(\boldsymbol{y}_{\mathrm{r}}\) and via the stiffness matrix \(\boldsymbol{K}_{\mathrm{e}}\) and the global shape functions \(\boldsymbol{\varPhi} \) on the design variables \(\boldsymbol{x}\).

4.3 Gradient evaluation

The gradient \(\nabla\psi\) of the objective function (49) is evaluated at the point \(x_{i} = 0.5\), \(i=1\dots588\), using two different approaches. On the one hand, the adjoint variable method is employed as described in this work. On the other hand, in order to verify the results, the finite central difference method is used, whereby each design variable is perturbed by 0.1. Thereby, the forward integration of the equations of motion is always performed using the implicit MATLAB solver ode15s; see [13]. Thereby the absolute and relative integration tolerances are \(10^{-12}\) and \(10^{-10}\), respectively. The backward integration of the adjoint ODEs is also performed using the ode15s-solver and the absolute and relative integration tolerances \(10^{-10}\) and \(10^{-8}\).

In Figs. 5 and 6, the results of both methods are visualized as surfaces above the reference domain of the flexible piston rod. Thus, the \(x\)-axis ranges from 0.01 to \(0.99~\mbox{m}\) and the \(y\)-axis from \(-0.03 \mbox{ to } 0.03~\mbox{m}\).

Fig. 5
figure 5

Topological gradient using the adjoint variable method

Fig. 6
figure 6

Topological gradient using the finite difference method

First of all, it can be noted that the resulting gradients are in good agreement and are reasonable. Whereas the gradients are negative for the upper and lower elements of the piston rod, they are positive for the inner elements of the structure. This is due to the fact that the piston rod is loaded by its own inertia only. The upper and lower elements support the piston rod against bending. Here, increasing the amount of material, the compliance of the piston rod can be reduced. In contrast, the inner elements in the domain \(-0.01~\mbox{m} \le e_{2} \le 0.01~\mbox{m}\) contribute little to the stiffness but cause loading due to their inertia. As a consequence, the gradient is positive in this area.

The maximal absolute error between the solutions obtained with the adjoint variable and the finite difference method \(\max(|\nabla\psi_{\mathrm{avm}}| - |\nabla\psi_{\mathrm{fd}}| )\) is \(2.5\cdot10^{-4}\) and, hence, only about \(3.4~\%\) of the maximal absolute value \(\max(|\nabla\psi_{\mathrm{avm}}| ) = 7.4\cdot10^{-3}\). Whereas both methods suffer from numerical errors made, for instance, in the modal reduction, the computation of the standard input data, or the time integration, there is an additional approximation error in the finite difference method. As a consequence, the adjoint variable method returns a smooth gradient, whereas this is not the case for the finite difference method. Also, comparing the computational times, we can see the superiority of the adjoint variable method. The overall gradient computation using the adjoint approach requires less than 5 min, whereas it takes roughly 12 h with the central finite difference method.

5 Summary and conclusion

Employing the adjoint variable method, it is possible to perform large-scale sensitivity analyses in flexible multibody systems modeled with the floating frame of reference approach such as they occur, for instance, in topology optimization. For the general case of flexible multibody systems with kinematic loops, it is shown that both the equations of motion and the adjoint differential equations can be derived and solved as ODEs only. Therefore, on the one hand, a coordinate partitioning is used to represent the equations of motion in minimal coordinates. On the other hand, deriving the adjoint equations, the variations of the dependent position and velocity coordinates have to be substituted. The necessary equations are found from the variations of the constraint equations at position and velocity level.

The procedure is tested evaluating the structural sensitivity of a flexible slider–crank mechanism, which is parameterized using the SIMP approach. The results show that, in contrast to the finite difference method, the adjoint variable method is able to provide exact and smooth gradient information in reasonable computing times. It should be therefore preferred over the finite difference method for the large-scale sensitivity analysis of flexible multibody systems.