Semi-analytical sensitivity analysis for nonlinear transient problems

Fernandez, Felipe; Tortorelli, Daniel A.

doi:10.1007/s00158-018-2096-y

Semi-analytical sensitivity analysis for nonlinear transient problems

Research Paper
Published: 28 September 2018

Volume 58, pages 2387–2410, (2018)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

Structural and Multidisciplinary Optimization Aims and scope Submit manuscript

Semi-analytical sensitivity analysis for nonlinear transient problems

Download PDF

606 Accesses
11 Citations
Explore all metrics

Abstract

Efficient analytical sensitivity computations are essential elements of gradient-based optimization schemes; unfortunately, they can be difficult to implement. This implementation issue is often resolved by adopting the semi-analytical method which exhibits the efficiency of the analytical methods and the ease of implementation of the finite difference method. However, care must be taken as semi-analytical sensitivities may exhibit errors due to truncation and round-off. Additional errors are introduced if the convergence tolerance of the primal analysis is not sufficiently small. This paper gives a general overview and some new developments of the analytical and semi-analytical sensitivity analyses for nonlinear steady-state, transient, and dynamic problems. We discuss the restrictive assumptions, accuracy, and consistency of these methods. Both adjoint and direct differentiation methods are studied. Numerical examples are provided.

Adjoint sensitivity analysis and optimization of transient problems using the mixed Lagrangian formalism as a time integration scheme

Article 07 September 2019

Sensitivity Analysis of Numerical Models

Properties and practicability of convergence-guaranteed optimization methods derived from weak discrete gradients

Article Open access 14 March 2024

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

Both, analyses and design sensitivity analyses, are crucial in gradient-based optimization wherein analyses are performed to predict the performance of proposed designs, while design sensitivity analyses are performed to quantify the performance changes with respect to design changes. Since the optimization is iterative and because it relies on the accurate values of the gradients, efficient and accurate sensitivity analyses are essential. The finite difference sensitivity method requires one re-analysis to compute the sensitivities of the performance functions with respect to each design variable, so this method is extremely inefficient especially when the primal analysis is time-consuming. On the other hand, analytical direct differentiation and adjoint sensitivity analyses are very efficient. Unfortunately, this efficiency requires the analytical evaluation of various derivatives which may be difficult to compute since they require detailed knowledge of the analysis program. Indeed, analytical sensitivities require the differentiation of specific element formulations and material models with respect to a variety of design variables (Cheng and Olhoff 1993; Kiendl et al. 2014). To alleviate these implementation issues, the semi-analytical method approximates these derivatives with finite differences; as such little knowledge of the analysis program is required. However, care must be exercised as the accuracy of the semi-analytical method depends on the finite difference perturbation size. For a thorough review of sensitivity analyses and the semi-analytical method see Haftka and Adelman (1989), Tortorelli and Michaleris (1994), Gunzburger (2003), van Keulen et al. (2005), Haftka and Gürdal (2012).

Much work has been focused on the semi-analytical method for linear static structural problems (Gallagher and Zienkiewicz 1973; Botkin 1982; Camarda and Adelman 1984; Esping 1984; Cheng and Liu 1987; Barthelemy et al. 1988; Pedersen et al. 1989; Barthelemy and Haftka 1990; Haftka and Adelman 1989; Fenyes and Lust 1991; Olhoff and Rasmussen 1991; Bestle and Seybold 1992), especially its application to shape sensitivity analysis.

Our response functions are integrals over the domain. In shape sensitivity analysis, the design variables include geometric parameters that define this domain. Thus, analytical shape sensitivity analyses require the use of the material derivative from continuum mechanics and such computations can be onerous. For this reason, the semi-analytical method of shape sensitivity analyses may be preferable for its ease of implementation; moreover, it is fully reliable for most problems in which the structural displacement field entails small rigid-body rotations relative to deformations of the finite elements (Olhoff et al. 1993). However, large errors attributed to rigid body rotations of the finite elements have been found in shape sensitivities computed with the semi-analytical method (Barthelemy et al. 1988; Cheng et al. 1989; Pedersen et al. 1989; Fenyes and Lust 1991; Olhoff and Rasmussen 1991; Cheng and Olhoff 1993).

Different approaches have been suggested to improve the accuracy of the semi-analytical method. For example, improved accuracy is obtained by using the second-order central differences scheme, instead of first-order accurate forward differences (Barthelemy et al. 1988; Cheng et al. 1989; Haftka and Adelman 1989; Pedersen et al. 1989; Fenyes and Lust 1991). This method requires an additional computational cost and unfortunately does not completely eliminate the errors caused by large rigid body motions in shape sensitivity analysis. To circumvent this, the natural approach retains consistency conditions for rigid body modes and their derivatives (Mlejnek 1992). Alternatively, the analytical derivatives of the element rigid body modes are incorporated in the refined semi-analytical design sensitivities approach to alleviate inaccuracies (Van Keulen and De Boer 1998). Utilizing specific characteristics of the element stiffness matrices to compute correction factors, the so-called exact semi-analytical eliminates truncation error (Olhoff et al. 1993). A proposed improved semi-analytical method obtains better accuracy by using the von Neumann series (Oral 1996).

Kiendl et al. (2014) use the isogeometric finite element in which non-rational uniform B-splines (NURBS) are used to parameterize both the finite element response and the domain geometry. A multilevel approach allows for a more coarse, i.e., smooth, design parameterization versus the finite element response. The semi-analytical method is combined with a sensitivity weighting scheme to compute the design updates for their optimization example problems.

Semi-analytical methods have been applied for nonlinear static structures. Haftka (1993) and Mróz and Haftka (1994) use it to compute sensitivities of limit loads and show that the semi-analytical method is equivalent to the overall finite difference method when a single Newton iteration is used. A more thorough formulation of the refined semi-analytical method was presented for linear, linearized buckling, geometrically nonlinear and limit point analyses in de Boer and van Keulen (2000). The exact semi-analytical method has also been extended to geometric nonlinearities in Wang et al. (2015). Curiously, this formulation uses the secant stiffness matrix and incorporates correction terms to eliminate truncation errors.

The refined semi-analytical approach was also extended to obtain second-order derivatives (de Boer et al. 2002). The higher-order semi-analytical derivatives studied by Bernard et al. (1993) use cubic polynomials to develop surrogate models of the mass and stiffness matrices so that higher-order derivatives can be easily computed.

Sensitivity analysis for transient problems have been extensively studied (Adelman and Haftka 1986; Haug 1987; Haftka and Gürdal 2012). These studies included nonlinearities (Ray et al. 1978; Michaleris et al. 1994; Kreissl et al. 2011; Deng et al. 2011), and shape sensitivities (Meric 1988; Tortorelli et al. 1991). The semi-analytical method has been applied for linear transient structural problems using a reduced order modal model (Camarda and Adelman 1984; Greene and Haftka 1991; Hooijkamp and van Keulen 2018). As such, these methods are restricted to linear systems.

The semi-analytical method has been applied to transient heat conduction problems (Gu and Grandhi 1998), including nonlinear behaviour (Gu et al. 2002), and nonlinear coupled with structural dynamics (Chen et al. 2003). It is unclear how their use of the Precise Time Integration scheme (Zhong and Williams 1994) which is limited to time varying linear systems affects the accuracy of their nonlinear analyses and subsequent sensitivity analysis.

Semi-analytical sensitivity analysis via direct differentiation has been applied to dynamic systems with large rotations (Brüls and Eberhard 2008), and to flexible multibody systems (Tromme et al. 2015). In the latter, to ease the computation, the pseudo load is approximated using the perturbation of the residual. This approximation is easy to implement, since simulation codes usually have a function to compute the residual (Tromme et al. 2015). The goal of this paper is to study this formulation and extend it to the adjoint method.

In the following, we study the semi-analytical method to facilitate the sensitivity analyses for transient nonlinear systems. The transient problems are treated as general as possible. To do this, we use both an implicit-explicit time integration algorithm and the popular Newmark time stepping method. Additionally, we use a general formulation, so the methods can be applied to any type of transient problems (e.g., thermal, structural, multibody, etc.). We systematically develop direct and adjoint sensitivity analysis approaches. Furthermore, for transient and dynamic problems, we study the adjoint differentiate-then-discretize and the adjoint discretize-then-differentiate approaches. The adjoint semi-analytical sensitivity analysis approaches require restrictive assumptions. In particular, we show that the adjoint differentiate-then-discretize method exhibits consistency error and requires some terms to be constant in order to reuse the tangent stiffness matrix from the primal analysis. We also show that the semi-analytical adjoint differentiate-then-discretize method, for nonlinear transient and nonlinear dynamic systems is limited to systems with symmetric stiffness and damping matrices. Fortunately, we show that by using an implicit time integration, the discretize-then-differentiate adjoint method can accommodate asymmetric stiffness matrices.

The major contributions of this paper are (1) an overview of analytical sensitivity analysis for nonlinear transient problems, (2) the development of novel efficient semi-analytical formulations, (3) the identification of restrictions for semi-analytical adjoint methods, and (4) a discussion of the consistency and accuracy of the methods.

This paper gives a general overview of the finite difference method (Section 2.2) and the analytical and semi-analytical sensitivity analyses for nonlinear steady state (Section 2), transient (Section 3) and dynamic (Section 4) systems. Numerical examples are provided in Sections 3.6, and 4.6 wherein the accuracy of the methods are discussed. To quantify the accuracy, we introduce the relative percentage error between the sensitivities obtained by finite differences δF_f and the analytical sensitivities δF as

$$ e_{f}= \left| \frac{\delta F_{f}-\delta F}{\delta F} \right| 100\% . $$

(1)

Similarly, we compute the relative error of the semi-analytical sensitivities δF_s with respect to the analytical sensitivities δF as

$$ e_{s}=\left| \frac{\delta F_{s}-\delta F}{\delta F} \right| 100\% . $$

(2)

2 Steady-state nonlinear problems

After finite element discretization, the steady-state nonlinear problem is expressed in terms of the residual function R via the equation

$$ \textbf{R}(\textbf{U})=\textbf{0} , $$

(3)

where U is the response vector, e.g., displacement. This nonlinear problem is solved using the iterative Newton-Raphson method. If the residual of the current iterate U_j is not a solution, R(U_j)≠0, then the next iterate U_j+ 1 = U_j + ΔU_j is computed by equating the first order Taylor series expansion of R about U_j+ 1 to zero, i.e.,

$$ \textbf{R}(\textbf{U}_{j + 1})=\textbf{R}(\textbf{U}_{j}+{\Delta} \textbf{U}_{j})\approx \textbf{R}(\textbf{U}_{j})+{\textbf{K}} {\Delta} \textbf{U}_{j}=\textbf{0} , $$

(4)

where K = ∂R/∂U is the tangent matrix. The incremental response update ΔU_j is obtained by solving the linear equation

$$ \textbf{K}(\textbf{U}_{j}){\Delta} \textbf{U}_{j}=-\textbf{R}_{j}(\textbf{U}_{j}) , $$

(5)

whereafter the next iterate

$$ \textbf{U}_{j + 1}=\textbf{U}_{j}+{\Delta} \textbf{U}_{j} , $$

(6)

is computed. The steps of evaluating the residual R and updating the response U are repeated until the solution converges to a within user specified tolerance, i.e., until |R(U)|≤ 𝜖_R.

2.1 Sensitivity analysis of steady-state nonlinear systems

For the sensitivity analysis we treat the residual R and the response U as functions of the n_d vector of design variables $\textbf {d}=[d_{1},d_{2},..,d_{n_{d}}]^{\top }$, i.e., we now have express (3) as

$$ \textbf{R}(\textbf{U}(\textbf{d}),\textbf{d})=\textbf{0} . $$

(7)

After completing the primal analysis of (3), we can evaluate any number of response functions F. For our purposes, the response function depends on the response U(d) to the problem in (7) whereby we express

$$ F(\textbf{d})=G(\textbf{U}(\textbf{d}),\textbf{d}) . $$

(8)

Using the chain rule, the derivative of the response functional of (8) with respect to each d_i is

$$ \frac{\mathrm{D}F}{\mathrm{D}d_{i}}=\frac{\mathrm{D}G}{\mathrm{D}d_{i}}=\frac{\partial G}{\partial \textbf{U}} \frac{\partial \textbf{U}}{\partial d_{i}}+\frac{\partial G}{\partial d_{i}} , $$

(9)

where ∂U/∂d_i is implicitly defined through (7).

2.2 Finite difference method

The forward finite difference method approximates the derivatives of a response function F using a truncated Taylor series expansion

$$ \frac{\mathrm{D}F(\textbf{d})}{\mathrm{D}d_{i}}\approx\frac{F(\textbf{d}+\epsilon \textbf{e}_i)-F(\textbf{d})}{\epsilon} , $$

(10)

where e_i = [0,0,..,1,...,0,0]^⊤ is the unit vector of component i, and 𝜖 the perturbation. The approximation DF(d)/Dd_i𝜖 ≈ F(d + 𝜖e_i) − F(d) exhibits truncation error o(𝜖), where o is a function defined such that o(𝜖) tends to zero faster than 𝜖, i.e., $\lim _{\epsilon \to 0} o(\epsilon )/\epsilon = 0$. To reduce the truncation error o(𝜖) it is desirable to choose a small 𝜖, however, numerical round-off error will erode the accuracy of the approximation if 𝜖 is too small.

Since the response function depends on the response U(d), the approximation of (10) is expressed by

$$ \frac{\mathrm{D}F(\textbf{d})}{\mathrm{D}d_{i}}\approx\frac{G(\textbf{U}(\textbf{d}+\epsilon \textbf{e}_i),\textbf{d}+\epsilon \textbf{e}_i)-G(\textbf{U}(\textbf{d}),\textbf{d})}{\epsilon} . $$

(11)

As seen above, the response U(d + 𝜖e_i) must be calculated for each design variable d_i; this is easily obtained by modifying the finite element model, but computationally inefficient because it requires n_d additional simulations to compute the U(d + 𝜖e_i). Note that second-order accurate approximations which are accurate to o(𝜖²) can be obtained by central differences, but this requires two re-analyses for U(d ± 𝜖e_i) which is even more costly. As seen here, the finite difference method is easy to implement, computationally inefficient, and subjected to truncation and round-off errors.

2.3 Direct differentiation for steady-state nonlinear systems

In the direct differentiation approach, the implicit derivative ∂U/∂d_i, i.e., pseudo response, is obtained by differentiating (7) respect to d_i, which after some rearranging defines the so-called pseudo problem

$$ \textbf{K} \frac{\partial \textbf{U}}{\partial d_{i}}=-\frac{\partial \textbf{R}}{\partial d_{i}} , $$

(12)

where − ∂R/∂d_i is the pseudo load. Notice that the tangent operator K from the primal analysis appears in the pseudo problem; moreover, it is already factored, assuming the use of direct solvers in the primal analysis. Thus, the evaluation of the implicit derivative ∂U/∂d_i only requires the formation of the pseudo load vector − ∂R/∂d_i and a back substitution. Once the implicit derivative ∂U/∂d_i is obtained, (9) is evaluated to obtain the sensitivities for any number of functions F. As seen here, the direct method is computationally efficient because it solves one pseudo problem using the previously factored tangent matrix for each design variable regardless of the number of response functions. In addition, the computed sensitivities are numerically exact.

In the semi-analytical formulation, the derivatives ∂R/∂d_i and DG/Dd_i of (12) and (9) are approximated to within o(𝜖) via finite differences

$$\begin{array}{@{}rcl@{}} \frac{\mathrm{D} G(\textbf{U}(\textbf{d}),\textbf{d})}{\mathrm{D} d_{i}} &\approx& \frac{1}{\epsilon}\left( G \left( \textbf{U}(\textbf{d})+\epsilon \frac{\partial \textbf{U} (\textbf{d})}{\partial d_{i}},\textbf{d}+\epsilon \textbf{e}_i\right) \right.\\ &&-\left.{G(\textbf{U}(\textbf{d}),\textbf{d})}\vphantom{G \left( \textbf{U}(\textbf{d})+\epsilon \frac{\partial \textbf{U} (\textbf{d})}{\partial (d_{i}},\textbf{d}+\epsilon \textbf{e}_i\right)}\right) . \end{array} $$

(13)

In (1), we assume the residual R(U(d),d) = 0; however, we solve the primal analysis until the solution converges to a user defined tolerance, i.e., |R(U(d),d)|≤ 𝜖_R. This tolerance imposes a new source of error in addition to the truncation and round-off errors.

Since the function G is known, the derivatives ∂G/∂d_i and ∂G/∂U in the sensitivity DF/Dd_i can be computed exactly as in (9) or approximated as in (13). We assume the former.

The approximations in (1) and (13) are easy to implement because they only require the generation of the d + 𝜖e_i followed by the evaluations of the perturbed residual R(U(d),d + 𝜖e_i) and response function G (U(d) + 𝜖∂U/∂d_i,d + 𝜖e_i) which are readily computed by the subroutines that are used to compute R(U(d),d) and G(U(d),d). Thusly, the semi-analytical method shares the simplicity of the finite difference method and the efficiency of the analytical methods. It is noted, however, that tolerance 𝜖_R, truncation and round-off errors may pollute the results. In most cases, a design perturbation will not affect all of the element internal force vectors. As such, we only need to evaluate the elemental residual R(U(d),d + 𝜖e_i) of the affected elements. An extreme case of this occurs in topology optimization where each volume fraction design variable only affects a single element. Less extreme cases occur in shape optimization where each dimensional change may only affect a subset of the element boundary elements.

2.4 Adjoint method for steady-state nonlinear systems

In the adjoint method, the derivative ∂U/∂d_i is annihilated. This formulation uses the identity

$$ \frac{\mathrm{D}{F}}{\mathrm{D}d_{i}}=\frac{\partial G}{\partial \textbf{U}} \frac{\partial \textbf{U}}{\partial d_{i}}+\frac{\partial G}{\partial d_{i}}+{\boldsymbol{\Lambda}}^{\top}\left( \textbf{K} \frac{\partial \textbf{U}}{\partial d_{i}}+\frac{\partial \textbf{R}}{\partial d_{i}}\right) , $$

(14)

which follows from (9) and (12). In the above, Λ is the arbitrary adjoint vector. Rearranging (14) yields

$$ \frac{\mathrm{D}{F}}{\mathrm{D}d_{i}}=\left( \frac{\partial G}{\partial \textbf{U}}+{\boldsymbol{\Lambda}}^{\top} \textbf{K} \right) \frac{\partial \textbf{U}}{\partial d_{i}}+\frac{\partial G}{\partial d_{i}}+{\boldsymbol{\Lambda}}^{\top}\frac{\partial \textbf{R}}{\partial d_{i}} , $$

(15)

from which we identify the adjoint problem that we solve for the heretofore arbitrary Λ, i.e.,

$$ \textbf{K}^{\top} \boldsymbol{\Lambda}=-\frac{\partial G}{\partial\textbf{U}}^{\top} . $$

(16)

In this way, the term containing ∂U/∂d_i is annihilated from (15) reducing the sensitivity to

$$ \frac{\mathrm{D}{F}}{\mathrm{D}d_{i}}=\frac{\partial G}{\partial d_{i}}+\boldsymbol{\Lambda}^{\top}\frac{\partial \textbf{R}}{\partial d_{i}} . $$

(17)

The adjoint method requires the solution of one adjoint problem (cf. (16)) for each response function F regardless of the number of design variables. And like the direct method, the adjoint problem utilizes the tangent matrix from the primal analysis, so it is also computationally efficient and numerically exact. Furthermore, the tangent stiffness matrix may be already factored, if a direct solver is used in the primal analysis.

In the semi-analytical formulation, the derivative ∂R/∂d_i is approximated via finite differences (cf. (1)) and use (17) to obtain the sensitivities. As previously mentioned, the derivative ∂G/∂U is obtained analytically using our knowledge of the function G.

3 Transient nonlinear problems

A first-order transient problem is expressed in residual form as

$$\begin{array}{@{}rcl@{}} \textbf{R}(\textbf{U}(t,\textbf{d}),\dot{\textbf{U}}(t,\textbf{d}), \textbf{d}) &=&\textbf{0} , \end{array} $$

(19a)

$$\begin{array}{@{}rcl@{}} \textbf{U}(0) &=& \textbf{U}^{0} , \end{array} $$

(19b)

where we note the design dependencies as in (7), t ∈ [0,t_f] denotes time and t_f the terminal analysis time. The response function for this system is expressed as

$$ F(\textbf{d})= {\int}_{0}^{t_{f}} G({\textbf{U}}(t,\textbf{d}),\dot{\textbf{U}}(t,\textbf{d}),\textbf{d}) \mathrm{d}t . $$

(20)

Our goal is to compute the sensitivity in an efficient, accurate and easy manner, i.e., we want to compute

$$ \frac{\mathrm{D} F}{\mathrm{D} d_{i}}= {\int}_{0}^{t_{f}} \left( \frac{\partial G}{\partial\textbf{U}} \frac{\partial \textbf{U}}{\partial d_i} + \frac{\partial G}{\partial \dot{\textbf{U}}} \frac{\partial \dot{\textbf{U}}}{\partial d_i} + \frac{\partial G}{\partial d_{i}} \right) \mathrm{d}t . $$

(21)

For the sensitivity analysis, we can implement the direct method whereby we differentiate (19a) and (19b) to define the pseudo problem

$$\begin{array}{@{}rcl@{}} \frac{\partial \textbf{R}}{\partial\dot{\textbf{U}}} \frac{\partial \dot{\textbf{U}}}{\partial d_i} + \frac{\partial \textbf{R}}{\partial{\textbf{U}}} \frac{\partial \textbf{U}}{\partial d_i} & =& - \frac{\partial \textbf{R}}{\partial d_{i}} , \end{array} $$

(22a)

$$\begin{array}{@{}rcl@{}} \frac{\partial \textbf{U}(0)}{\partial d_i} &=& \frac{\partial \textbf{U}^{0}}{\partial d_i} , \end{array} $$

(22b)

which we solve for ∂U/∂d_i and $\partial \dot {\textbf {U}}/\partial d_{i}$ and then we evaluate (21). Alternatively, we can implement the adjoint approach, whereby we utilize (22a) to write (21) as

$$\begin{array}{@{}rcl@{}} \frac{\mathrm{D} F}{\mathrm{D} d_{i}} &=& {\int}_{0}^{t_{f}} \left( \frac{\partial G}{\partial\textbf{U}} \frac{\partial \textbf{U}}{\partial d_i} + \frac{\partial G}{\partial \dot{\textbf{U}}} \frac{\partial \dot{\textbf{U}}}{\partial d_i} + \frac{\partial G}{\partial d_{i}} \right) \mathrm{d}t \\ &&+ {\int}_{0}^{t_{f}} \boldsymbol{ \lambda}^{\top} \left( \frac{\partial \textbf{R}}{\partial{\textbf{U}}} \frac{\partial \textbf{U}}{\partial d_i} + \frac{\partial \textbf{R}}{\partial\dot{\textbf{U}}} \frac{\partial \dot{\textbf{U}}}{\partial d_i} + \frac{\partial \textbf{R}}{\partial d_{i}}\right) \mathrm{d}t , \end{array} $$

(23)

where again λ is the arbitrary adjoint vector. Integrating by parts and rearranging (23) yields

$$\begin{array}{@{}rcl@{}} \frac{\mathrm{D} F}{\mathrm{D} d_{i}}&=& \left.{\int}_{0}^{t_{f}} \left( \frac{\partial G}{\partial d_{i}} + {\boldsymbol{ \lambda}^{\top}} \frac{\partial \textbf{R}}{\partial d_{i}} \right) \mathrm{d}t - \left( \frac{\partial G}{\partial \dot{\textbf{U}}} + \boldsymbol{ \lambda}^{\top} \frac{\partial \textbf{R}}{\partial\dot{\textbf{U}}}\right) \frac{\partial \textbf{U}}{\partial d_i} \right\rvert_{t = 0}\\ &&+ {\int}_{0}^{t_{f}} \frac{\partial \textbf{U}}{\partial d_i}^{\top} \left( \frac{\partial {G}}{\partial \textbf{U}}^{\top} -\frac{\mathrm{d}}{\mathrm{d}{t}} \left( \frac{\partial {G}^{\top}}{\partial \dot{\textbf{U}}}\right) + \frac{\partial \textbf{R}^{\top}}{\partial \textbf{U}} \boldsymbol{ \lambda}\right.\\ &&-\left.\frac{\text{d}}{\text{d}t} \left( \frac{\partial \textbf{R}^{\top}}{\partial \dot{\textbf{U}}} \boldsymbol{ \lambda}\right) \right) \mathrm{d}t {\left.+ \frac{\partial \textbf{U}}{\partial d_i}^{\top} \left( \frac{\partial G}{\partial \dot{\textbf{U}}}^{\top} + \frac{\partial \textbf{R}}{\partial \dot{\textbf{U}}}^{\top} \boldsymbol{ \lambda}\right)\right\rvert_{t=t_{f}}} . \end{array} $$

(24)

Next, a time mapping is introduced, i.e., we define Λ such that

$$ \boldsymbol{\Lambda}(t_{f}-t)=\boldsymbol{\lambda}(t) , $$

(25)

and hence

$$ -\dot{\boldsymbol{\Lambda}}(t_{f}-t)=\dot{\boldsymbol{\lambda}}(t) , $$

(26)

substituting the above into (24) renders

$$\begin{array}{@{}rcl@{}} \frac{\mathrm{D} F}{\mathrm{D} d_{i}}&=&\left. {\int}_{0}^{t_{f}} \left( \frac{\partial G}{\partial d_{i}} + {\boldsymbol{\Lambda}^{\top}}\frac{\partial \textbf{R}}{\partial d_{i}} \right) \mathrm{d}t - \left( \frac{\partial G}{\partial \dot{\textbf{U}}} + \boldsymbol{\Lambda}^{\top} \frac{\partial \textbf{R}}{\partial\dot{\textbf{U}}}\right) \frac{\partial \textbf{U}}{\partial d_i} \right\rvert_{t = 0}\\ &&+ {\int}_{0}^{t_{f}} \frac{\partial \textbf{U}}{\partial d_i}^{\top} \left( {\frac{\partial G}{\partial\textbf{U}}^{\top}} -\frac{\text{d}}{\text{d}t} \left( \frac{\partial {G}^{\top}}{\partial \dot{\textbf{U}}}\right) + \frac{\partial \textbf{R}^{\top}}{\partial \textbf{U}} \boldsymbol{\Lambda} -\frac{\text{d}}{\text{d}t} \left( \frac{\partial \textbf{R}^{\top}}{\partial \dot{\textbf{U}}}\right) \boldsymbol{\Lambda} \right. \\ &&+\left.\left. \frac{\partial \textbf{R}^{\top}}{\partial \dot{\textbf{U}}} \dot{\boldsymbol{\Lambda}} \right) \mathrm{d}t+ \frac{\partial \textbf{U}}{\partial d_i}^{\top} \left( \frac{\partial G}{\partial \dot{\textbf{U}}}^{\top} + \frac{\partial \textbf{R}}{\partial \dot{\textbf{U}}}^{\top} \boldsymbol{\Lambda}\right)\right\rvert_{t=t_{f}} . \end{array} $$

(27)

where all quantities are evaluated at time t except Λ which is evaluated at t_f − t. We can annihilate the terms containing the implicitly defined derivative ∂U/∂d_i by requiring Λ to solve

$$\begin{array}{@{}rcl@{}} &&\frac{\partial \textbf{R}^{\top}}{\partial \dot{\textbf{U}}} \dot{ \boldsymbol{\Lambda} } + \left( \frac{\partial \textbf{R}}{\partial \textbf{U}}^{\top} -\frac{\text{d}}{\text{d}t} \left( \frac{\partial \textbf{R}^{\top}}{\partial \dot{\textbf{U}}}\right)\right) \boldsymbol{\Lambda} \\ &=& - \frac{\partial {G}}{\partial \textbf{U}}^{\top} +\frac{\text{d}}{\text{d}t} \left( \frac{\partial {G}^{\top}}{\partial \dot{\textbf{U}}}\right) , \end{array} $$

(28a)

$$ \left.\frac{\partial \textbf{R}}{\partial \dot{\textbf{U}}}^{\top}\right\rvert_{t=t_{f}} \boldsymbol{\Lambda} (0) = \left.-\frac{\partial G}{\partial \dot{\textbf{U}}}^{\top} \right\rvert_{t=t_{f}} . $$

(28b)

Using this Λ, DF/Dd_i reduces to the known quantity

$$\begin{array}{@{}rcl@{}} \frac{\mathrm{D} F}{\mathrm{D} d_{i}}&=& {\int}_{0}^{t_{f}} \left( \frac{\partial G}{\partial d_{i}} + \boldsymbol{\Lambda}^{\top} \frac{\partial \textbf{R}}{\partial d_{i}} \right) \mathrm{d}t \\ &&-\left. \left( \frac{\partial G}{\partial \dot{\textbf{U}}} + \boldsymbol{\Lambda}^{\top} \frac{\partial \textbf{R}}{\partial\dot{\textbf{U}}}\right) \frac{\partial \textbf{U}}{\partial d_i} \right\rvert_{t = 0} . \end{array} $$

(29)

where again all quantities are evaluated at time t except Λ which is evaluated at t_f − t.

3.1 Discretization

To solve the above, we discretize in time using an explicit/implicit parameter 0 ≤ α ≤ 1 so that

$$ \textbf{U}^{n}=\textbf{U}^{n-1} + \left( \alpha \dot{\textbf{U}}^{n} + (1-\alpha)\dot{\textbf{U}}^{n-1}\right){\Delta} t , $$

(30)

where Uⁿ = U(t_n) and $\dot {\textbf {U}}^{n}=\dot {\textbf {U}}(t_{n})$.^{Footnote 1} We then solve (19a) at the discrete times t_n. Finally, the integrals in (20) and (21) are evaluated as

$$ F=\sum\limits_{n = 0}^{N} \mu_{n} G^{n}(\textbf{U}^{n},\dot{\textbf{U}}^{n}, \textbf{d}) , $$

(31)

$$ \frac{\mathrm{D} F}{\mathrm{D} d_{i}}=\sum\limits_{n = 0}^{N} \mu_{n} \left( \frac{\partial G^{n}}{\partial \textbf{U}} \frac{\partial \textbf{U}}{\partial d_i}^{n} + \frac{\partial G^{n}}{\partial \textbf{U}} \frac{\partial \dot{\textbf{U}}^{n}}{\partial d_{i}} + \frac{\partial G^{n}} {\partial d_{i}}\right) , $$

(32)

where, e.g., $G^{n}=G(\textbf {U}^{n}, \dot {\textbf {U}}^{n}, \textbf {d})$ and the coefficient μ_n depends on the summation scheme, e.g., for trapezoidal 2μ₀ = μ₁ = μ₂ = ... = μ_N− 1 = 2μ_N = Δt.

3.2 Primal analysis

The initial condition U⁰ is given, but $\dot {\textbf {U}}^{0}$ is needed in (30) to obtain U¹. To these ends, we use (19a), i.e., we use the Newton-Raphson method to solve

$$ \textbf{R}^{0}(\textbf{U}^{0},\dot{\textbf{U}}^{0},\textbf{d}) =\textbf{0} , $$

(33)

for $\dot {\textbf {U}}^{0}$. The procedure is akin to that which we use to evaluate U in Section 2. Here $\textbf {K}^{0}={\partial \textbf {R}^{0}}/{\partial \dot {\textbf {U}}}$ is the tangent matrix. Having U⁰ and $\dot {\textbf {U}}^{0}$, we compute the first term in (31), i.e., $F= \mu _{0} G^{0}(\textbf {U}^{0},\dot {\textbf {U}}^{0}, \textbf {d})$.

Now we commence our analysis. At each time step t_n, we insert Uⁿ of (30), in (19a) and solve the resulting equation for $\dot {\textbf {U}}^{n}$. Again, we use Newton’s method for this solution, cf. Section 2, where we introduce the tangent stiffness matrix $\textbf {K}^{n}={\partial \textbf {R}^{n}}/{\partial \dot {\textbf {U}}}+\alpha {\Delta } t {\partial \textbf {R}^{n}}/{\partial \textbf {U}}$. After convergence, Uⁿ is updated as per (30) and F is updated as per (31), i.e.,

$$ F \leftarrow F+ \mu_{n} G^{n}(\textbf{U}^{n},\dot{\textbf{U}}^{n}, \textbf{d}) , $$

(34)

where the symbol ← represents the update assignment.

The time is then incremental and the process repeats itself until the terminal time t_f. A flow chart describing these computations is provided in Fig. 1 wherein multiple functions F are evaluated for n = 1,2,...,N.

3.3 Direct differentiation

For the direct differentiation, we discretize ∂U/∂d_i like U, i.e.,

$$ \frac{\partial \textbf{U}^{n}}{\partial d_i}= \frac{\partial \textbf{U}^{n-1}}{\partial d_i} + \left( \alpha \frac{\partial \dot{\textbf{U}}^{n}}{\partial d_i} + (1-\alpha) \frac{\partial \dot{\textbf{U}}^{n-1}}{\partial d_i}\right){\Delta} t . $$

(35)

Note that the initial condition ∂U⁰/∂d_i is known, but ${\partial \dot {\textbf {U}}^{0}}/{\partial d_{i}}$ is not. So before commencing, we must obtain ${\partial \dot {\textbf {U}}^{0}}/{\partial d_{i}}$ like we did $\dot {\textbf {U}}^{0}$. To these ends, we differentiate (33) to obtain the linear equation

$$ \textbf{K}^{0} \frac{\partial \dot{\textbf{U}}^{0}}{\partial d_i} = -\left( \frac{\partial {\textbf{R}}^{0}}{\partial \textbf{U}} \frac{\partial \textbf{U}^{0}}{\partial d_i} + \frac{\partial {\textbf{R}}^{0}}{\partial d_{i}} \right) , $$

(36)

which we solve for ${\partial \dot {\textbf {U}}^{0}}/{\partial d_{i}}$. Having ∂U⁰/∂d_i and ${\partial \dot {\textbf {U}}^{0}}/{\partial d_{i}}$, we update DF/Dd_i as per (32), i.e.,

$$ \frac{\mathrm{D} F}{\mathrm{D} d_{i}} = \mu_{0} \left( \frac{\partial G^{0}}{\partial \textbf{U}} \frac{\partial \textbf{U}^{0}}{\partial d_i} + \frac{\partial G^{0}}{\partial \textbf{U}}\frac{\partial \dot{\textbf{U}}^{0}}{\partial d_i} + \frac{\partial G^{0}}{\partial d_{i}} \right) . $$

(37)

Now we march in time evaluating ∂Uⁿ/∂d_i and ${\partial \dot {\textbf {U}}^{n}}/{\partial d_{i}}$ as we did to compute Uⁿ and $\dot {\textbf {U}}^{n}$. Equation (22a) and (35) render the linear equation

$$\begin{array}{@{}rcl@{}} \textbf{K}^{n}(\textbf{U}^{n},\dot{\textbf{U}}^{n}, \textbf{d}) \frac{\partial \dot{\textbf{U}}^{n}}{\partial d_i} &=&-\left( \frac{\partial \textbf{R}^{n}}{\partial \textbf{U}} \left( \frac{\partial \textbf{U}^{n-1}}{\partial d_i}\right.\right.\\ &&+\left.\left.(1-\alpha){\Delta} t\frac{\partial \dot{\textbf{U}}^{n-1}}{\partial d_i}\right)+ \frac{\partial \textbf{R}^{n}}{\partial d_{i}} \right) , \end{array} $$

(38)

which we solve for ${\partial \dot {\textbf {U}}^{n}}/{\partial d_{i}}$. Next, we update ∂Uⁿ/∂d_i as per (35) and DF/Dd_i as per (32)

$$ \frac{\mathrm{D} F}{\mathrm{D} d_{i}} \leftarrow \frac{\mathrm{D} F}{\mathrm{D} d_{i}}+ \mu_{n} \left( \frac{\partial G^{n}}{\partial \textbf{U}} \frac{\partial \textbf{U}^{n}}{\partial d_i} + \frac{\partial G^{n}}{\partial \textbf{U}} \frac{\partial \dot{\textbf{U}}^{n}}{\partial d_{i}} + \frac{\partial G^{n}}{\partial d_{i}} \right) . $$

(39)

We continue marching in this manner for all t_n. In so far as our sensitivity analysis algorithm is concerned, we insert nodes A and B from Fig. 2 into the primal analysis flowchart of Fig. 1.

For the semi-analytical method we use the approximations

$$ \frac{\partial {\textbf{R}}^{0}}{\partial \textbf{U}} \frac{\partial \textbf{U}^{0}}{\partial d_i} + \frac{\partial {\textbf{R}}^{0}}{\partial d_{i}} \approx \frac{1}{\epsilon} \textbf{R}^{0} \left( \textbf{U}^{0}+\epsilon \frac{\partial \textbf{U}^{0}}{\partial d_i}, \dot{\textbf{U}}^{0}, \textbf{d}+\epsilon \textbf{e}_i \right) , $$

(40)

$$\begin{array}{@{}rcl@{}} \frac{\partial \textbf{R}^{n}}{\partial \textbf{U}} \left( \frac{\partial \textbf{U}^{n-1}}{\partial d_i}+(1-\alpha){\Delta} t\frac{\partial \dot{\textbf{U}}^{n-1}}{\partial d_i}\right)+ \frac{\partial \textbf{R}^{n}}{\partial d_{i}} \approx \\ \quad \frac{1}{\epsilon} \textbf{R}^{n} \left( \textbf{U}^{n}+\epsilon\left( \frac{\partial \textbf{U}^{n-1}}{\partial d_i}+(1-\alpha){\Delta} t\frac{\partial \dot{\textbf{U}}^{n-1}}{\partial d_i}\right), \right.\\ \left. \vphantom{\frac{1}{\epsilon} \textbf{R}^{n} \left( \textbf{U}^{n}+\epsilon\left( \frac{\partial \textbf{U}^{n-1}}{\partial d_i}+(1-\alpha){\Delta} t\frac{\partial \dot{\textbf{U}}^{n-1}}{\partial d_i}\right), \right)} \dot{\textbf{U}}^{n}, \textbf{d}+\epsilon \textbf{e}_i \right) , \end{array} $$

(41)

$$\begin{array}{@{}rcl@{}} &&\frac{\partial G^{n}}{\partial \textbf{U}} \frac{\partial \textbf{U}^{n}}{\partial d_i} + \frac{\partial G^{n}}{\partial \dot{\textbf{U}}}\frac{\partial \dot{\textbf{U}}^{n}}{\partial d_{i}} + \frac{\partial G^{n}}{\partial d_{i}} \\ \quad &\approx&\frac{1}{\epsilon} \left( G^{n} \left( \textbf{U}^{n}+\epsilon \frac{\partial \textbf{U}^{n}}{\partial d_i} , \dot{\textbf{U}}^{n}+\epsilon \frac{\partial \dot{\textbf{U}}^{n}}{\partial d_i} , \textbf{d}+\epsilon \textbf{e}_i \right) \right.\\ &&-\left.G^{n}(\textbf{U}^{n},\dot{\textbf{U}}^{n}, \textbf{d})\right) , \end{array} $$

(42)

in (36), (37), (38), and (39). Again, we assume the user can code ∂Gⁿ/∂U, $\partial G^{n}/ \partial \dot {\textbf {U}}$ and ∂Gⁿ/∂d, so we do not use (42). As mentioned before, semi-analytical sensitivities carry the error due to 𝜖_R, truncation, and round-off.

3.4 Adjoint method using differentiate-then-discretize

In the differentiate-then-discretize approach, one obtains the adjoint problem (cf. (28a) and (28b)) and the sensitivity (cf. (29)) at the continuous time level. Now we use numerical time integration to compute

$$\begin{array}{@{}rcl@{}} \frac{\mathrm{D} F}{\mathrm{D} d_{i}}&=& \sum\limits_{n = 0}^{N}\mu_{N-n} \left( \frac{\partial G^{N-n}}{\partial d_{i}} + \boldsymbol{\Lambda}^{n\top} \frac{\partial \textbf{R}^{N-n}}{\partial d_{i}} \right) \\ &&- \left( \frac{\partial G^{0}}{\partial \textbf{U}} + {\boldsymbol{\Lambda}^{N}}^{\top} \frac{\partial {\textbf{R}}^{0}}{\partial \dot{\textbf{U}}}\right) \frac{\partial \textbf{U}^{0}}{\partial d_i} . \end{array} $$

(43)

Before we evaluate the above, we must solve the adjoint problem of (28a) and (28b). To do this, we discretize the adjoint variable Λ like U, i.e.,

$$ \boldsymbol{\Lambda}^{n}=\boldsymbol{\Lambda}^{n-1} + \left( \alpha \dot{\boldsymbol{\Lambda}}^{n} + (1-\alpha) \dot{\boldsymbol{\Lambda}}^{n-1}\right){\Delta} t , $$

(44)

To reuse Kⁿ like the direct method, we restrict our adjoint discussion to those R such that

$$ \frac{\text{d}}{\text{d}t}\left( \frac{\partial \textbf{R}}{{\partial \dot{\textbf{U}}}}\right) =\textbf{0} . $$

(45)

Notably $\partial \textbf {R}/ \partial \dot {\textbf {U}}$ is typically interpreted as a mass matrix, so the mass matrix must be constant which is fairly common.

Referring to (28b), we initially solve the adjoint problem

$$ {\frac{\partial \textbf{R}^{N}}{\partial \dot{\textbf{U}}}}^{\top} \boldsymbol{\Lambda}^{0} = -\frac{\partial G^{N}}{\partial \dot{\textbf{U}}}^{\top} , $$

(46)

for Λ⁰ and then solve (28a) with Λ⁰ to evaluate $\dot {\boldsymbol {\Lambda }}^{0}$, i.e.,

$$\begin{array}{@{}rcl@{}} {\frac{\partial \textbf{R}^{N}}{\partial \dot{\textbf{U}}}}^{\top} \dot{ \boldsymbol{\Lambda}^{0}}=- \frac{\partial \textbf{R}^{N}}{\partial \textbf{U}}^{\top} \boldsymbol{\Lambda}^{0} - \frac{\partial {G}^{N}}{\partial \textbf{U}}^{\top} \\+ \left( \frac{\partial^{2} {G}^{N}}{\partial \dot{\textbf{U}} \partial \textbf{U}} \dot{\textbf{U}}^{N} \right)^{\top} + \left( \frac{\partial^{2} {G}^{N}}{\partial \dot{\textbf{U}}^{2}} \dot{\textbf{U}}^{N} \right)^{\top} . \end{array} $$

(47)

Note that (46) and (47) do not use the tangent stiffness matrix from the primal problem. Next, we compute

$$ \frac{\mathrm{D} F}{\mathrm{D} d_{i}} = \mu_{N} \left( \frac{\partial G^{N}}{\partial d_{i}} + \boldsymbol{\Lambda}^{0\top} \frac{\partial \textbf{R}^{N}}{\partial d_{i}} \right) , $$

(48)

cf. (29), (31), and (32). Time marching now commences for the remaining time steps, i.e., for n = 1,2,...,N − 1 we evaluate $\dot {\boldsymbol {\Lambda }}^{0}$ by solving

$$\begin{array}{@{}rcl@{}} &&{\textbf{K}^{N-n}}^{\top} \dot{\boldsymbol{\Lambda}}^{n} =- \frac{\partial {G}^{N-n}}{\partial \textbf{U}}^{\top} \\ &&+ \left( \frac{\partial^{2} {G}^{N-n}}{\partial \dot{\textbf{U}} \partial \textbf{U}} \dot{\textbf{U}}^{N-n} \right)^{\top}+ \left( \frac{\partial^{2} {G}^{N-n}}{\partial \dot{\textbf{U}}^{2}} \ddot{\textbf{U}}^{N-n} \right)^{\top} \\ &&-\frac{\partial \textbf{R}^{N-n}}{\partial \textbf{U}}^{\top} \left( \boldsymbol{\Lambda}^{n-1} + (1-\alpha) {\Delta} t \dot{\boldsymbol{\Lambda}}^{n-1}\right) , \end{array} $$

(49)

where K^N−n is the tangent stiffness matrix of the primal problem. Then, we compute Λⁿ as per (44) and update

$$ \frac{\mathrm{D} F}{\mathrm{D} d_{i}} \leftarrow \frac{\mathrm{D} F}{\mathrm{D} d_{i}} + \mu_{N-n} \left( \frac{\partial G^{N-n}}{\partial d_{i}} + \boldsymbol{\Lambda}^{n\top} \frac{\partial \textbf{R}^{N-n}}{\partial d_{i}} \right) . $$

(50)

Finally, we solve

$$\begin{array}{@{}rcl@{}} &&\left( \frac{\partial {\textbf{R}}^{0}}{\partial \dot{\textbf{U}}} +\alpha{\Delta} t \frac{\partial {\textbf{R}}^{0}}{\partial \textbf{U}}\right)^{\top} \dot{\boldsymbol{\Lambda}}^{N} =- \frac{\partial {G}^{0}}{\partial \textbf{U}}^{\top} \\ &&+ \left( \frac{\partial^{2} {G}^{0}}{\partial \dot{\textbf{U}} \partial \textbf{U}} \dot{\textbf{U}}^{0} \right)^{\top}+ \left( \frac{\partial^{2} {G}^{0}}{\partial \dot{\textbf{U}}^{2}} \ddot{\textbf{U}}^{0} \right)^{\top} \\ &&-\frac{\partial {\textbf{R}}^{0}}{\partial \textbf{U}}^{\top} \left( \boldsymbol{\Lambda}^{N-1} + (1-\alpha) {\Delta} t \dot{\boldsymbol{\Lambda}}^{N-1}\right) , \end{array} $$

(51)

for $\dot {\boldsymbol {\Lambda }}^{N}$, we evaluate Λ^N with (44) and update

$$\begin{array}{@{}rcl@{}} \frac{\mathrm{D} F}{\mathrm{D} d_{i}} \leftarrow \frac{\mathrm{D} F}{\mathrm{D} d_{i}} &+& \mu_{0} \frac{\partial G^{0}}{\partial d_{i}} - \frac{\partial G^{0}}{\partial \dot{\textbf{U}}}\frac{\partial \textbf{U}^{0}}{\partial d_i} \\ &+& {\boldsymbol{\Lambda}^{N}}^{\top}\left( \mu_{0} \frac{\partial {\textbf{R}}^{0}}{\partial d_{i}} - \frac{\partial {\textbf{R}}^{0}}{\partial \textbf{U}}\frac{\partial \textbf{U}^{0}}{\partial d_i} \right) , \end{array} $$

(52)

As in (47) and (51) does not use the tangent stiffness matrix from the primal analysis.

The second derivatives $\ddot {\textbf {U}}^{n}$ in (47), (49), and (51) can be computed using the known first derivatives $\dots \dot {\textbf {U}}^{n-1},\dot {\textbf {U}}^{n},\dot {\textbf {U}}^{n + 1},\dots $ and Δt, and a second order forward difference for $\ddot {\textbf {U}}^{0}$, backward differences for $\dot {\textbf {U}}^{N}$, and central differences for any other $\ddot {\textbf {U}}^{n}$ (cf. Figure 3). The adjoint sensitivity analysis is executed after the primal analysis is concluded. Thus, we describe this algorithm by inserting node C of Fig. 3 into the flowchart of Fig. 1.

In the semi-analytical, we consider a further restriction that ∂R/∂U is symmetric, so the term in the adjoint load of (47) can be approximated as

$$ \frac{\partial \textbf{R}^{N}}{\partial \textbf{U}}^{\top} \boldsymbol{\Lambda}^{0} \approx \frac{1}{\epsilon} \textbf{R}^{N} \left( \textbf{U}^{N}+\epsilon\boldsymbol{\Lambda}^{0} , \dot{\textbf{U}}{}^{N} , \textbf{d}\right) , $$

(53)

and the term in the adjoint load of (49) can be approximated as

$$\begin{array}{@{}rcl@{}} &&\frac{\partial \textbf{R}^{N-n}}{\partial \textbf{U}}^{\top} \left( \boldsymbol{\Lambda}^{n-1} + (1-\alpha) {\Delta} t \dot{\boldsymbol{\Lambda}}^{n-1}\right) \approx\\ &&~~\frac{1}{\epsilon} \textbf{R}^{N-n} \left( \textbf{U}^{N-n}+\epsilon \left( \boldsymbol{\Lambda}^{n-1} + (1-\alpha) {\Delta} t \dot{\boldsymbol{\Lambda}}^{n-1}\right), \right.\\ &&~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\left. \vphantom{\frac{1}{\epsilon} \textbf{R}^{N-n} \left( \textbf{U}^{N-n}+\epsilon \left( \boldsymbol{\Lambda}^{n-1} + (1-\alpha) {\Delta} t \dot{\boldsymbol{\Lambda}}^{n-1}\right), \right)} \dot{\textbf{U}}{}^{N-n} , \textbf{d}\right) . \end{array} $$

(54)

In regard to DF/Dd_i of (52), we use the approximation

$$\begin{array}{@{}rcl@{}} &&\mu_{0} \frac{\partial {\textbf{R}}^{0}}{\partial d_{i}} - \frac{\partial {\textbf{R}}^{0}}{\partial \textbf{U}}\frac{\partial \textbf{U}^{0}}{\partial d_i} \\ &\approx&\frac{1}{\epsilon} \textbf{R} \left( \textbf{U}^{0}, \dot{\textbf{U}}{}^{0}-\epsilon \frac{\partial \textbf{U}^{0}}{\partial d_i} , \textbf{d} +\mu_{0}\epsilon \textbf{e}_i \right) . \end{array} $$

(55)

Finally, the derivative ∂Rⁿ/∂d_i in (48) and (50) is approximated as

$$ \frac{\partial \textbf{R}^{n}}{\partial d_{i}} \approx \frac{1}{\epsilon} \textbf{R} \left( \textbf{U}^{n}, \dot{\textbf{U}}{}^{n} , \textbf{d}+\epsilon \textbf{e}_i\right) . $$

(56)

Of course, the semi-analytical approximations exhibit the previously discussed errors.

Again, we assume the user can code ∂G/∂U, etc., as these would be time consuming to compute by finite differences.

3.5 Adjoint method using discretize-then-differentiate

In this second option of the adjoint method, we use (22a) and (35) to equivalently write (32) as

$$\begin{array}{@{}rcl@{}} \frac{\mathrm{D} F}{\mathrm{D} d_{i}}&=&\sum\limits_{n = 0}^{N} \mu_{n} \left( \frac{\partial G^{n}}{\partial \textbf{U}} \frac{\partial \textbf{U}^{n}}{\partial d_i} + \frac{\partial G^{n}}{\partial \textbf{U}}\frac{\partial \dot{\textbf{U}}^{n}}{\partial d_{i}} + \frac{\partial G^{n}}{\partial d_{i}} \right) \\ &&+\sum\limits_{n = 0}^{N}{\boldsymbol{\Lambda}^{n}}^{\top} \left( \frac{\partial \textbf{R}^{N-n}}{\partial \dot{\textbf{U}}}\frac{\partial \dot{\textbf{U}}^{N-n}}{\partial d_i} \right. \\ &&+\left.\frac{\partial \textbf{R}^{N-n}}{\partial \textbf{U}} \frac{\partial \textbf{U}^{N-n}}{\partial d_i} + \frac{\partial \textbf{R}}{\partial d_{i}}^{N-n} \right) \\ &&+\sum\limits_{n = 0}^{N-1}{\boldsymbol{\Phi}^{n}}^{\top}\left( \frac{\partial \textbf{U}^{N-n}}{\partial d_i}-\frac{\partial \textbf{U}^{N-n-1}}{\partial d_i}\right. \\ &&-\left. \left( \alpha \frac{\partial \dot{\textbf{U}}^{N-n}}{\partial d_i} + (1-\alpha) \frac{\partial \dot{\textbf{U}}^{N-n-1}}{\partial d_i}\right){\Delta} t \right) ,\\ \end{array} $$

(57)

where Λⁿ and Φⁿ are arbitrary adjoint vectors. Rearrangement subsequently yields

$$\begin{array}{@{}rcl@{}} \frac{\mathrm{D} F}{\mathrm{D} d_{i}}&=&\sum\limits_{n = 0}^{N} \left( \mu_{N-n} \frac{\partial G^{N-n}}{\partial d_{i}} + {\boldsymbol{\Lambda}^{n}}^{\top} \frac{\partial \textbf{R}}{\partial d_{i}}^{N-n} \right) \\ &&+ \left( \mu_{0} \frac{\partial G^{0}}{\partial \textbf{U}} +{\boldsymbol{\Lambda}^{N}}^{\top} \frac{\partial {\textbf{R}}^{0}}{\partial \textbf{U}} -{\boldsymbol{\Phi}^{N-1}}^{\top} \right) \frac{\partial \textbf{U}^{0}}{\partial d_i} \\ &&+ \left( \mu_{0} \frac{\partial G^{0}}{\partial \dot{\textbf{U}}} +{\boldsymbol{\Lambda}^{N}}^{\top} \frac{\partial {\textbf{R}}^{0}}{\partial \textbf{U}} - (1-\alpha){\Delta} t {\boldsymbol{\Phi}^{N-1}}^{\top} \right) \frac{\partial \dot{\textbf{U}}^{0}}{\partial d_i} \\ &&+\sum\limits_{n = 1}^{N-1} \left( \mu_{N-n} \frac{\partial G^{N-n}}{\partial \textbf{U}} +{\boldsymbol{\Lambda}^{n}}^{\top} \frac{\partial \textbf{R}}{\partial \textbf{U}}^{N-n} \right. \\ &&\left. \vphantom{+{\sum}_{n = 1}^{N-1} \left( \mu_{N-n} \frac{\partial G^{N-n}}{\partial \textbf{U}} +{\boldsymbol{\Lambda}^{n}}^{\top} \frac{\partial \textbf{R}}{\partial \textbf{U}}^{N-n} \right)} -{\boldsymbol{\Phi}^{n-1}}^{\top} +{\boldsymbol{\Phi}^{n}}^{\top} \right) \frac{\partial \textbf{U}^{N-n}}{\partial d_i} \\ &&+ \sum\limits_{n = 1}^{N-1}\left( \mu_{N-n} \frac{\partial G^{N-n}}{\partial \dot{\textbf{U}}} +{\boldsymbol{\Lambda} ^{n}}^{\top} \frac{\partial \textbf{R}}{\partial \dot{\textbf{U}}}^{N-n} \right. \\ &&\left. \vphantom{+ {\sum}_{n = 1}^{N-1}\left( \mu_{N-n} \frac{\partial G^{N-n}}{\partial \dot{\textbf{U}}} +{\boldsymbol{\Lambda}^{n}}^{\top} \frac{\partial \textbf{R}}{\partial \dot{\textbf{U}}}^{N-n} \right)}- (1-\alpha){\Delta} t {\boldsymbol{\Phi}^{n-1}}^{\top}- \alpha {\Delta} t {\boldsymbol{\Phi}^{n}}^{\top} \right) \frac{\partial \dot{\textbf{U}}^{N-n}}{\partial d_i} \\ && +\left( \mu_{N} \frac{\partial G^{N}}{\partial \textbf{U}} +{\boldsymbol{\Lambda} ^{0}}^{\top} \frac{\partial \textbf{R}}{\partial \textbf{U}}^{N} +{\boldsymbol{\Phi}^{0}}^{\top} \right) \frac{\partial \textbf{U}^{N}}{\partial d_i} \\ &&+\left( \mu_{N} \frac{\partial G^{N}}{\partial \dot{\textbf{U}}} +{\boldsymbol{\Lambda}^{0}}^{\top} \frac{\partial \textbf{R}}{\partial \dot{\textbf{U}}}^{N} - \alpha {\Delta} t {\boldsymbol{\Phi}^{0}}^{\top} \right) \frac{\partial \dot{\textbf{U}}^{N}}{\partial d_i} . \end{array} $$

(58)

To annihilate the implicitly defined derivatives ∂U^N/∂d_i and $\partial \dot {\textbf {U}}^{N} / \partial d_{i}$, we first solve the adjoint problem

$$ {\textbf{K}^{N}}^{\top} {\boldsymbol{\Lambda}^{0}}=-\mu_{N} \alpha {\Delta} t \frac{\partial G^{N}}{\partial \textbf{U}}^{\top} -\mu_{N} \frac{\partial G^{N}}{\partial \dot{\textbf{U}}}^{\top} , $$

(59)

for Λ⁰ and evaluate Φ⁰ from either of the following expressions

$$\begin{array}{@{}rcl@{}} {\boldsymbol{\Phi}^{0}} &=& -\mu_{N} \frac{\partial G^{N}}{\partial \textbf{U}}^{\top} - {\frac{\partial \textbf{R}}{\partial \textbf{U}}^{N}}^{\top} {\boldsymbol{\Lambda}^{0}} \end{array} $$

(60)

$$\begin{array}{@{}rcl@{}} &=&\frac{1}{\alpha {\Delta} t}\left( \mu_{N} \frac{\partial G^{N}}{\partial \dot{\textbf{U}}}^{\top} + {\frac{\partial \textbf{R}}{\partial \dot{\textbf{U}}}^{N}}^{\top} {\boldsymbol{\Lambda}^{0}} \right) . \end{array} $$

(61)

Note that for an explicit method, i.e., α = 0, we must use (60) to evaluate Φ⁰. We next evaluate

$$ \frac{\mathrm{D} F}{\mathrm{D} d_{i}}= \mu_{N} \frac{\partial G^{N}}{\partial d_{i}} + {\boldsymbol{\Lambda}^{0}}^{\top} \frac{\partial \textbf{R}}{\partial d_{i}}^{N} . $$

(62)

To annihilate ∂Uⁿ/∂d_i and $\partial \dot {\textbf {U}}^{n} / \partial d_{i}$, we march in time computing Λⁿ from

$$\begin{array}{@{}rcl@{}} {\textbf{K}^{N-n}}^{\top} {\boldsymbol{\Lambda}^{n}}&=&-\mu_{N-n} \alpha {\Delta} t \frac{\partial G^{N-n}}{\partial \textbf{U}}^{\top} \\&&-\mu_{N-n} \frac{\partial G^{N-n}}{\partial \dot{\textbf{U}}}^{\top} + {\Delta} t {\boldsymbol{\Phi}^{n-1}} , \end{array} $$

(63)

and updating Φⁿ from either of the following equations

$$\begin{array}{@{}rcl@{}} {\boldsymbol{\Phi}^{n}} &= &{\boldsymbol{\Phi}^{n-1}} -\mu_{N-n} \frac{\partial G^{N-n}}{\partial \textbf{U}}^{\top} - {\frac{\partial \textbf{R}}{\partial \textbf{U}}^{N-n}}^{\top} {\boldsymbol{\Lambda}^{n}} \end{array} $$

(64)

$$\begin{array}{@{}rcl@{}} &= &-\frac{1-\alpha}{\alpha} {\boldsymbol{\Phi}^{n-1}}\\ &&+\frac{1}{\alpha {\Delta} t}\left( \mu_{N-n} \frac{\partial G^{N-n}}{\partial \dot{\textbf{U}}}^{\top} + {\frac{\partial \textbf{R}}{\partial \dot{\textbf{U}}}^{N-n}}^{\top} {\boldsymbol{\Lambda}^{n}} \right) . \end{array} $$

(65)

Again (65) is restricted to the α≠ 0 case. Due to the different Φ updates, we define option 1 if we choose to use (60) and (64), and option 2 if we use (61) and (65). After each of these tⁿ computations, we update

$$ \frac{\mathrm{D} F}{\mathrm{D} d_{i}} \leftarrow \frac{\mathrm{D} F}{\mathrm{D} d_{i}} + \mu_{N-n} \frac{\partial G^{N-n}}{\partial d_{i}} + {\boldsymbol{\Lambda}^{n}}^{\top} \frac{\partial \textbf{R}}{\partial d_{i}}^{N-n} . $$

(66)

Finally, to annihilate $\partial \dot {\textbf {U}}^{0} / \partial d_{i}$, we solve the linear problem

$$ {\textbf{K}^{0}}^{\top} {\boldsymbol{\Lambda}^{N}} =-\mu_{0} \frac{\partial G^{0}}{\partial \textbf{U}}+ (1-\alpha){\Delta} t {\boldsymbol{\Phi}^{N-1}} , $$

(67)

for Λ^N and update

$$\begin{array}{@{}rcl@{}} \frac{\mathrm{D} F}{\mathrm{D} d_{i}} &\leftarrow& \frac{\mathrm{D} F}{\mathrm{D} d_{i}} + \mu_{0} \frac{\partial G^{0}}{\partial d_{i}} + \mu_{0} \frac{\partial G^{0}}{\partial \textbf{U}}\frac{\partial \textbf{U}^{0}}{\partial d_i} -{\boldsymbol{\Phi}^{N-1}}^{\top}\frac{\partial \textbf{U}^{0}}{\partial d_i} \\ &&+ {\boldsymbol{\Lambda}^{N}}^{\top} \left( \frac{\partial {\textbf{R}}^{0}}{\partial d_{i}} + \frac{\partial {\textbf{R}}^{0}}{\partial \textbf{U}} \frac{\partial \textbf{U}^{0}}{\partial d_i}\right) . \end{array} $$

(68)

All of the computations in (59)–(67) are performed after the primal analysis is terminated, thus we insert node C from Fig. 4 into the flowchart of Fig. 1.

The sensitivities using the differentiate-then-discretize and discretize-then-differentiate adjoint approaches are different because the discretization and differentiation steps do not commute. As seen shortly, the latter approach yields more accurate results. However, for large number of time steps, the time discretization error shrinks and the methods converge.

For the semi-analytical, if we use option 1, we again require ∂Rⁿ/∂U to be symmetric and we approximate the adjoint load terms of (60) and (64) as

$$ {\frac{\partial \textbf{R}}{\partial \textbf{U}}^{N-n}}^{\top} {\boldsymbol{\Lambda}^{n}} \approx \frac{1}{\epsilon} \textbf{R} \left( \textbf{U}^{N-n}+\epsilon \boldsymbol{\Lambda}^{n}, \dot{\textbf{U}}{}^{N-n}, \textbf{d}\right) , $$

(69)

Fortunately, we have option 2 to approximate Φⁿ if ∂Rⁿ/∂U is asymmetric and we cannot use (69). We consider the restriction for which $\partial \textbf {R} / \partial \dot {\textbf {U}}$ is symmetric, which is common, and α≠ 0. In this case, the adjoint load terms of (61) and (65) are approximated as

$$ {\frac{\partial \textbf{R}}{\partial \dot{\textbf{U}}}^{N-n}}^{\top} \boldsymbol{\Lambda}^{n} \approx \frac{1}{\epsilon} \textbf{R} \left( \textbf{U}^{N-n}, \dot{\textbf{U}}{}^{N-n}+{\epsilon} \boldsymbol{\Lambda}^{n} , \textbf{d}\right) , $$

(70)

and in DF/Dd_i of (68) we approximate the sum

$$ \frac{\partial {\textbf{R}}^{0}}{\partial d_{i}} + \frac{\partial {\textbf{R}}^{0}}{\partial \textbf{U}} \frac{\partial \textbf{U}}{\partial d_i}^{0} \approx \frac{1}{\epsilon} \textbf{R} \left( \textbf{U}^{0} +\epsilon \frac{\partial \textbf{U}}{\partial d_i}^{0}, \dot{\textbf{U}}{}^{0} , \textbf{d} +\epsilon \textbf{e}_i\right) . $$

(71)

The derivatives ∂Rⁿ/∂d_i of (62), (66), and (68) are approximated via finite differences using (56). Again, these semi-analytical approximations are susceptible to the previously discussed errors.

3.6 Transient example

Consider the transient heat conduction problem of a straight one-dimensional fin with constant cross-sectional area (Kramer and Stockman 1963) expressed in non-dimensional form as

$$\begin{array}{@{}rcl@{}} \dot{\theta} -\frac{\mathrm{d}}{\mathrm{d}x}\left[ k(\theta) \frac{\mathrm{d}\theta}{\mathrm{d}x}\right]+M^{2}\theta^{p + 1} = 0, & \enskip \text{in} \enskip 0<x<1 , \\ \frac{\mathrm{d}\theta}{\mathrm{d}x} = 0, &\enskip\text{at} \enskip x = 0, t>0 ,\\ \theta = 1, &\enskip\text{at} \enskip x = 1, t>0 ,\\ \theta = 1, &\enskip\text{at} \enskip t = 0 , \end{array} $$

(72)

where x, t, and 𝜃 are the non-dimensional position, time, and temperature respectively. k(𝜃) = 1 + ξ𝜃 is the non-dimensional thermal conductivity, ξ and M = 1 are fin parameters, and the exponent p = 1/3 models the removal of heat by turbulent natural convection along the fin. The bar is discretized by 5 equal length linear finite elements and the time domain [0,1] is discretized into N equal time steps. The Newton-Raphson tolerance is 𝜖_R = |R| < 10^− 14. The various sensitivity methods are illustrated for the following response function

$$ F={{\int}_{0}^{2}} {{\int}_{0}^{1}} \left( \zeta\theta^{2}(x,t) + (1-\zeta)\dot{\theta}^{2}(x,t)\right) \mathrm{d}x \mathrm{d}t $$

(73)

where the integral is approximated by using the trapezoidal rule in time and the element wise 2-point Gaussian quadrature in space. We use ζ = 0.5 and compute the sensitivities with respect to the parameter d = M. The perturbation 𝜖 = 10^− 6 is used in the finite difference and semi-analytical approaches, unless otherwise stated.

3.6.1 Symmetric ∂ R/∂ U and $\partial \mathbf {R} / \partial \dot {\mathbf {U}}$

We first consider the linear thermal conductivity case, i.e., ξ = 0, for which ∂R/∂U and $\partial \textbf {R} / \partial \dot {\textbf {U}}$ are symmetric. The computations performed with the various methods yield similar results, cf. Table 1. For N = 100 and α = 0, the explicit integration scheme is not stable. Also, for α = 0, we cannot use the semi-analytical adjoint discretize-then-differentiate option 2, cf. (61) and (65).

Table 1 Sensitivities for the symmetric problem

Full size table

To examine the consistency of the methods, we show the error e_f, cf. (1), for different perturbation sizes 𝜖 for the N = 1000 and α = 0.5 case, cf. Figure 5. As the perturbation 𝜖 decreases, the sensitivities obtained by finite differences converge to those obtained analytically. However, the finite difference sensitivities erode for small perturbations due to round-off error.

We also show the error e_f for different time discretizations N for the 𝜖 = 10^− 6 and α = 0.5 case, cf. Figure 6. The errors of the sensitivities obtained by the different methods show no dependency on N, with the exception of the adjoint differentiate-then-discretize scheme. As expected, this sensitivity has a consistency error that decreases as the number of time steps increases (Gunzburger 2003; Jensen et al. 2014).

To examine the accuracy of the semi-analytical sensitivities, we compare them to their respective analytical sensitivities via the error e_s of (2) for different perturbation sizes 𝜖 and the N = 1000 and α = 0.5 case, cf. Figure 7. As expected, the error is smaller as the perturbation size decreases until round-off error pollutes the computations.

In Fig. 8, we show the error e_s for different time discretization N using the 𝜖 = 10^− 6 and α = 0.5 case. The errors of the semi-analytical sensitivities show no dependency on N because the semi-analytical approximations are independent of the time discretization, i.e., the error is solely due to the perturbation size 𝜖.

3.6.2 Asymmetric ∂ R/∂ U

We consider the nonlinear thermal conductivity case where ξ = 0.5 for which only $\partial \textbf {R} / \partial \dot {\textbf {U}}$ is symmetric and ∂R/∂U is not. The computations performed with the various methods yield similar results, cf. Table 2, with the exception of the semi-analytical adjoint differentiate-then-discretize and semi-analytical adjoint discretize-then-differentiate option 1 schemes, which exhibit errors of approximately 0.1% with respect to their analytical counter parts. We attribute this error to the asymmetric ∂R/∂U. Again for N = 100, the explicit α = 0 scheme is not stable.

Table 2 Sensitivities for the asymmetric problem

Full size table

To examine the consistency of the methods, we show the error e_f for different perturbation sizes and time steps in Figs. 9 and 10. Again as shown in the previous example, the adjoint method differentiate-then-discretize has a consistency error that decreases as the number of time steps increases.

Now we examine the accuracy of the semi-analytical sensitivities, computing the error e_s for different perturbation sizes 𝜖 with N = 1000 and α = 0.5, cf. Figure 11. Since ∂R/∂U is not symmetric, (54) and (69) do not hold, resulting in appreciable error in both the semi-analytical adjoint differentiate-then-discretize and the semi-analytical adjoint discretize-then-differentiate option 1 schemes. The other semi-analytical methods do not exhibit this error. Again as the perturbation size decreases, the error lessens until round-off error pollutes the computations.

In Fig. 12, we show the error e_s for different time discretization N using the 𝜖 = 10^− 6 and α = 0.5 case. The error for the semi-analytical adjoint differentiate-then-discretize and the semi-analytical adjoint discretize-then-differentiate option 1 schemes is evident.

4 Nonlinear dynamic problems

A nonlinear dynamic problem can be expressed through a residual as

$$\begin{array}{@{}rcl@{}} \textbf{R}(\textbf{U}(t, \textbf{d}),\dot{\textbf{U}}(t, \textbf{d}),\ddot{\textbf{U}}(t, \textbf{d}),\textbf{d}) &=&\textbf{0} , \end{array} $$

(74a)

$$\begin{array}{@{}rcl@{}} \dot{\textbf{U}}(0) &=& \dot{\textbf{U}}^{0} , \end{array} $$

(74b)

$$\begin{array}{@{}rcl@{}} \textbf{U}(0) &=& \textbf{U}^{0} , \end{array} $$

(74c)

where we note the design dependencies as in (7). The response function for this system is again expressed by (20) and its sensitivity computed by (21).

For the sensitivity analysis, we implement the direct method by differentiating (74a), (74b), and (74c)

$$\begin{array}{@{}rcl@{}} \frac{\partial \textbf{R}}{\partial \ddot{\textbf{U}}} \frac{\partial \ddot{\textbf{U}}}{\partial d_i} +\frac{\partial \textbf{R}}{\partial\dot{\textbf{U}}} \frac{\partial \dot{\textbf{U}}}{\partial d_i} + \frac{\partial \textbf{R}}{\partial{\textbf{U}}} \frac{\partial \textbf{U}}{\partial d_i} & =& - \frac{\partial {\textbf{R}}}{\partial d_i} , \end{array} $$

(75a)

$$\begin{array}{@{}rcl@{}} \frac{\partial \dot{\textbf{U}}(0)}{\partial d_i} &=& \frac{\partial \dot{\textbf{U}}^{0}}{\partial d_i} , \end{array} $$

(75b)

$$\begin{array}{@{}rcl@{}} \frac{\partial \textbf{U}(0)}{\partial d_i} &=& \frac{\partial \textbf{U}^{0}}{\partial d_i} , \end{array} $$

(75c)

and solve the resulting pseudo problem for $\partial \ddot {\textbf {U}}/\partial d_{i}$, $\partial \dot {\textbf {U}}/\partial d_{i}$, and ∂U/∂d_i whereupon we evaluate (21).

Alternatively, we can implement the adjoint method whereby we insert (75a) into (21) to obtain the equivalent sensitivity

$$\begin{array}{@{}rcl@{}} \frac{\mathrm{D} F}{\mathrm{D} d_{i}}& = & {\int}_{0}^{t_{f}} \left( \frac{\partial G}{\partial\textbf{U}} \frac{\partial \textbf{U}}{\partial d_i} + \frac{\partial G}{\partial \dot{\textbf{U}}} \frac{\partial \dot{\textbf{U}}}{\partial d_i} + \frac{\partial G}{\partial d_{i}} \right) \mathrm{d}t \\ && + {\int}_{0}^{t_{f}} {\boldsymbol{ \lambda}}^{\top} \left( \frac{\partial \textbf{R}}{\partial \ddot{\textbf{U}}} \frac{\partial \ddot{\textbf{U}}}{\partial d_i} +\frac{\partial \textbf{R}}{\partial\dot{\textbf{U}}} \frac{\partial \dot{\textbf{U}}}{\partial d_i} + \frac{\partial \textbf{R}}{\partial{\textbf{U}}} \frac{\partial \textbf{U}}{\partial d_i} + \frac{\partial \textbf{R}}{\partial d_{i}} \right) \mathrm{d}t .\\ \end{array} $$

(76)

Where again λ is the arbitrary adjoint vector. Integrating by parts and rearranging (76) yields

$$\begin{array}{@{}rcl@{}} \frac{\mathrm{D} F}{\mathrm{D} d_{i}}\!\!&=&\!\! {\int}_{0}^{t_{f}} \left( \frac{\partial G}{\partial d_{i}} + {\boldsymbol{ \lambda}^{\top}} \frac{\partial \textbf{R}}{\partial d_{i}} \right) \mathrm{d}t \\ &&-\left. \frac{\partial \textbf{U}^{0}}{\partial d_i}^{\top} \left( \frac{\partial G}{\partial \dot{\textbf{U}}}^{\top} + \frac{\partial \textbf{R}}{\partial \dot{\textbf{U}}}^{\top} \boldsymbol{ \lambda} -\frac{\text{d}}{\text{d}t}\left( \frac{\partial {\textbf{R}}}{\partial \ddot{\textbf{U}}}^{\top}\boldsymbol{ \lambda}\right)\right)\right\rvert_{t = 0} \\ &&-\!\left. \frac{\partial \dot{\textbf{U}}^{0}}{\partial d_i}^{\top} \left( \frac{\partial \textbf{R}}{\partial \ddot{\textbf{U}}}^{\top} \boldsymbol{ \lambda}\right)\right\rvert_{t = 0} + {\int}_{0}^{t_{f}} \frac{\partial \textbf{U}}{\partial d_i}^{\top} \left( \frac{\partial {G}}{\partial \textbf{U}}^{\top} \! - \frac{\text{d}}{\text{d}t}\left( \frac{\partial {G}^{\top}}{\partial \dot{\textbf{U}}}\right) \right. \\ &&\left.\vphantom{+ {\int}_{0}^{t_{f}} \frac{\partial \textbf{U}}{\partial d_i}^{\top} \left( \frac{\partial {G}}{\partial \textbf{U}}^{\top} \right)} +\frac{\partial \textbf{R}^{\top}}{\partial \textbf{U}} \boldsymbol{ \lambda} - \frac{\text{d}}{\text{d}t}\left( \frac{\partial \textbf{R}^{\top}}{\partial \dot{\textbf{U}}} \boldsymbol{ \lambda}\right) +\frac{\text{d}^{2}}{\text{d}t^{2}} \left( {\frac{\partial \textbf{R}^{\top}}{\partial{\ddot{\textbf{U}}}} \boldsymbol{ \lambda}}\right) \right) \mathrm{d}t \\&&+\left.\frac{\partial \textbf{U}}{\partial d_i}^{\top} \left( \frac{\partial G}{\partial \dot{\textbf{U}}}^{\top} + \frac{\partial \textbf{R}}{\partial \dot{\textbf{U}}}^{\top} \boldsymbol{ \lambda}-\frac{\text{d}}{\text{d}t} \left( \frac{\partial {\textbf{R}}}{\partial \ddot{\textbf{U}}}^{\top}\boldsymbol{ \lambda}\right)\right)\right\rvert_{t=t_{f}} \\&&+\left.\frac{\partial \dot{\textbf{U}}}{\partial d_i}^{\top} \left( \frac{\partial \textbf{R}}{\partial \ddot{\textbf{U}}}^{\top} \boldsymbol{\lambda}\right)\right\rvert_{t=t_{f}} .\\ \end{array} $$

(77)

Next, we introduce the time mapping of (25) and substitute it into the above (77) to obtain

$$\begin{array}{@{}rcl@{}} \frac{\mathrm{D} F}{\mathrm{D} d_{i}}&=& {\int}_{0}^{t_{f}} \left( \frac{\partial G}{\partial d_{i}} + {\boldsymbol{\Lambda}^{\top}} \frac{\partial \textbf{R}}{\partial d_{i}} \right) \mathrm{d}t \\ &&-\left. \frac{\partial \textbf{U}^{0}}{\partial d_i}^{\top} \left( \frac{\partial G}{\partial \dot{\textbf{U}}}^{\top} + \left( \frac{\partial \textbf{R}}{\partial \dot{\textbf{U}}}^{\top} -\frac{\text{d}}{\text{d}t}\left( \frac{\partial {\textbf{R}}}{\partial \ddot{\textbf{U}}}^{\top}\right)\right) \boldsymbol{\Lambda} + \frac{\partial {\textbf{R}}}{\partial \ddot{\textbf{U}}}^{\top} \dot{\boldsymbol{\Lambda}} \right)\right\rvert_{t = 0} \\&&-\left. \frac{\partial \dot{\textbf{U}}^{0}}{\partial d_i}^{\top} \left( \frac{\partial \textbf{R}}{\partial \ddot{\textbf{U}}}^{\top} \boldsymbol{\Lambda} \right)\right\rvert_{t = 0} + {\int}_{0}^{t_{f}} \frac{\partial \textbf{U}}{\partial d_i}^{\top} \left( \frac{\partial {G}}{\partial \textbf{U}}^{\top} - \frac{\text{d}}{\text{d}t}\left( \frac{\partial {G}^{\top}}{\partial \dot{\textbf{U}}}\right) \right. \\&&+\left. \left( \frac{\partial \textbf{R}}{\partial \textbf{U}}^{\top} - \frac{\text{d}}{\text{d}t}\left( \frac{\partial \textbf{R}^{\top}}{\partial \dot{\textbf{U}}}\right) +\frac{\text{d}^{2}}{\text{d}t^{2}} \left( \frac{\partial {\textbf{R}}}{\partial \ddot{\textbf{U}}}^{\top}\right) \right) \boldsymbol{\Lambda} \right. \\ &&+\left.\left( \frac{\partial {\textbf{R}}}{\dot{\textbf{U}}}^{\top} -2\frac{\text{d}}{\text{d}t}\left( \frac{\partial {\textbf{R}}}{\partial \ddot{\textbf{U}}}^{\top}\right)\right)\dot{\boldsymbol{\Lambda}} +\frac{\partial {\textbf{R}}}{\partial \ddot{\textbf{U}}}^{\top} \ddot{\boldsymbol{\Lambda}} \right) \mathrm{d}t \\&&+\left.\frac{\partial \textbf{U}}{\partial d_i}^{\top} \left( \frac{\partial G}{\partial \dot{\textbf{U}}}^{\top} + \left( \frac{\partial \textbf{R}}{\partial \dot{\textbf{U}}}^{\top} -\frac{\text{d}}{\text{d}t}\left( \frac{\partial {\textbf{R}}}{\partial \ddot{\textbf{U}}}^{\top}\right) \right) \boldsymbol{\Lambda} + \frac{\partial {\textbf{R}}}{\partial \ddot{\textbf{U}}}^{\top} \dot{\boldsymbol{\Lambda}} \right)\right\rvert_{t=t_{f}} \\&&+\left. \frac{\partial \dot{\textbf{U}}}{\partial d_i}^{\top} \left( \frac{\partial \textbf{R}}{\partial \ddot{\textbf{U}}}^{\top} \boldsymbol{\Lambda} \right)\right\rvert_{t=t_{f}} . \end{array} $$

(78)

where all quantities are evaluated at time t except for Λ which is evaluated at t_f − t. We annihilate the terms containing the implicitly defined derivative ∂U/∂d_i by requiring Λ to solve

$$\begin{array}{@{}rcl@{}} &&\frac{\partial {\textbf{R}}}{\partial \ddot{\textbf{U}}}^{\top} \ddot{\boldsymbol{\Lambda}} +\left( \frac{\partial {\textbf{R}}}{\partial\dot{\textbf{U}}}^{\top} -2\frac{\text{d}}{\text{d}t}\left( \frac{\partial {\textbf{R}}}{\partial \ddot{\textbf{U}}}^{\top}\right)\right)\dot{\boldsymbol{\Lambda}} \\ &&+\left( \frac{\partial \textbf{R}}{\partial \textbf{U}}^{\top} - \frac{\text{d}}{\text{d}t}\left( \frac{\partial \textbf{R}^{\top}}{\partial \dot{\textbf{U}}}\right) +\frac{\text{d}^{2}}{\text{d}t^{2}}\left( \frac{\partial {\textbf{R}}}{\partial \ddot{\textbf{U}}}^{\top}\right)\right) \boldsymbol{\Lambda} \\ &=& - \frac{\partial {G}}{\partial \textbf{U}}^{\top} + \frac{\text{d}}{\text{d}t}\left( \frac{\partial {G}^{\top}}{\partial \dot{\textbf{U}}}\right) , \end{array} $$

(79a)

$$\begin{array}{@{}rcl@{}} \left.\frac{\partial {\textbf{R}}}{\partial \ddot{\textbf{U}}}^{\top} \right\rvert_{t=t_{f}} \dot{\boldsymbol{\Lambda}} (0) &=&\left. -\frac{\partial G}{\partial \dot{\textbf{U}}}^{\top} \right\rvert_{t=t_{f}} , \end{array} $$

(79b)

$$\begin{array}{@{}rcl@{}} \boldsymbol{\Lambda}(0) &=& \textbf{0} . \end{array} $$

(79c)

Using this Λ, the sensitivity reduces to

$$\begin{array}{@{}rcl@{}} \frac{\mathrm{D} F}{\mathrm{D} d_{i}}&=& {\int}_{0}^{t_{f}} \left( \frac{\partial G}{\partial d_{i}} + {\boldsymbol{\Lambda}^{\top}} \frac{\partial \textbf{R}}{\partial d_{i}} \right) \mathrm{d}t \\ &&- \frac{\partial \textbf{U}^{0}}{\partial d_i}^{\top} \left( \frac{\partial G}{\partial \dot{\textbf{U}}}^{\top} + \left( \frac{\partial \textbf{R}}{\partial \dot{\textbf{U}}}^{\top} -\frac{\text{d}}{\text{d}t}\left( \frac{\partial {\textbf{R}}}{\partial \ddot{\textbf{U}}}^{\top}\right)\right) \boldsymbol{\Lambda} \right. \\ &&+\left.\left. \frac{\partial {\textbf{R}}}{\partial \ddot{\textbf{U}}}^{\top} \dot{\boldsymbol{\Lambda}} \right)\right\rvert_{t = 0} -\left. \frac{\partial \dot{\textbf{U}}^{0}}{\partial d_i}^{\top} \left( \frac{\partial \textbf{R}}{\partial \ddot{\textbf{U}}}^{\top} \boldsymbol{\Lambda} \right)\right\rvert_{t = 0} . \end{array} $$

(80)

where again all quantities are evaluated at time t except for Λ which is evaluated at t_f − t.

4.1 Discretization

To solve the above, we discretize in time using the Newmark method so that

$$\begin{array}{@{}rcl@{}} \dot{\textbf{U}}^{n} =\dot{\textbf{U}}^{n-1}+(1-\gamma){\Delta} t \ddot{\textbf{U}}^{n-1}+\gamma{\Delta} t \ddot{\textbf{U}}^{n} , \end{array} $$

(81)

$$\begin{array}{@{}rcl@{}} \textbf{U}^{n} &=& \textbf{U}^{n-1}+{\Delta} t\dot{\textbf{U}}^{n-1}\\ &&+\left( \frac{1}{2}-\beta\right){\Delta} t^{2} \ddot{\textbf{U}}^{n-1}+\beta{\Delta} t^{2} \ddot{\textbf{U}}^{n} , \end{array} $$

(82)

where Uⁿ = U(t_n), $\dot {\textbf {U}}^{n}=\dot {\textbf {U}}(t_{n})$ and $\ddot {\textbf {U}}^{n}=\ddot {\textbf {U}}(t_{n})$. To simplify the ensuing developments, we define coefficients a = (1 − γ)Δt, b = γΔt, c = Δt, d = (1/2 − β)Δt², and e = βΔt².

4.2 Primal analysis

In the primal analysis, we are given the initial condition U⁰ and $\dot {\textbf {U}}^{0}$, so first we use (74a) and solve

$$ \textbf{R}^{0}(\textbf{U}^{0},\dot{\textbf{U}}^{0},\ddot{\textbf{U}}^{0},\textbf{d}) =\textbf{0} , $$

(83)

for $\ddot {\textbf {U}}^{0}$ by Newton-Raphson. The updates ${\Delta }\ddot {\textbf {U}}^{0}$ for $\ddot {\textbf {U}}^{0}$ are obtained by solving

$$ \textbf{K}^{0}(\textbf{U}^{0},\dot{\textbf{U}}^{0},\ddot{\textbf{U}}^{0}, \textbf{d}){\Delta}\ddot{\textbf{U}}^{0}=-\textbf{R}^{0}(\textbf{U}^{0},\dot{\textbf{U}}^{0},\ddot{\textbf{U}}^{0}, \textbf{d}) , $$

(84)

where $ \textbf {K}^{0}={\partial \textbf {R}^{0}}/{\partial \ddot {\textbf {U}}}$ is the tangent matrix. We continue updating until convergence.

Having U⁰, $\dot {\textbf {U}}^{0}$ and $\ddot {\textbf {U}}^{0}$, we compute the first term in (31), i.e., $F= \mu _{0} G^{0}(\textbf {U}^{0},\dot {\textbf {U}}^{0}, \textbf {d})$.

Now we commence our analysis. At each time step t_n, we replace $\dot {\textbf {U}}^{n}$ and Uⁿ with the right-hand side (RHS) of (81) and (82), solve (74a) for $\ddot {\textbf {U}}^{n}$ and then evaluate $\dot {\textbf {U}}^{n}$ and Uⁿ from (81) and (82). Newton’s method is also used for these solves, whereupon we calculate the update ${\Delta }\ddot {\textbf {U}}^{n}$ from the linear equation

$$ \textbf{K}^{n}(\textbf{U}^{n},\dot{\textbf{U}}^{n},\ddot{\textbf{U}}^{n}, \textbf{d}){\Delta}\ddot{\textbf{U}}^{n}=-\textbf{R}^{n}(\textbf{U}^{n},\dot{\textbf{U}}^{n},\ddot{\textbf{U}}^{n}, \textbf{d}) , $$

(85)

where $ \textbf {K}^{n}={\partial \textbf {R}^{n}}/{\partial \ddot {\textbf {U}}}+b {\partial \textbf {R}^{n}}/{\partial \dot {\textbf {U}}} +e {\partial \textbf {R}^{n}}/{\partial \textbf {U}}$ is the tangent stiffness matrix. After convergence, we update F as per (34). A flowchart of these computations appears in Fig. 13.

4.3 Direct differentiation

For the direct differentiation sensitivity analysis, we discretize ∂U/∂d_i like U, i.e.,

$$\begin{array}{@{}rcl@{}} \frac{\partial \dot{\textbf{U}}^{n}}{\partial d_i} &=&\frac{\partial \dot{\textbf{U}}^{n-1}}{\partial d_i}+a \frac{\partial \ddot{\textbf{U}}^{n-1}}{\partial d_{i}}+b \frac{\partial \ddot{\textbf{U}}^{n}}{\partial d_{i}} , \end{array} $$

(86)

$$\begin{array}{@{}rcl@{}} \frac{\partial \textbf{U}^{n}}{\partial d_i} &=& \frac{\partial \textbf{U}^{n-1}}{\partial d_i}+c \frac{\partial \dot{\textbf{U}}^{n-1}}{\partial d_i}+d \frac{\partial \ddot{\textbf{U}}^{n-1}}{\partial d_{i}}+e \frac{\partial \ddot{\textbf{U}}^{n}}{\partial d_{i}} . \end{array} $$

(87)

Note that the initial condition ∂U⁰/∂d_i and $\partial \dot {\textbf {U}}^{0} /\partial d_{i}$ are known, but $\partial \ddot {\textbf {U}}^{0} /\partial d_{i}$ is not. So before commencing, we must obtain $\partial \ddot {\textbf {U}}^{0} /\partial d_{i}$ like we did ${\ddot {\textbf {U}}}^{0}$. To these ends, we differentiate (83) to obtain the linear equation

$$ \textbf{K}^{0} \frac{\partial \ddot{\textbf{U}}^{0}}{\partial d_{i}} = -\left( \frac{\partial {\textbf{R}}^{0}}{\partial \textbf{U}}\frac{\partial \textbf{U}^{0}}{\partial d_i} +\frac{\partial {\textbf{R}}^{0}}{\partial \textbf{U}}\frac{\partial \dot{\textbf{U}}^{0}}{\partial d_i} + \frac{\partial {\textbf{R}}^{0}}{\partial d_{i}} \right) , $$

(88)

which we solve for $\partial \ddot {\textbf {U}}^{0} /\partial d_{i}$. Having $\partial \dot {\textbf {U}}^{0} /\partial d_{i}$ and ∂U⁰/∂d_i we update DF/Dd_i as per (37). Now we march in time evaluating ∂Uⁿ/∂d_i, $\partial \dot {\textbf {U}}^{n} /\partial d_{i}$ and $\partial \ddot {\textbf {U}}^{n} /\partial d_{i}$ as we did to compute Uⁿ, $\dot {\textbf {U}}^{n}$, and $\ddot {\textbf {U}}^{n}$. From (75a), (86), and (87) we formulate the linear equation

$$\begin{array}{@{}rcl@{}} \textbf{K}^{n} \frac{\partial \ddot{\textbf{U}}^{n}}{\partial d_{i}} &=&-\frac{\partial \textbf{R}^{n}}{\partial \textbf{U}} \left( \frac{\partial \textbf{U}^{n-1}}{\partial d_i} +c \frac{\partial \dot{\textbf{U}}^{n-1}}{\partial d_i} +d \frac{\partial \ddot{\textbf{U}}^{n-1}}{\partial d_{i}} \right)\\ &&-\frac{\partial {\textbf{R}}^{n}}{\partial \dot{\textbf{U}}} \left( \frac{\partial \dot{\textbf{U}}^{n-1}}{\partial d_i} +a \frac{\partial \ddot{\textbf{U}}^{n-1}}{\partial d_{i}}\right) - \frac{\partial \textbf{R}^{n}}{\partial d_{i}} . \end{array} $$

(89)

We solve the above (89) for $\partial \ddot {\textbf {U}}^{n} /\partial d_{i}$ and update $\partial \dot {\textbf {U}}^{n} /\partial d_{i}$ and ∂Uⁿ/∂d_i via (86) and (87) and DF/Dd_i via (39). We continue marching in this manner for all t_n. In so far as our sensitivity analysis algorithm is concerned, we insert nodes A and B from Fig. 14 into the primal analysis flowchart of Fig. 13.

For semi-analytical, we have the approximations

$$\begin{array}{@{}rcl@{}} &&\frac{\partial {\textbf{R}}^{0}}{\partial \textbf{U}}\frac{\partial \textbf{U}^{0}}{\partial d_i} +\frac{\partial {\textbf{R}}^{0}}{\partial \textbf{U}}\frac{\partial \dot{\textbf{U}}^{0}}{\partial d_i} + \frac{\partial {\textbf{R}}^{0}}{\partial d_{i}} \approx \\ &&\frac{1}{\epsilon} \textbf{R}^{0} \left( \textbf{U}^{0} + \epsilon\frac{\partial \textbf{U}^{0}}{\partial d_i}, \dot{\textbf{U}}^{0} + \epsilon\frac{\partial \dot{\textbf{U}}^{0}}{\partial d_i}, \ddot{\textbf{U}}^{0}, \textbf{d} + \epsilon \textbf{e}_i \right) , \end{array} $$

(90)

$$\begin{array}{@{}rcl@{}} &&\frac{\partial \textbf{R}^{n}}{\partial \textbf{U}} \left( \frac{\partial \textbf{U}^{n-1}}{\partial d_i} +c \frac{\partial \dot{\textbf{U}}^{n-1}}{\partial d_i} +d \frac{\partial \ddot{\textbf{U}}^{n-1}}{\partial d_{i}} \right) \\&&+\frac{\partial {\textbf{R}}^{n}}{\partial \dot{\mathbf{U}}} \left( \frac{\partial \dot{\textbf{U}}^{n-1}}{\partial d_i} +a \frac{\partial \ddot{\textbf{U}}^{n-1}}{\partial d_{i}}\right) + \frac{\partial \textbf{R}^{n}}{\partial d_{i}} \\ &\approx&\frac{1}{\epsilon} \textbf{R}^{n} \left( \textbf{U}^{n}+\epsilon\left( \frac{\partial \textbf{U}^{n-1}}{\partial d_i} +c \frac{\partial \dot{\textbf{U}}^{n-1}}{\partial d_i} +d \frac{\partial \ddot{\textbf{U}}^{n-1}}{\partial d_{i}} \right), \right. \\ &&\left. \vphantom{\frac{1}{\epsilon} \textbf{R}^{n} \left( \textbf{U}^{n}+\epsilon\left( \frac{\partial \textbf{U}^{n-1}}{\partial d_i} +c \frac{\partial \dot{\textbf{U}}^{n-1}}{\partial d_i} +d \frac{\partial \ddot{\textbf{U}}^{n-1}}{\partial d_{i}} \right), \right)} \dot{\textbf{U}}^{n}+\epsilon\left( \frac{\partial \dot{\textbf{U}}^{n-1}}{\partial d_i} +a \frac{\partial \ddot{\textbf{U}}^{n-1}}{\partial d_{i}}\right), \ddot{\textbf{U}}^{n}, \textbf{d}+\epsilon \textbf{e}_i \right) ,\\ \end{array} $$

(91)

which we use in (88) and (89). Again, we assume the user can code ∂Gⁿ/∂U, $\partial G^{n}/ \partial \dot {\textbf {U}}$ and ∂Gⁿ/∂d_i.

4.4 Adjoint method using differentiate-then-discretize

In the adjoint differentiate-then-discretize approach, we discretize the adjoint problem and sensitivity of (79a) and (80). (80) is evaluated as

$$\begin{array}{@{}rcl@{}} \frac{\mathrm{D} F}{\mathrm{D} d_{i}}&=& \sum\limits_{n = 0}^{N}\mu_{N-n} \left( \frac{\partial G^{N-n}}{\partial d_{i}} + {{\boldsymbol{\Lambda}^{n}}^{\top}} \frac{\partial \textbf{R}^{N-n}}{\partial d_{i}} \right) \\ &&- \frac{\partial \textbf{U}^{0}}{\partial \dot{\textbf{U}}}^{\top} \left( \frac{\partial G^{0}}{\partial \dot{\textbf{U}}}^{\top} + \left( \frac{\partial {\textbf{R}}^{0}}{\partial \dot{\textbf{U}}}^{\top} -\frac{\text{d}}{\text{d}t}\left( \frac{\partial {\textbf{R}}^{0}}{\partial \ddot{\textbf{U}}}^{\top}\right)\right) {\boldsymbol{\Lambda}^{N}}\right. \\ &&+\left. \frac{\partial {\textbf{R}}^{0}}{\partial \ddot{\textbf{U}}}^{\top} {\dot{\boldsymbol{\Lambda}}^{N}}\right) - \frac{\partial \dot{\textbf{U}}^{0}}{\partial d_i}^{\top} \left( \frac{\partial {\textbf{R}}^{0}}{\partial \ddot{\textbf{U}}}^{\top} {\boldsymbol{\Lambda}^{N}}\right) .\\\ \end{array} $$

(92)

To obtain Λⁿ, we solve (79a), (79b), and (79c) like we did for U, i.e., we introduce the Newmark time stepping scheme

$$\begin{array}{@{}rcl@{}} \dot{\boldsymbol{\Lambda}}^{n} &=&\dot{\boldsymbol{\Lambda}}^{n-1}+a \ddot{\boldsymbol{\Lambda}}^{n-1}+b \ddot{\boldsymbol{\Lambda}}^{n} , \end{array} $$

(93)

$$\begin{array}{@{}rcl@{}} \boldsymbol{\Lambda}^{n} &=& \boldsymbol{\Lambda}^{n-1}+c \dot{\boldsymbol{\Lambda}}^{n-1}+d \ddot{\boldsymbol{\Lambda}}^{n-1}+e \ddot{\boldsymbol{\Lambda}}^{n} . \end{array} $$

(94)

To reuse Kⁿ like the direct method, we restrict R such that

$$\begin{array}{@{}rcl@{}} \frac{\text{d}}{\text{d}t}\left( \frac{\partial \textbf{R}}{\partial\dot{\textbf{U}}}\right) &=\textbf{0} , \end{array} $$

(95)

$$\begin{array}{@{}rcl@{}} \frac{\text{d}}{\text{d}t}\left( \frac{\partial \textbf{R}}{\partial \ddot{\textbf{U}}}\right) &=\textbf{0} . \end{array} $$

(96)

This means, ${\partial \textbf {R}}/{\partial \dot {\textbf {U}}}$ and ${\partial \textbf {R}}/{\partial \dot {\textbf {U}}}$ which are typically interpreted as damping and mass matrices respectively, are constant.

Noting that Λ⁰ = 0 from (79c), we start the algorithm by solving (79b), i.e.,

$$ \frac{\partial \textbf{R}^{N}}{\partial\ddot{\textbf{U}}}^{\top} \dot{\boldsymbol{\Lambda}}^{0}=-\frac{\partial G^{N}}{\partial \dot{\textbf{U}}}^{\top} , $$

(97)

for $\dot {\boldsymbol {\Lambda }}^{0}$. Next, we obtain $\ddot {\boldsymbol {\Lambda }}^{0}$ from (79a), i.e.,

$$\begin{array}{@{}rcl@{}} \frac{\partial \textbf{R}^{N}}{\partial\ddot{\textbf{U}}}^{\top} \ddot{\boldsymbol{\Lambda}}^{0} &=& -\frac{\partial \textbf{R}^{N}}{\partial \dot{\textbf{U}}}^{\top} \dot{\boldsymbol{\Lambda}}^{0} - \frac{\partial G^{N}}{\partial \textbf{U}}^{\top} \\&&+\left( \frac{\partial^{2} G^{N}}{\partial \dot{\textbf{U}}\partial \textbf{U}} \dot{\textbf{U}}^{N}\right)^{\top} + \left( \frac{\partial^{2} G^{N}}{\partial \dot{\textbf{U}}^{2}} \ddot{\textbf{U}}^{N}\right)^{\top} . \end{array} $$

(98)

Notice that (97) and (98) do not use the tangent stiffness matrix of the primal analysis. Next we initialize DF/Dd_i from (48).

The time marching now commences for the remaining in time steps t_n, i.e., for n = 1,2,...,N − 1 we solve

$$\begin{array}{@{}rcl@{}} {\textbf{K}^{N-n}}^{\top} \ddot{\boldsymbol{\Lambda}}^{n} &=&- \frac{\partial G^{N-n}}{\textbf{U}}^{\top} +\left( \frac{\partial^{2} G^{N-n}}{\partial \dot{\textbf{U}}\partial \textbf{U}} \dot{\textbf{U}}^{N-n}\right)^{\top} \\&&+ \left( \frac{\partial^{2} G^{N-n}}{\partial \textbf{U}^{2}} \ddot{\textbf{U}}^{N-n} \right)^{\top} \\&&- \frac{\partial \textbf{R}^{N-n}}{\partial \textbf{U}}^{\top} \left( \boldsymbol{\Lambda}^{n-1} +c \dot{\boldsymbol{\Lambda}}^{n-1} +d \ddot{\boldsymbol{\Lambda}}^{n-1} \right) \\&&-\frac{\partial \textbf{R}^{N-n}}{\partial \dot{\textbf{U}}}^{\top} \left( \dot{\boldsymbol{\Lambda}}^{n-1} +a \ddot{\boldsymbol{\Lambda}}^{n-1}\right) , \\ \end{array} $$

(99)

for $\ddot {\boldsymbol {\Lambda }}^{n}$. Then, we update Λⁿ and $\dot {\boldsymbol {\Lambda }}^{n}$ with (93) and (94) and DF/Dd_i with (50).

Finally, we solve

$$\begin{array}{@{}rcl@{}} &&\left( \frac{\partial {\textbf{R}}^{0}}{\partial {\ddot{\textbf{U}}}}+b \frac{\partial {\textbf{R}}^{0}}{\partial {\dot{\textbf{U}}}} +e \frac{\partial {\textbf{R}}^{0}}{\partial \textbf{U}}\right)^{\top} \ddot{\boldsymbol{\Lambda}}^{N} =- \frac{\partial G^{0}}{\partial \textbf{U}}^{\top} \\ &&+\left( \frac{\partial^{2} G^{0}}{\partial \dot{\textbf{U}}\partial \textbf{U}} \dot{\textbf{U}}^{0}\right)^{\top} + \left( \frac{\partial^{2} G^{0}}{\partial \dot{\textbf{U}}^{2}} \ddot{\textbf{U}}^{0}\right)^{\top} \\&&- \frac{\partial {\textbf{R}}^{0}}{\partial \textbf{U}}^{\top} \left( \boldsymbol{\Lambda}^{N-1} +c \dot{\boldsymbol{\Lambda}}^{N-1} +d \ddot{\boldsymbol{\Lambda}}^{N-1} \right) \\&&-\frac{\partial {\textbf{R}}^{0}}{\partial \dot{\textbf{U}}}^{\top} \left( \dot{\boldsymbol{\Lambda}}^{N-1} +a \ddot{\boldsymbol{\Lambda}}^{N-1}\right) , \\ \end{array} $$

(100)

for $\ddot {\boldsymbol {\Lambda }}^{N}$, then obtain $\dot {\boldsymbol {\Lambda }}^{N}$ and Λ^N from (93) and (94), and update

$$\begin{array}{@{}rcl@{}} \frac{\mathrm{D} F}{\mathrm{D} d_{i}} &\leftarrow& \frac{\mathrm{D} F}{\mathrm{D} d_{i}} +\mu_{0} \frac{\partial G^{0}}{\partial d_{i}} + \mu_{0} {\boldsymbol{\Lambda}^{N}}^{\top} \frac{\partial {\textbf{R}}^{0}}{\partial \textbf{d}} -\frac{\partial G^{0}}{\partial \dot{\textbf{U}}}\frac{\partial \textbf{U}^{0}}{\partial d_i} \\&&- {\boldsymbol{\Lambda}^{N}}^{\top} \left( \frac{\partial {\textbf{R}}^{0}}{\partial \dot{\textbf{U}}}\frac{\partial \textbf{U}^{0}}{\partial d_i} + \frac{\partial {\textbf{R}}^{0}}{\partial \ddot{\textbf{U}}} \frac{\partial \dot{\textbf{U}}^{0}}{\partial d_i}\right) - \dot{\boldsymbol{\Lambda}}^{N\top} \frac{\partial {\textbf{R}}^{0}}{\partial \dot{\textbf{U}}}\frac{\partial \textbf{U}^{0}}{\partial d_i} .\\ \end{array} $$

(101)

Again, we note that (100) does not use the tangent stiffness matrix from primal problem. This algorithm is described by inserting node C from Fig. 15 into the flowchart of Fig. 13.

For the semi-analytical, we consider the further restriction that ∂R/∂U and $\partial \textbf {R} / \partial \dot {\textbf {U}}$ are symmetric. In this way, the term in the adjoint load of (98) can be approximated as

$$ \frac{\partial \textbf{R}^{N}}{\partial \dot{\textbf{U}}}^{\top} \dot{\boldsymbol{\Lambda}}^{0} \approx \frac{1}{\epsilon} \textbf{R}\left( \textbf{U}^{N}, \dot{\textbf{U}}^{N} +\epsilon \dot{\boldsymbol{\Lambda}}^{0}, \ddot{\textbf{U}}^{N}, \textbf{d} \right) , $$

(102)

and the terms in the adjoint load of (99) and (100) can be approximated as

$$\begin{array}{@{}rcl@{}} &&\frac{\partial \textbf{R}^{N-n}}{\partial \textbf{U}}^{\top} \left( \boldsymbol{\Lambda}^{n-1} +c \dot{\boldsymbol{\Lambda}}^{n-1} +d \ddot{\boldsymbol{\Lambda}}^{n-1} \right) \\&&+\frac{\partial \textbf{R}^{N-n}}{\partial \dot{\textbf{U}}}^{\top} \left( \dot{\boldsymbol{\Lambda}}^{n-1} +a \ddot{\boldsymbol{\Lambda}}^{n-1}\right) \\ &\approx&\frac{1}{\epsilon} \textbf{R}\left( \textbf{U}^{N-n}+\epsilon\left( \boldsymbol{\Lambda}^{n-1} +c \dot{\boldsymbol{\Lambda}}^{n-1} +d \ddot{\boldsymbol{\Lambda}}^{n-1} \right), \right. \\ &&\left.\vphantom{\frac{1}{\epsilon} \textbf{R}\left( \textbf{U}^{N-n}+\epsilon\left( \boldsymbol{\Lambda}^{n-1} +c \dot{\boldsymbol{\Lambda}}^{n-1} +d \ddot{\boldsymbol{\Lambda}}^{n-1} \right), \right)} \dot{\textbf{U}}^{N-n} +\epsilon \left( \dot{\boldsymbol{\Lambda}}^{n-1} +a \ddot{\boldsymbol{\Lambda}}^{n-1}\right), \ddot{\textbf{U}}^{N-n}, \textbf{d} \right) . \end{array} $$

(103)

Regarding DF/Dd_i of (48), (50), and (101), we can use the approximations

$$\begin{array}{@{}rcl@{}} &&\frac{\partial {\textbf{R}}^{0}}{\partial \dot{\textbf{U}}} \frac{\partial \textbf{U}^{0}}{\partial d_i} + \frac{\partial {\textbf{R}}^{0}}{\partial \ddot{\textbf{U}}} \frac{\partial \dot{\textbf{U}}^{0}}{\partial d_i} \approx \\ &&\frac{1}{\epsilon} \textbf{R}\left( \textbf{U}^{0}, \dot{\textbf{U}}^{0}+\epsilon\frac{\partial \textbf{U}^{0}}{\partial d_i}, \ddot{\textbf{U}}^{0}+\epsilon\frac{\partial \dot{\textbf{U}}^{0}}{\partial d_i}, \textbf{d} \right) , \end{array} $$

(104)

$$\begin{array}{@{}rcl@{}} \frac{\partial {\textbf{R}}^{0}}{\partial {\ddot{\textbf{U}}}}\frac{\partial \textbf{U}^{0}}{\partial d_i} &\approx \frac{1}{\epsilon} \textbf{R}\left( \textbf{U}^{0}, \dot{\textbf{U}}^{0}, \ddot{\textbf{U}}^{0}+\epsilon\frac{\partial \textbf{U}^{0}}{\partial d_i}, \textbf{d}\right), \end{array} $$

(105)

$$\begin{array}{@{}rcl@{}} \frac{\partial \textbf{R}^{n}}{{\partial d_{i}}} &\approx \frac{1}{\epsilon} \textbf{R} \left( \textbf{U}^{n}, \dot{\textbf{U}}{}^{n}, \ddot{\textbf{U}}^{n} , \textbf{d}+\epsilon \textbf{e}_i\right) . \end{array} $$

(106)

4.5 Adjoint method using discretize-then-differentiate

In this adjoint discretize-then-differentiate method, we first discretize the primal analysis and response function in time and then we differentiate for the sensitivity analysis. Thus, we incorporate (75a), (86), and (87) into (32) to obtain the equivalent sensitivity

$$\begin{array}{@{}rcl@{}} \delta F&=&\sum\limits_{n = 0}^{N} \mu_{n} \left( \frac{\partial G^{n}}{\partial \textbf{U}} \frac{\partial \textbf{U}^{n}}{\partial d_i} + \frac{\partial G^{n}}{\partial \dot{\textbf{U}}}\frac{\partial \dot{\textbf{U}}^{n}}{\partial d_{i}} + \frac{\partial G^{n}}{\partial d_{i}}\right) \\ && +\sum\limits_{n = 0}^{N}\boldsymbol{\Lambda}^{n\top} \left( \frac{\partial \textbf{R}^{N-n}}{\partial \ddot{\textbf{U}}}\frac{\partial \ddot{\textbf{U}}^{N-n}}{\partial d_{i}}+\frac{\partial \textbf{R}^{N-n}}{\partial \dot{\textbf{U}}} \frac{\partial \dot{\textbf{U}}^{N-n}}{\partial d_i} \right. \\ && +\left. \frac{\partial \textbf{R}^{N-n}}{\partial \textbf{U}} \frac{\partial \textbf{U}^{N-n}}{\partial d_i} + \frac{\partial \textbf{R}^{N-n}}{\partial d_{i}}\right) \\ && +\sum\limits_{n = 0}^{N-1}{\boldsymbol{\Phi}^{n}}^{\top} \left( \frac{\partial \dot{\textbf{U}}^{N-n}}{\partial d_i} -\frac{\partial \dot{\textbf{U}}^{N-n-1}}{\partial d_i} \right. \\ && -\left.a \frac{\partial \ddot{\textbf{U}}^{N-n}}{\partial d_i} -\frac{\partial \ddot{\textbf{U}}^{N-n}}{\partial d_i} \right) \\ && +\sum\limits_{n = 0}^{N-1}{\boldsymbol{\Psi}^{n}}^{\top}\left( \frac{\partial \textbf{U}^{N-n}}{\partial d_i} - \frac{\partial \textbf{U}^{N-n-1}}{\partial d_i}-c \frac{\partial \dot{\textbf{U}}^{N-n-1}}{\partial d_i} \right. \\ &&-\left.d \frac{\partial \ddot{\textbf{U}}^{N-n-1}}{\partial d_i} -\frac{\partial \ddot{\textbf{U}}^{N-n}}{\partial d_i}\right) , \end{array} $$

(107)

where Λⁿ, Φⁿ, and Ψⁿ are arbitrary adjoint vectors. Rearranging the above yields

$$\begin{array}{@{}rcl@{}} \frac{\mathrm{D} F}{\mathrm{D} d_{i}}&=& {\sum}_{n = 0}^{N} \left( \mu_{N-n} \frac{\partial G^{N-n}}{\partial d_{i}} + \boldsymbol{\Lambda}^{n\top} \frac{\partial \textbf{R}^{N-n}}{\partial d_{i}} \right) \\ &&+\left( \mu_{0} \frac{\partial G^{0}}{\partial \textbf{U}} +{\boldsymbol{\Lambda}^{N}}^{\top}\frac{\partial {\textbf{R}}^{0}}{\partial \textbf{U}} -{\boldsymbol{\Psi}^{N-1}}^{\top} \right) \frac{\partial \textbf{U}^{0}}{\partial d_i} \\ &&+ \left( \mu_{0} \frac{\partial G^{0}}{\partial \dot{\textbf{U}}} +{\boldsymbol{\Lambda}^{N}}^{\top} \frac{\partial {\textbf{R}}^{0}}{\partial \dot{\textbf{U}}} -{\boldsymbol{\Phi}^{N-1}}^{\top} -c {\boldsymbol{\Psi}^{N-1}}^{\top} \right) \frac{\partial \dot{\textbf{U}}^{0}}{\partial d_i} \\ &&+\left( {\boldsymbol{\Lambda}^{N}}^{\top} \frac{\partial {\textbf{R}}^{0}}{\partial \ddot{\textbf{U}}} -a {\boldsymbol{\Phi}^{N-1}}^{\top} -d {\boldsymbol{\Psi}^{N-1}}^{\top}\right) \frac{\partial \ddot{\textbf{U}}^{0}}{\partial d_{i}} \\ &&+\sum\limits_{n = 1}^{N-1} \left( \mu_{N-n} \frac{\partial G^{N-n}}{\textbf{U}} +{{\boldsymbol{\Lambda}}^{n}}^{\top}\frac{\partial \textbf{R}^{N-n}}{\partial \textbf{U}} \right. \\ &&+\left.{\boldsymbol{\Psi}^{n}}^{\top} -{\boldsymbol{\Psi}^{n-1}}^{\top} \right) \frac{\partial \textbf{U}^{N-n}}{\partial d_i}\\ &&+ \sum\limits_{n = 1}^{N-1}\left( \mu_{N-n} \frac{\partial G^{N-n}}{\partial \dot{\textbf{U}}^{n}}{\partial d_{i}} +{\boldsymbol{\Lambda}^{n}}^{\top} \frac{\partial \textbf{R}^{N-n}}{\partial \dot{\textbf{U}}} \right. \\ &&+\left.{\boldsymbol{\Phi}^{n}}^{\top} -{\boldsymbol{\Phi}^{n-1}}^{\top} -c {\boldsymbol{\Psi}^{n-1}}^{\top}\right) \frac{\partial \dot{\textbf{U}}^{N-n}}{\partial d_i}\\ &&+ \sum\limits_{n = 1}^{N-1}\left( {\boldsymbol{\Lambda}^{n}}^{\top} \frac{\partial \textbf{R}^{N-n}}{\partial \ddot{\textbf{U}}} -b {\boldsymbol{\Phi}^{n}}^{\top} -a {\boldsymbol{\Phi}^{n-1}}^{\top} \right.\\ &&-\left.e {\boldsymbol{\Psi}^{n}}^{\top} -d {\boldsymbol{\Psi}^{n-1}}^{\top} \right) \frac{\partial\ddot{\textbf{U}}^{N-n}}{\partial d_{i}}\\ &&+\left( \mu_{N} \frac{\partial G^{N}}{\partial \textbf{U}} +{\boldsymbol{\Lambda}^{0}}^{\top}\frac{\partial \textbf{R}^{N}}{\partial\textbf{U}} +{\boldsymbol{\Psi}^{0}}^{\top} \right) \frac{\partial \textbf{U}^{N}}{\partial d_i}\\ &&+\left( \mu_{N} \frac{\partial G^{N}}{\partial \dot{\textbf{U}}} +{\boldsymbol{\Lambda}^{0}}^{\top} \frac{\partial \textbf{R}^{N}}{\partial \dot{\textbf{U}}^{n}}{\partial d_{i}} + {\boldsymbol{\Phi}^{0}}^{\top} \right) \frac{\partial \dot{\textbf{U}}^{N}}{\partial d_i} \\ &&+\left( {\boldsymbol{\Lambda}^{0}}^{\top} \frac{\partial \textbf{R}^{N}}{\partial\ddot{\textbf{U}}} - b {\boldsymbol{\Phi}^{0}}^{\top} - e {\boldsymbol{\Psi}^{0}}^{\top}\right) \frac{\partial\ddot{\textbf{U}}^{N}}{\partial d_{i}} .\\ \end{array} $$

(108)

To annihilate $\partial \ddot {\textbf {U}}^{N} / \partial d_{i}$, $\partial \dot {\textbf {U}}^{N} / \partial d_{i}$ and ∂U^N/∂d_i, we first solve the adjoint problem

$$ {\textbf{K}^{N}}^{\top} \boldsymbol{\Lambda}^{0}=-b \mu_{N} \frac{\partial G^{N}}{\partial \dot{\textbf{U}}}^{\top} -e \mu_{N} \frac{\partial G^{N}}{\partial \textbf{U}}^{\top} , $$

(109)

for Λ⁰, then we evaluate Φ⁰ from

$$ {\boldsymbol{\Phi}^{0}} = -\mu_{N} \frac{\partial G^{N}}{\partial \dot{\textbf{U}}}^{\top} -\frac{\partial \textbf{R}^{N}}{\partial \dot{\textbf{U}}}^{\top} {\boldsymbol{\Lambda}^{0}} . $$

(110)

and Ψ⁰ from either of the following options

$$\begin{array}{@{}rcl@{}} {\boldsymbol{\Psi}^{0}} &=&-\mu_{N} \frac{\partial G^{N}}{\partial \textbf{U}}^{\top} -\frac{\partial \textbf{R}^{N}}{\partial \textbf{U}}^{\top} \boldsymbol{\Lambda}^{0} , \end{array} $$

(111)

$$\begin{array}{@{}rcl@{}} &=& \frac{1}{e } \left( \frac{\partial \textbf{R}^{N}}{\partial\ddot{\textbf{U}}}^{\top} {\boldsymbol{\Lambda}^{0}} -b {\boldsymbol{\Phi}^{0}}\right) , \end{array} $$

(112)

where (112) holds for β≠ 0. We next initialize the sensitivity from (62).

To annihilate $\partial \ddot {\textbf {U}}^{N-n} / \partial d_{i}$, $\partial \dot {\textbf {U}}^{N-n} / \partial d_{i}$ and ∂U^N−n/∂d_i, we march in time t_n for n = 1,2,...,N − 1 by solving

$$\begin{array}{@{}rcl@{}} {\textbf{K}^{N-n}}^{\top} \boldsymbol{\Lambda}^{n}&=&-b \mu_{N} \frac{\partial G^{N}}{\partial \dot{\textbf{U}}}^{\top} -e \mu_{N} \frac{\partial G^{N}}{\partial \textbf{U}}^{\top} \\&&+ {\Delta} t {\boldsymbol{\Phi}^{n-1}} +\left( \gamma+\frac{1}{2}\right){\Delta} t^{2} {\boldsymbol{\Psi}^{n-1}} , \end{array} $$

(113)

for Λⁿ, updating Φⁿ from

$$ {\boldsymbol{\Phi}^{n}} = {\boldsymbol{\Phi}^{n-1}} +c {\boldsymbol{\Psi}^{n-1}} -\mu_{N-n} \frac{\partial G^{N-n}}{\partial \dot{\textbf{U}}^{n}}^{\top} -\frac{\partial \textbf{R}^{N-n}}{\partial \dot{\textbf{U}}}^{\top} {\boldsymbol{\Lambda}^{n}} , $$

(114)

computing Ψⁿ by either option

$$\begin{array}{@{}rcl@{}} {\boldsymbol{\Psi}^{n}} &=& {\boldsymbol{\Psi}^{n-1}} -\mu_{N-n} \frac{\partial G^{N-n}}{\partial\textbf{U}}^{\top} -\frac{\partial \textbf{R}^{N-n}}{\partial \textbf{U}}^{\top}{{\boldsymbol{\Lambda}}^{n}} , \end{array} $$

(115)

$$\begin{array}{@{}rcl@{}} &=& \frac{1}{e } \left( -d {\boldsymbol{\Psi}^{n-1}} +\frac{\partial \textbf{R}^{N-n}}{\partial \ddot{\textbf{U}}}^{\top} \boldsymbol{\Lambda}^{n} -b {\boldsymbol{\Phi}^{n}} -a {\boldsymbol{\Phi}^{n-1}} \right) \\ \end{array} $$

(116)

and updating DF/Dd_i from (66).

Finally, to annihilate $\partial \ddot {\textbf {U}}^{0} / \partial d_{i}$, we solve

$$ {\textbf{K}^{0}}^{\top} {\boldsymbol{\Lambda}^{N}} =a {\boldsymbol{\Phi}^{N-1}} +d {\boldsymbol{\Psi}^{N-1}} , $$

(117)

for Λ^N and we update

$$\begin{array}{@{}rcl@{}} \frac{\mathrm{D} F}{\mathrm{D} d_{i}} &\leftarrow& \frac{\mathrm{D} F}{\mathrm{D} d_{i}} + \mu_{0} \frac{\partial G^{0}}{\partial d_{i}} + {{\boldsymbol{\Lambda}}^{N}}^{\top} \frac{\partial {\textbf{R}}^{0}}{\partial d_{i}} \\ &&+\left( \mu_{0} \frac{\partial G^{0}}{\partial \textbf{U}} -{\boldsymbol{\Psi}^{N-1}}^{\top} \right) \frac{\partial \textbf{U}^{0}}{\partial d_i} \\ &&+ \left( \mu_{0} \frac{\partial G^{0}}{\partial \dot{\textbf{U}}} -{\boldsymbol{\Phi}^{N-1}}^{\top} -c {\boldsymbol{\Psi}^{N-1}}^{\top} \right) \frac{\partial \dot{\textbf{U}}^{0}}{\partial d_i} \\ &&+{\boldsymbol{\Lambda}^{N}}^{\top}\left( \frac{\partial {\textbf{R}}^{0}}{\partial \textbf{U}}\frac{\partial \textbf{U}^{0}}{\partial d_i} + \frac{\partial {\textbf{R}}^{0}}{\partial \dot{\textbf{U}}}\frac{\partial \dot{\textbf{U}}^{0}}{\partial d_i} \right) . \end{array} $$

(118)

This algorithm is obtained by inserting node C from Fig. 16 into the primal analysis flowchart of Fig. 13.

For semi-analytical implementation, we require $\partial \textbf {R}^{n} / \partial \dot {\textbf {U}}$ to be symmetric. The adjoint load terms of (110) and (114) are thusly approximated as

$$ \frac{\partial \textbf{R}^{N-n}}{\partial \dot{\textbf{U}}}^{\top} {\boldsymbol{\Lambda}^{n}} \approx \frac{1}{\epsilon} \textbf{R} \left( \textbf{U}^{N-n}, \dot{\textbf{U}}^{N-n}+\epsilon {\boldsymbol{\Lambda}}^{n}, \ddot{\textbf{U}}^{N-n}, \textbf{d}\right) . $$

(119)

The first Ψⁿ option, is restricted to symmetric ∂R/∂U. Whereby (111) and (115) are approximated as

$$ \frac{\partial \textbf{R}^{N-n}}{\partial \textbf{U}}^{\top}{{\boldsymbol{\Lambda}}^{n}} \approx \frac{1}{\epsilon} \textbf{R} \left( \textbf{U}^{N-n}+\epsilon \boldsymbol{\Lambda}^{n}, \dot{\textbf{U}}^{N-n}, \ddot{\textbf{U}}, \textbf{d}\right) . $$

(120)

For the second Ψⁿ option, considers the more common restriction for which $\partial \textbf {R} / \partial \ddot {\textbf {U}}$ is symmetric and β≠ 0, whence the terms in (112) and (116) are approximated as

$$ \frac{\partial \textbf{R}^{N-n}}{\partial \ddot{\textbf{U}}}^{\top} \boldsymbol{\Lambda}^{n} \approx \frac{1}{\epsilon} \textbf{R} \left( \textbf{U}^{N-n}, \dot{\textbf{U}}^{N-n}, \ddot{\textbf{U}}+\epsilon \boldsymbol{\Lambda}^{n}, \textbf{d}\right) . $$

(121)

Finally, to compute DF/Dd_i in (118), we use the following approximation

$$\begin{array}{@{}rcl@{}} &&\frac{\partial {\textbf{R}}^{0}}{\partial \textbf{U}}\frac{\partial \textbf{U}^{0}}{\partial d_i} + \frac{\partial {\textbf{R}}^{0}}{\partial \textbf{U}}\frac{\partial \dot{\textbf{U}}^{0}}{\partial d_i} \\ &\approx&\frac{1}{\epsilon} \textbf{R} \left( \textbf{U}^{0} +\epsilon \frac{\partial \textbf{U}^{0}}{\partial d_i}, \dot{\textbf{U}}^{0} +\epsilon {\frac{\partial \dot{\textbf{U}}^{0}}{\partial d_i}}, \ddot{\textbf{U}}^{0}, \textbf{d}\right) . \end{array} $$

(122)

The derivative ∂Rⁿ/∂d_i of (62), (66), and (118) is approximated from (106).

4.6 Dynamic example

Consider a two identical masses m₁ = m₂ = 1 that are free to slide over a frictionless horizontal surface. The masses are connected by identical nonlinear springs and identical linear dampers as seen in Fig. 17. The internal force generated by the springs is f_e = x + k_dx³ where x is the relative displacement of the connected nodes of the spring and the parameter k_d = 1 is our design variable. The dampers generate the force $f_{c}=k_{c} \dot {x}$, where k_c = 0.1. There is no external force acting in the two mass-spring-damper system but it is subjected to the initial conditions x₁(0) = 0, x₂(0) = 1, $\dot {x_{1}}(0)= 0$ and $\dot {x_{2}}(0)= 0$. The time domain is t = [0,10], the Newton-Raphson tolerance is 𝜖_R < 10^− 15 and the Newmark-beta parameters are γ = 1/2 and β = 1/4.

To illustrate the various sensitivity analyses, the response function is

$$ F={\int}_{0}^{10} \left( {x_{1}^{2}}+{x_{2}^{2}}+ \dot{x}_{1}^{2}+ \dot{x}_{2}^{2} \right) \mathrm{d}t , $$

(123)

where the numerical integration is done by the trapezoidal rule. Table 3 shows the computed sensitivities values for the different methods using the perturbation size 𝜖 = 10^− 6. The response function converges as the number of time steps increases, thus the values of the sensitivities corresponding to N = 100 differ from those corresponding to N = 1000. For N = 100, the sensitivities obtained by the adjoint method differentiate-then-discretize, do not coincide with the others due to the consistency error (Gunzburger 2003; Jensen et al. 2014). However, this consistency error practically vanishes for N = 1000.

Table 3 Sensitivities for the two mass-spring-damper problem with 𝜖 = 10^− 6.

Full size table

To examine the consistency of the methods, we show e_f for the N = 1000 case and different perturbation sizes, cf. Figure 18. As expected the finite differences show truncation and round off error for large and small perturbations respectively, and the adjoint differentiate-then-discretize method shows a consistency error. Figure 19 illustrates the error e_f for 𝜖 = 10^− 6 and different time steps, where it is seen that the consistency error of the adjoint differentiate-then-discretize method reduces as the number of time steps increases.

To examine the accuracy of the semi-analytical sensitivities, we compute the error e_s for the N = 1000 case, cf. Fig. 20. Again, as expected, the semi-analytical sensitivities exhibit truncation and round off error for small and large perturbation sizes respectively.

Figure 21 shows that the error e_s for 𝜖 = 10^− 6 is fairly independent of the time step size.

5 Conclusions

Implementation of analytical sensitivity analyses requires detailed knowledge of the analysis program and can be error-prone and time-consuming to implement. Fortunately, these drawbacks may be reduced by adopting the semi-analytical method, where terms in the pseudo or adjoint loads and also in the sensitivities are approximated by finite differences. In this way, we are able to compute these complicated terms using subroutines that are used for the solution of the primal problem and maintain the efficiency of the analytical methods. That said, the accuracy of the semi-analytical sensitivities is susceptible to truncation, round-off errors, and additional errors if the convergence tolerance of the primal analysis is not sufficiently small.

In transient and dynamic problems, the semi-analytical sensitivity analysis approach affects both restrictive assumptions and accuracy. In particular, expressions for the adjoint differentiate-then-discretize and discretize-then-differentiate approaches differ because the differentiation and discretization steps do not commute. The differentiate-then-discretize approach requires some terms to be constant, e.g., mass matrix, in order to reuse the tangent stiffness matrix from the primal analysis; however, the first and last tangent stiffness matrices are not reused. This is not the case for the direct and the adjoint discretize-then-differentiate methods where the tangent stiffness matrix is reused for all time steps. Furthermore, the adjoint differentiate-then-discretize approach yields consistency error, albeit they reduce with the time step size.

In most cases, the semi-analytical adjoint approaches for the nonlinear transient and nonlinear dynamic systems require symmetry of ∂Rⁿ/∂U, ${\partial \textbf {R}^{n}}/{\partial \dot {\textbf {U}}}$, and/or $\partial \textbf {R}^{n}/ \partial \ddot {\textbf {U}}$. This may be problematic, as ∂Rⁿ/∂U is usually asymmetric in nonlinear problems. Fortunately, if we do not use an explicit method, the semi-analytical discretize-then-differentiate adjoint method can accommodate asymmetric ∂Rⁿ/∂U. A summary of these restrictions is presented in Tables 4 and 5. Example problems are provided to show the efficiency and errors associated with the various methods for nonlinear transient and nonlinear dynamic problems.

Table 4 Restrictions for semi-analytical adjoint methods for transient problems

Full size table

Table 5 Restrictions for semi-analytical adjoint methods for dynamic problems

Full size table

Notes

For α = 0, 1/2, or 1, we recover the forward Euler, Crank-Nicolson, and backward Euler strategies respectively.

References

Adelman HM, Haftka RT (1986) Sensitivity analysis of discrete structural systems. AIAA J 24(5):823–832
Article Google Scholar
Barthelemy B, Haftka R (1990) Accuracy analysis of the semi-analytical method for shape sensitivity calculation. Mech Struct Mach 18(3):407–432
Article Google Scholar
Barthelemy B, Chon C, Haftka R (1988) Accuracy problems associted with semi-analytical derivatives of static response. Finite Elem Anal Des 4(3):249–265
Article Google Scholar
Bernard J, Kwon S, Wilson J (1993) Differentiation of mass and stiffness matrices for high order sensitivity calculations in finite element-based equilibrium problems. J Mech Des 115(4):829–832
Article Google Scholar
Bestle D, Seybold J (1992) Sensitivity analysis of constrained multibody systems. Arch Appl Mech 62 (3):181–190
MATH Google Scholar
de Boer H, van Keulen F (2000) Refined semi-analytical design sensitivities. Int J Solids Struct 37 (46-47):6961–6980
de Boer H, van Keulen F, Vervenne K (2002) Refined second order semi-analytical design sensitivities. Int J Numer Methods Eng 55(9):1033–1051
Botkin M (1982) Shape optimization of plate and shell structures. AIAA J 20(2):268–273
Article Google Scholar
Brüls O, Eberhard P (2008) Sensitivity analysis for dynamic mechanical systems with finite rotations. Int J Numer Methods Eng 74(13):1897–1927
Article MathSciNet Google Scholar
Camarda C, Adelman H (1984) Static and dynamic structural-sensitivity derivative calculations in the finite-element-based engineering analysis language (eal) system. NASA TM-85743
Chen B, Gu Y, Zhang H, Zhao G (2003) Structural design optimization on thermally induced vibration. Int J Numer Methods Eng 58(8):1187–1212
Article Google Scholar
Cheng G, Liu Y (1987) A new computation scheme for sensitivity analysis. Eng Optim 12(3):219–234
Article Google Scholar
Cheng G, Olhoff N (1993) Rigid body motion test against error in semi-analytical sensitivity analysis. Comput Struct 46(3):515–527
Article Google Scholar
Cheng G, Gu Y, Zhou Y (1989) Accuracy of semi-analytic sensitivity analysis. Finite Elem Anal Des 6 (2):113–128
Article Google Scholar
Deng Y, Liu Z, Zhang P, Liu Y, Wu Y (2011) Topology optimization of unsteady incompressible navier–stokes flows. J Comput Phys 230(17):6688–6708
Article MathSciNet Google Scholar
Esping B (1984) Minimum weight design of membrane structures using eight node isoparametric elements and numerical derivatives. Comput Struct 19(4):591–604
Article Google Scholar
Fenyes P, Lust R (1991) Error analysis of semianalytic displacement derivatives for shape and sizing variables. Amer Inst Aeronaut Astronaut 29(2):271–279
Article Google Scholar
Gallagher R, Zienkiewicz O (1973) Optimum structural design: Theory and applications. Wiley, New York
MATH Google Scholar
Greene W, Haftka R (1991) Computational aspects of sensitivity calculations in linear transient structural analysis. Struct Optim 3(3):176–201
Article Google Scholar
Gu Y, Grandhi R (1998) Sensitivity analysis and optimization of heat transfer and thermal-structural designs. In: 7th AIAA/USAF/NASA/ISSMO Symposium on Multidisciplinary Analysis and Optimization, pp 4746
Gu Y, Chen B, Zhang H, Grandhi R (2002) A sensitivity analysis method for linear and nonlinear transient heat conduction with precise time integration. Struct Multidiscip Optim 24(1):23–37
Article Google Scholar
Gunzburger MD (2003) Perspectives in flow control and optimization. Advances in Design and Control, Society for Industrial and Applied Mathematics
Haftka R (1993) Semi-analytical static nonlinear structural sensitivity analysis. AIAA J 31(7):1307–1312
Article MathSciNet Google Scholar
Haftka R, Adelman H (1989) Recent developments in structural sensitivity analysis. Struct Optim 1(3):137–151
Article Google Scholar
Haftka RT, Gürdal Z (2012) Elements of structural optimization, vol 11. Springer Science & Business Media, Berlin
Haug EJ (1987) Design sensitivity analysis of dynamic systems. In: Computer aided optimal design: Structural and Mechanical Systems. Springer, Berlin, pp 705–755
Google Scholar
Hooijkamp EC, van Keulen F (2018) Topology optimization for linear thermo-mechanical transient problems: Modal reduction and adjoint sensitivities. Int J Numer Methods Eng 113(8):1230–1257
Article MathSciNet Google Scholar
Jensen J, Nakshatrala P, Tortorelli D (2014) On the consistency of adjoint sensitivity analysis for structural optimization of linear dynamic problems. Struct Multidiscip Optim 49(5):831–837
Article MathSciNet Google Scholar
van Keulen F, Haftka R, Kim R (2005) Review of options for structural design sensitivity analysis. part 1: Linear systems. Comput Methods Appl Mech Eng 194(30-33):3213–3243. Structural and Design Optimization
Kiendl J, Schmidt R, Wüchner R, Bletzinger KU (2014) Isogeometric shape optimization of shells using semi-analytical sensitivity analysis and sensitivity weighting. Comput Methods Appl Mech Eng 274:148–167
Article MathSciNet Google Scholar
Kramer JL, Stockman NO (1963) Effect of variable thermal properties on one-dimensional heat transfer in radiating fins. Technical report, NASA TN D-1878
Kreissl S, Pingen G, Maute K (2011) Topology optimization for unsteady flow. Int J Numer Methods Eng 87(13):1229–1253
MathSciNet MATH Google Scholar
Meric R (1988) Shape design sensitivity analysis of dynamic structures. AIAA J 26(2):206–212
Article MathSciNet Google Scholar
Michaleris P, Tortorelli DA, Vidal CA (1994) Tangent operators and design sensitivity formulations for transient non-linear coupled problems with applications to elastoplasticity. Int J Numer Methods Eng 37(14):2471–2499
Article Google Scholar
Mlejnek H (1992) Accuracy of semi-analytical sensitivities and its improvement by the natural method. Struct Optim 4(2):128–131
Article Google Scholar
Mróz Z, Haftka R (1994) Design sensitivity analysis of non-linear structures in regular and critical states. Int J Solids Struct 31(15):2071–2098
Article MathSciNet Google Scholar
Olhoff N, Rasmussen J (1991) Study of inaccuracy in semi-analytical sensitivity analysis a model problem. Struct Optim 3(4):203–213
Article Google Scholar
Olhoff N, Rasmussen J, Lund E (1993) A method of exact numerical differentiation for error elimination in finite-element-based semi-analytical shape sensitivity analyses. Mech Struct Mach 21(1):1–66
Article MathSciNet Google Scholar
Oral S (1996) An improved semianalytical method for sensitivity analysis. Struct Optim 11(1-2):67–69
Article Google Scholar
Pedersen P, Cheng G, Rasmussen J (1989) On accuracy problems for semi-analytical sensitivity analyses. Mech Struct Mach 17(3):373–384
Article Google Scholar
Ray D, Pister KS, Polak E (1978) Sensitivity analysis for hysteretic dynamic systems: theory and applications. Comput Methods Appl Mech Eng 14(2):179–208
Article MathSciNet Google Scholar
Tortorelli D, Michaleris P (1994) Design sensitivity analysis: Overview and review. Inverse Probl Eng 1(1):71–105
Article Google Scholar
Tortorelli DA, Haber RB, Lu SCY (1991) Adjoint sensitivity analysis for nonlinear dynamic thermoelastic systems. AIAA J 29(2):253–263
Article Google Scholar
Tromme E, Tortorelli D, Brüls O, Duysinx P (2015) Structural optimization of multibody system components described using level set techniques. Struct Multidiscip Optim 52(5):959–971
Article MathSciNet Google Scholar
Van Keulen F, De Boer H (1998) Rigorous improvement of semi-analytical design sensitivities by exact differentiation of rigid body motions. Int J Numer Methods Eng 42(1):71–91
Article Google Scholar
Wang W, Clausen PM, Bletzinger KU (2015) Improved semi-analytical sensitivity analysis using a secant stiffness matrix for geometric nonlinear shape optimization. Comput Struct 146:143–151
Article Google Scholar
Zhong WX, Williams FW (1994) A precise time step integration method. Proc Inst Mech Eng Part C: J Mech Eng Sci 208(6):427–430
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mechanical Sciences and Engineering, University of Illinois at Urbana-Champaign, Champaign, IL, USA
Felipe Fernandez & Daniel A. Tortorelli
Center for Design and Optimization, Lawrence Livermore National Laboratory, Livermore, CA, USA
Daniel A. Tortorelli

Authors

Felipe Fernandez
View author publications
You can also search for this author in PubMed Google Scholar
Daniel A. Tortorelli
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Felipe Fernandez.

Additional information

Responsible Editor: Hai Huang

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was partially performed under the auspices of the US Department of Energy by Lawrence Livermore Laboratory under contract DE-AC52-07NA27344, cf. ref number LLNLCONF-717640.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fernandez, F., Tortorelli, D.A. Semi-analytical sensitivity analysis for nonlinear transient problems. Struct Multidisc Optim 58, 2387–2410 (2018). https://doi.org/10.1007/s00158-018-2096-y

Download citation

Received: 30 March 2018
Revised: 05 September 2018
Accepted: 07 September 2018
Published: 28 September 2018
Issue Date: December 2018
DOI: https://doi.org/10.1007/s00158-018-2096-y

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Semi-analytical sensitivity analysis for nonlinear transient problems

Abstract

Similar content being viewed by others

Adjoint sensitivity analysis and optimization of transient problems using the mixed Lagrangian formalism as a time integration scheme

Sensitivity Analysis of Numerical Models

Properties and practicability of convergence-guaranteed optimization methods derived from weak discrete gradients

1 Introduction

2 Steady-state nonlinear problems

2.1 Sensitivity analysis of steady-state nonlinear systems

2.2 Finite difference method

2.3 Direct differentiation for steady-state nonlinear systems

2.4 Adjoint method for steady-state nonlinear systems

3 Transient nonlinear problems

3.1 Discretization

3.2 Primal analysis

3.3 Direct differentiation

3.4 Adjoint method using differentiate-then-discretize

3.5 Adjoint method using discretize-then-differentiate

3.6 Transient example

3.6.1 Symmetric ∂ R/∂ U and \(\partial \mathbf {R} / \partial \dot {\mathbf {U}}\)

3.6.2 Asymmetric ∂ R/∂ U

4 Nonlinear dynamic problems

4.1 Discretization

4.2 Primal analysis

4.3 Direct differentiation

4.4 Adjoint method using differentiate-then-discretize

4.5 Adjoint method using discretize-then-differentiate

4.6 Dynamic example

5 Conclusions

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation