1 Introduction

Designing components to sustain worst-case loads, as defined by Federal Aviation Administration (FAA) through Airworthiness Standards in 14 CFR Part 25 Subpart-C Structure (FAA 2019), is the primary role of structural sizing in the aircraft design process. The regulations mandate that the aircraft must be structurally sound under dynamic loads. Current aircraft design procedures heavily rely on computational tools to size the structure to ensure compliance with the regulations. Broadly speaking, the following procedure is typically adopted to size the structure: (1) dynamic loads are first computed by coupling aerodynamics, structural dynamics, flight mechanics, and controls (Shearer and Cesnik 2007; Meirovitch and Tuzcu 2004; Patil 1999; Karpel et al. 2004); (2) then, the dynamic loads are converted into a set of critical static load cases (Kim and Park 2010; Kang et al. 2005); and finally, (3) the structure is approximately represented as either a shell model (Kennedy and Martins 2010, 2013, 2014a, b; Cavagna et al. 2009, 2011; Kenway et al. 2014; Kennedy et al. 2008; Kenway et al. 2012) or a beam model (Takahashi and Lemonds 2015; Chauhan and Martins 2018) and sized for critical static load cases.

The process of converting the dynamic loads into equivalent static loads typically leads to a large number of load cases that the structure must be sized for. Conventional tube-and-wing aircraft design methods have matured to the extent that thousands of equivalent static load cases can be systematically reduced to a smaller set of critical load cases (Sinha et al. 2021; Dharmasaroja et al. 2017). However, for novel aircraft concepts and configurations, such heuristic, albeit systematic reductions, may no longer be valid. For highly flexible or multi-body systems, the sizing and optimization algorithm will likely favor different designs depending on whether the problem is static or dynamic because the goals of the study may be different. For instance, sizing and optimization based on equivalent static loads tend to favor designs with higher stiffness, whereas under dynamic loads, the algorithms tend to drive the structure’s eigenfrequencies away from the driving frequency of the system (Stolpe et al. 2018; Stolpe 2014). Therefore, methods applicable in the early design must account for dynamics explicitly without relying on ad hoc conversions to equivalent static load cases.

Explicit consideration of dynamical loads is challenging, especially in early design. One, shell models (Cavagna et al. 2009, 2011; Kennedy and Martins 2010, 2013, 2014a; Kenway et al. 2014; Kennedy et al. 2008; Kenway et al. 2012) typically employed in early design have large degrees of freedom (DoF), making them computationally expensive. Two, shell models offer control over the thicknesses, composite ply orientations, etc., leading to a large design space and as a result, many expensive function and gradient calls when used in many-query exercises. Consequently, sizing for dynamical loads with shell models renders many-query exercises impractical. To avoid compute intensive analyses, this work advocates for the use of low DoF beam models (Nguyen 2008; Nguyen and Tuzcu 2009; Nguyen et al. 2014, 2016; Ting et al. 2014; Drela 1990, 1999; Patil 1999; Patil et al. 2001; Raghavan 2009; Shearer and Cesnik 2007; Palacios and Cesnik 2005) or modal representations (Hurty 1965; Waszak and Schmidt 1988; Schmidt and Raney 2001; Pedro and Bigg 2005; Karpel 1999; Karpel et al. 2004, 2005; Moulin and Karpel 2007; Raveh et al. 2000, 2001). Between beam models and modal analysis, the former is nonlinear, and hence preferred for highly flexible wings (Patil 1999). Beam models are a powerful tool for structural analysis and optimization. Variational asymptotic method-based approaches (Hodges 2006) have been successfully used in the analysis and design of rotorcraft blades (Ku et al. 2007; Li et al. 2008; Hodges and Yu 2007), wind turbines (Richards et al. 2014; Lee et al. 2002), and very flexible aircraft such as HALE (Cesnik et al. 2012).

Initial structural sizing and optimization of the wingbox by representing it as a beam has been demonstrated by Takahashi and Lemonds (2015). Recently, an open-source python tool named OpenAeroStruct (Jasa et al. 2018; Chauhan and Martins 2018) has been shown to provide reasonable weight estimates at the early stages of aircraft design. Sarojini (2021) compared the beam theory framework in this paper to a shell model for the PEGASUS aircraft and showed a 6% error in weight estimate, but with a \(7.6\times\) speed-up in computational efficiency. It should be noted that the references cited earlier used Euler–Bernoulli (EB) beam theory to compute the stresses and failure criteria during the structural sizing process. A current limitation is the simplification to static critical loads and static structural analysis instead of dynamic loads and structural dynamic analysis. This work addresses this limitation.

Another issue concerning direct consideration of dynamical loads is related to the implementation and computational cost of the adjoint method for gradient computations. One, the adjoint method requires the entire state history of the forward-simulation to be stored and used in the adjoint simulation (Boopathy and Kennedy 2017, 2019). For models with large DoF, these data can amount to gigabytes of data for a single simulation. Two, analytical derivation of the adjoint is tedious and requires considerable effort to implement (Sarojini et al. 2020). Any additional changes, such as the inclusion of complex cross-sections or accurate techniques to compute stresses, require re-derivation and re-implementation of the adjoint-based gradient computation. Advances in automatic differentiation (AD) enable computation of derivatives by accumulating values during code execution to generate numerical derivative evaluations rather than derivative expressions. AD can be applied to regular code with minimal change, allowing branching, loops, and recursion (Baydin et al. 2018). Despite its name, AD does not fully automate differentiation and can yield inefficient code if naively implemented (Margossian 2019). Symbolic differentiation offers yet another solution by allowing practitioners to write regular code using symbols, leading to an easily modifiable and modular framework. However, it is memory and compute intensive (Margossian 2019). Hence, symbolic differentiation’s use is impractical for large DoF models such as a shell model where symbolics must be used for each node in the computational mesh. Since low DoF beam models are sufficiently accurate for early-stage aircraft structural weight estimation (Takahashi and Lemonds 2015; Chauhan and Martins 2018) under static loads, symbolics can be used in addition to AD to perform efficient gradient computations. This work uses a hybrid approach that leverages both symbolic and AD provided by the CasADi package. The core of CasADi consists of a symbolic framework that requires users to construct expressions that automatically define differentiable functions. These general-purpose expressions have no notion of optimization and are best likened with expressions in, e.g., MATLAB’s Symbolic Toolbox or Python’s SymPy package. Once the expressions have been implemented, they are used to efficiently obtain new expressions for derivatives using AD (Andersson et al. 2012). The result can be interpreted as a hybrid of symbolic differentiation and AD that is capable of automatically generating extremely fast C code.

Finally, this work attempts to tackle another challenge encountered in implementing structural sizing for dynamic loads concerning handling time-dependent constraints in structural optimization. Throughout a dynamic maneuver, it must be ensured that the stress in the discretized representation of the structure is below the yield stress of the material, i.e., the stress at every point in the structure must be less than the yield stress throughout the dynamic maneuver. As it is impossible to know a priori, the time and location at which the stress will exceed the yield stress, constraints must be placed at every spatial and temporal location. In literature, a large number of such constraints are typically handled using spatial constraint aggregation techniques (Poon and Martins 2007; Martins and Poon 2005; Kennedy and Hicken 2015). Aggregation in both space and time has been given limited attention (Boopathy and Kennedy 2017, 2019) for computationally expensive high-fidelity simulations. This study demonstrates one of the first empirical demonstrations of constraint aggregation in both space and time for beam models.

In summary, this work presents the development of a ground-up implementation of an equation-agnostic adjoint-based framework to directly size and optimize a structure under dynamic loads with a special focus on early aircraft design. Low DoF beam models are used in place of high DoF models to represent the structure, thereby significantly reducing the computational cost (runtime and memory). Via several sizing and optimization problems on a notional ’wing-like’ wingbox, it will be demonstrated that (1) loads applied to a structure produce higher stresses when analyzed dynamically rather than statically. Thus, the resulting structure has larger thicknesses, thereby reinforcing the necessity of considering dynamics in early-stage aircraft structural sizing, and (2) the structure can be sized directly for dynamic loads, thereby obviating the need for conversion to equivalent critical static loads.

The remainder of the paper is organized as follows: Sect. 2 provides background on the adjoint method (Sect. 2.1) and constraint aggregation (Sect. 2.2). Section 3 presents the framework to perform structural optimization using the adjoint method and constraint aggregation when applied to the equations of a nonlinear beam theory. Section 4 demonstrates the proposed approach on a case study of optimization of a wing-like structure subjected to dynamic loads.

2 Background

2.1 Adjoint method

The governing equations for many systems of interest are second-order differential equations in time. It is possible to convert any second-order system into a first-order system using standard approaches. Any set of governing equations can be written in a generic form as follows:

$$\mathbf {R}(\varvec{\mu }, \mathbf {x}, \dot{\mathbf {x}}, t) = \mathbf {0},$$
(1)

where \(\varvec{\mu }\) represents the design variables, \(\mathbf {x}\,{\text {and}}\,\dot{\mathbf {x}}\) are the state variables and their first derivatives with respect to time (denoted by t), respectively. Note that the chosen form of representing the system lends itself to a simple derivation of the adjoint system of equations.

Several methods exist for integrating the numerical differential algebraic equations (DAEs) arising from the governing equations. Each method treats the discrete system of algebraic equations in a different manner (for instance, requiring solutions from different time steps). Therefore, the adjoint system must be derived separately for each method. In this work, the adjoint system of equations is derived for the backward difference formula (BDF) scheme due to its stability properties. The state approximation of the BDF scheme (Curtiss and Hirschfelder 1952; Boopathy and Kennedy 2017) is given by

$$\mathbf {S}_{i} = \frac{1}{\Delta t} \sum _{p=0}^{P} \alpha _{p} \mathbf {x}^{t-p} - \dot{\mathbf {x}}^{t},$$
(2)

where \(\alpha _{p}\) are the BDF coefficients, and \(\Delta t\) the time step.

In this work, a second-order (\(P=2\)) BDF scheme is implemented. The time derivative at a time instant is defined as

$$\dot{\mathbf {x}}^{t} = \alpha _{0} \mathbf {x}^{t} + \alpha _{1} \mathbf {x}^{t-1} + \alpha _{2} \mathbf {x}^{t-2}$$
(3)

with \(\alpha _{0}=\frac{1.5}{\Delta t}\), \(\alpha _{1}=\frac{-2}{\Delta t}\), and \(\alpha _{2}=\frac{0.5}{\Delta t}\). At each time step, Newton’s method is used to solve the nonlinear system given by Eq. 1. Newton’s method uses the following linear solve at each iteration

$$\left[ \frac{\partial \mathbf {R}}{\partial \mathbf {x}^{t}} + \alpha _{0} \frac{\partial \mathbf {R}}{\partial \dot{\mathbf {x}}^{t}} \right] \delta _{\mathbf {x}} = -\mathbf {r}^{t},$$
(4)

where \(\frac{\partial \mathbf {R}}{\partial \mathbf {x}^{t}}\) and \(\frac{\partial \mathbf {R}}{\partial \dot{\mathbf {x}}^{t}}\) are the Jacobians of the residual system, and \(\mathbf {r}\) is the residual vector when Eq. 1 is evaluated for the current guess of \(\mathbf {x}\) and \(\dot{\mathbf {x}}\). If the residual vector is nonzero, Newton’s method creates a new guess for the state variable using

$$\mathbf {x}^{t} = \mathbf {x}^{t} + \delta _{\mathbf {x}}.$$
(5)

The algorithm for the dynamic simulation’s forward solve is given in “Appendix B”—Algorithm 1.

In the context of design through numerical optimization, an objective/constraint functional can be expressed as a sum over time as

$$\begin{aligned} f(\varvec{\mu })&= \int _{0}^{T} F(\varvec{\mu }, \mathbf {x}, \dot{\mathbf {x}}, t){\text {d}}t \\&\approx \sum _{t=0}^{T} \Delta t F^{t}(\varvec{\mu }, \mathbf {x}^{t}, \dot{\mathbf {x}}^{t}). \end{aligned}$$
(6)

Here, the integrand F can be any arbitrary function of the state and design variables. What follows is a summary of the equations that will be used for the discrete-time adjoint system of equations (Boopathy and Kennedy 2017). The Lagrangian can be written as follows:

$$\mathbf {L} = \sum _{t=0}^{T}\Delta t F^{t} + \sum _{i=0}^{n}\Delta t \left( \varvec{\lambda }^{t}\right) ^{T}\mathbf {R}^{t} + \sum _{t=0}^{T}\left( \varvec{\phi }^{t}\right) ^{T} \mathbf {S}^{t},$$
(7)

where \(\mathbf {R}^{t}\) is the residual of the governing equations, \(\mathbf {S}^{t}\) denotes the state approximation resulting from the BDF scheme, \(\varvec{\lambda }\) and \(\varvec{\phi }\) are the corresponding Lagrange multipliers, respectively, \(\Delta t\) denotes the step size in time, and superscript t denotes position in time.

Setting the partial derivative of the Lagrangian w.r.t. to the state variables to zero results in a linear system of the following form at each time step, t

$$\begin{aligned}&\Big ( \frac{\partial \mathbf {R}^{t}}{\partial \mathbf {x}} + \alpha _{0} \frac{\partial \mathbf {R}^{t}}{\partial \dot{\mathbf {x}}} \Big )^{T} \varvec{\lambda }^{t} = -\Big ( \frac{\partial F^{t}}{\partial \mathbf {x}} + \alpha _{0}\frac{\partial F^{t}}{\partial \dot{\mathbf {x}}} \Big ) - \\&\quad \Big ( \sum _{p=1}^{P} \alpha _{p} \frac{\partial \mathbf {R}^{t+p}}{\partial \dot{\mathbf {x}}}^{T} \varvec{\lambda }_{t+p} - \sum _{p=1}^{P} \alpha _{p} \frac{\partial F^{t+p}}{\partial \dot{\mathbf {x}}}^{T}\Big ). \end{aligned}$$
(8)

Once the adjoint variables at each time step are computed, the derivative of the functional of interest w.r.t. design variables can be found as

$$\frac{{\text {d}} f}{{\text {d}} \varvec{\mu }} = \sum _{t=0}^{T} \Delta t \frac{\partial F^{t}}{\partial \varvec{\mu }} + \sum _{t=0}^{T} \Delta t \left( \varvec{\lambda }^{t}\right) ^{T} \frac{\partial \mathbf {R}^{t}}{\partial \varvec{\mu }}.$$
(9)

The formulation presented here results in a general method to obtain the gradient for a system of residual equations time-marched using a BDF scheme. The adjoint variable is solved at each time step using Eq. 8. The four quantities required are as follows:

  • \(\frac{\partial \mathbf {R}}{\partial \mathbf {x}}\)—the Jacobian matrix of the derivative of the residual equations w.r.t. the state variables,

  • \(\frac{\partial \mathbf {R}}{\partial \dot{\mathbf {x}}}\)—the Jacobian matrix of the derivative of the residual equations w.r.t. the first time derivative of the state variables

  • \(\frac{\partial F}{\partial \mathbf {x}}\)—the vector of derivatives of the function of interest w.r.t. the state variables,

  • \(\frac{\partial F}{\partial \dot{\mathbf {x}}}\)—the vector of derivatives of the function of interest w.r.t. the first time derivative of the state variables.

The gradient is then computed using Eq. 9 for which the following two additional quantities are required:

  • \(\frac{\partial F}{\partial \varvec{\mu }}\)—the vector of derivatives of the function of interest w.r.t. the design variables,

  • \(\frac{\partial \mathbf {R}}{\partial \varvec{\mu }}\)—the matrix of derivatives of the residual equations w.r.t. the design variables.

The numerical algorithm to obtain the gradient using the adjoint method is presented in “Appendix B”—Algorithm 2. An illustrative example that shows the application of the formulation above to a spring–mass–damper system to obtain the gradient of the integral of the potential energy with respect to the spring stiffness can be found in prior work by Sarojini et al. (2020).

2.2 Constraint aggregation

Consider a constraint g at location i in the domain and at time step t of the dynamic simulation

$$g_{i}^{t} \le 0,\quad i = 1, \ldots , N,\quad t = 0, \ldots , T. \\$$
(10)

Note that the equation above results in a very large number of constraints \(\mathcal {O}(NT)\). A large number of constraints results in a challenging-to-solve optimization problem. Hence, constraint aggregation methods are used (Kennedy and Hicken 2015). One possible constraint aggregation method is the Kreisselmeier–Steinhauser (KS) function given by

$$c_{\text {ks}} = m + \frac{1}{\rho _{\text {ks}}} \ln \left[ \sum _{i=1}^{N}{\text {e}}^{\rho _{\text {ks}}(g_{i}-m)}\right] ,$$
(11)

where \(m=\max g_{i}\), and \(\rho _{\text {ks}}\) is a constraint aggregation parameter. Note that Eq. 11 is conventionally used when constraints are aggregated over space. However, the peak stress on the beam could occur at any time instant between 0 and T. As it is impossible to know a priori the time instant at which the peak load occurs, the constraints must also be aggregated over time. In this work, the KS functional in Eq. 11 is modified to integrate over both space and time as

$$c_{\text {ks}} = m + \frac{1}{\rho _{\text {ks}}} \ln \left[ \sum _{i=1}^{N} \sum _{t=0}^{T}{\text {e}}^{\rho _{\text {ks}}(g_{i}^{t}-m)}\right] ,$$
(12)

where \(m = \max g_{i}^{t}\).

3 Adjoint method applied to a nonlinear beam theory

We first introduce the nomenclature of the symbolic framework. The symbolic variables denoted as \({}_{\text {V}}^{\text {sym}}{()}\). Operations on the symbolic variables result in symbolic expressions denoted as \({}_{\text {E}}^{\text {sym}}{()}\). A symbolic function, \({}_{\text {F}}^{\text {sym}}{()}\), of a symbolic expression is a function that accepts numeric inputs for the symbolic variables and computes numeric values of the symbolic expression.

The nonlinear beam theory used in this study (Drela 1999), when discretized, leads to a system of DAE. A summary of the equations was presented in a prior work by Sarojini et al. (2018). The quantities tracked at each node and at each time step are shown in Table 1. They include nodal positions, Euler angles, translational and angular velocities, accelerations, and reaction forces and moments. It must be noted that all the dynamic simulations in this manuscript use a damping ratio of 0.02, which corresponds to a continuous metallic structure (Drela 1999) and the simulations are initialized with an unloaded and undeflected structure. For the static studies, the tip loads are applied as boundary conditions.

Table 1 Quantities tracked when solving a 1-D beam problem

The process to create a symbolic beam model is shown as an extended design structure matrix (XDSM; Lambe and Martins 2012b) in Fig. 1. The number of cross-sections N and number of time steps T determine the size of the state vector. As seen in Table 1, 18 quantities are tracked per node, resulting in a state vector of size 18N per time step. Thus, the symbolic variable for the state vector can be written as \({}_{\text {V}}^{\text {sym}}{\mathbf {x}} \in \mathbb {S}^{18N\times T}\). Some of the computations require a slice, in time, of the state vector, denoted as \({}_{\text {V}}^{\text {sym}}{\mathbf {x}^{t}} \in \mathbb {S}^{18N}\).

Fig. 1
figure 1

Generation of symbolic beam represented as a XDSM (Lambe and Martins 2012a). \(i = 1, \ldots , N\) is the index for cross-sections along the span of the beam, \(t = 0, \ldots , T\) the index for time steps in the dynamic simulation, \(\mathbf {C}\) the material properties matrix, \({\varvec{\sigma }_{\mathbf {pts}}}_{, i}\) the cross-section locations of the stress recovery points, \(\mathbf {r_{0}}\) the undeformed beam axis node locations, \({\varvec{\kappa _{0}}}_{i}\) the undeformed curvature matrix, and BC the boundary conditions

For each cross-section, the number of design variables M is dictated by the practitioners design choices. For example, a rectangular beam cross-section has width and height as design variables. A box cross-section, in general, has six design variables, i.e., width w, height h, thickness of the top flange \(t_{\text {top}}\), thickness of the bottom flange \(t_{\text {bot}}\), thickness of the left web \(t_{\text {left}}\), and thickness of the right web \(t_{\text {right}}\). For aircraft wingbox structural optimization, the width and height are fixed by the airfoil and the locations of the spars, and the thicknesses are set as design variables. The appropriate number of symbolic cross-section design variables \({}_{\text {V}}^{\text {sym}}{\mathbf {cs_{dv}}} \in \mathbb {S}^{MN}\) are generated.

The function Symbolic beam properties implements appropriate equations to compute the beam stiffness \(\mathbf {K_{i}}\) and mass matrices \(\mathbf {M_{i}}\) at each cross-section i. Analytical expressions exist for simple cross-sections such as rectangular and box beams. If the cross-section is represented as an airfoil, appropriate equations in literature may be used to obtain the beam matrices (Chauhan and Martins 2018). It is also possible to use higher-order beam theories such as a VABS (Yu 2011; Cesnik and Hodges 1997) or BECAS (Blasques 2012). If these higher-order theories are used, this block is an implicit function performing a linear solve. The final result is a set of symbolic expressions given as

$$\begin{aligned}&{}_{\text {V}}^{\text {sym}}{\mathbf {K}_{i}} \in \mathbb {S}^{6 \times 6} = f({}_{\text {V}}^{\text {sym}}{\mathbf {cs_{dv}}}), \end{aligned}$$
(13)
$$\begin{aligned}&{}_{\text {V}}^{\text {sym}}{\mathbf {K}_{i}} \in \mathbb {S}^{6 \times 6} = f({}_{\text {V}}^{\text {sym}}{\mathbf {cs_{dv}}}). \end{aligned}$$
(14)

The function Symbolic beam properties also computes the mass and aggregated stress of the beam. Information from the state vector \({}_{\text {V}}^{\text {sym}}{\mathbf {x}}\) is used to compute while performing the aggregation. As an example, the computation of axial stresses in a rectangular beam with EB assumptions is given by

$$\sigma _{i}^{t} = \frac{M}{I} y = \frac{M_{i}^{t}}{\frac{w_{i} h_{i}^{3}}{12}} \frac{h_{i}}{2} = \frac{6M_{i}^{t}}{w_{i} h_{i}^{2}},$$
(15)

where \(M_{i}^{t}\), an element of the state vector \({}_{\text {V}}^{\text {sym}}{\mathbf {x}}\), is the moment at cross-section i and time t. The corresponding stress constraint is given by

$$g_{i}^{t} = \frac{6M_{i}^{t}}{w_{i} h_{i}^{2} \sigma _{y}} - 1 \le 0,\quad i = 1, \ldots , N;\quad t = 0, \ldots , T.$$
(16)

Aggregating the above stress using Eq. 12 yields \({}_{\text {F}}^{\text {sym}}{c_{{\text {ks}},\sigma _{\text {vm}}}}\). Thus the following symbolic functions are generated.

$$\begin{aligned}&{}_{\text {V}}^{\text {sym}}{m} \rightarrow \mathbb {R} = f({}_{\text {V}}^{\text {sym}}{\mathbf {cs_{dv}}}), \end{aligned}$$
(17)
$$\begin{aligned}&{}_{\text {F}}^{\text {sym}}{c_{{\text {ks}},\sigma _{\text {vm}}}} \rightarrow \mathbb {R} = f({}_{\text {V}}^{\text {sym}}{\mathbf {cs_{dv}}}, {}_{\text {V}}^{\text {sym}}{\mathbf {x}}). \end{aligned}$$
(18)

The symbolic expression of the constraint can be differentiated using AD with respect to the state vector and the design variables to obtain

$$\begin{aligned}&{}_{\text {F}}^{\text {sym}}{\frac{\partial c_{{\text {ks}},\sigma _{\text {vm}}}}{\partial \mathbf {cs_{dv}}}} \rightarrow \mathbb {R}^{MN} = f({}_{\text {V}}^{\text {sym}}{\mathbf {cs_{dv}}}, {}_{\text {V}}^{\text {sym}}{\mathbf {x}^{t}}), \end{aligned}$$
(19)
$$\begin{aligned}&{}_{\text {F}}^{\text {sym}}{\frac{\partial c_{{\text {ks}},\sigma _{\text {vm}}}}{\partial \mathbf {x}}} \rightarrow \mathbb {R}^{18N} = f({}_{\text {V}}^{\text {sym}}{\mathbf {cs_{dv}}}, {}_{\text {V}}^{\text {sym}}{\mathbf {x}}), \end{aligned}$$
(20)
$$\begin{aligned}&{}_{\text {F}}^{\text {sym}}{\frac{\partial c_{{\text {ks}},\sigma _{\text {vm}}}}{\partial \dot{\mathbf {x}}}} \rightarrow \mathbb {R}^{18N} = \mathbf {0}. \end{aligned}$$
(21)

Similarly, the gradient of the objective function is found as

$${}_{\text {F}}^{\text {sym}}{\frac{\partial m}{\partial \mathbf {cs_{dv}}}} \rightarrow \mathbb {R}^{M} = f({}_{\text {V}}^{\text {sym}}{\mathbf {cs_{dv}}}).$$
(22)

Finally, the residual equations of the beam are generated as symbolic expressions in the Symbolic beam model block. The resulting symbolic function

$${}_{\text {F}}^{\text {sym}}{\mathbf {R}} \rightarrow \mathbb {R}^{18N} = f({}_{\text {V}}^{\text {sym}}{\mathbf {cs_{dv}}}, {}_{\text {V}}^{\text {sym}}{\mathbf {x}}, {}_{\text {V}}^{\text {sym}}{\dot{\mathbf {x}}})$$
(23)

accepts numeric values of the cross-section design variables, guesses values of the state vector and the time derivative of the state vector, and returns the residual vector \(\in \mathbb {R}^{18N}\). As mentioned earlier, the symbolic expression of the residual equation can be differentiated using AD with respect to the state vector, and the time derivative of the state vector to obtain the Jacobian matrices

$$\begin{aligned}&{}_{\text {F}}^{\text {sym}}{\frac{\partial \mathbf {R}}{\partial \mathbf {x}}} \rightarrow \mathbb {R}^{18N \times 18N} = f({}_{\text {V}}^{\text {sym}}{\mathbf {cs_{dv}}}, {}_{\text {V}}^{\text {sym}}{\mathbf {x}}, {}_{\text {V}}^{\text {sym}}{\dot{\mathbf {x}}}), \end{aligned}$$
(24)
$$\begin{aligned}&{}_{\text {F}}^{\text {sym}}{\frac{\partial \mathbf {R}}{\partial \dot{\mathbf {x}}}} \rightarrow \mathbb {R}^{18N \times 18N} = f({}_{\text {V}}^{\text {sym}}{\mathbf {cs_{dv}}}, {}_{\text {V}}^{\text {sym}}{\mathbf {x}}, {}_{\text {V}}^{\text {sym}}{\dot{\mathbf {x}}}) \end{aligned}$$
(25)

and with respect to the cross-section design variables to obtain

$${}_{\text {F}}^{\text {sym}}{\frac{\partial \mathbf {R}}{\partial \mathbf {cs_{dv}}}} \rightarrow \mathbb {R}^{18 \times MN} = f({}_{\text {V}}^{\text {sym}}{\mathbf {cs_{dv}}}, {}_{\text {V}}^{\text {sym}}{\mathbf {x}^{t}}, {}_{\text {V}}^{\text {sym}}{\dot{\mathbf {x}}^{t}}).$$
(26)

The process described thus far provides the following six partial derivative quantities needed for the forward and adjoint solvers mentioned earlier in Sect. 2.1: \(\frac{\partial \mathbf {R}}{\partial \mathbf {x}}\): Eq. 24, \(\frac{\partial \mathbf {R}}{\partial \dot{\mathbf {x}}}\): Eq. 25, \(\frac{\partial F}{\partial \mathbf {x}}\): Eq. 20, \(\frac{\partial F}{\partial \dot{\mathbf {x}}}\): Eq. 21, \(\frac{\partial F}{\partial \varvec{\mu }}\): Eq. 19, and \(\frac{\partial \mathbf {R}}{\partial \varvec{\mu }}\): Eq. 26.

It must be noted that the developed framework is agnostic to the type of beam theory used for stress recovery. The framework may be used if one of the two following criteria are met: (1) analytical equations are used for stress recovery, or (2) the gradient of the stress with respect to the design variables is provided. The test case in Sect. 4 uses a box cross-section beam shown in Fig. 14, with stresses recovered at 12 extremity points. For completeness, “Appendix A” contains the standard formulae to compute the beam properties and stresses for a box beam using EB assumptions. This assumption meets criterion 1 stated above. However, the beam theory used in this work is illustrative. Using Timoshenko theory would involve using different analytical equations, and is supported by the framework. If higher accuracy is desired, higher-order beam theory such as VABS (Yu 2011) may be used. To successfully use VABS—that does not have analytical equations for the stress recovery—criterion 2 must be satisfied, i.e., the burden of providing the gradient is on the analysis code.

4 Structural optimization

The test case considered is a ‘wing-like’ geometry shown in Fig. 2 consisting of four cross-sections, each with four design variables. The structure is cantilevered at cross-section 1 at the end left. The optimization problem setup is given in Eq. 27.

Fig. 2
figure 2

Wing-like box beam with sweep, taper, and dihedral

$$\begin{aligned} &\text {minimize}&\text {mass}&\\&\text {subject to}&\mathbf {R}(\mu , \mathbf {x}, \dot{\mathbf {x}}, t, P, E) = \mathbf {0} \\&&c_{{\text {ks}},\upsigma } \le 0 \\&&0.003 \le t_{{\text {left}},{\text {i}}} \le 0.25&\\&&0.003 \le t_{{\text {top}},{\text {i}}} \le 1.5&\\&&0.003 \le t_{{\text {right}},{\text {i}}} \le 0.25&\\&&0.003 \le t_{{\text {bot}},{\text {i}}} \le 1.5&\\&\text {data}&P(t)\,\hbox {(N)}, E\,\hbox {(Pa)}, \\&&\sigma _{\text {y}}\,(\hbox {{Pa}}), \rho \,({\hbox {kg/m}^{3}}). \end{aligned}$$
(27)

In this problem, P(t) is the applied external load as a function of time, E, \(\rho\) and \(\sigma _{\text {y}}\) are the Young’s modulus, density, and yield stress of the material of the beam, respectively, and \(c_{{\text {ks}},\upsigma }\) denotes the aggregated constraint of the stress failure criteria over space and time. For all optimization problems considered in this work, a relative change of \(10^{-8}\) in the value of the objective function and a relative change of \(10^{-10}\) in design variables between successive iterations were used as termination criteria.

4.1 Adjoint validation

The implementation of the adjoint method is validated by computing the derivative of the aggregated stress constraint \(c_{{\text {ks}},\upsigma }\) for von Mises stress with respect to the 16 design variables using the adjoint method and compared with those computed using a central difference stencil. A tip load of \(F_{x}={1 \times 10^{6}}\,{\text {N}}\) and \(F_{z}=F_{x}={1 \times 10^{6}}\,{\text {N}}\) is applied for 3 s. The simulation is advanced with a time step of 0.01 s. The resulting adjoint gradient vector’s components are shown in Table 2. Note that the adjoint gradients are within 2% error from their finite difference equivalents, making them suitable for optimization.

Table 2 Gradient comparison between adjoint method and central differences (step size: \(10^{-8}\)) for 16 design variables (DV) for von Mises stress aggregated constraint \(c_{{\text {ks}},\upsigma }\)

To test the scalability of the method, additional cross-sections are introduced in-between the four cross-sections seen in Fig. 2. Each additional cross-section adds four design variables to the problem. Table 3 compares the wall-time to compute the gradient using finite differences and the adjoint method as the number of design variables is varied. It can be clearly observed that the wall-time to compute the gradient using finite differences becomes infeasible as the number of design variables increases.

Table 3 Comparison of time (in seconds) to compute the gradient between finite-difference and adjoint method

4.2 Study 1: static loading stress validation

Consider sizing the structure for the following conditions:

  • Tip load of \(F_{x}=F_{y}=0\, \hbox {N}, F_{z}={2 \times 10^{6}}\,\hbox {N},\)

  • Midpoint von Mises stress constraints,

  • Initial guess to optimizer set to the upper bounds,

  • Aluminum material with properties: \(E={7 \times 10^{9}}\,\hbox {Pa}, \rho =2700\,\hbox {kg/m}^{3}, \sigma _{\text {y}}={276 \times 10^{6}}\,\hbox {Pa}.\)

MATLAB’s fmincon function was used to perform the optimization with objective and constraint gradients supplied. The optimizer converged in 114 iterations satisfying constraints with a tolerance of \({1.074 \times 10^{-7}}\). The thickness distributions along the span are shown in Fig. 3a.

Fig. 3
figure 3

Optimum solution for study 1. In a the optimized thickness distribution as a function of the span is shown. In b, c, d, and e the von-Mises stresses (in Pa) are shown in blue, the axial stresses (in Pa) are shown in magenta, and the shear stresses are shown in red. (Color figure online)

The von Mises stresses on cross-sections 1 through 4 are shown in Fig. 3b through e, respectively. It can be observed that the maximum stress on CS 1 (\({2.75 \times 10^{8}}\,\text {Pa}\)) is close to the yield stress value (\({2.76 \times 10^{8}}\,\text {Pa}\)). The maximum stresses on CS 2 (\({2.70 \times 10^{8}}\,\text {Pa}\)) and CS 3 (\({2.61 \times 10^{8}}\,\text {Pa}\)) are lower, but still close to the yield value. These results imply that the constraint aggregation in both space and time is working as expected, and the optimizer converges to a feasible optimum solution. Note that on the tip cross-section, i.e., CS 4, the stresses are essentially \({0}\,\text {Pa}\). The optimizer thins the cross-section to the lower bound. This behavior highlights a limitation of the proposed method in that the stresses at the tip cross-section cannot be captured accurately.

4.3 Study 2: static loading weight optimization

Consider sizing the structure for the following conditions:

  • Tip load in ‘lift, drag, and moment’ directions of \(F_{x}=0\,\text {N}, F_{y}={1 \times 10^{5}}\,\text {N}, F_{z}={2 \times 10^{6}}\,\text {N}, M_{x}={1 \times 10^{6}}\,\text{N m}, M_{y}=M_{z}=0\,\text{Nm},\)

  • Initial guess to optimizer set to the upper bounds,

  • Aluminum material with properties: \(E={7 \times 10^{9}}\,\text {Pa}, \rho =2700\,\hbox {kg/m}^{3}, \sigma _{\text {y}}={276 \times 10^{6}}\,\text {Pa}.\)

The following five constraint cases were considered:

  1. 1.

    Corner points axial stress only,

  2. 2.

    Midpoint von Mises stress only,

  3. 3.

    Flange shear stress only,

  4. 4.

    Midpoint von Mises stress and flange shear stress,

  5. 5.

    Midpoint von Mises stress, corner point axial stress, and flange shear stress.

Figure 4 compares the optimal thicknesses computed by the optimizer. The behavior of the results for the different constraint cases is summarized as follows:

  1. 1.

    Axial stresses only depend on distance from the neutral point. The optimizer favors increasing the top and bottom flange thicknesses, while the left and right webs are at the lower bounds, as their thicknesses do not affect axial stress with the loading profile considered, as \(F_{z} > F_{y}\), while also \(I_{zz} > I_{yy}\).

  2. 2.

    The top and bottom flanges were still sized by the axial stress component in the von Mises resultant. The shear flow in the right and left webs contribute to their final-sized thicknesses. The internal torsional moment has a large negative value at the root while having a positive value at the tip. This difference in signs and magnitudes sizes the right and left webs differently.

  3. 3.

    If only the flange shear is used to size, the top and bottom flanges are sized to their minimums. This is due to not having to deal with axial stress and dealing with relatively low shear stress due to the cross-sectional inertia in the top–bottom direction. On the other hand, the webs did get a size increment due to the shear stresses from two sources—the torsional shear and the force shear—which size the right web. Unlike torsional shear, which depends on the torsional loads, shear stresses depend on the shear forces experienced by the beam, which are constant along its span. Also, since the torsion at the tip is nonzero, the optimizer sizes one of the flanges to counter the stresses.

  4. 4.

    When the von Mises corner points include the shear at the center of the flanges, it results in thicker left and right webs with even the tip section sized, resulting in a lesser need for the top–bottom flange thickness.

  5. 5.

    If all constraints are enabled, the beam ends up with the highest weight, and thickness develops along all the different flanges, with an outcome similar to that of Case 4.

Fig. 4
figure 4

Optimum thickness distribution for different constraints in study 2. FS: front spar, TS: top skin, RS: rear spar, and BS: bottom skin

Table 4 compares the sized structure for the different constraint cases. As expected, the case that uses all constraints ends up with the highest thickness for all sections, as it takes into account every high point of stress within the beam. Due to width being greater than the height, the thicknesses of the top skin and bottom skin have a larger impact on the total weight than the thicknesses of the front and rear spars.

Table 4 Comparison of structural sizing for different constraints

4.4 Study 3: constant load dynamic analysis

The purpose of this study is to compare the optimized structure obtained from a static analysis to that obtained from a dynamic analysis under identical loads (i.e., a constant tip load). This study is conducted under conditions similar to Study 2. When simulated dynamically, the loading conditions of \(F_{x}=0\,\text {N}\), \(F_{y}={1 \times 10^{5}}\,\text {N}\), \(F_{z}={2 \times 10^{6}}\,\text {N}\), \(M_{x}={1 \times 10^{6}}\,{\text {Nm}}\), and \(M_{y}=M_{z}=0\,{\text {Nm}}\) at the tip are held constant throughout the dynamic analysis. The simulation is run for 5 s with a step size of 0.01 s. Stress constraints are imposed on midpoint von Mises stress, corner point axial stress, and flange shear stress.

First, consider the optimum solution when the simulation is run statically. The stresses on the optimized root cross-section are shown in Fig. 5a. As observed in earlier studies, it can be seen that the maximum stresses are close to, but lower than the yield stress of \({2.76 \times 10^{8}}\,\text {Pa}\).

Fig. 5
figure 5

Illustration of the effects of dynamic analysis and optimization when initialized with a statically converged optimal

The optimized beam is next subjected to the same constant tip load, but is analyzed dynamically. Figure 5b shows the axial stress time history of the top left corner point. Note that the peak stress of \({-3.27 \times 10^{8}}\,\text {Pa}\) is much higher than the allowable yield stress. Also note that as the effects of dynamics damp out, the steady state value of \({-2.62 \times 10^{8}}\,\text {Pa}\) is indeed the value seen in Fig. 5a for the same top left corner point.

This simple example illustrates why larger thicknesses are achieved when a dynamic simulation is performed instead of a static simulation under identical loads. The thickness distribution of the optimum solutions is shown in Fig. 6. It can be seen that all four thicknesses for the dynamic analysis have been sized to larger values when compared to a static analysis. Table 5 compares the results of the optimization obtained from static analysis against that obtained from dynamic analysis. It can be seen that the weight increases by 10.8% due to the dynamics.

Fig. 6
figure 6

Optimum solution for study 3

Table 5 Comparison of structural sizing and optimization results from static and dynamic analysis

The breakdown of execution time is reported in Table 5. The generation of the symbolic beam model is a one-time cost. Thereafter, at each iteration of the optimizer, note that the execution times for the forward solve and each gradient computation using the adjoint solver are of the same order of magnitude. The computational time is higher due to the five stress gradient computations. As shown in Table 4 earlier, the optimum point found considering only midpoint von Mises stresses and that found considering all three (von Mises, axial, and shear) produce comparable results. Thus, larger optimization problems considering only the von Mises stress constraint would yield acceptable results with a far lower computational cost (Table 6).

Table 6 Breakdown of computation time of dynamic solver

4.5 Study 4: gust load dynamic optimization

Several representative load profiles could be used to perform a sensible comparison between structures obtained from static versus dynamic sizing and optimization. To construct a meaningful dynamic simulation, a series of 13 gust profiles were simulated by means of a heavy elliptically distributed load, which was scaled by multipliers that mimic the profiles specified in the 14 CFR Part 25 Regulations for gust velocities. A 1-minus-cosine gust is selected as a test case, defined in Eq. 28 as

$$U = \frac{U_{\text {ds}}}{2}\left( 1 - \cos {\frac{\pi V \cdot t}{H}} \right)$$
(28)

based on a medium regional aircraft, where U is the multiplier for the loads at time t of the gust and \(U_{\text {ds}}\) is the multiplier for the peak design load being simulated. The value for \(U_{\text {ds}}\) is selected from each of the 10 gust profiles, with a maximum value of 1 for the gust corresponding to the highest design gust velocity, and less than 1 for the remaining gusts, sized in accordance with CFR 25.341 gust and turbulence loads. The gust penetration distance H oscillates from 30 to 350 ft. at an altitude of 30,000 ft. and a velocity V at Mach 0.6. Figure 7 shows an illustration of the gust profiles vs. time.

Fig. 7
figure 7

Simulated gust profiles based on 14 CFR Regulations. Thirteen profiles are simulated from 0.089 to 1.03 s. Only three are shown for clarity

These gusts are used to size and optimize an aluminum structure with the method presented in this work. The optimizer used the von Mises aggregated stress constraint to size a box beam with the same geometry as the beam shown in Fig. 2. To illustrate the stress before and after the optimization process, four reference points were tracked, as shown in Fig. 8.

Fig. 8
figure 8

Reference points tracked during optimization

For the shortest period gust, Fig. 9 shows the stress at these points before and after optimization. Before optimization, note that points on the inner sections violate the constraints, whereas points on the outer sections are severely under-loaded. After optimization, smaller oscillations in the stress response to the gust are observed. Note that the maximum stress at points on all four cross-sections is below the yield stress (shown as a horizontal line). Some sections are slightly under-loaded as the thicknesses of the sections reached the lower bounds specified in the optimization problem statement. The optimization results in a mass of 393.6 kg.

Fig. 9
figure 9

von Mises stress before and after optimization for gust period 0.16 s

Figure 10 illustrates the optimization iteration history. Note that some constraints are violated in the beginning because the optimizer first opts to lower the overall thickness of the beam before focusing on satisfying the constraints, and finally honing in on the optimum.

Fig. 10
figure 10

Optimization history for 0.16-s dynamic gust sizing

4.6 Study 5: multiple gust loads sizing and optimization

To quantify the impact and consequence of sizing and optimizing a structure using the equivalent static load method rather than considering the dynamics directly, a final optimization study was performed with 13 elliptically distributed load profiles designed to mimic actual gusts in accordance with the 14 CFR Part 25 Regulations. As mentioned earlier, the time variation of gust load magnitudes is modeled using multipliers. Equivalent static loads for the gusts are extracted from the dynamic simulations using a method presented by Park (2011).

The structure was first sized and optimized for the 13 gusts taken one at a time. Figure 11 compares the optimized weight using the equivalent load to that of its dynamic counterpart as a function of the gust period. Note that the equivalent static method under-predicts the optimum mass compared to the dynamic optimization. The equivalent static sizing begins yielding structures with heavier mass as the gust period gets wider, missing the peak that occurs at the gust period of 0.17 s.

Fig. 11
figure 11

Dynamic vs. equivalent static sizing of different gust profiles

Finally, an optimization study was run with all 13 gust load cases as constraints, simultaneously. The dynamic sizing and optimization resulted in a beam with a mass of 400.2 kg, whereas the structure obtained by the aggregated equivalent static analysis resulted in a mass of 388.9 kg. While the overall difference (about \(2.8\%\)) in total optimized weight between both approaches is small, it is mainly due to the consideration of the longer gust periods. When the optimized design produced by the equivalent load method was subjected to dynamic analysis with the gust loads, the structure did not satisfy the yield constraint. This was particularly pronounced when subjected to the 0.17-s gust period.

5 Conclusion and future work

This work presented the development and implementation of an equation and cross-section agnostic beam element-based dynamic structural analysis and optimization framework with a special emphasis on enabling early-stage structural design and optimization. A general adjoint solver was implemented for the backwards difference formula scheme. The framework utilizes a hybrid approach leveraging a mix of symbolic expressions and AD that provides a convenient way to consider arbitrary cross-section shapes, use physics-based stress recovery formulae at the desired level of fidelity, and add an arbitrary points number of points on the cross-section to recover stresses. The framework efficiently computes gradients using the adjoint method, thus allowing for rapid static and dynamic sizing and gradient-based structural optimization using any publicly available optimization algorithm. Finally, this work demonstrated aggregation of stress constraints in both space and time.

The gradient computed by the adjoint-based implementation was compared and validated against finite differences. Study 1 demonstrated the ability of the framework to minimize the structural weight for a static tip load. Study 2 highlighted the effects of different types of stresses on the design variables by comparing optimal solutions obtained by considering different types of stresses in the optimization problem. Studies 3, 4, and 5 explicitly investigated the ramifications of neglecting dynamics by posing and solving a series of equivalent static and dynamic optimization problems. For a realistic gust load sizing scenario, study 5 revealed a difference in the optimal weight when the time history of loads was considered as opposed to the optimal weight obtained by considering a steady state static load or a sequence of equivalent static loads.

A particular consequence of utilizing the beam theory and associated stress recovery formulae is the inability to accurately compute the stresses at the tip cross-section. Future work will include further accuracy enhancements to the stiffness and stress recovery computations using the VAM. Other improvements will involve the implementation of constraints important to aerospace applications, such as deflection and buckling constraints.