1 Introduction

Inclusion of acceleration loads due to self-weight are essential when performing topology optimization for parts to be used in the aircraft industry. An aircraft is subjected to accelerations in all directions; ranging from quite moderate in a passenger aircraft to more extreme in a fighter due to advanced maneuvers. We may for example think of applications such as a wing pylon with an external store, e.g. an engine on a passenger aircraft or a weapon or external fuel tank on a fighter aircraft. If we neglect aerodynamic loads, the loads will be a function of the mass of the external store, the self-weight of the pylon, and the accelerations that it is subjected to. Other examples include structural parts inside the fuselage and attachment brackets for electronic devices. Common for these type of applications is that the direction of the loading may vary in space.

1.1 Robust optimization

This paper deals with the problem of how to account for such loadings, which may act in any direction, in a topology optimization problem and particularly the difficulty that optimized designs tend to be very sensitive to small perturbations of the applied loads. Traditionally, this issue is dealt with by optimizing under multiple load cases. The selection of appropriate load cases, however, relies on the intuition of engineers, and may result in large-scale problems. Most importantly, this approach cannot guarantee that there does not exist a load case causing a structural collapse. Thus, it is evident that there is a need for more systematic approaches to obtain optimized structures that are robust with respect to variations in the load. Development of such methods is the topic of the field robust optimization (Ben-Tal et al. 2009).

There are essentially two different approaches to robust optimization found in the literature: (i) the stochastic and (ii) the deterministic, or worst-case. In the first approach, data, such as applied loads, is assumed to be random but obeying some known or partially known probability distribution. The most common type of stochastic methods in structural optimization is reliability-based design optimization, where the probability of structural failure is quantified using so-called reliability indices (Kharmanda et al. 2004) which determines the formulation of a design optimization problem and is updated in an iterative process. In another class of methods the idea is to convert the stochastic problem into a ”standard” multiple load-case problem, the salient point being that the loads and their relative importance are determined by sampling from a continuous probability distribution, (Christiansen et al. 2001; Dunning and Kim 2013).

The deterministic approach to robust optimization deals with worst-case scenarios: structures are optimized to withstand the worst possible loads or manufacturing errors, regardless of the probability of this happening. Worst-case approaches are typically more conservative than stochastic methods, as more restrictions are put on the optimal designs. Further, deterministic methods are not dependent on good statistical data and, most importantly, provide non-probabilistic guarantees, ensuring with certainty that an optimized structure will function as intended, rather than simply that the probability of failure is low. This is clearly desirable in situations where structural failure might cause considerable economical damage or even loss of life. For this reason, deterministic methods are most appropriate for aircraft design.

Ben-Tal and Nemirovski (1997) considered deterministic robust topology optimization of trusses with loads varying in an uncertainty set and showed that the corresponding optimization problem could be formulated as a linear semi-definite program (SDP). This is very nice from a theoretical point of view, since linear SDPs are convex and can be solved with algorithms having proven polynomial-time complexity. However, SDPs with large matrix inequalities are nevertheless beyond reach of current solvers; consequently, numerical examples are generally of small size (Ben-Tal and Nemirovski 1997; Achtziger and Kocvara 2007; Ohsaki et al. 1999).

Other researchers (Brittain et al. 2012; Cherkaev and Cherkaev 2008; de Gournay et al. 2008; Takezawa et al. 2011) have studied formulations for robust topology optimization involving generalized eigenvalues as objectives or constraints. This way the complexity of SDPs with large matrices can be avoided, but the price to pay is the introduction of non-differentiability in the presence of multiple eigenvalues. Unfortunately, multiple eigenvalues are likely to occur in worst-case compliance problems, so simply ignoring non-differentiability, as is often done, is not satisfying. Therefore, de Gournay et al. (2008) used an SDP approach involving matrices of small size to calculate descent directions, rather than using gradients differentiated directly from the generalized eigenvalue problem.

Another approach is to formulate the robust optimization problem as a mathematical program with equilibrium constraints (MPEC) (Kanno 2011). MPECs are, however, difficult to solve numerically, and MPEC-formulations of structural optimization problems will have a large number of non-linear constraints and/or possibly require the evaluation of second derivatives.

This paper focuses on deterministic methods and a main novelty of the paper is that we show how a worst-case problem can be formulated and solved as an SDP with, in contrast to Ben-Tal and Nemirovski (1997), a small matrix inequality constraint so that it can be solved efficiently, without introducing further approximations.

1.2 Optimization including self-weight

Topology optimization including self-weight was studied by Bruyneel and Duysinx (2005), who noticed that for design variables approaching the lower bound, the ratio between the load and the (SIMP-based) stiffness becomes infinite, leading to unbounded displacements and thus also unbounded compliance. This causes the optimization to converge to a design with intermediate design variable values. Bruyneel and Duysinx (2005) suggested a modified SIMP-model, which was linear below a threshold value, and the same modification was also used in Lee et al. (2012). Whereas they report that no problems were experienced due to the introduced non-differentiability, we prefer a differentiable penalization function. Therefore, we suggest a linear scaling, see (5) in Section 2.2, of the mass matrix when the force vector is created, such that no loads are applied in voids, allowing the optimization to converge with values equal to the lower bound. To the authors’ knowledge, this is a new contribution to topology optimization of self-weight loaded structures and the result of this linear scaling is shown numerically in Fig. 1, where the problem from Bruyneel and Duysinx (2005) and Lee et al. (2012) is repeated using our (differentiable) scaling of the mass matrix. As in Lee et al. (2012) the design space, Fig. 1a, is meshed with 2048 elements (we use 4-node quadrilaterals in this example) and the final structure shall occupy 20 % of the design space and support its own weight, subjected to gravitational acceleration. The minimum compliance design, here obtained using MMA (Svanberg 1987), is shown in Fig. 1b; as can be seen we obtain a black-and-white design without the ”erratic intermediate density patterns” reported in Bruyneel and Duysinx (2005) and Lee et al. (2012).

Fig. 1
figure 1

Re-calculation of the arch structure example from Bruyneel and Duysinx (2005). a: Mesh and boundary conditions, b: Black-and-white optimized design obtained for standard gravitational load

Following this introduction, Section 2 presents the structural model, describes the loading and derives the semi-definite optimization problem. In Section 3 we derive the optimality conditions and the numerical treatment of the SDP-problem is discussed in Section 4, where we also show how the displacements can be obtained and present a method to obtain black-and-white designs. Gradients are derived in Section 5, examples are shown in Section 6 and conclusions are finally drawn in Section 7.

2 Problem statement

2.1 Model

We use the standard framework for structural optimization of linear elastic finite element discretized structures; i.e., there is a state problem of the type

$$ \boldsymbol{K}\left( \boldsymbol{x}\right)\boldsymbol{u}=\boldsymbol{F}\left( \boldsymbol{x},\boldsymbol{r}\right), $$
(1)

where \(\boldsymbol {K}\left (\boldsymbol {x}\right ) \in \mathbb {S}^{n}\) (the space of symmetric n×n matrices with real entries), n being the number of degrees of freedom, is the stiffness matrix; \(\boldsymbol {u}\in \mathbb {R}^{n}\) is the nodal displacement vector; and \(\boldsymbol {F}\left (\boldsymbol {x},\boldsymbol {r}\right ) \in \mathbb {R}^{n}\) is the nodal force vector. Both \(\boldsymbol {K}\left (\boldsymbol {x}\right )\) and \(\boldsymbol {F}\left (\boldsymbol {x}, \boldsymbol {r}\right )\) are taken to depend on a design variable vector x=(x 1,…,x m )T. For notational simplicity, we write the theory as if all elements belong to the design domain; thus, there is one design variable for each finite element and the structure consists of m such elements. The nodal force vector also depends on an uncertainty vector r, to be described in Section 2.2.

To avoid mesh dependency and checkerboard patterns (Sigmund and Petersson 1998) we use a filter to define a set of filtered variables ρ=(ρ 1,…,ρ m )T, which are used to calculate the stiffness and mass of the structure. The filtered variables are therefore denoted physical variables. There are different types of filters (Sigmund 2007) but a linear local averaging design variable filter, suggested by Bruns and Tortorelli (2001), can be written

$$ \rho_{i}\left( \boldsymbol{x}\right)=\sum\limits_{j=1}^{m}{\Psi}_{ij}x_{j},\quad i=1,{\ldots} ,m. $$
(2)

Here

$${\Psi}_{ij}=\frac{\psi_{ij}v_{j}}{{\sum}_{k=1}^{m}\psi_{ik}v_{k}}, \quad \psi_{ij}=\max\left( 0,1-\frac{||\boldsymbol{e}_{i}-\boldsymbol{e}_{j}||}{R}\right), $$

where v j is the volume of element j, e i denotes the position vector of the geometric center of element i, R is the filter radius and ||⋅|| is the Euclidean norm.

The design variables are restricted to the compact and convex set

$$\mathcal{H} = \left\{\boldsymbol{x}\in\mathbb{R}^{m} \;|\; \varepsilon \leq x_{i} \leq 1, \; \sum\limits_{i=1}^{m}d v_{i} \rho_{i}\left( \boldsymbol{x}\right) =\overline{M}\right\}, $$

in which d is the constant density and \(\overline {M}\) is the available mass. The lower bound ε>0, is introduced to maintain positive definiteness of the stiffness matrix (Appendix B provides a formulation allowing for a singular stiffness matrix). The stiffness matrix is in a SIMP formulation (Bendsøe 1989) written as follows:

$$ \boldsymbol{K}\left( \boldsymbol{x}\right)=\sum\limits_{i=1}^{m}\left( \rho_{i}\left( \boldsymbol{x}\right)\right)^{p} \boldsymbol{K}_{i}, $$
(3)

where K i is an expanded element stiffness matrix and p>1 is a penalty exponent.

2.2 Uncertain loading

Due to the intended application to aircraft design, we include mass acceleration forces in the force vector \(\boldsymbol {F}\left (\boldsymbol {x},\boldsymbol {r}\right )\). In the aircraft industry, the acceleration is usually expressed as multiples of the magnitude of the acceleration of gravity g, and the maximum acceleration factors, n x , n y (and n z ), in the x-, y- (and z)-directions, respectively, are generally of different magnitude. Thus, for example, in 2D we may have an elliptical loading region expressed by different accelerations g n x and g n y , as visualized in Fig. 2.

Fig. 2
figure 2

Visualization of admissible accelerations in 2D

The maximum acceleration factors are placed on the diagonal of a diagonal matrix \(\boldsymbol {N}\in \mathbb {S}^{s}\), where s denotes the number of spatial dimensions. The mass acceleration forces can then be written as

$$ \boldsymbol{F}\left( \boldsymbol{x},\boldsymbol{r}\right) = \boldsymbol{B}\left( \boldsymbol{x}\right)^{\textsf{T}}\boldsymbol{r}= \overset{m}{\underset{i=1}{\boldsymbol{\mathsf{A}}}}g\left[ q\left( \rho_{i}\left( \boldsymbol{x}\right)\right) \boldsymbol{M}_{i}+\boldsymbol{M}_{i}^{\textsf{ext}}\right]\boldsymbol{G}_{i}\boldsymbol{N}\boldsymbol{r}, $$
(4)

where A is an assembly operator; \(\boldsymbol {M}_{i}\in \mathbb {S}^{s h_{i}}\), where h i is the number of element nodes, is the mass matrix for element i; \(\boldsymbol {M}_{i}^{\textsf {ext}}\in \mathbb {S}^{s h_{i}}\) is a matrix representing external, non-structural, masses; and \(\boldsymbol {G}_{i}=(\boldsymbol {I}\; {\ldots } \; \boldsymbol {I})^{\textsf {T}}\in \mathbb {R}^{s h_{i} \times s}\), where \(\boldsymbol {I}\in \mathbb {S}^{s}\) is the identity matrix. The linear scaling

$$ q\left( \rho_{i}\left( \boldsymbol{x}\right)\right)=\frac{\rho_{i}\left( \boldsymbol{x}\right)-\varepsilon}{1-\varepsilon}, $$
(5)

of the mass matrix is, as discussed in the introduction, intended to bound the ratio between stiffness and force; without introducing non-differentiable functions, such as that in Bruyneel and Duysinx (2005). By subtracting the lower bound ε in the nominator, we make sure that there are no applied loads, due to self-weight, in voids, and q(ρ i ) becomes 1 when ρ i =1.

Finally, uncertainty is accounted for by letting r vary in the unit ball \(T=\{\boldsymbol {r}\in \mathbb {R}^{s} \;|\; ||\boldsymbol {r}||\leq 1\}\). This implies that the (nodal) loads vary synchronously, which is the case under acceleration loads.

The extension to multiple load cases, with one r for each load case, is straightforward and an extension to handle both fixed and uncertain loads simultaneously is possible, Thore et al. (2015) (de Gournay et al. (2008) also treats this type of loading). Further, the methods in this paper also apply to variations of external nodal loads if B, instead of being formulated as in (4), is a constant matrix with the maximum nodal loads in the x-, y- and z-directions.

2.3 Calculation of the worst-case compliance

The compliance can be written

$$C(\boldsymbol{x},\boldsymbol{r}) = \frac{1}{2}\boldsymbol{F}(\boldsymbol{x},\boldsymbol{r})^{\textsf{T}}\boldsymbol{u}, $$

where F(x,r)=B(x)T r and, from (1), \(\boldsymbol {u}\left (\boldsymbol {x},\boldsymbol {r}\right ) = \boldsymbol {K}\left (\boldsymbol {x} \right )^{-1}\boldsymbol {F}\left (\boldsymbol {x},\boldsymbol {r}\right )\), assuming \(\boldsymbol {K}\left (\boldsymbol {x} \right )\) is non-singular (see Appendix B for a derivation where \(\boldsymbol {K}\left (\boldsymbol {x} \right )\) may be singular). The worst-case compliance is then

$$\begin{array}{@{}rcl@{}} P\left( \boldsymbol{x}\right) = \max\limits_{\boldsymbol{r}\in T}\, C(\boldsymbol{x},\boldsymbol{r}) &=& \max\limits_{\boldsymbol{r}\in T}\frac{1}{2}\boldsymbol{F}(\boldsymbol{x},\boldsymbol{r})^{\textsf{T}} \boldsymbol{u}(\boldsymbol{x},\boldsymbol{r})\\ &=& \max\limits_{\boldsymbol{r}\in T} \frac{1}{2}\boldsymbol{r}^{\textsf{T}}\boldsymbol{H}(\boldsymbol{x})\boldsymbol{r}, \end{array} $$

where the notation

$$ \boldsymbol{H}(\boldsymbol{x})=\boldsymbol{B}(\boldsymbol{x})\boldsymbol{K}(\boldsymbol{x})^{-1} \boldsymbol{B}(\boldsymbol{x})^{\textsf{T}} $$
(6)

was used in the last equality. Since \(\boldsymbol {H}(\boldsymbol {x})\in \mathbb {S}^{s}\) is positive semi-definite, the last problem is one of maximizing an upper semi-continuous, convex function over a convex, compact set. This implies that the maximum value is taken at extreme points of the feasible set T (Rockafeller 1972, Corollary 32.3.1). Since the extreme points of T are \(\{\boldsymbol {r}\in \mathbb {R}^{s} \;|\; ||\boldsymbol {r}|| = 1\}\) (Hiriart-Urruty and Lemarchal 1993, p. 110) we have

$$ P\left( \boldsymbol{x}\right) = \max\limits_{||\boldsymbol{r}||=1} \frac{1}{2}\boldsymbol{r}^{\textsf{T}}\boldsymbol{H}(\boldsymbol{x})\boldsymbol{r}. $$
(7)

Using the Rayleigh-Ritz theorem (Horn and Johnson 1985, Theorem 4.2.2) we see that the optimal value equals the maximum eigenvalue of \(\frac {1}{2}\boldsymbol {H}(\boldsymbol {x})\). Thus, (7) may be written

$$ P\left( \boldsymbol{x}\right) = \frac{1}{2}\lambda_{1}\left( \boldsymbol{H}\left( \boldsymbol{x}\right)\right), $$
(8)

where \(\lambda _{1}\left (\boldsymbol {H}\left (\boldsymbol {x}\right )\right )\) is the largest eigenvalue of \(\boldsymbol {H}\left (\boldsymbol {x}\right )\).

2.4 Optimization problem

We are now interested in the problem

$$ \min\limits_{\boldsymbol{x}\in\mathcal{H}} P\left( \boldsymbol{x}\right). $$
(9)

A main drawback or this formulation is that if \(\lambda _{1}(\boldsymbol {H}\left (\boldsymbol {x}\right ))\) is a multiple eigenvalue, only directional derivatives exist, resulting in a non-smooth problem. Directional derivatives may be obtained following Seyranian et al. (1994), Pedersen and Nielsen (2003) or Overton (1992, Theorem 4). In Pedersen and Nielsen (2003), the directional derivative is used in place of partial derivatives and in Overton (1992) and Seyranian et al. (1994) specialized algorithms are used. A further elevation of this difficulty is that for many problems the optima is likely to occur at a point of multiple eigenvalues. The analysis of Pataki (1998) indicates that given a sufficiently large feasible set \(\mathcal {H}\) this is the case. This is also shown numerically in Fig. 6.

In order to find a computationally tractable problem that is equivalent to (9), we use a reformulation. First, the min-max problem (9) is rephrased in a so-called bound formulation. Recalling (8), this leads to the optimization problem

$$\begin{array}{@{}rcl@{}} &&\min\limits_{z\in\mathbb{R},\,\boldsymbol{x}\in\mathcal{H}} \quad z \\ && \text{subject to} \;\; \lambda_{1}\left( \boldsymbol{H}\left( \boldsymbol{x}\right)\right) \leq z. \end{array} $$

Then, it can be shown, Appendix A, that this problem is equivalent to the following semi-definite program:

$$ \begin{array}{l} \min\limits_{z\in\mathbb{R},\,\boldsymbol{x}\in\mathcal{H}} \quad z \\ \text{subject to} \;\; z\boldsymbol{I} - \boldsymbol{H}\left( \boldsymbol{x}\right) \succeq \boldsymbol{0}, \end{array} $$
(10)

where “ ≽0” means positive semi-definite. This problem has a small matrix inequality constraint of size s×s, and is the non-linear SDP-problem that we treat numerically.

3 Optimality conditions

Let y=(x,z) and introduce the Lagrangian

$$\begin{array}{@{}rcl@{}} L(\boldsymbol{y};\mu,\boldsymbol{\gamma},\boldsymbol{Z}) &=& z + \mu(\boldsymbol{c}^{\textsf{T}}\boldsymbol{x} - \overline{M}) \\ &&+ \sum\limits_{i=1}^{m}\left( \gamma_{i}(x_{i}-1) + \gamma_{i+m}(\varepsilon-x_{i})\right)\\ &&+ \text{tr}\left( [\boldsymbol{H}(\boldsymbol{x}) - z\boldsymbol{I}]\boldsymbol{Z}\right), \end{array} $$

where \(\boldsymbol {Z}\in \mathbb {S}^{s}\), μ and γ are multipliers, tr(A) denotes the trace of A and c defines the left hand side of the mass constraint in \(\mathcal {H}\). The first-order necessary optimality conditions for (10) are then obtained from Bonnans and Shapiro (2000, Theorem 3.9), and reads: If y is an optimal solution to (10), then there exists μ,γ,Z such that

$$\begin{array}{@{}rcl@{}} &&\nabla L(\boldsymbol{y};\mu,\boldsymbol{\gamma},\boldsymbol{Z}) = \boldsymbol{0}, \end{array} $$
(11a)
$$\begin{array}{@{}rcl@{}} &&\text{tr}([\boldsymbol{H}(\boldsymbol{x}) - z\boldsymbol{I}]\boldsymbol{Z}) = 0, \end{array} $$
(11b)
$$\begin{array}{@{}rcl@{}} &&\gamma_{i}(x_{i}-1) = 0,\qquad\qquad\qquad\quad i=1,\ldots,m, \end{array} $$
(11c)
$$\begin{array}{@{}rcl@{}} &&\gamma_{i+m}(\varepsilon-x_{i}) = 0,\qquad\qquad\;~~~~~~ i=1,\ldots,m, \end{array} $$
(11d)
$$\begin{array}{@{}rcl@{}} &&\gamma_{i} \geq 0, \qquad\qquad\qquad\qquad\quad~~~ i=1,\ldots,2m, \end{array} $$
(11e)
$$\begin{array}{@{}rcl@{}} &&\boldsymbol{Z} \succeq \boldsymbol{0}. \end{array} $$
(11f)

The necessity of the optimality conditions hinges on some constraint qualification being satisfied; here we use the Robinson’s constraint qualification (Bonnans and Shapiro 2000, p. 67). We now introduce the notations ∇h(y) and ∇g i (y) for derivatives with respect to y of, respectively, the mass constraint and the i:th box constraint in \(\mathcal {H}\). Then, utilizing Corollary 2.101 in Bonnans and Shapiro (2000) we find that Robinson’s constraint qualification holds for (10) if and only if there exists \(\boldsymbol {w} \in \mathbb {}R^{m+1}\) such that

$$\begin{array}{@{}rcl@{}} \nabla h(\boldsymbol{x})^{\textsf{T}}\boldsymbol{w} &=& 0 \\ \sum\limits_{k=1}^{m+1}w_{k} \frac{\partial [z\boldsymbol{I} - \boldsymbol{H}\left( \boldsymbol{x}\right)]}{\partial y_{k}} & \prec &\boldsymbol{0} \quad \text{if } z\boldsymbol{I}-\boldsymbol{H}\left( \boldsymbol{x}\right)=\boldsymbol{0} \\ \nabla g_{i}(\boldsymbol{x})^{\textsf{T}}\boldsymbol{w} & < & 0, \quad \forall i\in I(\boldsymbol{x}), \end{array} $$
(12)

where “\(\prec \boldsymbol {0}\)” means negative definite, \(I(\boldsymbol {x}) \subset \{1,\ldots ,2m\}\) denotes the index set of active inequalities at x and the condition \(z\boldsymbol {I}-\boldsymbol {H}\left (\boldsymbol {x} \right )=\boldsymbol {0}\) means that the matrix inequality constraint is active.

In order to verify that a solution y satisfies the first-order optimality conditions, we need to find a w that satisfies (12), and if such a w exist we proceed to find a Z that satisfies (11a), (11b) and (11f). We may for example solve for Z in the overdetermined linear system obtained from (11a) and (11b). Finally, we check that (11c) – (11f) are satisfied within some numerical tolerance.

4 Numerical treatment

4.1 Implementation

The algorithms presented in this paper are implemented and solved in a Matlab based FE- and optimization program. The models are first pre-processed in a commercial program (Altair Engineering 2014) where geometry, mesh and boundary conditions etc. are defined and written to an indata file. The Matlab program then reads the indata file and solves the FE- and optimization problem and post-processes the result.

The semi-definite programming problem is treated using the open-source code fminsdp (Thore 2013) which reformulates (10) into a non-linear optimization problem (NLP) that is solved using the interior-point-solver IPOPT (Wächter and Biegler 2006).

4.2 Evaluating the matrix constraint

To avoid explicitly forming \(\boldsymbol {K}\left (\boldsymbol {x}\right )^{-1}\), the matrix constraint function in (10) is evaluated by solving

$$ \boldsymbol{K}\left( \boldsymbol{x}\right)\boldsymbol{R}=\boldsymbol{B}\left( \boldsymbol{x}\right)^{\textsf{T}}, $$
(13)

where \(\boldsymbol {R}=\boldsymbol {R}\left (\boldsymbol {x}\right )\in \mathbb {R}^{n\times s}\) is solved for at the same cost as solving (1) for s load cases. Given \(\boldsymbol {R}\left (\boldsymbol {x}\right )\) then, \(\boldsymbol {H}\left (\boldsymbol {x}\right )=\boldsymbol {B}\left (\boldsymbol {x}\right )\boldsymbol {R}\left (\boldsymbol {x} \right )\).

4.3 Methods to solve the semi-definite program

The matrix inequality constraint in (10) is of small size (s×s) but non-linear in x. Therefore, specialized methods must be used. Examples include the following:

  • The Cholesky factorization method, see Ertel et al. (2008) and Thore (2013). (A “nested” Cholesky method is suggested in Burer et al. (2002))

  • The LDL-factorzation method due to Fletcher (1985); see also Bogani et al. (2009), Vanderbei and Benson (2000), and Benson and Vanderbei (2003).

  • PENNON or PENLAB, see Kočvara and Stingl (2003).

  • The ”feasible direction method” suggested by Aroztegui et al. (2014).

The Cholesky method has the advantage that it immediately leads to a standard, smooth, NLP formulation. The same is almost true for the LDL-factorzation method, but extra care is required to make sure that certain nonlinear inequalities are not violated in the solution process, making it more difficult to use with standard NLP-solvers. PENNON (and its open-source version PENLAB), which implements an augmented Lagrangian-type method, has the drawback that it requires computation of the Hessian of the Lagrangian, something which is prohibitively costly for large-scale nested problems. The benchmarks used to demonstrate the feasible direction method in Aroztegui et al. (2014) are too small to be able to tell whether the algorithm will be efficient in practice, but the implementation described used a dense quasi-Newton approximation for the Hessian, and this is not feasible for large-scale problems.

In fminsdp the Cholesky factorization method is used to handle the matrix inequality constraint. The Cholesky factorization method is based on the fact that a symmetric matrix A0 if and only if A=L L T, where L is a Cholesky factor of A; see Thore (2013) for further details. Therefore, to obtain solutions to (10) we solve the following “standard” NLP:

$$ \begin{array}{l} \min\limits_{z\in\mathbb{R},\,\boldsymbol{x}\in\mathcal{H},\boldsymbol{L}\in\mathcal{L}^{s}_{+}} \quad z \\[-3pt]\\[-3pt] \text{subject to} \;\; z\boldsymbol{I} - \boldsymbol{H}\left( \boldsymbol{x}\right) = \boldsymbol{L}\boldsymbol{L}^{\textsf{T}}, \end{array} $$
(14)

where \(\mathcal {L}^{s}_{+}\) denotes the set of lower triangular matrices with non-negative diagonal entries and, at a feasible point, L is a Cholesky factor of \(z\boldsymbol {I} - \boldsymbol {H}\left (\boldsymbol {x}\right )\).

4.4 Calculating the displacements

The nodal displacements are not obtained explicitly when solving problem (10), but are often required or convenient to have for plotting or calculating compliance etc. Obviously, since the load (4) depends on rT, there is no unique displacement associated to an optimal design x. However, it may be of interest to calculate the displacement \(\boldsymbol {u}\in \mathbb {R}^{n}\) that is associated to the worst-case compliance. To that end we note that (7) shows that the rT that achieves the worst-case compliance is the unit length eigenvector of \(\boldsymbol {H}\left (\boldsymbol {x}\right )\) that is associated with the maximum eigenvalue, if it is unique. Once this rT is established the corresponding u can be calculated by combining (1) and (4) into

$$\boldsymbol{K}\left( \boldsymbol{x}\right)\boldsymbol{u}=\boldsymbol{B}\left( \boldsymbol{x}\right)^{\textsf{T}} \boldsymbol{r} \quad\Leftrightarrow\quad \boldsymbol{u}=\boldsymbol{R}\left( \boldsymbol{x}\right)\boldsymbol{r}, $$

where the last expression follows from (13).

If the maximum eigenvalue of \(\boldsymbol {H}\left (\boldsymbol {x}\right )\) is not unique we may use any linear combination of the eigenvectors corresponding to the multiple eigenvalues and obtain one (of several possible) r, and corresponding u, giving the worst-case compliance.

4.5 Obtaining a black-and-white design

A drawback of using the design variable filter (2) is that a transition layer of elements with intermediate design variable values remain between elements representing the structure (black) and those representing holes (white). One way, suggested in the literature, to avoid this problem is to use a Heaviside projection filter (Guest et al. 2004) where a curvature parameter is increased successively until a black-and-white solution is obtained. The Heaviside filter has been proven to work well for the traditional minimum compliance problem, but we have experienced difficulties on other problem formulations where the discontinuity due to the parameter update has caused convergence problems. Therefore, we use instead a strategy where the problem is solved in two steps: first, the optimization problem is solved until convergence using the design variable filter (2); then all design variables that are at the upper or lower bound (or actually within the distance ε from these bounds) are fixed and only the design variables with intermediate values are allowed to vary in a second optimization, where no filter is used. The set of design variables which are allowed to change is thus smaller in the second step and the design is sufficiently constrained so that no new structural components will be created and no structural components will be removed.

The design obtained after solving (10) in the first step is denoted ρ opt, and the design variables in the second optimization step are denoted \(\hat {\boldsymbol {x}}\in \mathcal {\hat {H}}\), where

$$\begin{array}{@{}rcl@{}} \mathcal{\hat{H}} &=& \left\{ \hat{\boldsymbol{x}}\in\mathbb{R}^{m} \;|\; \varepsilon \leq \hat{x}_{i} \leq 1 \; \textsf{if} \; 2\varepsilon < \rho^{\textsf{opt}}_{i} < 1-\varepsilon,\right.\\ \textsf{else} \; \hat{x}_{i}&=&\left.\rho^{\textsf{opt}}_{i}, \; \sum\limits_{i=1}^{m}d v_{i} \hat{\rho}_{i}\left( \boldsymbol{x}\right) = \overline{M}\right\}. \end{array} $$

The initial design in the second optimization step is \(\hat {\boldsymbol {x}}=\boldsymbol {\rho }^{\textsf {opt}}\), and since no filter is used \(\hat {\boldsymbol {\rho }}=\hat {\boldsymbol {x}}\).

Obvious drawbacks of the proposed two-step strategy are that additional iterations are required and that it is not possible to use filters that are much larger than the average element size; if the transition layer consists of several elements we may obtain checkerboard patterns in the second step.

Interior-point methods such as IPOPT are known to be difficult to warm-start, i.e. to re-start from a given solution, which is the case in the second optimization step. However, by keeping the Lagrangian multipliers associated with the constraints and using these as starting values in the next optimization step together with a small initial value for the barrier parameter (see the documentation included in the IPOPT installation package (IPOPT 2014) for further details) we have not experienced any difficulties with convergence.

5 Gradient calculation

The solver IPOPT uses gradient data in order to iteratively change the design variables towards an optimum design. As we use fminsdp to solve (10), we do not need to treat the auxiliary variables related to the Cholesky factors that fminsdp uses to solve (14), but it is sufficient to provide gradients to the left hand side of the constraint in (10). The gradient with respect to z is straightforward and the gradient with respect to design variable x i reads

$$\begin{array}{@{}rcl@{}} \frac{\partial \boldsymbol{H}\left( \boldsymbol{x}\right)}{\partial x_{i}} & = & \sum\limits_{j=1}^{m} \frac{\partial \boldsymbol{H}\left( \boldsymbol{x}\right)}{\partial \rho_{j}} \frac{\partial \rho_{j}\left( \boldsymbol{x}\right)}{\partial x_{i}} \\ & = & \sum\limits_{j=1}^{m} \frac{\partial \boldsymbol{H}\left( \boldsymbol{x}\right)}{\partial \rho_{j}} {\Psi}_{ji}, \end{array} $$
(15)

where Ψ j i was defined in (2). The derivative with respect to the physical variable ρ j is calculated as

$$ \frac{\partial \boldsymbol{H}\left( \boldsymbol{x}\right)}{\partial \rho_{j}}= \frac{\partial \boldsymbol{B}\left( \boldsymbol{x}\right)}{\partial \rho_{j}} \boldsymbol{R}\left( \boldsymbol{x}\right) +\boldsymbol{B}\left( \boldsymbol{x}\right) \frac{\partial \boldsymbol{R}\left( \boldsymbol{x}\right)}{\partial \rho_{j}}, $$
(16)

where \(\partial \boldsymbol {R}\left (\boldsymbol {x}\right ) / \partial \rho _{j}\) is obtained from (13) as

$$ \frac{\partial \boldsymbol{R}\left( \boldsymbol{x}\right)}{\partial \rho_{j}}=\boldsymbol{K}^{-1} \left( \boldsymbol{x}\right) \left( \frac{\partial \boldsymbol{B}^{\textsf{T}}\left( \boldsymbol{x}\right)}{\partial \rho_{j}} - \frac{\partial \boldsymbol{K}\left( \boldsymbol{x}\right)}{\partial \rho_{j}} \boldsymbol{R}\left( \boldsymbol{x}\right) \right). $$
(17)

Thus, using (16) and (17), (15) can now be written

$$\begin{array}{@{}rcl@{}} \frac{\partial \boldsymbol{H}\left( \boldsymbol{x}\right)}{\partial x_{i}} &=&\sum\limits_{j=1}^{m} \left( \frac{\partial \boldsymbol{B}\left( \boldsymbol{x}\right)}{\partial \rho_{j}} \boldsymbol{R}\left( \boldsymbol{x}\right) + \boldsymbol{R}^{\textsf{T}}\left( \boldsymbol{x}\right) \frac{\partial \boldsymbol{B}^{\textsf{T}}\left( \boldsymbol{x}\right)}{\partial \rho_{j}}\right.\\ &&\qquad \left.- \boldsymbol{R}^{\textsf{T}}\left( \boldsymbol{x}\right) \frac{\partial \boldsymbol{K}\left( \boldsymbol{x}\right)}{\partial \rho_{j}} \boldsymbol{R}\left( \boldsymbol{x}\right) \right) {\Psi}_{ji}, \end{array} $$

where, recalling (4),

$$\frac{\partial \boldsymbol{B}^{\textsf{T}}\left( \boldsymbol{x}\right)}{\partial \rho_{j}}= \overset{m}{\underset{i=1}{\boldsymbol{\mathsf{A}}}} g\frac{1}{1-\varepsilon} \boldsymbol{M}_{i}\boldsymbol{G}_{i}\boldsymbol{N}, $$

and \(\partial \boldsymbol {K}\left (\boldsymbol {x}\right ) / \partial \rho _{j}=p \rho _{j}^{p-1}\boldsymbol {K}_{j}\) follows from (3).

6 Examples

6.1 Parameter settings and material

In this section we show four examples, all solved using the SDP-problem (10), and each showing different challenges. All designs shown satisfy the constraint qualification (12) and the optimality conditions (11), where we check numerically that the infinity norm of the equality conditions is below 10−8.

The same parameter settings are used in all examples: the SIMP-penalization exponent is p=3, the lower bound of the design variables, representing void, is chosen as ε=0.001 and the initial design is a uniform distribution satisfying the mass constraint, unless otherwise is specified in the text. The thickness of the 2D domains is 1 mm and the design material is an aluminum with Young’s modulus 71000 MPa, Poisson’s ratio 0.33 and density 2.8×10−9 tonne/mm3. The figures displaying optimized designs show the design variables in the final design, where black is material and white represents void. Recalling Section 4.5, no filter is used in the final iteration, and thus x=ρ. The maximum acceleration factors n x , n y and n z entering N in (4) are in this section presented as the vector a={n x ,n y ,n z }.

6.2 Example 1 - Self weight arch

As a first example we look at the self weight arch, previously used in self-weight loaded problems by Bruyneel and Duysinx (2005) and Lee et al. (2012). The same example was also used in the introduction, Fig. 1, where we optimized for a standard gravitational acceleration, a={0,−1,0}.

The design domain in Fig. 3a, now discretized with eight-node quadrilateral elements, is supported at the lower corners and has no other loading than its own weight. The dimensions are 2000×1000 mm and the available mass is chosen to be 33 % of the mass of the entire design domain. The physical relevance of this problem may be questioned – a structure which is not supposed to carry any external load appears to be of no use – but it serves as an interesting academic example.

Fig. 3
figure 3

Self weight arch. a: Mesh and boundary conditions, b: Optimized design for the maximum acceleration vector a={1,1,0}

We now seek a design where the compliance is minimized for the worst possible acceleration, and which has lower or equal compliance for all other possible directions. Thus, we allow the acceleration to act with the same magnitude in any direction in the xy-plane, so a={1,1,0}, and we obtain the design shown in Fig. 3b. Compared to the single load case design in Fig. 1b, the worst-case compliance design has a higher compliance for that specific load case, but it is much stiffer for an acceleration in, for example, the x-direction.

6.3 Example 2 - L-shaped beam

The second example is an L-shaped beam, meshed with 6400 eight-node quadrilateral elements as shown in Fig. 4a, where also boundary conditions and a point mass, illustrated as a black circle, are shown. The point mass has the same mass as the final structure, which is constrained to 33 % of the mass of the entire design domain, which has outer dimensions 200×200 mm. In this example we use an elliptical loading region with a higher acceleration in the x-direction, a={10,5,0}. The optimized design, Fig. 4b, does not use the entire height of the design domain as this is not beneficial for acceleration in the x-direction. Further, we find that, because self-weight is considered, much material is placed close to the fixed boundary, and obviously, the structure connects the external mass to the fixed boundary.

Fig. 4
figure 4

L-shaped beam. a: Mesh, boundary conditions and point mass, b: Robust optimized design for the maximum acceleration vector a={10,5,0}, c: Non-robust design obtained for two load cases with maximum acceleration vectors a={0,5,0} and a={10,0,0}

In order to compare the robust design with a conventional, non-robust, design we also solve a minimum weighted compliance problem, with the same mass constraint as for the robust design. The most probable choice of load cases from an engineering point-of-view would be to use two load cases, a={0,5,0} and a={10,0,0}, which is what we have used to obtain the design in Fig. 4c. The compliance for the robust and the non-robust multiple load case designs is compared in Section 6.3.1.

To highlight the fact that multiple eigenvalues are present, not only at the solution but also in most iteration steps towards the solution, we have calculatedFootnote 1 the two eigenvalues of H and created the plot in Fig. 5. The eigenvalues are distinct in the first iterations, but from iteration 20 and onwards we have multiple eigenvalues.

Fig. 5
figure 5

Convergence history of the eigenvalues of H for the design in Fig. 4b, the eigenvalues are multiple from iteration 20

6.3.1 Compliance for other accelerations

In order to see what the compliance for different acceleration loads is, we calculate and plot the compliance of the final designs due to the force in (4), with the vector r varying as

$$ \boldsymbol{r}=(\cos\theta ,\; \sin\theta)^{\textsf{T}}, $$
(18)

for angles 𝜃∈[0,2π], where 𝜃=0 gives a vector parallel to the x-axis. The result is seen in Fig. 6. What is interesting to note is that the optimized robust design has approximately the same compliance for all possible loads defined by (4). The difference between the maximum and the minimum compliance is on the order of 10−8 %. This suggests that, unless there are geometric restrictions, the robust optimization will result in designs which have the same compliance in all directions. This strengthens the arguments in Section 2.4 and motivates the use of SDP compared to solving the eigenvalue problem (9) directly.

Fig. 6
figure 6

Compliance as a function of 𝜃 in (18) for the optimized designs in Fig. 4b and c

For the non-robust, multiple load case design in Fig. 4c we find that the maximum compliance is not attained for any of the two angles (𝜃=0 and 𝜃=π/2) corresponding to the applied loads, implying that this is not a robust design.

6.4 Example 3 - Circular domain

A circular design domain with radius 10 mm is discretized with a rotationally symmetric mesh consisting of second order triangular and quadrilateral elements. The circle is fixed at the outer boundary and has a point mass in the center as shown in Fig. 7a. The filter radius is chosen as R=1 mm and the allowable mass is 40 % of the mass of the entire design domain. Two different sizes of the point mass are used: Fig. 7b shows the optimized design when the point mass has the same mass as the final structure and Fig. 7c shows the optimized design when the point mass is 1000 times heavier than the structure. When the non-structural mass is much heavier than the structure, the design has a quite even thickness of the four structural members in order to minimize the displacement of the non-structural mass, whereas we find that more material is placed close to the fixed boundary when the structural- and non-structural masses are equal.

Fig. 7
figure 7

Circular domain. a: Mesh, boundary conditions and point mass, b: Optimized design for a point mass with the same mass as the structure, c: Optimized design for a point mass much heavier than the structure, d: Same problem as (b) but with another starting point, e: Same problem as (c) but with another starting point

Because of the rotational symmetry, there exist several designs, differing only by a rotation, with the same performance. Thus, there exists several local optima, and if we start from the uniform design design x i =0.35, instead of x i =0.40, we obtain the same, but rotated, designs seen in Fig. 7d and e.

6.5 Example 4 - Hemisphere

In this example we have modelled a hemisphere with radius 200 mm and filled it with a linear tetrahedral mesh, resulting in 327493 elements. All nodes on the flat surface are fixed in all directions and a point mass is applied as shown in Fig. 8a. The point mass is 10 times heavier than the final structure which is constrained to 5 % of the mass of the entire design domain. The maximum possible acceleration is the same in all directions, hence a={1,1,1}. The optimized design, Figs. 8b, c and d, is a tripod structure; i.e., a structure with three equally sized legs with 120 degree angle between them.

Fig. 8
figure 8

Hemisphere. a: Mesh, boundary conditions and point mass, b, c, d: Optimized design, isometric view, top view and side view

7 Conclusions

The proposed SDP-formulation provides a computationally tractable form of worst-case compliance topology optimization problems, previously solved with eigenvalue formulations which are non-smooth when multiple eigenvalues occur. The formulation of the SDP-problem, with a small matrix inequality constraint, makes the problem computationally efficient and, even though an infinite number of load cases are taken into account, the computational cost is similar to that of solving a standard compliance minimization problem with the same number of load cases as there are spatial dimensions. The presented scaling of the mass matrix (Section 2.2) allows us to optimize self-weight loaded structures with a SIMP-formulation without introducing non-differentiable functions. The method for obtaining a black-and-white design (Section 4.5), while still avoiding mesh dependency by limiting the smallest size of structural members, has proven to work very well for the presented examples, in which the final designs are almost completely black-and-white. Finally, necessary first-order optimality conditions are derived and checked numerically, providing confidence in that we have found locally optimal designs.