1 Introduction

Topology optimization of continuum structures has become a popular design tool in industry due to the design freedom it provides. However, in most applications, topology optimization is used in the early design phase, and there is still a relatively large gap between the optimized design and the final design for manufacturing. The topology optimized design is generally followed by a number of post-processing steps to make the design suitable for manufacturing and meet relevant failure criteria, such as stress and buckling constraints. Directly including stress constraints in topology optimization has been an important field of study because this reduces the gap between the optimized and final design. However, several difficulties arise when including stress constraints in topology optimization.

One of the major difficulties is that the correct optima are often inaccessible to standard gradient-based optimization techniques. These inaccessible optima are known as ‘singular optima’, and have been first observed in truss optimization by Sved and Ginos (1968). They demonstrated on a three-bar truss example that the optimum is a solution in which one of the original members vanishes. However, the stress constraint on that member prevented eliminating this member by standard gradient-based optimization. Kirsch (1989, 1990) investigated the characteristics of singular optima, and demonstrated that these optima are located in a lower dimensional subdomain of the feasible domain. In general, singular optima arise in optimization problems that are of the type ‘mathematical programs with vanishing constraints’ (MPVC’s) (Achtziger and Kanzow 2008). Stress-constrained topology optimization belongs to this class of problems. For a detailed discussion on singular optima and its main characteristics, we refer to Rozvany (2001a) and the references therein.

Another fundamental difficulty is that the stress is a local state variable, which leads to a large number of constraints. For other topology optimization problems with few responses and many design variables, the sensitivities can be calculated efficiently using an adjoint formulation. However, since for stress-constrained problems the number of constraints design variables are of the same order, there is no benefit in using an adjoint formulation. Consequently, the potentially large number of local constraints leads to a computationally expensive sensitivity analysis.

Several solutions have been proposed to tackle these difficulties. The most common approach is to subsequently apply (i) constraint relaxation to make singular optima accessible, and (ii) constraint aggregation to deal with the large number of local constraints. Constraint relaxation techniques replace the original set of constraints by smooth approximations. This operation perturbs the feasible domain, and makes singular optima accessible. Constraint relaxation techniques that have been applied are ε-relaxation (Cheng and Guo 1997), the qp-approach (Bruggi 2008), and considering a ‘relaxed’ stress (Le et al. 2009). Constraint relaxation is then generally followed by constraint aggregation. Following this approach, the relaxed local constraints (or stresses) are lumped into a global constraint using an aggregation function that approximates the maximum local function value. This transformation drastically reduces the computational costs of the adjoint sensitivity analysis. Examples of aggregation functions that have been applied in literature are the Kreisselmeier-Steinhauser function (KS-function hereafter) (Kreisselmeier 1979; Yang and Chen 1996), and the P-norm (Duysinx and Sigmund 1998). Recently, the authors have proposed an alternative solution (Verbart et al. 2015).

The combined relaxation and aggregation approach introduces two additional parameters: the relaxation parameter, which controls the perturbation effect on the original feasible domain, and an aggregation parameter, which controls the quality of the approximation of the maximum local function value. A difficulty is that the optimal choice for the parameter values in computational practice is generally very problem dependent, and therefore, difficult to determine a priori. Furthermore, we demonstrate in this paper that the feasible domain of the optimization problem with constraint relaxation and aggregation depends in a non-trivial way on the problem parameters.

In order to overcome these difficulties, this paper unifies these two concepts of constraint relaxation and aggregation. The first step is to reformulate the original optimization problem with a design-dependent set of stress constraints into an equivalent optimization problem with a design-independent set of constraints. Next, we apply constraint aggregation using a lower bound aggregation function without separately relaxing the local constraints. We demonstrate that constraint aggregation using a lower bound aggregation function perturbs the original feasible domain, and makes singular optima accessible. Consequently, no separate relaxation techniques are necessary. The main advantage is that the optimization problem only depends on a single aggregation parameter, which reduces the parameter dependence of the problem. Furthermore, there is a clear relationship between the original feasible domain and the perturbed feasible domain in terms of this aggregation parameter.

The remainder of this paper is structured as follows. Section 2 presents the general framework of density-based topology optimization with stress constraints. Section 3 discusses relaxation and conventionally used aggregation strategies, which are generally applied separately. Both these solution strategies are unified in the novel approach presented in Section 4. Section 5 discusses the results obtained by testing the method on several design cases on which we investigated the parameter- and mesh dependency of the optimized designs. Finally, conclusions are drawn in Section 6.

2 Stress-constrained topology optimization

This section presents density-based topology optimization with stress constraints considering homogenous linear elastic isotropic material following a SIMP formulation (Bendsøe 1989).

2.1 SIMP model

We consider density-based topology optimization to find the optimal distribution of a material domain Ωmat inside a larger design domain Ω. Following this approach, the design domain is discretized into finite elements, and a density variable ρ is assigned to each element. The density design variables can then vary between zero and one, representing void and solid material, respectively. The governing equations for static equilibrium in terms of the density design variables are defined as

$$ \mathbf{E}(\mathbf{u}(\boldsymbol{\rho}),\boldsymbol{\rho})= \mathbf{K}(\boldsymbol{\rho})\mathbf{u}(\boldsymbol{\rho}) - \mathbf{f} = \mathbf{0}, $$
(1)

where \(\boldsymbol {\rho }=\left ({\rho _{1},\rho _{2},...,\rho _{N}}\right )^{\mathsf {T}}\) denotes the vector with N density design variables, K denotes the global stiffness matrix, u denotes the vector with nodal displacements, and f denotes the design-independent load vector.

The global stiffness matrix is composed out of the local element stiffness matrices as

$$ \mathbf{K} = \sum\limits_{e \in \Omega^{d}} \mathbf{K}_{e}(\left<E_{e}\right>). $$
(2)

Here, Ωd denotes the discretized design domain; i.e., set of indices of all elements within the design domain. In this paper, we use \(\left < . \right >\) to indicate homogenized quantities, therefore, \(\left <E_{e} \right >\) denotes the homogenized (i.e., effective) Young’s modulus, which we define following the SIMP model as

$$ \left<E_{e} \right>= {\rho_{e}^{p}} E_{0}, \quad \text{where } p>1. $$
(3)

Here, E 0 denotes the Young’s modulus associated with solid densities (ρ e =1). The exponent p is chosen larger than one, which makes intermediate density material unfavorable in terms of stiffness to promote a black and white design.

The original SIMP model in (3) requires a small non-zero lower bound on the design variables to prevent singularity of the global stiffness matrix (\(0<\rho _{{\min }}\ll 1\)). An alternative formulation, which allows the densities to vary between zero and one, is the modified SIMP model (Sigmund 2007):

$$ \left<E_{e}\right>= E_{{\min}} + {\rho_{e}^{p}} (E_{0}-E_{{\min}}). $$
(4)

Here, \(E_{{\min }}\) is a lower bound to the Young’s modulus (e.g., \(E_{{\min }}=10^{-9}E_{0}\)). In this paper, we adopt this modified SIMP formulation.

2.2 Problem formulation

First, we present the original topology optimization problem with stress constraints. Since the constraints are only defined on material elements, this topology optimization problem is known in literature as a topology optimization problem with ‘design-dependent constraints’Footnote 1 (Rozvany 2001a), also known as ‘vanishing constraints’ (Achtziger and Kanzow 2008). Next, we reformulate the original optimization problem as an optimization problem with a fixed design-independent set of constraints.

2.2.1 Original optimization problem

The stress-constrained topology optimization problem in its nested form is defined as

$$\begin{array}{@{}rcl@{}} (\mathbb{P}_{0}):\; {\underset{\boldsymbol\rho\in S}{\min}}\, & & V = \frac{1}{V_{0}}\sum\limits_{e\in\Omega^{d}}\rho_{e} v_{e}, \end{array} $$
$$\begin{array}{@{}rcl@{}} \text{s.t.}\,\,\,&& g_{j} = \frac{|\boldsymbol\sigma_{j}|}{\sigma_{{\lim}}}-1 \leq0, \quad\forall j \in \Omega_{\text{mat}}^{d}(\boldsymbol{\rho}). \end{array} $$
(5)

Here, V 0 denotes the total volume of the design domain, v e denotes the volume (area in 2D) of a finite element, |σ| represents a positive scalar-valued equivalent stress criterion such as the Von Mises stress that depends on the symmetric stress tensor σ. The equivalent stress is bounded by the allowable stress \(\sigma _{{\lim }}\). The stress constraints g j are only defined over the material domain:

$$\varOmega_{\text{mat}}^{d} := \left\{ \left.{j\in\varOmega^{d}} \right| {\rho_{j} > 0}\right\},$$
(6)

which in the discretized context is the set of indices of all elements with a strictly positive density. Finally, the design space in which we search for a solution is defined as

$$ S := \left\{\left.\boldsymbol{\rho} \in \mathbb{R}^{N}\right| \mathbf{0} \leq \boldsymbol\rho \leq \mathbf{1}, \, \mathbf{E}(\mathbf{u}(\boldsymbol{\rho}),\boldsymbol{\rho})=\mathbf{0}\right\}. $$
(7)

Here, E = 0 are the equations of static equilibrium defined in (1). In other words, we only consider solutions where static equilibrium is satisfied.

The reason that the constraints are only defined on the material domain, \(\Omega _{\text {mat}}^{d}\), is that physically the stress should be zero in void regions. However, in density-based topology optimization, one converts the topology optimization problem in a continuum setting, into a sizing optimization problem by modeling void as very compliant material. In this model, the stress typically attains a finite value at zero density (assuming finite strains), which corresponds with the stress in an element with infinitesimal density. A similar phenomenon is known from truss optimization where the stress in a member converges to a non-zero ‘limiting stress value’ (Cheng and Jiang 1992) when a member vanishes from the structure (again assuming finite strains). Consequently, the model fails to represent the correct physics when material vanishes.

2.2.2 Mathematical program with vanishing constraints

An alternative but equivalent formulation of the optimization problem \((\mathbb {P}_{0})\) in (5) was first proposed by Cheng and Jiang (1992). Later, Achtziger and Kanzow (2008) demonstrated that such a reformulation is generally applicable to the class of optimization problems known as mathematical programs with vanishing constraints (MPVC’s) assuming continuous differentiable functions. Topology optimization with stress constraints belongs to this class of problems.

Following this approach, the design-dependent set of constraints in \((\mathbb {P}_{0})\) is reformulated into a new design-independent set of constraints defined over the entire design domain. The reformulated optimization problem \((\overline {\mathbb {P}}_{0})\) is defined as

$$\begin{array}{@{}rcl@{}} (\overline{\mathbb{P}}_{0}):\;{\underset{\boldsymbol\rho\in S}{\min}} && \,V = \frac{1}{V_{0}}\sum\limits_{e\in\Omega^{d}}\rho_{e} v_{e} \end{array} $$
$$\begin{array}{@{}rcl@{}} \text{s.t.}\,\, &&\overline{g}_{j}= \rho_{j} g_{j} \leq 0, \quad\forall j \in \Omega^{d}. \end{array} $$
(8)

The new constraints \(\overline {g}_{j}\) are defined over the entire design domain Ωd instead of the design-dependent set \(\Omega _{\text {mat}}^{d}\). The reformulated constraints are always satisfied when a member vanishes; i.e., \(\overline {g}_{j}=0\) when ρ j =0. The optimization problems \((\mathbb {P}_{0})\) and \((\overline {\mathbb {P}}_{0})\) are equivalent in the sense that their feasible domain is identical, and a minimizer ρ to the reformulated optimization problem \((\overline {\mathbb {P}}_{0})\) is also a minimizer to \((\mathbb {P}_{0})\).

The advantage of formulation \((\overline {\mathbb {P}}_{0})\) over \((\mathbb {P}_{0})\) is that the set of constraints is design-independent, and therefore, suitable for standard gradient-based optimization techniques. We note that this reformulation does not solve the difficulty of singular optima, but relaxation techniques can be applied to this reformulated optimization problem \((\overline {\mathbb {P}}_{0})\).

2.3 Stress formulation

A difficulty in density-based topology optimization is that the stress is non-uniquely defined for intermediate densities. Assuming that the densities in SIMP represent a porous microstructure, one can distinguish the stress at a macroscopic- and microscopic level. Here, we briefly discuss the macroscopic stress, and the microscopic stress commonly used in density-based topology optimization (Duysinx and Bendsøe 1998).

2.3.1 Macroscopic stress

The macroscopic stress is based on the effective Young’s modulus following the SIMP model in (3). If we assume that intermediate density represents certain configurations of a microstructure, we can interpret the macroscopic stress as the stress based on the homogenized material properties of the microstructure. The macroscopic stress tensor for an element in Voigt notation is defined as

$$ \left<\boldsymbol{\sigma_{e}}\right> = \mathbf{C}_{e}(\left< E_{e} \right>) \left<\boldsymbol{\epsilon_{e}}\right>. $$
(9)

Here, \(\mathbf {C}_{e}(\left < E_{e} \right >) \) is the elasticity matrix based on the homogenized Young’s modulus in (3), and \(\left <\boldsymbol {\epsilon _{e}} \right >\) is the infinitesimal strain tensor.

Unfortunately, the macroscopic stress is not suitable for stress-constrained topology optimization, since it does not correctly predict failure at the microscopic level for intermediate densities (Duysinx and Bendsøe 1998). Furthermore, the macroscopic stress leads to an all-void design in topology optimization (Le et al. 2009). A solution is to consider the stress experienced at the microscopic level.

2.3.2 Microscopic stress

Duysinx and Bendsøe (1998) proposed a stress model that mimics the behavior of the ‘local stress’ in a rank-2 layered composite. Each density variable can then be expressed in terms of the thicknesses of the layers. The microscopic stress is the stress experienced in the layers. To mimic the behavior of the stress in such material, the microscopic stress in density-based topology optimization should be: (i) inversely proportional to the density variable, and (ii) converge to a finite stress value at zero density. The last conditions follow from studying the asymptotic behavior of the microscopic stress in the layers as the thickness of a layer goes to zero. A definition consistent with condition (i) is

$$ \boldsymbol{\sigma}_{e}=\frac{\left<\boldsymbol{\sigma}_{e}\right>}{{\rho_{e}^{q}}} = \rho_{e}^{p-q}\mathbf{C}_{e}(E_{0}) \left<\boldsymbol{\epsilon}_{e} \right>. $$
(10)

The value of the exponent q should be chosen such that the stress satisfies condition (ii). This condition is only satisfied for q = p. Thus, the microscopic stress is defined as

$$ \boldsymbol{\sigma}_{e} = \mathbf{C}_{e}(E_{0})\left<\boldsymbol{\epsilon}_{e}\right>. $$
(11)

This definition of the microscopic stress has been commonly used in stress-constrained topology optimization, and will also be used in this paper.

2.4 Summarizing remarks

Summarizing, our aim is to find an optimum to the optimization problem \((\mathbb {P}_{0})\) stated in (5), which is equivalent to finding an optimum to the reformulated optimization problem \((\overline {\mathbb {P}}_{0})\) in (8). We consider an equivalent stress criterion based on the microscopic stress defined in (11).

As mentioned before, \((\overline {\mathbb {P}}_{0})\) cannot be solved directly because of singular optima, and the potentially large number of local constraints. Solution techniques have to be applied to circumvent these difficulties. Before introducing our new approach, we briefly discuss the common solution techniques used to deal with these difficulties.

3 Constraint relaxation and aggregation

The presence of singular optima, and potentially large number of local constraints make it difficult to solve \((\overline {\mathbb {P}}_{0})\) directly. The most common approach is to subsequently (i) relax the constraints to make singular optima accessible, and (ii) apply constraint aggregation to deal with the large number of constraints. In this section, we discuss both solutions independently and investigate the parameter dependence of the combined approach in which constraint relaxation is followed by relaxation.

3.1 Constraint relaxation

We demonstrate the effect of constraint relaxation on the accessibility of singular optima using a two-bar truss problem.

3.1.1 Two-bar truss optimization problem

We consider the two-bar truss example shown in Fig. 1 (Stolpe 2003). The optimization problem is to minimize its mass subjected to an allowable stress \(\sigma _{{\lim }}\), which is equal in tension and compression and bounds the absolute stress value |σ e | in each member. The design variables are the cross-sectional areas A 1 and A 2. Both members have a Young’s modulus E, and ρ e and L e denote the density and the length of the e-th member, respectively. The stress in the members is given by

$$ \sigma_{1} = \frac{PL_{2}}{A_{1}L_{2}+A_{2}L_{1}}, \quad \sigma_{2} = -\frac{PL_{1}}{A_{1}L_{2}+A_{2}L_{1}}. $$
(12)

The original optimization problem with vanishing stress constraints is defined as

$$\begin{array}{@{}rcl@{}} (\mathbb{P}_{0}):\quad\min_{\mathbf{A}\in S} &&m=\sum\limits_{e\in\Omega^{d}}{\rho_{e} A_{e} L_{e}}, \end{array} $$
$$\begin{array}{@{}rcl@{}} \text{s.t.}\,\,&& g_{j}=\left( \frac{| \sigma_{j} | }{\sigma_{{\lim}}}-1\right) \leq 0, \quad\forall j \in \Omega^{d}_{\text{mat}}(\mathbf{A}), \end{array} $$
$$\begin{array}{@{}rcl@{}} &&\hspace{0cm} \mathbf{0} \leq \mathbf{A}\leq A_{{\max}}\mathbf{1}. \end{array} $$
(13)

Here, \(\mathbf {A}= \left ({A_{1},A_{2}}\right )^{\mathsf {T}}\) denotes the vector with the cross-sectional areas, S is the design space where all configurations of A satisfy the equilibrium equations, and \(A_{{\max }}\) is the maximum allowable cross-sectional area, which is assumed to be equal for all elements. In this example, we used \(A_{{\max }}=2\). Finally, \(\Omega ^{d}_{\text {mat}} \subseteq \Omega ^{d}\) is the set of indices of members with a strictly positive cross-sectional area.

Fig. 1
figure 1

Two-bar truss (Stolpe 2003). The optimization problem is to minimize mass by varying the cross-sectional areas A 1 and A 2 without exceeding the allowable stress

Because we use the absolute value of the stress, each constraint can be rewritten as a pair of constraints. However, for this load case, the left member is always in tension and the right member is always in compression. Consequently, two of the four constraints become redundant and are therefore not considered.

Figure 2a shows the design space of \((\mathbb {P}_{0})\). The gray lines are the isocontours of the objective function. The red line corresponds with the stress constraint in tension of the left member, and the blue line corresponds with the stress constraint in compression of the right member. The blue open circle in point F indicates that the constraint g 2 is not defined at A 2=0 since the constraint vanishes together with the structural member. The reason that stress constraints are removed from the problem at zero cross-section is that the stress may be non-zero in the limit. In this example, the stress in the right member exceeds the allowable stress along DF, and taking the constraint into account at zero cross-section would therefore wrongfully qualify the subdomain DF as infeasible.

Fig. 2
figure 2

Design space for the two-bar truss problem in Fig. 1 for both formulations and the associated feasible domain, which is identical

The set of constraints in (13) is design-dependent and prevents direct use of standard gradient-based optimization techniques. As discussed in Section 2.2.2, \((\mathbb {P}_{0})\) belongs to the class of MPVC’s (Achtziger and Kanzow 2008), and can be reformulated as

$$\begin{array}{@{}rcl@{}} (\overline{\mathbb{P}}_{0}):\quad&{\underset{\mathbf{A}\in S}{\min}}\quad &m=\sum\limits_{e\in\Omega^{d}}{\rho_{e} A_{e} L_{e}}, \end{array} $$
$$\begin{array}{@{}rcl@{}} &\text{s.t.}\,\,&\overline{g}_{j}=\left( \frac{A_{j}}{A_{{\max}}}\right) g_{j} \leq 0, \quad\forall j \in \Omega^{d}, \end{array} $$
$$\begin{array}{@{}rcl@{}} &&\hspace{0cm} \mathbf{0} \leq \mathbf{A}\leq A_{{\max}}\mathbf{1}. \end{array} $$
(14)

Here, the original constraints are premultiplied by the normalized cross-sectional area of the members they belong to. The new set of constraints is defined over the entire design domain Ωd and thus design-independent. Notice that normalization of the cross-sectional area is not strictly necessary but ensures that the new set of constraints is also dimensionless.

Figure 2b shows the design space for the reformulated problem \((\overline {\mathbb {P}}_{0})\). For reasons of clarity, we omit the isocontours of the objective function. In this case, the constraint represented by the blue line is also defined in point F. The feasible domain for both formulations is the same and is shown in Fig. 2c. Since the set of constraints is design-independent standard gradient-based optimization techniques can be applied to \((\overline {\mathbb {P}}_{0})\).

However, it has been demonstrated that for this type of problems, true optima cannot be reached since they reside in a lower-dimensional subdomain of the feasible domain (Kirsch 1989, 1990). In this problem any standard gradient-based optimizer will converge to point B located in A B =(0,1), where the mass is m B =4/5. However, this is not the true optimum. The true optimum is located in point D at the left end of the one-dimensional subdomain D-F. This subdomain is part of the feasible domain since the cross-sectional area of the second member is zero. In point A D =(1,0) the mass of the structure is m D =3/5. In computational practice, the subdomain D-F, and therefore the true optimum D, is inaccessible since it is of a lower dimension than the ‘main body’ of the feasible domain. Point D is known in literature as a singular optimum (Kirsch 1989).

3.1.2 Constraint relaxation

In general, relaxation techniques, such as ε-relaxation (Cheng and Guo 1997) and the qp-approach (Bruggi and Venini 2008), are applied to tackle the difficulty of singular optima. Instead of the original set of constraints, a set of relaxed constraints is considered. By relaxing the constraints, the original feasible domain is perturbed such that singular optima become accessible.

Here, we briefly discuss ε-relaxation since it has a clear relationship to the original problem \((\overline {\mathbb {P}}_{0})\). The idea is to relax the original set of constraints in (14) by introducing a small relaxation parameter 0<ε≪1. The relaxed optimization problem \((\mathbb {P}_{\varepsilon })\) is defined as

$$\begin{array}{@{}rcl@{}} (\mathbb{P}_{\varepsilon}):\quad&{\underset{\mathbf{A}\in S}{\min}}\quad &m=\sum\limits_{e\in\Omega^{d}}{\rho_{e} A_{e} L_{e}}, \end{array} $$
$$\begin{array}{@{}rcl@{}} &\text{s.t.}\,\,\,\,&\tilde{g}_{j}=\overline{g}_{j} -\varepsilon \leq 0, \quad\forall j \in \Omega^{d}, \end{array} $$
$$\begin{array}{@{}rcl@{}} &&\hspace{0cm} \mathbf{0} \leq \mathbf{A}\leq A_{{\max}}\mathbf{1}, \end{array} $$
(15)

where \(\overline {g}_{j}\) are the constraints as defined in (14).

Figure 3 shows the effect of relaxation on the feasible domain for ε=0.01. Relaxation makes the true optimum D accessible by widening the subspace D-F. Solving the relaxed problem will give an optimal solution close to D, where both constraints intersect. Cheng and Guo (1997) demonstrated that the optimum solution \(\mathbf {A}_{\varepsilon }^{*}\) of the relaxed problem \((\mathbb {P}_{\varepsilon })\) converges to the optimum solution \(\mathbf {A}_{0}^{*}\) of \((\overline {\mathbb {P}}_{0})\) as the relaxation parameter tends to zero: i.e., \(\Vert \mathbf {A}_{\varepsilon }^{*}-\mathbf {A}_{0}^{*} \Vert \rightarrow 0\) as \(\varepsilon \rightarrow 0\). Therefore, ε-relaxation has been applied sometimes in a continuation strategy beginning with a relatively large amount relaxation, and gradually decreasing the relaxation parameter during optimization (see, e.g., Duysinx and Bendsøe 1998; Duysinx 1999).

Fig. 3
figure 3

Design space of \((\mathbb {P}_{\varepsilon })\) for ε=0.01. The dashed lines correspond to the original constraints of \((\overline {\mathbb {P}}_{0})\)

However, Stolpe and Svanberg (2001) demonstrated that the ’global trajectory’ may be discontinuous with respect to the relaxation parameter. Here, global trajectory is defined as the path of the global solution in the design space with respect to the relaxation parameter; e.g., \(\mathbf {A}_{\varepsilon }^{*}(\varepsilon )\). The global trajectory \(\mathbf {A}_{\varepsilon }^{*}(\varepsilon )\) with respect to \((\mathbb {P}_{\varepsilon })\) may suddenly jump from location within the design space for arbitrary small ε>0. Consequently, following a sequence of solutions to the ε-relaxed problem in a continuation strategy does not guarantee finding the true optimum, even when the starting point is a global optimum of the relaxed problem.

3.2 Constraint aggregation

The most common approach to deal with the large number of constraints is constraint aggregation. Following this approach, the local constraints are lumped together into a global constraint using an aggregation function. Instead of many local constraints, only a single aggregated constraint is considered, which drastically decreases the computational costs of sensitivity analysis.

Several aggregation functions have been used in literature; e.g., the Kreisselmeier-Steinhauser (KS) function (Kreisselmeier 1979; Yang and Chen 1996) and the P-norm, and P-mean (Duysinx and Sigmund 1998; Le et al. 2009). These aggregation functions have in common that they transform a set of local function values into a scalar function. This scalar function depends on an aggregation parameter P>0, and converges in the limit to the maximum local function value:

$$ \lim_{P\rightarrow \infty} \Psi(\mathbf{f};P) = \max(f_{1},f_{2},...,f_{N}). $$
(16)

Here, \(\mathbf {f}=\left ({f_{1}, f_{2}, ..., f_{N}}\right )^{\mathsf {T}}\) denotes a vector in which the entries are the local function values, and Ψ is the scalar aggregation function.

Some aggregation functions approximate the maximum local function value from above, and others from below. Depending on this characteristic behavior the aggregation function forms an upper- or lower-bound to the maximum local function value. As will become clear later, this characteristic is important for the proposed approach in this paper. First, we briefly discuss aggregation functions that have been used in literature.

3.2.1 P-norm and P-mean

Under the assumption that the local function values in f are non-negative, two aggregation functions that satisfy the asymptotic behavior in (16) are the P-norm and P-mean, which are defined as

$$ \Psi_{\text{PN}}^{U} = \left( \sum\limits_{i=1}^{N} {f_{i}^{P}} \right)^{1/P}, $$
(17)

and

$$ \Psi_{\text{PM}}^{L} = \left( \frac{1}{N}\sum\limits_{i=1}^{N} {f_{i}^{P}} \right)^{1/P}, $$
(18)

respectively.

The difference between these two aggregation functions is that the P-norm is an upper bound, and the P-mean is a lower bound to the maximum local function value:

$$ \Psi_{\text{PM}}^{L}\leq \max(f_{1},f_{2},...,f_{N}) \leq \Psi_{\text{PN}}^{U}. $$
(19)

We use superscripts U and L, to denote an upper and lower bound aggregation function, respectively. The P-norm and P-mean have been mostly used to aggregate non-negative stress criteria, such as the Von Mises stress, into a global stress function (see, e.g., Le et al. 2009; Holmberg et al. 2013).

3.2.2 KS-function and lower bound KS-function

Another aggregation function often used is the KS-function (Kreisselmeier 1979; Yang and Chen 1996), which is defined as

$$ \Psi_{\text{KS}}^{U} = \frac{1}{P}\ln\left( \sum\limits_{i=1}^{N} e^{P f_{i}}\right). $$
(20)

Here, we used the superscript U to emphasize that the KS-function forms an upper bound to the maximum local function value. For any P>0, the KS-function overestimates the maximum local function value.

The maximum difference between KS-function and maximum local function value \(f_{{\max }}\) occurs when all local function values are equal, and is defined as

$$ \frac{1}{P}\ln\left( N e^{P f_{{\max}}}\right)-f_{{\max}} = \frac{1}{P}\ln\left( N\right). $$
(21)

Subtracting this maximum difference of the original KS-function gives a lower bound to the maximum local function value defined as

$$ \Psi_{\text{KS}}^{L} = \Psi_{\text{KS}}^{U}-\frac{1}{P}\ln{\left( N\right)}= \frac{1}{P}\ln\left( \frac{1}{N}\sum\limits_{i=1}^{N} e^{P f_{i}}\right). $$
(22)

We will refer to \(\Psi _{\text {KS}}^{L}\) as the lower bound KS-function, which also has been used by some researchers (París et al. 2009; Luo et al. 2012).

Similar to the P-norm and P-mean, the upper and lower bound KS-function satisfy the asymptotic behavior of (16). However, for the KS-function the local function values are not restricted to non-negative values. Consequently, in contrast to the P-norm and P-mean, the KS-function is often applied over the constraint functions (París et al. 2010; Luo et al. 2013) in contrast to the relaxed stresses (Le et al. 2009).

3.3 Subsequent relaxation and aggregation

Finally, we consider the conventional approach of subsequently applying constraint relaxation followed by constraint aggregation. On the two bar truss example we show that, in computational practice, the feasible domain of this approximate optimization problem depends in a non-trivial way on the problem parameters. First, we relax the constraints by ε-relaxation, followed by constraint aggregation using the upper bound KS-function in (20). The approximate optimization problem is then formulated as minimizing mass subject to a global constraint:

$$ \Psi_{\text{KS}}^{U}(\tilde{\mathbf{g}}(\mathbf{A};\varepsilon);P) = \frac{1}{P}\ln\left( \sum\limits_{i=1}^{N} e^{P \tilde{g}_{i}}\right)\leq0, $$
(23)

where \(\tilde {g}_{i}\) are the ε-relaxed constraints defined in (15).

The global constraint depends on the relaxation parameter ε and aggregation parameter P. Fig. 4 shows the constraint surface (\(\Psi _{\text {KS}}^{U}=0\)) represented by the green line. The magenta color represents the original unperturbed feasible domain, and point D denotes the true optimum. The constraint surface is plotted for parameter values close to their limits; i.e., a small relaxation parameter ε=10−6, and a large aggregation parameter P=106. We observe that the feasible domain of the approximate optimization problem (i.e., the region to the right of the green line) approximates the original feasible domain when approaching the limit of both parameters.

Fig. 4
figure 4

The green line represents the constraint surface (\(\Psi _{\text {KS}}^{U}(\mathbf {A};\varepsilon ,P)=0\)) for subsequent ε-relaxation followed by aggregation using the upper bound KS-function. The aggregation- and relaxation parameter were chosen as P=106 and ε=10−6, respectively. The magenta color filled region represents the original unperturbed feasible domain

Although the feasible domain of the approximate optimization problem converges to the original feasible domain, in computational practice, the problem parameters are chosen far from these limits (e.g., P=20 and ε=0.01, París et al. 2009). The reason is that a large value of the aggregation parameter may cause numerical instabilities, and a too small value of the relaxation parameter does not provide sufficient relaxation to make singular optima accessible. Next, we investigate the effect of both parameters on the feasible domain of the approximate optimization problem.

Figure 5a shows the constraint surface for increasing values of the aggregation parameter and a constant relaxation parameter ε=0.1. The arrow shows the effect of increasing the aggregation parameter. We observe that increasing the aggregation parameter for a fixed relaxation parameter does not necessarily give a better approximation of the true optimum. The global optimum of the approximate optimization problem may deviate more from the true optimum as the aggregation parameter is increased. Figure 5b shows a similar result when decreasing the relaxation parameter for a fixed value of the aggregation parameter P=10. We observe that as the relaxation parameter approaches its limit, the global optimum of the approximated optimization problem is not necessarily closer to the true optimum in D.

Fig. 5
figure 5

a) Isocontours of the KS-function for increasing values of the aggregation parameter, P=2.5,5,10,40, and a fixed value of the relaxation parameter ε=0.1, and b) isocontours of KS-function for decreasing values of the relaxation parameter ε=1/4,1/16,1/64,1/256 and a fixed value of the aggregation parameter P=10

In conclusion, increasing the aggregation parameter for a constant relaxation parameter may produce a feasible domain in which the global optimum deviates more from the true optimum. The same behavior occurs visa versa when decreasing the relaxation parameter while keeping the aggregation parameter constant. This non-trivial dependence makes it difficult to choose optimal parameter values. In addition, these findings indicate that continuation strategies applied to a single parameter while keeping the other parameter constant may not lead to improved designs. Next, we propose a novel unified approach, in which we demonstrate that constraint relaxation is not necessary when applying constraint aggregation. This reduces the previously shown parameter dependence of the problem.

4 A unified aggregation and relaxation approach

In this section, we propose a unified aggregation and relaxation approach. We demonstrate that aggregating the constraints using a lower bound aggregation function simultaneously relaxes the feasible domain. Consequently, there is no need for additional relaxation techniques and the problem only depends on a single aggregation parameter. Finally, we demonstrate that using a lower bound KS-function can be considered as a special case of ε-relaxation combined with constraint aggregation using the original upper bound KS-function.

4.1 Problem formulation

Here, we present the approach in the context of truss optimization, and apply it to the two-bar truss example of Section 3.1.1. The approach consists of two steps: (i) reformulate the original problem \((\mathbb {P}_{0})\) in (13) into an equivalent optimization problem \((\overline {\mathbb {P}}_{0})\) in (14), and (ii) aggregate these reformulated constraints using a lower bound aggregation function. The resulting optimization problem formulation with a single aggregated constraint is

$$\begin{array}{@{}rcl@{}} ({\mathbb{P}_{P}^{L}}):\quad&{\underset{\mathbf{A}\in S}{\min}}\,\, &m=\sum\limits_{e\in\Omega^{d}}{\rho_{e} A_{e} L_{e}}, \end{array} $$
$$\begin{array}{@{}rcl@{}} &\text{s.t.}&G^{L}(\Psi^{L}(\mathbf{\overline{g}};P)) \leq 0, \end{array} $$
$$\begin{array}{@{}rcl@{}} &&\hspace{0cm} \mathbf{0} \leq \mathbf{A}\leq A_{{\max}}\mathbf{1}, \end{array} $$
(24)

Here, G L denotes the global constraint function, which depends on a lower bound aggregation function ΨL, which aggregates the reformulated constraints defined as

$$ \overline{g}_{j}=\frac{A_{j}}{A_{{\max}}} \left( \frac{|\sigma_{j}|}{\sigma_{{\lim}}}-1\right) \quad \forall j \in \Omega^{d}. $$
(25)

Next, we use the P-mean \(\left (\Psi ^{L}_{\text {PM}}\right )\) and lower bound KS-function \(\left (\Psi ^{L}_{\text {KS}}\right )\), and demonstrate the effect of using this formulation on the original feasible domain. When using the lower bound KS-function, we aggregate directly over the reformulated constraints in (25); i.e., we substitute \(f_{i}=\overline {g}_{i}\) in (22). Therefore, the global constraint is simply defined as \(G^{L}_{\text {KS}}=\Psi _{\text {KS}}^{L}\).

For the P-mean we first rewrite the set of original constraints in (25) as

$$ \overline{g}_{j}-\overline{g}_{{\min}} \leq -\overline{g}_{{\min}},\quad \forall j\in\Omega^{d}. $$
(26)

Here, \(\overline {g}_{{\min }}=-1\), which is the minimum possible value that the constraints in (25) can take. By subtracting this constant we ensure that the left hand side of (26) is non-negative. The P-mean can then be applied over the left hand side; i.e., we substitute \(f_{i}=\overline {g}_{i}+1\) in (18). The global constraint function in (24) based on the P-mean is then defined as

$$ G^{L}_{\text{PM}}= \left( \frac{1}{N}\sum\limits_{i=1}^{N} (\overline{g}_{i}+1)^{P} \right)^{1/P} -1\leq 0. $$
(27)

Figure 6 shows the design spaces for the problem formulation \(({\mathbb {P}_{P}^{L}})\) based on the P-mean, and KS-function. The green lines represent the global constraint surface for different values of \(P\in \left ]0,\infty \right [\). The arrow in both figures indicates the effect of increasing the aggregation parameter. The magenta color represents the original unperturbed feasible domain.

Fig. 6
figure 6

Design space for the problem formulation in (24) with a single global constraint based on the (a) lower bound KS-function and (b) P-mean. The green lines represents the constraint surface (G L=0) for different values of the aggregation parameter: P=4,16,32,256. The arrow indicates the direction of the constraint surface for increasing values of P. The magenta color represents the original feasible domain

It is observed that the P-mean and KS-function have a similar perturbing effect on the unperturbed feasible domain as conventional relaxation techniques such as ε-relaxation (cf. Fig. 3). For both aggregation functions, the perturbed feasible domain converges to the original feasible domain as the aggregation parameter tends to infinity. We notice that the lower bound KS-function provides slightly more relaxation within the same range of the aggregation parameter.

The true optimal solution in D is accessible for all chosen values of the aggregation parameter. Notice that the constraint surface of both the P-mean and the KS-function intersects with the optimal solution D for the different values of the aggregation parameter P. This is generally true for stress-constrained problems under a single load case with the same stress limits in tension and compression. Since for this class of optimization problems, the optimum is a fully stressed design (Rozvany 2001b), and all constraints \(\mathbf {\overline {g}}\) in (25) will be active at a minimizer. Consequently, the global constraint value is equal to all local constraint values in that point. Next, we compare the result to the result obtained when using an upper bound aggregation function.

4.2 Lower bound vs. upper bound aggregation function

Here, we consider the same optimization problem in (24), but instead of lower bound aggregation functions, we consider upper bound aggregation functions: the original upper bound KS-function \(\Psi _{\text {KS}}^{U}(\overline {\mathbf {g}};P)\), and the P-norm \(\Psi _{\text {PN}}^{L}(\overline {\mathbf {g}}+1;P)\). For the P-norm, we aggregate similarly as for the P-mean over the left hand side of (26).

Figure 7 shows the constraint surfaces of both upper bound functions for different values of \(P\in \left ]0,\infty \right [\). We observe that in contrast to the lower bound aggregation functions, the upper bound functions cut off the lower dimensional subspace in which the true optimum D is located. In fact, this lower dimension subspace will never be a part of the feasible domain for any \(P\in \left ]0,\infty \right [\). Consequently, in numerical practice, the true optimum can never be reached following this approach and additional relaxation techniques are necessary to make singular optima accessible. As a result, in literature, constraint aggregation is typically applied to the relaxed local stress constraints (see e.g., Duysinx and Sigmund 1998; Le et al. 2009).

Fig. 7
figure 7

Design space for the problem formulation in (24) with a single global constraint based on the (a) upper bound KS-function and (b) P-norm. The green lines represents the constraint surface for different values of the aggregation parameter: P=4,8,16,32. The arrow indicates the direction of the constraint surface for increasing values of P. The magenta color represents the original feasible domain

In conclusion, we have demonstrated that aggregating the local constraint using a lower bound aggregation function, concurrently relaxes the feasible domain for any \(P\in \left ]0,\infty \right [\). Therefore, no additional relaxation procedures are necessary, and the approximated problem only depends on a single parameter P. As the aggregation parameter tends to infinity the relaxed feasible domain approximates that of the original unperturbed problems: \(({\mathbb {P}^{L}_{P}}) \rightarrow (\overline {\mathbb {P}}_{0})\) as \(P\rightarrow \infty \). Furthermore, for the class of problems where the optimal design is a fully stressed design, the lower bound KS-function gives an exact approximation in the true optimum of the maximum local function value for any value of the aggregation parameter. Note that this exact approximation in the true optimum does not imply that the global optimum in this formulation coincides with the true optimum for every value of the aggregation parameter.

4.3 A special case of aggregation and ε-relaxation

Next, we demonstrate that the proposed approach using a lower bound KS-function turns out to be a special case of subsequently applying ε-relaxation and constraint aggregation by the original KS-function. Consider the optimization problem in which aggregation and relaxation are implemented separately:

$$\begin{array}{@{}rcl@{}} &{\underset{\mathbf{A}\in S}{\min}}\quad &m=\sum\limits_{e\in\Omega^{d}}{\rho_{e} A_{e} L_{e}}, \end{array} $$
$$\begin{array}{@{}rcl@{}} &\text{s.t.}&\Psi^{U}_{\text{KS}}(\tilde{\mathbf{g}};P)\leq 0, \end{array} $$
$$\begin{array}{@{}rcl@{}} &&\hspace{0cm} \mathbf{0} \leq \mathbf{A}\leq A_{{\max}}\mathbf{1}, \end{array} $$
(28)

Here, \(\Psi ^{U}_{\text {KS}}(\tilde {\mathbf {g}};P)\) is the upper bound KS-function over the ε-relaxed set of constraints, which is defined as:

$$ \tilde{g}_{j}(\mathbf{A};\varepsilon)=\overline{g}_{j} - \varepsilon \leq 0, \quad \forall j \in \Omega^{d}. $$
(29)

The relaxation parameter ε is assumed to be equal for all local constraints. Aggregating the local relaxed constraints using the KS-function gives

$$\begin{array}{@{}rcl@{}} \Psi_{\text{KS}}^{U}(\tilde{\mathbf{g}};P) & = \frac{1}{P}\ln\left( \sum\limits_{i=1}^{N} e^{P \tilde{g}_{i}}\right)\\ & = \Psi_{\text{KS}}^{U} (\overline{\mathbf{g}};P)-\varepsilon \end{array} $$
(30)

We observe that the KS-function over the relaxed constraints can be written in terms of the KS-function over the original constraints minus a relaxation parameter ε.

Comparing (30) with (22), we conclude that using the lower bound KS-function is a special case of aggregating ε-relaxed constraints by the original upper bound KS-function, and using an adaptive relaxation parameter defined as \(\varepsilon (P)=\ln (N)/P\).

4.4 A unified relaxation and aggregation approach in density-based topology optimization

Here, we briefly summarize the unified approach for density-based topology optimization. First, we reformulate the original topology optimization problem with a design-dependent set of constraint, as the equivalent optimization problem:

$$\begin{array}{@{}rcl@{}} (\overline{\mathbb{P}}_{0}):\;&{\underset{\boldsymbol{\rho}\in S}{\min}} & V = \frac{1}{V_{0}}\sum\limits_{e\in\Omega^{d}}\rho_{e} v_{e}, \\ &\text{s.t.}&\overline{g}_{j}= \rho_{j} \left( \frac{\sigma_{j}}{\sigma_{{\lim}}}-1 \right) \leq 0, \quad\forall j \in \Omega^{d}. \end{array} $$
(31)

Here, σ j (σ j ) represents an equivalent stress criterion (e.g., Von Mises stress) based on the microscopic stress (Duysinx and Bendsøe 1998) of Section 2.3.2, defined as

$$ \boldsymbol{\sigma}_{e} = \mathbf{C}_{e}(E_{0})\left<\boldsymbol{\epsilon}_{e}\right>. $$
(32)

Instead of solving (31) directly, we solve an approximate optimization problem in which the local constraints in \((\overline {\mathbb {P}}_{0})\) are aggregated by a lower bound aggregation function. We consider the lower bound KS-function and the P-mean. In case of the KS-function, the constraints are replaced by the following global constraint:

$$ G^{L}_{\text{KS}} = \Psi_{\text{KS}}^{L} = \frac{1}{P}\ln\left( \frac{1}{N} \sum\limits_{i=1}^{N} e^{P \overline{g}_{i}}\right) \leq 0. $$
(33)

For the P-mean, we follow the procedure as described in Section 4.1, in which the minimum possible local constraint value \(\overline {g}_{{\min }}=-1\) is subtracted from both sides of the original set of constraints in (31). Following this approach, the P-mean can be applied over the non-negative left hand side and is defined as

$$ \Psi_{\text{PM}}^{L}=\left( \frac{1}{N}\sum\limits_{i=1}^{N} (\overline{g}_{i}+1)^{P}\right)^{(1/P)}, $$
(34)

and we consider the single constraint:

$$ G^{L}_{\text{PM}} = \Psi_{\text{PM}}^{L}-1\leq 0. $$
(35)

Next, we present the results obtained in density-based topology optimization in which we parameterized the design following the modified SIMP model as described in Section 2.1.

5 Results and discussion

This section discusses the results that were obtained by applying the proposed approach described in Section 4.4 on the design cases shown in Fig. 8. In order to focus primarily on the effect of the proposed formulation and study its parameter and mesh-dependency, optimizer settings have not been tuned to achieve the fastest convergence but were set to conservative values; i.e., tight move-limits and a strict convergence criterion. Unless stated otherwise, we use the settings listed in Table 1. All values are in SI units.

Fig. 8
figure 8

Design cases

Table 1 General settings

Section 5.1 discusses the design-dependency on the aggregation parameter value. Subsequently, Section 5.3 discusses the effect of mesh-refinement on the optimized designs. Both studies are performed for the lower bound KS-function and the P-mean aggregation function.

5.1 Effect of the aggregation parameter

Here, we discuss the effect of the aggregation parameter value on the optimized designs for both aggregation functions. This effect is studied considering the cantilever and L-bracket design cases shown in Fig. 8. The design domains are discretized using square elements of dimension 1×1, which results in 5000 and 6400 elements for the cantilever and L-bracket, respectively. The Von Mises stress used in the analysis is based on the microscopic stress tensor in (32) evaluated at the centroid of each element. For interpretation of the optimized designs, we consider the Von Mises stress only in ‘material elements’, which we define as all elements with a density value ρ≥1/2. The reason to neglect lower density elements when plotting is that the microscopic stress is non-zero at zero densities, and therefore, distracting large stress values arise in zero densities making design interpretation difficult. This phenomenon is well-known from truss optimization where the stress converges to a non-zero ’limiting stress’ value (Cheng and Jiang 1992) for members with zero cross-sectional area (assuming finite strains).

5.1.1 Cantilever design case

First, the cantilever design problem was solved using the lower bound KS-function for different values of the aggregation parameter. Figure 9a shows the different optimized designs and the corresponding stress plots. It is observed that increasing values of the aggregation parameter result in designs with more uniform stress distributions. This effect is especially noticeable in the lower range of values for P. For example, consider the optimized designs for P=4, and P=12 in Fig. 9a. The optimized for P=4 has two peak stresses at the corners of the design domain of the fixed boundary condition. Although the optimized design for P=12 has the same topology the two diagonal members closest to the fixed boundary moved slightly into the direction of the corners. Consequently, the peak stress of this design was reduced by approximately 23%, while the volume fraction only increased by approximately 1%.

Fig. 9
figure 9

Optimized cantilever designs using the (a) lower bound KS-function and (b) P-mean aggregation function for different values of aggregation parameter P. On top the density distribution and below the Von Mises stress plotted for material elements (i.e., ρ≥1/2)

Next, the cantilever design was solved using the P-mean aggregation function. Figure 9b shows the optimized design and associated stress plots versus the aggregation parameter value. A similar behavior is observed as for the lower bound KS-function. Increasing values of the aggregation parameter lead to designs with a more uniform stress distribution, but eventually also to an increased number of iterations.

Figure 10 shows the data of the optimized designs for both aggregation functions versus P∈{4,8,...,60}. Figure 10a shows that the maximum stress becomes closer to the allowable stress \((\sigma _{{\lim }}=1)\) as the aggregation parameter increases. As mentioned before, in the case of a single load case, in theory all constraints are active in the true optimum, and therefore, the maximum stress should exactly match the allowable stress at an optimum. However, in computational practice, a significant amount of local constraints are inactive, which introduces an error between the global constraint value G L (defined by (33) or (35)) and the maximum local constraint value \(\overline {g}_{{\max }}\). Figure 10b shows the error for both aggregation functions and shows that it decreases for increasing P.

Fig. 10
figure 10

Data of the optimized cantilever designs for both the lower bound KS-function and P-mean for different values of the aggregation parameter

Figure 10c shows the volume fractions of the optimized designs versus the aggregation parameter. In contrast to the maximum stress, which smoothly decreases as P increases for both aggregation functions, the volume fraction shows less predictable behavior. Compare for example the optimized designs obtained using the lower bound KS-function for P=44 and P=52 in Fig. 9a. These designs have approximately the same maximum stress value, however, the volume fraction for P=52 is approximately ≈16% larger. The same effect, but less pronounced, is observed for the P-mean comparing the optimized design for P=52 and P=60 in Fig. 9b. The maximum stress value is approximately equal for both designs, but the volume fraction increased with ≈6% from P=52 to P=60. From this result, we conclude that increasing the aggregation parameter further does not necessarily lead to more optimal designs.

Figure 10d shows the number of iterations versus the aggregation parameter. For both aggregation functions, we observe a trend of an increasing number of iterations as P increases, which is especially noticeable in the range of larger values P>28. The increased number of iterations may be explained by the increased nonlinearity of the constraint function as the aggregation parameter value increases. Figure 11 shows some convergence histories of the cantilever designs in Fig. 9. For both aggregation functions, it is observed that the convergence histories show more fluctuation as P increases, which coincides with slower convergence.

Fig. 11
figure 11

A selection of convergence histories of the cantilever designs in Fig. 9 for increasing values of the aggregation parameter for both the lower bound KS-function (LBKS) in (a-c) and P-mean in (d-f)

For larger values of P>60 for both aggregation functions, the designs did often not converge, or converged to designs containing large areas of intermediate densities. These large regions of intermediate densities can be attributed to the fact that as P increases, the feasible domain approximates the feasible domain of the original unperturbed optimization problem. It is well-known that the original optimization problem contains singular optima, which prevent convergence to a black and white design (Duysinx and Bendsøe 1998).

5.2 L-bracket design case

The same study was performed on the L-bracket design case. Figure 12 shows a selection of optimized designs for the L-bracket using the P-mean. It is observed that the optimized design for P=16 contains a peak stress in the reentrant corner. Increasing the aggregation parameter value leads to designs with a more uniform stress distribution. For example, in contrast to the optimized design for P=16, the optimized designs for P≥24 have a rounded shape in the reentrant corner, which is desired to effectively prevent a peak stress. However, increasing the aggregation parameter value further does not necessarily lead to improved designs. Compare for example the optimized designs for P=40 and P=32. Although the optimized design for P=40 has a maximum stress value of approximately 1% lower, the volume fraction increased with approximately 6%. This result confirms what was found for the cantilever design case, that further increasing the aggregation parameter does not necessarily gives improved designs. In general, the same dependence of the optimized designs on the aggregation parameter was found as for the cantilever design case.

Fig. 12
figure 12

Optimized designs using the P-mean, and different values of the aggregation parameter P. On top the density distribution, and below the Von Mises stress plotted for material elements (i.e., ρ≥1/2)

5.2.1 Concluding remarks

In general, we have found that both the Lower-bound KS-function and the P-mean produce similar designs and have a similar dependence of the aggregation parameter. Two trends were observed. First, increasing the aggregation parameter value initially leads to improved designs, which have a more uniform stress distribution. However, for increasingly large values of the aggregation parameter, the number of iterations increases and the optimizer is prone to convergence to inferior local minima. Eventually, too large values of the aggregation parameter lead to numerically unstable behavior and no convergence at all.

For the used optimizer settings in Table 1, well-performing designs, both in terms of structural performance and number of iterations, were found in the range P∈[20,40]. Consequently, the value of P should be chosen as a trade-off between a large enough value to prevent peak stresses, but not too large value in order to prevent numerical instabilities and large number of iterations. This may offer opportunities for continuation strategies, but this aspect has not been explored in this paper.

5.3 Effect of mesh refinement

Next, we study the effect of mesh refinement where the L-bracket design for the P-mean with P=32 of Fig. 12c is used as a reference design. The mesh of the reference design contains N=6400 equally sized quadrilaterals: 100×100 elements along the longest edges. We solved this optimization problem under 4 different levels of mesh refinement.

Figure 13 shows the optimized designs and associated data obtained under mesh refinement. We observe that the gap between the maximum stress and the allowable stress \((\sigma _{{\lim }}=1)\) increases with mesh refinement. However, the aggregation function does produce fully stressed designs and successfully prevents peak stresses by forming a rounded shape in the reentrant corner for all mesh sizes. The gap between the maximum stress and allowable stress can be dealt with using adaptive normalization techniques to scale the allowable stress during optimization (Le et al. 2009).

Fig. 13
figure 13

Mesh refinement applied to the L-bracket using the P-mean function for P=32 in Fig. 12c

Although the resulting optimized designs show a clear black and white design, we observed that density fluctuations occur in void regions under mesh refinement. In order to make this effect more visible, the optimized design in Fig. 13d is plotted again but with the greyscale colormap rescaled from a density range of [0,1] to a range of [0,0.05]; i.e., every density value ρ≥0.05 is depicted as black. The result is shown in Fig. 14a. Cross-section AA shows fluctuating intermediate densities inside the void region.

Fig. 14
figure 14

(a) Cross-section of Fig. 13d shows fluctuation densities in the void region, (b) shows the optimized design and cross-section after aggregating only local constraints with density ρ>0.04

A possible explanation for this behavior is that in the proposed approach a local constraint becomes active as the density approach zero, since \(\overline {g}_{j}=\rho _{j} g_{j} \rightarrow 0\) as \(\rho _{j} \rightarrow 0\). Consequently, low-density elements can potentially have an important contribution in the aggregation function, and therefore, new search direction. The aforementioned hypothesis is confirmed by only aggregating the local constraints of elements with a density above a small threshold value: ρ>0.04. Figure 14b shows that except for the void regions, this result is equivalent to the previous result in Fig. 14a indicating that these density fluctuations are indeed numerical artifacts associated with lower density elements.

We notice that the densities in the void regions in Fig. 14b converge to a lower bound of approximately ρ=0.015. The reason for this is currently unknown and is a topic of future research. This phenomenon was not observed for simple compliance minimization under mesh refinement for which the densities in void regions converged to a value closer to zero (≈3⋅10−5). However, it was also observed using other approaches for stress-constrained topology optimization; e.g., the damage approach (Verbart et al. 2015) and the conventional approach of constraint relaxation followed by aggregation. For example, Fig. 15 shows a result obtained by considering qp-relaxed stresses aggregated into a single P-norm constraint (Le et al. 2009).

Fig. 15
figure 15

Cross-section for optimized design using qp-relaxed Von Mises stress (\(\tilde {\sigma }_{e} = \rho _{e}^{1/2}\sigma _{e}\)), and P-norm aggregation with an aggregation parameter of P=32

6 Conclusions

In this paper, we proposed a new approach that unifies constraint aggregation and relaxation in stress-constrained topology optimization. We demonstrated on an elementary two-bar truss example, that aggregating the local constraints using a lower bound aggregation function simultaneously relaxes the feasible domain. In contrast to the conventional approach of subsequently relaxing and aggregating the local stress constraints, no additional constraint relaxation techniques are necessary. It was also found that using an upper bound aggregation function makes singular optima inaccessible (at least for the two-bar truss). This explains the need of constraint relaxation before aggregation in the conventional approach.

The main advantage of the proposed approach is that the problem only depends on a single aggregation parameter which reduces the parameter dependency of the problem, which is non-trivial in the conventional approach as also is demonstrated on the two-bar truss. Furthermore, in contrast to the conventional approach, there is a clear relationship between the original feasible domain, and the relaxed feasible domain in terms of this aggregation parameter.

We tested the proposed approach on a cantilever and L-bracket design case and studied the effect of the aggregation parameter. Both the lower bound KS-function and the P-mean are suitable for this approach and produced similar results. Both aggregation functions show the same dependency on the aggregation function. Increasing the aggregation parameter initially gives better results, however, for large values of the aggregation parameter the constraint function becomes increasing nonlinear and the optimizer may converge to inferior local minima. Furthermore, large values of the aggregation parameter lead to an increased number of iterations. In general, best results were obtained with moderate values of the aggregation parameter P∈[20,40].

Finally, the effect of mesh refinement was studied. It was observed that the gap between the maximum stress and the allowable stress increases under mesh refinement. However, the optimized designs remain fully stressed under mesh refinement and contain a rounded shape along the reentrant corner thereby preventing a peak stress. The increasing gap between the maximum stress and the allowable stress can potentially be dealt with using adaptive normalization strategies as was shown in (Le et al. 2009). Numerical artifacts were observed in low-density regions. It was found that only aggregating stress values of elements above a certain threshold effectively circumvent these numerical artifacts. Future work focuses on finding the exact cause of these numerical artifacts.