1 Introduction

One of the most common problems in topology optimization is the minimization of the mean compliance of structures, under the constraints of static equilibrium and a prescribed volume of material (Bendsoe and Sigmund 2004). In several circumstances, the requirements upon the structure may impose an elastic but nonlinear relation between strains and displacements, meaning that the structure is under geometric nonlinearity. In such a case, the computation of the objective function of the topology optimization problem demands the solution of a system of nonlinear equations that models the nodal displacements of the structure due to the external forces. When it comes to compliant mechanisms, under the geometric nonlinearity assumption, addressing nonlinear systems of equations is mandatory as well. The approximate solutions of these systems are usually computed by Newton’s method, constituting, by far, the most expensive step in the process of achieving optimal configurations.

Several attempts have been made to reduce the effort spent to compute the nodal displacements of the structures and mechanisms (Kirsch 2010). Challenges such as parallel and numerical scalability have been pursued, specially when it comes to 3-D problems (see e.g. Amir 2015 and references therein). The use of preconditioned conjugate gradients has been exploited in the solution of the linear systems associated with the equilibrium equations (Amir et al. 2012, 2010; Liu et al. 2014; Long et al. 2019). Meshless-based local reanalysis (Cheng and Wang 2017) and a meshfree reduced basis approach for large deformations (Nguyen et al. 2021) have been analyzed as well. In the last years, educational codes handling geometric nonlinearity, based on the SIMP (Solid Isotropic Material with Penalization) method (Chen et al. 2019; Zhu et al. 2021), and on the BESO (Bi-directional Evolutionary Structural Optimization) method (Han et al. 2021), also became available. The recent survey (Mukherjee et al. 2021) discusses additional aspects concerning high-performance computing, approximate reanalysis, reduced-order modeling, and multigrid methods.

Inspired on the combined approximations approach (Amir et al. 2008; Bogomolny 2010; Kirsch 2008), we have analyzed some strategies that merge the inexact Newton scheme with iterative combined approximations for minimizing the mean compliance of structures and for maximizing the displacement of compliant mechanisms. Our approach rests upon a sequential piecewise linear programming (SPLP) algorithm (Gomes and Senne 2014) that, aside from having a stopping criterion with theoretical support, has proven to be efficient and robust. In a previous work (Senne et al. 2019), the approximate reanalysis technique has been applied in combination with the SPLP algorithm to solve three benchmark structure problems under small displacements. In the current work, besides addressing two structures, we have also applied the devised strategies to the design of two mechanisms. A detailed analysis of the results has been reported, assessing the scope and the impact of the methodology.

This text is organized as follows. The problem formulation is presented in Sect. 2. The elements of the reanalysis technique that encompasses the inexact Newton method with iterative combined approximations are detailed in Sect. 3. The experimental setup is provided in Sect. 4, namely the test problems, the proposed strategies and the implementation details. The analysis of the performance of the strategies with a fixed budget is done in Sect. 5, whereas Sect. 6 presents the results concerning the effect of the mesh refinement. The convergence performance of the SPLP with the nonlinear reanalysis is reported in Sect. 7. The behavior of the norm of the operator in charge of the effectiveness of the iterative process is assessed in Sects. 8, and 9 contains our final remarks.

2 Problem formulation

Suppose that \(\varOmega\) is the design domain of a structure. In the context of topology optimization, we want the distribution of material on the domain \(\varOmega\) to be optimal in some sense, satisfying certain constraints. In this work, we consider that \(\varOmega\) is two-dimensional and, with the aim of making the problem computationally affordable, we discretize it into rectangular finite elements, using bilinear interpolating functions to approximately evaluate the displacements.

Let \(\varOmega _{d}\) be the discretized domain, partitioned in disjointed subdomains \(\varOmega _i\), i.e. \(\varOmega _{d} = \cup _{i=1}^{n_{el}} \varOmega _i\). We denote by \(n_{el}\) the number of elements of \(\varOmega _{d}\) and by \(n_{nd}\) the number of its nodes. We associate with each \(\varOmega _i\) a continuous variable \(\rho _{i}\) representing the density of material, in such a way that \(\rho _{i} = 1\) if the element is solid or \(\rho _{i} = 0\) if the element is void.


We address two topology optimization problems: the compliance minimization of rigid structures and the displacement maximization of compliant mechanisms. In both cases, the optimization problem can be stated as

$$\begin{aligned} \begin{array}{ll} \displaystyle {\min _{\varvec{\rho }}} &{} {{\varvec{l}}}^{T}{\varvec{u}}\\ \text{ s.t. } &{} {\varvec{r}}({\varvec{u}},\,\varvec{\rho }) = \, {\varvec{0}}\\ &{} \sum ^{n_{el}}_{i=1} v_{i}\rho _{i} = V^{*} \\ &{} 0 < \rho _{\min } \le \rho _{i} \le 1, \qquad i = 1,\dots ,n_{el}, \end{array} \end{aligned}$$
(1)

where \(\varvec{\rho }\in {\mathbb {R}}^{n_{el}}\) is the vector of densities of each element of \(\varOmega _{d}\) (that are our design variables), \({\varvec{u}}\in {\mathbb {R}}^{2n_{nd}}\) is the vector of nodal displacements, \(v_{i}\) is the volume of the i-th element, \(V^{*}\) is the prescribed volume of the structure, and \(\rho _{min}\) is a sufficiently small positive number used to avoid numerical instabilities. For the compliance minimization of rigid structures, \({\varvec{l}}\in {\mathbb {R}}^{2n_{nd}}\) is the vector of nodal external forces, \({\varvec{f}}\), and in the case of the maximization of the displacements for compliant mechanisms it is a vector filled with zeros, except at the positions corresponding to the output points (the points in which the displacement is to be maximized), where it is equal to \(-1\). We remark that, since we adopt the SIMP interpolation model (Bendsoe 1989), \({\varvec{u}}\) depends on the vector of densities \(\varvec{\rho }\), so we can express \({\varvec{u}}\equiv {\varvec{u}}(\varvec{\rho })\).


The system

$$\begin{aligned} {\varvec{r}}({\varvec{u}},\,\varvec{\rho }) = {\varvec{0}}\, \end{aligned}$$
(2)

represents the static equilibrium conditions of the structure (or compliant mechanism). Under the assumption of geometric nonlinearity, we have

$$\begin{aligned} {\varvec{r}}({\varvec{u}},\,\varvec{\rho }) = {\varvec{f}}_{int}({\varvec{u}},\,\varvec{\rho }) - {\varvec{f}}, \end{aligned}$$
(3)

where \({\varvec{f}}_{int}({\varvec{u}},\,\varvec{\rho }) \in {\mathbb {R}}^{2n_{nd}}\) is the vector of internal nodal forces. We say that \({\varvec{r}}({\varvec{u}},\,\varvec{\rho })\) in (3) is the residual associated with the nonlinear system (2).


Usually, the Newton-Raphson method (that, from now on, will be referred as Newton’s method) is used to find an approximate solution for the nonlinear system (2). Given an initial point \({\varvec{u}}^{(0)}\) and keeping \(\varvec{\rho }\) fixed, at each iteration of Newton’s method, we solve the linear system

$$\begin{aligned} {\varvec{K}}_{T}({\varvec{u}}^{(\ell )},\,\varvec{\rho }){\varvec{s}}\, = \, -{\varvec{r}}({\varvec{u}}^{(\ell )},\,\varvec{\rho }), \end{aligned}$$
(4)

where \({\varvec{K}}_{T}\equiv {\varvec{K}}_{T}({\varvec{u}},\,\varvec{\rho }) \in {\mathbb {R}}^{2n_{nd} \times 2n_{nd}}\) is the Jacobian matrix of the nonlinear system (2), also known as the global tangent stiffness matrix, that is supposed to be nonsingular. The solution of the linear system (4), denoted by \({\varvec{s}}^{(\ell )}\), is used to obtain the new approximate solution, \({\varvec{u}}^{(\ell +1)} = {\varvec{u}}^{(\ell )} + {\varvec{s}}^{(\ell )}\). Usually, we consider that \({\varvec{u}}^{(\ell )}\) is a good approximate solution to the original nonlinear system (2) whenever \(\Vert {\varvec{r}}({\varvec{u}}^{(\ell )},\,\varvec{\rho }) \Vert < \varepsilon\), where \(\varepsilon\) is a prescribed small positive number.

It is important to mention that the definitions of the global tangent stiffness matrix \({\varvec{K}}_{T}\) and of the residual \({\varvec{r}}\) depend on the hyperelastic model for the material of the structure. In this work, we adopt the neo-Hookean material model of Simo-Ciarlet (Lahuerta et al. 2013), combined with the SIMP model, so we have

$$\begin{aligned} {\varvec{K}}_{T}({\varvec{u}}, \varvec{\rho }) = \sum _{i=1}^{n_{el}} \rho ^{p}_{i} \int _{\varOmega _{i}}\!\! {\varvec{G}}^{T}{\varvec{D}}_i{\varvec{G}}\,\,d\varOmega _{i} \end{aligned}$$
(5)

and

$$\begin{aligned} {\varvec{f}}_{int}({\varvec{u}},\varvec{\rho }) = \sum ^{n_{el}}_{i=1} \rho ^{p}_{i}\int _{\varOmega _{i}}\!\! {\varvec{G}}^{T}\varvec{\sigma }_i\,\,d\varOmega _{i}, \end{aligned}$$
(6)

in which p is the penalty parameter of the SIMP model (\(p > 1\)), \({\varvec{G}}\in {\mathbb {R}}^{4 \times 8}\) is the matrix of the derivatives of the shape function with respect to the displacements and, for the domain \(\varOmega _i\) of the i-th element of the discretized domain \(\varOmega _d\), \({\varvec{D}}_i \in {\mathbb {R}}^{4 \times 4}\) is the tangent stiffness modulus matrix and \(\varvec{\sigma }_i \in {\mathbb {R}}^{4}\) is the first Piola-Kirchhoff stress tensor.

3 Inexact Newton method with iterative combined approximations

The factorization of \({\varvec{K}}_{T}\) is the most expensive step in the process of solving the linear system (4). Although the Cholesky factorization is usually employed for symmetric matrices, the \(LDL^T\) factorization (Golub and Loan 2013) is required here, since \(K_T\) is not necessarily positive definite. With the aim of saving computational effort, several reanalysis techniques were proposed in the literature. One of the most common reanalysis strategy, named Combined Approximations (CA), will be summarized here to ground our presentation, following the lines of (Amir et al. 2008; Bogomolny 2010; Senne et al. 2019), among others.


First of all, let \({\varvec{K}}_{T0}\) be the Jacobian at the i-th iteration of Newton’s method. Supposing that its factorization is available, it will be reused at the subsequent m iterations \(i+1,\,i+2,\,\dots ,\,i+m\), and updated at the iteration \(i+m+1\). Denoting by \({\varvec{K}}_{T}\) the Jacobian at the \((i+j)\)-th iteration (with \(1 \le j \le m\)), we can rewrite the linear system (4) as

$$\begin{aligned} ({\varvec{K}}_{T0}+ \varDelta {\varvec{K}}){\varvec{s}}\, = \, -{\varvec{r}}, \end{aligned}$$
(7)

where \(\varDelta {\varvec{K}}:= {\varvec{K}}_{T}- {\varvec{K}}_{T0}\).

It is worthwhile to give some attention to our definition of the matrix \(\varDelta {\varvec{K}}\). According to Kirsch (Kirsch 2000), the global tangent stiffness matrix is usually assumed to obey the additive model \({\varvec{K}}_{T}= {\varvec{K}}_{0}+ {\varvec{K}}_{G}\), in which \({\varvec{K}}_{0}\) is the elastic stiffness matrix, and \({\varvec{K}}_{G}\) is the geometric stiffness matrix (see also Kirsch 2008; Kirsch and Bogomolni 2004). In this model, it is supposed that \(\varDelta {\varvec{K}}= \varDelta {\varvec{K}}_{D} + {\varvec{K}}_{G}\), where \(\varDelta {\varvec{K}}_{D}\) is the variation in \({\varvec{K}}_{0}\) due to changes in the design. The aforementioned additive assumption differs from the neo-Hookean Simo-Ciarlet constitutive model adopted in our work, in which \(\varDelta {\varvec{K}}\) in (7) represents the changes in the global tangent stiffness matrix \({\varvec{K}}_{T}\) given by (5) only due the variations in the nodal displacements between the iterations i and \(i+m\) of Newton’s method, as mentioned above.


After rearranging the terms in (7), we obtain

$$\begin{aligned} {\varvec{s}}\, = \, \widetilde{{\varvec{s}}}- {\varvec{B}}{\varvec{s}}, \end{aligned}$$
(8)

where

$$\begin{aligned} \widetilde{{\varvec{s}}}:= -{\varvec{K}}_{T0}^{-1}{\varvec{r}}\qquad \text{ and } \qquad {\varvec{B}}:= {\varvec{K}}_{T0}^{-1}\varDelta {\varvec{K}}. \end{aligned}$$
(9)

So, using (8), we state the following sequence of steps:

$$\begin{aligned} {\varvec{s}}^{(k+1)}\, = \, \widetilde{{\varvec{s}}}- {\varvec{B}}{\varvec{s}}^{(k)}, \qquad k = 0,\,1,\,2,\,\dots . \end{aligned}$$
(10)

Choosing \({\varvec{s}}^{(0)} = \widetilde{{\varvec{s}}}\), the CA strategy consists in generating the first q terms of the sequence (10) to build an approximate solution \(\widehat{{\varvec{s}}}\), given by

$$\begin{aligned} \widehat{{\varvec{s}}}\, = \, y_{1}{\varvec{s}}^{(1)} + y_{2}{\varvec{s}}^{(2)} + \cdots + y_{q}{\varvec{s}}^{(q)} \, = \, {\varvec{S}}_{B}{\varvec{y}}, \end{aligned}$$
(11)

where

$$\begin{aligned} {\varvec{S}}_{B}\, = \, [{\varvec{s}}^{(1)} \,\,\, {\varvec{s}}^{(2)} \,\,\, \cdots \,\,\, {\varvec{s}}^{(q)}] \qquad \text{ and } \qquad {\varvec{y}}= [y_{1} \,\,\, y_{2} \,\,\, \cdots \,\,\, y_{q}]^{T}. \end{aligned}$$

We remark that q should be a small integer. Usually, \(q \le 10\).

The vector of coefficients \({\varvec{y}}\) is found by solving the small (\(q \times q\)) linear system

$$\begin{aligned} {\varvec{S}}_{B}^{T}{\varvec{K}}_{T}{\varvec{S}}_{B}{\varvec{y}}\, = \, -{\varvec{S}}_{B}^{T}{\varvec{r}}, \end{aligned}$$

obtained by substituting \(\widehat{{\varvec{s}}}\) in the original linear system (4) and then multiplying both sides of the resulting equation on the left by \({\varvec{S}}_{B}^{T}\). Generally, the relative norm of the residual of the linear system (4) is the measure adopted to decide if the approximate solution \(\widehat{{\varvec{s}}}\) will be accepted.

Kirsch et al. (2002) proved that the preconditioned conjugate method (PCG) and the CA method provide theoretically the same results, in the sense that to perform q iterations of the PCG method is completely equivalent to generate q vector basis in the CA method to construct the approximate solution \(\widehat{{\varvec{s}}}\) given by (11), when the preconditioner matrix in PCG is chosen as the inverse of the Cholesky factor of \({\varvec{K}}_{T}= {\varvec{K}}_0 + \varDelta {\varvec{K}}\). They have additionally employed the Gram-Schmidt orthonormalization strategy to ensure a well-conditioned basis, as previously adopted in Leu and Huang (1998). Moreover, it is shown in Kirsch et al. (2002) that the error bound for q basis vector in the CA method is the same as the one associated with q iterations of the PCG method.

In this work, we propose a new iterative method to get an approximate solution to the linear system (4), named Iterative Combined Approximations (ICA), based on the sequence \(\left\{ {\varvec{s}}^{(k)}\right\} ^{+\infty }_{k=0}\) defined in (10), and described below. It is worth mentioning that a distinct iterative CA approach to address topological modifications for the static reanalysis of structures has been presented in Chen and Yang (2004). In addition, since our matrix \({\varvec{K}}_{T}\) is computed using (5), we are not under the same hypotheses of Kirsch et al. (2002), as the incremental matrix \(\varDelta {\varvec{K}}\) and, consequently, matrix \({\varvec{B}}\) are distinct in each case. For completeness, we show next that, under the assumption that \(\Vert {\varvec{B}}\Vert <1\), the generated sequence (10) has linear rate of convergence.

Given \(\widetilde{{\varvec{s}}}\) as in (9), and any initial point \({\varvec{s}}^{(0)}\), the general term of the sequence (10), for any k, can be written as

$$\begin{aligned} {\varvec{s}}^{(k+1)}\, = \, [{\varvec{I}}- {\varvec{B}}+ {\varvec{B}}^{2} - {\varvec{B}}^{3} + \cdots + (-1)^{k}{\varvec{B}}^{k}]\,\widetilde{{\varvec{s}}}+ (-1)^{k+1}{\varvec{B}}^{k+1}{\varvec{s}}^{(0)}. \end{aligned}$$
(12)

By considering any vector norm and the consistent norm of operators, if \(\Vert {\varvec{B}}\Vert < 1\), the matrix \({\varvec{I}}+ {\varvec{B}}\) is nonsingular (cf. Golub and Loan 2013, §2.3.4), and its inverse can be expressed by means of the Newmann’s series (Juanjuan and Hu 2018; Zuo et al. 2016, 2012)

$$\begin{aligned} ({\varvec{I}}+ {\varvec{B}})^{-1} \, = \, \textstyle \sum ^{+\infty }_{j=0} (-1)^{j}{\varvec{B}}^{j}. \end{aligned}$$

Notice that, from (9), since \({\varvec{B}}= {\varvec{K}}_{T0}^{-1} {\varvec{K}}_{T}- {\varvec{I}}\), as long as \({\varvec{K}}_{T}\) does not differ too much from \({\varvec{K}}_{T0}\), the assumption \(\Vert {\varvec{B}}\Vert < 1\) is reasonable.

Defining \({\varvec{s}}^{*}:= ({\varvec{I}}+ {\varvec{B}})^{-1}\widetilde{{\varvec{s}}}\), from (12) we have

$$\begin{aligned} {\varvec{s}}^{(k+1)}- {\varvec{s}}^{*}\, = \, -\left[ \textstyle \sum ^{+\infty }_{j=k+1} (-1)^{j}{\varvec{B}}^{j}\right] \widetilde{{\varvec{s}}}+ (-1)^{k+1}{\varvec{B}}^{k+1}{\varvec{s}}^{(0)}. \end{aligned}$$
(13)

Since

$$\begin{aligned} \textstyle \sum ^{+\infty }_{j=k+1} (-1)^{j}{\varvec{B}}^{j}= & {} (-1)^{k+1}{\varvec{B}}^{k+1} + (-1)^{k+2}{\varvec{B}}^{k+2} + (-1)^{k+3}{\varvec{B}}^{k+3} + \cdots \nonumber \\= & {} (-1)^{k+1}{\varvec{B}}^{k+1}({\varvec{I}}- {\varvec{B}}+ {\varvec{B}}^{2} - {\varvec{B}}^{3} + \cdots ) \nonumber \\= & {} (-1)^{k+1}{\varvec{B}}^{k+1}({\varvec{I}}+ {\varvec{B}})^{-1}, \end{aligned}$$
(14)

substituting (14) into (13), we obtain

$$\begin{aligned} {\varvec{s}}^{(k+1)}- {\varvec{s}}^{*}= & {} (-1)^{k+1}{\varvec{B}}^{k+1}[{\varvec{s}}^{(0)}- ({\varvec{I}}+ {\varvec{B}})^{-1}\widetilde{{\varvec{s}}}] \nonumber \\= & {} (-1)^{k+1}{\varvec{B}}^{k+1}[{\varvec{s}}^{(0)}- {\varvec{s}}^{*}] \end{aligned}$$
(15)

and hence

$$\begin{aligned} 0 \, \le \, \Vert {\varvec{s}}^{(k+1)}- {\varvec{s}}^{*}\Vert \, \le \, \Vert {\varvec{B}}\Vert ^{k+1}\,\Vert {\varvec{s}}^{(0)}-{\varvec{s}}^{*}\Vert . \end{aligned}$$

Once we have \(\lim _{k \rightarrow +\infty } \Vert {\varvec{B}}\Vert ^{k+1} = 0\) (provided that \(\Vert {\varvec{B}}\Vert < 1\)), and assuming that \(\Vert {\varvec{s}}^{(0)}-{\varvec{s}}^{*}\Vert\) remains bounded, by the Squeeze Theorem we conclude that

$$\begin{aligned} \lim _{k \rightarrow +\infty } \Vert {\varvec{s}}^{(k+1)}- {\varvec{s}}^{*}\Vert \, = \, 0, \end{aligned}$$

and, consequently, \(\lim _{k \rightarrow +\infty } {\varvec{s}}^{(k+1)}\, = \, {\varvec{s}}^{*}\).

Now, we will show that \({\varvec{s}}^{*}\) is the (unique) solution of the linear system \({\varvec{K}}_{T}{\varvec{s}}\, = \, -{\varvec{r}}\) if, and only if, \({\varvec{s}}^{*}\) is the (unique) solution of \(({\varvec{I}}+ {\varvec{B}}){\varvec{s}}= \widetilde{{\varvec{s}}}\). In fact,

$$\begin{aligned} \begin{array}{lll} {\varvec{K}}_{T}{\varvec{s}}^{*}\, = \, -{\varvec{r}}&{} \Longleftrightarrow &{} ({\varvec{K}}_{T0}+ \varDelta {\varvec{K}}){\varvec{s}}^{*}\, = \, -{\varvec{r}}\\ &{} \Longleftrightarrow &{} {\varvec{s}}^{*}\, = \, -{\varvec{K}}_{T0}^{-1}{\varvec{r}}- {\varvec{K}}_{T0}^{-1}(\varDelta {\varvec{K}}){\varvec{s}}^{*}\\ &{}\Longleftrightarrow &{} {\varvec{s}}^{*}\, = \, \widetilde{{\varvec{s}}}- {\varvec{B}}{\varvec{s}}^{*}\\ &{}\Longleftrightarrow &{} ({\varvec{I}}+ {\varvec{B}}){\varvec{s}}^{*}= \widetilde{{\varvec{s}}}. \end{array} \end{aligned}$$

Using (15), we observe that

$$\begin{aligned} \begin{array}{lll} {\varvec{s}}^{(k+1)}- {\varvec{s}}^{*}&{} = &{} (-1)^{k+1}{\varvec{B}}^{k+1}[{\varvec{s}}^{(0)}- ({\varvec{I}}+ {\varvec{B}})^{-1}\widetilde{{\varvec{s}}}] \\ &{} = &{} (-1){\varvec{B}}(-1)^{k}{\varvec{B}}^{k}[{\varvec{s}}^{(0)}- ({\varvec{I}}+ {\varvec{B}})^{-1}\widetilde{{\varvec{s}}}] = -{\varvec{B}}({\varvec{s}}^{(k)}- {\varvec{s}}^{*}) \end{array} \end{aligned}$$

and, consequently,

$$\begin{aligned} \Vert {\varvec{s}}^{(k+1)}- {\varvec{s}}^{*}\Vert \, \le \, \Vert {\varvec{B}}\Vert \,\Vert {\varvec{s}}^{(k)}- {\varvec{s}}^{*}\Vert . \end{aligned}$$
(16)

Thus, by (16) and under the assumption that \(\Vert {\varvec{B}}\Vert < 1\), we conclude that the sequence \(\left\{ {\varvec{s}}^{(k)}\right\} ^{+\infty }_{k=0}\) generated by the ICA method has a linear rate of convergence.

In the solution of the linear system (4) using the ICA method, and considering a fixed iteration \(\ell\) of Newton’s method, we adopt a stopping criterion based on the value of the relative magnitude of its residual, given by

$$\begin{aligned} {\widehat{R}}_{k}^{(\ell )} = \frac{\Vert {\varvec{K}}_{T}{\varvec{s}}^{(k)}+ {\varvec{r}}\Vert _{\infty }}{\Vert {\varvec{r}}\Vert _{\infty }}, \qquad k = 0,\,1,\,2,\,\dots , \end{aligned}$$
(17)

where \({\varvec{K}}_{T}\equiv {\varvec{K}}_{T}({\varvec{u}}^{(\ell )},\,\varvec{\rho })\), as stated in (5), and \({\varvec{r}}= {\varvec{r}}({\varvec{u}}^{(\ell )},\,\varvec{\rho })\). Let \(\varepsilon _{R}\) be a small positive constant and \(k_{\max }\) a positive integer. If we get \({\widehat{R}}_{k^{*}}^{(\ell )} < \varepsilon _{R}\) for some \(k^{*} \le k_{\max }\), we consider that \({\varvec{s}}^{(\ell )} = {\varvec{s}}^{(k^{*})}\) is a good approximation for the solution of (4). Otherwise, we update the factorization of \({\varvec{K}}_{T}\), and solve the linear system (4) exactly.

Let \(\widehat{{\varvec{u}}}\) be an approximate solution of the nonlinear system (2) found by means of Newton’s method, such that \(\Vert {\varvec{r}}(\widehat{{\varvec{u}}},\,\varvec{\rho })\Vert < \varepsilon\). It can be shown that the partial derivatives of the objective function \(F(\varvec{\rho }) = {\varvec{l}}^{T}{\varvec{u}}(\varvec{\rho })\) are given by \(\partial F/\partial \rho _{e} \, = \, \varvec{\lambda }_{e}^{T}\,\partial {\varvec{r}}/\partial \rho _{e}\), where \(\varvec{\lambda }\) is the solution of the adjoint linear system

$$\begin{aligned} \widehat{{\varvec{K}}_{T}}\varvec{\lambda }= -{\varvec{l}}, \end{aligned}$$
(18)

\(\widehat{{\varvec{K}}_{T}}\) is the global tangent stiffness matrix evaluated at \(\widehat{{\varvec{u}}}\), and \(\varvec{\lambda }_{e}\) encompasses the components of the vector \(\varvec{\lambda }\) associated with the e-th element.

In order to avoid the factorization of \(\widehat{{\varvec{K}}_{T}}\) (and, thus, to accelerate the solution of our topology optimization problems), the ICA method can also be applied to find an approximate solution of the linear system (18). As done in (10), we define the sequence

$$\begin{aligned} \varvec{\lambda }^{(k+1)}= \widetilde{\varvec{\lambda }}- \widehat{{\varvec{B}}}\varvec{\lambda }^{(k)}, \qquad k = 0,\,1,\,2,\,\dots \end{aligned}$$

where \(\widetilde{\varvec{\lambda }}= -{\varvec{K}}_{T0}^{-1}{\varvec{l}}\), \(\widehat{{\varvec{B}}}= {\varvec{K}}_{T0}^{-1}\widehat{\varDelta {\varvec{K}}}\), \(\widehat{\varDelta {\varvec{K}}}= \widehat{{\varvec{K}}_{T}}-{\varvec{K}}_{T0}\), and we take \(\varvec{\lambda }^{(0)}= \widetilde{\varvec{\lambda }}\). Following the same steps shown above, we can prove that, if \(\Vert \widehat{{\varvec{B}}}\Vert < 1\), the sequence \(\{\varvec{\lambda }^{(k)}\}^{+\infty }_{k=0}\) converges to the unique solution \(\varvec{\lambda }^{*} = -\widehat{{\varvec{K}}_{T}}^{-1}{\varvec{l}}\) of the linear system (18) with a linear rate of convergence.

We can also apply the ICA method for inexactly solving the linear system (18). In this case, the relative magnitude of its residual is defined as

$$\begin{aligned} T_{k} = \frac{\Vert \widehat{{\varvec{K}}_{T}}\varvec{\lambda }^{(k)}+ {\varvec{l}}\Vert _{\infty }}{\Vert {\varvec{l}}\Vert _{\infty }}, \qquad k = 0,\,1,\,2,\,\dots , \end{aligned}$$
(19)

and we require that \(T_{{\overline{k}}} < \varepsilon _{T}\) (where \(\varepsilon _{T}\) is a small positive constant) for some \({\overline{k}} \le k_{\max }\). If this criterion is not satisfied, we update the factorization of \(\widehat{{\varvec{K}}_{T}}\), and solve the linear system (18) exactly.

To sum up, the contributions of our work are as follows. First, we have addressed geometrically nonlinear two-dimensional structures and mechanisms by means of the inexact Newton method with iterative combined approximations, which was not thoroughly studied in the literature. Second, the neo-Hookean Simo-Ciarlet model was adopted to describe the inherent large displacements, thus implying that the increment \(\varDelta {\varvec{K}}\) that represents the changes in the global stiffness matrix is distinct from the additive assumption made in previous works. Last, but not least, the ICA strategy was applied to the adjoint system (18) as well. Despite having been used in the robust topology optimization context (Amir et al. 2012), as far as we know, such an idea was not employed to handle geometrical nonlinearities before.

4 Test problems, proposed strategies and implementation details

In this section we present the computational settings and choices considered in this work.

4.1 Test problems

In our numerical investigation, we have addressed the two structures and the two compliant mechanisms described below.

Cantilever beam Fig. 1 (left) shows the design domain for this problem. Following the suggestions of Chen et al. (2019), we define \(L = 120\,mm\)  and suppose that the material used to construct the structure has a Young’s modulus of \(3000\,N/mm^{2}\) and a Poisson’s ratio of 0.4. The thickness of the structure is equal to \(1\,mm\). An external load of \(120\,N\) is applied downwards at the midpoint of the right side. The structure is supported at the left edge, where all of the nodes are fixed. The domain is discretized into \(40000 \ (400 \times 100)\) finite elements, and the optimal structure must contain \(50\,\%\) of the domain’s volume.

Slender beam The design domain of this structure is presented in Fig. 1 (right), in which \(L = 400\,mm\). An external load of \(40\,N\) is applied downwards at the center of the domain’s basis, and the structure is supported at the left and at the right edges, where all of the nodes are fixed. The Poisson’s ratio and the Young’s modulus of the material are, respectively, 0.3 and \(3000\,N/mm^{2}\). The structure has a thickness of \(1\,mm\). The domain is discretized into \(45000 \ (600 \times 75)\) finite elements, and the optimal structure must contain \(20\,\%\) of the domain’s volume.

Fig. 1
figure 1

Domains of the structures: cantilever beam (left) and slender beam (right)

Inverter Fig. 2 (left) exhibits the square domain for the inverter, with \(L = 300\,\mu m\) and thickness equal to \(7\,\mu m\). At the center of the left edge (point A), there is an external horizontal force of \(50\,mN\) pointing to the right. Supports at the lower and the upper left corners prevent horizontal and vertical displacements. The Poisson’s ratio and the Young’s modulus of the material are, respectively, 0.3 and \(180\,mN/\mu m^{2}\). The stiffness of the springs of the model at points A (input port) and B (output port) are, respectively, \(k_{in} = 4.0\,mN/\mu m\) and \(k_{out} = 1.0\,mN/\mu m\). The volume of the optimal compliant mechanism is limited to \(20\,\%\) of the domain’s volume. Due to symmetry, only the upper half of the domain is discretized into \(45000 \ (300 \times 150)\) finite elements.

Gripper The design domain for the gripper is shown in Fig. 2 (right), where \(L = 320\,\mu m\). The domain has a thickness of \(7\,\mu m\). In this compliant mechanism, an external force of \(4\,mN\) is applied horizontally at point A. Two sets of supports (each one with \(L/20\,\mu m\) of length) are located at the upper and lower left corners, preventing displacements. The Poisson’s ratio and the Young’s modulus of the material are, respectively, 0.3 and \(180\,mN/\mu m^{2}\). The stiffness of the spring at point A (input port) is \(k_{in} = 0.2\,mN/\mu m\). At points B and C (output ports), the stiffness is equal to \(k_{out} = 1.0\,mN/\mu m\). Due to symmetry, only the upper half of the domain is discretized into \(51200 \ (320 \times 160)\) finite elements. The optimal mechanism must contain \(20\,\%\) of the domain’s volume.

Fig. 2
figure 2

Domains of the mechanisms: inverter (left) and gripper (right)

4.2 Proposed strategies

Whenever the objective function of problem (1) needs to be computed, it is necessary to apply Newton’s method to solve the nonlinear system (2), which in turn requires the solution of several linear systems in the form (4). Therefore, the frequency of factorizations of \({\varvec{K}}_{T}\) is the key for speeding up the overall algorithm while keeping the solutions of the system (2) sufficiently accurate.

In the context of the nonlinear analysis of frame structures, Amir et al. (2008) presented some strategies based in the pure CA method reviewed in Sect. 3, in contrast to Newton’s and the Modified Newton methods. They considered the application of the CA method using 2 to 6 vectors of the sequence (10) to obtain the approximate solution (11) of the linear system (4).

In this work, we propose the five Inexact Combined Approximations (ICA) strategies described below. All of them rely on the use of the iterative scheme (10) to approximately solve the linear system (4) at each iteration of Newton’s method. The strategies differ on the frequency of \({\varvec{K}}_{T}\) factorizations, on the updating scheme for matrix \(\varDelta {\varvec{K}}\) and on the use of (10) for the solution of the adjoint linear system (18) related to the objective function gradient computation. To help future reference, each strategy is identified by its abbreviation.

upK1::

\({\varvec{K}}_{T}\) is factored only at the first iteration of Newton’s method; \(\varDelta {\varvec{K}}\) is updated at each iteration of Newton’s method.

upK1g::

same as upK1, with the addition that the linear system (18) is also solved inexactly by means of the iterative scheme (10).

upK100::

\({\varvec{K}}_{T}\) is factored only at the first iteration of Newton’s method; \(\varDelta {\varvec{K}}\) is updated every 100 iterations of Newton’s method.

upK100g::

same as upK100, with the addition that the linear system (18) is also solved inexactly by means of the iterative scheme (10).

upK03K100g::

\({\varvec{K}}_{T}\) is factored only at every 3 iterations of the SPLP method; \(\varDelta {\varvec{K}}\) is updated every 100 iterations of Newton’s method, and the linear system (18) is solved inexactly by means of the iterative scheme (10).

In order to evaluate the performance of the proposed strategies, they are compared with two factorization schemes commonly found in literature:

N:

(Newton): \({\varvec{K}}_{T}\) is factored at every iteration of Newton’s method.

MN:

(Modified Newton): \({\varvec{K}}_{T}\) is factored only at the first iteration of Newton’s method.

It is important to mention that, independently of the adopted strategy, in the first five iterations of the SPLP method, Newton’s method is always applied to obtain the approximate solutions of the nonlinear system (2) to reduce the impact of the starting solution on the overall optimization process. Besides, the approximate solution of the nonlinear system (2) evaluated at the current design cycle is used as the initial guess to the next cycle.

We should also point out that the design of the proposed strategies came after running preliminary intensive sets of experiments. Indeed, we started our numerical investigation with adaptive strategies, but the reached results were not encouraging. Additionally, along the tests, we investigated the effect of decreasing the frequency of the factorizations, and the strategy in which \({\varvec{K}}_{T}\) is factored only at every 5 iterations of the SPLP method performed worse than upK03K100g.

4.3 Implementation Details

The performance of the proposed strategies will be investigated for the four benchmark structures and compliant mechanisms presented in Sect. 4.1.

When we apply an optimization algorithm for solving problem (1), at every iteration we need to solve the nonlinear system (2) in order to compute the objective function. No matter the strategy adopted here, our stopping criterion for Newton’s method is \(\Vert {\varvec{r}}({\varvec{u}}^{(\ell )},\,\varvec{\rho }) \Vert _{\infty } \le 10^{-5}\).

For strategies upK1, upK1g, upK100, upK100g and upK03K100g, in a fixed iteration of Newton’s method, the linear system (4) is solved inexactly by means of the ICA method described in Sect. 3. Moreover, whenever strategies upK1g, upK100g and upK03K100g are adopted, the ICA method is also applied for inexactly solving the adjoint linear system (18).

Based on intensive preliminary experiments, we noticed that for computing the derivatives of the objective function, the impact of the precision upon the adjoint system (18) is much more significant than the precision upon the current linearization of the state problem, given by the linear system (4). The tolerances were chosen based on this preliminary insight, to maintain acceptable accuracy. So, in our numerical tests, we used \(\varepsilon _{R} = 10^{-2}\) and \(\varepsilon _{T} = 10^{-8}\) in (17) and (19), respectively. Besides, in both cases, we set \(k_{max} = 10\) as the maximum number of iterations for the ICA method.

With the aim of ensuring the global convergence of Newton’s method and avoiding small decreases of \(\Vert {\varvec{r}}({\varvec{u}},\,\varvec{\rho }) \Vert ^{2}_{2}\) even with large steps, for all of the strategies we adopt the Armijo’s line search (Nocedal and Wright 2006).

When dealing with topology optimization problems under geometric nonlinearity, the value of the penalty parameter p of the SIMP method plays a crucial role in the stabilization of the overall optimization process. Based on our previous experience (Gomes and Senne 2014), the initial value of p is set to 1.0 and it is increased by \(\varDelta p = 0.1\) after every 10 iterations of the SPLP method until it reaches 3.0. The minimum value \(\rho _{\min }\) allowed for the densities in problem (1) was set to 0.001, to avoid the singularity of the matrix \({\varvec{K}}_{T}\). The density filter proposed by Bruns and Tortorelli (2003) was applied to prevent the checkerboard-like pattern in the optimal distribution of material (Díaz and Sigmund 1995). In our tests, the number of elements used to define the filter radius were 10, 5, 7.5 and 5, respectively, for the cantilever beam, the slender beam, the inverter and the gripper.

The algorithm presented was coded in C\(++\). The CPLEX 12.1 software library was used for solving the subproblems of the SPLP algorithm, and the linear systems were solved using the CHOLMOD 1.7 library. All of the tests were performed on a personal computer with an Intel Core i7-6500U processor, under the Ubuntu Linux 16.4 operating system.

5 Performance of the strategies considering a fixed budget

With the aim of providing a fair comparison of the performance of the several strategies proposed here, we have first solved all of the problems with a fixed budget of 300 iterations of the SPLP method. Table 1 and the remaining tables presented in this and in the following sections contain the value reached for the objective function \(F(\varvec{\rho })\), the accumulated number of iterations of Newton’s method and the total time spent to achieve the optimal solution. More complete tables that contain the time spent evaluating \(F(\varvec{\rho })\), solving the SPLP subproblems, applying the filter to the densities, and computing the gradient vector \(\nabla F(\varvec{\rho })\), are given as supplementary information. We remark that the CPU times are always expressed in seconds.

Table 1 Results obtained for 300 iterations of SPLP (cantilever beam and slender beam)

From Table 1, one may notice that the optimal values of the objective function among the strategies are very similar for the cantilever beam and for the slender beam. On the other hand, all of the new strategies showed a noticeable reduction of the total time spent by the solver, and this reduction is more pronounced for strategies upK1g, upK100g and upK03K100g. Figure 3 indicates that, while the remaining steps of the SPLP algorithm took almost the same time for all of the strategies, the time spent to evaluate the objective function was substantially reduced if compared to Newton’s method, especially when the upK03K100g strategy is adopted.

Fig. 3
figure 3

Total CPU time and time spent to compute the objective function for the cantilever beam \(400 \times 100\) (left) and the slender beam \(600 \times 75\) (right)

The time spent to evaluate \(F(\varvec{\rho })\) is the sum of the time consumed to assemble \({\varvec{K}}_{T}\), to factorize this matrix, to assemble the right-hand side (RHS) vectors, and to solve the linear systems (4) and (18). Figure 4 shows the time spent by the strategies on each of these steps. As we see, the percentage reduction of the time spent to compute \(F(\varvec{\rho })\) (showed above the bars) is similar for the Modified Newton method and strategies upK1 and upK100, although the reduction of the time required to assemble and to factorize \({\varvec{K}}_{T}\) showed a subtle variation among them.

As expected, the time consumed to assemble \({\varvec{K}}_{T}\) is reduced when \(\varDelta {\varvec{K}}\) is updated only at every 100 iterations of Newton’s method (i.e., when strategies upK100, upK100g and upK03K100g are used). Besides, a noticeable reduction of the time spent with the factorization of \({\varvec{K}}_{T}\) is attained when the linear system (18) is solved inexactly by means of the iterative scheme (10) (strategies upK1g, upK100g and upK03K100g). This reduction easily compensates the slight increase in the time consumed to assemble the RHS vectors and to solve the linear systems. Finally, factoring \({\varvec{K}}_{T}\) only at every 3 iterations of the SPLP method, as done by the upK03K100g strategy, has a dramatic effect on the time spent to assemble and factorize \({\varvec{K}}_{T}\), which leads to an overall reduction that exceeds 60% for both structures.

Fig. 4
figure 4

Time spent to compute \(F(\varvec{\rho })\) divided into its main components (assembling \({\varvec{K}}_{T}\), assembling the RHS vector, factorizing \({\varvec{K}}_{T}\), and solving the linear systems) for the cantilever beam \(400 \times 100\) (left) and the slender beam \(600 \times 75\) (right). Data labels show the percentage reduction of the time spent to compute \(F(\varvec{\rho })\) (above the bars), to assemble \({\varvec{K}}_{T}\), and to factorize \({\varvec{K}}_{T}\), with respect to the time of Newton’s method

To put the effect of the geometric nonlinearity in perspective, we have also considered the two structures as described in Sect. 4.1, but using the linear elastic assumption to define the stiffness matrix, so that it varies just with the densities, and do not depend on the displacements any longer. Adopting the policy of factoring such a matrix at every call of the objective function, i.e. the original strategy analyzed in Senne et al. (2019), we have solved the corresponding topology optimization problem, reaching the so-called small displacement configuration. Figure 5 depicts the optimal topologies, with small and large displacements, for the cantilever beam and the slender beam, respectively. For the cantilever beam, it is evident the lack of symmetry about the medium horizontal line presented by the structure under large displacements, to cope with the nonlinear requirement, as compared with the structure under small displacements. When it comes to the slender beam, it is also significant the effect of the geometric nonlinearity upon the optimal configuration. It must be stressed that the structures obtained for the seven strategies presented on Table 1 are quite similar, so only the results for the upK100g strategy are shown.

Fig. 5
figure 5

Optimal topologies for the cantilever beam \(400 \times 100\) (top) and the slender beam \(600 \times 75\) (bottom)

Figure 6 shows the distribution of the time consumed by each strategy to evaluate the objective function, and the number of iterations performed by Newton’s method at each iteration of the SPLP method, for the cantilever beam.

Fig. 6
figure 6

Distributions of the time spent to compute the objective function (left) and the number of iterations of Newton’s Method (right) at each iteration of the SPLP algorithm, for the cantilever beam \(400 \times 100\)

We can observe from Fig. 6 that, when Newton’s method is employed, it takes at most 3 iterations to reach the approximate solution of the nonlinear system (2), but, due to its expensive iterations, the average time spent to compute the objective function is 3.9 seconds (the average values are indicated by “\(\varvec{\times }\)” in the box plots). With the adoption of the Modified Newton method, the average time is reduced to 2.8, but a slower convergence is experienced in some occasions. For example, at one iteration of the SPLP method, 10 iterations were necessary to obtain the approximate solution of (2). This behavior can be explained by the fact that, for the Modified Newton method, the matrix \({\varvec{K}}_{T}\) is factored only at the initial point, without carrying any information about the changes of \({\varvec{K}}_{T}\) along the iterative process.

The upK1 strategy represents an enhancement of the Modified Newton method, since the iterative scheme (10) takes into account the variations on \({\varvec{K}}_{T}\) by means of the matrix \(\varDelta {\varvec{K}}\). This is reflected in the acceleration in the convergence of such inexact Newton method, as we can see in the right box plot of Fig. 6. Moreover, in almost all of the callings of this inexact Newton method, the approximate solution of the linear system (4) with the desired precision was obtained already with the first term of the sequence (10). Despite of these facts, the time consumed to evaluate \(F(\varvec{\rho })\) by the Modified Newton method and by strategy upK1 are quite similar. On the other hand, strategy upK1g showed a noticeable decrease in the time spent to compute \(F(\varvec{\rho })\) in comparison with upK1 (the average time required to compute \(F(\varvec{\rho })\) was reduced from 2.7 to 2.2), since it does not require the factorization of \({\varvec{K}}_{T}\) for the solution of the adjoint linear system (18). Strategies upK100 and upK100g showed a slightly better performance than upK1 and upK1g, respectively, suggesting that solving inexactly the linear system (18) is more effective than reducing the frequency of \(\varDelta {\varvec{K}}\) updates.

The adoption of the upK03K100g strategy produced the most remarkable improvement in the time spent to evaluate the objective function. As we can see in Fig. 6, although upK03K100g has a wider box plot than the other strategies, the evaluation of \(F(\varvec{\rho })\) demanded less than 1.5 seconds in most of the iterations, requiring less than 1 second in many cases. It should be mentioned that the maximum number of iterations of the iterative scheme (10) can be reached when we are far from the solution, due to the lack of information on the changes of the factorization of \({\varvec{K}}_{T}\). Whenever this happens, the factorization of \({\varvec{K}}_{T}\) is updated, and we solve the full linear system (4). Fortunately, this rarely occurs, so the speed of convergence of the inexact Newton method is almost not affected.

Now, we will discuss the results obtained for the inverter and for the gripper mechanisms, considering the fixed budget of 300 iterations of the SPLP method. These results are summarized on Table 2. One may notice that the maximum variation between the optimal values of the objective function was of \(0.09\,\%\) for the inverter and of \(0.26\,\%\) for the gripper, which is very reasonable from a practical point of view.

Table 2 Results obtained for 300 iterations of SPLP (Inverter and Gripper)

The remarkable time saving provided by the proposed strategies is directly related to the reduction in the time spent to compute \(F(\varvec{\rho })\), as we see in Fig. 7. Besides, also for the mechanisms, upK03K100g is much cheaper than the other strategies.

Fig. 7
figure 7

Total CPU time and time spent to compute the objective function for the inverter \(300 \times 150\) (left) and the gripper \(320 \times 160\) (right)

Figure 8 shows the time spent to compute \(F(\varvec{\rho })\) divided into its main components. As expected, when Newton’s method is employed, the factorization of \({\varvec{K}}_{T}\) takes most of the total time for solving the problems (\(90\,\%\) of the total time for the inverter and \(86.5\,\%\) for the gripper).

Fig. 8
figure 8

Time spent to compute \(F(\varvec{\rho })\) divided into its main components (assembling \({\varvec{K}}_{T}\), assembling the RHS vector, factorizing \({\varvec{K}}_{T}\), and solving the linear systems) for the inverter \(300 \times 150\) (left) and the gripper \(320 \times 160\) (right). Data labels show the percentage reduction of the time spent to compute \(F(\varvec{\rho })\) (above the bars), to assemble \({\varvec{K}}_{T}\), and to factorize \({\varvec{K}}_{T}\), with respect to the time of Newton’s method

As it was observed for the rigid structures, the time spent assembling \({\varvec{K}}_{T}\) is reduced when we move from strategy upK1 to upK100 and from upK1g to upK100g. However, this reduction is less relevant than the one that results from solving inexactly the linear system (18) (compare strategies upK1 and upK100 to their “g” counterparts). Yet none of these strategies achieves the outstanding percentage reduction provided by upK03K100g.

Aiming to assess the effect of the geometric nonlinearity for the mechanisms, following the same ideas applied to the structures, we have also generated the optimal configurations under small displacements for the inverter and the gripper, so they can be compared with the optimal configurations obtained when large displacements are considered. The varied topologies obtained are presented in Fig. 9. Since there was no significant difference between the structures obtained with the strategies presented on Table 2, only the results for the upK100g strategy are shown.

Fig. 9
figure 9

Optimal topologies for the inverter \(300 \times 150\) (left) and the gripper \(320 \times 160\) (right)

Figure 10 shows the distribution of the time spent by each strategy to compute \(F(\varvec{\rho })\), and the number of iterations performed by Newton’s method at each iteration of the SPLP method, for the gripper mechanism. As we can see, although Newton’s method required few iterations to find the approximate solution of the nonlinear system (2), it took 6.9 seconds, on average, to compute \(F(\varvec{\rho })\). This contrasts with the behavior of the remaining strategies, that managed to reduce the time spent to compute the objective function without significantly increasing the number of Newton iterations.

Fig. 10
figure 10

Distributions of the time spent to compute the objective function (left) and the number of iterations for Newton’s Method (right) at each iteration of the SPLP algorithm, for the gripper \(320 \times 160\)

Strategies upK1 and upK100 required fewer iterations for solving the nonlinear system (2) than the Modified Newton method, due to the frequent updating of the \(\varDelta {\varvec{K}}\) matrix. The upK100 strategy was slightly better than upK1 just because it took less time assembling \({\varvec{K}}_{T}\). In turn, strategies upK1g and upK100g required much less time to factorize \({\varvec{K}}_{T}\) than their counterparts upK1 and upK100, causing a significant reduction of the average time required to compute \(F(\varvec{\rho })\).

At the iterations of the SPLP method in which the factorization of \({\varvec{K}}_{T}\) was reused, the upK03K100g strategy took no more than 2 seconds to compute \(F(\varvec{\rho })\). In some occasions, the time spent was less than 1 second. Because of the smaller frequency of factorizations of \({\varvec{K}}_{T}\), the average number of iterations to obtain the solution of the linear system (4) was larger than those required by the other inexact Newton strategies. Despite this fact, the accuracy of the approximate solutions of the linear system (4) and of the nonlinear system (2) were not spoiled.

6 Effect of mesh refinement

The time consumed to evaluate the objective function (and, consequently, the total time spent for solving the topology optimization problem) is directly influenced by the mesh refinement of the design domain. To investigate the impact of the number of finite elements on the time spent to compute \(F(\varvec{\rho })\), we chose four different meshes for the slender beam and for the inverter, considering a fixed budget of 300 iterations of the SPLP method. For the slender beam, the design domain was discretized into \(200 \times 25 = 5000\), \(400 \times 50 = 20000\), \(600 \times 75 = 45000\) and \(800 \times 100 = 80000\) elements. In each case, the filter radius was defined accordingly, corresponding to the length of 2.5, 5.0, 7.5 and 10.0 elements, respectively. The upper half of the design domain of the inverter was discretized into \(200 \times 100 = 20000\), \(300 \times 150 = 45000\), \(400 \times 200 = 80000\) and \(500 \times 250 = 125000\) elements, and the filter radius was set to the length of 5.0, 7.5, 10.0 and 12.5 elements, respectively. The tables that show how the performance of each strategy was affected by the mesh refinement are available as supplementary information.

6.1 Slender beam

First of all, it is worth mentioning that, for all of the meshes considered, the value of \(F(\varvec{\rho })\) attained after 300 iterations of the SPLP method differ very little regardless of the strategy adopted, corroborating the robustness of the reanalysis strategies proposed here. Since the optimal topologies are also very similar among strategies, only the results related to the upK100g strategy are shown in Fig. 11.

Fig. 11
figure 11

Optimal topologies for the slender beam

Figure 12, however, shows that the time required to evaluate the objective function may vary significantly among strategies. This figure suggests that, for all of the methods, there is an almost linear relation between the time spent to compute \(F(\varvec{\rho })\) and the product of the size of matrix \({\varvec{K}}_{T}\) and its bandwidth. Moreover, all of our strategies not only provide a considerable decrease on the overall time spent to compute \(F(\varvec{\rho })\)  but also this reduction become more evident as we refine the design domain. In fact, the slopes of the regression lines of the Modified Newton method and strategies upK1, upK1g, upK100, upK100g and upK03K100g are, respectively, \(30.2\,\%\), \(33.2\,\%\), \(44.4\,\%\), \(35.0\,\%\), \(48.7\,\%\) and \(66.3\,\%\) smaller than the slope of the line associated with Newton’s method.

Fig. 12
figure 12

\(F(\varvec{\rho })\)-time for each strategy according to the dimension of \({\varvec{K}}_{T}\) (slender beam)

Figure 13 shows the percentage of decrease in the time spent to evaluate \(F(\varvec{\rho })\) with respect to Newton’s method, for the four discretizations considered for the slender beam. For the coarsest mesh, the Modified Newton method took almost twice the number of iterations of Newton’s method, resulting in an increase of \(5.7\,\%\) in the time spent to compute the objective function. For the remaining discretizations, the performance of the Modified Newton method and strategy upK1 were very similar, although the latter took fewer iterations due to its frequent updating of \(\varDelta {\varvec{K}}\). It is clear from the figure that the inexact solution of the system related to the objective function gradient becomes more relevant as the mesh gets finer, so the decrease in time is more pronounced for the upK1g, upK100g and upK03K100g strategies.

Fig. 13
figure 13

Percentage of decrease in the time demanded to evaluate the objective function (F) and in the total time (T), as compared with Newton’s method, for the slender beam

Although strategies upK1g and upK100g attained an average reduction of \(34.2\,\%\)  and \(38.5\,\%\), respectively, for the last two discretizations, upK03K100g showed the best figures for all of the meshes, as expected. For this strategy, the reduction of the time consumed in evaluating \(F(\varvec{\rho })\) varied from \(46.0\,\%\) to \(65.0\,\%\). When it comes to the time for assembling \({\varvec{K}}_{T}\), the reduction ranged between \(38.4\,\%\) and \(45.0\,\%\), while the factorization of \({\varvec{K}}_{T}\) showed an even more significant result, with a reduction in the range \(71.6\,\%\) to \(77.4\,\%\).

Figure 13 also shows the percentage of decrease in the total time spent by the inexact Newton strategies, with respect to Newton’s method. The difference between the two percentages presented for each strategy reveals the relevance of the remained steps performed at each iteration of the SPLP method. In short, for the Modified Newton method and strategies upK1 and upK100, the decrease in the time spent to obtain the optimal topologies was, in average, \(18.1\,\%\), \(19.3\,\%\) and \(21.4\,\%\), respectively. When strategies upK1g and upK100g were adopted, an average decrease of \(25.6\,\%\) and \(29.5\,\%\) was obtained. Finally, the upK03K100g strategy showed an average decrease of \(40.0\,\%\).

6.2 Inverter

As we observed for the slender beam, for all of the mesh sizes, the value of the objective function attained by the various strategies after 300 iterations of the SPLP method was almost the same. In this case, the difference did not exceed \(0.07\,\%\), which is very reasonable from the practical point of view. Moreover, the number of iterations necessary to obtain the approximate solution of the nonlinear system (2) also showed little variation among strategies. As a consequence, the optimal topologies obtained were quite similar, so only the results for the upK100g strategy are shown in Fig. 14.

Fig. 14
figure 14

Optimal topologies for the inverter

Figure 15 shows how the time spent to compute \(F(\varvec{\rho })\) depends on the problem size. Once again, we observe that our strategies are very efficient in reducing the time consumed in computing \(F(\varvec{\rho })\)  as the mesh is refined. The regression lines suggest that the growth rates of the Modified Newton method and strategies upK1 and upK100 are very similar. The same occurs with upK1g and upK100g. As expected, the upK03K100g strategy achieved the best performance. Compared to the regression line of Newton’s method, the lines of the Modified Newton method and strategies upK1 and upK100 showed a 37 %  smaller slope. For strategies upK1g and upK100g, the slope is \(54.1\,\%\)  smaller, and for upK03K100g the reduction is about \(72.0\,\%\).

Fig. 15
figure 15

\(F(\varvec{\rho })\)-time for each strategy according to the dimension of \({\varvec{K}}_{T}\) (inverter)

Figure 16 presents the percentage of decrease in the time spent to evaluate \(F(\varvec{\rho })\) and in the overall time of the algorithm, with respect to Newton’s method. From the figure, one can notice that the percentages provided by the Modified Newton method and strategies upK1 and upK100 are very similar, ranging from \(31.0\,\%\) to \(37.0\,\%\) when we take into account the four mesh sizes.

Fig. 16
figure 16

Percentage of decrease in the time demanded to evaluate the objective function (F) and in the total time (T), as compared with Newton’s method, for the inverter mechanism

Analogously to the results shown in Fig. 13, the adoption of the iterative scheme (10) to obtain an approximate solution for the adjoint system (18) resulted in a substantial reduction in the time spent with the reanalysis process. In fact, strategies upK1g and upK100g provided an average decrease that ranged from \(44.5\,\%\) (for the \(200 \times 100\) mesh) to \(53.3\,\%\) (for the \(500 \times 250\) one).

As before, strategy upK03K100g showed the best results for all of the meshes considered, providing a reduction that varies from \(62.2\,\%\) (for the coarser mesh) to \(71.2\,\%\) (for the finer one). On average, this strategy spent \(46.0\,\%\) less time to assemble \({\varvec{K}}_{T}\) and \(78.7\,\%\) less time to factorize this matrix, in comparison with Newton’s method.

For the Modified Newton method and strategies upK1 and upK100, the overall time spent by the SPLP algorithm was \(28.6\,\%\), \(28.1\,\%\) and \(30.4\,\%\) smaller, respectively, than the time consumed when Newton’s method was adopted. For strategies upK1g and upK100g, we have observed a reduction of \(41.0\,\%\) and \(42.8\,\%\), respectively. Finally, the upK03K100g strategy showed an average decrease of \(57.4\,\%\).

7 Performance of the SPLP method with the nonlinear reanalysis

In this section, we investigate the performance of the reanalysis strategies when the SPLP method is run until convergence. We have seen above that, for a fixed budget of 300 iterations, these strategies can substantially reduce the overall time spent by the algorithm, as compared to Newton’s method. Our purpose now is to check if this also happens when we adopt a mathematically well founded stopping criterion, as described below.

The Lagrangian function associated with the topology optimization problem is defined by

$$\begin{aligned} {\mathcal {L}}(\varvec{\rho },\,\theta ) \, := \, F(\varvec{\rho }) + \theta V(\varvec{\rho }), \end{aligned}$$

where \(F(\varvec{\rho })\) is the objective function, \(V(\varvec{\rho })\) is the volume constraint, and \(\theta \in {\mathbb {R}}\) is the Lagrange multiplier associated with this constraint. Denoting the gradient vector of the Lagrangian function by \(\nabla _{\varvec{\rho }}{\mathcal {L}}(\varvec{\rho },\,\theta )\) and defining \(v(\varvec{\rho },\,\theta ) := \varvec{\rho }-\nabla _{\varvec{\rho }}{\mathcal {L}}(\varvec{\rho },\,\theta )\), the continuous projected gradient onto the set

$$\begin{aligned} X := \left\{ \varvec{\rho }\in {\mathbb {R}}^{n_{el}} \,\,\, | \,\,\, \rho _{\min } \le \rho _{i} \le 1, \,\,\, i = 1,\,\dots ,\, n_{el} \right\} \end{aligned}$$

is given by \(g_{P}(\varvec{\rho },\,\theta ) = P_{X}(v(\varvec{\rho },\,\theta ))-\varvec{\rho }\), where \(P_{X}(v(\varvec{\rho },\,\theta ))\) is the orthogonal projection of the vector \(v(\varvec{\rho },\,\theta )\) onto X. We consider that the SPLP algorithm has found a good approximation for a stationary point for the topology optimization problem whenever

$$\begin{aligned} \Vert g_{P}(\varvec{\rho }^{(k_{s})},\,\theta ^{(k_{s})}) \Vert _{\infty } < 10^{-3}, \end{aligned}$$
(20)

where \(\Vert \, \cdot \, \Vert _{\infty }\) denotes the max-norm, and \(\theta ^{(k_{s})}\) is the approximate Lagrange multiplier of the volume constraint at the \(k_{s}\)-th iteration of the SPLP algorithm.

The results obtained for the rigid structures and compliant mechanisms using the criterion (20) are shown in Tables 3 and 4, respectively. As we observe, no matter the problem, the optimal values of the objective function at the solution were very similar among strategies. On the other hand, with the exception of the gripper, the number of iterations of the SPLP algorithm showed some variation, affecting the overall performance of the algorithm.

The performance of the various strategies was also affected by the number of iterations required for solving the nonlinear system (2). For the slender beam, for example, most of the strategies took more iterations than Newton’s method, so the decrease in the total CPU time was less noticeable. Even though, all of the new strategies performed better than the Modified Newton method for the mechanisms, behavior that is also seen, in most cases, for the rigid structures. Moreover, for all of the four problems, the improvement in the total time showed by the upK03K100g strategy over the Modified Newton method is about 48%.

Table 3 Results obtained using the SPLP stopping criterion (cantilever and slender beams)
Table 4 Results obtained using the SPLP stopping criterion (Inverter and Gripper)

Figure 17 summarizes the differences observed on the time the various strategies spent to compute the objective function, both for the fixed budget of 300 iterations and in the case we require full convergence. The bars of the figure show the reduction provided by each strategy, with respect to Newton’s method. From the figure, it is easy to notice that, for the cantilever beam and the inverter, the results obtained when we use the stopping criterion (20) are better than those attained with a fixed budget. For the gripper, there was almost no difference between the results, and for the slender beam the performance of the first five strategies was degraded when full convergence was required, mainly due to an increase on the number of iterations of Newton’s method, as mentioned above. The figure also reveals the relevance of applying the ICA scheme to the computation of the objective function gradient. In general, the strategies whose name end with g (i.e., those that use ICA for solving (18)) attained the best results, notably upK03K100g, that showed a consistent improvement when the stopping criterion was changed.

Fig. 17
figure 17

Percentage of decrease in the time required to evaluate the objective function, as compared to Newton’s method

8 Investigating the behavior of the norm of matrix B

Let \({\varvec{B}}\) be the matrix defined in (9). In Sect. 3, we demonstrated that if \(\Vert {\varvec{B}}\Vert < 1\), then the sequence \(\left\{ {\varvec{s}}^{(k)}\right\} ^{+\infty }_{k=0}\) (defined in (10) and associated with the ICA method) converges to the solution \({\varvec{s}}^{*}\) of the linear system (4) of Newton’s method. Moreover, we showed that this sequence has linear rate of convergence.

With the aim of investigating the impact of the variation on the maximum value of \(\Vert {\varvec{B}}\Vert _{2}\) attained whenever the nonlinear system (2) is solved, we have performed an additional test considering the slender beam and the gripper mechanism, using strategies upK100g and upK03K100g. Since the explicit evaluation of \({\varvec{B}}\) is very expensive, these numerical test was performed in MATLAB (version R2016a) and took into account a coarser mesh, with \(160 \times 20 = 3200\) finite elements for the slender beam and \(80 \times 40 = 3200\) finite elements for the gripper mechanism, applying the stopping criterion for the SPLP method stated in (20).

As in the previous experiments, the optimal values of the objective function found in this new test were very similar for the two structures considered here, regardless of the applied strategy. For each combination of problem and strategy, we have recorded \(\Vert {\varvec{B}}\Vert _{2}\) and the number of iterations of Newton’s method, to see how they vary along time. The results are shown in Figs. 18 and 19 .

Fig. 18
figure 18

Maximum value of \(\Vert {\varvec{B}}\Vert _{2}\) (left) and number of iterations of Newton’s method (right) for the slender beam \(160 \times 20\)

Fig. 19
figure 19

Maximum value of \(\Vert {\varvec{B}}\Vert _{2}\) (left) and number of iterations of Newton’s method (right) for the gripper \(80 \times 40\)

We can observe that there is a direct relation between the number of iterations of Newton’s method and the value of \(\Vert {\varvec{B}}\Vert _{2}\), i.e., the chance of performing more than 3 iterations increases with the norm of B. Even though, when the upK100g strategy is applied, the number of iterations necessary to find the approximate solution of the linear system (4) is at most equals to 3 in \(93.5\,\%\) of the iterations for the slender beam and \(98.2\,\%\) for the gripper. For the upK03K100g strategy, no more than 3 linear system solutions are required in \(48.6\,\%\) of the iterations for the slender beam and \(73.5\,\%\) for the gripper.

As shown in Figs. 18 and 19 , when the upK100g strategy is used, the maximum value of \(\Vert {\varvec{B}}\Vert _{2}\) is below 1 in \(41.7\,\%\) of the SPLP iterations for the slender beam, and in \(56.3\,\%\) for the gripper. For the upK03K100g strategy, this percentage is reduced to \(20.3\,\%\) for the slender beam and \(26.5\,\%\) for the gripper. However, for both strategies, the approximate solution of the nonlinear system (2) was always obtained after at most 5 iterations of Newton’s method, no matter the problem under analysis. Therefore, we conclude that our reanalysis strategies are robust. 

9 Final remarks

Under the geometric nonlinearity assumption, the nonlinear equations that model the equilibrium of topology optimization problems must be solved repeatedly along the solution process, constituting the bottleneck of any iterative scheme.

In this work, to accommodate the trade-off between speed and accuracy, five strategies have been proposed to steer the frequencies of the factorizations and of the updatings along the solution process. These strategies have provided significant time savings as compared with Newton’s method, without spoiling the convergence behavior of the algorithm. Indeed, strategy upK03K100g has saved more than 60% with the fixed budget of 300 SPLP iterations, and around 70% when the solver is run up to convergence. The deformed structures and mechanisms obtained by means of this strategy are given as supplementary information.

By choosing four different finite element meshes, we have observed that the CPU time spent for evaluating the objective function scales well with the dimensions of the global stiffness matrix. Moreover, strategy upK03K100g produced the smallest rate of increase in time as the meshes were refined.

Despite the theoretical condition for convergence of the reanalysis, namely \(\Vert {\varvec{B}}\Vert _2 < 1\), not being verified all along the iterations, we have observed that the incidental increasing of such a norm did not have a negative impact upon the results, corroborating the robustness of the proposed strategies.

We have also observed that the use of the ICA scheme for solving the adjoint linear system associated with the sensitivity analysis had a crucial role in accelerating the SPLP algorithm. The combination of this scheme with the factorization of \({\varvec{K}}_{T}\) at every 3 iterations and the update of \(\varDelta {\varvec{K}}\) at every 100 iterations of Newton’s method turned upK03K100g into a fast, yet robust, approach, outperforming the remaining strategies.

Related topics for future research include the mixing of the proposed strategies with further acceleration schemes, such as reduced-order modeling and multigrid methods, as well as handling additional stress constraints into the model (see e.g. Amir 2021; Deng et al. 2021; Holmberg et al. 2013). Another room for investigation would be to admit geometric nonlinearities within the minimum weight problem formulation (cf. Amir 2015; Long et al. 2019), to develop the inherent approximate reanalysis framework by means of iterative combined approximations, and to assess its impact.