Abstract
Many problems in structural optimization can be formulated as a minimization of the maximum eigenvalue of a symmetric matrix. In practise it is often observed that the maximum eigenvalue has multiplicity greater than one close to or at optimal solutions. In this note we give a sufficient condition for this to happen at extreme points in the optimal solution set. If, as in topology optimization, each design variable determines the amount of material in a finite element in the design domain then this condition essentially amounts to saying that the number of elements containing material at a solution must be greater than the order of the matrix.
Avoid common mistakes on your manuscript.
1 Introduction
Many problems in structural optimization can be formulated as a minimization of the maximum eigenvalue of a symmetric matrix. Examples include maximization of the fundamental eigenfrequency (Seyranian et al. 1994; Du and Olhoff 2005), robust design optimization (Kobelev 1993; Takezawa et al. 2011; Brittain et al. 2012; Holmberg et al. 2015), and maximization of the fundamental buckling load (Cox and Overton 1992; Rahmatalla and Swan 2003; Stanford and Beran 2013).Footnote 1
Let λ 1(A) denote the largest eigenvalue of a symmetric matrix \(\boldsymbol {A} \in \mathbb {R}^{n \times n}\). In the present context A depends on the stiffness matrix arising from a finite element discretization of some structure. The stiffness matrix in turn depends on a set of design variables x, usually determining the amount of material present in each finite element. It is frequently observed that the multiplicity of λ 1(A) is greater than one close to or at an optimal solution x ∗, rendering the function x→λ 1(A(x)) non-smooth (Overton and Womersley 1993). In this note we provide sufficient conditions for this to happen. The most important part of these is that the number of non-zero elements in x ∗ is greater than the order n of A. Since the conditions depend on the solution they cannot in general be checked a priori, but this is quite natural since one expects applied loads and boundary conditions to have a significant impact on the properties of the optimal solutions.
The work presented here builds on the analysis of Pataki (1998) and follows it closely in parts. Pataki (1998) considered unconstrained minimization of the maximum eigenvalue (so did Dedieu (2000) using different techniques), but for structural optimization it is necessary to (at least) introduce lower bounds on the design variables and a linear equality constraint limiting the total amount of material used. The main result here is Theorem 1 which gives a lower bound (greater than one) on the multiplicity of λ 1(A(x ∗)) when A depends linearly on the vector of design variables x, and x ∗ is an extreme point in the optimal solution set. In many problems of interest, however, A is a non-linear function of x. Fortunately, Lemma 1 shows that that such problems can be reduced to problems of the type treated by Theorem 1, and Theorem 2 then provides a lower bound on the multiplicity of λ 1(A(x ∗)) in the non-linear case.
2 Notation and basic facts
Let \(\mathbb {S}^{n}\) denote the space of n × n symmetric matrices with real entries. If a matrix in \(\mathbb {S}^{n}\) is positive semi-definite we say that it belongs in \(\mathbb {S}^{n}_{+}\). The eigenvalues of \(\boldsymbol {A} \in \mathbb {S}^{n}\) are denoted λ 1 ≥ … ≥ λ n , and we can write A = QDiag{λ i }Q T where \(\boldsymbol {Q} \in \mathbb {R}^{n\times n}\) is an orthonormal matrix and Diag{λ i } is a diagonal matrix with λ 1, … , λ n on the diagonal. The functions \(\lambda _{i}:\mathbb {S}^{n} \rightarrow \mathbb {R}\) are Lipschitz continuous and convex. The multiplicity of λ 1 is the largest number p, 1 ≤ p ≤ n, such that λ 1=… = λ p > λ p+1; we denote this number mult(λ 1). tr(A) is the trace of A.
A face of a convex set C is a non-empty, convex subset F of C such that if \(F \ni x = \frac {1}{2}(x_{1}+x_{2})\) for some x 1, x 2 ∈ C, then x 1, x 2 ∈ F. A zero-dimensional face of a convex set, i.e. a point, is called an extreme point of that set.
We use the notation t(n) = n(n + 1)/2 for the n-th triangular number and ℕ0 denote the set of non-negative integers.
3 The affine case
Let \(\boldsymbol {A} : \mathbb {R}^{m} \rightarrow \mathbb {S}^{n}\) be defined by
where \(\boldsymbol {A}_{i}\in \mathbb {S}^{n}\) for i = 0,…, m. Consider the optimization problem
where \(\mathcal {X} = \{\boldsymbol {x} \in \mathbb {R}^{m} \;|\; \boldsymbol {x} \geq \boldsymbol {0},\; \boldsymbol {a}^{\mathsf {T}}\boldsymbol {x} = b\}\). The vector \(\boldsymbol {a} \in \mathbb {R}^{m}\) is assumed to be such that this convex set is non-empty, compact and contains a point x > 0. The objective function in (1) is a composition of a convex and an affine function, hence convex. Since \(\mathcal {X}\) is compact and λ 1(⋅) is lower semi-continuous, the set of optimal solutions to (1) is non-empty and compact (Andréasson et al. 2005, Theorem 4.7).
The results proved in this paper concern extreme points of the set of optimal solutions to (1) so it is of interest to know that such points exist. Let \(\lambda _{1}^{\ast }\) denote the optimal value of (1). The set of optimal solutions is then
The first of the sets to the right is closed and convex (Pataki 1998, p. 348), so by Theorem 2.1 in (Rockafeller 1972) \(\mathcal {O}\) is convex. Proposition 2.3.3 in Hiriart-Urruty and Lemarchal (1993) then ensures the existence of an extreme point in \(\mathcal {O}\).
3.1 SDP-reformulation and the main theorem
There exists several different characterizations of the maximum eigenvalue of a symmetric matrix (Pataki 1998). Here we employ a version where the eigenvalue is the optimal value of a certain semi-definite program (SDP).
Let x be given. A symmetric matrix is positive semi-definite if and only if all its eigenvalues are non-negative (Horn and Johnson 1985, Theorem 7.2.1). It follows that z I − A(x) ≽ 0 ⇔ z − λ 1(A(x)) ≥ 0, where I is an identity matrix. Therefore λ 1(A(x)) is the optimal value of the SDP
where W is a slack variable. Since A(x) is symmetric, A(x) = QDiag{λ i }Q T, with Q an orthonormal matrix of order n. Then (z ∗, W ∗) is an optimal solution to (2) if and only if z ∗ = λ 1(A(x)) and
It follows from the last equality that
This implies that the lower the rank of W ∗, the greater the multiplicity of λ 1(A(x)). Consequently, the proof of Theorem 1 below amounts to bounding the rank of W ∗.
Now since x in (2) is arbitrary, problem (1) can be reformulated as
The key step in the proof of Theorem 1 below is the application of Theorem 2.2 in (1998), which provides an upper bound on the sum of the ranks of the matrix variables at faces of the feasible set of an SDP. Therefore we need to reformulate (4) into a problem to which the cited theorem applies. To this end, note that a scalar x j ≥ 0 can be seen as a symmetric, positive semi-definite matrix of order 1. This means that problem (4) can be written as
With this reformulation we can now prove the main theorem.
Theorem 1
Let x ∗ be an extreme point of \(\mathcal {O}\) such that \({\sum }_{i=1}^{m} r(x_{i}^{\ast }) > n\), where r(x i ) = rank(x i ) ∈ {0, 1}. Then
Proof
Since x ∗ is an extreme point of \(\mathcal {O}\), it follows from Lemma 4.2 in Pataki (1998) (omit V and take k = 1) that (x ∗, z ∗, W ∗), (z ∗, W ∗) solving (2) with x = x ∗, is an extreme point of the feasible set of (5). Let w = rank(W ∗). Problem (5) has q = 1 unconstrained scalar variables and t(n)+1 equality constraints (counting a symmetric matrix-constraint as t(n) scalar constraints), and an extreme point is a face of dimension zero, so Theorem 2.2 from Pataki (1998) yields
To get a contradiction, assume that w = n−1 [by (3) then, mult(λ 1(A(x ∗)) = 1]. Straightforward calculations show that
since by assumption, \({\sum }_{i=1}^{m} r(x_{i}^{\ast })>n\). But now (7) contradicts (6) and it follows that w < n − 1; hence w ≤ n − 2. This result together with (6) substituted in (3) then yields
□
Remark 1
If x ≥ 0, y ≥ 0 and x + y = u, then 0 ≤ x ≤ u and 0 ≤ y ≤ u. Based on this observation we could account for an upper bound u i on each of the x i ’s in (5) by introducing additional variables \(y_{i}\in \mathbb {S}_{+}^{1}\), i = 1,…, m, and m linear equality constraints x i + y i = u i . Inequality (6) in the proof above then becomes
It holds that \(r(x_{i}^{\ast }) + r(y_{i})^{\ast } = 2\) if \(0<x_{i}^{\ast }<u\); 1 otherwise. Following the same arguments as in the proof one is thus led to a version of Theorem 1 where \({\sum }_{i=1}^{m} r(x_{i}^{*}) > n\) is replaced by
where |⋅| denotes the cardinality of a set. In terms of topology optimization this means that the number of ”grey” elements in the solution should exceed the order of the matrix. This seems overly restrictive since researchers report coalescing of eigenvalues despite using penalization methods which lead to solutions where almost all variables attain their upper or lower bound (Holmberg et al. 2015).
Remark 2
It is straightforward to modify the theorem to account for additional linear equality constraints – adding s such constraints leads to an additional term ” + s” to the right in (6). Linear inequality constraints may be treated by introduction of (non-negative) slack variables. Adding one linear inequality and a slack variable s then implies that a term ” r(s)” should be added to the left-hand side of (6) and ” +1” to the right-hand side.
4 The non-linear case
Consider the optimization problem
where x → A(x) is a continuously differentiable, non-linear function. Lemma 1 below shows than any solution x ∗ to this problem is also a solution to a problem of type (1). Therefore it makes sense to assume that x ∗ is an extreme point in the optimal solution set of the latter problem. This is done in Theorem 2, the proof of which is then a direct application of Theorem 1.
Lemma 1
Let x ∗ be an optimal solution to (8). Then x ∗ solves
where A ∗ = A(x ∗), and \(\boldsymbol {A}_{i}^{\ast } = \frac {\partial \boldsymbol {A}}{\partial x_{i}}(\boldsymbol {x}^{\ast })\).
Proof
We begin by noting that \(\mathcal {X}\) satisfies Slater’s constraint qualification (Andréasson et al. 2005, Def. 5.38), ensuring the necessity of the optimality conditions that follow.
Introduce the Lagrangian
The necessary first-order optimality conditions for a point \(\boldsymbol {x}^{\ast }\in \mathcal {X}\) in (8) is (Clarke 1983, Theorem 6.1.1) the existence of (μ, γ) ≠ 0, with γ ≥ 0 such that
where ∂ L(x ∗;μ, γ) = ∂ λ 1(A ∗) + μ a−γ is the Clarke’s generalized gradient of L. Let p = mult(λ 1(A ∗)). Theorem 3.9 in Overton and Womersley (1993) provides an expression for the generalized gradient ∂ λ 1(A ∗) using which the optimality conditions become the existence of (μ, γ) ≠ 0, with γ ≥ 0, and \(\boldsymbol {U}\in \mathbb {S}^{p}_{+}\) such that
where the columns of \(\boldsymbol {Q}_{1} \in \mathbb {R}^{n \times p}\) are eigenvectors associated with λ 1(A ∗).
Problem (9) is convex, so the generalized gradient of the associated Lagrangian coincide with the subdifferential (Clarke 1983, Proposition 2.2.7). The necessary and sufficient, first-order optimality conditions for (9) then follows from Theorem 28.3 in Rockafeller (1972), and are the existence of (μ, γ) ≠ 0, with γ ≥ 0, and \(\boldsymbol {U}\in \mathbb {S}^{p}_{+}\) such that
But these conditions coincide with conditions (11). This shows that y = x ∗ satisfies the optimality conditions (12) of problem (9). Since these are sufficient for optimality it follows that x ∗ solves (9). □
Theorem 2
Assume that \({\sum }_{i=1}^{m} r(x_{i}^{\ast }) > n\) and that x ∗ is an extreme point in the set of optimal solutions to problem (9). Then
Proof
Since problem (9) is of the form (1) with optimal value λ 1(A ∗) the assertion follows from Theorem 1. □
5 Discussion
This paper has presented sufficient conditions for the maximum eigenvalue at an optimal solution to a type of problems occurring in structural optimization to have multiplicity greater than one. The conditions are given in Theorems 1 and 2, whose proofs are based on results obtained by Pataki (1998).
It is common in structural optimization to require that the design variables satisfy x j ≥ε for some small number ε>0 to avoid singularity of the stiffness matrix (Christensen and Klarbring 2009). A simple change of variables x i → y i + ε results in a problem with zero as the lower bound on the design variables to which the presented results apply.
In Holmberg et al. (2015) the author and co-workers considered minimization of the worst-case compliance for a structure subject to self-weight loadings. The problem was formulated as a non-linear SDP, which is likely the best way to treat it numerically, but it is not difficult to show that it is equivalent to
where \(\mathcal {H}\) is similar to \(\mathcal {X}\), but includes upper bounds on x, and \(\boldsymbol {H}(\boldsymbol {x}) = \boldsymbol {B}(\boldsymbol {x})\boldsymbol {K}(\boldsymbol {x})^{-1}\boldsymbol {B}(\boldsymbol {x})^{\mathsf {T}}\in \mathbb {S}^{s}\), in which K(x) is the global stiffness matrix, B(x) is a scaling matrix and s ∈ {2, 3} is the number of spatial dimensions. For this problem, n in Theorem 2 equals s – a very small number – so the inequality \({\sum }_{i=1}^{m}r(x_{i}^{\ast }) > n\) will, in practise, always hold. Numerical experience (Holmberg et al. 2015) strongly corroborates the suggestion by Theorem 2 that the maximum eigenvalue at an optimal solution will have multiplicity greater than one.
6 Future work
There are at least three items that should be addressed in future work:
-
(i)
Are numerical solutions to structural optimization problems typically extreme points in the optimal solution set? Is it possible to modify the results in this paper so that they apply also to non-extremal solutions?
-
(ii)
Problems involving generalized eigenvalues fit into the present framework by assuming that they solve ordinary eigenvalue problems (see footnote in the Introduction). This hinges on the non-singularity of at least one of the involved matrices. This assumption is satisfied in the bulk of the relevant literature, but important exceptions exist (Achtziger and Kočvara 2007). Therefore direct treatment of generalized eigenvalues would be desirable.
-
(iii)
The results given in this paper should be extended to account for upper bounds on the variables x i . This is needed to analyze, for instance, three-dimensional topology optimization problems for continuum bodies. The issue was discussed in the remark following Theorem 1 above, but the obtained condition seems overly restrictive.
Notes
To see that all of these problems can be formulated as a minimization of the maximum eigenvalue, note that λ 1(A) = −λ n (−A), where λ 1 and λ n is the largest and, respectively, smallest eigenvalue, of the symmetric matrix A, and that, for a positive definite A, if λ 1 is the maximum eigenvalue for the generalized eigenvalue problem (B−λ A)u = 0, then it plays the same role in the eigenvalue problem (A −1/2 B A −1/2−λ I)u = 0.
References
Achtziger W, Kočvara M (2007) On the maximization of the fundamental eigenvalue in topology optimization. Struct Multidiscip Optim 34(3):181–195
Andréasson N, Evgrafov A, Patriksson M (2005) An introduction to continuous optimization. Studentlitteratur
Brittain K, Silva M, Tortorelli DA (2012) Minmax topology optimization. Struct Multidiscip Optim 45(5):657–668
Christensen PW, Klarbring A (2009) An introduction to structural optimization. Springer, Berlin Heidelberg New York
Clarke FH (1983) Optimization and nonsmooth analysis. Wiley, New York
Cox SJ, Overton ML (1992) On the optimal design of columns against buckling. SIAM J Math Anal 23(2):287–325
Dedieu JP (2000) Does optimality imply ill-posedness? Some remarks about certain min-max optimization problems. Optimization: A Journal of Mathematical Programming and Operations Research 48(2):211–217
Du J, Olhoff N (2005) Topology optimization of continuum structures with respect to simple and multiple eigenfrequencies. In: 6th world congresses of structural and multidisciplinary optimization, Rio de Janeiro, Brazil
Hiriart-Urruty JB, Lemarchal C (1993) Convex analysis and minimization algorithms I. Springer, Berlin Heidelberg New York
Holmberg E, Thore C-J, Klarbring A (2015) Worst-case topology optimization with self-weight using semi-definite programming. Struct Multidiscip Optim 52(5):915–928
Horn RA, Johnson CR (1985) Matrix analysis. Cambridge University Press, Cambridge, UK
Kobelev V (1993) On a game approach to optimal structural design. Struct Optimization 6(3):194–199
Overton ML, Womersley RS (1993) Optimality conditions and duality theory for minimizing sums of the largest eigenvalues of symmetric matrices. Math Program 62(1–3):321–357
Pataki G (1998) On the rank of extreme matrices in semidefinite programs and the multiplicity of optimal eigenvalues. Math Oper Res 23(2):339–358
Rahmatalla S, Swan CC (2003) Continuum topology optimization of buckling-sensitive structures. AIAA J 41(6):1180–1189
Rockafeller RT (1972) Convex Analysis. Princeton
Seyranian AP, Lund E, Olhoff N (1994) Multiple eigenvalues in structural optimization problems. Struct Optimization 8(4):207–227
Stanford B, Beran P (2013) Aerothermoelastic topology optimization with flutter and buckling metrics. Struct Multidiscip Optim 48:149–171
Takezawa A, Nii S, Kitamura M, Kogiso N (2011) Topology optimization for worst load conditions based on the eigenvalue analysis of an aggregated linear system. Comput Methods Appl Mech Eng 200(25):2268–2281
Acknowledgments
Thanks to Prof. Anders Klarbring for valuable discussions and the anonymous reviewers for their comments. This research was supported in part by the Swedish Foundation for Strategic Research Grant No. AM13-0029.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Thore, CJ. Multiplicity of the maximum eigenvalue in structural optimization problems. Struct Multidisc Optim 53, 961–965 (2016). https://doi.org/10.1007/s00158-015-1380-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00158-015-1380-3