Abstract
The differential-geometric structure of the manifold of smooth shapes is applied to the theory of shape optimization problems. In particular, a Riemannian shape gradient with respect to the first Sobolev metric and the Steklov–Poincaré metric are defined. Moreover, the covariant derivative associated with the first Sobolev metric is deduced in this paper. The explicit expression of the covariant derivative leads to a definition of the Riemannian shape Hessian with respect to the first Sobolev metric. In this paper, we give a brief overview of various optimization techniques based on the gradients and the Hessian. Since the space of smooth shapes limits the application of the optimization techniques, this paper extends the definition of smooth shapes to \(H^{1/2}\)-shapes, which arise naturally in shape optimization problems. We define a diffeological structure on the new space of \(H^{1/2}\)-shapes. This can be seen as a first step towards the formulation of optimization techniques on diffeological spaces.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Shape optimization is of great importance in a wide range of applications. A lot of real world problems can be reformulated as shape optimization problems which are constrained by partial differential equations (PDE). Aerodynamic shape optimization [49], acoustic shape optimization [50], optimization of interfaces in transmission problems [19, 48], image restoration and segmentation [24], electrochemical machining [23] and inverse modelling of skin structures [47] can be mentioned as examples. The subject of shape optimization is covered by several fundamental monographs, see, for instance, [14, 56].
Questions like How can shapes be defined? or How does the set of all shapes look like? have been extensively studied in recent decades. Already in 1984, David G. Kendall has introduced the notion of a shape space in [30]. Often, a shape space is just modeled as a linear (vector) space, which in the simplest case is made up of vectors of landmark positions (cf. [13, 30]). However, there is a large number of different shape concepts, e.g., plane curves [42, 43], surfaces in higher dimensions [4, 31, 40], boundary contours of objects [18, 36, 64], multiphase objects [63], characteristic functions of measurable sets [65] and morphologies of images [15]. In a lot of processes in engineering, medical imaging and science, there is a great interest to equip the space of all shapes with a significant metric to distinguish between different shape geometries. In the simplest shape space case (landmark vectors), the distances between shapes can be measured by the Euclidean distance, but in general, the study of shapes and their similarities is a central problem. In order to tackle natural questions like How different are shapes?, Can we determine the measure of their difference? or Can we infer some information? mathematically, we have to put a metric on the shape space. There are various types of metrics on shape spaces, e.g., inner metrics [4, 42] like the Sobolev metrics, outer metrics [6, 30, 42], metamorphosis metrics [26, 60], the Wasserstein or Monge-Kantorovic metric on the shape space of probability measures [2, 7], the Weil-Petersson metric [34], current metrics [16] and metrics based on elastic deformations [18, 64]. However, it is a challenging task to model both, the shape space and the associated metric. There does not exist a common shape space or shape metric suitable for all applications. Different approaches lead to diverse models. The suitability of an approach depends on the requirements in a given situation.
In contrast to a finite dimensional optimization problem, which can be obtained, e.g., by representing shapes as splines, the connection of shape calculus with infinite dimensional spaces [14, 28, 56] leads to a more flexible approach. In recent work, it has been shown that PDE constrained shape optimization problems can be embedded in the framework of optimization on shape spaces. E.g., in [53], shape optimization is considered as optimization on a Riemannian shape manifold, the manifold of smooth shapes. Moreover, an inner product, which is called Steklov–Poincaré metric, for the application of finite element (FE) methods is proposed in [54].
First, we concentrate on the particular manifold of smooth shapes and consider the first Sobolev and the Steklov–Poincaré metric in this paper. The definition of the Riemannian shape gradient with respect to these two metrics results in the formulation of gradient based optimization algorithms. One aim of this paper is to give an overview of the optimization techniques in the space of smooth shapes together with the first Sobolev and Steklov–Poincaré metric. This paper extends the gradient based results in [62], where the theory of PDE constrained shape optimization problems is connected with the differential-geometric structure of the space of smooth shapes. To be more precisely, this paper aims at the definition of a Riemannian shape Hessian with respect to the first Sobolev metric. In order to formulate such a definition, the covariant derivative needs to be specified. This paper formulates a theorem about the covariant derivative associated with the first Sobolev metric. This opens the door for formulating higher order methods in the space of smooth shapes.
The manifold of smooth shapes contains shapes with infinitely differentiable boundaries, which limits the practical applicability. For example, in the setting of PDE constrained shape optimization, one has to deal with polygonal shape representations from a computational point of view. This is because FE methods are usually used to discretize the models. In [54], not only an inner product, the Steklov–Poincaré metric, is given but also a suitable shape space for the application of FE methods is proposed. The combination of this particular shape space and its associated inner product is an essential step towards applying efficient FE solvers as outlined in [55]. However, so far, this shape space and its properties are not investigated. From a theoretical point of view, it is necessary to clarify its structure. If we do not know the structure, there is no chance to get control over the space. Thus, this paper aims at a generalization of smooth shapes to shapes which arise naturally in shape optimization problems. We define the space of so-called \(H^{1/2}\)-shapes. Moreover, we clarify its structure as a diffeological one and, thus, go towards the formulation of optimization techniques on diffeological spaces. Since a diffeological space is one of the generalizations of manifolds, this paper formulates a theorem which clarifies the difference between manifolds and diffeological spaces.
This paper is organized as follows. In Sect. 2, besides a short overview of basic concepts in shape optimization (Sect. 2.1), the connection of shape calculus with the differential-geometric structure of shape spaces is stated (Sect. 2.2). In particular, the Riemannian shape gradients with respect to the first Sobolev and Steklov–Poincaré metric are defined and the Riemannian shape Hessian with respect to the first Sobolev metric is given. One of the main theorems of this paper is Theorem 2, which specifies the covariant derivative associated with the first Sobolev metric necessary for the definition of the Riemannian shape Hessian with respect to the first Sobolev metric. Thanks to the definition of the Riemannian shape Hessian we are able to formulate the Newton method in the space of smooth shapes together with the first Sobolev metric. Additionally, we give a brief overview of first order optimization techniques based on gradients with respect to the first Sobolev metric as well as the Steklov–Poincaré metric. In particular, Sect. 2.2 ends with a comparison of the gradient based algorithms for a specific example. Section 3 is concerned with the space of \(H^{1/2}\)-shapes. First, we give a brief introduction in diffeological spaces and explain the difference between these spaces and manifolds (Sect. 3.1). The first main theorem of Sect. 3 is Theorem 3, which specifies the difference between diffeological spaces and manifolds. In Sect. 3.2, the space of \(H^{1/2}\)-shapes is defined. Here, Theorem 4, which is the third and last of the main theorems in this paper, endows the space of \(H^{1/2}\)-shapes with its diffeological structure.
2 Optimization in Shape Spaces
First, we set up notation and terminology of basic shape optimization concepts (Sect. 2.1). Afterwards, shape calculus is combined with geometric concepts of shape spaces (Sect. 2.2). In [62], the theory of shape optimization problems constrained by partial differential equations is already connected with the differential-geometric structure of the space of smooth shapes. Moreover, gradient-based methods are outlined. However, Sect. 2.2 extends these results to a Riemannian shape Hessian, for which the covariant derivative needs to be specified. This opens the door for formulating higher order methods in space of smooth shapes. In particular, we formulate a Newton method on the space of smooth shapes based on the definition of the Riemannian shape Hessian.
2.1 Basic Concepts in Shape Optimization
This section sets up notation and terminology of basic shape optimization concepts used in this paper. For a detailed introduction into shape calculus, we refer to the monographs [14, 56].
One of the main focuses of shape optimization is to investigate shape functionals and solve shape optimization problems. First, we give the definition of a shape functional.
Definition 1
(Shape functional) Let D denote a non-empty subset of \({\mathbb {R}}^d\), where \(d\in {\mathbb {N}}\). Moreover, \({\mathcal {A}}\subset \{\varOmega :\varOmega \subset D\}\) denotes a set of subsets. A function
is called a shape functional.
Let \(J:{\mathcal {A}}\rightarrow {\mathbb {R}}\) be a shape functional, where \({\mathcal {A}}\) is a set of subsets \(\varOmega \) as in Definition 1. An unconstrained shape optimization problem is given by
Often, shape optimization problems are constrained by equations, e.g., equations involving an unknown function of two or more variables and at least one partial derivative of this function. In this case, the objective functional J has two arguments, the shape \(\varOmega \) as well as the so-called state variable y, where the state variable is the solution of the underlying constraint. A constrained shape optimization problem reads as
where \({\mathcal {X}}(\varOmega )\) is usually a function space and the constraint \({\mathcal {F}} (\varOmega ,y(\varOmega ))=0\) is given for example by a PDE or a system of PDEs. When J in (2) depends on a solution of a PDE, we call the shape optimization problem PDE constrained.
Let D be as in Definition 1. Moreover, let \(\{F_t\}_{t\in [0,T]}\) be a family of mappings \(F_t:{\overline{D}}\rightarrow {\mathbb {R}}^d\) such that \(F_0={\text {id}}\), where \({\overline{D}}\) denotes the closure of D and \(T>0\). This family transforms the domain \(\varOmega \) into new perturbed domains
and the boundary \(\varGamma \) of \(\varOmega \) into new perturbed boundaries
Such a transformation can be described by the velocity method or by the perturbation of identity. We concentrate on the perturbation of identity, which is defined by \(F_t(x):= x+tV(x)\), where V denotes a sufficiently smooth vector field.
To solve shape optimization problems, we need their shape derivatives.
Definition 2
(Shape derivative) Let \(D\subset {\mathbb {R}}^d\) be open, \(\varOmega \subset D\) and \(k\in {\mathbb {N}}\cup \{\infty \}\). Moreover, let \({\mathcal {C}}^k_0(D,{\mathbb {R}}^d)\) denote the set of \({\mathcal {C}}^k(D,{\mathbb {R}}^d)\)-functions which vanish on \(\partial \varOmega \). The Eulerian derivative of a shape functional J at \(\varOmega \) in direction \(V\in {\mathcal {C}}^k_0(D,{\mathbb {R}}^d)\) is defined by
If for all directions \(V\in {\mathcal {C}}^k_0(D,{\mathbb {R}}^d)\) the Eulerian derivative (4) exists and the mapping
is linear and continuous, the expression \(DJ(\varOmega )[V]\) is called the shape derivative of J at \(\varOmega \) in direction \(V\in {\mathcal {C}}^k_0(D,{\mathbb {R}}^d)\). In this case, J is called shape differentiable of class \({\mathcal {C}}^k\) at \(\varOmega \).
Remark 1
There are many options to prove shape differentiability of shape functionals which depend on a solution of a PDE and to derive the shape derivative of a shape optimization problem. The min–max approach [14], the chain rule approach [56], the Lagrange method of Céa [10] and the rearrangement method [29] have to be mentioned in this context. A nice overview about these approaches is given in [59].
The Hadamard Structure Theorem (cf. [56, Theorem 2.27]) states that under certain assumptions the shape derivative is a distribution acting on the normal part of the perturbation field on the boundary.
Theorem 1
(Hadamard Structure Theorem) Let D and \(\varOmega \) be as in Definition 2. Moreover, let the shape functional J be shape differentiable of class \({\mathcal {C}}^k\) at every domain \(\varOmega \subset D\) with \({\mathcal {C}}^{k-1}\)-boundary \(\varGamma =\partial \varOmega \). Then there exists a scalar distribution \(r\in {\mathcal {C}}^k_0(\varGamma )'\) such that \(G(\varOmega )\in {\mathcal {C}}^k_0(\varOmega ,{\mathbb {R}}^d)'\) of J at \(\varOmega \) is given by
Here \( {\mathcal {C}}^k_0(\varGamma )'\) and \({\mathcal {C}}^k_0(\varOmega ,{\mathbb {R}}^d)'\) denote the dual spaces of \( {\mathcal {C}}^k_0(\varGamma )\) and \({\mathcal {C}}^k_0(\varOmega ,{\mathbb {R}}^d)\). Moreover,
denotes the trace operator and \(\gamma _\varGamma '\) its adjoint operator.
Note that the Hadamard Structure Theorem 1 actually states the existence of a scalar distribution \(r=r(\varOmega )\) on the boundary \(\varGamma \) of a domain \(\varOmega \). However, in this paper, we always assume that r is an integrable function. In general, if \(r\in L^1(\varGamma )\), then r is obtained in the form of the trace on \(\varGamma \) of an element \(G\in W^{1,1}(\varOmega )\). This means that it follows from (5) that the shape derivative can be expressed more conveniently as
If the objective functional is given by an integral over the whole domain, the shape derivative can be expressed as an integral over the domain, the so-called volume or weak formulation, and an integral over the boundary, the so-called surface or strong formulation:
Here \(r\in L^1(\varGamma )\) and R is a differential operator acting linearly on the vector field V with \(DJ_\varOmega [V]=DJ(\varOmega )[V]=DJ_\varGamma [V]\). Recent advances in PDE constrained optimization on shape manifolds are based on the surface formulation, also called Hadamard-form, as well as intrinsic shape metrics. Major effort in shape calculus has been devoted towards such surface expressions (cf. [14, 56]), which are often very tedious to derive. When one derives a shape derivative of an objective functional which is given by an integral over the domain, one first get the volume formulation. This volume form can be converted into its surface form by applying the integration by parts formula. In order to apply this formula, one needs a higher regularity of the state and adjoint of the underlying PDE. Recently, it has been shown that the weak formulation has numerical advantages, see, for instance, [8, 19, 25, 48]. In [35], also practical advantages of volume shape formulations have been demonstrated. However, volume integral forms of shape derivatives require an outer metric on the domain surrounding the shape boundary. In contrast to inner metrics, which can be seen as describing a deformable material that the shape itself is made of, the differential operator governing outer metrics is defined even outside of the shape (cf., e.g., [6, 9, 30, 42]). In [54], both points of view are harmonized by deriving a metric from an outer metric. Based on this metric, efficient shape optimization algorithms, which also reduce the analytical effort so far involved in the derivation of shape derivatives, are proposed in [54, 55, 61]. The next subsection concentrates on the question how shape calculus and in particular shape derivatives can be combined with geometric concepts of shape spaces. This combination results in efficient optimization techniques in shape spaces.
2.2 Shape Calculus Combined with Geometric Concepts of Shape Spaces
As pointed out in [51], shape optimization can be viewed as optimization on Riemannian shape manifolds and the resulting optimization methods can be constructed and analyzed within this framework. This combines algorithmic ideas from [1] with the Riemannian geometrical point of view established in [4]. In this subsection, we analyze the connection of Riemannian geometry on the space of smooth shapes to shape optimization and extend the results in [62], which is also concerned with this connection, to a Riemannian shape Hessian and to a second order optimization method. In particular, we specify the covariant derivative associated with the first Sobolev metric on the space of smooth shapes (cf. Theorem 2 in Sect. 2.2.1), which results in the definition of the Riemannian shape Hessian with respect to this metric (cf. Definition 5 in Sect. 2.2.2). The formulation of the covariant derivative and the shape Hessian opens the door for formulating higher order methods in space of smooth shapes (cf. Algorithm 2 in Sect. 2.2.2).
2.2.1 The Space of Smooth Shapes
We first introduce the space of smooth shapes and summarize some of its properties which are relevant for this paper from the literature [3,4,5, 32, 41, 42]. First, we concentrate on one-dimensional shapes, which are defined as the images of simple closed smooth curves in the plane of the unit circle. Such simple closed smooth curves can be represented by embeddings from the circle \(S^1\) into the plane \({\mathbb {R}}^2\), see, for instance, [33]. Therefore, the set of all embeddings from \(S^1\) into \({\mathbb {R}}^2\), denoted by \(\mathrm {Emb}(S^1,{\mathbb {R}}^2)\), represents all simple closed smooth curves in \({\mathbb {R}}^2\). However, note that we are only interested in the shape itself and that images are not changed by re-parametrizations. Thus, all simple closed smooth curves which differ only by re-parametrizations can be considered equal to each other because they lead to the same image. Let \(\mathrm {Diff}(S^1)\) denote the set of all diffeomorphisms from \(S^1\) into itself. This set is a regular Lie group (cf. [32, Chap. VIII, 38.4]) and consists of all the smooth re-parametrizations mentioned above. In [41], the set of all one-dimensional shapes is characterized by
i.e., the obit space of \(\mathrm {Emb}(S^1,{\mathbb {R}}^2)\) under the action by composition from the right by the Lie group \(\mathrm {Diff}(S^1)\). A particular point on \(B_e(S^1,{\mathbb {R}}^2)\) is represented by a curve \( c:S^1\rightarrow {\mathbb {R}}^2 , \ \theta \mapsto c(\theta ) \) and illustrated in the left picture of Fig. 1. The tangent space is isomorphic to the set of all smooth normal vector fields along c, i.e.,
where n denotes the exterior unit normal field to the shape boundary c such that \(n (\theta )\perp c_\theta (\theta )\) for all \(\theta \in S^1\), where \(c_\theta =\frac{\partial c}{\partial \theta }\) denotes the circumferential derivative as in [41]. Since we are dealing with parametrized curves, we have to work with the arc length and its derivative. Therefore, we use the following notation:
Remark 2
Some properties of the operator \(D_s\) can be found in, e.g., [42]. In [4], this operator is considered for higher dimensions and its connection with the Bochner-Laplacian is given.
In [32], it is proven that the shape space \(B_e(S^1,{\mathbb {R}}^2)\) is a smooth manifold. Is it even perhaps a Riemannian shape manifold? This question was investigated by Peter W. Michor and David Mumford. They show in [41] that the standard \(L^2\)-metric on the tangent space is too weak because it induces geodesic distance equals zero. This phenomenon is called the vanishing geodesic distance phenomenon. The authors employ a curvature weighted \(L^2\)-metric as a remedy and prove that the vanishing phenomenon does not occur for this metric. Several Riemannian metrics on this shape space are examined in further publications, e.g., [4, 40, 42]. All these metrics arise from the \(L^2\)-metric by putting weights, derivatives or both in it. In this manner, we get three groups of metrics: the almost local metrics which arise by putting weights in the \(L^2\)-metric (cf. [5, 42]), the Sobolev metrics which arise by putting derivatives in the \(L^2\)-metric (cf. [4, 42]) and the weighted Sobolev metrics which arise by putting both, weights and derivatives, in the \(L^2\)-metric (cf. [5]). It can be shown that all these metrics do not induce the phenomenon of vanishing geodesic distance under special assumptions. To list all these goes beyond the scope of this paper, but they can be found in the above-mentioned publications. All Riemannian metrics mentioned above are inner metrics. As already mentioned above, this means that the deformation is prescribed on the shape itself and the ambient spaceFootnote 1 stays fixed.
In the following, we clarify briefly how the above-mentioned inner Riemannian metrics can be defined on the shape space \(B_e(S^1,{\mathbb {R}}^2)\). For details we refer to [42]. Moreover, we refer to [3] for a comparison of an inner metric on \(B_e(S^1,{\mathbb {R}}^2)\) with the diffeomorphic matching framework which works with outer metrics.
First, we define a Riemannian metric on the space \(\mathrm {Emb}(S^1,{\mathbb {R}}^2)\), which is a family \(g=\left( g_c(h,k)\right) _{c\in \mathrm {Emb}(S^1,{\mathbb {R}}^2)}\) of inner products \(g_c(h,k)\), where h and k denote vector fields along \(c\in \mathrm {Emb}(S^1,{\mathbb {R}}^2)\). The most simple inner product on the tangent bundle to \(\text {Emb}(S^1,{\mathbb {R}}^2)\) is the standard \(L^2\)-inner product \(g_c(h,k) := \int _{S^1}\left< h,k\right> ds\). Note that
and that a tangent vector \(h\in T_c\text {Emb}(S^1,{\mathbb {R}}^2)\) has an orthonormal decomposition into smooth tangential components \(h^\top \) and normal components \(h^\perp \) (cf. [41, Sect. 3, 3.2]). In particular, \(h^\perp \) is an element of the bundle of tangent vectors which are normal to the \(\text {Diff}(S^1)\)-orbits denoted by \({\mathcal {N}}_c\). This normal bundle is well defined and is a smooth vector subbundle of the tangent bundle. In [41], it is outlined how the restriction of the metric \(g_c\) to the subbundle \({\mathcal {N}}_c\) gives the quotient metric. The quotient metric induced by the \(L^2\)-metric is given by
where \(h=\alpha n\) and \(k=\beta n\) denote two elements of the tangent space \(T_cB_e(S^1,{\mathbb {R}}^2)\) given in (9). Unfortunately, in [41], it is shown that this \(L^2\)-metric induces vanishing geodesic distance, as already mentioned above.
For the following discussion, among all the above-mentioned Riemannian metrics, we pick the first Sobolev metric which does not induce the phenomenon of vanishing geodesic distance (cf. [42]). On \(B_e(S^1,{\mathbb {R}}^2)\) it is defined as follows:
Definition 3
(First Sobolev metric on \(B_e(S^1,{\mathbb {R}}^2)\)) The first Sobolev metric on \(B_e(S^1,{\mathbb {R}}^2)\) is given by
where \(A>0\) and \(D_s\) denotes the arc length derivative with respect to c defined in (11).
An essential operation in Riemannian geometry is the covariant derivative. In differential geometry, it is often written in terms of the Christoffel symbols. In [4], Christoffel symbols associated with the Sobolev metrics are provided. However, in order to provide a relation with shape calculus, another representation of the covariant derivative in terms of the Sobolev metric \(g^1\) is needed. Now, we get to the first main theorem of this paper. The Riemannian connection provided by this theorem makes it possible to specify the Riemannian shape Hessian.
Theorem 2
Let \(A>0\) and let \(h,m\in T_c\mathrm{Emb}(S^1,{\mathbb {R}}^2)\) denote vector fields along \(c\in \mathrm{Emb}(S^1,{\mathbb {R}}^2)\). The arc length derivative with respect to c is denoted by \(D_s\) as in (11). Moreover, \(L_1:= I-AD_s^2\) is a differential operator on \({\mathcal {C}}^\infty (S^1,{\mathbb {R}}^2)\) and \(L_1^{-1}\) denotes its inverse operator. The covariant derivative associated with the Sobolev metric \(g^1\) can be expressed as
where \(v=\frac{c_\theta }{|c_\theta |}\) denotes the unit tangent vector.
Remark 3
The inverse operator \(L_1^{-1}\) in Theorem 2 is an integral operator whose kernel has an expression in terms of the arc length distance between two points on a curve and their unit normal vectors (cf. [42]). For the existence and more details about \(L_1^{-1}\) we refer to [42].
Proof of Theorem 2
Let h, k, m be vector fields on \({\mathbb {R}}^2\) along \(c\in \text {Emb}(S^1,{\mathbb {R}}^2)\). Moreover, \(d(\cdot )[m]\) denotes the directional derivative in direction m. From [42, Sect. 4.2, formula (3)], we have
Applying (16), we obtain in analogy to the computations in [42, Sect. 4.2]
Since the differential operator \(D_s\) is anti self-adjoint for the \(L^2\)-metric \(g^0\), i.e.,
we get from (17)
Now, we proceed analogously to the proof of Theorem 2.1 in [51], which exploits the product rule for Riemannian connections. Thus, we conclude from
that the covariant derivative associated with \(g^1\) is given by (15). \(\square \)
Remark 4
For the sake of completeness it should be mentioned that the shape space \(B_e(S^1,{\mathbb {R}}^2)\) and its theoretical results can be generalized to higher dimensions. Let M be a compact manifold and let N denote a Riemannian manifold with \(\text {dim}(M)<\text {dim}(N)\). In [40], the space of all submanifolds of type M in N is defined by
In Fig. 1, the left picture illustrates a two-dimensional shape which is an element of the shape space \(B_e(S^2,{\mathbb {R}}^3)\). In contrast, the second shape from left in this figure is a two-dimensional shape which is not an element of this shape space. Note that the vanishing geodesic distance phenomenon occurs also for the \(L^2\)-metric in higher dimensions as verified in [40]. For the definition of the Sobolev metric \(g^1\) in higher dimensions we refer to [4].
2.2.2 Optimization in the Space of Smooth Shapes
In the following, we focus on two Riemannian metrics on the space of smooth shapes \(B_e\), the first Sobolev metric \(g^1\) introduced in Sect. 2.2.1 and the Steklov–Poincaré metric \(g^S\) defined below. The aim of this subsection is to provide some optimization techniques in \((B_e,g^1)\) and \((B_e,g^S)\).
The subsection is structured in three paragraphs. The first paragraph considers the first Sobolev metric \(g^1\), where we repeat firstly some relevant results from [62]. Afterwards, we built on our findings of the previous subsection and extend the results in [62]. More precisely, thanks to the specification of the covariant derivative associated with \(g^1\) in the previous subsection, we are able to define the Riemannian shape Hessian with respect to \(g^1\) and formulate the Newton method in \((B_e,g^1)\), which is based on this definition. As we will see below, if we consider Sobolev metrics, we have to deal with surface formulations of shape derivatives. An intermediate and equivalent result in the process of deriving these expressions is the volume expression as already mentioned above. These volume expressions are preferable over surface forms. This is not only because of saving analytical effort, but also due to additional regularity assumptions, which usually have to be required in order to transform volume into surface forms, as well as because of saving programming effort. However, in the case of the more attractive volume formulation, the shape manifold \(B_e\) and the corresponding inner products \(g^1\) are not appropriate. One possible approach to use volume forms is addressed in the second paragraph of this subsection, which considers Steklov–Poincaré metrics. We summarize some of the main results related to this metric from [54] with view on optimization methods. Finally, the third paragraph of this subsection considers a specific example and concludes this subsection with a brief discussion about the two approaches resulting from considering the first Sobolev and the Steklov–Poincaré metric. Since this paper does not focus on numerical investigations, we pick an example which is already implemented in [54, 61] to illustrate the main differences between the two approaches.
Optimization based on first Sobolev metrics
We consider the Sobolev metric \(g^1\) on the shape space \(B_e\). In particular, this means that we consider elements of \(B_e\), i.e., smooth boundaries \(\varGamma \) of the domain \(\varOmega \) under consideration in the following. The Riemannian connection with respect to this metric, which is given in Theorem 2, makes it possible to specify the Riemannian shape Hessian of an optimization problem.
First, we detail the Riemannian shape gradient from [62]. Due to the Hadamard Structure Theorem, there exists a scalar distribution r on the boundary \(\varGamma \) of the domain \(\varOmega \) under consideration. If we assume \(r\in L^1(\varGamma )\), the shape derivative can be expressed on the boundary \(\varGamma \) of \(\varOmega \) (cf. (7)). The distribution r is often called the shape gradient in the literature. However, note that gradients depend always on the chosen scalar products or on the chosen metrics defined on the space under consideration. If we want to optimize on a shape manifold, we have to find a representation of the shape gradient with respect to a Riemannian metric defined on the shape manifold under consideration. This representation is called the Riemannian shape gradient. The shape derivative can be expressed more concisely as
if . In order to get an expression of the Riemannian shape gradient with respect to the Sobolev metric \(g^1\), we look at the isomorphism (9). Due to this isomorphism, a tangent vector \(h\in T_\varGamma B_e\) is given by \(h=\alpha n\) with \(\alpha \in {\mathcal {C}}^\infty (\varGamma )\). This leads to the following definition.
Definition 4
(Riemannian shape gradient with respect to the first Sobolev metric) A Riemannian representation of the shape derivative, i.e., the Riemannian shape gradient of a shape differentiable objective function J in terms of the first Sobolev metric \(g^1\), is given by
where \(D_s\) is the arc length derivative with respect to \(\varGamma \in B_e\), \(A>0\), \(q\in {\mathcal {C}}^\infty (\varGamma )\) and r denotes the function in the shape derivative representation (7) for which we assume \(r\in {\mathcal {C}}^\infty (\varGamma )\).
Next, we specify the Riemannian shape Hessian with respect to the first Sobolev metric. It is based on the Riemannian connection \(\nabla \) related to the Sobolev metric \(g^1\) given in one of the main theorems of this paper, Theorem 2, as well as on the Riemannian shape gradient definition, Definition 4. In analogy to [1], we can define the Riemannian shape Hessian as follows:
Definition 5
(Riemannian shape Hessian with respect to the first Sobolev metric) Let \(\nabla \) be the covariant derivative associated with the Sobolev metric \(g^1\). The Riemannian shape Hessian with respect to the Sobolev metric \(g^1\) of a two times shape differentiable objective function J is defined as the linear mapping
The Riemannian shape gradient and the Riemannian shape Hessian with respect to the Sobolev metric \(g^1\) are required to apply first and second order optimization methods in the shape space \((B_e,g^1)\). The gradient method is an example for a first order optimization method. If we apply the gradient method to (1) and consider \(g^1\) on \(B_e\), we need to compute the Riemannian shape gradient with respect to \(g^1\) from (22). The negative gradient is then used as descent direction for the objective functional J. An example for a second order method is the Newton method. If we apply the Newton method to (1), we need to solve—similarly to standard non-linear programming—the problem of finding \(\varGamma \in B_e\) with
where \(\varGamma \) denotes the boundary of \(\varOmega \) and \(\text {grad}J\) is the gradient with respect to \(g^1\) (cf. (22)).
In general, the calculations of optimization methods on manifolds have to be performed in tangent spaces. This means, points from a tangent space have to be mapped to the manifold in order to get a new iterate. More precisely, we need to take a given tangent vector to the manifold, run along the geodesic starting at that point and go in that direction for a special length defined by the optimization process. The computation of the Riemannian exponential map, which is the theoretically superior choice of such a mapping, is prohibitively expensive in the most applications. However, in [1], it is shown that a so-called retraction is a first-order approximation and sufficient.
Definition 6
(Retraction) A retraction on a manifold M is a smooth mapping \({\mathcal {R}}:TM\rightarrow M\) with the following properties:
-
(i)
\({\mathcal {R}}_p(0_p)=p\), where \({\mathcal {R}}_p\) denotes the restriction of \({\mathcal {R}}\) to \(T_pM\) and \(0_p\) denotes the zero element of \(T_pM\).
-
(ii)
\(d{\mathcal {R}}_p(0_p)=\text {id}_{T_pM}\), where \(\text {id}_{T_pM}\) denotes the identity mapping on \(T_pM\) and \(d{\mathcal {R}}_p(0_p)\) denotes the pushforward of \(0_p\in T_pM\) by \({\mathcal {R}}\).
For example, in \(B_e(S^1,{\mathbb {R}}^2)\), for sufficiently small perturbations \(\alpha \in {\mathcal {C}}^\infty (S^1)\), a retraction \({\mathcal {R}}\) is defined by
where \(\eta _c\in T_cB_e(S^1,{\mathbb {R}}^2)\) and \( c+\eta _c:S^1 \rightarrow {\mathbb {R}}^2,\, \theta \mapsto c(\theta )+\alpha (\theta )n(c(\theta )) \).
Now, we are able to formulate the gradient in \((B_e,g^1)\) (cf. Algorithm 1). Thanks to the definition of the Riemannian shape Hessian \(\text {Hess}J\) in \((B_e,g^1)\), which is based on the resulting Riemannian connection \(\nabla \) given in (15), we can formulate also the Newton method in \((B_e,g^1)\) (cf. Algorithm 2). We require the shape function r of the surface shape derivative (cf. (7)) in both algorithms. This function is needed to compute the shape gradient with respect to \(g^1\) in each iteration. In Algorithm 2, both, the Riemannian shape gradient and the Riemannian shape Hessian with respect to \(g^1\), are required. If we have a PDE constrained shape optimization problem, the Newton method can be applied to find stationary points of the Lagrangian of the optimization problem which leads to the Lagrange–Newton method.
Optimization based on Steklov–Poincaré metrics
Gradients with respect to \(g^1\) are based on surface expressions of shape derivatives as you can see in (22), where r is the function in the surface shape derivative representation (7). As outlined at the beginning of this section, volume expressions are preferable over surface forms. One possible approach to use volume forms is to consider Steklov–Poincaré metrics \(g^S\) (cf. [54]). In the following, we summarize some of the main results related to this metric from [54] with view on first order optimization approaches in the space of smooth shapes. In order to be able to formulate higher order methods in \((B_e,g^S)\) an explicit expression of the covariant derivative with respect to the Sobolev metric is necessary. The derivation of such an expression and also the formulation and investigation of higher order methods in \((B_e,g^S)\) is not in the scope of this paper and left for future work.
In the following, we need to deal with Lipschitz boundaries. Since there are several competing conditions which are used to define a Lipschitz boundary, we first specify its definition:
Definition 7
(\({\mathcal {C}}^{k,r}\)-boundary, Lipschitz boundary) Let \(\varOmega \subset {\mathbb {R}}^d\) be open with boundary \(\varGamma =\partial \varOmega \). Moreover, let \(k\in \overline{{\mathbb {N}}}\) and \( {\mathcal {C}}^{k,r}({\overline{\varOmega }})\) denote the set of \({\mathcal {C}}^k\)-functions which are Hölder-continuous with exponent \(r\in [0,1]\). Further, \(B_d(x,R)\) denotes the ball in \({\mathbb {R}}^d\) centered at \(x\in {\mathbb {R}}^d\) with radius \(R>0\). We say \(\varOmega \) has a \({\mathcal {C}}^{k,r}\)-boundary or \(\varOmega \) is \({\mathcal {C}}^{k,r}\) if for any \(x\in \varGamma \) there exist local coordinates \(y_1,\ldots ,y_d\) centered at x, i.e., such that x is the unique solution of \(y_1=\dots =y_d=0\), and constants \(a,b>0\) as well as a mapping \(\psi \in {\mathcal {C}}^{k,r}( B_{d-1}(x,a))\), where \(B_{d-1}(x,a)\) is considered in the linear subspace defined by \((y_1,\ldots ,y_{d-1})\), subject to the following conditions:
-
(i)
\(y_d=\psi ({\widetilde{y}}) \Rightarrow ({\widetilde{y}},y_d)\in \varGamma \),
-
(ii)
\(\psi ({\widetilde{y}})<y_d<\psi ({\widetilde{y}})+b \Rightarrow ({\widetilde{y}},y_d)\in \varOmega \),
-
(iii)
\(\psi ({\widetilde{y}})-b<y_d<\psi ({\widetilde{y}}) \Rightarrow ({\widetilde{y}},y_d)\not \in {\overline{\varOmega }}\).
Definition 8
(Steklov–Poincaré metric) Let \(\varOmega \subset X\subset {\mathbb {R}}^d\) be a compact domain with \(\varOmega \ne \emptyset \) and Lipschitz-boundary \(\varGamma :=\partial \varOmega \), where X denotes a bounded domain with Lipschitz-boundary \(\varGamma _\text {out}:=\partial X\). The Steklov–Poincaré metric is given by
Here \(S^{pr}\) denotes the projected Poincaré–Steklov operator which is given by
where \(\gamma _0:H^1_0(X,{\mathbb {R}}^d) \rightarrow H^{1/2}(\varGamma ,{\mathbb {R}}^d)\), and \(U\in H^1_0(X,{\mathbb {R}}^d)\) solves the Neumann problem
with \(a(\cdot ,\cdot )\) being a symmetric and coercive bilinear form.
Remark 5
Note that a Steklov–Poincaré metric depends on the choice of the bilinear form. Thus, different bilinear forms lead to various Steklov–Poincaré metrics.
Next, we state the connection of \(B_e\) with respect to the Steklov–Poincaré metric \(g^S\) to shape calculus. As already mentioned, the shape derivative can be expressed as the surface integral (7) due to the Hadamard Structure Theorem. Recall that the shape derivative can be written more concisely (cf. (21)). Due to isomorphism (9) and expression (21), we can state the connection of the shape space \(B_e\) with respect to the Steklov–Poincaré metric \(g^S\) to shape calculus.
Definition 9
(Shape gradient with respect to Steklov–Poincaré metric) Let \(r\in {\mathcal {C}}^\infty (\varGamma )\) denote the function in the shape derivative expression (7). Moreover, let \(S^{pr}\) be the projected Poincaré-Steklov operator and let \(\gamma _0\) be as in Definition 8. A representation \(h\in T_{\varGamma } B_e\cong {\mathcal {C}}^\infty (\varGamma )\) of the shape gradient in terms of \(g^S\) is determined by
which is equivalent to
Remark 6
In Definition 9, the isomorphism \(T_{\varGamma } B_e\cong {\mathcal {C}}^\infty (\varGamma )\) is given. It is worth to mention that for example identifying \(\varGamma \) with the corresponding embedding of the circle leads to this isomorphism. In particular, attention needs to be put onto (9) and (12).
Now, the shape gradient with respect to Steklov–Poincaré metric is defined. This enables the formulation of optimization methods in \(B_e\) which involve volume formulations of shape derivatives. From (32) we get \(h=S^{pr}r=(\gamma _0 U)^T n\), where \(U\in H^1_0(X,{\mathbb {R}}^d)\) solves
with \(a(\cdot ,\cdot )\) being the symmetric and coercive bilinear form on \(H_0^1(X,{\mathbb {R}}^d) \times H_0^1(X,{\mathbb {R}}^d)\) of the Steklov–Poincaré metric definition (cf. (30)). The identity (33) opens the door to consider volume expression of shape derivatives to compute the shape gradient with respect to \(g^S\). In order to compute the shape gradient, we have to solve
with \(b(\cdot )\) being a linear form and given by
Here \(J_\text {surf}(\varOmega )\) denotes parts of the objective function leading to surface shape derivative expressions, e.g., perimeter regularizations, and is incorporated as Neumann boundary condition in equation (34). Parts of the objective function leading to volume shape derivative expressions are denoted by \(J_\text {vol}(\varOmega )\). The bilinear form \(a(\cdot ,\cdot )\) can be chosen, e.g., as the weak form of the linear elasticity equation. More details can be found below, in the paragraph about the comparison of Algorithms 1 and 3.
Remark 7
Note that it is not ensured that \(U\in H^1_0(X,{\mathbb {R}}^d)\) is \({\mathcal {C}}^\infty \). Thus, \(h=S^{pr}r=(\gamma _0 U)^\top n\) is not necessarily an element of \(T_{\varGamma }B_e\). However, under special assumptions depending on the coefficients of a second-order partial differential operator and the right-hand side of a PDE, a weak solution U which is at least \(H^1_0\)-regular is \({\mathcal {C}}^\infty \) (cf. [17, Sect. 6.3, Theorem 6]).
Thanks to the definition of the gradient with respect to \(g^S\) we are able to formulate the gradient method on \((B_e,g^S)\) (cf. Algorithm 3). We compute the Riemannian shape gradient with respect to \(g^S\) from (34). The negative solution \(-U\) is then used as descent direction for the objective functional J. In Algorithm 3, in order to be in line with the above theory, it is assumed that in each iteration k, the shape \(\xi ^k\) is a subset of a general surrounding space X, which is assumed to be a bounded domain with Lipschitz boundary as illustrated in Fig. 2.
Comparison of Algorithm 1 and Algorithm 3
We conclude this section with a brief discussion about Algorithms 1 and 3. Since this paper does not focus on numerical investigations, we pick an example which is already implemented in [54, 61]. We summarize briefly numerical results observed in [54, 61] in order to illustrate how the algorithms work. Additionally, we discuss the main differences between the two approaches.
Let \(\varOmega \subset X\subset {\mathbb {R}}^2\) be a domain with \( \partial \varOmega = \varGamma \), where X denotes a bounded domain with Lipschitz-boundary \(\varGamma _\text {out}:=\partial X\). In contrast to the outer boundary \(\varGamma _\text {out}\), which is assumed to be fixed and partitioned in \(\varGamma _\text {out}:=\varGamma _{\text {bottom}} \sqcup \varGamma _{ \text {left}} \sqcup \varGamma _{\text {right}} \sqcup \varGamma _{\text {top}}\) (here, \(\sqcup \) denotes the disjoint union), the inner boundary \(\varGamma \), which is also called the interface, is variable. Let the interface \(\varGamma \) be an element of \(B_e(S^1,{\mathbb {R}}^2)\). Note that X depends on \(\varGamma \). Thus, we denote it by \(X(\varGamma )\). Figure 2 illustrates this situation. We consider the following parabolic PDE constrained interface problem (cf. [54, 61]):
with
denoting a jumping coefficient, n being the unit outward normal vector to \(\varOmega \) and \({\bar{y}}\in H^1(X(\varGamma ))\) represents data measurements. The second term in the objective function (36) is a perimeter regularization with \(\mu >0\). Please note that formulation (37) of the PDE has to be understood only formally because of the jumping coefficient k.
In order to solve the shape optimization problem (36)–(40), we first need to solve the underlying PDE, the so-called state equation. The solution of the parabolic boundary value problem (37)–(40) is obtained by discretizing its weak formulation with standard linear finite elements in space and an implicit Euler scheme in time. The diffusion parameter k is discretized as a piecewise constant function. Figure 3 illustrates an example initial shape geometry, where the domain is discretized with a fine and coarse finite element mesh. Besides the underlying PDE, we also need to solve the corresponding adjoint problem to the shape optimization problem (36)–(40), which is given in our example by
and which can be discretized in the same way as the state equation.
Remark 8
In general, the solution of the state and adjoint equation are needed in Algorithms 1 and 3 because they are part of the shape derivative of the objective functional.
We use the retraction given in (25) in order to update the shapes according to Algorithm 1 (cf. (26)) and Algorithm 3 (cf. (35)), respectively. This retraction is closely related to the perturbation of identity defined on the domain X. Given a stating shape \(\varGamma ^k\) in the k-th iteration of Algorithm 3, the perturbation of identity acting on the domain X in the direction \(U^k\), where \(U^k\) solves (34), gives
i.e. the vector field \(U^k\) weighted by a step size \(t^k\) is added as a deformation to all nodes in the finite element mesh. One calls \(U^k\) also mesh deformation (field). Here, the volume form allows us to optimize directly over the domain X containing \(\varGamma ^k\in B_e\). It is worth to mention that, in practice, we are only interested in the deformation on X because we need to update the finite element mesh after each iteration. The update of the shape \(\varGamma ^k\) itself is contained in this deformation field. In contrast to Algorithm 3, Algorithm 1 can only work with surface shape derivative expressions. These surface formulations would give us descent directions (in normal directions) for \(\varGamma ^k\) only, which would not help us to move mesh elements around the shape. Additionally, when we are working with a surface shape derivative, we need to solve another PDE in order to get a mesh deformation in the ambient space X. Below, this issue is addressed in more detail.
Both approaches follow roughly the same steps but with a major difference in the way of computing the mesh deformation. For convenience we summarize one optimization iteration and the main aspects of the two approaches:
-
1.
Solve the state and adjoint equation.
-
2.
Compute the mesh deformation:
-
Algorithm 3 The computation of a representation of the shape gradient with respect to the chosen inner product of the tangent space is moved into the mesh deformation itself. In particular, we get the gradient representation and the mesh deformation all at once from (33), which is very attractive from a computational point of view. The bilinear form \(a(\cdot ,\cdot )\) in (34) is used as both, an inner product and a mesh deformation, leading to only one linear system, which has to be solved. In practice, the bilinear form \(a(\cdot ,\cdot )\) in (34) is chosen as the weak form corresponding to the linear elasticity equation, i.e.,
$$\begin{aligned} a(U,V)= \int _{X(\varGamma )} \sigma (U):\epsilon (V) \, dx, \end{aligned}$$where : denotes the sum of the component-wise products and \(\sigma \), \(\varepsilon \) are the so-called strain and stress tensor,Footnote 2 respectively. In strong form, (34) is given by
$$\begin{aligned} \text {div}( \sigma )&= f^\text {elas} \quad \text {in} \quad X(\varGamma ) \end{aligned}$$(46)$$\begin{aligned} U&= 0 \quad \text {on} \quad \varGamma _\text {out} \end{aligned}$$(47)Here, the source term \(f^\text {elas} \) in its weak form is given by the shape derivative parts in volume form, where parts of the objective function leading to surface expressions only, such as, for instance, the perimeter regularization, are incorporated in Neumann boundary conditions. In our example, the shape derivative is given by
$$\begin{aligned} \begin{aligned} DJ(\varGamma )[V] =&\int _{0}^{T}\int _{X(\varGamma )}-k\nabla y^T\left( \nabla V+\nabla V^T\right) \nabla p-p\nabla f^T V\\&+\mathrm {div}(V)\left( \frac{1}{2}(y-{\overline{y}})^2+\frac{\partial y}{\partial t}p+k\nabla y^T\nabla p-fp\right) dxdt\\&+\int _{\varGamma }\kappa \left<V,n\right>ds \end{aligned} \end{aligned}$$where \(\kappa \) denotes the mean curvature of \(\varGamma \) and \(y, \,p\) denote the solution of the state and adjoint equation, respectively (cf. [53]).
-
Algorithm 1 First, a representation of the shape gradient on \(\varGamma \) with respect to the Sobolev metric \(g^1\) as given in (22) needs to be computed by solving
$$\begin{aligned} (I-AD^2_s)qn=rn. \end{aligned}$$(48)In our example, r is given by , where \(\kappa \) denotes the mean curvature of \(\varGamma \), the jump symbol is defined on the interface \(\varGamma \) by , \(y_{1} := \text {tr}_{\text {out}}(y\vert _{X\setminus {\overline{\varOmega }}})\) with y denoting the solution of the state equation, \(p_2 := \text {tr}_{\text {in}}(p \vert _{\varOmega })\) with p denoting the solution of the adjoint equation, and \(\text {tr}_{\text {in}}:\varOmega \rightarrow \varGamma \) and \(\text {tr}_{\text {out}}:X\setminus {\overline{\varOmega }} \rightarrow \varGamma \) are trace operators (cf. [53]). In order to compute a mesh deformation field, we need to solve a further PDE. In practice, this further PDE is again equation (46)–(47) but modified as follows: the Dirichlet boundary condition
$$\begin{aligned} U = U^\text {surf} \quad \text {on} \quad \varGamma \end{aligned}$$is added to (46)–(47), where \(U^\text {surf}\) is the representation of the shape gradient with respect to the Sobolev metric \(g^1\); the source term \(f^\text {elas}\) is set to zero.
-
-
3.
Apply the resulting deformation to the current finite element mesh, and go to the next iteration.
Remark 9
The strain and stress tensor in (46) are defined by \(\sigma :=\lambda \text {tr}(\varepsilon ) I + 2 \mu \varepsilon \), \(\varepsilon := \frac{1}{2}\left( \nabla U + \nabla U^T\right) \), where \(\lambda \) and \(\mu \) denote the so-called Lamé parameters. The Lamé parameters do not need to have a physical meaning here but it is rather essential to understand their effect on the mesh deformation. They can be expressed in terms of Young’s modulus E and Poisson’s ratio \(\nu \) as \(\lambda = \frac{\nu E}{(1+\nu )(1-2\nu )} ,\, \mu = \frac{E}{2(1+\nu )}\). Young’s modulus E states the stiffness of the material, which enables to control the step size for the shape update, and Poisson’s ratio \(\nu \) gives the ratio controlling how much the mesh expands in the remaining coordinate directions when compressed in one particular direction.
Besides saving analytical effort during the calculation process of the shape derivative, Algorithm 3 is computationally more efficient than using Algorithm 1. The optimization algorithm based on domain shape derivative expressions (Algorithm 3) can be applied to very coarse meshes with approximately 100,000 cells (cf. right picture of Fig. 3). This is due to the fact that there is no dependence on normal vectors like in the case of surface shape gradients, which are needed in Algorithm 1. In [54, 61], the convergence of the gradient method for the surface and volume shape derivative formulation are investigated for the parabolic shape interface problem. It can be observed that the convergence with the representation of the shape gradient with respect to \(g^1\) seems to require fewer iterations compared to the domain-based formulation. Yet, the domain-based form is computationally more attractive since it also works for much coarser discretizations. This can be seen in a comparision the two meshes in Fig. 3. In particular, the mesh in the left picture in Fig. 3 shows the necessary fineness of the mesh for the surface gradient (Algorithm 1) to lead to a reasonable convergence. However, the coarse grid in Fig. 3 works only for the domain-based formulation (Algorithm 3).
We can conclude that Algorithm 3 is very attractive from a computational point of view. However, the shape space \(B_e\) containing smooth shapes unnecessarily limits the application of this algorithm. More precisely, numerical investigations have shown that the optimization techniques also work on shapes with kinks in the boundary (cf. [52, 54, 55]). This means that Algorithm 3 is not limited to elements of \(B_e\) and another shape space definition is required. Thus, in [54], the definition of smooth shapes is extended to so-called . In the next section, it is clarified what we mean by \(H^{1/2}\)-shapes. However, only a first try of a definition is given in [54]. From a theoretical point of view there are several open questions about this shape space. The most important question is how the structure of this shape space is. If we do not know the structure, there is no chance to get control over the space. Moreover, the definition of this shape space has to be adapted and refined. The next section is concerned with the novel space of \(H^{1/2}\)-shapes and in particular with its structure.
3 The Shape Space \({\mathcal {B}}^{\mathbf {1/2}}\)
The Steklov–Poincaré metric correlates shape gradients with \(H^1\)-deformations. Under special assumptions, these deformations give shapes of class \(H^{1/2}\), which are defined below. As already mentioned above the shape space \(B_e\) unnecessarily limits the application of the methods mentioned in the previous section. Thus, this section aims at a generalization of smooth shapes to shapes which arise naturally in shape optimization problems. In the setting of \(B_e\), shapes can be considered as the images of embeddings. From now on we have to think of shapes as boundary contours of deforming objects. Therefore, we need another shape space. In this section, we define the space of \(H^{1/2}\)-shapes and clarify its structure as a diffeological one.
First, we do not only define diffeologies and related objects, but also explain the difference between diffeological spaces and manifolds (Sect. 3.1). In particular, we formulate the second main theorem of this paper, Theorem 3. Afterwards, the space of \(H^{1/2}\)-shapes is defined (Sect. 3.2). In the third main theorem, Theorem 4, we see that it is a diffeological space.
3.1 A Brief Introduction into Diffeological Spaces
In this subsection, we define diffeologies and related objects. Moreover, we clarify the difference between manifolds and diffeological spaces. For a detailed introduction into diffeological spaces we refer to [27].
3.1.1 Definitions
We start with the definition of a diffeological space and related objects like a diffeology, with which a diffeological space is equipped, and plots, which are the elements of a diffeology. Afterwards, we consider subset and quotient diffeologies. These two objects are required in the main theorem of Sect. 3.2. The definitions and theorems in this Sect. 3.1.1 are summarized from [27].
Definition 10
(Parametrization, diffeology, diffeological space, plots) Let Y be a non-empty set. A parametrization in Y is a map \(U\rightarrow Y\), where U is an open subset of \({\mathbb {R}}^n\). A diffeology on Y is any set \(D_Y\) of parametrizations in Y such that the following three axioms are satisfied:
-
(i)
Covering Any constant parametrization \({\mathbb {R}}^n\rightarrow Y\) is in \(D_Y\).
-
(ii)
Locality Let P be a parametrization in Y, where \(\text {dom}(P)\) denotes the domain of P. If, for all \(r\in \text {dom}(P)\), there is an open neighborhood V of r such that the restriction \(P|V\in D_Y\) , then \(P \in D_Y\).
-
(iii)
Smooth compatibility Let \(p:O\rightarrow Y\) be an element of \(D_Y\), where O denotes an open subset of \({\mathbb {R}}^n\). Moreover, let \(q:O'\rightarrow O\) be a smooth map in the usual sense, where \(O'\) denotes an open subset of \({\mathbb {R}}^m\). Then \(p\circ q\in D_Y\) holds.
A non-empty set Y together with a diffeology \(D_Y\) on Y is called a diffeological space and denoted by \((Y,D_Y)\). The parametrizations \(p\in D_Y\) are called plots of the diffeology \(D_Y\). If a plot \(p\in D_Y\) is defined on \(O\subset {\mathbb {R}}^n\), then n is called the dimension of the plot and p is called n-plot.
In the literature, there are a lot of examples of diffeologies, e.g., the diffeology of the circle, the square, the set of smooth maps, etc. For those we refer to [27].
Remark 10
A diffeology as a structure and a diffeological space as a set equipped with a diffeology are distinguished only formally. Every diffeology on a set contains the underlying set as the set of non-empty 0-plots (cf. [27]).
Next, we want to connect diffeological spaces. This is possible though smooth maps between two diffeological spaces.
Definition 11
(Smooth map between diffeological spaces, diffeomorphism) Let \((X,D_X),(Y,D_Y)\) be two diffeological spaces. A map \(f:X\rightarrow Y\) is smooth if for each plot \(p\in D_X\), \(f\circ p\) is a plot of \( D_Y \), i.e., \(f\circ D_X\subset D_Y\). If f is bijective and if both, f and its inverse \(f^{-1} \), are smooth, f is called a diffeomorphism. In this case, \((X,D_X)\) is called diffeomorphic to \((Y,D_Y)\).
The stability of diffeologies under almost all set constructions is one of the most striking properties of the class of diffeological spaces like in the subset, quotient, functional or powerset diffeology. In the following, we concentrate on the subset and quotient diffeology. The concept of these are required in the proof of the main theorem in the next subsection.
Subset diffeology Every subset of a diffeological space carries a natural subset diffeology, which is defined by the pullback of the ambient diffeology by the natural inclusion.
Before we can construct the subset diffeology, we have to clarify the natural inclusion and the pullback. For two sets A, B with \(A\subset B\), the (natural) inclusion is given by \(\iota _A:A\rightarrow B\), \(x\mapsto x\). The pullback is defined as follows:
Theorem and Definition 12
(Pullback) Let X be a set and \((Y,D_Y)\) be a diffeological space. Moreover, \(f:X\rightarrow Y\) denotes some map.
-
(i)
There exists a coarsest diffeology of X such that f is smooth. This diffeology is called the pullback of the diffeology \(D_Y\) by f and is denoted by \(f^*(D_Y)\).
-
(ii)
Let p be a parametrization in X. Then \(p\in f^*(D_Y)\) if and only if \(f\circ p\in D_Y\).
Proof
See [27, Chap. 1, 1.26]. \(\square \)
The construction of subset diffeologies is related to so-called inductions.
Definition 13
(Induction) Let \((X,D_X),(Y,D_Y)\) be diffeological spaces. A map \(f:X\rightarrow Y\) is called induction if f is injective and \(f^*(D_Y)=D_X\), where \(f^*(D_Y)\) denotes the pullback of the diffeology \(D_Y\) by f.
The illustration of an induction as well as the criterions for being an induction can be found in [27, Chap. 1, 1.31].
Now, we are able to define the subset diffeology (cf. [27]).
Theorem and Definition 14
(Subset diffeology) Let \((X,D_X)\) be a diffeological space and let \(A\subset X\) be a subset. Then A carries a unique diffeology \(D_A\), called the subset or induced diffeology, such that the inclusion map \(\iota _A:A\rightarrow X\) becomes an induction, namely, \(D_A=\iota _A^*(D_X)\). We call \((A,D_A)\) the diffeological subspace of \((X,D_X)\).
Quotient diffeology
Like every subset of a diffeological space inherits the subset diffeology, every quotient of a diffeological space carries a natural quotient diffeology defined by the pushforward of the diffeology of the source space to the quotient by the canonical projection.
First, we have to clarify the canonical projection. For a set A and an equivalence relation \(\sim \) on A, the canonical projection is defined as \(\pi :X\rightarrow X/\sim \), \(x\mapsto [x]\), where \([x]:= \{x'\in X:x\sim x'\}\) denotes the equivalence class of x with respect to \(\sim \). Moreover, the pushforward has to be defined:
Theorem and Definition 15
(Pushforward) Let \((X,D_X)\) be a diffeological space and Y be a set. Moreover, \(f:X\rightarrow Y\) denotes a map.
-
(i)
There exists a finest diffeology of Y such that f is smooth. This diffeology is called the pushforward of the diffeology \(D_X\) by f and is denoted by \(f_*(D_X)\).
-
(ii)
A parametrization \(p:U\rightarrow Y\) lies in \(f_*(D_X)\) if and only if every point \(x\in U\) has an open neighbourhood \(V\subset U\) such that is constant or of the form for some plot \(q:V\rightarrow X\) with \(q\in D_X\).
Proof
See [27, Chap. 1, 1.43]. \(\square \)
Remark 11
If a map f from a diffeological space \((X,D_X)\) into a set Y is surjective, then \(f_*(D_X)\) consists precisely of the plots \(p:U\rightarrow Y\) which locally are of the form \(f\circ q\) for plots \(q\in D_X\) since those already contain the constant parametrizations.
The construction of quotient diffeologies is related to so-called subductions.
Definition 16
(Subduction) Let \((X,D_X),(Y,D_Y)\) be diffeological spaces. A map \(f:X\rightarrow Y\) is called subduction if f is surjective and \(f_*(D_X)=D_Y\), where \(f_*(D_X)\) denotes the pushforward of the diffeology \(D_X\) by f.
The illustration of a subduction as well as the criterions for being a subduction can be found in [27, Chap. 1, 1.48].
Now, we can define the quotient diffeology (cf. [27]).
Theorem and Definition 17
(Quotient diffeology) Let \((X,D_X)\) be a diffeological space and \(\sim \) be an equivalence relation on X. Then the quotient set \(X/\sim \) carries a unique diffeologcial sturcture \(D_{X/\sim } \), called the quotient diffeology, such that the canonical projection \(\pi :X\rightarrow X/\sim \) becomes a subduction, namely, \(D_{X/\sim }=\pi _*(D_X)\). We call \(\left( X/\sim ,D_{X/\sim }\right) \) the diffeological quotient of \((X,D_X)\) by the relation \(\sim \).
One aim of this paper is to go towards optimization algorithms in diffeological spaces. Thus, we end this subsection with a brief discussion about the topology of a diffeological space which is necessary to discuss properties of optimization methods in diffeological spaces like convergence. Every diffeological space induces a unique topology, the so-called D-topology, which is a natural topology and introduced by Patrick Iglesias–Zemmour for each diffeological space (cf. [27]). In particular, openess, compactness and convergence depend on the D-topology. Given a diffeological space \((X,D_X)\), the D-topology is the finest topology such that all plots are continuous. That is, a subset U of X is open (in the D-topology) if for any plot \(p:O\rightarrow X\) the pre-image \(p^{-1}U\subset O\) is open. For more information about the D-topology we refer to the literature, e.g., [27, Chap. 2, 2.8] or [11]. However, if \((X,D_X)\) is a diffeological space and one knows that a sequence \(\{x_n\}\) converges with respect to the topology of X, it is not guaranteed that \(\{x_n\}\) converges also for the D-topology because this topology is finer than the given one on X. Thus, all discussions about compactness, convergence, etc in the diffeological sense reduces to the D-topology.
3.1.2 Differences Between Diffeological Spaces and Manifolds
Manifolds can be generalized in many ways. In [58], a summary and comparison of possibilities to generalize smooth manifolds are given. One generalization is a diffeological space on which we concentrate in this section. In the following, the main differences between manifolds and diffeological spaces are figured out and formulated in Theorem 3. For simplicity, we concentrate on finite-dimensional manifolds. However, it has to be mentioned that infinite-dimensional manifolds can also be understood as diffeological spaces. This follows, e.g., from [32, Corollary 3.14] or [37].
Given a smooth manifold there is a natural diffeology on this manifold consisting of all parametrizations which are smooth in the classical sense. This yields the following definition.
Definition 18
(Diffeological space associated with a manifold) Let M be a finite-dimensional (not necessarily Hausdorff or paracompact) smooth manifold. The diffeological space associated with M is defined as \((M,D_M)\), where the diffeology \(D_M\) consists precisely of the parametrizations of M which are smooth in the classical sense.
Remark 12
If M, N denote finite-dimensional manifolds, then \(f:M\rightarrow N\) is smooth in the classical sense if and only if it is a smooth map between the associated diffeological spaces \((M,D_M)\rightarrow (N,D_N)\).
In order to characterize the diffeological spaces which arise from manifolds, we need the concept of smooth points.
Definition 19
Let \((X,D_X)\) be a diffeological space. A point \(x\in X\) is called smooth if there exists a subset \(U\subset X\) which is open with respect to the topology of X and contains x such that \((U,D_U)\) is diffeomorphic to an open subset of \({\mathbb {R}}^n\), where \(D_U\) denotes the subset diffeology.
The concept of smooth points is quite simple. Let us consider the coordinate axes, e.g., in \({\mathbb {R}}^2\). All points of the two axis with exception of the origin are smooth points.
Now, we are able to formulate the following main theorem:
Theorem 3
A diffeological space \((X,D_X)\) is associated with a (not necessarily paracompact or Hausdorff) smooth manifold if and only if each of its points is smooth.
Proof
We have to show the following statements:
-
(i)
Given a smooth manifold M, then each point of the associated diffeological space \((M,D_M)\) is smooth.
-
(ii)
Given a diffeological space \((X,D_X)\) for which all points are smooth, then it is associated with a smooth manifold M.
To (i) Let M be a smooth manifold and \(x\in M\) an arbitrary point. Then there exists an open neighbourhood \(U\subset M\) of x which is diffeomorphic to an open subset \(O\subset {\mathbb {R}}^n\). Let \(f:U\rightarrow O\) be a diffeomorphism. This diffeomorphism is a diffeomphism of the associated diffeological spaces \((U,D_U)\) and \((O,D_O)\). Thus, \(x\in M\) is a smooth point. Since \(x\in M\) is an arbitrary point, each point of \((M,D_M)\) is smooth.
To (ii) Let \((X,D_X)\) be a diffeological space for which all points are smooth. Then there exist an open cover \(X=\bigcup _{i\in I} U_i\) and diffeomorphisms \(f_i:U_i\rightarrow O_i \) onto open subsets \(O_i\subset {\mathbb {R}}^{n}\). The map is smooth (in the diffeological sense) for all \(i,j\in I\). Due to Remark 12, the map is smooth in the classical sense for all \(i,j\in I\). Thus, \(\{(U_i,f_i)\}_{i\in I}\) defines a smooth atlas and a manifold structure on X is defined. Let D be the associated diffeology. A similar argument as above shows that the diffeology D agrees with the original one \(D_X\). \(\square \)
This theorem clarifies the difference between manifolds and diffeological spaces. Roughly speaking, a manifold of dimension n is getting by glueing together open subsets of \({\mathbb {R}}^n\) via diffeomorphisms. In contrast, a diffeological space is formed by glueing together open subsets of \({\mathbb {R}}^n\) with the difference that the glueing maps are not necessarily diffeomorphisms and that n can vary. However, note that manifolds deal with charts and diffeological spaces deal with plots. A system of local coordinates, i.e., a diffeomorphism \(p:U\rightarrow U'\) with \(U\subset {\mathbb {R}}^n\) open and \(U'\subset X\) open, can be viewed as a very special kind of plot \(U\rightarrow X\) which induces an induction on the corresponding diffeological spaces.
Remark 13
Note that we consider smooth manifolds which do not necessary have to be Hausdorff or paracompact. If we understand a manifold as Hausdorff and paracompact, then the diffeological space \((X,D_X)\) in Theorem 3 has to be Hausdorff and paracompact. In this case, we need the concept of open sets in diffeological spaces. Whether a set is open depends on the topology under consideration. In the case of diffeological spaces, openness depends on the D-topology.
3.2 The Diffeological Shape Space
We extend the definition of smooth shapes, which are elements of the shape space \(B_e\), to shapes of class \(H^{1/2}\). In the following, it is clarified what we mean by \(H^{1/2}\)-shapes. We would like to recall that a shape in the sense of the shape space \(B_e\) is given by the image of an embedding from the unit sphere \(S^{d-1}\) into the Euclidean space \({\mathbb {R}}^d\). In view of our generalization, it has technical advantages to consider so-called Lipschitz shapes which are defined as follows.
Definition 20
(Lipschitz shape) A \((d-1)\)-dimensional Lipschitz shape \(\varGamma _0\) is defined as the boundary \(\varGamma _0=\partial {\mathcal {X}}_0\) of a compact Lipschitz domain \({\mathcal {X}}_0\subset {\mathbb {R}}^d\) with \({\mathcal {X}}_0\ne \emptyset \). The set \({\mathcal {X}}_0\) is called a Lipschitz set.
Example of Lipschitz shapes are illustrated in Fig. 4. In contrast, Fig. 5 shows examples of shapes which are non-Lipschitz shapes.
General shapes—in our novel terminology—arise from \(H^1\)-deformations of a Lipschitz set \({\mathcal {X}}_0\). These \(H^1\)-deformations, evaluated at a Lipschitz shape \(\varGamma _0\), give deformed shapes \(\varGamma \) if the deformations are injective and continuous. These shapes are called of class \(H^{1/2}\) and proposed firstly in [54]. The following definitions differ from [54]. This is because of our aim to define the space of \(H^{1/2}\)-shapes as diffeological space which is suitable for the formulation of optimization techniques and its applications.
Definition 21
(Shape space \({\mathcal {B}}^{1/2}\)) Let \(\varGamma _0\subset {\mathbb {R}}^d\) be a \((d-1)\)-dimensional Lipschitz shape. The space of all \((d-1)\)-dimensional \(H^{1/2}\)-shapes is given by
where
and the equivalence relation \(\sim \) is given by
The set \({{{\mathcal {H}}}}^{1/2}(\varGamma _0,{\mathbb {R}}^d)\) is obviously a subset of the Sobolev–Slobodeckij space \(H^{1/2}(\varGamma _0,{\mathbb {R}}^d)\), which is well-known as a Banach space (cf. [39, Chap. 3]). Banach spaces are manifolds and, thus, we can view \(H^{1/2}(\varGamma _0,{\mathbb {R}}^d)\) with the corresponding diffeology. This encourages the following theorem which provides the space of \(H^{1/2}\)-shapes with a diffeological structure. The next theorem is the third (and, thus, last) main theorem of this paper.
Theorem 4
The set \(\mathcal{H}^{1/2}(\varGamma _0,{\mathbb {R}}^d)\) and the space \(\mathcal{B}^{1/2}(\varGamma _0,{\mathbb {R}}^d)\) carry unique diffeologies such that the inclusion map \(\iota _{\mathcal{H}^{1/2}(\varGamma _0,{\mathbb {R}}^d)}:\mathcal{H}^{1/2}(\varGamma _0,{\mathbb {R}}^d) \rightarrow H^{1/2}(\varGamma _0,{\mathbb {R}}^d)\) is an induction and such that the canonical projection \(\pi :{{{\mathcal {H}}}}^{1/2}(\varGamma _0,{\mathbb {R}}^d) \rightarrow \mathcal{B}^{1/2}(\varGamma _0,{\mathbb {R}}^d)\) is a subduction.
Proof
Let \(D_{H^{1/2}(\varGamma _0,{\mathbb {R}}^d)}\) be the diffeology on \(H^{1/2}(\varGamma _0,{\mathbb {R}}^d)\). Due to Theorem and Definition 14, \(\mathcal{H}^{1/2}(\varGamma _0,{\mathbb {R}}^d)\) carries the subset diffeology \(\iota _{\mathcal{H}^{1/2}(\varGamma _0,{\mathbb {R}}^d)}^*\left( D_{H^{1/2}(\varGamma _0,{\mathbb {R}}^d)}\right) \). Then the space \({{{\mathcal {B}}}}^{1/2}(\varGamma _0,{\mathbb {R}}^d)\) carries the quotient diffeology
due to Theorem and Definition 17. \(\square \)
So far, we have defined the space of \(H^{1/2}\)-shapes and showed that it is a diffeological space. The appearance of a diffeological space in the context of shape optimization can be seen as a first step or motivation towards the formulation of optimization techniques on diffeological spaces. Note that, so far, there is no theory for shape optimization on diffeological spaces. Of course, properties of the shape space \(\mathcal{B}^{1/2}\left( \varGamma _0,{\mathbb {R}}^d\right) \) have to be investigated. E.g., an important question is how the tangent space looks like. Tangent spaces and tangent bundles are important in order to state the connection of \(\mathcal{B}^{1/2}\left( \varGamma _0,{\mathbb {R}}^d\right) \) to shape calculus and in this way to be able to formulate optimization algorithms in \(\mathcal{B}^{1/2}\left( \varGamma _0,{\mathbb {R}}^d\right) \). There are many equivalent ways to define tangent spaces of manifolds, e.g., geometric via velocities of curves, algebraic via derivations or physical via cotangent spaces (cf. [33]). Many authors have generalized these concepts to diffeological spaces, e.g., [12, 22, 27, 57]. In [57], tangent spaces are defined for diffeological groups by identifying smooth curves using certain states. Tangent spaces and tangent bundles for many diffeological spaces are given in [22]. Here smooth curves and a more intrinsic identification are used. However, in [12], it is pointed out that there are some errors in [22]. In [27], the tangent space to a diffeological space at a point is defined as a subspace of the dual of the space of 1-forms at that point. These are used to define tangent bundles. In [12], two approaches to the tangent space of a general diffeological space at a point are studied. The first one is the approach introduced in [22] and the second one is an approach which uses smooth derivations on germs of smooth real-valued functions. Basic facts about these tangent spaces are proven, e.g., locality and that the internal tangent space respects finite products. Note that the tangent space to \({{{\mathcal {B}}}}^{1/2}\left( \varGamma _0,{\mathbb {R}}^d\right) \) as diffeological space and related objects which are needed in optimization methods, e.g., retractions and vector transports, cannot be deduced or defined so easily. The study of these objects and the formulation of optimization methods on a diffeological space go beyond the scope of this paper and are topics of subsequent work. Moreover, note that the Riemannian structure \(g^S\) on \(\mathcal{B}^{1/2}\left( \varGamma _0,{\mathbb {R}}^d\right) \) has to be investigated in order to define \(\mathcal{B}^{1/2}\left( \varGamma _0,{\mathbb {R}}^d\right) \) as a Riemannian diffeological space. In general, a diffeological space can be equipped with a Riemannian structure as outlined, e.g., in [38].
Besides the tangent spaces, another open question is which assumptions guarantee that the image of a Lipschitz shape under \(w\in H^{1/2}\left( \varGamma _0,{\mathbb {R}}^d\right) \) is again a Lipschitz shape. Of course, the image of a Lipschitz shape under a continuously differentiable function is again a Lipschitz shape, but the requirement that w is a \({\mathcal {C}}^1\)-function is a too strong. One idea is to require that w has to be a bi-Lipschitz function. Unfortunately, the image of a Lipschitz shape under a bi-Lipschitz function is not necessarily a Lipschitz shape as the example given in [44, Sect. 4.1] shows. Another option is to generalize the concept of Lipschitz domains to non-tangentially accessible (NTA) domains. In order to formulate the definition of these domains, we need the concept of so-called Harnack chains.
Definition 22
(\(\alpha \)-Harnack chain) Let \(\alpha \ge 1\). For a metric space \(({\mathbb {R}}^d,\text {d})\) and an open set \(\varOmega \subset {\mathbb {R}}^d\), a sequence of balls \(B_0, \dots ,B_k\subset \varOmega \) is called an \(\alpha \)-Harnack chain in \(\varOmega \) if \(B_i\cap B_{i-1}\not = \emptyset \) for all \(i=1,\dots ,k\) and
where \(\text {dist}(B_i,\partial \varOmega ):=\inf \limits _{x\in B_i, y\in \partial \varOmega } \text {d}(x,y)\) and \(r(B_i)\) is the radius of \(B_i\).
Definition 23
(Non-tangentially accessible domain) Let \(({\mathbb {R}}^d,\text {d})\) be a metric space. A bounded open set \(\varOmega \) is called an non-tangentially accessible domain (NTA domain) if the following conditions hold:
-
(i)
There exist \(\alpha \ge 1\) such that for all \(\eta >0\) and for all \(x,y\in \varOmega \) such that \(\text {dist}(x,\partial \varOmega )\ge \eta \), \(\text {dist}(y,\partial \varOmega )\ge \eta \) and \(\text {d}(x,y)\le C\eta \) for some \(C>0\), there exists an \(\alpha \)-Harnack chain \(B_0,\dots ,B_k\subset \varOmega \) such that \(x\in B_0,y\in B_k\) and k depends on C but not on \(\eta \).
-
(ii)
\(\varOmega \) satisfies the corkscrew condition, i.e., there exist \(r_0>0\) and \(\varepsilon >0\) such that for all \(r\in (0,r_0)\) and \(x\in \partial \varOmega \) the sets \(B(x,r)\cap \varOmega \), \(B(x,r)\cap ({\mathbb {R}}^d\setminus {\overline{\varOmega }})\) contain a ball of radius \(\varepsilon r\).
In fact, the image of an non-tangentially accessible (NTA) domain under a global quasiconformal mapping is an NTA domain (cf. [21]). If we consider boundaries \(\varGamma _0\) of NTA domains \({\mathcal {X}}_0\) instead of Lipschitz domains, the space \({\mathcal {H}}^{1/2}\) defined in (50) changes to
The resulting space of \(H^{1/2}\)-shapes carries also a diffeological structure if \(\varGamma _0\) is the boundary of an NTA domain \({\mathcal {X}}_0\) due to Theorem 4.
Remark 14
A quasi-conformal mapping of the open d-ball \(B^{d-1}\) induces a homeomorphism on the boundary for \(d=2,3\) (cf. [20, 45]). In [46], this result is generalized for higher dimensions. More precisely, it is proven that a quasiconformal mapping of an open ball in \({\mathbb {R}}^d\) onto itself extends to a homeomorphism of the closed d-ball. If we apply these results to our shape space, we get injectivity and continuity in (53) for free for \(\varGamma _0:=S^{d-1}\).
4 Conclusion
The differential-geometric structure of the shape space \(B_e\) is applied to the theory of shape optimization problems. In particular, a Riemannian shape gradient and a Riemannian shape Hessian with respect to the Sobolev metric \(g^1\) is defined. The specification of the Riemannian shape Hessian requires the Riemannian connection, which is given and proven for the first Sobolev metric. It is outlined that we have to deal with surface formulations of shape derivatives if we consider the first Sobolev metric. In order to use the more attractive volume formulations, we consider the Steklov–Poincaré metrics \(g^S\) and state their connection to shape calculus by defining the shape gradient with respect to \(g^S\). The gradients with respect to both, \(g^1\) and \(g^S\), and the Riemannian shape Hessian, open the door to formulate optimization algorithms in \(B_e\). We formulate the gradient method in \((B_e,g^1)\) and \((B_e,g^S)\) as well as the Newton method in \((B_e,g^1)\). The implementation and investigation of Newton’s method in \((B_e,g^1)\) for an explicit example will be touched in future work. In particular, the comparison of Newton’s method in \((B_e,g^1)\) and \((B_e,g^S)\) will be investigated in the future. Here, a challenging question, which arises, is how an explicit formulation of the Riemannian shape Hessian with respect to the Steklov–Poincaré metric looks like. For this, we would generally need to work in fractional order Sobolev spaces and deal with the projected Poincaré–Steklov operator.
Since the shape space \(B_e\) limits the application of optimization techniques, we extend the definition of smooth shapes to \(H^{1/2}\)-shapes and define a novel shape space. It is shown that this space has a diffeological structure. In this context, we clarify the differences between manifolds and diffeological spaces. From a theoretical point of view, a diffeological space is very attractive in shape optimization. It can be supposed that a diffeological structure suffices for many differential-geometric tools used in shape optimization techniques. In particular, objects which are needed in optimization methods, e.g., retractions and vector transports, have to be deduced. Note that these objects cannot be defined so easily and additional work is required to formulate optimization methods on a diffeological space, which remain open for further research and will be touched in subsequent papers.
Notes
The ambient space of a mathematical object is the space surrounding that mathematical object along with the object itself.
Please see Remark 9 for the definition of the strain and stress tensor.
References
Absil, P.A., Mahony, R., Sepulchre, R.: Optimization Algorithms on Matrix Manifolds. Princeton University Press, Princeton (2008)
Ambrosio, L., Gigli, N., Savaré, G.: Gradient flows with metric and differentiable structures, and applications to the Wasserstein space. Rendiconti Lincei-Matematica e Applicazioni 15(3–4), 327–343 (2004)
Bauer, M., Bruveris, M.: A new Riemannian setting for surface registration. In: Pennec, X., Joshi, S., Nielsen, M. (eds.) Proceedings of the Third International Workshop on Mathematical Foundations of Computational Anatomy—Geometrical and Statistical Methods for Modelling Biological Shape Variability, pp. 182–193 (2011)
Bauer, M., Harms, P., Michor, P.M.: Sobolev metrics on shape space of surfaces. J. Geom. Mech. 3(4), 389–438 (2011)
Bauer, M., Harms, P., Michor, P.W.: Sobolev metrics on shape space II. Weighted Sobolev metrics and almost local metrics. J. Geom. Mech. 4(4), 365–383 (2012)
Beg, M.F., Miller, M.I., Trouvé, A., Younes, L.: Computing large deformation diffeomorphic metric mappings via geodesic flows of diffeomorphisms. Int. J. Comput. Vis. 61(2), 139–157 (2005)
Benamou, J.-D., Brenier, Y.: A computational fluid mechanics solution to the Monge-Kantorovich mass transfer problem. Numer. Math. 84(3), 375–393 (2000)
Berggren, M.: A unified discrete-continuous sensitivity analysis method for shape optimization. In: Fitzgibbon, W., et al., (eds.) Applied and Numerical Partial Differential Equations, Computational Methods in Applied Sciences, vol. 15, pp. 25–39. Springer (2010)
Bookstein, F.L.: Morphometric Tools for Landmark Data: Geometry and Biology. Cambridge University Press, Cambridge (1997)
Céa, J.: Conception optimale ou identification de formes calcul rapide de la dérivée directionelle de la fonction coût. RAIRO Modelisation mathématique et analyse numérique 20(3), 371–402 (1986)
Christensen, J.D., Sinnamon, G., Wu, A.E.: The \(d\)-topology for diffeological spaces. Pac. J. Math. 272(1), 87–110 (2014)
Christensen, J.D., Wu, E.: Tangent spaces and tangent bundles for diffeological spaces. Cahiers de Topologie et Geométrie Différentielle Catégoriques 57(1), 3–50 (2016)
Cootes, T.F., Taylor, C.J., Cooper, D.H., Graham, J.: Active shape models—their training and application. Comput. Vis. Image Underst. 61(1), 38–59 (1995)
Delfour, M.C., Zolésio, J.-P.: Shapes and Geometries: Metrics, Analysis, Differential Calculus, and Optimization, Advances in Design and Control. SIAM, vol.22, 2nd edn. (2001)
Droske, M., Rumpf, M.: Multi scale joint segmentation and registration of image morphology. IEEE Trans. Pattern Anal. Mach. Intell. 29(12), 2181–2194 (2007)
Durrleman, S., Pennec, X., Trouvé, A., Ayache, N.: Statistical models of sets of curves and surfaces based on currents. Med. Image Anal. 13(5), 793–808 (2009)
Evans, L.C.: Partial Differential Equations. American Mathematical Society, Providence (1993)
Fuchs, M., Jüttler, B., Scherzer, O., Yang, H.: Shape metrics based on elastic deformations. J. Math. Imaging Vis. 35(1), 86–102 (2009)
Gangl, P., Laurain, A., Meftahi, H., Sturm, K.: Shape optimization of an electric motor subject to nonlinear magnetostatics. SIAM J. Sci. Comput. 37(6), B1002–B1025 (2015)
Gehring, F.W.: Rings and quasiconformal mappings in space. Trans. Am. Math. Soc. 103(3), 353–393 (1962)
Gehring, F.W., Osgood, B.G.: Uniform domains and the quasi-hyperbolic metric. J. Anal. Math. 36(1), 50–74 (1979)
Hector, G.: Géométrie et topologie des espaces difféologiques. In: Virgos, E.M., Lopez, J.A.A. (eds.) Analysis and Geometry in Foliated Manifolds, pp. 55–80. World Scientific Publishing (1995)
Hintermüller, M., Laurain, L.: Optimal shape design subject to elliptic variational inequalities. SIAM J. Control. Optim. 49(3), 1015–1047 (2011)
Hintermüller, M., Ring, W.: A second order shape optimization approach for image segmentation. SIAM J. Appl. Math. 64(2), 442–467 (2004)
Hiptmair, R., Paganini, A.: Shape optimization by pursuing diffeomorphisms. Comput. Methods Appl. Math. 15(3), 291–305 (2015)
Holm, D., Trouvé, A., Younes, L.: The Euler-Poincaré theory of metamorphosis. Q. Appl. Math. 67(4), 661–685 (2009)
Iglesias-Zemmour, P.: Diffeology. Mathematical Surveys and Monographs, vol. 185. American Mathematical Society, Providence (2013)
Ito, K., Kunisch, K.: Lagrange Multiplier Approach to Variational Problems and Applications, Advances in Design and Control, vol. 15. SIAM (2008)
Ito, K., Kunisch, K., Peichl, G.H.: Variational approach to shape derivatives. ESAIM 14(3), 517–539 (2008)
Kendall, D.G.: Shape manifolds, procrustean metrics, and complex projective spaces. Bull. Lond. Math. Soc. 16(2), 81–121 (1984)
Kilian, M., Mitra, N.J., Pottmann, H.: Geometric modelling in shape space. ACM Trans. Gr. 26(64), 1–8 (2007)
Kriegl, A., Michor, P.W.: The Convient Setting of Global Analysis, Mathematical Surveys and Monographs, vol. 53. American Mathematical Society, Providence (1997)
Kühnel, W.: Differentialgeometrie: Kurven, Flächen und Mannigfaltigkeiten. Vieweg, 4th edn (2008)
Kushnarev, S.: Teichons: Solitonlike geodesics on universal Teichmüller space. Exp. Math. 18(3), 325–336 (2009)
Laurain, A., Sturm, K.: Domain expression of the shape derivative and application to electrical impedance tomography. Technical Report No. 1863, Weierstraß-Institut für angewandte Analysis und Stochastik, Berlin (2013)
Ling, H., Jacobs, D.W.: Shape classification using the inner-distance. IEEE Trans. Pattern Anal. Mach. Intell. 29(2), 286–299 (2007)
Losik, M.V.: Categorical differential geometry. Cahiers de topologie et geométrie différentielle catégoriques 35(4), 274–290 (1994)
Magnot, J.-P.: Remarks on the geometry and the topology of the loop spaces \(H^s(S^1,{\mathbb{N}})\), for \(s\le 1/2\). Int. J. Maps Math. 2(1), 4–37 (2019)
McLean, W.: Strongly Elliptic Systems and Boundary Integral Equations. Cambridge University Press, Cambridge (2000)
Michor, P.M., Mumford, D.: Vanishing geodesic distance on spaces of submanifolds and diffeomorphisms. Doc. Math. 10, 217–245 (2005)
Michor, P.M., Mumford, D.: Riemannian geometries on spaces of plane curves. J. Eur. Math. Soc. 8(1), 1–48 (2006)
Michor, P.M., Mumford, D.: An overview of the Riemannian metrics on spaces of curves using the Hamiltonian approach. Appl. Comput. Harmon. Anal. 23(1), 74–113 (2007)
Mio, W., Srivastava, A., Joshi, S.: On shape of plane elastic curves. Int. J. Comput. Vis. 73(3), 307–324 (2007)
Mitrea, M., Hofmann, S., Taylor, M.: Geometric and transformational properties of lipschitz domains, semmes-kenig-toro domains, and other classes of finite perimeter domains. J. Geom. Anal. 17(4), 593–647 (2007)
Mori, A.: On quasi-conformality and pseudo-analyticity. Trans. Am. Math. Soc. 84(1), 56–77 (1957)
Mostow, G.D.: Quasi-conformal mappings in \( n \)-space and the rigidity of hyperbolic space forms. Publ. Math. l’IHÉS 34, 53–104 (1968)
Nägel, A., Schulz, V.H., Siebenborn, M., Wittum, G.: Scalable shape optimization methods for structured inverse modeling in 3D diffusive processes. Comput. Vis. Sci. 17(2), 79–88 (2015)
Paganini, A.: Approximative shape gradients for interface problems. In: Pratelli, A., Leugering, G. (eds.) New Trends in Shape Optimization, International Series of Numerical Mathematics, vol. 166, pp. 217–227. Springer, New York (2015)
Schmidt, S., Ilic, C., Schulz, V.H., Gauger, N.R.: Three-dimensional large-scale aerodynamic shape optimization based on shape calculus. AIAA J. 51(11), 2615–2627 (2013)
Schmidt, S., Wadbro, E., Berggren, M.: Large-scale three-dimensional acoustic horn optimization. SIAM J. Sci. Comput. 38(6), B917–B940 (2016)
Schulz, V.H.: A Riemannian view on shape optimization. Found. Comput. Math. 14(3), 483–501 (2014)
Schulz, V.H., Siebenborn, M.: Computational comparison of surface metrics for PDE constrained shape optimization. Comput. Methods Appl. Math. 16(3), 485–496 (2016)
Schulz, V.H., Siebenborn, M., Welker, K.: Structured inverse modeling in parabolic diffusion problems. SIAM J. Control. Optim. 53(6), 3319–3338 (2015)
Schulz, V.H., Siebenborn, M., Welker, K.: Efficient PDE constrained shape optimization based on Steklov-Poincaré type metrics. SIAM J. Optim. 26(4), 2800–2819 (2016)
Siebenborn, M., Welker, K.: Computational aspects of multigrid methods for optimization in shape spaces. SIAM J. Sci. Comput. 39(6), B1156–B1177 (2017)
Sokolowski, J., Zolésio, J.-P.: Introduction to Shape Optimization, Computational Mathematics, vol. 16. Springer, New York (1992)
Souriau, J.-M.: Groupes différentiels. In: Garcia, P.L., Perez-Rendon, A., Souriau, J.-M. (eds.) Differential Geometrical Methods in Mathematical Physics. Lecture Notes in Mathematics, vol. 836, pp. 91–128. Springer, New York (1980)
Stacey, A.: Comparative smootheology. Theory Appl. Categ. 25(4), 64–117 (2011)
Sturm, K.: Shape differentiability under non-linear PDE constraints. In: Pratelli, A., Leugering, G. (eds.) New Trends in Shape Optimization, International Series of Numerical Mathematics, vol. 266, pp. 271–300. Springer, New York (2015)
Trouvé, A., Younes, L.: Metamorphoses through Lie group action. Found. Comput. Math. 5(2), 173–198 (2005)
Welker, K.: Efficient PDE Constrained Shape Optimization in Shape Spaces. PhD thesis, Universität Trier (2016)
Welker, K.: Optimization in the space of smooth shapes. In: Nielsen, F., Barbaresco, F. (eds.) Geometric Science of Information. Lecture Notes in Computer Science, vol. 10589, pp. 65–72. Springer, New York (2017)
Wirth, B., Bar, L., Rumpf, M., Sapiro, G.: A continuum mechanical approach to geodesics in shape space. Int. J. Comput. Vis. 93(3), 293–318 (2011)
Wirth, B., Rumpf, M.: A nonlinear elastic shape averaging approach. SIAM J. Imaging Sci. 2(3), 800–833 (2009)
Zolésio, J.-P.: Control of moving domains, shape stabilization and variational tube formulations. Int. Ser. Numer. Math. 155, 329–382 (2007)
Acknowledgements
The author is indebted to Ben Anthes for many helpful comments and discussions about diffeological spaces. Moreover, the author thanks Leonhard Frerick (Trier University) for discussions about Lipschitz domains.
Funding
Open Access funding enabled and organized by Projekt DEAL.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work has been partly supported by the German Research Foundation (DFG) within the priority program SPP1962/2 under Contract number WE 6629/1-1.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Welker, K. Suitable Spaces for Shape Optimization. Appl Math Optim 84 (Suppl 1), 869–902 (2021). https://doi.org/10.1007/s00245-021-09788-2
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00245-021-09788-2