Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

1 Introduction

1.1 Level Set Methods for Inverse Problems and Image Reconstruction

The level set technique has been introduced for the solution of inverse problems in the seminal paper of Santosa [82]. Since then, it has developed significantly and appears to become now a standard technique for solving inverse problems with interfaces. However, there are still a large number of unresolved problems and open questions related to this method, which keeps fuelling active research on it worldwide. This chapter can only give a rough overview of some techniques which have been discussed so far in the literature. For more details which go beyond the material covered here, the reader is referred to the recent review articles [20, 3133, 92], each of them providing a slightly different view on the topic and making available a rich set of additional references which the interested reader can follow for further consultation.

1.2 Images and Inverse Problems

An image, as referred to in this chapter, is a (possibly vector-valued) function which assigns to each point of a given domain in 2D or in 3D one or more physical parameter values which are characteristic for that point. An image often contains interfaces, across which one or more of these physical parameters change value in a discontinuous manner. In many applications, these interfaces coincide with physical interfaces between different materials or regions. These interfaces divide the domain Ω in subdomains \(\Omega _{k}\), k = 1, , K of different region-specific internal parameter profiles. Often, due to the different physical structures of each of these regions, quite different mathematical models might be most appropriate for describing them in the given context.

Since the image represents physical parameters, it can be tested by physical inspection. Here, the physical parameters typically appear in partial differential equations (PDEs) or in integral equations (IEs) as space-dependent coefficients, and various probing fields are created for measuring the response of the image to these inputs. Due to physical restrictions, these measurements are typically only possible at few discrete locations, often situated at the boundary of the domain Ω, but sometimes also at a small number of points inside Ω. If the underlying PDE is time dependent, then these measurements can be time-dependent functions. The corresponding measured data give information on the spatial distribution of the subdomains and on the corresponding internal model parameters.

Sometimes the physical interpretation of the image is that of a source distribution rather than a parameter distribution. Then, the image itself creates the probing field and needs to be determined from just one set of measured data. Also combinations are possible where some components of the (vector-valued) image describe source distributions and other components describe parameter distributions. Initial conditions or boundary conditions can also often be interpreted as images in this spirit, which need to be determined from indirect data. It is clear that this concept of an image can be generalized even further, which leads to interesting mathematical problems and concepts.

There is often a large variety of additional prior information available for determining the image, whose character depends on the given application. For example, it might be known or assumed that all parameter profiles inside the individual subregions of a domain Ω are constant with known or unknown region-specific values. In this particular case, only the interfaces between the different regions, and possibly the unknown parameter values, need to be reconstructed from the gathered data, which, as a mathematical problem, is much better posed [37, 71] than the task of estimating independent values at each individual pixel or voxel from the same data set without additional prior information on the image. However, in many realistic applications, the image to be found is more complicated, and even the combined available information is not sufficient or adequate for completely and uniquely determining the underlying image. This becomes even worse due to typically noisy or incomplete data, or due to model inaccuracies. Then, it needs to be determined which information on the image is desired and which information can reasonably be expected from the data, taking into account the available additional prior information. Depending on the specific application, different viewpoints are typically taken which yield different strategies for obtaining images which agree (in an application-specific sense) with the given information. We will give some instructive examples further below.

Determining an image (or a set of possible images) from the measured data, in the above-described sense and by taking into account the available additional prior information, is called here imaging or image reconstruction. In practice, images are often represented in a computer and thereby need to be discretized somehow. The most popular discretization model uses 2D pixels or 3D voxels for representing an image, even though alternative models are possible. Often the underlying PDE also needs to be discretized on some grid, which could be done by finite differences, finite volumes, finite elements, and other techniques. The discretization for the image does not necessarily need to be identical to the discretization used for solving the PDE, and sometimes different models are used for discretizing the image and the PDE. However, in these cases, some method needs to be provided to map from one representation to the other. In a level set representation of an image, also the level set functions need to be discretized for being represented in a computer. The above said then holds true also for the discretizations of the level set functions, which could either follow the same model as the PDE and/or a pixel model for the image or follow a different pattern.

1.3 The Forward and the Inverse Problem

In this chapter, it is supposed that data \(\tilde{g}\) are given in the form

$$\displaystyle{ \tilde{g} = \mathcal{M}\tilde{\mathbf{u}},\quad }$$
(1)

where \(\mathcal{M}\) denotes a linear measurement operator, and \(\tilde{\mathbf{u}}\) are the physical states created by the sources q for probing the image. It is assumed that a physical model \(\Lambda (b)\) is given, which incorporates the (possibly vector-valued) model parameter b and which is able to (roughly) predict the probing physical states when being plugged into an appropriate numerical simulator, provided the correct sources and physical parameters during the measurement process were known. The forward operator \(\mathcal{A}\) is defined as

$$\displaystyle{ \mathcal{A}(b,\mathbf{q}) = \mathcal{M}\Lambda (b)^{-1}\mathbf{q}. }$$
(2)

As mentioned, \(\Lambda (b)\) is often described in a form of some partial differential equation (PDE) or, alternatively, an integral equation (IE), and the individual coefficients of the model parameter b appear at one or several places in this model as space-dependent coefficients. In most applications, measurements are taken only at few locations of the domain, for example, at the boundary of the area of interest, from which the physical parameters b or the source q (or both) need to be inferred in the whole domain. It is said that with respect to these unknowns, the measurements are indirect: They are taken not at the locations where the unknowns need to be determined but indirectly by their overall impact on the states (modeled by the underlying PDE or IE) probing the image, which are measured only at few locations. The behavior of the states is modeled by the operator \(\mathcal{A}\) in (2). If in \(\mathcal{A}\) only b (but not q) is unknown, then the problem is an inverse parameter or inverse scattering problem. If in \(\mathcal{A}\) only q (but not b) is unknown, then the problem is an inverse source problem. Given measured data \(\tilde{g}\), the “residual operators” \(\mathcal{R}\) are correspondingly given by

$$\displaystyle{ \mathcal{R}(b,\mathbf{q}) = \mathcal{A}(b,\mathbf{q}) -\tilde{ g}. }$$
(3)

Given the above definitions, an image is defined here as a mapping

$$\displaystyle{a: \Omega \rightarrow \mathbb{R}^{n},}$$

where Ω is a bounded or unbounded region in \(\mathbb{R}^{2}\) or in \(\mathbb{R}^{3}\) and n is the number of components of the (vector-valued) image. Each component function a k , k = 1, , n, represents a space-dependent physical characteristic of the domain Ω which can be probed by physical inspection. If it appears as a coefficient of a PDE (or IE), it is denoted a k = b k , and if it appears as a source, it is denoted a k = q k . The exposition given in this chapter mainly focuses on the recovery of parameter distributions a k = b k and addresses several peculiarities related to those cases. However, the main concepts carry over without major changes to inverse source problems and also to some related formulations as, for example, the reconstruction of boundary or initial conditions of PDEs.

2 Examples and Case Studies

Some illustrative examples and case studies are presented in the following, which will be used further on in this chapter for demonstrating basic ideas and concepts on realistic and practical situations.

2.1 Example 1: Microwave Breast Screening

Figure 1 shows two-dimensional images from the application of microwave breast screening. The images of size 160 × 160 pixels have been constructed synthetically based on MRI images of the female breast. Three representative breast structures are displayed in the three images of the left column, where the value at each pixel of the images represents the physical parameter “static relative permittivity.”

Fig. 1
figure 1

Three images from microwave breast screening. The three images are synthetically generated from MRI breast models. Left column: two-dimensional maps of the distribution of the static permittivity \(\varepsilon _{{\mathrm{st}}}\) inside the three breast models. Right column: the corresponding histograms of values of \(\varepsilon _{{\mathrm{st}}}\) in each map

A commonly accepted model for breast tissue is to roughly distinguish between skin, fatty tissue and fibroglandular tissue. In the images also a matching liquid is shown in which the breast is immersed. Inside the breast, regions can be identified easily which correspond to fibroglandular tissue (high static relative permittivity values) and fatty tissue (low static relative permittivity values), separated by a more or less complicated interface. On the right column, histograms are shown for the distributions of static relative permittivity values inside the breast. In these histograms, it becomes apparent that values for fatty and fibroglandular tissue are clustered around two typical values, but with a broader range of distribution. However, a clear identification of fatty and fibroglandular tissue cannot be made easily for each pixel of the image based on just these values.

Nevertheless, during a reconstruction, and from anatomical reasoning, it does make sense to assume a model where fatty and fibroglandular tissue occupy some subregions of the breast where a sharp interface exists between these subregions. Finding these subregions provides valuable information for the physician. Furthermore, it might be sufficient for an overall evaluation of the situation to have a smoothly varying profile of tissue parameters reconstructed inside each of these subregions, allowing for the choice of a smoothly varying profile of static relative permittivity values inside each region. In the same spirit, from anatomical reasoning, it makes sense to assume a sharp interface (now of less complicated behavior) separating the skin region from the fatty/fibroglandular tissue on the one side and from the matching liquid on the other side. It might also be reasonable to assume that the skin and the matching liquid have constant static permittivity values, which might be known or not. If a tumor in its early stage of development is sought in this breast model, it will occupy an additional region of small size (and either simple or complicated shape and topology) and might have constant but unknown static relative permittivity value inside this region.

During a reconstruction for breast screening, this set of plausible assumptions provides us with a complex mathematical breast model which incorporates this prior information and might yield an improved and more realistic image for the reconstructed breast (including a better estimate of the tumor characteristics) than a regular pixel-based inversion would be able to provide. This is so because it is assumed that the real breast follows roughly the complicated model constructed above and that this additional information is taken into account in the inversion.

In this application, the underlying PDE is the system of time-harmonic Maxwell’s equations, or its 2D representative (describing so-called TM-waves), a Helmholtz equation. The “static relative permittivity,” as mapped in Fig. 1, represents one parameter entering in the wavenumber of the Debye dispersion model. The electromagnetic fields are created by specifically developed microwave antennas surrounding the breast, and the data are gathered at different microwave antennas also located around the breast. For more details, see [52].

2.2 Example 2: History Matching in Petroleum Engineering

Figure 2 shows a synthetically created 2D image of a hydrocarbon reservoir during the production process. Circles indicate injection wells, and crosses indicate production wells. The physical parameter displayed in the image is the permeability, which affects fluid flow in the reservoir. Physically, two lithofacies can be distinguished in this image, namely, sandstone and shaly sandstone (further on simply called “shale”). The sandstone region has permeability values roughly in the range 150–500 mDarcy, whereas shale has permeability values more in the range 900–1,300 mDarcy. In petroleum engineering applications, the parameters inside a given lithofacie sometimes follow an overall linear trend, which is the case here inside the sandstone region. This information is often available from geological evaluation of the terrain. As a rough approximation, inside this region, the permeability distribution can be modeled mathematically as a smooth perturbation of a bilinear model. Inside the shale region, no trend is observed or expected, and therefore the permeability distribution is described as a smooth perturbation of a constant distribution (i.e., an overall smoothly varying profile).

Fig. 2
figure 2

An image from reservoir engineering. Shown is the permeability distribution of a fluid flow model in a reservoir which consists of a sandstone lithofacie (values in the range of 150–500 mDarcy) and a shaly sandstone lithofacie (values in the range of 900–1,300 mDarcy), separated by a sharp interface. The sandstone region shows an overall linear trend in the permeability distribution, whereas the shaly sandstone region does not show any clear trend

During a reconstruction, a possible model would be to reconstruct a reservoir image from production data which consists of three different quantities: (1) the interface between the sandstone and shale lithofacies, (2) the smooth perturbation of the constant profile inside the shale region, and (3) the overall trend (i.e., the bilinear profile) inside the sandstone region, assuming that inside this sandstone region the smooth perturbation is small compared to this dominant overall trend. In this application, the PDE is a system of equations modeling two-phase or three-phase fluid flow in a porous medium, of which the relative permeability is one model parameter.

The “fields” (in a slightly generalized sense) are represented in this application by pressure values and water/oil saturation values at each point inside the reservoir during production and are generated by injecting (under high pressure) water in the injection wells and extracting (imposing lower pressure) water and oil from the production wells. The data are the injection and production rates of water and oil, respectively, and sometimes pressure values measured at injection and production wells over production time. For more details, see [35].

2.3 Example 3: Crack Detection

Figure 3 shows an image of a disconnected crack embedded in a homogeneous material. The cracks are represented in this simplified model as very thin regions of fixed thickness. The physical parameter represented by the image is the conductivity distribution in the domain. Only two values can be assumed by this conductivity, one inside the thin region (crack) and another one in the background. The background value is typically known, and the value inside the crack might either be approximately known or it might be an unknown of the inverse problem. The same holds true for the thickness of the crack, which is assumed constant along the cracks, even though the correct thickness (the constant) might become an unknown of the inverse problem as well. Here insulating cracks are considered, where the conductivity is significantly lower than in the background. The probing fields inside the domain are the electrostatic potentials which are produced by applying voltages at various locations along the boundary of the domain, and the data are the corresponding currents across the boundary at discrete positions.

Fig. 3
figure 3

An image from the application of crack detection. Three disconnected crack components are embedded in a homogeneous background medium and need to be reconstructed from electrostatic measurements at the region boundary. In the considered case of insulating cracks, these components are modeled as thin shapes of fixed thickness with a conductivity value much lower than the background conductivity

This model can be considered as a special case of a binary medium where volumetric inclusions are embedded in a homogeneous background. However, the fact that these structures are very thin with fixed thickness requires some special treatment during the shape evolution, which will be commented on further below. In this application, the underlying PDE is a second-order elliptic equation modeling the distribution of electric potentials in the domain for a set of given applied voltage patterns. For more details, see [4].

3 Level Set Representation of Images with Interfaces

A complex image in the above sense needs a convenient mathematical representation in order to be dealt with in a computational and mathematical framework. In this section, several different approaches are listed which have been proposed in the literature for describing images with interfaces by a level set technique. First, the most basic representation is given, which only considers binary media. Afterwards, various representations are described which represent more complicated situations.

3.1 The Basic Level Set Formulation for Binary Media

In the shape inverse problem in its simplest form, the parameter distribution is described by

$$\displaystyle{ b({\mathbf{x}}) = \left \{\begin{array}{c@{\quad \mbox {in}\quad }c} b^{(i)}({\mathbf{x}})\quad \mbox{ in}\quad & D \\ b^{(e)}({\mathbf{x}})\quad \mbox{ in}\quad &\Omega \setminus D \end{array} \right., }$$
(4)

where \(D \subset \Omega \) is a subregion of Ω and where usually discontinuities in the parameters b occur at the interface ∂ D. In the basic level set representation for the shape D, a (sufficiently smooth, i.e., Lipschitz continuous) level set function \(\phi: \Omega \rightarrow \mathbb{R}\) is introduced and the shape D is described by

$$\displaystyle{ \left \{\begin{array}{c@{\quad \mbox {for all}\quad }l} \phi ({\mathbf{x}}) \leq 0\quad \mbox{ for all}\quad &{\mathbf{x}} \in D,\\ \phi ({\mathbf{x}} ) > 0\quad \mbox{ for all} \quad &{\mathbf{x}} \in \Omega \setminus D. \end{array} \right. }$$
(5)

In other words, the parameter function b has the form

$$\displaystyle{ b({\mathbf{x}}) = \left \{\begin{array}{c@{\quad \mbox { where }\quad }c} b^{(i)}({\mathbf{x}})\quad \mbox{ where }\quad &\phi ({\mathbf{x}}) \leq 0 \\ b^{(e)}({\mathbf{x}})\quad \mbox{ where }\quad &\phi ({\mathbf{x}}) > 0. \end{array} \right. }$$
(6)

Certainly, a unique representation of the image is possible by just knowing those points where ϕ(x) has a change of sign (the so-called zero level set) and additionally knowing the two interior profiles b (i)(x) and b (e)(x) inside those areas of Ω where they are active (which are D and \(\Omega \setminus D\), respectively). Often, however, it is more convenient to assume that these functions are defined on larger sets which include the minimal sets mentioned above. In this chapter, it is assumed that all functions are defined on the entire domain Ω, by employing any convenient extensions from the abovementioned sets to the rest of Ω. Again, it is clear that the above extensions are not unique and that many possible representations can then be found for a given image. Which one to choose depends on details of the algorithm for constructing the image, on the available prior information, and possibly on other criteria (Fig. 4).

Fig. 4
figure 4

The basic level set representation of a shape D. Those points of the domain where the describing level set function assumes negative values are “inside” the shape D described by the level set function ϕ, while those with positive values are “outside” it. The zero level set where ϕ = 0 represents the shape boundary

For a sufficiently smooth level set function, the boundary of the shape D permits the characterization

$$\displaystyle{ \partial D =\{ {\mathbf{x}} \in \Omega,\quad \phi ({\mathbf{x}}) = 0\}. }$$
(7)

This representation motivates the name zero level set for the boundary of the shape. In some representations listed further below, however, level set functions are preferred which are discontinuous across those sets where they change sign. Then, the boundary of the different regions can be defined alternatively as

$$\displaystyle\begin{array}{rcl} \partial D& =& \{{\mathbf{x}} \in \Omega \;:\; \mbox{ for all}\;\rho > 0\;\mbox{ we can find}\;{\mathbf{x}}_{1},{\mathbf{x}}_{2} \in B_{\rho }({\mathbf{x}}) \\ & & \qquad \qquad \mbox{ with}\;\phi ({\mathbf{x}}_{1}) > 0\;\mbox{ and}\;\phi ({\mathbf{x}}_{2}) \leq 0\} {}\end{array}$$
(8)

where \(B_{\rho }({\mathbf{x}}_{0}) =\{ {\mathbf{x}} \in \Omega: \vert {\mathbf{x}} -{\mathbf{x}}_{0}\vert <\rho \}\).

3.2 Level Set Formulations for Multivalued and Structured Media

As mentioned already above, in many applications the binary model described in section “The Basic Level Set Formulation for Binary Media” is not sufficient and more complex image models need to be employed. Several means have been discussed in the literature for generalizing the basic model to more complex situations, some of them being listed in the following.

3.2.1 Different Levels of a Single Smooth Level Set Function

A straightforward generalization of the technique described in section “The Basic Level Set Formulation for Binary Media” consists in using, in addition to the level set zero, additional level sets of a given smooth (e.g., Lipschitz continuous) level set function in order to describe different regions of a given domain [99]. For example, define

$$\displaystyle\begin{array}{rcl} \Gamma _{i}& =& \{{\mathbf{x}} \in \Omega,\quad \phi ({\mathbf{x}}) = c_{i}\}{}\end{array}$$
(9)
$$\displaystyle\begin{array}{rcl} D_{i}& =& \{{\mathbf{x}} \in \Omega,\quad c_{i+1} <\phi ({\mathbf{x}}) < c_{i}\},{}\end{array}$$
(10)

where c i are prespecified values with c i+1 > c i for \(i = 0,\ldots,\underline{i} - 1\) and with c 0 = +, \(c_{\underline{i}} = -\infty \). Then,

$$\displaystyle{ \Omega =\bigcup _{ i=0}^{\underline{i}}D_{ i},\quad \mbox{ with}\quad D_{i} \cap D_{i^{{\prime}}} =\emptyset \quad \mbox{ for}\;i\neq i^{{\prime}}. }$$
(11)

A level set representation for the image b is then given as a tupel \((b_{0},\ldots,b_{\underline{i}},\phi )\) which satisfies

$$\displaystyle{ b({\mathbf{x}}) = b_{i}({\mathbf{x}})\mbox{ for }c_{i+1} <\phi ({\mathbf{x}}) < c_{i}. }$$
(12)

It is clear that certain topological restrictions are imposed on the distribution of the regions D i by this formulation. In particular, it favors certain nested structures. For more details, see [63].

3.2.2 Piecewise Constant Level Set Function

This model describes piecewise constant multiple phases of a domain by only one level set function and has its origins in the application of image segmentation. A single level set function is used which is only allowed to take a small number of different values, e.g.,

$$\displaystyle{ \phi ({\mathbf{x}}) = i\quad \mbox{ in}\;D_{i},\quad \mbox{ for}\;i = 0,\ldots,\underline{i}, }$$
(13)
$$\displaystyle{\Omega =\bigcup _{ i=0}^{\underline{i}}D_{ i},\quad \mbox{ with}\quad D_{i} \cap D_{i^{{\prime}}} =\emptyset \quad \mbox{ for}\quad i\neq i^{{\prime}}.}$$

Introducing the set of basis functions γ i

$$\displaystyle{ \gamma _{i} = \frac{1} {\alpha _{i}} \prod _{{ j=1 \atop j\neq i} }^{\underline{i}}(\phi -j)\quad \mbox{ with}\quad \alpha _{ i} =\prod _{ { j=1 \atop j\neq i} }^{\underline{i}}(i - j), }$$
(14)

the parameter distribution b(x) is defined as

$$\displaystyle{ b =\sum _{ i=1}^{\underline{i}}b_{ i}\gamma _{i}. }$$
(15)

A level set representation for the image b is then given as a tupel \((b_{1},\ldots,b_{\underline{i}},\phi )\) with

$$\displaystyle{ b({\mathbf{x}}) = b_{i}\quad \mbox{ where}\;\phi ({\mathbf{x}}) = i. }$$
(16)

Numerical results using this model can be found, among others, in [61, 65, 67, 97].

3.2.3 Vector Level Set

In [99] multiple phases are described by using one individual level set function for each of these phases, i.e.,

$$\displaystyle\begin{array}{rcl} \Gamma _{i}& =& \{{\mathbf{x}} \in \Omega,\quad \phi _{i}({\mathbf{x}}) = 0\}{}\end{array}$$
(17)
$$\displaystyle\begin{array}{rcl} D_{i}& =& \{{\mathbf{x}} \in \Omega,\quad \phi _{i}({\mathbf{x}}) \leq 0\},{}\end{array}$$
(18)

for sufficiently smooth level set functions ϕ i , \(i = 0,\ldots,\underline{i}\). In this model, the level set representation for the image b is given by a tupel \((b_{1},\ldots,b_{\underline{i}},\phi _{1},\ldots,\phi _{\underline{i}})\) which satisfies

$$\displaystyle{ b(x) = b_{k}(x)\mbox{ where }\phi _{k}({\mathbf{x}}) \leq 0. }$$
(19)

Care needs to be taken here that different phases do not overlap, which is not automatically incorporated in the model. For more details on how to address this and other related issues, see [99].

3.2.4 Color Level Set

An alternative way of describing different phases by more than one level set functions has been introduced in [95] in the framework of image segmentation and further investigated by [14, 25, 38, 63, 92] in the framework of inverse problems. In this model (which also is known as the Chan–Vese model), up to 2n different phases can be represented by n different level set functions by distinguishing all possible sign combinations for these functions. For example, a level set representation for an image b containing up to four different phases is given by the tupel (b 1, b 2, b 3, b 4, ϕ 1, ϕ 2) which satisfies

$$\displaystyle\begin{array}{rcl} b({\mathbf{x}})& =& b_{1}(1 - H(\phi _{1}))(1 - H(\phi _{2})) + b_{2}(1 - H(\phi _{1}))H(\phi _{2}) \\ & & +b_{3}H(\phi _{1})(1 - H(\phi _{2})) + b_{4}H(\phi _{1})H(\phi _{2}). {}\end{array}$$
(20)

Also here, the contrast values b ν , ν = 1, , 4 are allowed to be smoothly varying functions inside each region. The four different regions are then given by

$$\displaystyle\begin{array}{rcl} D_{1}& =& \{{\mathbf{x}},\quad \phi _{1} \leq 0\quad \mbox{ and}\quad \phi _{2} \leq 0\} \\ D_{2}& =& \{{\mathbf{x}},\quad \phi _{1} \leq 0\quad \mbox{ and}\quad \phi _{2} > 0\} \\ D_{3}& =& \{{\mathbf{x}},\quad \phi _{1} > 0\quad \mbox{ and}\quad \phi _{2} \leq 0\} \\ D_{4}& =& \{{\mathbf{x}},\quad \phi _{1} > 0\quad \mbox{ and}\quad \phi _{2} > 0\}.{}\end{array}$$
(21)

This yields a complete covering of the domain Ω by the four regions, each point \({\mathbf{x}} \in \Omega \) being part of exactly one of the four shapes D ν ; see Fig. 5.

Fig. 5
figure 5

Color level set representation of multiple shapes. Each region is characterized by a different sign combination of the two describing level set functions

3.2.5 Binary Color Level Set

An alternative technique for using more than one level set function for describing multiple phases, which is, in a certain sense, a combination of the piecewise constant level set model described in section “Piecewise Constant Level Set Function” and the color level set technique described in section “Color Level Set,” has been proposed in [62] for the application of Mumford–Shah image segmentation. For the description of up to four phases by two (now piecewise constant) level set functions ϕ 1 and ϕ 2, in this binary level set model, the two level set functions are required to satisfy

$$\displaystyle{ \phi _{i} \in \{-1,\,1\},\quad \mbox{ or}\quad \phi _{i}^{2} = 1,\quad i \in \{ 1,2\}. }$$
(22)

The parameter function b(x) is given by

$$\displaystyle\begin{array}{rcl} b({\mathbf{x}})& =& \frac{1} {4}\Big(b_{1}(\phi _{1} - 1)(\phi _{2} - 1) - b_{2}(\phi _{1} - 1)(\phi _{2} + 1)\Big. \\ & & \Big.\left.-b_{3}(\phi _{1} + 1)\right )(\phi _{2} - 1) + b_{4}(\phi _{1} + 1)(\phi _{2} + 1)\Big),{}\end{array}$$
(23)

and the four different regions are encoded as

$$\displaystyle\begin{array}{rcl} D_{1}& =& \{{\mathbf{x}},\quad \phi _{1} = -1\quad \mbox{ and}\quad \phi _{2} = -1\} \\ D_{2}& =& \{{\mathbf{x}},\quad \phi _{1} = -1\quad \mbox{ and}\quad \phi _{2} = +1\} \\ D_{3}& =& \{{\mathbf{x}},\quad \phi _{1} = +1\quad \mbox{ and}\quad \phi _{2} = -1\} \\ D_{4}& =& \{{\mathbf{x}},\quad \phi _{1} = +1\quad \mbox{ and}\quad \phi _{2} = +1\}.{}\end{array}$$
(24)

A level set representation for an image b containing up to four different phases is given by the tupel (b 1, b 2, b 3, b 4, ϕ 1, ϕ 2) which satisfies (23). For more details, we refer to [62].

3.3 Level Set Formulations for Specific Applications

Often, for specific applications, it is convenient to develop particular modifications or generalizations of the above-described general approaches for describing multiple regions by taking into account assumptions and prior information which are very specific to the particular application. A few examples are given below.

3.3.1 A Modification of Color Level Set for Tumor Detection

In the application of tumor detection from microwave data for breast screening (see section “Example 1: Microwave Breast Screening”), the following situation needs to be modeled. The breast consists of four possible tissue types, namely, the skin, fibroglandular tissue, fatty tissue, and a possible tumor. Each of these tissue types might have an internal structure, which is (together with the mutual interfaces) one unknown of the inverse problem. In principle, the color level set description using two level set functions for describing four different phases would be sufficient for modeling this situation. However, the reconstruction algorithm as presented in [52] requires some flexibility with handling these four regions separately, which is difficult in this minimal representation of four regions. Therefore, in [52], the following modified version of the general representation of color level sets is proposed for modeling this situation. In this modified version, m different phases (here m = 4) are described by n = m − 1 level set functions in the following form

$$\displaystyle\begin{array}{rcl} b({\mathbf{x}})& =& b_{1}(1 - H(\phi _{1})) + H(\phi _{1})\Big[b_{2}(1 - H(\phi _{2}))\Big. \\ & & +\Big.H(\phi _{2})\left \{b_{3}(1 - H(\phi _{3})) + b_{4}H(\phi _{3})\right \}\Big] {}\end{array}$$
(25)

or

$$\displaystyle\begin{array}{rcl} D_{1}& =& \{{\mathbf{x}},\quad \phi _{1} \leq 0\} \\ D_{2}& =& \{{\mathbf{x}},\quad \phi _{1} > 0\quad \mbox{ and}\quad \phi _{2} \leq 0\} \\ D_{3}& =& \{{\mathbf{x}},\quad \phi _{1} > 0\quad \mbox{ and}\quad \phi _{2} > 0\;\quad \mbox{ and}\quad \phi _{3} \leq 0\} \\ D_{4}& =& \{{\mathbf{x}},\quad \phi _{1} > 0\quad \mbox{ and}\quad \phi _{2} > 0\;\quad \mbox{ and}\quad \phi _{3} > 0\},{}\end{array}$$
(26)

where b 1, b 2, b 3, and b 4 denote the dielectric parameters of the skin, tumorous, fibroglandular, and fatty tissue, respectively. In (25), ϕ 1, ϕ 2, and ϕ 3 are the three different level set functions indicating the regions filled with the skin, tumorous, and fibroglandular tissue, respectively, and the contrast values b ν , ν = 1, , 4 are generally allowed to be smoothly varying functions inside each region. This combination of m − 1 level set functions for describing m different phases has certain advantages with respect to the standard color level set formulation during the reconstruction process, as it is pointed out in [52]. On the other hand, it is obvious that (26) can be considered a special case of the color level set technique (section “Color Level Set”) where the theoretically possible 23 = 8 different values of the color level set description are enforced to fall into m = 4 different groups of characteristic values; see the central image of Fig. 6.

Fig. 6
figure 6

Multiple level set representation for modeling multiphase inverse problems. Left: original color level set technique for describing eight different phases by the different sign combinations of three level set functions. Center: modified color level set technique used in the model for early detection of breast cancer from microwave data. The possible eight regions of the color level set presentation are filled with four different materials in a tailor-made fashion for this application. Right: modified color level set technique for modeling the history matching problem of a water-flooding process in a petroleum reservoir. Also here the eight different regions are filled by a specific combination of materials characteristic for the reconstruction scheme used in this application. Regions with more than one subindex correspond to “characteristic regions” with averaged parameter values

3.3.2 A Modification of Color Level Set for Reservoir Characterization

Another modification of the color level set technique has been used in [35] for the application of history matching in reservoir engineering; see section “Example 2: History Matching in Petroleum Engineering.” Given, as an example, n = 4 level set functions ϕ 1, , ϕ 4, we define the parameter (permeability) distribution inside the reservoir by

$$\displaystyle\begin{array}{rcl} b& =& b_{1}(1 - H(\phi _{1}))H(\phi _{2})H(\phi _{3}) + b_{2}H(\phi _{1})(1 - H(\phi _{2}))H(\phi _{3}) \\ & & +b_{3}H(\phi _{1})H(\phi _{2})(1 - H(\phi _{3})) + b_{4}H(\phi _{1})H(\phi _{2})H(\phi _{3}) \\ & & +\frac{b_{2} + b_{3}} {2} \ H(\phi _{1})(1 - H(\phi _{2}))(1 - H(\phi _{3})) \\ & & +\frac{b_{1} + b_{3}} {2} \ (1 - H(\phi _{1}))H(\phi _{2})(1 - H(\phi _{3})) \\ & & +\frac{b_{1} + b_{2}} {2} \ (1 - H(\phi _{1}))(1 - H(\phi _{2}))H(\phi _{3}) \\ & & +\frac{b_{1} + b_{2} + b_{3}} {3} \ (1 - H(\phi _{1}))(1 - H(\phi _{2}))(1 - H(\phi _{3})),{}\end{array}$$
(27)

where the permeability values b ν , ν = 1, , 4 are assumed constant inside each region. The four lithofacies are represented as

$$\displaystyle\begin{array}{rcl} D_{1}& =& \{{\mathbf{x}},\quad \phi _{1} \leq 0\quad \mbox{ and }\quad \phi _{2}\ > 0\quad \mbox{ and }\quad \phi _{3}\ > 0\} \\ D_{2}& =& \{{\mathbf{x}},\quad \phi _{2} \leq 0\quad \mbox{ and }\quad \phi _{3}\ > 0\quad \mbox{ and }\quad \phi _{1}\ > 0\} \\ D_{3}& =& \{{\mathbf{x}},\quad \phi _{3} \leq 0\quad \mbox{ and }\quad \phi _{1}\ > 0\quad \mbox{ and }\quad \phi _{2}\ > 0\} \\ D_{4}& =& \{{\mathbf{x}},\quad \phi _{1} > 0\quad \mbox{ and }\quad \phi _{2}\ > 0\quad \mbox{ and }\quad \phi _{3}\ > 0\}.{}\end{array}$$
(28)

Let in the following n = 4 be the number of lithofacies. In this model, a point in the reservoir corresponds to the lithofacie D l , (l = 1, , n − 1) if ϕ l has negative sign and all the other level set functions have positive sign. In addition, one lithofacie (which here is referred to as the “background” lithofacie with index l = n) corresponds to those points where none of the level set functions has a negative sign. Notice that typically this definition does not yield a complete covering of the whole domain Ω by the four (n) lithofacies; see the right image of Fig. 6. Those regions inside the domain where more than one level set function are negative are recognized as so-called critical regions and are introduced for providing a smooth evolution from the initial guess to the final reconstruction. Inside these critical regions, the permeability assumes values which are calculated as certain averages over values of the neighboring noncritical regions. They are indicated in the right image of Fig. 6 by using multiple subindices indicating which noncritical regions contribute to this averaging procedure. For more details regarding this model, and numerical experiments for the application of reservoir characterization, see [35].

3.3.3 A Modification of the Classical Level Set Technique for Describing Cracks or Thin Shapes

Cracks of finite thickness can be modeled by using two level set functions in a setup which amounts to a modification of the classical level set technique for binary media. For simplicity, assume that a crack or thin region of finite thickness is embedded in a homogeneous background. The classical level set technique described in section “The Basic Level Set Formulation for Binary Media” captures this situation in principle, since the crack can be interpreted as a simple shape (with possibly complicated topology) embedded in a homogeneous background. However, when it comes to shape evolution for such a crack-like structure, it is difficult to maintain a fixed thickness of the thin shape following the classical shape evolution scheme. This is so since the classical shape evolution applies an individually calculated velocity field value in the normal direction at each point of the entire shape boundary, such that the thickness of the thin region will not be maintained. For crack evolution, the deformations of adjacent boundary points need to be coordinated in order to maintain the thickness of the crack during the entire shape evolution; see Fig. 9.

A modified version of the classical level set technique has been proposed in [4, 77] which uses two level set functions for modeling crack propagation and crack reconstruction in this sense. Here, a small neighborhood (narrowband) of the zero level set of the first level set function defines the general outline of the crack, whereas the second level set function selects those parts of this band structure which in fact contribute to the possibly disconnected crack topology.

In more details, given a continuously differentiable level set function ϕ 1 and its zero level set

$$\displaystyle\begin{array}{rcl} \Gamma _{\phi _{1}}& =& \{{\mathbf{x}} \in \Omega \;:\;\phi _{1}({\mathbf{x}}) = 0\}.{}\end{array}$$
(29)

The normal n to \(\Gamma _{\phi _{1}}\) is given by (61) and is pointing into the direction where ϕ(x) ≥ 0. An (connected or disconnected) “ideal” (i.e., of thickness zero) crack with finite length completely contained inside Ω is constructed by introducing a second level set function ϕ 2 which selects one or more parts from the line \(\Gamma _{\phi _{1}}\); see Fig. 7. This second level set function defines the region

$$\displaystyle{ B =\{ {\mathbf{x}} \in \Omega \;:\;\phi _{2}({\mathbf{x}}) \leq 0\}. }$$
(30)
Fig. 7
figure 7

Multiple level set representation for modeling a disconnected crack. The zero level set of the first level set function defines the potential outline of the crack, of which the second level set function selects those parts that are actually occupied by the crack. The “ideal” crack has vanishing thickness, whereas the “real” crack modeled by the level set technique has a small finite thickness and is practically obtained by a narrow band technique

The “ideal” crack is then defined as a collection of finite subintervals of \(\Gamma _{\phi _{1}}\)

$$\displaystyle{ S[\phi _{1},\phi _{2}] = \Gamma _{\phi _{1}} \cap B. }$$
(31)

An ideal “insulating” crack S (of thickness zero) is then supplemented with a vanishing electrical current condition across this set S[ϕ 1, ϕ 2]. However, in the simplified level set model, not the ideal crack is considered, but cracks of finite thickness 2δ > 0 with known conductivity b i inside the crack and b e outside it. Moreover, in the insulating case, it is assumed that b i b e . In this model, a small neighborhood of \(\Gamma _{\phi _{1}}\) is introduced as

$$\displaystyle{ \Gamma _{\phi _{1}}^{\delta } =\{ \mathbf{y} \in \Omega: \mathbf{y} = {\mathbf{x}} -\tau \mathbf{n}({\mathbf{x}}),\vert \tau \vert <\delta,{\mathbf{x}} \in \Gamma _{\phi _{1}}\}, }$$
(32)

and the above-defined “ideal crack” S is associated now with a “real crack” counterpart

$$\displaystyle{ S_{\delta } = \Gamma _{\phi _{1}}^{\delta } \cap B. }$$
(33)

The conductivity distribution is

$$\displaystyle{ b({\mathbf{x}}) = \left \{\begin{array}{l} b_{i}\quad \mbox{ for}\quad {\mathbf{x}} \in S_{\delta } \\ b_{e}\quad \mbox{ otherwise} \end{array} \right. }$$
(34)

in the domain Ω. Certainly, the real crack can also alternatively be defined by

$$\displaystyle{ \tilde{S}_{\delta } =\{ \mathbf{y} \in \Omega: \mathbf{y} = {\mathbf{x}} -\tau \mathbf{n}({\mathbf{x}}),\,\vert \tau \vert <\delta,\,{\mathbf{x}} \in S\}, }$$
(35)

which would slightly change the shape of the crack at the crack tips. Here, the form (33) is preferred. For the numerical treatment, see [4, 77].

4 Cost Functionals and Shape Evolution

One important technique for creating images with interfaces satisfying certain criteria is shape evolution, more specifically, interface and profile evolution. The general goal is to start with a set of shapes and profiles as initial guess, and then let both, shapes and profiles, evolve due to some appropriate evolution laws in order to improve the initial guess with increasing artificial evolution time. The focus in the following will be on shape evolution, since evolution laws for interior profiles fairly much follow classical and well-known concepts. Evolution of a shape or an interface can be achieved either by defining a velocity field on the domain Ω which deforms the boundaries of this shape or by defining evolution laws directly for the level set functions representing the shape. Some of these techniques will be presented next.

4.1 General Considerations

In many applications, images need to be evaluated for verifying their usefulness or merit for a given situation. This evaluation is usually based on a number of criteria, among them being the ability of the image (in its correct interpretation) to reproduce the physically measured data (its data fitness). Other criteria include the consistence with any additionally available prior knowledge on the given situation or the closeness of the image to a set of reference images. In many cases, some form of merit function (often in terms of a properly defined cost functional) is defined whose value is intended to indicate the usefulness of the image in a given application. However, sometimes this decision is done based on visual inspection only.

In general, during this evaluation process, a family of images is created and the merit of each of these images is assessed. Then, one or more of these images are selected. Let \(\left (b^{(1)},\ldots,b^{(\underline{i})},\phi ^{(1)},\ldots,\phi ^{(\underline{j})}\right )\) be a level set representation for the class of images to be considered. Then, creating this family of images can be described either in a continuous way by an artificial time evolution

$$\displaystyle{\left (b^{(1)}(t),\ldots,b^{(\underline{i})}(t),\phi ^{(1)}(t),\ldots,\phi ^{(\underline{j})}(t)\right ),\quad t \in [0,t_{\mbox{ max}}],}$$

with an artificial evolution time t or in a discrete way

$$\displaystyle{\left (b_{k}^{(1)},\ldots,b_{ k}^{(\underline{i})},\phi _{ k}^{(1)},\ldots,\phi _{ k}^{(\underline{j})}\right ),\quad k = 1,\ldots,\underline{k},}$$

with a counting index k. Usually these images are created in a sequential manner, using evolution laws

$$\displaystyle{ \frac{d} {\mathit{dt}}\left (b^{(1)}(t),\ldots,b^{(\underline{i})}(t),\phi ^{(1)}(t),\ldots,\phi ^{(\underline{j})}(t)\right ) = f(t),}$$

with a multicomponent forcing term f(t) or update formulas

$$\displaystyle{\left (b_{k+1}^{(1)},\ldots,b_{ k+1}^{(\underline{i})},\phi _{ k+1}^{(1)},\ldots,\phi _{ k+1}^{(\underline{j})}\right ) = F_{ k}\left (b_{k}^{(1)},\ldots,b_{ k}^{(\underline{i})},\phi _{ k}^{(1)},\ldots,\phi _{ k}^{(\underline{j})}\right )}$$

with update operators F k . These evolution laws and update operators can also be defined on ensembles of images, which allows for statistical evaluation of each ensemble during the evaluation process. Any arbitrarily defined evolution law and set of update operators yield a family of images which can be evaluated, but typically those are preferred which point into a descent direction of some predefined cost functional. Some choices of such cost functionals will be discussed in the following.

4.2 Cost Functionals

In general, a cost functional can consist of various components, typically combined in an additive or multiplicative manner. Popular components for an image model \(\underline{b} = \left (b^{(1)},\ldots,b^{(\underline{i})}\right )\) and \(\underline{\phi }= \left (\phi ^{(1)},\ldots,\phi ^{(\underline{j})}\right )\) are:

  1. 1.

    Data misfit terms \(\mathcal{J}_{{\mbox{ data}}}(\underline{b},\underline{\phi })\)

  2. 2.

    Terms measuring closeness to a prior model inside each subdomain \(\mathcal{J}_{{\mbox{ prior}}}(\underline{b},\underline{\phi })\)

  3. 3.

    Terms enforcing geometric constraints on the interfaces \(\mathcal{J}_{{\mbox{ geom}}}(\underline{b},\underline{\phi })\)

In (1), the by far most popular data misfit term is the least squares misfit cost functional which, in general, is given as an expression of the form

$$\displaystyle{ \mathcal{J}_{{\mbox{ data}}}(\underline{b},\underline{\phi }) = \frac{1} {2}\left \|\mathcal{A}(\underline{b},\underline{\phi }) -\tilde{ g}\right \|^{2} = \frac{1} {2}\left \|u_{{M}}[\underline{b},\underline{\phi }] -\tilde{ g}\right \|^{2}, }$$
(36)

where \(\mathcal{A}(\underline{b},\underline{\phi })\) is the forward operator defined in (2) and \(u_{{M}}[\underline{b},\underline{\phi ]}\) indicates the simulated data at the set of probing locations M for this guess. Other choices can be considered as well; see, for example, [37].

(2) corresponds to classical regularization techniques, applied to each subdomain, and is treated in many textbooks, such as [37, 71]. Therefore, it is not discussed in this chapter.

(3) has a long history in the shape optimization literature and in image processing applications. See, for example, [30, 87]. A few concepts are presented in Sect. 5.

4.3 Transformations and Velocity Flows

The first technique discussed here is shape evolution by transformations and velocity flows. This concept has been inspired by applications in continuum mechanics. Given a (possibly bounded) domain \(\Omega \subset \mathbb{R}^{n}\) and a shape \(D \subset \Omega \) with boundary \(\partial D\) which, as usual, is denoted as \(\Gamma \). Let a smooth vector field \(\mathbf{v}: \mathbb{R}^{n} \rightarrow \mathbb{R}^{n}\) be given with vn = 0 on \(\partial \Omega \). A family of transformations S t proceeds by

$$\displaystyle{ S_{t}({\mathbf{x}}) = {\mathbf{x}} + t\mathbf{v}({\mathbf{x}}) }$$
(37)

for all \({\mathbf{x}} \in \Omega \). In short, S t = I + t v where I stands for the identity map. This defines for each point (“particle”) X in the domain, a propagation law prescribed by the ordinary differential equation

$$\displaystyle\begin{array}{rcl} \dot{{\mathbf{x}}}(t,{\mathbf{x}})& =& \mathbf{V}(t,{\mathbf{x}}(t,{\mathbf{x}})),{}\end{array}$$
(38)
$$\displaystyle\begin{array}{rcl} {\mathbf{x}}(0,{\mathbf{x}})& =& {\mathbf{x}}{}\end{array}$$
(39)

with the specific velocity choice

$$\displaystyle{ \mathbf{V}(t,{\mathbf{x}}(t,{\mathbf{x}})) = \mathbf{v}({\mathbf{x}}). }$$
(40)

Physically, it corresponds to the situation where each point X of the domain travels with constant speed along a straight line which is defined by its initial velocity vector v(X). Notice that the definition (40) can with (37) also be written in a slightly more abstract fashion as

$$\displaystyle{ \mathbf{V}(t,{\mathbf{x}}) = \frac{\partial } {\partial t}S_{t}({\mathbf{x}}) = \left ( \frac{\partial } {\partial t}S_{t}\right ) \circ S^{-1}({\mathbf{x}}). }$$
(41)

In fact, it turns out that the ideas in the above example can be considerably generalized from the specific case (40) to quite arbitrary smooth vector fields V(t, x) describing smooth families of transformations T t (X). The generating vector field V(t, x) is often called “velocity field.” It can be done as follows.

Let us given an arbitrary smooth family of transformations T t (X) which maps every point X of the domain to the point x(t, X) = T t (X) at time t. The propagation of the point X over time t is again described by the ordinary differential equations (38) and (39) where the velocity V is defined by

$$\displaystyle{ \mathbf{V}(t,{\mathbf{x}}) = \left ( \frac{\partial } {\partial t}T_{t}\right ) \circ T^{-1}({\mathbf{x}}). }$$
(42)

Now the propagation of points is not restricted anymore to straight lines, but can be quite arbitrary. Vice versa, given a smooth vector field V(t, x), it gives rise to a family of transformations T t (X) via the differential equations (38) and (39) where every point \({\mathbf{x}} \in \Omega \) is mapped by T t (X) to the solution x(t, X) of (38), (39) at time t, i.e., T t (X)(x) = x(t, X). For more details on this duality of transformations and velocity flows, see the well-known monographs [30, 87].

Notice that the numerical treatment of such a velocity flow in the level set framework leads to a Hamilton–Jacobi-type equation. Some remarks regarding this link are given in section “The Level Set Framework for Shape Evolution.”

4.4 Eulerian Derivatives of Shape Functionals

Given the framework defined in section “Transformations and Velocity Flows,” the goal is now to define transformations and velocity flows which point into a descent direction for a given cost functional. Some useful concepts on how to obtain such descent directions are discussed here.

Let D = D 0 be a shape embedded in the domain at time t = 0. When the points in the said domain start moving under the propagation laws discussed above, the interior points of the shape, the boundary points, as well as the exterior points will move as well, and therefore the shape will deform. Denote the shape at time t by D t = T t (D 0) where as before T t is the family of transformations which correspond to a given velocity field V(t, x). Assume furthermore that a cost functional \(\mathcal{J} ({\mathbf{x}},t,D_{t},\ldots )\) is given which depends (among others) upon the current shape D t . Deformation of shape will entail change of this cost. The so-called shape sensitivity analysis of structural optimization aims at quantifying these changes in the cost due to a given velocity flow (or family of transformations) in order to determine suitable descent flows.

Given a vector field V(t, x), the Eulerian derivative of the cost functional \(\mathcal{J} (D_{t})\) at time t = 0 in the direction V is defined as the limit

$$\displaystyle{ d\mathcal{J} (D,\mathbf{V}) =\lim _{t\downarrow 0}\frac{\mathcal{J} (D_{t}) -\mathcal{J} (D)} {t}, }$$
(43)

if this limit exists. The functional \(\mathcal{J} (D_{t})\) is shape differentiable (or simply differentiable) if the Eulerian derivative \(d\mathcal{J} (D,\mathbf{V})\) exists for all directions V and furthermore the mapping \(\mathbf{V} \rightarrow d\mathcal{J} (D,\mathbf{V})\) is linear and continuous (in appropriate function spaces). It is shown in [30, 87] that if \(\mathcal{J} (D)\) is shape differentiable, there exists a distribution G(D) which is concentrated (supported) on \(\Gamma = \partial D\) such that

$$\displaystyle{ d\mathcal{J} (D,\mathbf{V}) = \left \langle G(D),\mathbf{V}(0)\right \rangle. }$$
(44)

This distribution G is the shape gradient of \(\mathcal{J}\) in D, which is a vector distribution. More specifically, let \(\varpi _{\Gamma }\) denote the trace (or restriction) operator on the boundary \(\Gamma \). Then, the Hadamard-Zolésio structure theorem states that (under certain conditions) there exists a scalar distribution g such that the shape gradient G writes in the form \(G =\varpi _{ \Gamma }^{{\ast}}(g\mathbf{n})\), where \(\varpi ^{{\ast}}\) is the transpose of the trace operator at \(\Gamma \) and where n is the normal to \(\Gamma \). For more details, see again [30, 87].

4.5 The Material Derivative Method

A useful concept for calculating Eulerian derivatives for cost functionals is the so-called material and shape derivative of states u. In the application of inverse problems, these states u typically are the solutions of the PDEs (IEs) which model the probing fields and which depend one way or another on the shape D.

Let as before V be a smooth vector field with \(\langle \mathbf{V},\mathbf{n}\rangle = 0\) on \(\partial \Omega \), and let T t (V) denote the corresponding family of transformations. Moreover, let u = u[D t ] be a state function (of some Sobolev space) which depends on the shape \(D_{t} \subset \Omega \) (denote as before D 0 = D). The material derivative \(\dot{\mathbf{u}}[D,\mathbf{V}]\) of u in the direction V is defined as

$$\displaystyle{ \dot{\mathbf{u}}[D,\mathbf{V}] =\lim _{t\downarrow 0}\frac{\mathbf{u}[D_{t}] \circ T_{t}(\mathbf{V}) -\mathbf{u}[D]} {t}, }$$
(45)

or

$$\displaystyle{ \dot{\mathbf{u}}[D,\mathbf{V}]({\mathbf{x}}) =\lim _{t\downarrow 0}\frac{\mathbf{u}[D_{t}](T_{t}({\mathbf{x}})) -\mathbf{u}[D]({\mathbf{x}})} {t} \quad \mbox{ for}\;{\mathbf{x}} \in \Omega, }$$
(46)

where the square brackets in the notation indicate the dependence of the states and derivatives on the shape D t and/or on the vector field V. The material derivative corresponds to a Lagrangian point of view describing the evolution of the points in a moving coordinate system, e.g., located in the point x(t, X) = T t (X).

The shape derivative u [D, V] of u in the direction V in contrast corresponds to an Eulerian point of view observing the evolution from a fixed coordinate system, e.g., located in the point X. It is defined as

$$\displaystyle{ \mathbf{u}^{{\prime}}[D,\mathbf{V}] =\lim _{ t\downarrow 0}\frac{\mathbf{u}[D_{t}] -\mathbf{u}[D]} {t}, }$$
(47)

or

$$\displaystyle{ \mathbf{u}^{{\prime}}[D,\mathbf{V}]({\mathbf{x}}) =\lim _{ t\downarrow 0}\frac{\mathbf{u}[D_{t}]({\mathbf{x}}) -\mathbf{u}[D]({\mathbf{x}})} {t} \quad \mbox{ for}\;{\mathbf{x}} \in \Omega. }$$
(48)

The shape derivative and the material derivative are closely related to each other. It can be shown that

$$\displaystyle{ \mathbf{u}^{{\prime}}[D,\mathbf{V}] =\dot{ \mathbf{u}}[D,\mathbf{V}] -\nabla (\mathbf{u}[D]) \cdot \mathbf{V}(0) }$$
(49)

provided that these quantities exist and are well defined. Subtracting \(\nabla (\mathbf{u}[D]) \cdot \mathbf{V}(0)\) in (49) from the material derivative makes sure that the shape derivative actually becomes zero in the special case that the states u do not depend on the shape D. The material derivative usually does not vanish in these situations.

4.6 Some Useful Shape Functionals

To become more specific, some useful examples for shape functionals which have been applied to shape inverse problems are provided herein.

  1. 1.

    Define for a given function ζ the shape integral

    $$\displaystyle{ \mathcal{J}_{1}(D) =\int _{\Omega }\chi _{D}({\mathbf{x}})\zeta ({\mathbf{x}})d{\mathbf{x}} =\int _{D}\zeta ({\mathbf{x}})d{\mathbf{x}} }$$
    (50)

    where χ D is the characteristic function for the domain D. Then the Eulerian derivative is given by

    $$\displaystyle{ d\mathcal{J}_{1}(D,\mathbf{V}) =\int _{D}{\mathrm{div}}\left (\zeta \mathbf{V}(0)\right )d{\mathbf{x}} =\int _{\Gamma }\zeta \langle \mathbf{V}(0),\mathbf{n}\rangle _{\mathbb{R}^{n}}d\Gamma. }$$
    (51)
  2. 2.

    Consider the shape functional

    $$\displaystyle{ \mathcal{J}_{2}(D) =\int _{\Gamma }\zeta ({\mathbf{x}})d\Gamma }$$
    (52)

    for a sufficiently smooth function ζ defined on Ω such that the traces on \(\Gamma \) exist and are integrable. The tangential divergence \({\mathrm{div}}_{\Gamma }\mathbf{V}\) of the vector field V at the boundary \(\Gamma \) is defined as

    $$\displaystyle{ {\mathrm{div}}_{\Gamma }\mathbf{V} = \left.\left ({\mathrm{div}}\mathbf{V} -\langle \mathbf{D}\mathbf{V} \cdot \mathbf{n},\,\mathbf{n}\rangle \right )\right \vert _{\Gamma } }$$
    (53)

    where D V denotes the Jacobian of V. Then,

    $$\displaystyle{ d\mathcal{J}_{2}(D,\mathbf{V}) =\int _{\Gamma }\left (\langle \nabla \zeta,\,\mathbf{V}(0)\rangle +\zeta {\mathrm{div}}_{\Gamma }\mathbf{V}(0)\right )d\Gamma }$$
    (54)

    Be \(\mathcal{N}\) an extension of the normal vector field n on \(\Gamma \) to a local neighborhood of \(\Gamma \). Then, the mean curvature κ of \(\Gamma \) is defined as \(\kappa = {\mathrm{div}}_{\Gamma }\mathcal{N}\vert _{\Gamma }\). With that, \(d\mathcal{J}_{2}(D,\mathbf{V})\) admits the alternative representation

    $$\displaystyle{ d\mathcal{J}_{2}(D,\mathbf{V}) =\int _{\Gamma }\left ( \frac{\partial \zeta } {\partial n}+\zeta \kappa \right )\langle \mathbf{V}(0),\mathbf{n}\rangle \,d\Gamma }$$
    (55)
  3. 3.

    A useful link between the shape derivative and the Eulerian derivative of the cost functional is

    $$\displaystyle{ \mathcal{J}_{3}(D) =\int _{D}\mathbf{u}[D]d{\mathbf{x}} }$$
    (56)

    which depends via the states u[D] on the shape D. Furthermore [30, 87]

    $$\displaystyle{ d\mathcal{J}_{3}(D,\mathbf{V}) =\int _{D}\mathbf{u}^{{\prime}}[D,\mathbf{V}]d{\mathbf{x}} +\int _{ \Gamma }\mathbf{u}[D]\langle \mathbf{V}(0),\,\mathbf{n}\rangle _{\mathbb{R}^{n}}\,d\Gamma. }$$
    (57)
  4. 4.

    Consider a cost functional

    $$\displaystyle{ \mathcal{J}_{4}(D) =\int _{\Gamma }\zeta (\Gamma )d\Gamma }$$
    (58)

    where ζ is only defined at the shape boundary \(\Gamma \). Then we cannot use the characterization (49) directly, since \(\nabla (\zeta ) \cdot \mathbf{V}(0)\) is not well defined. In that case, the shape derivative is defined as

    $$\displaystyle{ \zeta ^{{\prime}}[\Gamma,\mathbf{V}] =\dot{\zeta } [\Gamma,\mathbf{V}] -\nabla _{ \Gamma }(\zeta [\Gamma ]) \cdot \mathbf{V}(0), }$$
    (59)

    \(\nabla _{\Gamma }\) being the gradient along the boundary \(\Gamma \) of the shape (chosen such that \(\nabla \zeta = \nabla _{\Gamma }\zeta + \frac{\partial \zeta } {\partial \mathbf{n}}\mathbf{n}\) whenever all these quantities are well defined). Then, the Eulerian derivative of the cost functional \(\mathcal{J}_{4}(D)\) can be characterized as

    $$\displaystyle{ d\mathcal{J}_{4}(D,\mathbf{V}) =\int _{\Gamma }\zeta ^{{\prime}}[\Gamma,\mathbf{V}]d\Gamma +\int _{ \Gamma }\kappa \zeta \langle \mathbf{V}(0),\mathbf{n}\rangle _{\mathbb{R}^{n}}\,d\Gamma }$$
    (60)

    where again κ denotes the mean curvature on \(\Gamma \).

4.7 The Level Set Framework for Shape Evolution

So far, shape evolution has been discussed independently of its representation by a level set technique. Any of the abovementioned shape evolutions can practically be described by employing a level set representation of the shapes.

First, some convenient representations of geometric quantities in the level set framework are listed:

  1. 1.

    The outward normal direction [74, 85] is given by

    $$\displaystyle{ \mathbf{n}({\mathbf{x}}) = \frac{\nabla \phi } {\vert \nabla \phi \vert }. }$$
    (61)
  2. 2.

    The local curvature \(\kappa ({\mathbf{x}})\) of ∂ D, being the divergence of the normal field n(x), is

    $$\displaystyle{ \kappa ({\mathbf{x}}) = \nabla \cdot \mathbf{n}({\mathbf{x}}) = \nabla \cdot \left ( \frac{\nabla \phi } {\vert \nabla \phi \vert }\right ). }$$
    (62)
  3. 3.

    The following relation is often useful

    $$\displaystyle{ \delta (\phi ) = \frac{\delta _{\partial D}({\mathbf{x}})} {\vert \nabla \phi ({\mathbf{x}})\vert } }$$
    (63)

    where δ ∂ D is the n-dimensional Dirac delta distribution concentrated on ∂ D.

Notice that the right-hand sides of (61) and (62) make sense at every point of the domain Ω where the level set function ϕ is sufficiently smooth, giving rise to a natural extension of these quantities from the boundary ∂ D to a local neighborhood.

Assume now that a sufficiently smooth flow field V(x, t) is given and that a shape D is represented by the continuously differentiable level set function ϕ with \(\vert \nabla \phi \vert \neq 0\) at the boundary of the shape. Then, the deformation of the shape due to the flow field V(x, t) in the level set framework can be obtained as follows.

Since the velocity fields are assumed to be sufficiently smooth, a boundary point x remains at the boundary of ∂ D(t) during the evolution of the shape. Let ϕ(x, t) be the set of level set functions describing the shape at every time of the evolution. Differentiating ϕ(x, t) = 0 with respect to t yields

$$\displaystyle{ \frac{\partial \phi } {\partial t} + \nabla \phi \cdot \frac{d{\mathbf{x}}} {dt} = 0. }$$
(64)

Identifying V(x, t) to \(\frac{d{\mathbf{x}}} {dt}\) and using (61), one arrives at

$$\displaystyle{ \frac{\partial \phi } {\partial t} + \vert \nabla \phi \vert \,\mathbf{V}({\mathbf{x}},t) \cdot \mathbf{n}({\mathbf{x}},t) = 0. }$$
(65)

Defining the normal velocity as

$$\displaystyle{ F({\mathbf{x}},t) = \mathbf{V}({\mathbf{x}},t) \cdot \mathbf{n}({\mathbf{x}},t) }$$
(66)

the Hamilton–Jacobi-type equation for describing the evolution of the level set function follows as

$$\displaystyle{ \frac{\partial \phi } {\partial t} + F({\mathbf{x}},t) \cdot \vert \nabla \phi \vert = 0. }$$
(67)

5 Shape Evolution Driven by Geometric Constraints

It is possible to define a shape evolution without any data misfit functional being involved. This type of shape evolution often occurs in applications of image processing or computational physics. For example, starting from an initial shape, the goal might be to define a shape evolution which aims at reducing the cost of the image with respect to one or more geometric quantities, typically encoded in some geometric cost functional. Based on the theory developed in Sect. 4, some useful expressions will be derived here for calculating such descent directions. The then obtained geometrically driven shape evolutions can also be used for adding additional constraints or regularization during the shape evolution driven by data misfit, if desired. This is achieved practically by adding appropriate geometrical cost terms to the data misfit term and calculating descent directions for this combined cost.

5.1 Penalizing Total Length of Boundaries

Assume that \(\Gamma = \partial D\) is a smooth submanifold in Ω. The total length (or surface) of \(\Gamma \) is defined as

$$\displaystyle{ \mathcal{J}_{{\mathrm{len}}\Gamma }(D) =\int _{\Gamma }d\Gamma =\int _{\Omega }\delta _{\partial D}({\mathbf{x}})\,d{\mathbf{x}}. }$$
(68)

Applying a flow by a smooth vector field V(x, t), Eq. (55) yields with ζ = 1 an expression for the corresponding change in the cost (68) which is

$$\displaystyle{ d\mathcal{J}_{{\mathrm{len}}\Gamma }(D,\mathbf{V}) =\int _{\Gamma }\kappa \,\langle \mathbf{V}(0),\mathbf{n}\rangle \,d\Gamma. }$$
(69)

If the shape D is represented by a continuously differentiable level set function ϕ, an alternative derivation can be given. First, using (63), write (68) in the form

$$\displaystyle{ \mathcal{J}_{{\mathrm{len}}\Gamma }(D(\phi )) =\int _{\Omega }\delta (\phi )\vert \nabla \phi ({\mathbf{x}})\vert \,d{\mathbf{x}}. }$$
(70)

Perturbing now \(\phi \rightarrow \phi +\psi\), formal calculation (see, e.g., [92]) yields that the cost functional is perturbed by

$$\displaystyle{ \left \langle \frac{\partial \mathcal{J}_{{\mathrm{len}}\Gamma }} {\partial \phi },\,\psi \right \rangle =\int _{\Omega }\delta (\phi )\psi ({\mathbf{x}})\nabla \cdot \frac{\nabla \phi } {\vert \nabla \phi \vert }\,d{\mathbf{x}}. }$$
(71)

Therefore, using (62), it can be identified

$$\displaystyle{ \frac{\partial \mathcal{J}_{{\mathrm{len}}\Gamma }} {\partial \phi } =\delta (\phi )\nabla \cdot \frac{\nabla \phi } {\vert \nabla \phi \vert } =\delta (\phi )\kappa }$$
(72)

where κ is now an extension (defined, e.g., by (62)) of the local curvature to a small neighborhood of \(\Gamma \). For both representations (69) and (72), minimizing the cost by a gradient method leads to curvature-driven flow equations, which is V(0) = −κ n. This curvature-dependent velocity has been widely used to regularize the computation of motion of fronts via the level set method [48], as well in the field of image processing [70], and has been introduced also recently for regularizing inverse problems; see, e.g., [41, 79, 82].

Two popular concepts related to the above shape evolution are the Mumford–Shah and the total variation functionals, which are frequently employed in image segmentation applications. This relationship is briefly described in the following.

The popular Mumford–Shah functional for image segmentation [70] contains, in addition to a fidelity term inside each region of the segmented image, a term which encourages to shorten total curve length of the interfaces. This latter term can be written for piecewise constant binary media (see section “The Basic Level Set Formulation for Binary Media” with constant profiles in each region) as

$$\displaystyle{ \mathcal{J}_{{{\mathrm{MS}}}} =\int _{\Omega }\vert \nabla H(\phi )\vert \,d{\mathbf{x}}. }$$
(73)

Taking into account that \(\nabla H(\phi ) = H^{{\prime}}(\phi )\nabla \phi =\delta (\phi )\vert \nabla \phi \vert \mathbf{n},\) it is seen that \(\mathcal{J}_{{{\mathrm{MS}}}} = \mathcal{J}_{{\mathrm{len}}\Gamma }(D(\phi ))\) as given in (70), which again yields the curvature-driven flow Eq. (72). For more details, see [26, 38, 95].

The total variation (TV) functional, on the other hand, can be written, again for the situation of piecewise constant binary media, as

$$\displaystyle{ \mathcal{J}_{{{\mathrm{TV}}}} =\int _{\Omega }\vert \nabla b(\phi )\vert \,d{\mathbf{x}} = \vert b_{e} - b_{i}\vert \,\int _{\Omega }\vert \nabla H(\phi )\vert \,d{\mathbf{x}}. }$$
(74)

Therefore, it coincides with the Mumford–Shah functional \(\mathcal{J}_{{{\mathrm{MS}}}}\) up to the factor | b e b i | . Roughly it can be said that the TV functional (74) penalizes the product of the jump between different regions and the arc length of their interfaces, whereas the Mumford–Shah functional (73) penalizes only this arc length. Refer for more information to [25, 38].

5.2 Penalizing Volume or Area of Shape

It is again assumed that \(\Gamma = \partial D\) is a smooth submanifold in Ω. Define the total area (volume) of D as

$$\displaystyle{ \mathcal{J}_{{\mathrm{vol}}D}(D) =\int _{D}d{\mathbf{x}} =\int _{\Omega }\chi _{D}({\mathbf{x}})\,d{\mathbf{x}}, }$$
(75)

where the characteristic function \(\chi _{D}: \Omega \, \rightarrow \,\{ 0,1\}\) for a given shape D is defined as

$$\displaystyle{ \chi _{D}({\mathbf{x}}) = \left \{\begin{array}{c@{,\quad }l} 1,\quad &{\mathbf{x}} \in D\\ 0, \quad &{\mathbf{x}} \in \Omega \setminus D. \end{array} \right. }$$
(76)

Applying a flow by a smooth vector field V(x, t), Eqs. (50) and (51) yield with ζ = 1

$$\displaystyle{ d\mathcal{J}_{{\mathrm{vol}}D}(D,\mathbf{V}) =\int _{D}{\mathrm{div}}\mathbf{V}(0)d{\mathbf{x}} =\int _{\Gamma }\langle \mathbf{V}(0),\mathbf{n}\rangle \,d\Gamma. }$$
(77)

Again, if the shape D is represented by a continuously differentiable level set function ϕ, an alternative derivation can be given. First, using the Heaviside function H, let us write (75) in the form

$$\displaystyle{ \mathcal{J}_{{\mathrm{vol}}D}(D) =\int _{\Omega }\,H(\phi )\,d{\mathbf{x}}. }$$
(78)

Perturbing as before \(\phi \rightarrow \phi +\psi\), it follows

$$\displaystyle{ \left \langle \frac{\partial \mathcal{J}_{{\mathrm{vol}}D}} {\partial \phi },\,\psi \right \rangle =\int _{\Omega }\delta (\phi )\psi ({\mathbf{x}})\,d{\mathbf{x}} }$$
(79)

such that

$$\displaystyle{ \frac{\partial \mathcal{J}_{{\mathrm{vol}}D}} {\partial \phi } =\delta (\phi ). }$$
(80)

In both formulations, a descent flow is given by a motion with constant speed in the negative direction of the normal n to the boundary \(\Gamma \), which is \(\mathbf{V}(0) = -\mathbf{n}\).

6 Shape Evolution Driven by Data Misfit

An essential goal in the solution of inverse problems is to find an image which is able to reproduce the measured data in a certain sense. As far as interfaces are concerned, this gives rise to the need of finding descent directions for shape evolution with respect to the data misfit functional. In the following, some concepts are presented which aim at providing these descent directions during the shape evolution. These concepts can be combined arbitrarily with the above-discussed concepts for shape evolution driven by geometric terms.

6.1 Shape Deformation by Calculus of Variations

Historically, the first approach for applying a level set technique for solving an inverse problem in [82] has used concepts from the calculus of variations for calculating descent directions for the data misfit functional. In many applications, this approach is still a very convenient way of deriving evolution laws for shapes. In the following, the main ideas of this approach are briefly reviewed, following [82]. The goal is to obtain expressions for the deformation of already existing shapes according to a normal velocity field defined at the boundary of these shapes. Topological changes are not formally included in the consideration at this stage (even though they occur automatically when implementing the discussed schemes in a level set-based numerical framework). The formal treatment of topological changes is a topic of active current research and will be discussed briefly in section “Topological Derivatives.”

6.1.1 Least Squares Cost Functionals and Gradient Directions

Typically, appropriate function spaces are needed for defining and calculating appropriate descent directions with respect to the data misfit cost functional. Without being very specific, in the following, the general notation P is used for denoting the space of parameters b and, if not otherwise specified, Z for denoting the space of measurements \(\tilde{g}\). For simplicity, these function spaces are considered being appropriately chosen Hilbert or vector spaces. Certainly, other types of spaces can be used as well, which might lead to interesting variants of the described concepts.

Consider now the least squares cost functional

$$\displaystyle{ \mathcal{J} (b) = \frac{1} {2}\left \|\mathcal{R}(b)\right \|_{Z}^{2} = \frac{1} {2}\left \langle \,\mathcal{R}(b),\,\mathcal{R}(b)\,\right \rangle _{Z}, }$$
(81)

where \(\langle,\,\rangle _{Z}\) denotes the canonical inner product in data space Z. Assume that \(\mathcal{R}(b)\) admits the expansion

$$\displaystyle{ \mathcal{R}(b +\delta b) = \mathcal{R}(b) + \mathcal{R}^{{\prime}}(b)\delta b + O\left (\|\delta b\|_{ P}^{2}\right ), }$$
(82)

letting \(\|\;\|_{P}\) be the canonical norm in parameter space P, for a sufficiently small perturbation (variation) δ bP. The linear operator \(\mathcal{R}^{{\prime}}(b)\) (if it exists) is often called the Fréchet derivative of \(\mathcal{R}\). Plugging (82) into (81) yields the relationship

$$\displaystyle{ \mathcal{J} (b +\delta b) = \mathcal{J} (b) + \mathtt{Re}\left \langle \mathcal{R}^{{\prime}}(b)^{{\ast}}\mathcal{R}(b),\,\delta b\right \rangle _{ P} + O\left (\|\delta b\|_{P}^{2}\right ) }$$
(83)

where the symbol Re indicates the real part of the corresponding quantity. The operator \(\mathcal{R}^{{\prime}}(b)^{{\ast}}\) is the formal adjoint operator of \(\mathcal{R}^{{\prime}}(b)\) with respect to spaces Z and P:

$$\displaystyle{ \left \langle \mathcal{R}^{{\prime}}(b)^{{\ast}}g,\,\hat{b}\right \rangle _{ P} = \left \langle g,\,\mathcal{R}^{{\prime}}(b)\hat{b}\right \rangle _{ Z}\quad \mbox{ for all}\;\hat{b} \in P,\;g \in Z. }$$
(84)

The quantity

$$\displaystyle{ \mathbf{grad}_{b}\mathcal{J} = \mathcal{R}^{{\prime}}(b)^{{\ast}}\mathcal{R}(b) }$$
(85)

is called the gradient direction of \(\mathcal{J}\) in b.

It is assumed that the operators \(\mathcal{R}^{{\prime}}(b)\) and \(\mathcal{R}^{{\prime}}(b)^{{\ast}}\) take into account the correct interface conditions at ∂ D, which is important when actually evaluating these derivatives in a “direct” or in an “adjoint” fashion. In many practical applications, the situation can occur that (formally) the fields need to be evaluated at interfaces where jumps occur. In these situations, appropriate limits can be considered. Alternatively, the tools developed in section “Shape Sensitivity Analysis and the Speed Method” can be applied there. The existence and special form of Fréchet derivatives \(\mathcal{R}^{{\prime}}(b)\) (and the corresponding shape derivatives) for parameter distributions b with discontinuities along interfaces are problem specific and beyond the scope of this chapter. Refer to the cited literature, for example, [9, 15, 49, 53, 54, 58, 78]. In many practical implementations, the interface \(\partial D\) is de facto replaced by a narrow transition zone with smoothly varying parameters, in which case the interface conditions disappear.

6.1.2 Change of b Due to Shape Deformations

Assume that every point x moves in the domain Ω a small distance y(x) and that the mapping \({\mathbf{x}} \rightarrow \mathbf{y}({\mathbf{x}})\) is sufficiently smooth, such that the basic structure of the shape D remains preserved. Then, the points located on the boundary \(\Gamma = \partial D\) will move to the new locations \({\mathbf{x}}^{{\prime}} = {\mathbf{x}} + \mathbf{y}({\mathbf{x}})\), and the boundary \(\Gamma \) will be deformed into the new boundary \(\Gamma ^{{\prime}} = \partial D^{{\prime}}\). Assume furthermore that the parameter distribution in Ω has the special form (4), such that it will change as well. In the following, the first goal is to quantify this change in the parameter distribution b(x) due to an infinitesimal deformation as described above.

Consider the inner product of δ b with a test function f

$$\displaystyle{ \langle \delta b,\,f\rangle _{\Omega } =\int _{\Omega }\delta b({\mathbf{x}})\overline{f(x)}\,d{\mathbf{x}} =\int _{\mbox{ symdiff}(D,D^{{\prime}})}\delta b({\mathbf{x}})\overline{f({\mathbf{x}})}\,d{\mathbf{x}}, }$$
(86)

where the overline means “complex conjugate” and \(\mbox{ symdiff}(D,D^{{\prime}}) = (D \cup D^{{\prime}})\setminus (D \cap D^{{\prime}})\) is the symmetric difference of the sets D and D (see Fig. 8). Since the difference in D and D is infinitesimal, the area integral reduces to a line integral. Let n(x) denote the outward normal to x. Then, the integral in (86) becomes

$$\displaystyle{ \langle \delta b,\,f\rangle _{\partial D} =\int _{\delta D}\Big(b_{i}({\mathbf{x}}) - b_{e}({\mathbf{x}})\Big)\mathbf{y}({\mathbf{x}}) \cdot \mathbf{n}({\mathbf{x}})\overline{f({\mathbf{x}})}\,\mathit{ds}({\mathbf{x}}), }$$
(87)

where ds(x) is the incremental arclength. Here it has been used that in the limit δ b(x) = b i (x) − b e (x) at the boundary point x∂ D due to (4). It follows the result

$$\displaystyle{ \delta b({\mathbf{x}}) =\varpi _{\partial D}\Big((b_{i}({\mathbf{x}}) - b_{e}({\mathbf{x}}))\,\mathbf{y}({\mathbf{x}}) \cdot \mathbf{n}({\mathbf{x}})\Big) }$$
(88)

where \(\varpi _{\partial D}\) is the n-dimensional restriction operator which restricts functions defined in Ω to the boundary ∂ D of the shape D (n = 2 or 3, usually). Therefore, δ b(x) is interpreted now as a surface measure on ∂ D. Using the n-dimensional Dirac delta distribution δ ∂ D concentrated on the boundary ∂ D of the shape D, (88) can be written in the form

$$\displaystyle{ \delta b({\mathbf{x}}) = (b_{i} - b_{e})\,\mathbf{y}({\mathbf{x}}) \cdot \mathbf{n}({\mathbf{x}})\,\delta _{\partial D}({\mathbf{x}}) }$$
(89)

which is a distribution defined on the entire domain Ω but concentrated on ∂ D where it has the same strength as the corresponding surface measure. Although, strictly speaking, they are different mathematical objects, they are identified in the following for simplicity. Compare (87) also to the classical shape or domain derivative as, for example, calculated in [49], focusing there on the effect of the infinitesimal change in the boundary of a scatterer on the far-field pattern of a scattering experiment.

Fig. 8
figure 8

Deformation of shapes using calculation of small variations

6.1.3 Variation of Cost Due to Velocity Field v(x)

A popular approach for generating small displacements y(x) (as discussed in section “Change of b Due to Shape Deformations”) for moving the boundary \(\partial D\) is to assign to each point in the domain a velocity field v(x) and to let the points \({\mathbf{x}} \in \Omega \) move a small artificial evolution time [0, τ] with constant velocity v(x). Then

$$\displaystyle{ \mathbf{y}({\mathbf{x}}) = \mathbf{v}({\mathbf{x}})\tau. }$$
(90)

Plugging this into (89) for t ∈ [0, τ], the corresponding change in the parameters follows as

$$\displaystyle{ \delta b({\mathbf{x}};t) = (b_{i} - b_{e})\,\mathbf{v}({\mathbf{x}}) \cdot \mathbf{n}({\mathbf{x}})t\,\delta _{\partial D}({\mathbf{x}}). }$$
(91)

Plugging expression (91) into (83) and neglecting terms of higher than linear order yields

$$\displaystyle\begin{array}{rcl} \mathcal{J} (b(t)) -\mathcal{J} (b(0))& =& \mathtt{Re}\left \langle \mathbf{grad}_{b}\mathcal{J},\,\delta b({\mathbf{x}};t)\right \rangle _{P} \\ & =& \mathtt{Re}\left \langle \mathbf{grad}_{b}\mathcal{J},\,(b_{i} - b_{e})\,\mathbf{v}({\mathbf{x}}) \cdot \mathbf{n}({\mathbf{x}})t\,\delta _{\partial D}({\mathbf{x}})\right \rangle _{P}{}\end{array}$$
(92)

or, in the limit \(t \rightarrow 0\), evaluating the Dirac delta distribution,

$$\displaystyle{ {\left. \frac{\partial {\mathcal{J}} (b)} {\partial t} \right\vert} _{t=0} = {\mathtt{Re}} {\int _{\partial D}}{{\mathbf{grad}}_{b}} {\mathcal{J}} \, \overline{({b_{i}} - {b_{e}})}{\mathbf{v}}({\mathbf{x}}) \cdot {\mathbf{n}}({\mathbf{x}}){ds}({\mathbf{x}}), }$$
(93)

where the overline means “complex conjugate” and \(\mathbf{grad}_{b}\mathcal{J}\) is defined in (85). Similar expressions will be derived further below using formal shape sensitivity analysis. (Compare, e.g., for the situation of TM-waves, the expression (97) calculated by using (93) with the analogous expressions (100) and (105) calculated by using formal shape sensitivity analysis.)

If a velocity field v(x) can be found such that \(\left.\frac{\partial \mathcal{J}(b)} {\partial t} \right \vert _{t=0} < 0\), then it is expected (for continuity reasons) that this inequality holds in a sufficiently small time interval [0, τ] and that therefore the total cost during the artificial flow will be reduced. This will be the general strategy in most optimization type approaches for solving the underlying inverse problem. See the brief discussion in section “Shape Evolution and Shape Optimization.”

Notice that only the normal component of the velocity field

$$\displaystyle{ F({\mathbf{x}}) = \mathbf{v}({\mathbf{x}}) \cdot \mathbf{n}({\mathbf{x}}) }$$
(94)

at the boundary ∂ D of the shape D is of relevance for the change in the cost (compare the remarks made already in section “The Level Set Framework for Shape Evolution”). This is because tangential components of v do not contribute to shape deformations. In a parameterized way of thinking, they only “re-parameterize” the existing boundary.

6.1.4 Example: Shape Variation for TM-Waves

An instructive example is given here in order to demonstrate the above concepts for a practical application. Consider TM-waves in a typical imaging situation of subsurface imaging or of microwave breast screening as in the case study of section “Example 1: Microwave Breast Screening.” Assume for simplicity that the basic level set model of section “The Basic Level Set Formulation for Binary Media” is applied here. The cost functional measuring the mismatch between calculated data \(u_{{M}}[D]\) corresponding to the shape D and physically measured data \(\tilde{g}\) is defined as

$$\displaystyle{ \mathcal{J} (D) = \frac{1} {2}\left \|u_{{M}}[D] -\tilde{ g}\right \|_{L^{2}(M)}^{2}, }$$
(95)

where the calculated measurements \(u_{{M}}\) are given as the electric field values u(x) at the set of receiver locations M. Using a properly defined adjoint state z(x) (see, e.g., [30, 34, 71] for details on adjoint states), it can be shown by straightforward calculation that \(\mathcal{R}^{{\prime}}(b)^{{\ast}}\mathcal{R}(b)\) takes the form

$$\displaystyle{ \left (\mathbf{grad}_{b}\mathcal{J}\right )({\mathbf{x}}) = \overline{u({\mathbf{x}})z({\mathbf{x}})}, }$$
(96)

where u(x) denotes the solution of the forward problem and \(\mathbf{grad}_{b}\mathcal{J}\) is defined as in (85). Therefore, it follows that

$$\displaystyle{ {\left.\frac{\partial {\mathcal{J}} (b)} {\partial t} \right\vert} _{t=0} = {\mathtt{Re}} {\int _{\partial D}}u({\mathbf{x}})z({\mathbf{x}})({b_{i}} - {b_{e}}){\mathbf{v}}({\mathbf{x}}) \cdot {\mathbf{n}}({\mathbf{x}}){ds}({\mathbf{x}}), }$$
(97)

where it is used that the real part of a complex number and its complex conjugate are identical. Similar expressions based on adjoint field calculations can be derived for a large variety of applications; see, for example, [34, 39, 45, 71, 84, 89, 90].

6.1.5 Example: Evolution of Thin Shapes (Cracks)

Another application of this technique has been presented in [4, 77] for finding cracks in a homogeneous material from boundary data; see section “Example 3: Crack Detection.” The evolution of cracks as defined in section “A Modification of the Classical Level Set Technique for Describing Cracks or Thin Shapes” requires the simultaneous consideration of two level set functions. Evolution of the first level set function amounts to displacement of the thin region (crack) in the transversal direction, whereas evolution of the second level set function describes the process of crack growth or shrinkage in the longitudinal direction, which comes with the option of crack splitting and merging. Descent directions for both level set functions can be calculated by the following arguments presented above. It needs to be taken into account, however, that, due to the specific construction of a crack with finite thickness, deformation of the zero level set of the first level set function is associated with a displacement of the crack boundary at two adjacent locations, which both contribute to a small variation in the cost. See Fig. 9.

Fig. 9
figure 9

Deformation of a thin shape (crack) using calculus of small variations

Assume that a small displacement is applied to the zero level set of the first level set function defining S in the notation of section “A Modification of the Classical Level Set Technique for Describing Cracks or Thin Shapes.” This is reflected by two contributions to the least squares data misfit cost, one from the deformation of S in Fig. 9 and the other one from the deformation of S + in Fig. 9. It follows that a descent velocity is now given as \(\mathbf{v}({\mathbf{x}}) = F_{\varphi _{1}}({\mathbf{x}})\mathbf{n}({\mathbf{x}})\) with

$$\displaystyle{ F_{\varphi _{1}}({\mathbf{x}}) = -(b_{i} - b_{e})\left [\mathbf{grad}_{b}\mathcal{J}\vert _{S^{+}} -\mathbf{grad}_{b}\mathcal{J}\vert _{S-}\right ]\quad \mbox{ on}\;S, }$$
(98)

with \(\mathbf{grad}_{b}\mathcal{J}\) being defined in (85). In (98), for each \({\mathbf{x}} \in \Gamma _{\phi _{1}}\), two adjacent points of \(\mathbf{grad}_{b}\mathcal{J}\vert _{S^{+}}\) and \(\mathbf{grad}_{b}\mathcal{J}\vert _{S^{-}}\) contribute to the value of \(F_{\varphi _{1}}({\mathbf{x}})\) which can be found in the normal direction to \(\Gamma _{\phi _{1}}\) in x.

In a similar way, a descent direction with respect to the second level set function ϕ 2 can be obtained. Its detailed derivation depends slightly on the way how the crack tips are constructed in section “A Modification of the Classical Level Set Technique for Describing Cracks or Thin Shapes,” where two alternative choices are given. Overall, a descent velocity can be calculated following classical rules for those points x of \(\partial B\) (i.e., for those points of the zero level set of ϕ 2) which satisfy \({\mathbf{x}} \in \partial B \cap \Gamma _{\phi _{1}}^{\delta }\) or, alternatively, \({\mathbf{x}} \in \partial B \cap \Gamma _{\phi _{1}}\). Then, the obtained velocity field needs to be extended first to the remaining parts of \(\partial B\) and then to the rest of Ω. Notice that the specific form of \(\mathbf{grad}_{b}\mathcal{J}\) might be slightly different here from the one given in (96) due to the slightly different PDE which might be involved here (depending on the application). For more details, refer to [4, 77].

6.2 Shape Sensitivity Analysis and the Speed Method

In this section, an alternative technique is presented for formally defining shape derivatives and modeling shape deformations driven by cost functionals. This theory, called shape sensitivity analysis, is quite general and powerful, such that it is used heavily in various applications. Only very few concepts of it can be mentioned here which are employed when calculating descent directions with respect to data misfit cost functionals.

The tools, as presented here, have been used and advanced significantly during the last twenty years in the framework of optimal shape design [30, 87]. Having this powerful theory readily available, it is therefore quite natural that these methods have been applied very early already to the applications of shape-based inverse problems with level sets. The theory of this section is again mainly concentrated upon modeling in a formally accurate way the deformation of already existing shapes. It does not incorporate topological changes. These will be discussed briefly in section “Topological Derivatives.”

6.2.1 Example: Shape Sensitivity Analysis for TM-Waves

Again the situation of inverse scattering by TM-waves is considered here. The below discussion closely follows the results presented in [64]. The main tools used here are the material and shape derivative defined in section “The Material Derivative Method.”

The cost functional measuring the mismatch between calculated data \(u_{{M}}[D]\) corresponding to the shape D and physically measured data \(\tilde{g}\) is defined by (95). When perturbing the shape by a velocity field V(t, x), the electric field at the (fixed) probing line changes according to \(u \rightarrow u + u^{{\prime}}\), where u is the shape derivative defined in section “The Material Derivative Method.” Plugging this into (95) and neglecting terms of higher than linear order, it is verified that

$$\displaystyle{ d\mathcal{J} (D,\mathbf{V}) = \mathtt{Re}\int _{M}u^{{\prime}}({\mathbf{x}})\,\overline{(u_{{ M}} -\tilde{ g})}({\mathbf{x}})d{\mathbf{x}}. }$$
(99)

Now, the shape derivative u can be calculated by first computing the material derivative (also defined in section “The Material Derivative Method”) and then using one of the relationships between the material derivative and the shape derivative (see sections “The Material Derivative Method” and “Some Useful Shape Functionals”). Using also here an adjoint state z, the Eulerian derivative can be characterized and calculated as

$$\displaystyle{ d\mathcal{J} (D,\mathbf{V}) = \mathtt{Re}\int _{\Gamma }(b_{{\mathrm{int}}} - b_{{\mathrm{ext}}})u({\mathbf{x}})z({\mathbf{x}})\mathbf{V}(0,{\mathbf{x}}) \cdot \mathbf{n}\,d\Gamma. }$$
(100)

Notice that this is exactly the same result as we arrived at in (97). For more details, refer to [64].

6.2.2 Shape Derivatives by a Min–Max Principle

In order to avoid the explicit calculation of material and shape derivatives of the states with respect to the flow fields, an alternative technique can be used as reported in [29, 30, 78]. It is based on a reformulation of the derivative of a shape functional \(\mathcal{J} (D)\) with respect to time as the partial derivative of a saddle point (or a “min–max”) of a suitably defined Lagrangian. In the following, the basic features of this approach will be outlined, focusing in particular on the shape derivative for TM-waves.

Let again the cost functional \(\mathcal{J} (D(t))\) be defined as in (95) by

$$\displaystyle{ \mathcal{J} (D(t)) = \frac{1} {2}\left \|u_{{M}}[D(t)] -\tilde{ g}\right \|_{L^{2}(M)}. }$$
(101)

The goal is to write \(\mathcal{J} (D(t))\) in the form

$$\displaystyle{ \mathcal{J} (D(t)) =\min _{u}\max _{z}\mathcal{L}(t,u,z) }$$
(102)

for some suitably defined Lagrangian \(\mathcal{L}(t,u,z)\). Here and in the following, the complex nature of the forward fields u and the adjoint fields z is (partly) neglected in order to simplify notation (more rigorous expressions can be found in [78]). The Lagrangian \(\mathcal{L}(t,u,z)\) takes the form

$$\displaystyle\begin{array}{rcl} \mathcal{L}(t,u,z)& =& \frac{1} {2}\int _{M}\left \vert \int _{\Omega }\eta \left ({\mathbf{x}}^{{\prime}}\right )G_{ 12}\left ({\mathbf{x}},{\mathbf{x}}^{{\prime}}\right )u\left ({\mathbf{x}}^{{\prime}}\right )\,d{\mathbf{x}}^{{\prime}}-\tilde{ g}(x)\right \vert ^{2}dx \\ & & +\mathtt{Re}\int _{\Omega }\left (u\left ({\mathbf{x}}\right ) - u^{{\mathrm{inc}}}\left ({\mathbf{x}}\right ) -\int _{ \Omega }\eta \left ({\mathbf{x}}^{{\prime}}\right )G_{ 22}\left ({\mathbf{x}},{\mathbf{x}}^{{\prime}}\right )u\left ({\mathbf{x}}^{{\prime}}\right )d{\mathbf{x}}^{{\prime}}\right )\,z({\mathbf{x}})\,dx.{}\end{array}$$
(103)

Next, it can be shown that this Lagrangian has a unique saddle point denoted by (u , z ), which is characterized by an optimality condition with respect to u and z. In fact, the uniqueness follows from the well-posedness and uniqueness of the solutions of the direct and adjoint state equations; see [31, 78]. The key observation is now that

$$\displaystyle{ \frac{d\mathcal{J}} {dt} = \frac{\partial } {\partial t}\left (\min _{u}\max _{z}\mathcal{L}(t,u,z)\right ) = \frac{\partial } {\partial t}\mathcal{L}(t,u^{{\ast}},z^{{\ast}}) }$$
(104)

which says that the time derivative of the original cost functional can be replaced by the partial time derivative of a saddle point. Following these ideas, the result is derived

$$\displaystyle{ \frac{d {\mathcal{J}}} {dt} = {\mathtt{Re}} {\int _{\Gamma }}({b_{\mathrm{int}}} - {b_{\mathrm{ext}}})u({\mathbf{x}})z({\mathbf{x}}){\mathbf{V}}(0) \cdot {\mathbf{n}}\,d\Gamma }$$
(105)

which holds for TM-waves and which is identical to the previously derived expressions (97) and (100).

Similar expressions (now involving expressions of the form \(\nabla u({\mathbf{x}})\nabla z({\mathbf{x}})\) rather than u(x)z(x) at the interfaces) can be derived also for so-called TE-waves; see [78]. The above outlined min–max approach is in fact wide ranging and can be extended to 3D vector scattering, geometrical regularizations, simultaneous searches of shape and contrast, etc. It de facto applies as soon as one has well-posedness of the direct and adjoint problems. For more details, refer to [29, 30, 78, 79].

6.3 Formal Shape Evolution Using the Heaviside Function

A third possibility for describing and modeling shape deformations driven by data misfit (in addition to using calculus of variation as in section “Shape Deformation by Calculus of Variations” or shape sensitivity analysis as in section “Shape Sensitivity Analysis and the Speed Method”) is the use of the characteristic function and formal (basic) distribution theory. In contrast to the previous two techniques which first calculate velocity fields in the normal direction to the interfaces and then move the interfaces accordingly using a level set technique (or any other computational front propagation technique), typically leading to a Hamilton–Jacobi-type formalism (compare the remarks in section “The Level Set Framework for Shape Evolution”), the method presented in the following does not explicitly use the concept of velocity vector fields, but instead tries to design evolution laws directly for the describing level set functions (thereby not necessarily leading to Hamilton–Jacobi-type evolution laws).

Notice that many of the level set formulations presented in Sect. 3 give rise to similar concepts as discussed in the following. On the other hand, typically also the concepts discussed in sections “Shape Deformation by Calculus of Variations” and “Shape Sensitivity Analysis and the Speed Method” can be translated, once suitable velocity fields have been determined, into level set evolutions using the various representations of Sect. 3. Details on how these evolution laws can be established can be found in the literature cited in Sect. 3.

The formalism discussed in the following is in fact very flexible and quite easy to handle if standard rules for calculations with distributions are taken into account. Moreover, it often leads to very robust and powerful reconstruction algorithms. Certainly, it also has some limitations: in the form presented here, it is mainly applicable for “penetrable objects” with finite jumps in the coefficients between different regions. This means that it does not generally handle inverse scattering from impenetrable obstacles if very specific and maybe quite complicated boundary conditions need to be taken into account at the scatterer surfaces. For those applications, the theory based on shape sensitivity analysis is more appropriate. Nevertheless, since many inverse scattering problems can be described in the form presented in the following, possibly incorporating some finite jump conditions of the forward and adjoint fields or their normal components across the interfaces (which can be handled by slightly “smearing out” these interfaces over a small transition zone), this theory based on formal distribution theory provides an interesting alternative when deriving level set-based shape evolution equations for solving inverse scattering problems.

The main idea of this technique is demonstrated by giving two examples related to the test cases from sections “Example 1: Microwave Breast Screening” and “Example 2: History Matching in Petroleum Engineering.”

6.3.1 Example: Breast Screening–Smoothly Varying Internal Profiles

In the example of breast screening as discussed in section “Example 1: Microwave Breast Screening,” three level set functions and four different interior parameter profiles need to be reconstructed from the given data simultaneously. Some of the interior profiles are assumed to be constant, whereas others are smoothly varying. In the following, the theory is developed under the assumption that all interior parameter profiles are smoothly varying. The case of constant parameter profiles in some region is captured in the next section where a similar case is discussed for the application of reservoir engineering.

Let \(\mathcal{R}\left (b(\phi _{1},\phi _{2},\phi _{3},b_{1},b_{2},b_{3},b_{4})\right )\) denote the difference between measured data and data corresponding to the latest best guess (ϕ 1, ϕ 2, ϕ 3, b 1, b 2, b 3, b 4) of level set functions and interior profiles in the image model discussed in section “A Modification of Color Level Set for Tumor Detection.” Then, the least squares data misfit for tumor detection is given by

$$\displaystyle{ \mathcal{J}\left (b(\phi _{1},\phi _{2},\phi _{3},b_{1},b_{2},b_{3},b_{4})\right ) = \frac{1} {2}\left \|\mathcal{R}\left (b(\phi _{1},\phi _{2},\phi _{3},b_{1},b_{2},b_{3},b_{4})\right )\right \|^{2}. }$$
(106)

Introducing an artificial evolution time t for the above specified unknowns of the inverse problem, the goal is to find evolution laws

$$\displaystyle\begin{array}{rcl} \frac{d\phi _{\nu }} {\mathit{dt}}& =& f_{\nu }({\mathbf{x}},t),\,\nu = 1,\ldots,3,{}\end{array}$$
(107)
$$\displaystyle\begin{array}{rcl} \frac{\mathit{db}_{\nu }} {\mathit{dt}} & =& g_{\nu }({\mathbf{x}},t),\,\nu = 1,\ldots,4,{}\end{array}$$
(108)

such that the cost \(\mathcal{J}\) decreases with increasing evolution time. With level set functions and interior profiles evolving, also the cost will change, \(\mathcal{J} = \mathcal{J} (t)\), such that formally its time derivative can be calculated by using the chain rule

$$\displaystyle\begin{array}{ll} \frac{d{\mathcal{J}}} {dt} & = \frac{d{\mathcal{J}}} {db} \left[{\sum \limits_{\nu =1}^{3}}\frac{\partial b} {\partial {\phi _{\nu }}} \frac{d{\phi _{\nu }}} {dt} + {\sum \limits_{\nu =1}^{4}} \frac{\partial b} {\partial b_{\nu}} \frac{db_{\nu}}{dt} \right] \\ & = \mathtt{Re} \, {\left\langle {{\mathbf{grad}}_{b}}{\mathcal{J}},\,{\sum \limits_{\nu =1}^{3}}\frac{\partial b} {{\partial \phi} _{\nu}} {f_{\nu}} + {\sum \limits_{\nu =1}^{4}} \frac{\partial b} {\partial b_{\nu}}g_{\nu}\right\rangle} _{P}.\end{array}$$
(109)

Here, Re indicates to take the real part of the following complex quantity and \(\langle \;,\;\rangle _{P}\) denotes a suitable inner product in parameter space P, and \(\mathbf{grad}_{b}\mathcal{J}\) is defined in (85). It is verified easily that in the situation of section “A Modification of Color Level Set for Tumor Detection”

$$\displaystyle\begin{array}{rcl} \frac{\partial b} {\partial \phi _{1}}& =& \delta (\phi _{1})\Big(-b_{1} + b_{2}(1 - H(\phi _{2}))\Big. \\ & & \Big. + H(\phi _{2})\Big\{b_{3}(1 - H(\phi _{3})) + b_{4}H(\phi _{3})\Big\}\Big),{}\end{array}$$
(110)
$$\displaystyle\begin{array}{rcl} \frac{\partial b} {\partial \phi _{2}}& =& H(\phi _{1})\delta (\phi _{2})\left [-b_{2} + b_{3},(1 - H(\phi _{3})) + b_{4}H(\phi _{3})\right ],{}\end{array}$$
(111)
$$\displaystyle\begin{array}{rcl} \frac{\partial b} {\partial \phi _{3}}& =& H(\phi _{1})H(\phi _{2})\delta (\phi _{3})\left \{-b_{3} + b_{4}\right \},{}\end{array}$$
(112)

and

$$\displaystyle\begin{array}{rcl} \frac{\partial b} {\partial b_{1}}& =& 1 - H(\phi _{1}),{}\end{array}$$
(113)
$$\displaystyle\begin{array}{rcl} \frac{\partial b} {\partial b_{2}}& =& H(\phi _{1})(1 - H(\phi _{2})),{}\end{array}$$
(114)
$$\displaystyle\begin{array}{rcl} \frac{\partial b} {\partial b_{3}}& =& H(\phi _{1})H(\phi _{2})(1 - H(\phi _{3})),{}\end{array}$$
(115)
$$\displaystyle\begin{array}{rcl} \frac{\partial b} {\partial b_{4}}& =& H(\phi _{1})H(\phi _{2})H(\phi _{3}).{}\end{array}$$
(116)

Descent directions are therefore given by

$$\displaystyle\begin{array}{rcl} f_{\nu }(t)& =& -C_{\nu }(t)\mathtt{Re}\left [\mathbf{grad}_{b}\mathcal{J}\frac{\partial b} {\partial \phi _{\nu }} \right ],\,\nu = 1,\ldots,3,{}\end{array}$$
(117)
$$\displaystyle\begin{array}{rcl} g_{\nu }(t)& =& -\hat{C}_{\nu }(t)\mathtt{Re}\left [\mathbf{grad}_{b}\mathcal{J}\frac{\partial b} {\partial b_{\nu }}\right ],\,\nu = 1,\ldots,4,{}\end{array}$$
(118)

with some appropriately chosen positive-valued speed factors C ν (t) and \(\hat{C}_{\nu }(t)\). An efficient way to compute \(\mathbf{grad}_{b}\mathcal{J}\) is again to use the adjoint formulation; see (96) and for more details [34, 52, 71].

Notice that is might be convenient to approximate the Dirac delta on the right-hand side of (110)–(112) in the formulation of the level set evolution by either a norrowband scheme or by a positive constant which allows for topological changes in the entire computational domain driven by the least squares data misfit. For more details, see the brief discussion in section “Shape Evolution and Shape Optimization” and the slightly more detailed discussions held in [31, 52]. Following the latter scheme, one possible numerical discretization of the expressions (107) and (108) in time t = t (n), n = 0, 1, 2, , then yields the update rules

$$\displaystyle\begin{array}{rcl} \phi _{\nu }^{(n+1)}& =& \phi _{\nu }^{(n)} -\delta t^{(n)}C_{\nu }\left (t^{(n)}\right )\mathtt{Re}\left [\mathbf{grad}_{ b}\mathcal{J}\frac{\partial b} {\partial \phi _{\nu }} \right ]^{(n)},\,\nu = 1,\ldots,3,{}\end{array}$$
(119)
$$\displaystyle\begin{array}{rcl} b_{\nu }^{(n+1)}& =& b_{\nu }^{(n)} -\delta t^{(n)}\hat{C}_{\nu }\left (t^{(n)}\right )\mathtt{Re}\left [\mathbf{grad}_{ b}\mathcal{J}\frac{\partial b} {\partial b_{\nu }}\right ]^{(n)},\,\nu = 1,\ldots,4.{}\end{array}$$
(120)

6.3.2 Example: Reservoir Characterization–Parameterized Internal Profiles

In the example of history matching in reservoir engineering as discussed in section “Example 2: History Matching in Petroleum Engineering,” one level set function and two interior parameter profiles need to be reconstructed from the given data, where one interior parameter profile is assumed to be smoothly varying, and the other one is assumed to overall follow a bilinear pattern. The case of smoothly varying interior parameter profiles is completely analogous to the situation discussed in the previous section for microwave breast screening, such that it is not considered here. In the following, the situation is treated where both interior profiles follow a parameterized model with a certain set of given basis functions. In the history matching application as well as in the microwave breast screening application, the mixed cases of partly parameterized (e.g., with a constant or a bilinear profile) and partly smooth profiles are straightforward to implement as combinations of these two general approaches.

In this approach, it is assumed that the two internal profiles can be written in the parameterized form

$$\displaystyle{ b_{i}({\mathbf{x}}) =\sum _{ j=1}^{N_{i} }\alpha _{j}a_{j}({\mathbf{x}}),\quad b_{e}({\mathbf{x}}) =\sum _{ k=1}^{N_{e} }\beta _{k}b_{k}({\mathbf{x}}), }$$
(121)

where a j and b k are the selected basis functions for each of the two domains D and \(\Omega - D\), respectively. See the model discussed in section “A Modification of Color Level Set for Reservoir Characterization.” In the inverse problem, the level set function ϕ and the weights α j and β k need to be estimated with the goal to reproduce the measured data in some sense. In order to obtain an (artificial) evolution of the unknown quantities ϕ, α j , and β k , the following three general evolution equations for the level set function and for the weight parameters are formulated

$$\displaystyle{ \frac{d\phi } {\mathit{dt}} = f({\mathbf{x}},t,\phi,\mathcal{R}), }$$
(122)
$$\displaystyle{ \frac{d\alpha _{j}} {\mathit{\mathit{dt}}} = g_{j}(t,\phi,\mathcal{R}),\quad \frac{d\beta _{k}} {\mathit{dt}} = h_{k}(t,\phi,\mathcal{R}). }$$
(123)

In the same way as before, the goal is to define the unknown terms f, g j , and h k such that the mismatch in the production data decreases during the evolution. For this purpose, we reformulate the cost functional now as

$$\displaystyle{ \mathcal{J} (b(\phi,\alpha _{j},\beta _{k})) = \frac{1} {2}\|\mathcal{R}(b(\phi,\alpha _{j},\beta _{k}))\|^{2}, }$$
(124)

where α j denotes the weight parameters for region D and β k denotes the weight parameters for region \(\Omega - D\). Formal differentiation of this cost functional with respect to the artificial time variable t yields, in a similar way as before, the descent directions [35]

$$\displaystyle{ f_{\mbox{ SD}}({\mathbf{x}}) = -C_{1}\chi _{NB}(\phi )(b_{e} - b_{i})\mathbf{grad}_{b}\mathcal{J}, }$$
(125)
$$\displaystyle{ g_{\mathit{jSD}}(t) = -C(\alpha _{j})\int _{\Omega }a_{j}(1 - H(\phi ))\mathbf{grad}_{b}\mathcal{J}d{\mathbf{x}}, }$$
(126)
$$\displaystyle{ h_{\mathit{kSD}}(t) = -C(\beta _{k})\int _{\Omega }b_{k}H(\phi )\mathbf{grad}_{b}\mathcal{J}d{\mathbf{x}}, }$$
(127)

where C 1, C(α j ), and C(β k ) are again positive-valued speed factors which are used for steering the speed of evolution for each of the unknowns ϕ, α j , and β k individually. The narrowband function χ NB (ϕ) is introduced for computational convenience and can be omitted if desired. For details on this narrowband formulation, see the brief discussion held in section “Shape Evolution and Shape Optimization.”

7 Regularization Techniques for Shape Evolution Driven by Data Misfit

Regularization of shape evolution can be achieved by additional additive or multiplicative terms in the cost functional which control geometric terms, as discussed in Sect. 5. Alternatively, some form of regularization can be obtained by restricting the velocity fields, level set updates, or level set functions to certain classes, often without the need to introduce additional terms into the cost functional. Some of these techniques are presented in the following.

7.1 Regularization by Smoothed Level Set Updates

In the binary case (see section “The Basic Level Set Formulation for Binary Media”), a properly chosen level set function ϕ uniquely specifies a shape D[ϕ]. This can be described by a nonlinear operator \(\Pi \) mapping level set functions to parameter distributions

$$\displaystyle{ \Pi (\phi )({\mathbf{x}}) = \left \{\begin{array}{c@{\,,\quad }l} b_{i}({\mathbf{x}})\,,\quad &\phi ({\mathbf{x}}) \leq 0, \\ b_{e}({\mathbf{x}})\,,\quad &\phi ({\mathbf{x}}) > 0. \end{array} \right. }$$
(128)

We obviously have the equivalent characterization

$$\displaystyle{ \Pi (\phi )({\mathbf{x}}) = b_{i}({\mathbf{x}})\chi _{D}({\mathbf{x}}) + b_{e}({\mathbf{x}})(1 -\chi _{D}({\mathbf{x}})) }$$
(129)

where χ D is the characteristic function of the shape D. The “level set-based residual operator” \(\mathcal{T} (\phi )\) follows as

$$\displaystyle{ \mathcal{T} (\phi ) = \mathcal{R}(\Pi (\phi )). }$$
(130)

Formal differentiation by the chain rule yields

$$\displaystyle{ \mathcal{T}^{{\prime}}(\phi ) = \mathcal{R}^{{\prime}}(\Pi (\phi ))\Pi ^{{\prime}}(\phi ). }$$
(131)

The (formal) gradient direction of the least square cost functional

$$\displaystyle{ \hat{\mathcal{J}}(\phi ) = \frac{1} {2}\left \|\mathcal{R}(b(\phi ))\right \|_{Z}^{2} }$$
(132)

is then given by

$$\displaystyle{ \mathbf{grad}_{\hat{\mathcal{J}}}(\phi ) = \mathcal{T}^{{\prime}}(\phi )^{{\ast}}\mathcal{T} (\phi ), }$$
(133)

where \(\mathcal{T}^{{\prime}}(\phi )^{{\ast}}\) is the L 2-adjoint of \(\mathcal{T}^{{\prime}}(\phi )\). Moreover, formally it is calculated by standard differentiation rules that

$$\displaystyle{ \Pi ^{{\prime}}(\phi ) = (b_{ i} - b_{e})\delta (\phi ). }$$
(134)

Notice that, strictly speaking, the right-hand side of (131) is not an L 2-function due to the Delta distribution which is seen in (134). Nevertheless, in order to obtain practically useful expressions in a straightforward way, it is convenient to proceed with the formal considerations and, whenever necessary, to approximate the Dirac delta distribution δ(ϕ) by a suitable L 2-function; see the brief discussion on this topic held in section “Shape Evolution and Shape Optimization.” For example, the narrowband function χ ϕ, d (x) as defined in (151) can be used for that purpose. Then,

$$\displaystyle{ \mathcal{T}^{{\prime}}(\phi )^{{\ast}} = \Pi ^{{\prime}}(\phi )^{{\ast}}\mathcal{R}^{{\prime}}(\Pi (\phi ))^{{\ast}}. }$$
(135)

Assuming now that \(\phi \in W_{1}(\Omega )\) with

$$\displaystyle{ W_{1}(\Omega ) = \left \{\phi:\phi \in L_{2}(\Omega ),\,\nabla \phi \in L_{2}(\Omega ),\, \frac{\partial \phi } {\partial \nu } = 0\,\mbox{ at }\,\partial \Omega \right \}, }$$
(136)

the adjoint operator \(\mathcal{T}^{{\prime}}(\phi )^{{\ast}}\) needs to be replaced by a new adjoint operator \(\mathcal{T}^{{\prime}}(\phi )^{\circ }\) which maps back from the data space into this Sobolev space \(W_{1}(\Omega )\). Using the weighted inner product

$$\displaystyle{ \langle v,w\rangle _{W_{1}(\Omega )} =\alpha \langle v,w\rangle _{L_{2}(\Omega )} +\beta \langle \nabla v,\nabla w\rangle _{L_{2}(\Omega )} }$$
(137)

with α ≥ 1 and β > 0 being carefully chosen regularization parameters, it follows

$$\displaystyle{ \mathcal{T}^{{\prime}}(\phi )^{\circ } =\, (\alpha I -\beta \Delta )^{-1}\,\mathcal{T}^{{\prime}}(\phi )^{{\ast}}. }$$
(138)

The positive definite operator \((\alpha I -\beta \Delta )^{-1}\) has the effect of mapping the L 2 gradient \(\mathcal{T}^{{\prime}}(\phi )^{{\ast}}\mathcal{T} (\phi )\) from \(L_{2}(\Omega )\) towards the smoother Sobolev space \(W_{1}(\Omega )\). In fact, different choices of the weighting parameters α and β visually have the effect of “smearing out” the unregularized updates to a different degree. In particular, high-frequency oscillations or discontinuities of the updates for the level set function are removed, which yields shapes with more regular boundaries. Notice that for \(\phi \in W_{1}(\Omega ),\) the trace \(\phi \big\vert _{\Gamma }\) (which is the zero level set) is only within the intermediate Sobolev space \(W_{1/2}(\Gamma )\) due to the trace theorem. Therefore, the “degree of smoothness” of the reconstructed shape boundaries \(\Gamma \) lies somewhere in between \(L_{2}(\Gamma )\) and \(W_{1}(\Gamma )\).

Sometimes it is difficult or inconvenient to apply the mapping (138) to the calculated updates \(\mathcal{T}^{{\prime}}(\phi )^{{\ast}}\mathcal{T} (\phi )\). Then, an approximate version can be applied instead which is derived next. Denote \(f_{r} = \mathcal{T}^{{\prime}}(\phi )^{\circ }\mathcal{T} (\phi )\) and \(f_{{d}} = \mathcal{T}^{{\prime}}(\phi )^{{\ast}}\mathcal{T} (\phi )\). f r can formally be interpreted as the minimizer of the cost functional

$$\displaystyle{ \hat{\mathcal{J}}(f) = \frac{\alpha -1} {2} \|f\|_{L_{2}}^{2} + \frac{\beta } {2}\|\nabla f\|_{L_{2}}^{2} + \frac{1} {2}\|f - f_{{d}}\|_{L_{2}}^{2}. }$$
(139)

In particular, the minimization process of (139) can be attempted by applying a gradient method instead of explicitly applying \((\alpha I -\beta \Delta )^{-1}\) to \(f_{{d}}\). The gradient flow of (139) yields a modified heat (or diffusion) equation of the form

$$\displaystyle\begin{array}{rcl} v_{t} -\beta \Delta v& =& f_{{d}} -\alpha v\quad \mbox{ for}\quad t \in [0,\tau ] \\ v(0)& =& f_{{d}}, {}\end{array}$$
(140)

with time-dependent heating term \(f_{{d}} -\alpha v\), where \(\hat{v} = v(\tau )\) evolves towards the minimizer f r of (139) for \(\tau \rightarrow \infty \). Practically, it turns out that a satisfactory regularization effect is achieved if instead of (140) the simplified heat equation is solved for a few time steps only:

$$\displaystyle\begin{array}{rcl} v_{t} -\beta \Delta v& =& 0\quad \mbox{ for}\quad t \in [0,\tau ] \\ v(0)& =& f_{{d}}, {}\end{array}$$
(141)

for τ small, using \(\hat{v} = v(\tau )\) instead of f r as update. For more details, see [44].

The above-described regularization schemes only operate on the updates (or forcing terms f in a time-dependent setting) but not on the level set function itself. In particular, in the case that a satisfactory solution of the shape reconstruction problem has already been achieved such that the data residuals become zero, the evolution will stop (which sometimes is desirable). In the following subsection, we will mention some alternative regularization methods where the evolution in the above-described situation would continue until an extended cost functional combining data misfit with additional geometric terms or with additional constraints on the final level set functions is minimized.

7.2 Regularization by Explicitly Penalizing Rough Level Set Functions

Instead of smoothing the updates to the level set functions, additional terms can be added to the data misfit cost functional which have the effect of penalizing certain characteristics of the level set function. For example, a Tikhonov–Philips term for the level set function can be added to (81), which will yield the minimization problem

$$\displaystyle{ \min _{\phi }\mathcal{J} (\phi ) = \frac{1} {2}\left \|\mathcal{R}(b(\phi ))\right \|_{Z}^{2} +\rho (\phi ), }$$
(142)

where \(\|\;\|_{Z}\) denotes the canonical norm in the data space Z and where ρ denotes some additional regularization term, typically involving the norm or semi-norm in the space of level set functions, for example, \(\rho (\phi ) =\| \nabla \phi \|_{L_{2}}^{2}\). A discussion of different choices for ρ(ϕ) is provided in [93]. Alternative functionals could be applied to the level set function ϕ, as, for example, Mumford–Shah, total variation, etc., which would allow for jumps in the representing level set functions.

7.3 Regularization by Smooth Velocity Fields

In the previous two subsections, regularization tools have been discussed, which are directly linked to the level set formulation of shape evolution. In section “Regularization by Smoothed Level Set Updates,” smoothing operators have been applied to the updates of the level set functions (or forcing terms) which are considered as being defined on the whole domain Ω. The additional terms discussed in section “Regularization by Explicitly Penalizing Rough Level Set Functions,” on the other hand, will yield additional evolutions terms which typically have to be applied directly to the describing level set functions during the shape evolution.

An alternative concept of regularizing shape evolution, which does not directly refer to an underlying level set representation of the shapes, consists in choosing function spaces for the normal velocity fields which drive the shape evolution. These velocity fields are, as such, only defined on the zero level set, i.e., on the boundaries of the given shapes (unless extension velocities are defined for a certain reason). For example, the velocity field could be taken as an element of a Sobolev space \(W_{1}(\Gamma )\) equipped with the inner product

$$\displaystyle{ \langle v,w\rangle _{W_{1}(\Gamma )} =\int _{\Gamma }\left (\frac{\partial v} {\partial s} \frac{\partial w} {\partial s} + vw\right )\,\mathit{ds}, }$$
(143)

where ds is the surface increment at the boundary. This leads to a postprocessing operator applied to the gradient directions which are restricted to the boundary \(\Gamma \). The action of this postprocessing operator can be interpreted as mapping the given velocity field from \(L_{2}(\Gamma )\) towards the smoother subspace \(W_{1}(\Gamma )\), much as it was described in section “Regularization by Smoothed Level Set Updates” for the spaces \(L_{2}(\Omega )\) and \(W_{1}(\Omega )\). For the above given norm (143), this is modeled by a Laplace–Beltrami operator

$$\displaystyle{ -\frac{\partial ^{2}v} {\partial s^{2}} + v = f_{d}\big\vert _{\Gamma }. }$$
(144)

Weighted versions of (143) and (144), with parameters α and β as in (137), can be defined as well. These operators have the effect of smoothing the velocity fields along the boundary \(\Gamma \) and therefore lead to regularized level set evolutions if suitable extension velocities are chosen. Alternatively, diffusion processes along the boundary can be employed for achieving a similar effect of smoothing velocity fields. For a more detailed description of various norms and the corresponding surface flows, see [17, 50, 87].

7.4 Simple Shapes and Parameterized Velocities

An even stronger way of regularizing shape evolution is to restrict the describing level set functions or the driving velocities to be members of finite-dimensional function spaces spanned by certain sets of basis functions. As basis functions, for example, polynomials, sinusoidal or exponential functions, or any other set of linearly independent functions tailored to the specific inverse problem, can be used. Closely related to this approach is also the strategy of restricting the shapes (and thereby the shape evolution) to a small set of geometric objects, as, for example, ellipsoids. See the discussion in [31] where evolution laws for a small sample of basic shapes are derived. In a related manner, [11] considers a multiscale multiregion level set technique which adaptively adjusts the support and number of basis functions for the level set representation during the shape evolution. Also related to this approach is the projection mapping strategy for shape velocities as proposed in [12].

8 Miscellaneous On-Shape Evolution

8.1 Shape Evolution and Shape Optimization

Shape evolution and shape optimization are closely related. Assume given any velocity function F(x) = v(x) ⋅ n(x) pointing into a descent direction of the cost \(\mathcal{J}\), such that \(\left.\frac{\partial \mathcal{J}(b)} {\partial t} \right \vert _{t=0}<0\). Then, the cost will decrease in the artificial time evolution during a sufficiently small time interval [0, τ]. On the practical level, the corresponding Hamilton–Jacobi-type evolution equation for the representing level set function to be solved during the time interval [0, τ] reads

$$\displaystyle{ \frac{\partial \phi } {\partial t} + F\,\vert \nabla \phi \vert = 0, }$$
(145)

where the variables (x, t) have been dropped in the notation. Using, for example, a straighforward time-discretization scheme with finite differences yields

$$\displaystyle{ \frac{\phi (\tau ) -\phi (0)} {\tau } + F\vert \nabla \phi \vert = 0. }$$
(146)

Interpreting ϕ (n+1) = ϕ(τ) and ϕ (n) = ϕ(0) yields the iteration

$$\displaystyle{ \phi ^{(n+1)} =\phi ^{(n)} +\tau \delta \phi ^{(n)},\quad \phi ^{(0)} =\phi _{ 0}, }$$
(147)

where τ plays the role of the step size (which might be determined by a line-search strategy) and where

$$\displaystyle{ \delta \phi ^{(n)} = F\vert \nabla \phi ^{(n)}\vert }$$
(148)

for x∂ D.

In the level set optimization approach, on the other hand, updates v for a level set function \(\phi \rightarrow \phi +v\) are sought which reduce a given cost. Take, for example, the situation of the basic level set formulation described in section “The Basic Level Set Formulation for Binary Media.” Analogously to section “Least Squares Cost Functionals and Gradient Directions,” a small perturbation v then has the effect on the cost

$$\displaystyle\begin{array}{ll} \frac{d{\mathcal{J}}} {d\phi } v& = \frac{d{\mathcal{J}}} {db} \,\frac{db} {d\phi } v = {\mathtt{Re}} {\left\langle {{\mathbf{grad}}_{b}}{\mathcal{J}},\, \frac{db}{d\phi} v\,\right\rangle} _{P} \\ & = {\mathtt{Re}}{\left\langle {{\mathbf{grad}}_{b}}{\mathcal{J}},\,({b_{e}}({\mathbf{x}}) - {b_{i}}({\mathbf{x}}))\delta (\phi )v\,\right\rangle} _{P},\end{array}$$
(149)

with \(\mathbf{grad}_{b}\mathcal{J}\) defined in (85). Apart from the term δ(ϕ), this yields similar expressions for the discrete updates as (147) if choosing \(F\vert \nabla \phi ^{(n)}\vert = -\mathtt{Re}(b_{e}({\mathbf{x}}) - b_{i}({\mathbf{x}}))\mathbf{grad}_{b}\mathcal{J}\).

In fact, the term δ(ϕ) is the one which causes the biggest conceptual problem when interpreting the above scheme in an optimization framework. Notice that, strictly speaking, the mapping from the level set function to the data (or to the corresponding least square cost) is not differentiable in standard (e.g., L 2) function spaces. This is indicated by the appearance of this Dirac delta distribution δ(ϕ) in (149), which is not an L 2 function.

There are several ways to circumvent these difficulties, mainly aiming at replacing this troublesome Delta distribution by a better behaved approximation of it. First, in the narrowband approach, the Dirac delta is replaced by a narrow band function χ ϕ, d (x) which yields

$$\displaystyle{ F_{{d}}\vert \nabla \phi ^{(n)}\vert ({\mathbf{x}}) = -\mathtt{Re}\,\left ((b_{ e} - b_{i})\,\chi _{\phi,d}({\mathbf{x}})\,\mathbf{grad}_{b}\mathcal{J}\right )\,\quad \mbox{ for all}\quad {\mathbf{x}} \in \Omega. }$$
(150)

Here, χ ϕ, d (x) is an arbitrary positive-valued approximation to δ(ϕ) where the subscript d indicates the degree of approximation. For example, it can be chosen as

$$\displaystyle{ \chi _{\phi,d}({\mathbf{x}}) = \left \{\begin{array}{c@{\,, \quad } l} 1\,,\quad &\mbox{ there exists}\;{\mathbf{x}}_{0} \in \Omega \;\mbox{ with}\;\vert {\mathbf{x}} -{\mathbf{x}}_{0}\vert < d\;\mbox{ and}\;\phi ({\mathbf{x}}_{0}) = 0 \\ 0\,,\quad &\mbox{ otherwise} \end{array} \right. }$$
(151)

which has the form of a “narrowband” function. Other approximations with certain additional properties (e.g., on smoothness) are possible as well. This search direction obviously also provides a descent flow for \(\mathcal{J}\). In fact, the term \(\vert \nabla \phi ^{(n)}\vert \) can also be neglected in (150), without losing the descent property of the resulting flow, since formally it can be assumed that \(\vert \nabla \phi ^{(n)}\vert > 0\) (repeated recalculation of a signed distance function would even enforce \(\vert \nabla \phi ^{(n)}\vert = 1\)).

The Dirac delta could as well be replaced by a positive constant, say 1, which yields another choice for a descent direction

$$\displaystyle{ F_{{{\mathrm{top}}}}({\mathbf{x}}) = -\mathtt{Re}\,(b_{e} - b_{i})\,\mathbf{grad}_{b}\mathcal{J}\quad \mbox{ for all}\quad {\mathbf{x}} \in \Omega. }$$
(152)

This new direction \(F_{{{\mathrm{top}}}}({\mathbf{x}})\) has the property that it applies updates driven by data sensitivities on the entire domain and thereby enables the creation of objects far away from the actual zero level set by lowering a positive level set function until its values arrive at zero. Certainly, at this moment when the level set function changes at some points far away from the zero level set from positive to negative values, a new object is created, and the descent property with respect to the cost needs to be evaluated by introducing some concept evaluating the effect of object creation on the data misfit. A formal way of doing so is briefly discussed in section “Topological Derivatives” further below. The opposite effect that inside a given shape, with some distance from the zero level set, the values of the level set function switch from negative to positive values can also occur, in which case a hole is created inside the shape. Also here, justification of this hole creation with respect to its effect on the data misfit cost is needed and can be treated as well by the tools discussed in section “Topological Derivatives.”

Notice that, in the more classical level set framework, these replacements of the Dirac delta by functions with extended support can be interpreted as different ways of defining extension velocities for the numerical level set evolution scheme. Refer to [7, 19, 45, 47, 84] and the further references given there for numerical approaches which are focusing on incorporating topology changes during the shape reconstruction.

Once optimization schemes are considered for level set-based shape reconstruction, a rich set of classical optimization schemes can be adapted and applied to this novel application. For example, Newton-type optimization techniques and second-order shape derivatives can be defined and calculated. Strategies for doing so are available in the literature; see, for example, [30]. Also quasi-Newton-, Gauss–Newton-, or Levenberg–Marquardt-type schemes look promising in this framework. Some related approaches can be found, for example, in [18, 50, 82, 90, 93]. There exists a large amount of literature concerned with shape optimization problems in various applications. One important application is, for example, the structural optimal shape design problem, where the shape of a given object (a tool, bridge, telegraph pole, airplane wing, etc.) needs to be optimized subject to certain application-specific constraints [3, 87, 96]. Another example is the optimization of a bandgap structure or of maximal eigenvalues [46, 56, 75]. Some techniques from nonlinear optimization which have been successful in those applications consequently have also found their way into the treatment of shape inverse problems. For brevity, we simply refer here to the discussions presented in [2, 18, 20, 43, 50, 82, 90, 93] and the many further references therein. Alternative nonlinear algebraic reconstruction techniques are employed in [34] and fixed point techniques in [22, 23].

8.2 Some Remarks on Numerical Shape Evolution with Level Sets

Not much is said here regarding numerical schemes for solving Hamilton–Jacobi equations numerically or for solving the related optimality systems for shape inverse problems numerically. In the framework of imaging science, various schemes have been developed and discussed extensively in the vast literature on numerical level set evolution; see, for example, the books and reviews [74, 85, 91], to mention just a few examples. These schemes include CFL conditions, re-initialization of level set functions, signed distance functions, the fast marching method, higher-order upwind schemes like ENO (essentially non-oscillating) and WENO (weighted essentially non-oscillating), artificial viscosity solutions, numerical discretizations of mean curvature terms in the level set framework, etc. All these techniques can be applied when working on the treatment of inverse problems by a level set formulation.

It is emphasized here, however, that the application of image reconstruction from indirect data comes with a number of additional problems and complications which are due to the ill-posedness of the inverse problem and to the often high complexity of the PDE (or IE) involved in the simulation of the data. Therefore, each particular image reconstruction problem from indirect data requires a careful study of numerical schemes which typically are tailor-made for the specific application. Overall, a careful choice of numerical discretization schemes and regularization parameters is indeed essential for a stable and efficient solution of the shape reconstruction problem. Moreover, also design parameters of the experimental setup (as, e.g., source and receiver locations) during the data collection have a significant impact on the shape evolution later on in the reconstruction process. Judicious choices here pay out in form of faster and more reliable reconstructions.

8.3 Speed of Convergence and Local Minima

Level set methods for shape reconstruction in inverse problems have initially been claimed to suffer from slow convergence due to inherent time-discretization constraints (the CFL condition) for the Hamilton–Jacobi equation and due to the (so far) exclusive use of first-order shape derivatives. Also, it had been observed that the shape evolution sometimes gets trapped in local minima, such that, for example, some topological components are missed by the shape evolution when starting with an inappropriate intitial guess.

However, these initial problems seem to have been resolved by now, and it appears that level set methods have in fact become quite efficient and stable when following certain straightforward guidelines and often even clearly outperform many classical pixel-based reconstruction schemes when additional prior information is available.

Firstly, the search for a good starting guess for the shape evolution can usually be done by either specific preprocessing steps (as, e.g., in [98]) or by employing more traditional search routines for only a few iteration steps. This helps avoiding “long-distance evolutions” during the succeeding shape reconstruction process.

A similar effect is achieved by the incorporation of some form of “topological derivative” in the shape evolution algorithm; see the brief discussion of this topic in the following section “Topological Derivatives.” With this topological derivative technique, “seed” objects occur during the evolution just at the correct locations to be deformed in only few more iterations to their final shapes.

The topological derivative (or an appropriately designed extension velocity which has a similar effect) can also help in avoiding the shape evolution to become trapped in local minima due to barriers of low sensitivity where velocity fields become very small. Again by the effect of the creation of “seed” objects in areas of higher sensitivity, the shape evolution can jump over these barriers and quickly arrive at the final reconstruction. When an object is extended over an area of low sensitivity, then, certainly, any reconstruction scheme has difficulties with its reconstruction inside this zone, such that additional prior information might be needed for arriving at a satisfactory result inside this zone of low sensitivity (regardless which reconstruction technique is used).

In addition, also higher-order shape derivatives have been developed in the literature (see, e.g., [30]) which can be used for deriving higher-order shape-based reconstruction schemes. So far, however, their usefulness as part of a level set-based shape inversion technique has been investigated only to a very limited extent.

Finally, in an optimization framework, line-search techniques can replace the CFL condition for marching toward the sought minimum of a cost functional. This can speed up convergence significantly.

Keeping these simple strategies in mind, level set-based reconstruction techniques can in fact be much faster than more traditional schemes, in particular when the contrast value of the parameters is assumed to be known and does not need to be recovered simultaneously with the shape. For very ill-posed inverse problems, traditional techniques need a large number of iterations to converge to the right balance between correct volume and contrast value of the sought objects.

8.4 Topological Derivatives

Even though the level set formulation allows for automatic topology changes during the shape evolution, the concepts on calculating descent directions derived so far do not really apply at the moment when a topological change occurs. This is typically no problem for the case of splitting and merging of shapes, since descent directions are only calculated for discrete time steps, such that practically always never the need arises to calculate a descent direction just when such a topological change occurs. Still, from a theoretical perspective, it would be interesting to calculate expressions also for topological derivatives which capture the splitting and merging of shapes.

Another situation where topological changes occur in shape evolution is the creation and annihilation of shape components. These situations also occur automatically in the level set framework when a suitable extension velocity is chosen. However, for these two situations, explicit expressions have been derived in the literature which describe the impact of an infinitesimal topological change on the least squares data misfit cost. These are generally known as topological derivatives.

The technique of topological derivatives has received much attention lately as a direct way of image reconstruction. The idea in these approaches is usually to calculate a value of the topological derivative (or topological sensitivity) at each location of the imaging domain and then adding small geometric objects at those places where this topological sensitivity is the most negative.

Certain issues arise here, as, for example, the question on how large these new objects should be, how close to each other or to the domain boundary they can be, and which contrast value should be applied for the so created small object. So far, in most cases, just one update in line with the above said is done, and thereafter the image reconstruction is either stopped or continued by a shape evolution of the so constructed set of objects. Nevertheless, the possibility of iterative topological reconstruction techniques remains an interesting challenge. Furthermore, the combination of simultaneous topological and shape evolution seems to be a very promising approach which combines the flexibility of level set evolution with the sensitivity driven creation and annihilation of shapes. This effect occurs in practice automatically if appropriate extension velocities are chosen in the regular level set shape evolution technique.

In the following, a more formal approach to topological changes is presented which has the advantage of providing a stronger mathematical justification of topological changes in the goal of data misfit reduction. The discussion will be based on the general ideas described in the references [15, 22, 23, 39, 40, 69, 73, 83, 86]. The topological derivative as described here aims at introducing either a small hole (let us call it B ρ ) into an existing shape D or at adding a new object (let us call it D ρ ) into the background material at some distance away from an already existing shape D (see Fig. 10). We will concentrate in the following on the first process, namely, adding a small hole into an existing shape. The complementary situation of creating a new shape component follows the same guidelines.

Fig. 10
figure 10

Creating a hole B ρ inside the shape D

Denote \(\tilde{D}_{\rho } = D\setminus B_{\rho }\), where the index ρ indicates the “size” of the hole B ρ and where it is assumed that the family of new holes defined by this index is “centered” at a given point \(\hat{{\mathbf{x}}}\). (In other words, one has \(\hat{{\mathbf{x}}} \in B_{\rho } \subset B_{\rho ^{{\prime}}}\) for any 0 < ρ < ρ < 1.) It is assumed that all boundaries are sufficiently smooth. Consider then a cost functional \(\mathcal{J} (D)\) which depends on the shape D. The topological derivative \(\mathcal{D}_{T}\) is defined as

$$\displaystyle{ \mathcal{D}_{T}(\hat{{\mathbf{x}}}) =\lim _{\rho \downarrow 0}\frac{\mathcal{J} (\tilde{D}_{\rho }) -\mathcal{J} (D)} {f(\rho )}, }$$
(153)

where f(ρ) is a function which approaches zero monotonically, i.e., \(f(\rho ) \rightarrow 0\) for \(\rho \rightarrow 0\). With this definition, the asymptotic expansion follows

$$\displaystyle{ \mathcal{J} (\tilde{D}_{\rho }) = \mathcal{J} (D) + f(\rho )\mathcal{D}_{T}(\hat{{\mathbf{x}}}) + o(f(\rho )). }$$
(154)

Early applications of this technique (going back to [22, 23, 83]) were focusing on introducing ball-shaped holes into a given domain in connection to Dirichlet or Neumann problems for a Laplace equation. Here, the function f(ρ) is mainly determined by geometrical factors of the created shape, and the topological derivative \(\mathcal{J} (\tilde{D}_{\rho })\) can be determined by solving one forward and one adjoint problem for the underlying Laplace equation. In fact, for the Neumann problem for the Laplace equation using ball-shaped holes, the relationship (153) takes the original form introduced in [22, 23, 83] where f(ρ) is just the negative of the volume measure of the ball, i.e., f(ρ) = −π ρ 2 in 2D and f(ρ) = −4π ρ 3∕3 in 3D. For more details and examples, see [22]. In general, the details of the behavior of the limit in (153), as well as of the function f(ρ) if the limit exists, depend strongly on the shape of the hole, on the boundary condition at the hole interface, and on the underlying PDE.

An attempt has been made recently to find alternative definitions for the topological derivative. One such approach has been presented in [39, 40, 73]. Instead of taking the point of view that a hole is “created,” the topological derivative is modeled via a limiting process where an already existing hole gradually shrinks until it disappears. For example, perturb the parameter ρ of an existing hole by a small amount δ ρ. Then, the cost \(\mathcal{J} (\tilde{D}_{\rho })\) is perturbed to \(\mathcal{J} (\tilde{D}_{\rho +\delta \rho })\), and the following limit appears:

$$\displaystyle{ \mathcal{D}_{T}^{{\ast}}(\hat{{\mathbf{x}}}) =\lim _{\rho \rightarrow 0}\left \{\lim _{\delta \rho \rightarrow 0}\frac{\mathcal{J} (\tilde{D}_{\rho +\delta \rho }) -\mathcal{J} (\tilde{D}_{\rho })} {f(\rho +\delta \rho ) - f(\rho )} \right \}. }$$
(155)

In [40, 73] the authors show a relationship between (153) and (155), which reads as

$$\displaystyle{ \mathcal{D}_{T}(\hat{{\mathbf{x}}}) = \mathcal{D}_{T}^{{\ast}}(\hat{{\mathbf{x}}}) =\lim _{\rho \rightarrow 0} \frac{1} {f^{{\prime}}(\rho )\vert \mathbf{V}_{n}\vert }\mathcal{D}_{\mathbf{V}_{n}}(\rho ), }$$
(156)

where \(\mathcal{D}_{\mathbf{V}_{n}}(\rho )\) is a specific form of a shape derivative related to a velocity flow V n in the inward normal direction of the boundary \(\partial B_{\rho }\) with speed | V n | . For more details, refer to [40, 73]. A related link between shape derivative and topological derivative has been demonstrated also in [22]. Recently published-related work on this topic is briefly reviewed in [33].

9 Case Studies

9.1 Case Study: Microwave Breast Screening

In section “Example 1: Microwave Breast Screening,” a complex breast model is presented for tackling the problem of early breast cancer detection from microwave data. Due to the high complexity of the model, also the reconstruction algorithm is likely to show some complexity. In [52] a reconstruction technique is proposed which uses five consecutive stages for the reconstruction. In the first stage, a pixel-by-pixel reconstruction is performed for the interior fatty fibroglandular region, with the skin region being (at this stage of the algorithm typically still incorrectly) estimated and fixed. Once a pixel-based reconstruction has been achieved, an initial shape for the fibroglandular region (the background being then fatty tissue) is extracted from it, which then, in the succeeding stages, is evolved jointly with the interior profiles, the skin region, and a possible tumor region until the final reconstruction is achieved. An important feature of the algorithm is that in different stages of the algorithm, different combinations of the unknowns (level set functions and interior parameter profiles) are evolved. For more details regarding the reconstruction algorithm, refer to [52].

Here, the pixel-by-pixel reconstructions of stage I of the algorithm and the final reconstructions using the complex breast model and a level set evolution are presented for the three breast models introduced in Fig. 1 and compared with each other in the cases where a small tumor is present. See Figs. 1113. The upper left image of each figure shows the real breast, the central upper image shows the pixel-by-pixel reconstruction with our basic reconstruction scheme, and the upper right image shows the level set-based reconstruction using the complex breast model explained in section “Example 1: Microwave Breast Screening.” The bottom images show cross sections through a horizontal line indicated in the upper row images and passing through the tumor locations for the three images.

Fig. 11
figure 11

First breast model of Fig. 1 with a disk-shaped tumor of diameter 8 mm situated deeply inside the breast. Top left: reference permittivity profile (true tumor permittivity value \(\varepsilon _{s}^{{\mathrm{tum}}} = 53\)). Top center: the result at the end of stage I (pixel-by-pixel reconstruction). Top right: final reconstruction of level set-based structural inversion scheme (reconstructed permittivity value \(\varepsilon _{{\mathrm{st}}}^{{\mathrm{reconst}}} = 50\)). Bottom: cross section through the correct tumor for constant y coordinate (the dashed line represents the true permittivity profile, the solid line the pixel-by-pixel result, and the dash-dotted line the structural inversion result). For more details, see [52]

Fig. 12
figure 12

Second breast model of Fig. 1 with a large fibroglandular tissue and a disk-shaped tumor of 6 mm diameter. The images are arranged as in Fig. 11. The real static permittivity of the tumor is \(\varepsilon _{{\mathrm{st}}}^{{\mathrm{tumor}}} = 52\) and the reconstructed one is \(\varepsilon _{{\mathrm{st}}}^{{{\mathrm{reconst}}} } = 49\). See also the animated movie provided in [52] which shows the shape evolution for this example

Fig. 13
figure 13

Third breast model of Fig. 1 with a region of fibroglandular tissue intermixed with adipose tissue. The hidden tumor is an ellipsoid of 5 × 6 mm (lengths of principle axes). The images are displayed as in Fig. 11. The real static permittivity value of the tumor is \(\varepsilon _{{\mathrm{st}}}^{{\mathrm{tumor}}} = 50\) and the reconstructed one is \(\varepsilon _{{\mathrm{st}}}^{{{\mathrm{reconst}}} } = 42\). For more details, see [52]

The data are created on a different grid than the one used for the reconstruction. The corresponding signal-to-noise ratio is 26 dB. Forty antennas are used as sources and as receivers, which are situated equidistantly around the breast. Microwave frequencies of 1, 2, 3, 4, and 5 GHz are used for the illumination of the breast.

Even though the pixel-by-pixel reconstruction scheme is not optimized here, a general problem of pixel-based reconstruction can be identified immediately from the presented examples. The reconstructions tend to be oversmoothed, and the small tumor can hardly be identified from the pixel-based reconstruction. By no means it is possible to give any reliable estimate from these pixel-based reconstructions for the contrast of the interior tumor values to the fibroglandular or fatty tissue values of static relative permittivity. True, the level set reconstruction scheme takes advantage of the fact that it closely follows the correct model for breast tissue. On the other hand, this information is typically available (at least approximately) in breast screening applications, such that better estimates of the tumor characteristics can be expected when using such a level set-based complex breast model. This is confirmed in the three reconstructions shown in the upper right images of Figs. 1113. For more details on this reconstruction scheme in microwave breast screening, and for an animated movie showing the image evolution, see [52].

9.2 Case Study: History Matching in Petroleum Engineering

Figure 14 shows the situation described in section “Example 2: History Matching in Petroleum Engineering” of history matching from production data. The image is composed of one zone of overall (approximately) bilinear behavior (a trend) and another zone where the permeability is smoothly varying without any clearly identifiable trend. The reconstruction follows this model and evolves simultaneously the region boundaries (i.e., the describing level set function), the three expansion parameters of the bilinear profile in the sandstone lithofacie, as well as the smoothly varying interior permeability profile in the shale region. The initial guess (upper right image of the figure) is obtained from well-log measurements. The true image is displayed in the upper left image of the figure and the final reconstruction in the center left image. The center right image shows the evolution of the data misfit cost during the joint evolution of all model unknowns, the lower left image shows the evolution of the three model parameters for the bilinear model in one of the regions, and the lower right image shows the initial, true, and final production rate profile over production time averaged over all boreholes (the data). A classical pixel-based reconstruction scheme typically is not able to use different models for the parameter profiles in the different regions and simultaneously reconstruct the sharp region interfaces. For more details on this reconstruction scheme for history matching in reservoir engineering, including animated movies for additional numerical examples, see [35].

Fig. 14
figure 14

Case study: history matching in reservoir engineering from production data. Left column from top to bottom: reference model, final reconstruction, and evolution of parameter values β 1, β 2, β 3 of the bilinear trend model; right column from top to bottom: initial guess, evolution of the least squares data misfit and the initial (red solid), final (black dashed), and reference (black solid) total water production rate in m 3s (i.e., the true and estimated measurements). The complete evolution as an animated file and more details on the reconstruction scheme can be found in [35]

9.3 Case Study: Reconstruction of Thin Shapes (Cracks)

Figure 15 shows a situation of shape evolution for the reconstruction of thin shapes (cracks) as described in section “Example 3: Crack Detection.” The numerical model presented in section “A Modification of the Classical Level Set Technique for Describing Cracks or Thin Shapes” is used here, where both level set functions describing the crack are evolved simultaneously driven by the least squares data misfit term. It is seen clearly that also in this specific model, topological changes occur automatically, when marching from a single initital (and somewhat arbitrary) crack candidate (upper left image of the figure) towards the final reconstruction showing three different crack components (bottom left image of the figure). The true situation is displayed in the bottom middle image of the figure, which shows as well three crack components which roughly are at the same location and of similar shape as the reconstructed ones. The evolution of the data misfit cost over artificial evolution time is displayed in the bottom right image of the figure. For more details on this reconstruction scheme and additional numerical experiments, see [4].

Fig. 15
figure 15

Case study: crack reconstruction by an evolution of a thin shape. Top row from left to right: initial guess, reconstruction after 1 and 2 iterations. Middle row: after 5, 10, and 20 iterations. Bottom row: final reconstruction after 25 iterations, real crack distribution, and evolution of least squares data misfit with iteration index. The noise level is indicated in the bottom right image by a horizontal dotted line. One iteration amounts to successive application of the updates corresponding to the data of each source position in a single-step fashion

10 Cross-References

Readers interested in the material presented in this chapter will also find interesting and relevant additional material in many other chapters of this handbook. Some additional numerical results using level set techniques can be found, for example, in the chapter on EIT. Many concepts relevant to specific implementations of level set techniques can be found, amongst others, in the following chapters.