Deformable templates represent shapes as transformations of a given prototype, or template. One of the advantages of this approach is that the template needs to be specified only once, for a whole family of shapes. If the template is well chosen, describing the transformation leading to a shape results in a simpler representation, typically involving a small number of parameters. The conciseness of the description is important for detection or tracking algorithms in which the shape is a variable, since it reduces the number of degrees of freedom. Small-dimensional representations are also more easily amenable to probabilistic modeling, leading, as we will see, to interesting statistical shape models.

The methods that we describe provide a parametrized family of shapes, \((m(\theta ), \theta \in \varTheta )\), where \(\varTheta \) is a parameter set. Most of the time, \(\varTheta \) will be some subset of \({\mathbb {R}}^d\) but it can also be infinite-dimensional. We will always assume, as a convention, that \(0\in \varTheta \) and that m(0) represents the template.

To simplify the presentation, we will restrict to curves, therefore assuming that \(m(\theta )\) is a parametrized curve \(u \mapsto m(u, \theta )\) defined over a fixed interval [ab]. Other situations can easily be transposed from this one. For example, one commonly uses configurations of labeled points, or landmarks, with \(m(\theta ) = (m_1(\theta ), \ldots , m_N(\theta ))\) as a finite-dimensional descriptor of a shape. Transposition from curves to surfaces is also easy.

6.1 Linear Representations

We start with a description of linear methods, in which

$$ m(\theta ) = m(0) + \sum _{k=1}^n \theta _k u_k, $$

where \(u_k\) is a displacement applied to m(0): for example, if m(0) is a closed curve, \(u_k\) is defined on [ab], taking values in \({\mathbb {R}}^d\) with \(u_k(a) = u_k(b)\). If m(0) is a configuration of points, \(u_k\) is a list of two-dimensional vectors.

The issue in this context is obviously how to choose the \(u_k\)’s. We will provide two examples, the first one based on a deterministic approach, and the second relying on statistical learning.

6.1.1 Energetic Representation

The framework developed in this section characterizes an object using a “small-deformation” model partially inspired by elasticity or mechanics. The object is described, not by its aspect, but by how it deforms. Our presentation is inspired by that developed in [230] for face recognition. It includes the principal warps described in [41] as a particular case, and provides an interesting way of decomposing shape variations in a basis that depends on the geometry of the considered shape.

For such a shape m, we will consider small variations, represented by transformations \(h \mapsto F(m, h)\). For example, one can take \(F(m, h) = m+h\) when this makes sense. We assume that the small variations, h, belong to a Hilbert space H (see Appendix A), with dot product \(\langle {\cdot \,,}{\,\cdot }\rangle _m\), possibly depending on m.

Associate to h some deformation energy, denoted E(h). Attribute to a time-dependent variation, \(t\mapsto h(t)\), the total energy:

$$ J(h) = \frac{1}{2} \int \Vert \partial _t h(t)\Vert _m^2 dt + \int E(h(t)) dt. $$

Inspired by the Hamilton principle, we consider shape trajectories that are extremals of the Lagrangian \(\Vert \partial _t h\Vert ^2_m/2 - E(h)\), therefore characterized by

$$ \partial _t^2 h + \nabla E(h(t)) = 0, $$

where \(\nabla E\) is the Hilbert gradient, defined by

$$ \partial _\varepsilon E(h+\varepsilon w)_{{|_{\varepsilon =0}}} = {\big \langle {\nabla E(h)}\, , \, {w}\big \rangle }_m. $$

We make the assumption that this gradient exists. In fact, because we only analyze small variations, we will assume that a second derivative exists at \(h=0\), i.e., we assume that, for some symmetric operator \(\varSigma _m\),

$$ \nabla E(h) = \varSigma _m h + o(\Vert h\Vert _m). $$

Typically, we will have \(E\ge 0\) with \(E(0) = 0\), which ensures that \(\varSigma _m\) is a non-negative operator. The linearized equation for h now becomes

$$\begin{aligned} \partial _t^2 h + \varSigma _m h = 0. \end{aligned}$$
(6.1)

This equation has a simple solution when \(\varSigma _m\) is diagonalizable. Making this assumption (which is always true in finite dimensions), letting \((f_1, f_2, \ldots )\) be the eigenvectors and \((\lambda _1, \lambda _2, \ldots )\) the corresponding eigenvalues (in decreasing order), solutions of (6.1) take the form

$$ h(t) = \sum _{k\ge 1} \alpha ^{(k)}(t) f_k $$

with \(\partial _t^2 \alpha ^{(k)} + \lambda _k \alpha ^{(k)} = 0\), so that \(\alpha ^{(k)}\) oscillates with frequency \(\omega _k = 1/\sqrt{\lambda _k}\).

The \(\omega _k's\) form what was called a modal representation in [253]. These vibration modes can be used to describe and compare shapes (so that similar shapes should have similar vibration modes). It is also possible to use this model for a template-based representation: let m be a template, with a modal decomposition as before, and represent small variations as

$$ (\alpha _1, \ldots , \alpha _N) \rightarrow \tilde{m} = F\left( m, \sum _{k=1}^N \alpha _k f_k\right) , $$

which has a linearized deformation energy given by \(\sum _k \lambda _k \alpha _k^2\).

Consider a first example of such a construction using plane curves. Let \(m(\cdot )=m(0, \cdot )\) be the prototype and \(\varOmega _m\) its interior. Assume that m is parametrized by arc length. A deformation of m can be represented as a vector field \(s \mapsto h(s)N(s)\), where h is a scalar function and N the unit normal to m. The deformed template is \(s \mapsto m(s) + h(s)N(s)\). A simple choice for E is

$$ E(h) = \frac{1}{2}\int _0^{L} \partial _s h^2 ds, $$

for which \(\varSigma _m h = - \partial _s^2 h\) and Eq. (6.1) is

$$ \partial ^2_t h = \partial ^2_s h, $$

which is the classical wave equation in one dimension. Since this equation does not depend on the prototype, m, it is not really interesting for our purposes, and we need to consider energies that depend on geometric properties of m. The next simplest choice is probably

$$ E(h) = \frac{1}{2}\int _0^{L} \rho _m(s) \partial _s h^2 ds, $$

where \(\rho _m\) is some function defined along m, for example \(\rho _m = 1+\kappa ^2_m\) (where \(\kappa _m\) is the curvature along m). In this case, we get \(\varSigma _m h = - 2\partial _s (\rho _m \partial _s h)\). The vibration modes are the eigenvectors of this inhomogeneous diffusion operator along the curve.

One can obviously consider many variations of this framework. Consider, for example, discrete shapes, represented by a finite collection of landmarks, so that a shape is now a finite collection \(m = (x_1, \ldots , x_N)\) with each \(x_i\in \mathbb R^2\). Given displacements \(h = (h_1, \ldots , h_N)\), define \(h^{(m)}(x)\), for \(x\in {\mathbb {R}}^2\) by

$$ h^{(m)}(x) = \sum _{i=1}^N g(|x-x_i|^2) \alpha _i $$

with \(g(t) = e^{-t/2\sigma ^2}\), where \(\alpha _1, \ldots , \alpha _N\in {\mathbb {R}}^2\) are chosen so that \(h(x_i) = h_i\), \(i=1, \ldots , N\). Then, we can define

$$\begin{aligned} E_m(h)= & {} \int _{{\mathbb {R}}^2} |h^{(m)}(x)|^2 dx \\= & {} \sum _{i, j=1}^N {\alpha _i^T}{\alpha _j} \int _{{\mathbb {R}}^2} g(|x_i-x|^2) g(|x - x_j|^2) dx\\= & {} \sum _{i, j=1}^N c_{ij}(m) {\alpha _i^T}{\alpha _j} \end{aligned}$$

with

$$ c_{ij} = \int _{{\mathbb {R}}^2} e^{- \frac{|x_i - x|^2}{2\sigma ^2} - \frac{|x_j - x|^2}{2\sigma ^2}} dx = \pi \sigma ^2 e^{-\frac{|x_i-x_j|^2}{4\sigma ^2}}. $$

Finally, notice that, from the constraints, \(\alpha = S(m)^{-1} h\) with \(s_{ij}(m) = g(|x_i-x_j|^2)\), we have

$$ E_m(h) ={\mathbf 1}_d^T h^T S(m)^{-1} C(m) S(m)^{-1} h {\mathbf 1}_d, $$

where, in this expression, h is organized in an N by d matrix and \({\mathbf 1}_d\) is the d-dimensional vector with all coordinates equal to 1. The modal decomposition will, in this case, be provided by eigenvalues and eigenvectors of \(S(m)^{-1} C(m) S(m)^{-1}\).

The principal warp representation [41] is very similar to this one, and corresponds to

$$\begin{aligned} E_m(h) ={\mathbf 1}_d^T h^T S(m)^{-1} h {\mathbf 1}_d. \end{aligned}$$
(6.2)

It is also associated to some energy computed as a function of \(h^{(m)}\), as will be clear to the reader after the description of reproducing kernel Hilbert spaces in Chap. 8.

One can also define

$$ E_m(h) = \int _{{\mathbb {R}}^2} {\mathrm {trace}}((dh^{(m)})^Tdh^{(m)}) dx $$

or some other function of \((dh^{(m)})^Tdh^{(m)}\), which corresponds to elastic energies. Closed-form computation can still be done as a function of \(h_1, \ldots , h_N\), and provides a representation similar to the one introduced in [230, 253].

6.2 Probabilistic Decompositions

6.2.1 Deformation Axes

One can build another kind of modal decomposition, based on a training set of shapes, using principal component analysis (PCA).

We will work with parametrized curves. The following discussion can be applied, however, with any of the shape representations described in Chap. 1, or, as considered in [73], with finite collections of points (landmarks) placed along (or within) the shape.

Assume that a training set is given containing N shapes that we will consider as versions of the same object or class of objects. We shall denote its elements by \(m^{(k)}(\cdot )\), \(k=1,\ldots , N\), and assume they are all defined on the same interval, I. The average is given by

$$ {\bar{m}}(u) = \frac{1}{N}\sum _{k=1}^N m^{(k)}(u). $$

A PCA (cf. Appendix E) applied to \(m^{(k)}\), \(k= 1, \ldots , N\), with the \(L^2\) inner product provides a finite-dimensional approximation called the active shape representation

$$\begin{aligned} m^{(k)}(u) = {\bar{m}}(u) + \sum _{i=1}^p \alpha _{ki} e^{(i)}(u), \end{aligned}$$
(6.3)

where the principal directions \(e^{(1)}, \ldots , e^{(p)}\) provide deformation modes along which the shape has the most variations.

This provides a new, small-dimensional curve representation, in terms of variations of the template \({\bar{m}}\). One can use it, for example, to detect shapes in an image, which requires the estimation of p parameters, plus three parameters (in two dimensions) describing the shape position in the image (rotation and translation).

One must be aware, when using this method, of the limits of the validity of the PCA approach, which is a linear method. It is not always “meaningful” to compute linear combinations of deformation vectors, even though, once the data is represented by an array of numbers, such a computation is always possible and easy. The important issue, however, is whether one can safely go back, that is, whether one can associate a valid shape (which can be interpreted as an object of the same category as the initial dataset) to any such linear combination. The answer, in general, is yes, provided the coefficients in the decomposition are not too large. Large coefficients, however, lead to large distortions, singularities, and do not model interesting shapes. Because of this, PCA-based decompositions should be considered as first-order linear approximations of more complex, nonlinear, variations. Plane curves, for example, can certainly be considered as elements of some functional space, on which linear combinations are valid, but their result does not always lead to satisfactory shapes. To take an example, assume that the training set only contains triangles. The PCA decomposition includes no mechanism ensuring that the shapes remain triangular after decomposition on a few principal components. Most often, the representation will be very poor, as far as shape interpretation is concerned.

In fact, shape decomposition must always be, in one way or another, coupled with some feature alignment on the dataset. In [73], this is implicit, since the approach is based on landmarks that have been properly selected by hand. To deal with general curves, it is important to preprocess the parametrizations to ensure that they are consistent, in the sense that points with the same parameter have similar geometric properties. The curves cannot, in particular, all be assumed to be arc-length parametrized. One way to proceed is to assume that the parametrization is arc length for only one curve, say \(m^{(0)}\). For the other curves, say \(m^{(k)}, k=1,\ldots , N\), we want to make a change of parametrization, \(\varphi ^{(k)}\), such that \(m^{(k)}(\varphi ^{(k)}(s)) = m_0(s) + \delta ^{(k)}(s)\) with \(\delta ^{(k)}\) as small as possible. Methods to achieve such simultaneous parametrizations implement curve registration algorithms. They will be presented later in this book.

In addition to aligning the parametrization, it is important to also ensure that the geometries are aligned, with respect to linear transformations (such as rotations, translations, scaling). All these operations have the effect of representing all the shapes in the same “coordinate system”, within which linear methods will be more likely to perform well.

Finally, we notice that this framework can be used to generate stochastic models of shapes. We can use the expression

$$ m(u) = {\bar{m}}(u) + \sum _{i=1}^p \alpha _i e^{(i)}(u) $$

and generate random curves m by using randomly generated \(\alpha _i\)’s. Based on the statistical interpretation of PCA, the \(\alpha _i\)’s are uncorrelated, and their respective variances are the eigenvalues \(\lambda _i^2\) that correspond to the eigenvector \(e^{(i)}\). Simple models generate the \(\alpha _i\)’s as independent Gaussian variables with variance \(\lambda _i^2\), or uniformly distributed on \([-\sqrt{3}\lambda _i, \sqrt{3}\lambda _i]\).

6.3 Stochastic Deformation Models

6.3.1 Generalities

The previous approaches analyzed variations directly in the shape representation. We now discuss a point of view which first models deformations as a generic process, before applying them to the template.

We consider here the (numerically important) situation in which the deformed curves are polygons. Restricting ourselves to this finitely generated family will simplify the mathematical formulation of the theory. The template will therefore be represented as a list of contiguous line segments, and we will model a deformation as a process that can act on each line segment separately. The whole approach is a special case of Grenander’s theory of deformable templates, and we refer to [14, 134–136, 138] for more references and information. The general principles of deformable templates assume that an “object” can be built by assembling elementary components (called generators), with specified composition rules. In the case we consider here, generators are line segments and composition rules imply that exactly two segments are joined at their extremities. One then introduces a set of transformations (via a suitable group action) that modify the generators, under the constraints of maintaining the composition rules. In our example, the transformation group will consist of collections of planar similitudes.

6.3.2 Representation and Deformations of Planar Polygonal Shapes

The formulas being much simpler when expressed using complex notation, we identify a point \(p=(x, y)\) in the plane with the complex number \(x+iy\), that we also denote by p. A polygonal line can either be defined by the ordered list of its vertices, say \(s_0, \ldots , s_N \in {\mathbb {C}}\) or, equivalently, by one vertex \(s_0\) and the sequence of vectors \(v_k = s_{k+1} - s_{k}\), \(k=0,\ldots , N-1\). The latter representation has the advantage that the sequence \((v_0, \ldots , v_{N-1})\) is a translation-invariant representation of the polygon. A polygonal line modulo translations will therefore be denoted \(\pi = (v_0,\ldots , v_{N-1})\). The polygonal line is a polygon if it is closed, i.e., if and only if \(v_0+\cdots +v_{N-1} = 0\). A polygonal line with origin \(s_0\) will be denoted \((s_0,\pi )\).

A polygonal line can be deformed by a sequence of rotations and scalings applied separately to each edge \(v_k\). In \({\mathbb {C}}\), such a transformation is just a complex multiplication. Therefore, a deformation is associated with an N-tuple of non-vanishing complex numbers \({\varvec{z}}= (z_0, \ldots , z_{N-1})\), the action of \({\varvec{z}}\) on \(\pi \) being

$$\begin{aligned} {\varvec{z}}\cdot \pi = (z_0 v_0, \ldots , z_{N-1}v_{N-1})\,. \end{aligned}$$
(6.4)

This defines a group action (cf. Sect. B.5) of \(G=({\mathbb {C}}\setminus \{0\})^N\) on the set of polygonal lines with N vertices.

In this group, some transformations play a particular role. Introduce the set

$$ \varDelta = \{{\varvec{z}}\in G, {\varvec{z}}= z(1, \ldots , 1), z\in {\mathbb {C}}\}\, $$

(the diagonal in G). An element in \(\varDelta \) provides a single similitude applied simultaneously to all edges, i.e., \(\varDelta \) represents the actions of similitudes on polygons. Similarly, the set

$$ \varDelta _0 = \{{\varvec{z}}\in G, {\varvec{z}}= z(1, \ldots , 1), z\in {\mathbb {C}}, |z|=1\}\, $$

represents the action of rotations.

A polygonal line modulo similitudes (resp. rotations) can be represented as an orbit \(\varDelta \cdot \pi \) (resp. \(\varDelta _0\cdot \pi \)). We can define the quotient groups \(G/\varDelta \) and \(G/\varDelta _0\), namely the sets of orbits \(\varDelta \cdot {\varvec{z}}\) (resp. \(\varDelta _0\cdot {\varvec{z}}\)) for \({\varvec{z}}\in G\) (they have a group structure because G is commutative). One obtains a well posed action of, say, \(G/\varDelta \) on polygonal lines modulo similitudes, by defining

$$ (\varDelta \cdot \mathbf z) \cdot (\varDelta \cdot \pi ) = \varDelta \cdot ({\varvec{z}}\cdot \pi ). $$

Given a polygon, \(\pi \), we define \(F(\pi )\) as the set of group elements \({\varvec{z}}\) in G that transform \(\pi \) into another polygon, namely

$$\begin{aligned} F(\pi ) = \{{\varvec{z}}\in G, z_0 v_0 + \cdots + z_{N-1}v_{N-1} = 0\}\,. \end{aligned}$$
(6.5)

Note that \(F(\pi )\) is not a subgroup of G.

We can use this representation to provide a stochastic model for polygonal lines. It suffices for this to choose a template \(\pi \) and a random variable \(\zeta \) on G and to take \(\zeta \cdot \pi \) to obtain a random polygonal line. Because we are interested in shapes, however, we will restrict ourselves to closed lines. Therefore, given \(\pi = (v_0,\ldots , v_{N-1})\), we will assume that \(\zeta \) takes values in \(F(\pi )\).

We now build simple probability distributions on G and \(F(\pi )\) for a fixed \(\pi \). Consider the function:

$$ E({\varvec{z}}) = (\alpha /2) \sum _{k=0}^{N-1} |z_k - 1|^2 + (\beta /2) \sum _{k=0}^{N-1} |z_k-z_{k-1}|^2. $$

The first term is large when \({\varvec{z}}\) is far from the identity, and the second one penalizes strong variations between consecutive \(z_i\)’s. Here and in the following, we let \(z_{-1} = z_{N-1}\).

We want to choose a probability distribution on G which is small when E is large. A natural choice would be to take the measure with density proportional to \(\exp (-E({\varvec{z}}))/\prod _{k=1}^{N-1} |z_k|\) with respect to the Lebesgue measure on \({\mathbb {C}}^{N-1}\). This is the “Gibbs measure”, with energy E, relative to the Haar measure, \(\prod _{k=1}^{N-1} dz_k/|z_k|\) which is the uniform measure on G. Such a choice is of interest in that it gives a very small probability to small values of \(|z_k|\), which is consistent with the fact that the \(z_k\)’s are non-vanishing on G. Unfortunately, this model leads to intractable computations, and we will rely on the simpler, but less accurate, model with density, f, proportional to \(\exp (-E({\varvec{z}}))\). This choice will greatly simplify the simulation algorithms, and in particular, the handling of the closedness constraint.

With \(\pi = (v_0, \ldots , v_{N-1})\), this constraint is expressed by \(\sum _k v_k z_k = 0\), and we will use the conditional density for f given this identity. This conditional distribution can be computed by using a discrete Fourier transform. Define

$$ u_l = {\hat{z}}_l = \frac{1}{\sqrt{N}}\sum _{k=0}^{N-1}z_ke^{-2i\pi \frac{kl}{N}}\,. $$

One can easily prove that E can be written

$$ E({\varvec{z}}) = \alpha |u_0 - \sqrt{N}|^2 + \sum _{l=1}^{N-1}\left( \alpha + 2\beta \left( 1-\cos \frac{2\pi l}{ N}\right) \right) |u_l|^2\,, $$

and that the constraint becomes

$$ \sum _{l=0}^{N-1} {\hat{v}}_l u_l = 0 $$

with \({\hat{v}}_l = \frac{1}{\sqrt{N}}\sum _{k=0}^{N-1}v_ke^{-2i\pi \frac{kl}{N}}\). Notice that, because \(\pi \) is closed, we have \({\hat{v}}_0=0\).

Let \(w_0 = \sqrt{\alpha }(u_0-\sqrt{N})\), and, for \(l\ge 1\), \(w_l = \sqrt{\alpha + 2\beta (1- \cos \frac{2\pi l}{N})} u_l\), so that

$$ E({\varvec{z}}) = \sum _{l=0}^{N-1} |w_l|^2. $$

Without the constraint, the previous computation implies that the real and imaginary parts of \(w_0, \ldots , w_{N-1}\) are mutually independent standard Gaussian variables: they therefore can be easily simulated, and the value of \(z_0, \ldots z_{N-1}\) directly computed after an inverse Fourier transform. Conditioning on closedness only slightly complicates the procedure. Replacing \(u_l\) by its expression as a function of \(w_1, \ldots , w_{N-1}\), and using \({\hat{v}}_0=0\), the constraint can be written in the form

$$ \sum _{l=0}^{N-1} c_l w_l = 0 $$

with \(c_0=0\) (\(c_p = \sqrt{\alpha + 2\beta (1- \cos \frac{2\pi l}{N})}\)). The following standard lemma from the theory of Gaussian variables solves our problem.

Lemma 6.1

Let \({\varvec{w}}\) be a standard Gaussian vector in \({\mathbb {R}}^{2N}\), and let V be a vector subspace of \({\mathbb {R}}^{2N}\). Let \(\varPi _V\) be the orthogonal projection on V. Then, the random variable \(\varPi _V({\varvec{w}})\) follows the conditional distribution given that \({\varvec{w}}\in V\).

Assume that \({\varvec{c}}= (c_0,\ldots , c_{N-1})\) has been normalized so that \(\sum |c_i|^2 = 1\). To sample closed random polygonal lines, it suffices to sample a standard Gaussian \({\varvec{w}}^*\) in \({\mathbb {C}}^N\), and set

$$ {\varvec{w}}= {\varvec{w}}^* - \left( \sum _{l=0}^{N-1} c_l w^*_l\right) \, {\varvec{c}}. $$

Some examples of random shapes simulated with this process are provided in Fig. 6.1.

Fig. 6.1
figure 1

Random deformations of a circle (with different values for \(\alpha \) and \(\beta \))

6.4 Segmentation with Deformable Templates

Using deformable templates in shape segmentation algorithms incorporates much stronger constraints than with active contours, which only implement the fact that shapes can be assumed to be smooth. If one knows the kind of shapes that are to be detected, one obviously gains in robustness and accuracy by using a segmentation method that looks for small variations of an average shape in this category.

Detection algorithms can be associated with the models provided in Sects. 6.2 and 6.3. Let us start with Sect. 6.2 with a representation that takes the form, denoting \(\alpha = (\alpha _1, \ldots \alpha _{p_0})\):

$$ m^\alpha = {\bar{m}}+ \sum _{i=1}^{p_0} \alpha _i K^{(i)}\,, $$

for some template \({\bar{m}}\) and vector fields \(K^{(i)}\). This information also comes with the variance of \(\alpha _i\), denoted \(\lambda _i^2\).

The “pose” of the shape within the image is also unknown. It is associated with a Euclidean or affine transformation g applied to \(m^\alpha \). The problem is then to find g and \(\alpha \) such that \(gm^\alpha \) is close to regions of low deformation energy within the image.

One can use a variational approach for this purpose. As described in Sect. 5.4.9, one starts with the definition of a potential V which is small when evaluated as a point close to contours. One can then define

$$ E(g, \alpha ) = \sum _{i=1}^n \frac{\alpha ^2_i}{\lambda ^2_i} + \beta \int _I V(gm^\alpha (u)) du. $$

The derivatives of E are

$$ \partial _{\alpha _i}{E} = 2\frac{\alpha _i}{\lambda ^2_i} + \beta \int _I \nabla V(gm^\alpha (u))^T(gK^i(u))\, du $$

and

$$ \partial _g{E} = \int _I m^\alpha (u) \nabla V(gm^\alpha (u))^T du $$

(it is a matrix). A similar computation can be made for variants of the definition of the cost function. One can, for example add a penalty (such as \(|\log \det (g)|\)) to penalize shapes that are too small or too large. One can also replace the quadratic term in \(\alpha _i\) by boundedness constraints, such as \(|\alpha _i| < \sqrt{3}\lambda _i\).

If scaling to very small curves is penalized, it is plausible that, in contrast to the case of active contours, the global minimum of E provides an acceptable solution. However, from a practical point of view, minimizing E is a difficult problem, with many local minima. It is therefore still necessary to start the algorithm with a good guess of the initial curve.

Consider now the representation of Sect. 6.3. We will use the same notation as in this section, a shape being modeled by a polygon \(\pi \) with N edges denoted \((v_0, \ldots , v_{N-1})\). A deformation is represented by N complex numbers \({\varvec{z}}= (z_0, \ldots , z_{N-1})\), with the action

$$ {\varvec{z}}\cdot \pi = (z_0v_0, \ldots , z_{N-1}v_{N-1}). $$

We have denoted by \(\varDelta \) (resp. \(\varDelta _0\)) the set of \({\varvec{z}}\)’s for which all \(z_i\)’s coincide (resp. coincide and have modulus 1); these subgroups of \(G = ({\mathbb {C}}\setminus \{0\})^N\) correspond to plane similitudes (resp. plane rotations).

We denote by \([{\varvec{z}}]\) and \([{\varvec{z}}]_0\) the classes of \({\varvec{z}}\) modulo \(\varDelta \) and \(\varDelta _0\). Similarly, when \(\pi \) is a polygon, we denote by \([\pi ]\) and \([\pi ]_0\) the classes of \(\pi \) modulo \(\varDelta \) and \(\varDelta _0\). For example,

$$ [{\varvec{z}}]=\{c\cdot {\varvec{z}}, c\in \varDelta \} $$

and

$$ [\pi ] = \{{\varvec{z}}\cdot \pi , {\varvec{z}}\in \varDelta \}\,. $$

A template should be considered as a polygon modulo \(\varDelta \) or \(\varDelta _0\) (depending on whether scale invariance is required), whereas a shape embedded in an image should be a polygon with an origin. Let \({\overline{\pi }}\) denote the template, although we should use the notation \([{\overline{\pi }}]\) or \([{\overline{\pi }}]_0\). Introduce a function \(V(\cdot )\), defined over the image, that is large for points that are far from image contours. The quantity that can be minimized is

$$ Q({\varvec{z}}, s_0) = E([{\varvec{z}}]) + \int _{m=s_0+{\varvec{z}}\cdot {\overline{\pi }}}V d\sigma _m, $$

with \(s_0\in {\mathbb {C}}\) and \({\varvec{z}}\in G\). The deformation energy E is a function defined on \(G/\varDelta \) (or equivalently a function defined on G, invariant under similitude transformations), that measures the difference between \({\varvec{z}}\) and a similitude. For example, with \({\varvec{z}}= (z_0, \ldots , z_{N-1})\), and \(z_k = r_ke^{i\theta _k}\), one can take

$$E([{\varvec{z}}]) = \sum _{k=1}^{N} (\log r_k - \log r_{k-1})^2 + \sum _{k=1}^{N}\arg (e^{i\theta _k - i\theta _{k-1}})^2\,.$$

Here, we have defined \(\arg z\), for \(z\ne 0\), as the unique \(\theta \in ]-\pi ,\pi ]\) such that \(z = re^{i\theta }\) with \(r>0\). We also use the convention \(r_N = r_0, \theta _N = \theta _0\) for the last term of the sum (assuming we are dealing with closed curves).

If scale invariance is relaxed, a simpler choice is

$$ E([{\varvec{z}}]_0) = \sum _{k=1}^{N} |z_k-z_{k-1}|^2. $$

Notice that for closed curves, it must be ensured that \({\varvec{z}}\cdot {\overline{\pi }}\) remains closed, which induces the additional constraint, taking \(\pi = (v_0, \ldots , v_{N-1})\):

$$\sum _{k=0}^{N-1} z_kv_k = 0\,.$$

It is interesting to compute the continuum limit of this energy. Still using complex numbers, we consider a \(C^1\) template curve \(m: I\rightarrow {\mathbb {C}}\), where \(I=[0,L]\) is an interval, with arc-length parametrization. For a given N, we consider the polygon \(\pi = (v_0, \ldots , v_{N- 1})\), with

$$ v_k = m\left( \frac{kL}{N}\right) - m\left( \frac{(k-1)L}{N}\right) \simeq \frac{L}{ N} \partial _s m\left( \frac{(k-1)L}{N}\right) \,. $$

A deformation, represented by \({\varvec{z}}= (z_0, \ldots , z_{N-1})\) will also be assumed to come from a continuous curve \(\zeta \) defined on [0, 1] with \(z_k = \zeta (k/N)\). The continuum equivalent of \(\pi \mapsto \pi \cdot {\varvec{z}}\) can then be written as a transformation of derivatives:

$$ \partial _s m(s) \mapsto \zeta (s/L) \partial _s m(s), $$

which leads us to define an action of non-vanishing complex-valued curves \(\zeta \) on closed curves by

$$ (\zeta \cdot m)(s) = \int _0^s \zeta (u/L) \dot{m}(u) du\,. $$

In the rotation-invariant case, the energy of the action should be given by the limit of

$$ \sum _{k=1}^{N-1} |z_k-z_{k-1}|^2. $$

Using the fact that \(z_k - z_{k-1}\simeq \dot{\zeta }({(k-1)/N})/N\), we have the continuum equivalent

$$ N \sum _{k=1}^{N-1} |z_k-z_{k-1}|^2 \rightarrow \int _0^1 |\dot{\zeta }(s)|^2 du\,. $$

This is the \(H^1\) norm of the deformation generator along the curve.