9.1 General Principles

In the previous two chapters, we introduced and studied basic tools related to deformations and their mathematical representation using diffeomorphisms. In this chapter, we start investigating relations between deformations and the objects they affect, which we will call deformable objects, and discuss the variations of matching functionals, which are cost functions that measure the quality of the registration between two deformable objects.

Let \(\varOmega \) be an open subset of \({\mathbb {R}}^d\) and G a group of diffeomorphisms on \(\varOmega \). Consider a set \({\mathcal I}\) of structures of interest, on which G has an action: for every I in \({\mathcal I}\) and every \(\varphi \in G\), the result of the action of \(\varphi \) on I is denoted \(\varphi \cdot I\) and is a new element of \({\mathcal I}\). This requires (see Sect. B.5) that \({\mathrm {id}}\cdot I = I\) and \(\varphi \cdot (\psi \cdot I) = (\varphi \circ \psi )\cdot I\). Elements of \({\mathcal I}\) will be referred to as deformable objects.

A matching functional is based on a function \(D: {\mathcal I}\times {\mathcal I}\rightarrow [0, +\infty )\) such that \(D(I, I')\) measures the discrepancy between the two objects I and \(I'\), and is defined over G by

$$\begin{aligned} E_{I, I'}(\varphi ) = D(\varphi \cdot I, I'). \end{aligned}$$
(9.1)

So \(E_{I, I'}(\varphi )\) measures the difference between the target object \(I'\) and the deformed one \(\varphi \cdot I\). Because it is mapped onto the target by the deformation, the object I will often be referred to as the template (and \(\varphi \cdot I\) as the deformed template).

Even if our discussion of matching principles and algorithms is rather extensive, and occupies a large portion of this book, the size of the literature, and our choice of privileging methods that implement diffeomorphic matching prevents us from providing an exhaustive account of the registration methods that have been proposed over the last few decades. The interested reader can refer to a few starting points in order to complement the presentation that is made here, including [12, 13, 22, 27, 28, 41, 42, 111, 125, 240, 244, 275], and textbooks such as [132, 139, 208, 214].

9.2 Differentiation with Respect to Diffeomorphisms

We will review, starting with the next section, a series of matching functionals that are adapted to different types of deformable objects (landmarks, images, curves, etc.). We will also compute the derivative of each of them with respect to the diffeomorphism \(\varphi \).

We also introduce a special form of differential which is adapted to variational problems over diffeomorphisms. This shape, or Eulerian differential, as we will call it, is a standard tool in shape optimization [80], and we will interpret it later on as a gradient for a specific Riemannian metric over diffeomorphisms.

Recall that we have defined \(\mathrm {Diff}^{p, \infty }=\mathrm {Diff}^{p, \infty }(\varOmega )\) to be the set of diffeomorphisms \(\psi \) such that

$$ \max (\Vert \psi -{\mathrm {id}}\Vert _{p, \infty }, \Vert \psi ^{-1} -{\mathrm {id}}\Vert _{p, \infty }) < \infty . $$

We have also defined \(\mathrm {Diff}^{p, \infty }_0\) as the subgroup of \(\mathrm {Diff}^{p, \infty }\) whose elements converge to the identity at infinity.

Definition 9.1

A function \(\varphi \mapsto U(\varphi )\) is \((p, \infty )\)-compliant if it is defined for all \(\varphi \) in \(\mathrm {Diff}^{p, \infty }_0\).

A \((p, \infty )\)-compliant U is locally \((p, \infty )\)-Lipschitz if, for all \(\varphi \in \mathrm {Diff}^{p, \infty }_0\), there exist positive numbers \(\varepsilon (\varphi )\) and \(C(\varphi )\) such that

$$ |U(\psi ) - U(\tilde{\psi })| \le C(\varphi ) \Vert \psi - \tilde{\psi }\Vert _{p, \infty } $$

whenever \(\psi \) and \(\tilde{\psi }\) are diffeomorphisms such that

$$\max (\Vert \psi -\varphi \Vert _{p, \infty },\Vert \tilde{\psi }-\varphi \Vert _{p, \infty })<\varepsilon (\varphi ).$$

Note that a \((p, \infty )\)-compliant (resp. locally Lipschitz) U is \((q, \infty )\)-compliant (resp. locally Lipschitz) for any q larger than p.

Because \(\mathrm {Diff}^{p,\infty }_0\) is an open subset of \(\mathrm {id}+C^p_0(\Omega , {\mathbb {R}}^d)\), both Gâteaux and Fréchet derivatives are well defined for functions defined on this set (see Sect. C.1). In the following, whenever we speak of a derivative (without a qualifier), this will always mean in the strong (Fréchet) sense. A function U is \(C^1\) on \(\mathrm {Diff}_0^{p, \infty }\) if and only if U is Fréchet differentiable and \(d U(\psi )\) is continuous in \(\psi \), which is equivalent (by Proposition C.5) to U being Gâteaux differentiable and \(d U(\psi )\) continuous in \(\psi \). Note also that U being \(C^1\) implies that U is \((p, \infty )\)-Lipschitz.

Using the group structure of \(\mathrm {Diff}_0^{p, \infty }\), we can define another type of differential using the infinitesimal action of vector fields. If V is an admissible vector space and \(v\in V\), we will denote by \(\varphi _{0t}^v\) the flow associated to the equation

$$ \partial _ty = v(y). $$

Note that this is the same notation as the flow associated to a differential equation \(\partial _t y = v(t, y)\), where v is now a time-dependent vector field. This is not a conflict of notation if one agrees to identify vector fields, v, in V and the associated constant time-dependent vector field defined by \(\tilde{v}(t,\cdot ) = v \) for all t.

Definition 9.2

Let V be an admissible Hilbert space continuously embedded in \(C^p_0(\varOmega , {\mathbb {R}}^d)\) (so that \(\mathrm {Diff}_V \subset \mathrm {Diff}^{p, \infty }_0\)). We say that a \((p, \infty )\)-compliant function U over diffeomorphisms has an Eulerian differential in V at \(\psi \) if there exists a linear form \(\bar{\partial }U(\psi )\in V^*\) such that, for all \(v\in V\),

$$\begin{aligned} {\left( {\bar{\partial }U(\psi )}\, \left| \, {v}\right. \right) } = \partial _\varepsilon U(\varphi _{0\varepsilon }^v \circ \psi )_{|_{\varepsilon =0}}. \end{aligned}$$
(9.2)

If the Eulerian differential exists, the V-Eulerian gradient of U at \(\psi \), denoted \({\overline{\nabla }}^VU(\varphi ) \in V\), is defined by

$$\begin{aligned} {\big \langle {{\overline{\nabla }}^V U(\varphi )}\, , \, {v}\big \rangle }_V = {\left( {\bar{\partial }U(\varphi )}\, \left| \, {v}\right. \right) }. \end{aligned}$$
(9.3)

In this case, \({\overline{\nabla }}^V U(\varphi ) = \mathbb {K} \bar{\partial }U(\varphi ) \), where \(\mathbb {K}\) is the kernel operator of V.

The following proposition indicates when Eq. (9.2) remains valid with time-dependent vector fields v.

Proposition 9.3

Let V be an admissible Hilbert space continuously embedded in \(C^{p+1}_0(\varOmega , {\mathbb {R}}^d)\). Let V and U satisfy the hypotheses of Definition 9.2. If U is \((p, \infty )\)-locally Lipschitz and has a V-Eulerian differential at \(\psi \) and if \(v(t,\cdot )\) is a time-dependent vector field such that

$$\begin{aligned} \lim _{\varepsilon \rightarrow \infty } \frac{1}{\varepsilon }\int _0^\varepsilon \Vert v(t,\cdot ) - v(0, \cdot )\Vert _Vdt = 0, \end{aligned}$$
(9.4)

then

$$\begin{aligned} {\left( {\bar{\partial }U(\psi )}\, \left| \, {v(0, \cdot )}\right. \right) } = \partial _\varepsilon U(\varphi _{0\varepsilon }^v \circ \psi )_{|_{\varepsilon =0}}. \end{aligned}$$
(9.5)

Proof

Letting \(v_0 = v(0, \cdot )\), we need to prove that

$$ \frac{1}{\varepsilon } (U(\varphi _{0\varepsilon }^v \circ \psi ) - U(\varphi _{0\varepsilon }^{v_0} \circ \psi )) \rightarrow 0 $$

as \(\varepsilon \rightarrow 0\). From Proposition 7.4, we know that if \(\psi , \varphi , \tilde{\varphi }\) are in \(\mathrm {Diff}_0^{p, \infty }\), there exists a constant \(C_p(\psi )\) such that

$$ \Vert \varphi \circ \psi - \tilde{\varphi }\circ \psi \Vert _{p, \infty } \le C_p(\psi ) \Vert \varphi - \tilde{\varphi }\Vert _{p, \infty }. $$

Now, since U is Lipschitz, we have, for small enough \(\varepsilon \),

$$\begin{aligned} |U(\varphi _{0\varepsilon }^v \circ \psi ) - U(\varphi _{0\varepsilon }^{v_0} \circ \psi )|\le & {} C(\psi ) \Vert \varphi _{0\varepsilon }^v \circ \psi - \tilde{\varphi }_{0\varepsilon }^v \circ \psi \Vert _{p, \infty }\\\le & {} C(\psi ) C_p(\psi ) \Vert \varphi _{0\varepsilon }^v-\varphi _{0\varepsilon }^{v_0}\Vert _{p, \infty }\\\le & {} C(\psi ) C_p(\psi ) \tilde{C}(v_0) \int _0^\varepsilon \Vert v(t, \cdot ) - v_0\Vert _V dt, \end{aligned}$$

where \(\tilde{C}(v_0)\) depends on \(\Vert v_0\Vert _{p+1, \infty }\) and can be derived from Eq. (7.16), Noting that \(\Vert v\Vert _{{\mathcal X}^{p+1,1}, \varepsilon } \le \varepsilon (C'' + \Vert v_0\Vert _{p+1, \infty })\) for small enough \(\varepsilon \), this proves the proposition.    \(\square \)

Note also that, if U is \(C^1\), then the chain rule implies that

$$\begin{aligned} {\left( {\bar{\partial }U(\varphi )}\, \left| \, {v}\right. \right) } = {\left( {d U(\varphi )}\, \left| \, {v\circ \varphi }\right. \right) }. \end{aligned}$$
(9.6)

To the Eulerian gradient of U, we associate a “gradient descent” process (that we will formally interpret as a Riemannian gradient descent for a suitable metric in Sect. 11.4.3) which generates a time-dependent element of G by setting

$$\begin{aligned} \partial _t \varphi (t, x) = - {\overline{\nabla }}^V U (\varphi (t)) (\varphi (t, x)). \end{aligned}$$
(9.7)

As long as \(\int _0^t \left\| {\overline{\nabla }}^VU (\varphi (s)) \right\| _V ds\) is finite, this generates a time-dependent element of \(\mathrm {Diff}_V\). This therefore provides an evolution within the group of diffeomorphisms, an important property. Assuming that Proposition 9.3 applies at time t (e.g., if U is \(C^1\)), we can write

$$ \partial _t {U(\varphi (t))} = {\big \langle {{\overline{\nabla }}^V U(\varphi (t))}\, , \, {\partial _t \varphi }\big \rangle }_V = - \left\| {\overline{\nabla }}^V U(\varphi (t)) \right\| _V^2, $$

so that \(U(\varphi (t))\) decreases with time.

9.3 Relation with Matching Functionals

As pointed out in the introduction, matching functionals take the form

$$\begin{aligned} U(\varphi ) = U_I(\varphi ) =Z( \varphi \cdot I), \end{aligned}$$
(9.8)

where I is a fixed deformable object for some function Z (e.g., \(Z(I) = D(I, I_1)\) for a fixed \(I_1\)). Using the group action property, we have

$$ U_I(\psi ) = U_{\varphi \cdot I}(\psi \circ \varphi ^{-1}). $$

Using this property and the fact that the mapping \( \psi \circ \varphi ^{-1}\) is smooth (infinitely differentiable) from \(\mathrm {Diff}^{p, \infty }_0\) onto itself, we find that if \(U_I\) is Gâteau or Fréchet differentiable at \( \psi ={\mathrm {id}}\) for any \(I\in {\mathcal I}\), then it is differentiable at all \(\psi \in \mathrm {Diff}^{p, \infty }_0\), and

$$ {\left( {d U_I(\varphi )}\, \left| \, {h}\right. \right) } = {\left( {d U_{\varphi \cdot I}({\mathrm {id}})}\, \left| \, {h\circ \varphi ^{-1}}\right. \right) }. $$

A similar statement holds for the Eulerian differential, with

$$ {\left( {\bar{\partial }U_I(\psi )}\, \left| \, {v}\right. \right) } = {\left( {\bar{\partial }U_{\psi \cdot I}({\mathrm {id}})}\, \left| \, {v}\right. \right) }. $$

Notice that when U is differentiable at \(\psi ={\mathrm {id}}\), then \(\bar{\partial }U({\mathrm {id}}) = dU({\mathrm {id}})\). Finally, if we assume that \({\mathcal I}\) is itself a Banach space (or an open subset of a Banach space), that Z is differentiable and that the action \(R_I: \varphi \mapsto \varphi \cdot I\) is also differentiable, we have, using the chain rule

$$ {\left( {dU_I(\varphi )}\, \left| \, {v}\right. \right) } = {\left( {dZ(\varphi \cdot I)}\, \left| \, {dR_I(\varphi ) v}\right. \right) } $$

or \(dU_I(\varphi ) = dR_I(\varphi )^* dZ(\varphi \cdot I)\). At \(\varphi = {\mathrm {id}}\), \(dR_I({\mathrm {id}})v\) is the infinitesimal action of v on I, which we denote by \(v\cdot I\), so that

$$ {\left( {dU_I({\mathrm {id}})}\, \left| \, {v}\right. \right) } = {\left( {dZ(I)}\, \left| \, {v\cdot I}\right. \right) } $$

and

$$ {\left( {\bar{\partial }U_I(\psi )}\, \left| \, {v}\right. \right) } = {\left( {dZ(\psi \cdot I)}\, \left| \, {v\cdot (\psi \cdot I)}\right. \right) }. $$

We now present a series of matching problems, involving different types of deformable objects. In each case, we will introduce adapted matching functionals and compute their differentials. As just remarked, derivatives with respect to the diffeomorphisms can all be derived from that of the function Z, on which we will, whenever possible, focus the computations.

9.4 Labeled Point Matching

The simplest way to represent a visual structure is with configurations of labeled points, or landmarks attached to the structure. Anatomical shapes or images are typical examples of structures on which landmarks can be easily defined; this includes specific locations in faces (corners of the eyes, tip of the nose, etc.), fingertips for hands, apex of the heart, etc. Many man-made objects, like cars or other vehicles, can be landmarked too. Finally, landmarks can represent the centers of simple objects, like cells in biological images.

In the labeled point-matching problem, objects are ordered collections of N points \(x_1, \ldots , x_N \in \varOmega \), where N is fixed. Diffeomorphisms act on such objects by:

$$\begin{aligned} \varphi \cdot (x_1, \ldots , x_N) = (\varphi (x_1), \ldots , \varphi (x_N)). \end{aligned}$$
(9.9)

The landmark-matching problem is not to find correspondences between two objects, say \(I=(x_1, \ldots , x_N)\) and \(I' = (x'_1, \ldots , x'_N)\), since we know that \(x_i\) and \(x'_i\) are homologous, but to extrapolate these correspondences to the rest of the space.

Here we can take \({\mathcal I}= ({\mathbb {R}}^d)^N\), or, if one restricts to distinct landmarks, the open subset

$$ {\mathcal I}= \left\{ (x_1, \ldots , x_N)\in ({\mathbb {R}}^d)^N, x_i\ne x_j \text { if }i\ne j \right\} . $$

For \(I=(x_1, \ldots , x_N)\), the action \(\varphi \mapsto \varphi \cdot I\) is \(C^1\) on \(\mathrm {Diff}^{p, \infty }_0\) for any \(p\ge 0\) (it is the restriction of a linear map, with \(|\varphi \cdot I|^2 \le \sqrt{N}\Vert \varphi \Vert _\infty \)). The simplest matching functional that we can consider for this purpose is associated with

$$ Z(I) = |I-I'|^2 = \sum _{k=1}^N |x_k-x'_k|^2 $$

with \({\left( {dZ(I)}\, \left| \, {h}\right. \right) } = 2 (I-I')^T h\) (considering I, \(I'\) and h as dN-dimensional column vectors). We have

$$\begin{aligned} U_I(\varphi ) = E_{I, I'}(\varphi ) = |I'-\varphi \cdot I|^2 = \sum _{i=1}^N \left| x'_i - \varphi (x_i) \right| ^2, \end{aligned}$$
(9.10)
$$\begin{aligned} {\left( {d U_I(\varphi )}\, \left| \, {v}\right. \right) } = 2 \sum _{i=1}^N (\varphi (x_i) - x'_i)^Tv(x_i). \end{aligned}$$
(9.11)

This can be written as

$$ d U_I(\varphi ) = 2 \sum _{i=1}^N (\varphi (x_i) - x'_i) \delta _{x_i}. $$

From (9.6), we have

$$\begin{aligned} {\Big ( {\bar{\partial }U_I(\varphi )}\,\Big |\, {h}\Big )} = 2 \sum _{i=1}^N (\varphi (x_i) - x'_i)^Th\circ \varphi (x_i) \end{aligned}$$
(9.12)

or

$$ \bar{\partial }U_I(\varphi ) = 2 \sum _{i=1}^N (\varphi (x_i) - x'_i) \delta _{\varphi (x_i)}, $$

and (9.3) gives

$$\begin{aligned} {\overline{\nabla }}^V U_I(\varphi ) = 2 \sum _{i=1}^N K(\cdot , \varphi (x_i))(\varphi (x_i) - x'_i). \end{aligned}$$
(9.13)

The gradient descent algorithm (9.7) takes a very simple form:

$$\begin{aligned} \partial _t \varphi (t, x) = - 2 \sum _{i=1}^N K(\varphi (t,x), \varphi (t,x_i)) (\varphi (t, x_i) - x'_i). \end{aligned}$$
(9.14)

This system can be solved in two steps: let \(y_i(t) = \varphi (t, x_i)\). Applying (9.14) at \(x = x_j\) yields

$$ \partial _t y_j = - 2 \sum _{i=1}^N K(y_j, y_i) (y_i - x'_i). $$

This is a differential system in \(y_1, \ldots , y_N\). The first step is to solve it with initial conditions \(y_j(0) = x_j\). Once this is done, the extrapolated value of \(\varphi (t, x)\) for a general x is the solution of the differential equation

$$ \partial _t y = - 2 \sum _{i=1}^N K(y, y_i) (y_i - x'_i) $$

initialized at \(y(0) = x\). Figure 9.1 gives an example obtained by running this procedure, providing an illustration of the impact of the choice of the kernel for the solution. The last panel in Fig. 9.1 also shows the limitations of this algorithm, in the sense that it is trying to move the points in the direction of their targets at each step, while a more indirect path can sometimes be found generating less distortion (these results should be compared to Fig. 10.1 in Chapter 10).

Fig. 9.1
figure 1

Greedy landmark matching. Implementation of the gradient descent algorithm in (9.14), starting with \(\varphi ={\mathrm {id}}\), for the correspondences depicted in the upper-left image (diamonds moving to circles). The following three images provide the result after numerical convergence for Gaussian kernels \(K(x, y) = \exp (-|x-y|^2/2\sigma ^2){\mathrm {Id}}\) with \(\sigma = 1, 2, 4\) in grid units. Larger \(\sigma \) induce increasing smoothness in the final solution, and deformations affecting a larger part of the space. As seen in the figure for \(\sigma = 4\), the evolution can result in huge deformations

9.5 Image Matching

Images, or more generally multivariate functions, are also important and widely used instances of deformable objects. They correspond to functions I defined on \(\varOmega \) with values in \({\mathbb {R}}\). Diffeomorphisms act on them by:

$$ (\varphi \cdot I)(x) = I(\varphi ^{-1}(x)) $$

for \(x\in \varOmega \). Fixing two such functions I and \(I'\), the simplest matching functional which can be considered is the squared \(L^2\) norm of the difference \(Z(I) = \Vert I-I'\Vert _2^2\), yielding

$$\begin{aligned} U_I(\varphi ) = E_{I, I'}(\varphi ) = \int _\varOmega \left| I\circ \varphi ^{-1}(x) - I'(x) \right| ^2 dx. \end{aligned}$$
(9.15)

We will need the derivative of the mapping \(\mathrm {Inv}: \varphi \mapsto \varphi ^{-1}\). Considering \(\mathrm {Inv}\) as a mapping from \(\mathrm {Diff} _0^{p+1, \infty }\) to \(\mathrm {Diff}_0^{p,\infty }\), it is given by (see Proposition 7.8)

$$\begin{aligned} d\mathrm {Inv}(\varphi ) h = - \left( d\varphi \circ \varphi ^{-1}\right) ^{-1} h\circ \varphi ^{-1} = - d(\varphi ^{-1}) h\circ \varphi ^{-1}. \end{aligned}$$
(9.16)

Similarly \(\mathrm {Inv}\) is \(C^k\) from \(\mathrm {Diff}^{p+k, \infty }_0\) to \(\mathrm {Diff}^p_0\).

We now compute the derivative of \(U_{I}\) under the assumption that \(I'\) is square integrable, I is compactly supported (on some set \(Q_I\)) and continuously differentiable. One can relax the differentiability assumption on I (considering, for example, piecewise smooth images), but the analysis is much more difficult, and we refer the reader to results in [293–295] for more details. Define

$$ \tilde{U}_{I}(\varphi ) = \int _\varOmega \left| I\circ \varphi (x) - I'(x) \right| ^2 dx $$

so that \(U_{I} = \tilde{U}_{I}\circ \mathrm {Inv}\). Fixing \(\varphi \in \mathrm {Diff}^{p+1}_0\), \(h\in C^{p+1}_0(\varOmega , {\mathbb {R}}^d)\) and letting \(\varphi _\varepsilon = \varphi +\varepsilon h\) for \(|\varepsilon |\le 1\), we have

$$ \partial _\varepsilon \tilde{U}_{I}(\varphi _\varepsilon ) = 2\int _\varOmega (I\circ \varphi _\varepsilon (x) - I'(x)) \nabla I\circ \varphi _\varepsilon (x)^Th(x)\, dx. $$

The integrand in the right-hand side is dominated by the integrable upper bound \( (\Vert I\Vert _\infty + |I'(x)|) \Vert \nabla I\Vert _{\infty }\, \Vert h\Vert _\infty {\mathbf 1}_{\tilde{Q}}(x)\), where \(\tilde{Q}\) is any compact set that contains \(\varphi _\varepsilon ^{-1}(Q_I)\) for \(|\varepsilon |\le 1\), which justifies the derivation. Taking \(\varepsilon =0\), we obtain the directional derivative of \(\tilde{U}_{I}\), which is

$$ {\left( {d \tilde{U}_{I}(\varphi )}\, \left| \, {h}\right. \right) } = 2\int _\varOmega (I\circ \varphi (x) - I'(x)) \nabla I\circ \varphi (x)^Th(x)\, dx. $$

Our hypotheses on I imply that \(d \tilde{U}_{I}(\varphi )\) is continuous in \(\varphi \in \mathrm {Diff}^{p,\infty }_0\) for any \(p\ge 0\). Fixing \(\varphi \) and assuming \(\Vert \varphi ' - \varphi \Vert _{\infty } \le 1\), we have

$$\begin{aligned}&\left| {\left( {d \tilde{U}_{I}(\varphi ) - d\tilde{U}_{I}(\varphi ')}\, \left| \, {h}\right. \right) } \right| \\&\qquad \le 2\int _\varOmega (I\circ \varphi (x) - I\circ \varphi '(x)) \nabla I\circ \varphi '(x)^Th(x)\, dx\\&\qquad \quad + 2\int _\varOmega (I\circ \varphi (x) - I'(x)) (\nabla I\circ \varphi (x) - \nabla I\circ \varphi '(x))^Th(x)\, dx\\&\qquad \le 2|\tilde{Q}|\, \Vert I\circ \varphi - I\circ \varphi '\Vert _\infty \Vert \nabla I\Vert _\infty \Vert h\Vert _\infty \\&\qquad \quad + 2\sqrt{|\tilde{Q}|} \Vert I\circ \varphi - I'\Vert _2 \Vert \nabla I\circ \varphi - \nabla I\circ \varphi '\Vert _\infty \Vert h\Vert _\infty , \end{aligned}$$

where \(\tilde{Q}\) is a compact set containing all \((\varphi ')^{-1}(Q_I)\) for \(\Vert \varphi ' - \varphi \Vert _{\infty } \le 1\). The facts that I and \(\nabla I\) are uniformly continuous on \(Q_I\) imply that \(\Vert I\circ \varphi - I\circ \varphi '\Vert _\infty \) and \(\Vert \nabla I\circ \varphi - \nabla I\circ \varphi '\Vert _\infty \) tend to 0 as \(\Vert \varphi -\varphi '\Vert _\infty \) tends to 0. (We have denoted by \(|\tilde{Q}|\) the Lebesgue measure of the set \(\tilde{Q}\).)

As a composition of \(C^1\) functions, we find that \(U_{I}\) is \(C^1\) on \(\mathrm {Diff}^{p+1,\infty }_0\) for any \(p\ge 0\), with (applying the chain rule)

$$\begin{aligned}&{\left( {d U_{I}(\varphi )}\, \left| \, {h}\right. \right) } = \nonumber \\&\quad \quad \quad - 2\int _\varOmega (I\circ \varphi ^{-1}(x) - I'(x)) \nabla I\circ \varphi ^{-1}(x)^Td(\varphi ^{-1}) h\circ \varphi ^{-1}(x)\, dx. \end{aligned}$$
(9.17)

Notice that \(\nabla (I\circ \varphi ^{-1})^T = (\nabla I^T\circ \varphi ^{-1}) d(\varphi ^{-1})\), so that

$$ {\left( {d U_{I}(\varphi )}\, \left| \, {h}\right. \right) } = - 2\int _\varOmega (I\circ \varphi ^{-1}(x) - I'(x)) \nabla (I\circ \varphi ^{-1})(x)^T h\circ \varphi ^{-1}(x)\, dx $$

and we retrieve the formula

$$ {\left( {d U_{I}(\varphi )}\, \left| \, {h}\right. \right) } = {\left( {d U_{\varphi \cdot I}({\mathrm {id}})}\, \left| \, {h\circ \varphi ^{-1}}\right. \right) }. $$

The Eulerian derivative is given by

$$\begin{aligned} {\left( {\bar{\partial }U_{I}(\varphi )}\, \left| \, {v}\right. \right) }= & {} {\left( {d U_{I}({\mathrm {id}})}\, \left| \, {v\circ \varphi }\right. \right) }\\= & {} - 2\int _\varOmega (I\circ \varphi ^{-1}(x) - I'(x)) \nabla (I\circ \varphi ^{-1})(x)^T v(x)\, dx. \end{aligned}$$

We introduce a notation that will be used throughout this chapter and that generalizes the one given for point measures in Eq. (8.4). If \(\mu \) is a measure on \(\varOmega \) and \(z:\varOmega \rightarrow {\mathbb {R}}^d\) a \(\mu \)-measurable function, the vector measure \((z \mu )\) is the linear form over vector fields on \(\varOmega \) defined by

$$\begin{aligned} {\left( {z \mu }\, \left| \, {h}\right. \right) } = \int _\varOmega {z}^T{h} d\mu . \end{aligned}$$
(9.18)

With this notation we can write

$$\begin{aligned} \bar{\partial }U_I(\varphi ) = -2\big ((I\circ \varphi ^{-1} - I') \nabla (I\circ \varphi ^{-1})\big ) dx. \end{aligned}$$
(9.19)

Notice also that, making a change of variable in (9.17), we have

$$\begin{aligned} dU_I(\varphi ) = - 2 \big (\det (d\varphi ) \,(I - I'\circ \varphi )\, d\varphi ^{-T}\nabla I \big ) dx. \end{aligned}$$
(9.20)

To compute the Eulerian gradient of \(U_I\), we need to apply the kernel operator, \(\mathbb K\), to \(\bar{\partial }U_I(\varphi )\), which requires the following lemma.

Lemma 9.4

If V is a reproducing kernel Hilbert space (RKHS) of vector fields on \(\varOmega \) with kernel operator \(\mathbb K\) and kernel K, \(\mu \) is a measure on \(\varOmega \) and z a \(\mu \)-measurable function from \(\varOmega \) to \({\mathbb {R}}^d\), then, for all \(x\in \varOmega \),

$$ \mathbb K(z \mu )(x) = \int _\varOmega K(x, y) z(y) d\mu (y). $$

Proof

From the definition of the kernel, we have, for any \(a\in {\mathbb {R}}^d\):

$$\begin{aligned} a^T\mathbb K(z\mu )(x)= & {} {\left( {a \delta _x}\, \left| \, {\mathbb K(z \mu )}\right. \right) } \\= & {} {\left( {z \mu }\, \left| \, {\mathbb K(a \delta _x)}\right. \right) } \\= & {} {\left( {z \mu }\, \left| \, {K(.,x) a}\right. \right) } \\= & {} \int _\varOmega z^T(y) K(y,x) a d\mu (y) \\= & {} a^T \int _\varOmega K(x, y) z(y) d\mu (y), \end{aligned}$$

which proves Lemma 9.4.    \(\square \)

The expression of the Eulerian gradient of \(U_I\) is now given by Lemma 9.4:

$$\begin{aligned} {\overline{\nabla }}^V U_I(\varphi ) = -2\int _\varOmega (I\circ \varphi ^{-1}(y) - I'(y)) K(., y)\nabla (I\circ \varphi ^{-1})(y) dy. \end{aligned}$$
(9.21)

This provides the following “greedy” image-matching algorithm [67, 278].

Algorithm 9.5

Greedy image matching Start with \(\varphi (0)= {\mathrm {id}}\) and solve the evolution equation

$$\begin{aligned} \partial _t \varphi (t, y) = 2\int _\varOmega (J(t, x) - I'(x)) K(\varphi (t, y), x)\nabla J(t, x) dx \end{aligned}$$
(9.22)

with \(I(t, \cdot ) = I\circ (\varphi (t))^{-1}\).

This algorithm can also be written uniquely in terms of the evolving image, J, using \(\partial _t J\circ \varphi + (J\circ \varphi )^T\partial _t\varphi = 0\). This yields

$$ \partial _t J(t, y) = -2 \int _\varOmega K(y, x) (J(t,x) -I'(x) ) \nabla J(t, x)^T\nabla I(t, y) dx. $$

In contrast to what we did in the landmark case, this algorithm should not be run indefinitely (or until numerical convergence). The fundamental difference is that, in the landmark case, there is an infinity of solutions to the diffeomorphic interpolation problem, and the greedy algorithm would generally run until it finds one of them and then stabilize. In the case of images, it is perfectly possible (and even typical) that there is no solution to the matching problem, i.e., no diffeomorphism \(\varphi \) such that \(I\circ \varphi ^{-1} = I'\). In that case, Algorithm 9.5 will run indefinitely, creating huge deformations while trying to solve an impossible problem.

To decide when the evolution should be stopped, an interesting suggestion has been made in [278]. Define

$$ v(t, x) = 2\int _\varOmega (J(t, x) - I'(x)) K(y, x)\nabla J(t, x) dx $$

so that (9.22) reduces to \(\partial _t\varphi = v(t)\circ \varphi \). As we know from Chap. 7, the smoothness of \(\varphi \) at time t can be controlled by

$$ \int _0^t \Vert v(s)\Vert ^2_V ds, $$

the norm being explicitly given by

$$\begin{aligned}&\Vert v(s)\Vert ^2_V \\&= 2\int _{\varOmega \times \varOmega } K(y, x) (J(s, x) - I'(x))(J(s, y) - I'(y)) \nabla J(s, x)^T\nabla J(s, y) dx dy. \end{aligned}$$

Define, for some parameter \(\lambda \),

$$E(t) = \frac{1}{t} \int _0^t \Vert v(s)\Vert ^2_V ds + \lambda \int _\varOmega (J(t, y) - I'(y))^2 dy.$$

Then, the stopping time proposed in [278] for Algorithm 9.5 is the first t at which E(t) stops decreasing. Some experimental results using this algorithm and stopping rule are provided in Fig. 9.2.

Fig. 9.2
figure 2

Greedy image matching. Output of Algorithm 9.5 when estimating a deformation of the first image to match the second one. The third image is the obtained deformation of the first one and the last provides the deformation applied to a grid

There are many other possible choices for a matching criterion, least squares being, as we wrote, the simplest one. Among other possibilities, comparison criteria involving histograms provide an interesting option, because they allow for contrast-invariant comparisons.

Given a pair of images, I, \(I'\), associate to each \(x\in \varOmega \) and image values \(\lambda \) and \(\lambda '\) the local histogram \(H_x(\lambda , \lambda ')\), which counts the frequency of simultaneous occurrence of values \(\lambda \) in I and \(\lambda '\) in \(I'\) at the same location in a small window around x. One computationally feasible way to define it is to use the kernel estimator

$$ H_{I,I'}(x, \lambda , \lambda ') = \int _{\varOmega } f(\left| I(y) - \lambda \right| ) f(\left| I'(y) - \lambda ' \right| ) g(x, y) dy $$

in which f is a positive function such that \(\int _{\mathbb {R}}f(t) dt = 1\) and f vanishes when t is far from 0, and \(g\ge 0 \) is such that for all x, \(\int _\varOmega g(x, y) dy = 1\) and g(xy) vanishes when y is far from x.

For each x, \(H_{I, I'}(x, \cdot , \cdot )\) is a bi-dimensional probability function, and there exist several ways of measuring the degree of dependence between its components. The simplest one, which is probably sufficient for most applications, is the correlation ratio, given by

$$ C_{I, I'}(x) = 1 - \frac{\int _{{\mathbb {R}}^2} \lambda \lambda ' H_{I, I'}(x, \lambda , \lambda ') d\lambda d\lambda '}{\sqrt{\int _{{\mathbb {R}}^2} \lambda ^2 H_{I, I'}(x, \lambda , \lambda ') d\lambda d\lambda ' \int _{{\mathbb {R}}^2} (\lambda ')^2 H_{I, I'}(x, \lambda , \lambda ') d\lambda d\lambda '}}. $$

It is then possible to define the matching function by

$$ U_{I}(\varphi ) = \int _\varOmega C_{I\circ \varphi ^{-1}, I'}(x) dx. $$

The differential of \(U_I\) with respect to \(\varphi \) can be obtained after a lengthy (but elementary) computation. Some details can be found in [145]. A slightly simpler option is to use criteria based on the global histogram, which is defined by

$$ H_{I, I'}(\lambda , \lambda ') = \int _{\varOmega } f(\left| I(y) - \lambda \right| ) f(\left| I'(y) - \lambda ' \right| ) dy, $$

and the matching criterion is simply \(U_I(\varphi ) = C_{I\circ \varphi ^{-1}, I'}\) or, as introduced in [185, 298], the mutual information computed from the joint histogram.

9.6 Measure Matching

The running assumption in Sect. 9.4 was that the point sets \((x_1, \ldots , x_N)\) were labeled, so that, when comparing two of them, the correspondences were known and the problem was to extrapolate them to the whole space.

In some applications, correspondences are not given and need to be inferred as part of the matching problem. One way to handle this is to include them as new unknowns (in addition to the unknown diffeomorphism), add extra terms to the energy that measures the quality of correspondences, and minimize the whole thing. Such an approach is taken, for example, in [240, 241].

Another point of view is to start with a representation of the point set that does not depend on how the points are ordered. A natural mathematical representation of a subset of \({\mathbb {R}}^d\) is by the uniform measure on this set, at least when this is well-defined. For a very general class of sets, this corresponds to the Hausdorff measure for the appropriate dimension [107], which, for finite sets, simply provides the sum of Dirac measures at each point, i.e., \(x = (x_1, \ldots , x_N)\) is represented by

$$ \mu _x = \sum _{i=1}^N \delta _{x_i}. $$

For us, this raises the issue of comparing measures using diffeomorphisms, which will be referred to as the measure-matching problem.

In line with all other matching problems we are considering in this chapter, specifying the measure-matching problem requires, first, defining the action of diffeomorphisms on the considered objects, and second, using a good comparison criterion between two objects.

Let us start with the action of diffeomorphisms. The only fact we need here concerning measures is that they are linear forms acting on functions on \({\mathbb {R}}^d\) via

$$ {\left( {\mu }\, \left| \, {f}\right. \right) } = \int _{{\mathbb {R}}^d} fd\mu . $$

In particular, if \(\mu _x\) is as above, then

$$ {\left( {\mu _x}\, \left| \, {f}\right. \right) } = \sum _{i=1}^N f(x_i). $$

If \(\varphi \) is a diffeomorphism of \(\varOmega \) and \(\mu \) a measure, we define a new measure \(\varphi \cdot \mu \) by

$$ {\left( {\varphi \cdot \mu }\, \left| \, {f}\right. \right) } = {\left( {\mu }\, \left| \, {f\circ \varphi }\right. \right) }. $$

It is straightforward to check that this provides a group action. If \(\mu = \mu _x\), we have

$$ {\left( {\varphi \cdot \mu _x}\, \left| \, {f}\right. \right) } = \sum _{i=1}^N f\circ \varphi (x_i) = {\left( {\mu _{\varphi (x)}}\, \left| \, {f}\right. \right) }, $$

so that the transformation of the measure associated to a point set x is the measure associated to the transformed point set, which is reasonable.

When \(\mu \) has a density with respect to Lebesgue measure, say \(\mu = zdx\), this action can be translated to a resulting transformation over densities as follows.

Proposition 9.6

If \(\mu = z\, dx\), where z is a positive, Lebesgue integrable function on \(\varOmega \subset {\mathbb {R}}^d\), and \(\varphi \) is a diffeomorphism of \(\varOmega \), then

$$\begin{aligned} \varphi \cdot \mu = \det (d(\varphi ^{-1}))\, z\circ \varphi ^{-1}\, dx. \end{aligned}$$
(9.23)

The proposition is an immediate consequence of the definition of \(\varphi \cdot \mu \) and of the change of variable formula (details are left to the reader). Note that the action of diffeomorphisms does not change the total mass of a positive measure, that is \((\varphi \cdot \mu )(\varOmega ) = \mu (\varOmega )\) if \(\varphi \) is a diffeomorphism of \(\varOmega \).

Now that we have defined the action, we need to choose a function \(D(\mu , \mu ')\) to compare two measures \(\mu \) and \(\mu '\). Many such functions exist, especially when measures are normalized to have a unit mass, since this allows for the use of many comparison criteria defined in probability or information theory (such as the Kullback–Leibler divergence [75]). A very general example is the Wasserstein distance [238, 301], which is associated to a positive, symmetric, cost function \(\rho :\varOmega \times \varOmega \rightarrow [0, +\infty )\) and defined by

$$\begin{aligned} d_\rho (\mu , \mu ') = \inf _\nu \int _{\varOmega ^2} \rho (x, y) \nu (dx, dy), \end{aligned}$$
(9.24)

where the minimization is over all \(\nu \) with the first marginal given by \(\mu \), and the second one by \(\mu '\). If \(\mu \) and \(\mu '\) are uniform measures on discrete point sets, i.e.,

$$ \mu = \frac{1}{N} \sum _{k=1}^N \delta _{x_k}, \ \mu ' = \frac{1}{M} \sum _{k=1}^M \delta _{x'_k}, $$

then computing the Wasserstein distance reduces to minimizing

$$ \sum _{k=1}^N\sum _{l=1}^M \rho (x_k,x'_l) \nu (x_k, x'_l) $$

subject to the constraints

$$ \sum _{l=1}^M \nu (x_k, x'_l) = 1/N \text { and } \sum _{k=1}^N \nu (x_k, x'_l) = 1/M. $$

This linear assignment problem is solved by finite-dimensional linear programming. If this is combined with diffeomorphic interpolation, i.e., if one tries to compute a diffeomorphism \(\varphi \) minimizing \(d_\rho (\varphi \cdot x, x')\), this results in a formulation that mixes discrete and continuous optimization problems, similar to the methods introduced in [240]. The Wasserstein distance is also closely related to the mass transport problem, which can also be used to estimate diffeomorphisms, and will be discussed in the next chapter. For the moment, we focus on matching functionals associated with measures, and start with the case in which the compared measures are differentiable with respect to Lebesgue measure, i.e., with the problem of matching densities.

9.6.1 Matching Densities

Since densities are scalar-valued functions, we can use standard norms to design matching functionals for them. As an example, we can take the simplest case of the \(L^2\) norm, as we did with images. The difference with the image case is that the action is different, and has the interesting feature of involving the derivative of the diffeomorphism, via the Jacobian determinant.

So, let us consider the action \(\varphi \star \zeta \) given by

$$ \varphi \star \zeta = \det (d(\varphi ^{-1}))\, z\circ \varphi ^{-1} $$

and use the matching functional

$$ U_\zeta (\varphi ) = E_{\zeta ,\zeta '}(\varphi ) = \int _\varOmega (\varphi \star \zeta - \zeta ')^2dx. $$

Since we will need it for the differentiation of the Jacobian, we recall the following standard result on the derivative of the determinant.

Proposition 9.7

Let \(F(A) = \det (A)\) be defined over \({\mathcal M}_{n}({\mathbb {R}})\), the space of all n by n matrices. Then, for any \(A, H\in {\mathcal M}_n({\mathbb {R}})\),

$$\begin{aligned} dF(A)H = {\mathrm {trace}}(\mathrm {Adj}(A)H), \end{aligned}$$
(9.25)

where \(\mathrm {Adj}(A)\) is the adjugate matrix of A, i.e., the matrix with (ij) entry given by the determinant of A with the jth row and ith column removed, multiplied by \((-1)^{i+j}\). (When A is invertible \(\mathrm {Adj}(A) = \det (A)\, A^{-1}\).)

For \(A = {\mathrm {Id}}\), we have

$$\begin{aligned} dF({\mathrm {Id}})H = {\mathrm {trace}}(H). \end{aligned}$$
(9.26)

Proof

To prove this proposition, start with \(A = {\mathrm {Id}}\) and use the facts that, if \(\delta _{ij}\) is the matrix with 1 as the (ij) entry and 0 everywhere else, then \(\det ({\mathrm {Id}}+ \varepsilon \delta _{ij}) = 1+\varepsilon \) if \(i=j\) and 1 otherwise, which directly gives (9.26). Then, prove the result for an invertible A using

$$ \det (A +\varepsilon H) = \det (A) \det ({\mathrm {Id}}+ \varepsilon A^{-1} H) $$

and the fact that, when A is invertible, \(\det (A) A^{-1} = \mathrm {Adj}(A) \). This also implies the result for a general (not necessarily invertible) A because the determinant is a polynomial in the entries of a matrix, and so are its partial derivatives, and the coefficients of these polynomials are fully determined by the values taken on the dense set of invertible matrices.    \(\square \)

We have

$$ U_{\zeta }(\varphi ) = \int _\varOmega \left( \det (d(\varphi ^{-1}))\, \zeta \circ \varphi ^{-1} - \zeta '\right) ^2dx. $$

Under the assumptions that \(\zeta \) is \(C^1\) and compactly supported and that \(\zeta '\) is square integrable, one can prove that \(E_{\zeta ,\zeta '}\) is \(C^1\) when defined over \(\mathrm {Diff}^{p+2, \infty }_0\) with \(p\ge 0\) (the details are left to the reader). To compute the derivative at any given \(\varphi \), it will be convenient to use the trick described at the end of Sect. 9.2, starting with the computation of the differential at the identity and deducing from it the differential at any \(\varphi \) by replacing \(\zeta \) by \(\varphi \cdot \zeta \).

If \(\varphi (\varepsilon , \cdot )\) is a diffeomorphism that depends on a parameter \(\varepsilon \), such that \(\varphi (0,\cdot ) = {\mathrm {id}}\) and \(\partial _\varepsilon \varphi (0, \cdot ) = h\in C^{p+2}_0(\varOmega , {\mathbb {R}}^d)\), then, at \(\varepsilon =0\), \(\partial _\varepsilon \zeta \circ \varphi (\varepsilon , \cdot )^{-1} = -\nabla \zeta ^T{h}\) and \(\partial _\varepsilon \det (d(\varphi (\varepsilon , \cdot )^{-1})) = - {\mathrm {trace}}(dh) = - \mathrm {div}\, h\). This implies that

$$ \partial _\varepsilon \left( \zeta \circ \varphi (\varepsilon , \cdot )^{-1} \det (d(\varphi (\varepsilon , \cdot ))^{-1})\right) = -\nabla \zeta ^T{h} - \zeta \,\mathrm {div}\, h = -\mathrm {div}(\zeta h) $$

at \(\varepsilon =0\) and

$$ \partial _\varepsilon U_{\zeta }(\varphi _\varepsilon )_{|_{\varepsilon =0}} = - 2 \int _\varOmega (\zeta - \zeta ') \mathrm {div}(\zeta h)\, dx. $$

So this gives

$$ {\left( {d U_{\zeta }({\mathrm {id}})}\, \left| \, {h}\right. \right) } = - 2 \int _\varOmega (\zeta - \zeta ') \mathrm {div}(\zeta h)\, dx $$

and

$$\begin{aligned} {\left( {d U_{\zeta }(\varphi )}\, \left| \, {h}\right. \right) } = - 2 \int _\varOmega (\varphi \star \zeta - \zeta ') (\mathrm {div}((\varphi \star \zeta ) h)\, dx. \end{aligned}$$
(9.27)

We can use the divergence theorem to obtain an alternative expression (using the fact that h vanishes on \(\partial \varOmega \) or at infinity), yielding

$$\begin{aligned} {\left( {d U_{\zeta }(\varphi )}\, \left| \, {h}\right. \right) } = 2 \int _\varOmega \nabla (\varphi \star \zeta - \zeta ') (\varphi \star \zeta ) h dx \end{aligned}$$
(9.28)

or

$$\begin{aligned} d U_{\zeta }(\varphi ) = 2 (\varphi \star \zeta ) \nabla (\varphi \star \zeta - \zeta ') dx. \end{aligned}$$
(9.29)

One can appreciate the symmetry of this expression compared with the one obtained with images in (9.19).

9.6.2 Dual RKHS Norms on Measures

One of the limitations of functional norms, such as the \(L^2\) norm, is that they do not apply to singular objects such as the Dirac measures that motivated our study of the measure-matching problem. It is certainly possible to smooth out singular objects and transform them into densities that can be compared using the previous matching functional. For example, given a density function \(\rho \) (a Gaussian, for example) and a point set \((x_1, \ldots , x_N)\), one can compute a density

$$\begin{aligned} \zeta _x(y) = \sum _{k=1}^N \rho \Big (\frac{y-x_k}{\sigma }\Big ), \end{aligned}$$
(9.30)

where \(\sigma \) is a positive scale parameter (this is a standard kernel density estimator). One can then compare two point sets, say x and \(x'\), by comparing the associated \(\zeta _x\) and \(\zeta _{x'}\) using the previous method.

The representation in (9.30) is somewhat imperfect, in the sense that, for the natural actions we have defined, we have in general \(\varphi \star \zeta _x \ne \zeta _{\varphi \cdot x}\): the density associated to a deformed point set is not the deformed density. If the goal is to compare two point sets, it makes more sense to use \(\zeta _{\varphi \cdot x}\) instead of \(\varphi \cdot \zeta _x\) as a density resulting from the deformation, and to rather use the cost function

$$\begin{aligned} U_x(\varphi ) = E_{x, x'}(\varphi ) = \int _{{\mathbb {R}}^d} (\zeta _{\varphi \cdot x} - \zeta _{x'})^2dy, \end{aligned}$$
(9.31)

which can be written, if \(x = (x_1, \ldots , x_N)\) and \(x' = (x'_1, \ldots , x'_M)\), and introducing the function

$$\begin{aligned} \xi (z, z') = \int _{{\mathbb {R}}^d} \rho \Big (\frac{y-z}{\sigma }\Big )\rho \Big (\frac{y-z'}{\sigma }\Big ) dy, \end{aligned}$$
(9.32)

as

$$\begin{aligned} U_{x}(\varphi ) = \sum _{k, l=1}^N\xi (\varphi (x_k),&\varphi (x_l)) \nonumber \\&- 2 \sum _{k=1}^N\sum _{l=1}^M\xi (\varphi (x_k) , x'_l) + \sum _{k, l=1}^M\xi (x'_k , x'_l). \end{aligned}$$
(9.33)

Before computing the variations of this energy, we make the preliminary remark that the obtained expression is a particular case of what comes from a representation of measures as linear forms over RKHSs of scalar functions. Indeed, since measures are linear forms on functions, we can evaluate their dual norm, given by

$$\begin{aligned} \Vert \mu \Vert = \sup \left\{ {\left( {\mu }\, \left| \, {f}\right. \right) }: \Vert f\Vert = 1 \right\} . \end{aligned}$$
(9.34)

Following [128], assume that the function norm in (9.34) is that of an RKHS. More precisely, let W be an RKHS of real-valued functions, so that we have an operator \({\mathbb {K}}_W:W^*\rightarrow W\) with \({\mathbb {K}}_W\delta _x := \xi (\cdot , x)\) and with the identity \({\left( {\mu }\, \left| \, {f}\right. \right) } = {\big \langle {{\mathbb {K}}_W\mu }\, , \, {f}\big \rangle }_W\) for \(\mu \in W^*\), \(f\in W\). With this choice, (9.34) becomes

$$\begin{aligned} \Vert \mu \Vert _{W^*}= & {} \sup \left\{ {\left( {\mu }\, \left| \, {f}\right. \right) }: \Vert f\Vert _W = 1 \right\} \\= & {} \sup \left\{ {\big \langle {{\mathbb {K}}_W\mu }\, , \, {f}\big \rangle }_W: \Vert f\Vert _W = 1 \right\} \\= & {} \Vert {\mathbb {K}}_W\mu \Vert _W. \end{aligned}$$

This implies that

$$ \Vert \mu \Vert _{W^*}^2 = {\big \langle {{\mathbb {K}}_W\mu }\, , \, {{\mathbb {K}}_W\mu }\big \rangle }_W = {\left( {\mu }\, \left| \, {{\mathbb {K}}_W\mu }\right. \right) }. $$

If \(\mu \) is a measure, this expression is very simple and is given by

$$ \Vert \mu \Vert _{W^*}^2 = \int \xi (x, y) d\mu (x)d\mu (y). $$

This is because \({\mathbb {K}}_W\mu (x) = {\left( {\delta _x}\, \left| \, {{\mathbb {K}}_W\mu }\right. \right) } = {\left( {\mu }\, \left| \, {{\mathbb {K}}_W\delta _x}\right. \right) } = \int \xi (y, x) d\mu (y)\). So we can take

$$\begin{aligned} U_\mu (\varphi ) = E_{\mu , \mu '}(\varphi ) = \Vert \varphi \cdot \mu - \mu '\Vert _{W^*}^2. \end{aligned}$$
(9.35)

Expanding the norm, we get

$$\begin{aligned} U_{\mu }(\varphi )= & {} {\big \langle {\varphi \cdot \mu }\, , \, {\varphi \cdot \mu }\big \rangle }_{W^*} - 2{\big \langle {\varphi \cdot \mu }\, , \, {\mu '}\big \rangle }_{W^*} + {\big \langle {\mu '}\, , \, {\mu '}\big \rangle }_{W^*} \\= & {} {\left( {\varphi \cdot \mu }\, \left| \, {\xi (\varphi \cdot \mu )}\right. \right) } - 2{\left( {\varphi \cdot \mu }\, \left| \, {\xi \mu '}\right. \right) } + {\left( {\mu '}\, \left| \, {\xi \mu '}\right. \right) } \\= & {} \int \xi (\varphi (x), \varphi (y)) d\mu (x) d\mu (y) - 2\int \xi (\varphi (x), y) d\mu (x) d\mu '(y) \\&+ \int \xi (x, y) d\mu '(x)d\mu '(u). \end{aligned}$$

We retrieve (9.33) when \(\mu \) and \(\mu '\) are sums of Dirac measures and \(\xi \) is chosen as in (9.32), but the RKHS formulation is more general.

Assume that \(\mu \) is bounded and that \(\xi \) is continuously differentiable and bounded, with bounded derivatives. Then (leaving the proof to the reader) \(U_I\) is \(C^1\) on \(\mathrm {Diff}^{p, \infty }_0\) for any \(p\ge 0\) with derivative

$$\begin{aligned} {\left( {\partial U_{\mu }(\varphi )}\, \left| \, {h}\right. \right) }= & {} 2 \int {{\nabla _1 \xi (\varphi (x), \varphi (y))}^T} h(x) d\mu (x)d\mu (y) \\&- 2 \int {{\nabla _1 \xi (\varphi (x), z)}^T} h(x) d\mu (x)d\mu '(z). \end{aligned}$$

In particular,

$$\begin{aligned} d U_{\mu } ({\mathrm {id}}) = \bar{\partial }U_{\mu } ({\mathrm {id}}) = 2 \left( \int \nabla _1 \xi (\cdot , y) d\mu (y) - \int \nabla _1 \xi (\cdot , z) d\mu '(z)\right) \mu . \end{aligned}$$
(9.36)

To obtain the Eulerian differential at a generic \(\varphi \), it suffices to replace \(\mu \) by \(\varphi \cdot \mu \), which yields:

Proposition 9.8

The Eulerian derivative and gradient of (9.35) are

$$\begin{aligned} \bar{\partial }U_{\mu }(\varphi ) = 2 \left( \int \nabla _1 \xi (\cdot , \varphi (y)) d\mu (y) - \int \nabla _1 \xi (\cdot , z) d\mu '(z)\right) (\varphi \cdot \mu ) \end{aligned}$$
(9.37)

and

$$\begin{aligned} \overline{\nabla }^V U_{\mu }(\varphi )(\cdot )&= 2 \int K(\cdot , \varphi (x)) \nonumber \\&\left( \int \nabla _1 \xi (\varphi (x), \varphi (y)) d\mu (y) - \nabla _1 \xi (\varphi (x), z) d\mu '(z)\right) d\mu (x). \end{aligned}$$
(9.38)

The derivative of the expression in (9.33) can be directly deduced from this expression. This leads to the following unlabeled point-matching evolution for point sets \(x = (x_1, \ldots , x_N)\) and \(x' = (x'_1, \ldots , x'_M)\):

$$\begin{aligned} \partial _t \varphi (z) = -2 \sum _{i=1}^N&K(\varphi (z), \varphi (x_i)) \nonumber \\&\left( \sum _{j=1}^N \nabla _1 \xi (\varphi (x_i), \varphi (x_j)) - \sum _{h=1}^M \nabla _1 \xi (\varphi (x_i), x'_h)\right) . \end{aligned}$$
(9.39)

As discussed in the case of labeled point sets, this equation may be solved in two stages: letting \(z_i(t) = \varphi (t, x_i)\), first solve the system

$$ \partial _t z_q = -2 \sum _{i=1}^N K(z_q, z_i) \left( \sum _{j=1}^N \nabla _1 \xi (z_i, z_j) - \sum _{h=1}^M \nabla _1 \xi (z_i, x'_h)\right) . $$

Once this is done, the trajectory of an arbitrary point \(z(t) = \varphi _t(z_0)\) is

$$ \partial _t z = -2 \sum _{i=1}^N K(z, z_i) \left( \sum _{j=1}^N \nabla _1 \xi (z_i, z_j) - \sum _{h=1}^M \nabla _1 \xi (z_i, x'_h)\right) . $$

9.7 Matching Curves and Surfaces

Curves in two dimensions and surfaces in three dimensions are probably the most natural representations of shapes, and their comparison using matching functionals is a fundamental issue. In this section, we discuss a series of representations that can be seen as extensions of measure-matching methods. (This is not the unique way to compare such objects, and we will see a few more methods in the following chapters, especially for curves.)

Note that we are looking here for correspondences between points in the curves and surfaces that derive from global diffeomorphisms of the ambient space. The curve- (or surface-) matching problems are often studied in the literature as attempts to find diffeomorphic correspondences between points along the curve (or surface) only. Even if such restricted diffeomorphisms can generally be extended to diffeomorphisms of the whole space, the two approaches generally lead to very different algorithms. The search for correspondences within the structures is often implemented as a search for correspondences between parametrizations. This is easier for curves (looking, for example, for correspondences of the arc-length parametrizations), than for surfaces, which may not be topologically equivalent in the first place (a sphere cannot be matched to a torus); when matching topologically equivalent surfaces, special parametrizations, like conformal maps [72, 269] can be used. In this framework, once parametrizations are fixed, one can look for diffeomorphisms in parameter space that optimally align some well-chosen, preferably intrinsic, representation. In the case of curves, one can choose the representation \(s \mapsto \kappa _\gamma (s)\), where \(\kappa _\gamma \) is the curvature of a curve \(\gamma \), with the curve rescaled to have length 1 to fix the interval over which this representation is defined. One can then use image-matching functionals to compare them, i.e., find \(\varphi \) (a diffeomorphism of the unit interval) such that \(\varphi \cdot \kappa _\gamma \simeq \kappa _{\gamma '}\).

But, as we wrote, the main focus in this chapter is the definition of matching functionals for deformable objects in \({\mathbb {R}}^d\), and we now address this problem for curves and surfaces.

9.7.1 Curve Matching with Measures

We can arguably make a parallel between point sets and curves in that labeled point sets correspond to parametrized curves and unlabeled point sets to curves modulo parametrization. In this regard we have a direct generalization of the labeled point-matching functional to parametrized curves (assumed to be defined over the same interval, say [0, 1]), simply given by

$$ E_{\gamma , \gamma '}(\varphi ) = \int _0^1 |\varphi (\gamma (u)) - \gamma '(u)|^2 du. $$

But being given two consistent parametrizations of the curves (to allow for direct comparisons as done above) almost never happens in practice. Interesting formulations of the curve matching problem should therefore consider curves modulo parametrization, so that the natural analogy is with unlabeled point sets. The counterpart of a uniform measure over a finite set of points is the uniform measure on the curve, defined by, if \(\gamma \), parametrized over an interval [ab], is \(C^1\) and regular

$$ {\left( {\mu _{\gamma }}\, \left| \, {f}\right. \right) } = \int _\gamma f \,d\sigma _\gamma = \int _a^b f(\gamma (u))\, |\dot{\gamma }(u)| \, du. $$

This is clearly a parametrization-independent representation. Now, if \(\varphi \) is a diffeomorphism, we have, by definition of the action of diffeomorphisms on measures

$$ {\left( {\varphi \cdot \mu _{\gamma }}\, \left| \, {f}\right. \right) } = \int _\gamma f\circ \varphi \, d\sigma _\gamma = \int _a^b f(\varphi (\gamma (u)))\, |\dot{\gamma }(u)|\, du. $$

However, we have

$$ {\left( {\mu _{\varphi \cdot \gamma }}\, \left| \, {f}\right. \right) } = \int _{\varphi (\gamma )} f \, d\sigma _{\varphi (\gamma )} = \int _a^b f(\varphi (\gamma (u)))\, |d\varphi (\gamma (u))\dot{\gamma }(u)| \, du. $$

So, in contrast to point sets, for which we had \(\varphi \cdot \mu _x = \mu _{\varphi (x)}\), the image of the measure associated to a curve is not the measure associated to the image of a curve. When the initial goal is to compare curves, and not measures, it is more natural to use the second definition, \(\mu _{\varphi \cdot \gamma }\), rather than the first one. Using the notation of the previous section, and introducing a target curve \(\gamma '\) defined on \([a', b']\), we can set

$$\begin{aligned}&E_{\gamma ,\gamma '}(\varphi ) = \Vert \mu _{\varphi \cdot \gamma } - \mu _{\gamma '}\Vert _{W^*}^2 \\ \nonumber&= {\big \langle {\mu _{\varphi \cdot \gamma }}\, , \, {\mu _{\varphi \cdot \gamma }}\big \rangle }_{W^*} - 2{\big \langle {\mu _{\varphi \cdot \gamma }}\, , \, {\mu _{\gamma '}}\big \rangle }_{W^*} + {\big \langle {\mu _{\gamma '}}\, , \, {\mu _{\gamma '}}\big \rangle }_{W^*} \\ \nonumber&= {\left( {\mu _{\varphi \cdot \gamma }}\, \left| \, {\xi (\mu _{\varphi \cdot \gamma })}\right. \right) } - 2{\left( {\mu _{\varphi \cdot \gamma }}\, \left| \, {\xi \mu _{\gamma '}}\right. \right) } + {\left( {\mu _{\gamma '}}\, \left| \, {\xi \mu _{\gamma '}}\right. \right) } \\ \nonumber&= \int _a^b\int _a^b \xi (\varphi (\gamma (u)), \varphi (\gamma (v)))\, |d\varphi (\gamma (u)) \dot{\gamma }(u)|\, |d\varphi (\gamma (v)) \dot{\gamma }(v)|\, du dv \\ \nonumber&- 2\int _a^b \int _{a'}^{b'} \xi (\varphi (\gamma (u)), \gamma '(v))\, |d\varphi (\gamma (u)) \dot{\gamma }(u)|\, |\dot{\gamma '}(v)|\, du dv \\ \nonumber&+ \int _{a'}^{b'}\int _{a'}^{b'} \xi (\gamma '(u),\gamma '(v))\, |\dot{\gamma }'(u)|\, |\dot{\gamma }'(v)|\, dudv. \end{aligned}$$
(9.40)

If \(\xi \) is \(C^1\), then \(E_{\gamma ,\gamma '}\) is \(C^1\) on \(\mathrm {Diff}^{p+1,\infty }_0\) for any \(p\ge 0\). To explicitly compute the derivative, take \(\varphi (\varepsilon , \cdot )\) such that \(\varphi (0, \cdot ) = {\mathrm {id}}\) and \(\partial _\varepsilon \varphi (0, \cdot ) = h\), so that

$$ \partial _\varepsilon E(\varphi (\varepsilon , \cdot )) = 2 \partial _{\varepsilon } {\big \langle {\mu _{\varphi (\varepsilon , \cdot )\cdot \gamma }- \mu _{\gamma '}}\, , \, {\mu _\gamma - \mu _{\gamma '}}\big \rangle }_{W^*} = 2\partial _\varepsilon {\big \langle { \mu _{\varphi (\varepsilon , \cdot )\cdot \gamma }}\, , \, {\mu _\gamma - \mu _{\gamma '}}\big \rangle }_{W^*}, $$

the derivatives being computed at \(\varepsilon =0\). Introduce

$$ \tilde{E}(\varphi ) = {\big \langle {\mu _{\varphi \cdot \gamma }}\, , \, {\mu _\gamma - \mu _{\gamma '}}\big \rangle }_{W^*} $$

and let, for a given curve \(\tilde{\gamma }\),

$$\begin{aligned} Z^{\tilde{\gamma }}(\cdot ) = \int _{\tilde{\gamma }} \xi (\cdot , p) d\sigma _{\tilde{\gamma }}(p). \end{aligned}$$
(9.41)

Let also \(\zeta = Z^\gamma - Z^{\gamma '}\) and, for further use, \(\zeta ^\varphi = Z^{\varphi \cdot \gamma } - Z^{\gamma '}\). With this notation, we have

$$ \tilde{E}(\varphi ) = \int _{\varphi (\gamma )} \zeta (p) d\sigma _{\varphi (\gamma )}(p) $$

and we can use Theorem 5.2 to derive, letting \(p_0\) and \(p_1\) be the extremities of \(\gamma \),

$$\begin{aligned} {\partial }_{\varepsilon }{\tilde{E}}(\varphi (\varepsilon , \cdot ))_{|\varepsilon =0}\,&= \,\zeta (p_1) h(p_{1})^T T^\gamma (p_1) - \zeta (p_{0}) h(p_{0})^{T} T^\gamma (p_{0})\nonumber \\&+ \int _{\gamma } \big (\nabla \zeta ^T N^\gamma - \zeta \kappa ^\gamma \big ) h^T N^\gamma dl. \nonumber \end{aligned}$$

Replacing \(\gamma \) by \(\varphi \cdot \gamma \), this provides the expression of the Eulerian derivative of E at \(\varphi \), namely

$$\begin{aligned} \frac{1}{2} \bar{\partial }E_{\gamma , \gamma '} (\varphi ) = \zeta ^\varphi T^\gamma (\delta _{p_1} -&\delta _{p_0}) \nonumber \\&+ \big ({(\nabla \zeta ^\varphi )^T}{N^{\varphi \cdot \gamma }} - \zeta ^\varphi \kappa ^{\varphi \cdot \gamma }\big ) N^{\varphi \cdot \gamma } \mu _{\varphi \cdot \gamma }. \end{aligned}$$
(9.42)

The Eulerian gradient on V therefore is

$$\begin{aligned}&\frac{1}{2} \bar{\nabla }E_{\gamma , \gamma '} (\varphi ) = K(\cdot , p_1) \zeta ^\varphi (p_1)T^\gamma (p_1) - K(\cdot , p_0) \zeta ^\varphi (p_0)T^\gamma (p_0) \nonumber \\&+ \int _{\varphi \cdot \gamma } \Big ({\nabla \zeta ^\varphi (p)^T}{N^{\varphi \cdot \gamma }(p)} - \zeta ^\varphi (p)\kappa ^{\varphi \cdot \gamma }(p)\Big ) K(\cdot , p)N^{\varphi \cdot \gamma }(p) d\sigma _{\varphi \cdot \gamma }(p). \end{aligned}$$
(9.43)

To write this expression, we have implicitly assumed that \(\gamma \) is \(C^2\). In fact, we can give an alternative expression for the Eulerian gradient that does not require this assumption, by directly computing the variation of \(\tilde{E}(\varphi (\varepsilon , \cdot ))\) without applying Theorem 5.2. This yields, using the fact that, if z is a function of a parameter \(\varepsilon \), then \(\partial _\varepsilon |z| = (\dot{z}^Tz)/|z|\),

$$ \partial _\varepsilon |d\varphi (\varepsilon , \cdot )(\gamma ) \dot{\gamma }|_{|{\varepsilon =0}} = (T^\gamma )^Tdh(\gamma ) \dot{\gamma }= (T^\gamma )^Tdh(\gamma ) T^\gamma |\dot{\gamma }| $$
$$ \text { and } \partial _\varepsilon \tilde{E}(\varphi (\varepsilon , \cdot )) = \int _\gamma \big (\nabla \zeta ^Th + \zeta (T^\gamma )^TdhT^\gamma \big ) d\sigma _\gamma . $$

The term involving dh can be written in terms of V-dot products of h with derivatives of the kernel, K, since (we use the notation introduced in Sect. 8.1.3, Eq. (8.9))

$$\begin{aligned} {{a}^T}dh(x) b = {\big \langle {h}\, , \, {\partial _2 K(\cdot , x) (a, b)}\big \rangle }_V. \end{aligned}$$
(9.44)

This gives

$$\begin{aligned} \partial _\varepsilon \tilde{E}(\varphi (\varepsilon , \cdot )) =&\int _\gamma \Big ({\big \langle { K(\cdot , p)\nabla \zeta (p)}\, , \, {h}\big \rangle }_V \nonumber \\&\qquad \qquad \qquad + {\big \langle {\zeta (p) \partial _2 K(\cdot ,p)(T^\gamma (p), T^\gamma (p))}\, , \, {h}\big \rangle }_V \Big )\, d\sigma _\gamma (p) \end{aligned}$$

and a new expression of the Eulerian gradient

$$\begin{aligned} \frac{1}{2}{\overline{\nabla }}^V E_{\gamma , \gamma '} (\varphi ) =&\int _{\varphi \cdot \gamma }\Big ( K(\cdot , p)\nabla \zeta ^\varphi (p) \nonumber \\&\quad \quad \quad + \zeta ^\varphi (p) \partial _2K(., p)(T^{\varphi \cdot \gamma }(p), T^{\varphi \cdot \gamma }(p))\Big )\, d\sigma _\gamma (p). \end{aligned}$$
(9.45)

To be complete, let us consider the variation of a discrete form of \(E_{\gamma , \gamma '}(\varphi )\). If a curve \(\gamma \) is discretized with points \(x_0, \ldots , x_N\) (with \(x_N=x_0\) if the curve is closed), one can define the discrete measure, still denoted \(\mu _\gamma \)

$$ {\left( {\mu _\gamma }\, \left| \, {f}\right. \right) } = \sum _{i=1}^N f(c_i) |\tau _i| $$

with \(c_i = (x_i + x_{i-1})/2\) and \(\tau _i = x_i - x_{i-1}\). Use a similar expression for the measure associated to a discretization of \(\varphi \cdot \gamma \), with \(c^\varphi _i = (\varphi (x_i) + \varphi (x_{i-1}))/2\) and \(\tau ^\varphi _i = \varphi (x_i) - \varphi (x_{i-1})\). Finally, let \(\gamma '\) be discretized in \( x'_1, \ldots , x'_M\), and define

$$\begin{aligned} E_{\gamma , \gamma '}(\varphi )= & {} \sum _{i, j= 1}^N \xi (c^\varphi _i, c^\varphi _j) |\tau ^\varphi _i|\,|\tau ^\varphi _j| \\ \nonumber- & {} 2 \sum _{i= 1}^N\sum _{j=1}^N \xi (c^\varphi _i, c'_j) |\tau ^\varphi _i|\,|\tau '_j| + \sum _{i, j= 1}^M \xi (c'_i, c'_j) |\tau '_i|\,|\tau '_j| \end{aligned}$$
(9.46)

in which we identify indices 1 and \(N+1\) or \(M+1\) (assuming closed curves). Note that this functional depends on \(\varphi \cdot x\) and \(x'\). The computation of the differential proceeds as above. Define, for a point set \(\tilde{x} = (\tilde{x}_1, \ldots , \tilde{x}_Q)\)

$$ Z^{\tilde{x}}(\cdot ) = \sum _{j=1}^Q \xi (\cdot , \tilde{c}_j) |\tilde{\tau }_j|, $$

and \(\zeta = Z^x - Z^{x'}\), \(\zeta ^\varphi = Z^{\varphi \cdot x} - Z^{x'}\). We then obtain

$$\begin{aligned} \frac{1}{2} d E_{\gamma , \gamma '} ({\mathrm {id}})= & {} \sum _{i=1}^N (\nabla \zeta (c_i)|\tau _i| + \nabla \zeta (c_{i+1}) |\tau _{i+1}|) \delta _{x_i}\\- & {} 2\sum _{i=1}^N \Big (\zeta (c_{i+1}) \frac{\tau _{i+1}}{|\tau _{i+1}|} - \zeta (c_{i}) \frac{\tau _{i}}{|\tau _{i}|}\Big ) \delta _{x_i}. \end{aligned}$$

The Eulerian differential at \(\varphi \ne {\mathrm {id}}\) is obtained by replacing \(\zeta , c_i, \tau _i\) by \(\zeta ^\varphi , c_i^\varphi , \tau _i^\varphi \) and the Eulerian gradient by applying the V-kernel to it.

9.7.2 Curve Matching with Vector Measures

Instead of describing a curve with a measure, which is a linear form on functions, it is possible to represent it by a vector measure, which is a linear form on vector fields. Given a parametrized curve \(\gamma : [a, b] \rightarrow {\mathbb {R}}^d\), we define a vector measure \(\nu _\gamma \), which associates to each vector field f on \({\mathbb {R}}^d\) a number \({\left( {\nu _\gamma }\, \left| \, {f}\right. \right) }\) given by

$$ {\left( {\nu _\gamma }\, \left| \, {f}\right. \right) } = \int _a^b {{\dot{\gamma }(u)}^T} f\circ \gamma (u) du, $$

i.e., \(\nu _\gamma = T^\gamma \mu _\gamma \) where \(T^\gamma \) is the unit tangent to \(\gamma \) and \(\mu _\gamma \) is the line measure along \(\gamma \), as defined in the previous section. This definition is invariant under a change of parametrization, but depends on the orientation of \(\gamma \). If \(\varphi \) is a diffeomorphism, we then have

$$ \nu _{\varphi \cdot \gamma }(f) = \int _a^b {{(d\varphi (\gamma (u)) \dot{\gamma }(u))}^T} f \circ \varphi (\gamma (u)) du. $$

As done with scalar measures, we can use a dual norm for the comparison of two vector measures. Such a norm is defined by

$$ \Vert \nu \Vert _{W^*} = \sup \left\{ {\left( {\nu }\, \left| \, {f}\right. \right) }: \Vert f\Vert _W = 1 \right\} , $$

where W is now an RKHS of vector fields, and we still have \(\Vert \nu \Vert _{W^*}^2 = {\left( {\nu }\, \left| \, {{\mathbb {K}}_W\nu }\right. \right) }\). Still letting \(\xi \) denote the kernel of W (which is now matrix-valued), we have

$$ \Vert \nu _\gamma \Vert ^2_{W^*} = \int _a^b\int _a^b {{\dot{\gamma }(u)}^T} \xi (\gamma (u), \gamma (v)) \dot{\gamma }(v) du dv $$

and

$$ \Vert \nu _{\varphi \cdot \gamma }\Vert ^2_{W^*} = \int _a^b\int _a^b {{\dot{\gamma }(u)}^T} d\varphi (\gamma (u))^T \xi (\varphi (\gamma (u)), \varphi (\gamma (v))) {d\varphi (\gamma (v))} \dot{\gamma }(v) du dv. $$

Define \(E_{\gamma , \gamma '}(\varphi ) = \Vert \nu _{\varphi \cdot \gamma } - \nu _{\gamma '}\Vert _{W^*}^2\). We follow the same pattern as in the previous section and define

$$ \tilde{E}(\varphi ) = {\big \langle {\nu _{\varphi \cdot \gamma }}\, , \, {\nu _\gamma - \nu _{\gamma '}}\big \rangle }_{W^*}, $$

which (introducing \(\varphi (\varepsilon , \cdot )\) with \(\varphi (0, \cdot ) = {\mathrm {id}}\) and \(\partial _\varepsilon \varphi (0, \cdot ) = h\)) is such that \(\partial _\varepsilon E(\varphi (\varepsilon , \cdot )) = 2\partial _\varepsilon \tilde{E}(\varphi (\varepsilon , \cdot ))\) at \(\varepsilon =0\). Define

$$ Z^{\tilde{\gamma }}(\cdot ) = \int _{\tilde{\gamma }} \xi (\cdot , p) N^{\tilde{\gamma }}(p) dp, $$

and \(\zeta = Z^\gamma - Z^{\gamma '}, \zeta ^\varphi = Z^{\varphi \cdot \gamma } - Z^{\gamma '}\), so that (using \((T^{\varphi \cdot \gamma })^T T^{\gamma '} = (N^{\varphi \cdot \gamma })^TN^{\gamma '}\))

$$ \tilde{E}(\varphi ) = \int _{\varphi \cdot \gamma } \zeta ^T N^{\varphi \cdot \gamma } d\sigma _{\varphi \cdot \gamma }. $$

We can use Theorem 5.2, Eq. (5.4), to find

$$ \partial _\varepsilon E(\varphi (\varepsilon , \cdot )) = - [\det (\zeta , h)]_0^\varDelta + \int _{\gamma } \mathrm {div}(\zeta )(N^\gamma )^Th dl. $$

This yields in turn (replacing \(\gamma \) by \(\varphi \cdot \gamma \), and letting \(p_0\) and \(p_1\) be the extremities of \(\gamma \))

$$\begin{aligned} \frac{1}{2} \bar{\partial }E_{\gamma , \gamma '}(\varphi ) = - (R_{\pi /2}\zeta ^\varphi ) (\delta _{\varphi (p_1)} - \delta _{\varphi (p_0)}) + \mathrm {div}(\zeta ^\varphi ) \nu _{\varphi \cdot \gamma }, \end{aligned}$$
(9.47)

where \(R_{\pi /2}\) is a \(90^\mathrm{o}\) rotation. This final expression is remarkably simple, especially for closed curves, for which the first term cancels. A discrete version of the matching functional can also be defined, namely, using the notation of the previous section:

$$\begin{aligned} E_{\gamma , \gamma '}(\varphi )= & {} \sum _{i, j= 1}^N \xi (c^\varphi _i, c^\varphi _j) (\tau ^\varphi _i)^T\tau ^\varphi _j \\&- 2 \sum _{i= 1}^N\sum _{j=1}^N \xi (c^\varphi _i,c'_j) (\tau ^\varphi _i)^T\tau '_j + \sum _{i, j= 1}^M \xi (c'_i, c'_j) (\tau '_i)^T\tau '_j. \end{aligned}$$

We leave the computation of the associated Eulerian differential (which is a slight variation of the one we made with measures) to the reader.

9.7.3 Surface Matching

We now extend to surfaces the matching functionals that we just studied for curves. The construction is formally very similar. If S is a surface in \({\mathbb {R}}^3\), one can compute a measure \(\mu _S\) and a vector measure \(\nu _S\) defined by

$$\begin{aligned} {\left( {\mu _S}\, \left| \, {f}\right. \right) } = \int _S f(x) d\sigma _S(x) \text { for a scalar } f \end{aligned}$$
(9.48)

and

$$\begin{aligned} {\left( {\nu _S}\, \left| \, {f}\right. \right) } = \int _S f(x)^TN(x) d\sigma _S(x) \text { for a vector field } f, \end{aligned}$$
(9.49)

where \(d\sigma _S\) is the volume measure on S and N is the unit normal (S being assumed to be oriented in the definition of \(\nu _S\)).

We state without proof the following result:

Proposition 9.9

If S is a surface and \(\varphi \) a diffeomorphism of \({\mathbb {R}}^3\) that preserves the orientation (i.e., with positive Jacobian), we have

$$ {\left( {\mu _{\varphi (S)}}\, \left| \, {f}\right. \right) } = \int _S f\circ \varphi (x) |d\varphi (x)^{-T} N|\, \det (d\varphi (x)) d\sigma _S(x) $$

for a scalar f and for a vector-valued f,

$$ {\left( {\nu _{\varphi (S)}}\, \left| \, {f}\right. \right) } = \int _S {{f\circ \varphi (x)}^T} d\varphi (x)^{-T} N\, \det (d\varphi (x)) d\sigma _S(x). $$

If \(e_1(x), e_2(x)\) is a basis of the tangent plane to S at x, we have

$$\begin{aligned} d\varphi (x)^{-T} N\, \det (d\varphi (x)) = (d\varphi (x) e_1 \times d\varphi (x) e_2)/|e_1 \times e_2|. \end{aligned}$$
(9.50)

The last formula implies in particular that if S is parametrized by\((u,v)\mapsto m(u, v)\), then (since \(N= (\partial _1 m \times \partial _2 m) /|\partial _1 m\times \partial _2 m|\) and \(d\sigma _S = |\partial _1 m\times \partial _2 m|\, du dv\))

$$\begin{aligned} {\left( {\nu _S}\, \left| \, {f}\right. \right) }= & {} \int f(x)^T (\partial _1 m \times \partial _2 m) du dv \\= & {} \int \det (\partial _1 m, \partial _2 m, f) du dv \end{aligned}$$

and

$$ {\left( {\nu _{\varphi (S)}}\, \left| \, {f}\right. \right) } = \int \det (d\varphi \, \partial _1 m, d\varphi \, \partial _2 m, f\circ \varphi )du dv. $$

If W is an RKHS of scalar functions or vector fields, we can compare two surfaces by using the norm of the difference of their associated measures on \(W^*\). So define (in the scalar measure case)

$$\begin{aligned} E_{S, S'}(\varphi ) = \Vert \mu _{\varphi \cdot S} - \mu _{S'}\Vert ^2_{W^*} \end{aligned}$$
(9.51)

and the associated

$$ \tilde{E}(\varphi ) = {\big \langle {\mu _{\varphi \cdot S}}\, , \, {\mu _{S} - \mu _{S'}}\big \rangle }_{W^*} $$

so that, for \(\varphi (\varepsilon , \cdot )\) such that \(\varphi (0, \cdot ) = {\mathrm {id}}\) and \(\partial _\varepsilon \varphi (0, \cdot ) = h\)

$$ \partial _\varepsilon E_{\gamma , \gamma '}(\varphi (\varepsilon , \cdot )) = 2\partial _\varepsilon \tilde{E}(\varphi (\varepsilon , \cdot )) . $$

To a given surface \(\tilde{S}\), associate the function

$$ Z^{\tilde{S}}(\cdot ) = \int _{\tilde{S}} \xi (\cdot , p) d\sigma _{\tilde{S}}(p) $$

and \(\zeta = Z^S - Z^{S'}\), \(\zeta ^\varphi = Z^{\varphi \cdot S} - Z^{S'}\). Since

$$ \tilde{E}(\varphi ) = \int _{\varphi \cdot S} \zeta (p) d\sigma _{\varphi \cdot S}(p), $$

Theorem 5.4 yields

$$\begin{aligned} \partial _\varepsilon \tilde{E}(\varphi (\varepsilon , \cdot )) = - \int _{\partial S} \zeta \, (n^S)^Th&d\sigma _{\partial _S}\\ {}&+\int _{S} \big (-2\zeta H^S + \nabla \zeta ^TN^S\big ) \, (N^S)^Th d\sigma _S \end{aligned}$$

where \(H^S\) is the mean curvature on S. This implies

$$\begin{aligned} \frac{1}{2} \bar{\partial }E_{S, S'}(\varphi ) = - \zeta ^\varphi n^{\varphi \cdot S} \mu _{\varphi \cdot \partial S} + \big (-2\zeta ^\varphi H^{\varphi \cdot S} + (\nabla \zeta ^\varphi )^TN^{\varphi \cdot S}\big )\nu _{\varphi \cdot S}. \end{aligned}$$
(9.52)

If we now use vector measures, so that

$$\begin{aligned} E_{S, S'}(\varphi ) = \Vert \nu _{\varphi \cdot S} - \nu _{S'}\Vert ^2_{W^*} \end{aligned}$$
(9.53)

and

$$ \tilde{E}(\varphi ) = {\big \langle {\nu _{\varphi \cdot S}}\, , \, {\nu _{S} - \nu _{S'}}\big \rangle }_{W^*}, $$

we need to define

$$ Z^{\tilde{S}}(\cdot ) = \int _{\tilde{S}} \xi (\cdot , p)N^{\tilde{S}} d\sigma _{\tilde{S}}(p) $$

and \(\zeta = Z^S - Z^{S'}\), \(\zeta ^\varphi = Z^{\varphi \cdot S} - Z^{S'}\), so that

$$ \tilde{E}(\varphi ) = \int _{\varphi \cdot S} \zeta ^TN^{\varphi \cdot S}\, d\sigma _{\varphi \cdot S}. $$

Variations derive again from Theorem 5.4, yielding

$$\begin{aligned} \partial _\varepsilon \tilde{E} = - \int _{\partial S} ((\zeta ^TN^S)(h^Tn^S) - (\zeta ^Tn^S)(h^TN^S) )\,&d\sigma _{\partial S} \\&+ \int _{S} \mathrm {div}(\zeta )(N^S)^Th\, d\sigma _S. \end{aligned}$$

We therefore have

$$\begin{aligned} \frac{1}{2} \bar{\partial }E_{S, S'}(\varphi ) = - \big ((\zeta ^\varphi )^TN^{\varphi \cdot S}\, n^{\varphi \cdot S} - (\zeta ^\varphi )^Tn^{\varphi \cdot S}\, N^{\varphi \cdot S} \big )&\mu _{\varphi \cdot \partial S} \nonumber \\&+\mathrm {div}(\zeta ^\varphi ) \nu _{\varphi \cdot S}. \end{aligned}$$
(9.54)

Again the expression is remarkably simple for surfaces without boundary.

Consider now the discrete case and let S be a triangulated surface [289]. Let \(x_1, \ldots , x_N\) be the vertices of S and \(f_1, \ldots , f_Q\) be the faces (triangles) which are ordered triples of vertices \(f_i = (x_{i1}, x_{i2}, x_{i3})\). Let \(c_i\) be the center of \(f_i\), \(N_i\) its oriented unit normal and \(a_i\) its area. Define the discrete versions of the previous measures by

$$\begin{aligned} {\left( {\mu _S}\, \left| \, {h}\right. \right) } = \sum _{i=1}^Q h(c_i) a_i, \text { for a scalar } h \end{aligned}$$
(9.55)

and

$$\begin{aligned} {\left( {\nu _S}\, \left| \, {h}\right. \right) } = \sum _{i=1}^Q (h(c_i)^T N_i) a_i, \text { for a vector field } h. \end{aligned}$$
(9.56)

The previous formulae can be written as

$$ {\left( {\mu _S}\, \left| \, {h}\right. \right) } = \sum _{i=1}^K h\Big (\frac{x_{i1} + x_{i2} + x_{i3}}{3} \Big ) | (x_{i2} - x_{i1}) \times (x_{i3} - x_{i1})| $$

and

$$ {\left( {\nu _S}\, \left| \, {h}\right. \right) } = \sum _{i=1}^K {{h\Big (\frac{x_{i1} + x_{i2} + x_{i3}}{3} \Big )}^T} (x_{i2} - x_{i1})\times (x_{i3} - x_{i1}), $$

where the last formula requires that the vertices or the triangles are ordered consistently with the orientation (see Sect. 4.2). The transformed surfaces are now represented by the same expressions with \(x_{ik}\) replaced by \(\varphi (x_{ik})\). If, given two triangulated surfaces, one defines \(E_{S, S'}(\varphi ) = \Vert \mu _{\varphi \cdot S} - \mu _{S'}\Vert _{W^*}^2\), then (leaving the computation to the reader)

$$ \frac{1}{2} \bar{\partial }E_{S, S'}({\mathrm {id}}) = \sum _{k=1}^N \Big (\sum _{i: x_k\in f_i} (\nabla \zeta (c_i) \frac{a_i}{3} - \zeta (c_i) e_{ik}\times N_i)\Big ) \delta _{x_k}, $$

where \(e_{ik}\) is the edge opposite \(x_k\) in \(f_i\) (oriented so that \((x_k, e_{ik})\) is positively ordered), and \(\zeta = Z^S - Z^{S'}\), with

$$ Z^{\tilde{S}}(\cdot ) = \sum _{i=1}^{\tilde{K}} \xi (\cdot , \tilde{c}_i) \tilde{a}_i $$

for a triangulated surface \(\tilde{S}\). The Eulerian differential at \(\varphi \) is obtained by replacing all \(x_k\)’s by \(\varphi (x_k)\).

For the vector-measure form, \(E_{S, S'}(\varphi ) = \Vert \nu _{\varphi \cdot S} - \nu _{S'}\Vert _{W^*}^2\), we get

$$ \frac{1}{2} \bar{\partial }E_{S, S'}({\mathrm {id}}) = \sum _{k=1}^N \Big (\sum _{i: x_k\in f_i} (d \zeta (c_i)N_i) \frac{a_i}{3} - e_{ik}\times \zeta (c_i)\Big ) \delta _{x_k} $$

still with \(\zeta = Z^S - Z^{S'}\), but with

$$ Z^{\tilde{S}}(\cdot ) = \sum _{i=1}^{\tilde{K}} \xi (\cdot , \tilde{c}_i) N_i a_i. $$

9.7.4 Induced Actions and Currents

We have designed the action of diffeomorphisms on measures by \({\left( {\varphi \cdot \mu }\, \left| \, {h}\right. \right) } = {\left( {\mu }\, \left| \, {h\circ \varphi }\right. \right) }\). Recall that we have the usual action of diffeomorphisms on functions defined by \(\varphi \cdot h = h\circ \varphi ^{-1}\), so that we can write \({\left( {\varphi \cdot \mu }\, \left| \, {h}\right. \right) } = {\left( {\mu }\, \left| \, {\varphi ^{-1} \cdot h}\right. \right) }\). In the case of curves, we have seen that this action on the induced measure did not correspond to the image of the curve by a diffeomorphism, in the sense that \(\mu _{\varphi \cdot \gamma } \ne \varphi \cdot \mu _\gamma \). Here, we discuss whether the transformations \(\mu _\gamma \rightarrow \mu _{\varphi \cdot \gamma }\) or \(\nu _\gamma \rightarrow \nu _{\varphi \cdot \gamma }\) (and the equivalent transformations for surfaces) can be described by a similar operation, e.g., whether one can write \({\left( {\varphi \cdot \mu }\, \left| \, {h}\right. \right) } = {\left( {\mu }\, \left| \, {\varphi ^{-1} \star h}\right. \right) }\) where \(\star \) would represent another action of diffeomorphisms on functions (or on vector fields for vector measures).

For \(\mu _\gamma \), the answer is negative. We have, letting \(T(\gamma (u))\) be the unit tangent to \(\gamma \),

$$\begin{aligned} {\left( {\mu _{\varphi \cdot \gamma }}\, \left| \, {h}\right. \right) }= & {} \int _a^b h(\varphi (\gamma (u))) |d\varphi (\gamma (u)) \dot{\gamma }(u)| du \\= & {} \int _a^b h(\varphi (\gamma (u))) |d\varphi (\gamma (u)) T(u)| |\dot{\gamma }(u)| du, \end{aligned}$$

so that \({\left( {\mu _{\varphi \cdot \gamma }}\, \left| \, {h}\right. \right) } = {\left( {\mu _{\gamma }}\, \left| \, {h\circ \varphi |d\varphi \, T |}\right. \right) }\), with some abuse of notation in the last formula, since T is only defined along \(\gamma \). The important fact here is that the function h is transformed according to a rule which depends not only on the diffeomorphism \(\varphi \), but also on the curve \(\gamma \), and therefore the result cannot be put in the form \(\varphi ^{-1} \star h\).

The situation is different for vector measures. Indeed, we have

$$\begin{aligned} \nu _{\varphi \cdot \gamma }(h)= & {} \int _a^b {{(d\varphi (\gamma (u)) \dot{\gamma }(u))}^T} h \circ \varphi (\gamma (u)) du \\= & {} {\left( {\nu _\gamma }\, \left| \, {d\varphi ^T h\circ \varphi }\right. \right) }. \end{aligned}$$

So, if we define \(\varphi \star h = d(\varphi ^{-1})^T h\circ \varphi ^{-1}\), we have \({\left( {\nu _{\varphi \cdot \gamma }}\, \left| \, {h}\right. \right) } = {\left( {\nu _\gamma }\, \left| \, {\varphi ^{-1} \star h}\right. \right) }\). The transformation \((\varphi , h) \mapsto \varphi \star h\) is a valid action of diffeomorphisms on vector fields, since \({\mathrm {id}}\star h = h\) and \(\varphi \star (\psi \star h) = (\varphi \circ \psi ) \star h\), as can easily be checked.

The same analysis can be made for surfaces; scalar measures do not transform in accordance to an action, but vector measures do. Let us check this last point by considering the formula in a local chart, where

$$\begin{aligned} {\left( {\nu _{\varphi (S)}}\, \left| \, {h}\right. \right) }= & {} \int \det (d\varphi \, \partial _1 m, d\varphi \, \partial _2m, h\circ \varphi )du dv\\= & {} \int \det (d\varphi ) \det (\partial _1 m, \partial _2 m, (d\varphi )^{-1} h \circ \varphi ) du dv\\= & {} {\left( {\nu _S}\, \left| \, {\det (d\varphi ) (d\varphi )^{-1} h\circ \varphi }\right. \right) }. \end{aligned}$$

So, we need here to define

$$\varphi \star h = \det (d(\varphi ^{-1})) (d\varphi ^{-1})^{-1} h\circ \varphi ^{-1} = (d\varphi \, h/ \det (d\varphi )) \circ \varphi ^{-1}.$$

Here again, a direct computation shows that this is an action.

We have just proved that vector measures are transformed by a diffeomorphism \(\varphi \) according to a rule \({\left( {\varphi \cdot \mu }\, \left| \, {h}\right. \right) } = {\left( {\mu }\, \left| \, {\varphi ^{-1}\star h}\right. \right) }\), the action \(\star \) being apparently different for curves and surfaces. In fact, all these actions (including the scalar one) can be placed within a single framework if one replaces vector fields by differential forms and measures by currents [126, 127, 289].

The reader may refer to Sects. B.7.1 and B.7.2 for basic definitions of linear and differential forms, in which the space of differential k-forms on \({\mathbb {R}}^d\) is denoted \(\varOmega _k\), or \(\varOmega _k^d\). We can consider spaces of smooth differential k-forms, and in particular, reproducing kernel Hilbert spaces of such forms: a space \(W\subset \varOmega _k\) is an RKHS if, for every \(x\in {\mathbb {R}}^d\) and \(e_1, \ldots , e_k\in {\mathbb {R}}^d\), the evaluation function

$$ (e_1, \ldots , e_k)\delta _x: q \mapsto {\left( {q(x)}\, \left| \, {e_1, \ldots , e_k}\right. \right) } $$

belongs to \(W^*\). Introduce the duality operator, so that \({\mathbb {K}}_W((e_1, \ldots , e_k)\delta _x)\in W\). Introduce, for \(x, y\in {\mathbb {R}}^d\), the 2k-linear form \(\xi (x, y)\) defined by

$$ {\left( {\xi (x, y)}\, \left| \, {e_1, \ldots , e_k; f_1, \ldots , f_k}\right. \right) } = {\left( {{\mathbb {K}}_W((e_1, \ldots , e_k)\delta _x)(y)}\, \left| \, {f_1, \ldots , f_k}\right. \right) }. $$

Notice that this form is skew-symmetric with respect to its first k and its last k variables and that

$$ {\big \langle {\xi _x(e_1, \ldots , e_k)}\, , \, {\xi _y(f_1, \ldots , f_k)}\big \rangle }_W = {\left( {\xi (x, y)}\, \left| \, {e_1, \ldots , e_k; f_1, \ldots , f_k}\right. \right) }, $$

so that \(\xi \) may be called the reproducing kernel of W. Similar to vector fields, kernels for differential k-forms can be derived from scalar kernels by letting

$$\begin{aligned} {\left( {\xi (x, y)}\, \left| \, {e_1, \ldots , e_k; f_1, \ldots , f_k}\right. \right) }&= \nonumber \\&\xi (x, y) {\big \langle {e_1\times \cdots \times e_k}\, , \, {f_1\times \cdots \times f_k}\big \rangle }_{\varLambda _{d-k}}, \end{aligned}$$
(9.57)

where the dot product on the space of k-linear forms, \(\varLambda _k\), is the product of coefficients of the forms over a basis formed by all cross products of subsets of k elements of an orthonormal basis of \({\mathbb {R}}^d\), as described in Sect. B.7.1.

Elements of the dual space, \(W^*\), to W are therefore linear forms over differential k-forms, and are special instances of k-currents [107, 210] (k-currents are bounded differential forms over \(C^\infty \) differential k-forms with compact support, which is less restrictive than being bounded on W). Important examples of currents are those associated to submanifolds of \({\mathbb {R}}^d\), and are defined as follows. Let M be an oriented k-dimensional submanifold of \({\mathbb {R}}^d\). To a differential k-form q, associate the quantity

$$ {\left( {\eta _M}\, \left| \, {q}\right. \right) } = \int _M {\left( {q(x)}\, \left| \, {e_1(x), \ldots , e_k(x)}\right. \right) }, d\sigma _M(x) $$

where \(e_1, \ldots , e_k\) is, for all x, a positively oriented orthonormal basis of the tangent space to M at x (by Eq. (B.16), the result does not depend on the chosen basis).

If W is an RKHS of differential k-forms, \(\eta _M\) belongs to \(W^*\) and we can compute the dual norm of \(\eta _M\), which is

$$ \Vert \eta _M\Vert _{W^*}^2 = \int _M\int _M {\left( {\xi (x, y)}\, \left| \, {e_1(x), \ldots , e_k(x); e_1(y), \ldots , e_k(y)}\right. \right) } d\sigma _M(x) d\sigma _M(y) $$

or, for a scalar kernel defined by (9.57),

$$\begin{aligned} \Vert \eta _M\Vert _{W^*}^2 = \int _M\int _M \xi (x, y){\big \langle {e_1(x)\times \cdots \times e_k(x)}\, , \, {e_1(y) \times \cdots \times e_k(y)}\big \rangle }_{\varLambda _{d-k}} \\ d\sigma _M(x) d\sigma _M(y). \end{aligned}$$

The expressions of \(\eta _M\) and its norm in a local chart of M are quite simple. Indeed, if \((u_1, \ldots , u_k)\) is the parametrization in the chart and \((\partial _{1} m, \ldots , \partial _{k} m)\) the associated tangent vectors (assumed to be positively oriented), we have, for a k-form q (using (B.16))

$$ {\left( {q}\, \left| \, {\partial _{1}m, \ldots , \partial _{k}m}\right. \right) } = {\left( {q}\, \left| \, {e_1, \ldots , e_k}\right. \right) } \det (\partial _{1}m, \ldots , \partial _{k}m) $$

which immediately yields

$$ {\left( {q}\, \left| \, {\partial _{1}m, \ldots , \partial _{k}m}\right. \right) } du_1 \ldots du_k= {\left( {q}\, \left| \, {e_1, \ldots , e_k}\right. \right) } d\sigma _M. $$

We therefore have, in the chart,

$$ {\left( {\eta _M}\, \left| \, {q}\right. \right) } = \int {\left( {q}\, \left| \, {\partial _{1}m, \ldots , \partial _{k}m}\right. \right) } du_1\ldots du_k $$

and similar formulas for the norm.

Now consider the action of diffeomorphisms. If M becomes \(\varphi (M)\), the formula in the chart yields

$$ {\left( {\eta _{\varphi (M)}}\, \left| \, {q}\right. \right) } = \int {\left( {q\circ \varphi }\, \left| \, {d\varphi \partial _{1}m, \ldots , d\varphi \partial _{k}m}\right. \right) } du_1\ldots du_k $$

so that \({\left( {\eta _{\varphi (M)}}\, \left| \, {q}\right. \right) } = {\left( {\eta _{M}}\, \left| \, {\tilde{q}}\right. \right) }\) with

$${\left( {\tilde{q}(x)}\, \left| \, {f_1, \ldots , f_k}\right. \right) } = {\left( {q(\varphi (x))}\, \left| \, {d\varphi f_1, \ldots , d\varphi f_k}\right. \right) }.$$

As we did with vector measures, we can introduce the left action on k-forms (also called the push-forward of the k-form):

$$ {\left( {\varphi \star q}\, \left| \, {f_1, \ldots , f_k}\right. \right) } = {\left( {q\circ \varphi ^{-1}}\, \left| \, {d(\varphi ^{-1}) f_1, \ldots , d(\varphi ^{-1}) f_k}\right. \right) } $$

and the resulting action on p-currents

$$\begin{aligned} {\left( {\varphi \cdot \eta }\, \left| \, {q}\right. \right) } = {\left( {\eta }\, \left| \, {\varphi ^{-1}\star q}\right. \right) }, \end{aligned}$$
(9.58)

so that we can write \(\eta _{\varphi (M)} = \varphi \cdot \eta _M\).

This is reminiscent of what we have obtained for measures, and for vector measures with curves and surfaces. We now check that these examples are particular cases of the previous discussion.

Measures are linear forms on functions, which are also differential 0-forms. The definition \({\left( {\varphi \cdot \mu }\, \left| \, {f}\right. \right) } = {\left( {\mu }\, \left| \, {f\circ \varphi }\right. \right) }\) is exactly the same as in (9.58).

Consider now the case of curves, which are 1D submanifolds, so that \(k=1\). If \(\gamma \) is a curve, and T is its unit tangent, we have

$$ {\left( {\eta _\gamma }\, \left| \, {q}\right. \right) } = \int _\gamma {\left( {q(\gamma )}\, \left| \, {T}\right. \right) } d\sigma _\gamma = \int _a^b {\left( {q(\gamma (u))}\, \left| \, {\dot{\gamma }(u)}\right. \right) } du. $$

To a vector field h on \({\mathbb {R}}^d\), we can associate the differential 1-form \(q_h\) defined by \({\left( {q_h(x)}\, \left| \, {v}\right. \right) } = h(x)^Tv\). In fact all differential 1-forms can be expressed as \(q_h\) for some vector field h. Using this identification and noting that \({\left( {\nu _\gamma }\, \left| \, {h}\right. \right) } = {\left( {\eta _\gamma }\, \left| \, {q_h}\right. \right) }\), we can see that the vector measure for curve matching is a special case of the currents that we have considered here.

For surfaces in three dimensions, we need to take \(k=2\), and if S is a surface, we have

$$ {\left( {\eta _S}\, \left| \, {q}\right. \right) } = \int _S {\left( {q(x)}\, \left| \, {e_1(x), e_2(x)}\right. \right) } d\sigma _S(x). $$

Again, a vector field f on \({\mathbb {R}}^3\) induces a 2-form \(q_f\), defined by \({\left( {q_f}\, \left| \, {v_1, v_2}\right. \right) } = \det (f, v_1, v_2) = f^T (v_1\times v_2)\), and every 2-form can be obtained this way. Using the fact that, if \((e_1, e_2)\) is a positively oriented basis of the tangent space to the surface, then \(e_1\times e_2 = N\), we retrieve \({\left( {\nu _S}\, \left| \, {f}\right. \right) } = {\left( {\eta _S}\, \left| \, {q_f}\right. \right) }\).

9.7.5 Varifolds

A differential k-form \(\omega \) on \({\mathbb {R}}^d\) uniquely defines a function on the product space \({\mathbb {R}}^d\times \widetilde{\mathrm {Gr}}(d, k)\), the product space of \({\mathbb {R}}^d\) with the set of all oriented k-dimensional subspaces of \({\mathbb {R}}^d\) (called the oriented Grassmannian, on which a manifold structure similar to the one discussed in Sect. B.6.7 for the Grassmann manifold can be defined). One can indeed assign to any pair \((x, \alpha )\) in that set the scalar \(F_q(x,\alpha ) = {\left( {q(x)}\, \left| \, { e_1, \ldots , e_k}\right. \right) }\) where \(e_1, \ldots , e_k\) is any positively oriented orthonormal basis of \(\alpha \), and the value does not depend on the chosen basis. Given an oriented k-dimensional submanifold of \({\mathbb {R}}^d\), one can define the linear form on continuous functions F defined on \({\mathbb {R}}^d\times \widetilde{\mathrm {Gr}}(d, k)\), given by

$$ {\left( {\tilde{\rho }_M}\, \left| \, {F}\right. \right) } = \int _M F(p, T_p M) d\sigma _M, $$

where \(T_pM\) is considered with its orientation. The current \(\eta _M\) defined in the previous section is such that \({\left( {\eta _M}\, \left| \, {q}\right. \right) } = {\left( {\tilde{\rho }_M}\, \left| \, {F_q}\right. \right) }\).

When one wants to disregard orientation, which may be convenient, and sometimes necessary in practice, it is natural to replace \(\widetilde{\mathrm {Gr}}(d, k)\) by \(\mathrm {Gr}(d, k)\) (the Grassmannian) and define the same linear form (that we now call \(\rho _M\)) on functions defined on \({\mathbb {R}}^d \times \mathrm {Gr}(d, k)\). The linear form \(\rho _M\) is a special case of a varifold, where varifolds are defined as (Radon) measures on \({\mathbb {R}}^d \times \mathrm {Gr}(d, k)\).

From this point, and following [61], one can make a construction analogous to the one just described for measures on \({\mathbb {R}}^d\). Given a reproducing kernel Hilbert space W of functions defined on \({\mathbb {R}}^d \times \mathrm {Gr}(d, k)\), define the square distance between two k-dimensional submanifolds of \({\mathbb {R}}^d\) by

$$ D(M, M') = \Vert \rho _M - \rho _{M'}\Vert _{W^*}^2. $$

For the approach to be practical, one needs to have explicit kernels on \({\mathbb {R}}^d \times \mathrm {Gr}(d, k)\). Referring to [61] for a complete discussion, we note here that a class of such kernels can be designed based on the following observations.

  1. (i)

    The function defined for \(\alpha , \beta \in \widetilde{\mathrm {Gr}}(d, k)\) by

    $$ \tilde{\xi }(\alpha , \beta ) = {\big \langle {e_1(x)\times \cdots \times e_k(x)}\, , \, {f_1(y) \times \cdots \times f_k(y)}\big \rangle }_{\varLambda _{d-k}}, $$

    where \((e_1, \ldots , e_k)\) and \((f_1, \ldots , f_k)\) are positively oriented orthonormal bases of \(\tilde{\alpha }\) and \(\tilde{\beta }\), is a positive definite kernel. Hence, the function defined for \( \alpha , \beta \in {\mathrm {Gr}}(d, k)\)

    $$ \xi ( \alpha , \beta ) = {\big \langle {e_1(x)\times \cdots \times e_k(x)}\, , \, {f_1(y) \times \cdots \times f_k(y)}\big \rangle }^2_{\varLambda _{d-k}}, $$

    where \((e_1, \ldots , e_k)\) and \((f_1, \ldots , f_k)\) are orthonormal bases of \(\alpha \) and \(\beta \), is also definite positive. More generally, if \(\tilde{\xi }\) is positive definite on \(\widetilde{\mathrm {Gr}}(d, k)\), then \(\xi = f(\tilde{\xi })\) is definite positive on \(\mathrm {Gr}(d, k)\) for any even analytic function f whose derivatives at 0 are all non-negative, and at least one of them positive. These statements simply use the fact that products of positive kernels remain positive.

  2. (ii)

    If \(\eta \) is a positive kernel on differential p-forms, then \(\tilde{\xi }\) defined by

    $$ \tilde{\xi }(x,\alpha ; y, \beta ) = {\left( {\eta (x, y)}\, \left| \, {e_1, \ldots , e_k; f_1, \ldots , f_k}\right. \right) } $$

    is a positive kernel on \({\mathbb {R}}^d \times \widetilde{\mathrm {Gr}}(d, k)\).

  3. (iii)

    If \(\xi ^{(1)}\) is a reproducing kernel on \({\mathbb {R}}^d\) and \(\xi ^{(2)}\) a reproducing kernel on \(\mathrm {Gr}(d, k)\), then \(\xi \) defined by

    $$ \xi (x, \alpha ; y, \beta ) = \xi ^{(1)}(x, y) \xi ^{(2)} (\alpha , \beta ) $$

    is a reproducing kernel on \({\mathbb {R}}^d \times \mathrm {Gr}(d, k)\).

Applying this to surfaces, for example, and using the discussion at the end of the previous section, we find that taking

$$ {\big \langle {\rho _S}\, , \, {\rho _{\tilde{S}}}\big \rangle }_{W^*} = \int _S\int _{\tilde{S}} \xi (x,\tilde{x}) \left( 1+ a \left( N(x)^T\tilde{N}(\tilde{x})\right) ^2\right) d\sigma _{\tilde{S}}(\tilde{x}) d\sigma _S(x), $$

where \(\xi \) is a reproducing kernel on \({\mathbb {R}}^d\), provides an RKHS dual inner-product on varifolds. The discretization of such a norm is similar to those detailed for scalar and vector measures and is left to the reader.

9.8 Matching Vector Fields

We now study vector fields as deformable objects. They correspond, for example, to velocity fields (that can be observed for weather data), or to gradient fields that can be computed for images. Orientation fields (that can be represented by unit vector fields) are also interesting. They can correspond, for example, to fiber orientations in tissues observed in medical images.

We want to compare two vector fields f and \(f'\), i.e., two functions from \({\mathbb {R}}^d\) to \({\mathbb {R}}^d\). To simplify, we restrict ourselves to \(E_{f, f'}(\varphi )\) being the \(L^2\) norm between \(\varphi \cdot f\) and \( f'\), and focus our discussion on the definition of the action of diffeomorphisms on vector fields.

The simplest choice is to use the same action as in image matching and take \(\varphi \cdot f = f\circ \varphi ^{-1}\), where f is a vector field on \({\mathbb {R}}^d\). It is, however, natural (and more consistent with applications) to combine the displacement of the points at which f is evaluated with a reorientation of f, also induced by the transformation. Several choices can be made for such an action and all may be of interest depending on the context.

For example, we can interpret a vector field as a velocity field, assuming that each point in \(\varOmega \) moves on to a trajectory x(t) and that \(f(x) = \dot{x}(t)\), say at time \(t=0\). If we make the transformation \(x\mapsto x' = \varphi (x)\), and let \(f'\) be the transformed vector field, such that \(\dot{x}'(0) = f' (x')\), we get: \(\dot{x}'(0) = d\varphi (x) \dot{x}(0) = f'\circ \varphi (x)\) so that \(f' = (d\varphi \, f )\circ \varphi ^{-1}\). The transformation \(f \mapsto (d\varphi \, f)\circ \varphi ^{-1}\) is an important Lie group operation, called the adjoint representation (\(\mathrm {Ad}_\varphi f\)). This is anecdotal here, but we will use it again later as a fundamental tool. So, our first action is

$$ \varphi *f = (d\varphi \, f)\circ \varphi ^{-1}. $$

To define a second action, we now consider vector fields that are obtained as gradients of a function I: \(f = \nabla I\). If I becomes \(\varphi \cdot I = I\circ \varphi ^{-1}\), then f becomes \(d(\varphi ^{-1})^T \nabla I\circ \varphi ^{-1}\). This defines a new action

$$ \varphi \star f = d(\varphi ^{-1})^T f\circ \varphi ^{-1} = (d\varphi ^{-T}\, f)\circ \varphi ^{-1}. $$

This action can be applied to any vector field, not only gradients, but one can check that the set of vector fields f such that \(\mathrm {curl} \, f = 0\) is left invariant by this action.

Sometimes, it is important that the norms of the vector fields at each point remain invariant under the transformation, when dealing, for example, with orientation fields. This can be achieved in both cases by normalizing the result, and we define the following normalized actions:

$$\begin{aligned} \varphi \,\bar{*}\, f= & {} \left( |f| \frac{d\varphi f}{|d\varphi f|}\right) \circ \varphi ^{-1}\\ \varphi \,\bar{\star }\, f= & {} \left( |f| \frac{d\varphi ^{-T} f}{|d\varphi ^{-T} f|}\right) \circ \varphi ^{-1} \end{aligned}$$

(taking, in both cases, the right-hand side equal to 0 if \(|f|=0\)).

We now evaluate the differential of \(E_{f, f'}(\varphi ) = \Vert \varphi \cdot f- f'\Vert _2^2\), where \(\varphi \cdot f\) is one of the actions above. We will make the computation below under the assumption that f is \(C^1\) and compactly supported. For the \(*\) action, we can observe that, for \(\varphi = {\mathrm {id}}+ h\),

$$ \varphi *f - f = dh\, f \circ ({\mathrm {id}}+h)^{-1} + f\circ ({\mathrm {id}}+h)^{-1} - f, $$

so that

$$\begin{aligned} \varphi *f - f - dh\,f + df\, h = \,&dh (f \circ ({\mathrm {id}}+h)^{-1} - f)+ f\circ ({\mathrm {id}}+h)^{-1} \\&- f\circ ({\mathrm {id}}-h) + f\circ ({\mathrm {id}}-h) - f + df\, h. \end{aligned}$$

Using the fact that \(\Vert ({\mathrm {id}}+h)^{-1} - {\mathrm {id}}\Vert _\infty = \Vert h\Vert _\infty \),

$$ \Vert ({\mathrm {id}}+h)^{-1} - ({\mathrm {id}}- h)\Vert _\infty = \Vert h\circ ({\mathrm {id}}+h) - h\Vert _\infty \le \Vert h\Vert _{1, \infty }^2 $$

and letting

$$ \omega ^{(1)}_f(\varepsilon ) = \sup _{x\in {\mathbb {R}}^d}\sup _{|\delta |<\varepsilon } |f(x+\delta ) - f(x) -df(x)\delta | $$

we find that

$$ \Vert \varphi *f - f - dh\,f + df\, h\Vert _\infty \le \Vert f\Vert _\infty \Vert h\Vert _{1,\infty }^2 + \Vert f\Vert _{1, \infty } \Vert h\Vert _{1, \infty }^2 + \omega ^{(1)}_f(\Vert h\Vert _\infty ) . $$

Noting that \(\omega ^{(1)}_f(\varepsilon ) = o(\varepsilon )\), we find that

$$\begin{aligned} \Vert \varphi *f - f - dh\,f + df\, h\Vert _\infty = o(\Vert h\Vert _{1, \infty }). \end{aligned}$$
(9.59)

Using this estimate, it is now easy to show that \(E_{f, f'}: \mathrm {Diff}^{1, \infty }_0\rightarrow {\mathbb {R}}\) is differentiable at \(\varphi = {\mathrm {id}}\) with derivative

$$\begin{aligned} {\left( {d E_{f, f'}({\mathrm {id}})}\, \left| \, {h}\right. \right) }= & {} 2{\big \langle {dh \, f - df\, h}\, , \, {f-f'}\big \rangle }_2 \\= & {} 2\int _\varOmega (dh\, f - df\, h)^T(f-f') dx. \end{aligned}$$

For \(f \mapsto \varphi *f\) to map compactly supported \(C^1\) vector fields into vector fields with the same property, we need to take twice-differentiable diffeomorphisms, i.e., \(\varphi \in \mathrm {Diff}^{2, \infty }_0\). Over this group, we find that \(E_{f, f'}\) is differentiable everywhere, with \({\left( {d E_{f,f'}(\psi )}\, \left| \, {h}\right. \right) } = {\left( {d E_{\psi *f, f'}({\mathrm {id}})}\, \left| \, {h\circ \psi ^{-1}}\right. \right) } \).

The Eulerian derivative is then given by

$$\begin{aligned} {\left( {\bar{\partial }E_{f,f'}(\psi )}\, \left| \, {v}\right. \right) }= & {} {\left( {d E_{\psi *f, f'}({\mathrm {id}})}\, \left| \, {v}\right. \right) } \\= & {} 2{\big \langle {dv \,(\psi *f) - d(\psi *f)\, v}\, , \, {\psi *f-f'}\big \rangle }_2. \end{aligned}$$

This expression can be combined with (9.44) to obtain the Eulerian gradient of U, namely

$$\begin{aligned} {\overline{\nabla }}^V&E_{f, f'}(\psi ) = \\&2 \int _\varOmega \left( \partial _2 K(.,x)(\psi *f - f', \psi *f) - K(., x) d(\psi *f)^T(\psi *f - f')\right) dx. \end{aligned}$$

The Eulerian differential can be rewritten in another form to avoid the intervention of the differential of h. The following lemma is a consequence of the divergence theorem.

Lemma 9.10

If \(\varOmega \) is a bounded open domain of \({\mathbb {R}}^d\) and vwh are smooth vector fields on \({\mathbb {R}}^d\), then

$$\begin{aligned} \int _\varOmega v^T dh w\, dx = \int _{\partial \varOmega } (v^Th) (w^T N)&d\sigma _{\partial \varOmega } \nonumber \\&- \int _\varOmega \big (w^T dv^T h + (\mathrm {div}\, w) (v^Th)\big ) dx. \end{aligned}$$
(9.60)

Equation (9.60) can be rewritten as

$$\begin{aligned} {\big \langle {dh\, w}\, , \, {v}\big \rangle }_2 = {\left( {(w^TN) v \sigma _{\partial \varOmega }}\, \left| \, {h}\right. \right) } - {\big \langle {dv\, w + (\mathrm {div}\, w)\, v}\, , \, {h}\big \rangle }_2. \end{aligned}$$
(9.61)

Proof

To prove this, introduce the coordinates \(h^1, \ldots , h^d\) for h and \(v^1, \ldots , v^d\) for v so that

$$ v^T dh w = \sum _{i=1}^d v^i (\nabla h^i)^T{w}. $$

Now, use the fact that

$$\begin{aligned} \mathrm {div}(v^Th\, w)= & {} \mathrm {div}\Big (\sum _{i=1}^d v^i h^i w\Big ) \\= & {} \sum _{i=1}^d \big (v^i (\nabla h^i)^T{w} + h^i ({\nabla v^i})^T{w} + h_iv^i \mathrm {div}\, w\big )\\= & {} v^T dh w + h^T dv w + (h^Tv) {\mathrm {div}}\, w \end{aligned}$$

and the divergence theorem to obtain the result.    \(\square \)

Using this lemma with \(\varOmega \) large enough so that f vanishes on \(\partial \varOmega \), we find

$$ {\left( {d E_{f, f'}({\mathrm {id}})}\, \left| \, {h}\right. \right) } = - 2{\big \langle {(df-df') \, f + \mathrm {div}\, f\, (f-f') + df^T(f-f')}\, , \, {h}\big \rangle }_2, $$

which directly provides a new version of the Eulerian derivative at an arbitrary \(\varphi \), with the corresponding new expression of the Eulerian gradient:

$$\begin{aligned} {\overline{\nabla }}^V E_{f, f'}(\varphi ) = - 2\int _\varOmega&K(\cdot , x) \Big (d(\varphi *f - f') (\varphi *f) \\&+ \text {div}(\varphi *f)(\varphi *f - f') + d(\varphi *f)^T(\varphi *f - f')\Big ) dx. \end{aligned}$$

Let us now consider the normalized version of this action. We will make the computation under a few additional assumptions on f, namely, that f is compactly supported and f / |f| can be replaced by a smooth unit vector field that can be extended to an open set that contains the support of f. More precisely, we will assume that there exists a scalar function \(\rho \), continuously differentiable and supported by a compact set Q, and a vector field u such that \(|u(x)| = 1\) for all x in an open set \(\varOmega \) containing Q, u is continuously differentiable on \(\varOmega \) and \(f(x) = \rho (x) u(x)\). With this notation, we have

$$ \varphi \,\bar{*}\, f = \rho \circ \varphi ^{-1} \, \frac{\varphi *u}{|\varphi *u|}. $$

Note that, for \(z\ne 0\), the derivative of z / |z| is

$$ h \mapsto h/|z| - zz^Th/|z|^3 = \frac{1}{|z|} \pi _{z^\perp }(h), $$

where \(\pi _{z^\perp }\) is the orthogonal projection on the space of vectors perpendicular to z.

Consider \(\varphi = {\mathrm {id}}+ h\) for some \(h\in \mathrm {Diff}^{1, \infty }_0\). Let \(\delta = \mathrm {dist}(Q, \varOmega ^c)\) and assume that

$$ \Vert h\Vert _\infty = \Vert \varphi ^{-1} - {\mathrm {id}}\Vert _\infty < \delta /2. $$

From (9.59), we have, letting \(\varOmega '\) be the set of \(x\in {\mathbb {R}}^d\) such that \(\mathrm {dist}(x, Q) <\delta \),

$$ \sup _{x\in \varOmega '}| (\varphi *u)(x) - u(x) - dh(x) u(x) + du(x) h(x)| = o(\Vert h\Vert _{1, \infty }), $$

from which we deduce

$$ \sup _{x\in \varOmega '}\left| \frac{(\varphi *u)(x)}{|(\varphi *u)(x)|} - u(x) - \pi _{u^\perp }\left( dh(x) u(x) - du(x) h(x)\right) \right| = o(\Vert h\Vert _{1, \infty }). $$

Since \(\Vert \rho \circ \varphi ^{-1} - \rho - \nabla \rho ^Th\Vert _\infty = o(\Vert h\Vert _{\infty })\), we obtain the fact that \(E_{f, f'}\) is differentiable at \(\varphi = {\mathrm {id}}\) with

$$\begin{aligned} {\left( {dE_{f, f'}({\mathrm {id}})}\, \left| \, {h}\right. \right) }= & {} 2{\left( {-\nabla \rho ^T h\, u + \rho \,\pi _{u^\perp }(dh\, u - du\, h)}\, \left| \, {f-f'}\right. \right) }\\= & {} -2{\left( {\nabla \rho ^T h\, u}\, \left| \, {f-f'}\right. \right) } - 2{\left( {\rho \,(dh\, u - du\, h)}\, \left| \, {\pi _{u^\perp }(f')}\right. \right) }. \end{aligned}$$

Assuming now that \(\psi \in \mathrm {Diff}^{2,\infty }_0\), we obtain the fact that \(E_{f, f'}\) is differentiable at \(\psi \), with

$$ {\left( {dE_{f,f'}(\psi )}\, \left| \, {h}\right. \right) } = {\left( {dE_{\psi \bar{*}f, f'}({\mathrm {id}})}\, \left| \, {h\circ \psi ^{-1}}\right. \right) }. $$

The Eulerian derivative is

$$ {\left( {\bar{\partial }E_{f,f'}(\psi )}\, \left| \, {v}\right. \right) } = {\left( {dE_{\psi \bar{*}f, f'}({\mathrm {id}})}\, \left| \, {v}\right. \right) }. $$

Finally, we note that after integration by parts, we can write

$$\begin{aligned} dE_{f, f'}({\mathrm {id}}) = 2\big (-u^T(f-f')\nabla \rho + \rho \, du^T(\pi _{u^\perp }(f')) \\+ d(\pi _{u^\perp } (f'))f + {\mathrm {div}}(f) \pi _{u^\perp }(f')\big ) dx. \end{aligned}$$

The computations for \(\varphi \star f = (d\varphi ^{-T} f)\circ \varphi ^{-1}\) and its normalized version are very similar. One only needs to note that (9.59) is now replaced by

$$\begin{aligned} \Vert \varphi \star f - f +dh^T\,f + df\, h\Vert _\infty = o(\Vert h\Vert _{1, \infty }). \end{aligned}$$
(9.62)

As a consequence, the formulas for the differentials of the \(\star \) and \(\bar{\star }\) can be deduced from the \(*\) and \(\bar{*}\) actions by replacing \((dh\,f -df\, f)\) by \((-dh^Tf - df\, h)\).

For the unnormalized action, this yields

$$\begin{aligned} {\left( {d E_{f, f'}({\mathrm {id}})}\, \left| \, {h}\right. \right) }= & {} - 2{\big \langle {dh(f-f')}\, , \, {f}\big \rangle }_2 - 2{\big \langle {df^T(f-f')}\, , \, {h}\big \rangle }_2\\= & {} 2{\big \langle {(df-df^T)(f-f') + \mathrm {div}(f-f') f}\, , \, {h}\big \rangle }_2 \end{aligned}$$

and \(\bar{\partial }E_{f, f'}(\varphi )\) is obtained by replacing f by \(\varphi \star f\). To obtain the differential of \(E_{f, f'}\) for the normalized \(\star \) action, we get

$$ {\left( {dE_{f, f'}({\mathrm {id}})}\, \left| \, {h}\right. \right) } = -2{\left( {\nabla \rho ^T h\, u}\, \left| \, {f-f'}\right. \right) } + 2{\left( {\rho \,(dh^T\, u + du\, h)}\, \left| \, {\pi _{u^\perp }(f')}\right. \right) }, $$

where \(f = \rho \, u\) as above, which can also be written as

$$ dE_{f, f'}({\mathrm {id}}) = 2\big (-u^T(f-f') \nabla \rho + (\rho du^T -df)(\pi _{u^\perp }(f')) -{\mathrm {div}}(\pi _{u^\perp }(f'))f\big ) dx. $$

As an example of application of vector field matching, let us consider contrast-invariant image registration [90]. If \(I:\varOmega \rightarrow {\mathbb {R}}\) is an image, a change of contrast is a transformation \(I \mapsto q\circ I\), where q is a scalar diffeomorphism of the image intensity range. The level sets \(I_\lambda = \left\{ x, I(x) \le \lambda \right\} \) are simply relabeled by a change of contrast, and one obtains a contrast-invariant representation of the image by considering the normals to these level sets, i.e., the vector field

$$ f = \nabla I / |\nabla I| $$

with the convention that \(f=0\) when \(\nabla I=0\). Two images represented in this way can now be compared using vector field matching. Since we are using normalized gradients, the natural action is \((\varphi , f) \mapsto \varphi \, \bar{\star }\, f\). For our results to hold, some regularization needs to be applied, replacing f by \(\rho \,\tilde{f}\) where \(\rho =1\) and \(\tilde{f}=f\) when \(|f|=1\), \(\tilde{f}\) is a unit vector field that smoothly extends f over a neighborhood of the domain over which \(|f|=1\), \(\rho \) is smooth and vanishes outside this neighborhood.

9.9 Matching Fields of Frames

We now extend vector field deformation models to define an action of diffeomorphisms on fields of positively oriented orthogonal matrices, or frames. We will restrict ourselves to dimension 3, so that the deformable objects considered in this section are mappings \(x\mapsto R(x)\), with, for all \(x\in \varOmega \), \(R(x)\in \mathrm {SO}_3({\mathbb {R}})\) (the group of rotation matrices).

The \(*\) and \(\star \) actions we have just defined on vector fields have the nice property of conserving the Euclidean dot product when combined, that is

$$ {(\varphi *f)}^T{(\varphi \star g)} = ({f}^T{g})\circ \varphi ^{-1}. $$

Since \(\bar{*}\) and \(\bar{\star }\) also conserve the norm, we find that \((\varphi \,\bar{*}\, f, \varphi \,\bar{\star }\, g)\) is orthonormal as soon as (fg) is.

We now define an action of diffeomorphisms on fields on frames. Writing \(R(x) = (f_1(x), f_2(x), f_3(x))\), we let

$$\begin{aligned} \varphi \cdot R = (\varphi \,\bar{*}\, f_1, (\varphi \,\bar{\star }\, f_3)\times (\varphi \, \bar{*}\, f_1), \varphi \, \bar{\star }\, f_3). \end{aligned}$$
(9.63)

That this defines an action is a straightforward consequence of \(\bar{*}\) and \(\bar{\star }\) being actions.

The action can be interpreted as follows. Given a local chart in \({\mathbb {R}}^3\), which is a diffeomorphic change of coordinates \(x = m(s,t, u)\), one uniquely specifies a positively oriented frame \(R_m = (f_1, f_2, f_3)\) by \(f_1 = \partial _1m/|\partial _1 m|\) and \(f_3 = (\partial _1 m \times \partial _2 m)/|\partial _1 m \times \partial _2 m|\). Then, the action we have just defined is such that \(\varphi \cdot R\) is the frame associated to the change of coordinates \(\varphi \circ m\), i.e.,

$$ R_{\varphi \circ m} \circ \varphi = \varphi \cdot R_m. $$

The transformation \(m \rightarrow R_m\) has in turn the following interpretation, which is relevant for some medical imaging modalities. Let the change of coordinates be adapted to the following stratified description of a tissue. Curves \(s \mapsto m(s,t, u)\) correspond to tissue fibers, and surfaces \((s, t) \mapsto m(s, t, u)\) describe a layered organization. The cardiac muscle, for example, exhibits this kind of structure. Then \(f_1\) in \(R_m\) represents the fiber orientation, and \(f_3\) the normal to the layers; \(\varphi \cdot R_m\) then corresponds to the tissue to which the deformation \(\varphi \) has been applied.

Frame fields are typically observed over some object-dependent subregion of the observation domain. To account for this, we assume that we are dealing with weighted fields of frames, taking the form \(A = \rho \, R\), where the weight \(\rho \) vanishes outside a compact set and R is smooth over a neighborhood of this compact set. We will then consider the action

$$ \varphi \cdot A = (\rho \circ \varphi ^{-1})\, \varphi \cdot R. $$

The computations of the previous sections can now be applied to each column of A. In particular, letting \(\varphi = {\mathrm {id}}+ h\), we have \( \varphi \cdot A = A + (w_1, w_2, w_3) + o(\Vert h\Vert _{1, \infty })\) with, writing \(R = (f_1, f_2, f_3)\),

$$ \left\{ \begin{aligned} w_1 =&-(\nabla \rho ^T h) f_1 + \rho \pi _{f_1^\perp }(dh f_1 - df_1 h), \\ w_3 =&-(\nabla \rho ^T h) f_3 - \rho \pi _{f_3^\perp }(dh^T f_3 + df_3 h), \\ w_2 =&-(\nabla \rho ^T h) f_2 - \rho \left( \pi _{f_3^\perp }(dh^T f_3 + df_3 h)\right) \times f_1 \\&+ \rho f_3\times \left( \pi _{f_1^\perp }(dh f_1 - df_1 h)\right) . \end{aligned} \right. $$

Noticing that, for any vector \(u\in {\mathbb {R}}^3\),

$$ \left( \pi _{f_3^\perp }u\right) \times f_1 = \left( (u^Tf_1)f_1 + (u^Tf_2)f_2\right) \times f_1 = -(u^Tf_2)f_3 $$

and similarly \(f_3\times \left( \pi _{f_1^\perp }u\right) = - (u^T f_2) f_1\), we can simplify the expression of \(w_2\), yielding

$$\begin{aligned} \left\{ \begin{aligned} w_1 =&-(\nabla \rho ^T h) f_1 + \rho \pi _{f_1^\perp }(dh f_1 - df_1 h), \\ w_2 =&-(\nabla \rho ^T h) f_2 + \rho ((dh^T f_3 + df_3 h)^Tf_2)f_3 \\&- \rho ((dh f_1 - df_1 h)^Tf_2) f_1,\\ w_3 =&-(\nabla \rho ^T h) f_3 - \rho \pi _{f_3^\perp }(dh^T f_3 + df_3 h). \end{aligned} \right. \end{aligned}$$
(9.64)

Consider the matching functional

$$ E_{A, A'}(\varphi ) = \int _{{\mathbb {R}}^3} |\varphi \cdot A - A'|^2 \, dx $$

with \(|A|^2 = {\mathrm {trace}}(A^TA)\). If \(A = \rho \, R\) and \(A' = \rho '\, R'\), then

$$\begin{aligned} |A-A'|^2= & {} 3\rho ^2 - 2\rho \rho ' {\mathrm {trace}}(R^TR') +3{\rho '}^2 \\= & {} 3(\rho -\rho ')^2 +2\rho \rho ' {\mathrm {trace}}({\mathrm {Id}}- R^TR'). \end{aligned}$$

Introducing the rotation angle, \(\theta \), from R to \(R'\), defined by

$$\begin{aligned} {\mathrm {trace}}(R^{T}R') = 1 + 2\cos \theta , \end{aligned}$$
(9.65)

we get

$$ |A-A'|^2 = 3(\rho -\rho ')^2 +4\rho \rho ' (1-\cos \theta ). $$

Obviously, if \(A = (u_1,u_2, u_3)\) and \(A' = \rho (u'_1,u'_2,u'_3)\), we also have

$$ |A-A'|^2 = |u_1-u_1'|^2 + |u_2-u_2'|^2 + |u_3-u_3'|^2. $$

Using this, one gets the expression of the differential of \(E_{A, A'}\) at \(\varphi ={\mathrm {id}}\),

$$ {\left( {dE_{A, A'}({\mathrm {id}})}\, \left| \, {h}\right. \right) } = 2 \int _{{\mathbb {R}}^2} {\mathrm {trace}}(W^T(A-A'))\, dx, $$

where \(W = (w_1, w_2, w_3)\) is given by (9.64). In particular,

$$\begin{aligned} {\mathrm {trace}}(W^T(A-A'))= & {} - \nabla \rho ^Th \left( 3\rho - \rho '{\mathrm {trace}}(R^TR')\right) \\&+\, \rho \rho ' (dh f_1 - df_1 h)^T \left( - \pi _{f_1^\perp }(f'_1)+(f_1^Tf'_2)f_2\right) \\&-\, \rho \rho '(dh^T f_3 + df_3 h)^T\left( -\pi _{f_3^\perp }(f'_3) +(f_3^Tf'_2)f_2\right) . \end{aligned}$$

Letting

$$\begin{aligned} u_{A, A'}^1= & {} \rho \rho '\left( - \pi _{f_1^\perp }(f'_1)+(f_1^Tf'_2)f_2\right) \\ u_{A, A'}^3= & {} \rho \rho '\left( -\pi _{f_3^\perp }(f'_3) +(f_3^Tf'_2)f_2\right) \end{aligned}$$

and using Lemma 9.10 to eliminate dh, we find

$$\begin{aligned} d E_{A, A'}({\mathrm {id}}) = 2\Big (-\left( 3\rho - \rho '{\mathrm {trace}}(R^TR')\right) \nabla \rho&- du^1_{A, A'} f_1 - \mathrm {div}(f_1) u^1_{A, A'} \nonumber \\&- df_1^T u^1_{A,A'} + \mathrm {div}(u_{A, A'}^3) f_3\Big ) dx. \end{aligned}$$
(9.66)

9.10 Matching Tensors

The last class of deformable objects we will consider in this chapter are fields of matrices (or tensor fields). For general matrices, we can use the actions we have defined on vector fields, and apply them to each column of M, where M is a field of matrices. The differential of matching functionals is then computed as done in the previous two sections.

One sometimes needs to consider subclasses of tensors, and therefore define an action that leaves this subclass invariant. Here we consider symmetric matrices, which have especially been studied in diffusion tensor imaging (DTI) [6]. The previous actions applied to each column do not work, because they would break the symmetry. A simple choice to address this is to make the diffeomorphism also act on the right, in transpose form, defining, for a field \(x\mapsto S(x)\) of symmetric matrices

$$\begin{aligned} \varphi *S= & {} (d\varphi S d\varphi ^T) \circ \varphi ^{-1}\\ \varphi \star S= & {} (d\varphi ^{-T} S d\varphi ^{-1})\circ \varphi ^{-1}. \end{aligned}$$

We leave to the reader the computation of the differentials of objective functions derived from these actions.

These actions are not necessarily well adapted to DTI data, though, for which alternative options may be considered. DTI produces, at each point x in space, a symmetric positive definite matrix S(x) that measures the diffusion of water molecules in the imaged tissue. Roughly speaking, the tensor S(x) is such that if a water molecule is at s at time t, the probability of being at \(x + dx\) at time \(t+dt\) is centered Gaussian with variance \(dt^2dx^TS(x) dx\).

If we return to the structured tissue model discussed in the last section (represented by the parametrization \( x= m(s,t, u)\)), we can assume that molecules travel more easily along fibers, and with most difficulty across layers. So the direction of \(\partial _1 m\) is the direction of largest variance, and \(\partial _1 m \times \partial _2 m\) of smallest variance, so that the frame \(R_m = (f_1, f_2, f_3)\) associated to the parametrization is such that \(f_1\) is an eigenvector of S for the largest eigenvalue, and \(f_3\) for the smallest eigenvalue, which implies that \(f_2\) is an eigenvector for the intermediate eigenvalue. According to our discussion in the last section, a diffeomorphism \(\varphi \) should transform S so that the frame \(R_S\) formed by the eigenbasis of S transforms according to the action of diffeomorphisms on frames, namely, \(R_{\varphi \cdot S} = \varphi \cdot R_S\) defined in (9.63).

So, if we express the decomposition of S in the form

$$ S = \lambda _1 f_1 f_1^T + \lambda _2 f_2 f_2^T + \lambda _3 f_3 f_3^T $$

with \(\lambda _1\ge \lambda _2 \ge \lambda _3\), we should take

$$\begin{aligned} \varphi \cdot S = \tilde{\lambda }_1\tilde{f}_1 \tilde{f}_1^T + \tilde{\lambda }_2 \tilde{f}_2 \tilde{f}_2^T + \tilde{\lambda }_3 \tilde{f}_3 \tilde{f}_3^T \end{aligned}$$
(9.67)

with \((\tilde{f}_1, \tilde{f}_2, \tilde{f}_3)= \varphi \cdot (f_1, f_2, f_3)\) and \(\tilde{\lambda }_i = \lambda _i \circ \varphi ^{-1}\), \(i=1,2,3\). The action on eigenvalues expresses that intrinsic tissue properties have not been affected by the deformation. If there are reasons to believe that variations in volume should affect the intensity of water diffusion, using the action of diffeomorphisms on densities may be a better option, namely \(\tilde{\lambda }_i = \det d(\varphi ^{-1})\lambda _i\circ \varphi ^{-1}.\)

The action with \(\tilde{\lambda }_i = \lambda _i\circ \varphi ^{-1}\) is identical to the eigenvector-based tensor reorientation discussed in [6]. One of the important (and required) features or the construction is that, although the eigen-decomposition of S is not unique (when two or three eigenvalues coincide) the transformation \(S\mapsto \varphi \cdot S\) is defined without ambiguity. This will be justified below.

The following computations require that \(\lambda _1, \lambda _2, \lambda _3\) are \(C^1\) and vanish outside a compact set, and that \(R_S\) is smooth in a small neighborhood of this compact set. They are sketchily justified here, as they strongly resemble the computations that have been done before. It will be convenient to introduce the three-dimensional rotation \(U_S(\varphi )=((\varphi \cdot R_S)\circ \varphi ) \, R_S^T\), so that

$$ \varphi \cdot S = (U_S(\varphi ) S U_S(\varphi )^T) \circ \varphi ^{-1}. $$

Taking \(\varphi = {\mathrm {id}}+ h\), we have \(U_S(\varphi ) - S = \omega _S(h) + o( \Vert h\Vert _{1, \infty })\), where

$$\begin{aligned} \omega _S (h) = \pi _{f_1^\perp } dhf_1f_1^T - ((\pi _{f^\perp _3}dh^Tf_3)&\times f_1)f_2^T \nonumber \\&+ (f_3\times (\pi _{f_1^\perp }dhf_1)) f_2^T - (\pi _{f^\perp _3}dh^Tf_3)f_3^T \end{aligned}$$

is a skew-symmetric matrix.With this notation, we can write

$$ \varphi \cdot S - S = \omega _S(h)S - S\omega _S(h) - dS\, h + o(\Vert h\Vert _{1, \infty }). $$

(Here \(dS\, h\) is the matrix with coefficients \((\nabla S^{ij})^T{ h}\).) Letting

$$ E_{S, S'}(\varphi ) = \int _\varOmega {\mathrm {trace}}((\varphi \cdot S - S')^2) dx, $$

we then get

$$ {\left( {d E_{S, S'}({\mathrm {id}})}\, \left| \, {h}\right. \right) } = 2\int _\varOmega {\mathrm {trace}}\big ((S - S')(\omega _{S}(h)S - S\omega _{S}(h) - dS\, h)\big )\, dx, $$

with, as usual, for \(\psi \in \mathrm {Diff}^{2,\infty }_0\), \({\left( {d E_{S,S'}(\psi )}\, \left| \, {h}\right. \right) } = {\left( {d E_{\psi \cdot S, S'}({\mathrm {id}})}\, \left| \, {h\circ \psi ^{-1}}\right. \right) }\) and \({\left( {\bar{\partial }E_{S,S'}(\psi )}\, \left| \, {v}\right. \right) } = {\left( {d E_{\psi \cdot S, S'}({\mathrm {id}})}\, \left| \, {v}\right. \right) }\).

Here again, the derivatives of h that are involved in \(\omega _{S}(h)\) can be integrated by parts using the divergence theorem. Let us sketch this computation at \(\psi ={\mathrm {id}}\), which leads to a vector measure form for the differential. We focus on the term

$$ {\left( {\eta }\, \left| \, {h}\right. \right) }:=\int _\varOmega {\mathrm {trace}}((S - S')(\omega _{S}(h)S - S\omega _{S}(h))) dx = \int _\varOmega {\mathrm {trace}}(A\omega _S(h)) dx, $$

where \(A = S(S-S') - (S-S')S= SS'-S'S\), and want to express \(\eta \) as a vector measure. We have (using the fact that A is skew symmetric and that \((f_1, f_2, f_3)\) is orthonormal)

$$\begin{aligned} -{\mathrm {trace}}(A\omega _S(h))= & {} {(\omega _S(h) f_1)^T}{Af_1} + {(\omega _S(h) f_2)^T}{Af_2} + {(\omega _S(h) f_3)^T}{Af_3}\\= & {} {(\pi _{f_1^\perp } dhf_1)^T}{Af_1} - {((\pi _{f^\perp _3}dh^Tf_3)\times f_1)^T}{Af_2} \\+ & {} {(f_3\times (\pi _{f_1^\perp }dhf_1))^T}{A f_2} - {(\pi _{f^\perp _3}dh^Tf_3)^T}{Af_3}\\= & {} {(dhf_1)^T}{u_{S, S'}^1} - {(dh^Tf_3)^T}{u_{S, S'}^3}, \end{aligned}$$

with

$$\begin{aligned} u_{S, S'}^1= & {} \pi _{f_1^\perp } (Af_1 + (A f_2\times f_3))\\ \text {and } u_{S, S'}^3= & {} \pi _{f_3^\perp } (Af_3 + ( f_1\times A f_2)). \end{aligned}$$

It now remains to use Lemma 9.10 to identify \(\eta \) as

$$ \eta = \big (du_{S, S'}^1 f_1 + {\mathrm {div}}(f_1) u_{S, S'}^1 - df_3 u_{S, S'}^3 - {\mathrm {div}}(u_{S, S'}^3) f_3\big ) dx. $$

To write the final expression of \(d E_{S, S'}({\mathrm {id}})\), define \((S-S')\odot dS\) to be the vector

$$ (S-S') \odot dS = \sum _{i, j=1}^3 (S^{ij}-(S')^{ij}) \nabla S^{ij}, $$

so that we have

$$\begin{aligned} d E_{S, S'}({\mathrm {id}}) = 2 \big (du_{S, S'}^1 f_1&+ {\mathrm {div}}(f_1) u_{S, S'}^1 - df_3 u_{S, S'}^3 \nonumber \\&\quad \quad - {\mathrm {div}}(u_{S, S'}^3) f_3 - (S-S')\odot dS\big ) dx. \end{aligned}$$
(9.68)

We now generalize this action to arbitrary dimensions, in a way that will provide a new interpretation of the three-dimensional case. Decompose a field of d by d symmetric matrices S in \({\mathbb {R}}^d\) in the form

$$ S(x) = \sum _{k=1}^d \lambda _k(x) f_k(x) f_k(x)^T $$

with \(\lambda _1 \ge \cdots \ge \lambda _d\) and \((f_1, \ldots , f_d)\) orthonormal. The matrices \(f_kf_k^T\) represent the orthogonal projections on the one-dimensional space \({\mathbb {R}}\, f_k\) and, letting

$$ W_k = \mathrm {span}(f_1, \ldots , f_k), $$

and noting that the projection on \(W_k\), \(\pi _{W_k}\) is equal to \(f_1f_1^T +\cdots + f_kf_k^T\), we can obviously write

$$ S(x) = \sum _{k=1}^d \lambda _k(x) (\pi _{W_k(x)}- \pi _{W_{k-1}(x)}), $$

where we have set \(W_0 = \{0\}\).

Define the action \(S \mapsto \varphi \cdot S\) by

$$ \varphi \cdot S = \left( \sum _{k=1}^d \lambda _k \left( \pi _{d\varphi (W_k)}- \pi _{d\varphi (W_{k-1})}\right) \right) \circ \varphi ^{-1}. $$

In three dimensions, because

$$ {(d\varphi f_2)^T}{\tilde{f}_3\circ \varphi } = ({f_2^T}{f_3})/|d\varphi ^{-T} f_3| = 0, $$

we see that \(d\varphi f_2\in \text {span}(\tilde{f}_1\circ \varphi , \tilde{f}_2\circ \varphi )\). Since \(\tilde{f}_1 \circ \varphi \) is proportional to \(d\varphi \, f_1\), we can conclude that

$$ d\varphi \ \text {span}(f_1, f_2) = \text {span}(\tilde{f}_1\circ \varphi , \tilde{f}_2\circ \varphi ). $$

This proves that the action we have just defined coincides with the one we have considered for the case \(d=3\).

Returning to the general d-dimensional case, the definition we just gave does not depend on the choice made for the basis \(f_1, \ldots , f_d\). Indeed, if we let \(\mu _1> \cdots > \mu _q\) denote the distinct eigenvalues of S, and \(\varLambda _1, \ldots , \varLambda _q\) the corresponding eigenspaces, then, regrouping together the terms with identical eigenvalues in the decomposition of S and \(\varphi \cdot S\), and letting

$$ \varGamma _k = \varLambda _1 + \cdots + \varLambda _k, \quad \varGamma _0 = \{0\}, $$

we clearly have

$$ S(x) = \sum _{k=1}^q \mu _k(x) (\pi _{\varGamma _k(x)}- \pi _{\varGamma _{k-1}(x)}) $$

and

$$ \varphi \cdot S = \left( \sum _{k=1}^q \mu _k \left( \pi _{d\varphi (\varGamma _k)}- \pi _{d\varphi (\varGamma _{k-1})}\right) \right) \circ \varphi ^{-1}. $$

Since the decomposition of S in terms of its eigenspaces is uniquely defined, we obtain the fact that the definition of \(\varphi \cdot S\) is non-ambiguous.

9.11 Pros and Cons of Greedy Algorithms

We have studied in this chapter a series of deformable objects, by defining the relevant action(s) that diffeomorphisms have on them and computing the variations of associated matching functionals.

This computation can be used, as we did with landmarks and images, to design “greedy” registration algorithms, which implement gradient descent to progressively minimize the functionals within the group of diffeomorphisms. These algorithms have the advantage of providing relatively simple implementations, and of requiring a relatively limited computation time.

Most of the time, however, this minimization is an ill-posed problem. Minimizers may fail to exist, for example. This has required, for image matching, the implementation of a suitable stopping rule that prevents the algorithm from running indefinitely. Even when a minimizer exists, it is generally not unique (see the example we gave with landmarks). Greedy algorithms provide the minimizer corresponding to the path of steepest descent from where they have been initialized (usually the identity). This solution does not have to be the “best one”, and we will see that other methods can find much smoother solutions when large deformations are involved.

To design potentially well-posed problems, the matching functionals need to be combined with regularization terms that measure the smoothness of the registration. This will be discussed in detail in the next chapter.