1 Introduction

Shape analysis of surfaces in \({\mathbb {R}}^3\) has been motivated by many applications in bioinformatics, computer graphics and medical imaging, see, e.g., [2, 14, 16, 19, 22, 32]. In most applications, the actual parametrization of the surfaces under consideration is unknown and one is only able to observe the “shape” of the object, i.e., a priori the point correspondences between the surfaces are unknown and should be an output of the performed analysis. Furthermore, we will often identify surfaces that only differ by a rigid motion. Thus, we define the shape space of surfaces as the quotient space of all parametrized surfaces modulo the group of reparametrizations and/or the group of rigid motions. One goal in shape analysis is to quantify the differences and find the optimal deformations between the given objects; see Fig. 1 for two examples of optimal deformations between distinct surfaces.

Fig. 1
figure 1

Geodesics between shapes in the space of unparametrized surfaces \({\text {Imm}}(S^2, {\mathbb {R}}^3)/{\text {Diff}}_+(S^2)\) with respect to the split metric (4) for a choice of coefficients (1, 1, 0.1, 0)

The main challenge in the context of shape analysis of surfaces consists in the registration problem, i.e., finding the (optimal) point correspondences between distinct surfaces, which can then be used as the basis for the resulting statistical analysis. In the previous work, the correspondence problem has often been solved in a preprocessing step, which is then followed by an independent statistical analysis of the resulting parametrized surfaces. This approach can yield several undesirable consequences on the statistical analysis, see, e.g., [29].

The goal of elastic shape analysis is to formulate this problem in a unified framework: Using a reparametrization invariant metric on the space of all parametrized surfaces, one then studies the induced Riemannian metric on the quotient space. Using this approach, the point correspondences and the resulting statistical analysis can be performed in a consistent way.

In the past years, several metrics and frameworks have been proposed as potential approaches to this goal, see, e.g., [4, 18, 24, 25, 29, 33]. In particular, a class of elastic metrics has been proposed in [17], which is defined as a weighted sum of three components that measure the differences in shearing, stretching and bending of the surface. This family of metrics is actually a subfamily of the general class of reparametrization invariant Sobolev metrics, as studied in [4,5,6]. It is also a natural generalization of the family of elastic \(G^{a,b}\)-metrics on the space of curves [26], which has been proven efficient and successful in numerous shape analysis applications [10, 28,29,30,31, 34,35,36].

To obtain a numerically efficient representation, Srivastava et al. [28] introduced the so-called Square Root Velocity Function (SRVF) for comparing curves. In this framework, the space of curves endowed with the elastic metric for a particular choice of coefficients is isometric to an \(L^2\)-space, which makes the computation of geodesics extremely easy and efficient. Motivated by this progress, Jermyn et al. [18] introduced the Square Root Normal Field (SRNF) representation for elastic shape analysis of surfaces and showed that the \(L^2\)-metric on the space of SRNFs corresponds to one member of a more general class of elastic metrics on the space of surfaces. While it is computationally efficient, there are several drawbacks to this approach: The SRNF metric only consists of the last two terms of the general elastic metric for surfaces and is thus highly degenerate; i.e., there exists a high-dimensional space of deformations that has no cost in this framework.Footnote 1 Furthermore, the SRNF map is neither injective nor surjective, and its image is not fully understood. In consequence, there exists no analytic formula for geodesics in the image space and geodesics are usually approximated by numerically inverting the straight line between the given SRNFs, where each inversion is calculated as the solution to an optimization problem [25].

Contributions of this article The purpose of the present article is to introduce a numerical framework for the computation of the geodesic initial and boundary value problem with respect to a family of metrics that contains the general elastic metric as a special case. The framework complements [7] which defined, using vector-valued one-forms, a metric on the space of surfaces that is invariant under rigid motions and reparametrizations. It does not require a numerical inversion of the SRNF map and thus overcomes some of the aforementioned difficulties. Furthermore, this framework will allow us in the future to choose the constants of the metric in a data-driven way, which has potential importance in many applications. See [3, 23] for related considerations regarding the choice of constants for the elastic metric on the space of curves.

2 Mathematical Framework and Background

In this section, we will give the formal definition of the space of shapes and describe the general elastic metric. Then, we will introduce a new representation for the elastic metric using vector valued one-forms, which will still allow us to obtain an efficient discretization of the geodesic boundary value problem.

From here on, we will model a surface as an immersion f from a model space M into \({\mathbb {R}}^3\), i.e., a smooth map from M to \({\mathbb {R}}^3\) that has an injective tangent mapping. Here, M is a two-dimensional compact manifold encoding the topology of the objects under consideration. Typically, choices of M include the two-sphere \(M=S^2\) or the sheet \(M=[0,1]^2\).

Denote by \({\text {Imm}}(M, {\mathbb {R}}^3)\) the space of all immersions. To define the space of shapes, we now consider the actions of the group of rigid motions and the group of diffeomorphisms on \({\text {Imm}}(M, {\mathbb {R}}^3)\). The group of rigid motions is given by the semidirect product of the group of rotations and the group of translations, i.e., \({\text {SO}}(3)\ltimes {\mathbb {R}}^3\), where \({\text {SO}}(3)\) is the set of all rotation matrices. It acts on \({\text {Imm}}(M, {\mathbb {R}}^3)\) as follows:

$$\begin{aligned} \left( {\text {SO}}(3)\ltimes {\mathbb {R}}^3\right) \times {\text {Imm}}(M, {\mathbb {R}}^3)&\rightarrow {\text {Imm}}(M, {\mathbb {R}}^3) \\ \left( (R,v), f\right)&\mapsto Rf+v. \end{aligned}$$

Denote by \({\text {Diff}}_+(M)\) the group of diffeomorphisms that preserve the orientation of M. The action of \({\text {Diff}}_+(M)\) on \({\text {Imm}}(M, {\mathbb {R}}^3)\) is given by composition from the right:

$$\begin{aligned} {\text {Imm}}(M, {\mathbb {R}}^3)\times {\text {Diff}}_+(M)&\rightarrow {\text {Imm}}(M, {\mathbb {R}}^3) \\ (f, \gamma )&\mapsto f\circ \gamma . \end{aligned}$$

We say that two immersions \(f_1\) and \(f_2\) have the same shape if they are in the same orbit of the action of \({\text {Diff}}_+(M)\), or both actions depending on whether we want to mod out rigid motions. The space of shapes can then be defined as the quotient space:

$$\begin{aligned} {\mathcal {S}}(M,{\mathbb {R}}^3) = {\text {Imm}}(M, {\mathbb {R}}^3)/{\mathcal {G}}, \end{aligned}$$

where \({\mathcal {G}} = {\text {Diff}}_+(M)\) or \({\mathcal {G}} = {\text {Diff}}_+(M)\times ({\text {SO}}(3)\ltimes {\mathbb {R}}^3) \).

This quotient space has some mild singularities and does not carry the structure of a smooth manifold but only of an infinite-dimensional orbifold [11]. However, for the purpose of this article, we can ignore these subtleties and assume that we are always working away from the singularities, which allows us to treat \({\mathcal {S}}(M,{\mathbb {R}}^3)\) as an infinite-dimensional manifold.

By endowing the space of immersions \({\text {Imm}}(M,{\mathbb {R}}^3)\) with a Riemannian metric that is invariant under the actions of \({\text {SO}}(3)\ltimes {\mathbb {R}}^3\) and \({\text {Diff}}_+(M)\), the space of shapes \({\mathcal {S}}(M, {\mathbb {R}}^3)\) becomes a Riemannian manifold (orbifold), where the metric is induced by the Riemannian metric on \({\text {Imm}}(M, {\mathbb {R}}^3)\).

In the following, we will denote by \({\text {dist}}_{{\text {Imm}}}\) the geodesic distance function of a Riemannian metric on the space of immersions \({\text {Imm}}(M, {\mathbb {R}}^3)\) and by [f] the equivalence class of f under the action of \({\mathcal {G}}\). Given two surfaces \(f_1\) and \(f_2\), we can define the distance between \([f_1]\) and \([f_2]\) as the infimum of the distance between the orbits of \(f_1\) and \(f_2\) under the action of \({\mathcal {G}}\). For example, the distance function on the space of unparametrized surfaces \({\mathcal {S}} = {\text {Imm}}(M, {\mathbb {R}}^3)/{\text {Diff}}_+(M)\) can be defined as follows:

$$\begin{aligned} {\text {dist}}_{{\mathcal {S}}}([f_1], [f_2]) = \inf _{\gamma \in {\text {Diff}}_+(M)}{\text {dist}}_{{\text {Imm}}}(f_1\circ \gamma , f_2). \end{aligned}$$

We will use this induced distance as our measure for comparing unparametrized surfaces. Given two parametrized surfaces, to measure the similarity between them, we will need to find the optimal reparametrization in \({\text {Diff}}_+(M)\) that realizes the infimum. If we also want to mod out rigid motions and find the distance between two elements in the space of unparametrized surfaces modulo rigid motions \({\text {Imm}}(M, {\mathbb {R}}^3)/\left( {\text {Diff}}_+(M)\times {\text {SO}}(3)\ltimes {\mathbb {R}}^3\right) \), we will need to solve a joint optimization problem of finding the best reparametrization, rotation and translation.

2.1 The General Elastic Metric and the SRNF Framework

Jermyn et al. introduced in [18] the general elastic metric which has the desired invariance properties under shape-preserving deformations. To define this metric, we first introduce a transformation that maps an immersion onto its induced surface metric and normal vector field:

$$\begin{aligned} {\text {Imm}}(M,{\mathbb {R}}^3)&\mapsto {\text {Met}}(M)\times C^{\infty }(M,{\mathbb {R}}^3) \\ f&\rightarrow \left( g:=g^f,n:=n^f\right) \;, \end{aligned}$$

where \(n^f\) is the unit normal vector field to the surface f, which is given in local coordinates by

$$\begin{aligned} n=\frac{f_x\times f_y}{|f_x\times f_y|} \end{aligned}$$

and where the surface metric is given by

$$\begin{aligned} g=f^*\langle .,.\rangle _{{\mathbb {R}}^3}=\langle Tf.,Tf. \rangle _{{\mathbb {R}}^3}. \end{aligned}$$

It is classical result in Riemannian geometry that any surface can be reconstructed uniquely by these two quantities [1]. Thus, this representation allows one to define a Riemannian metric on the space of immersions by describing it on the image \({\text {Met}}(M)\times C^{\infty }(M,\mathbb R^3)\). The general elastic metric as introduced in [18] is defined by:

$$\begin{aligned}&G_{g,n}((\delta g,\delta n),(\delta g, \delta n)) \nonumber \\&\quad = A\int _M {\text {tr}} (g^{-1} \delta g g^{-1}\delta g) \mu _g + B\int _M {\text {tr}} (g^{-1} \delta g)^2 \mu _g\nonumber \\&\qquad + C\int _M \langle \delta n, \delta n \rangle _{{\mathbb {R}}^3} \mu _g\; \end{aligned}$$
(1)

where \(A,B,C\ge 0\) are constants and where \(\mu _g\) denotes the induced volume density of the surface f.

Each of the three terms appearing in the metric (1) has a natural geometric interpretation: The first term penalizes local change in the metric (shearing), the second term measures the change in the volume density (scaling) and the third term quantifies the change in the normal vector (bending).

Instead of using the (gn) representation for comparing surfaces, in the same paper [18], Jermyn et al. introduced the SRNF framework, where a surface is represented only as a rescaled normal vector field:

$$\begin{aligned} {\mathcal {Q}}: {\text {Imm}}(M, {\mathbb {R}}^3)&\rightarrow C^{\infty }(M, {\mathbb {R}}^3)\\ f(s)&\mapsto \sqrt{A(s)}n(s), \end{aligned}$$

where A(s) denotes the local area-multiplication factor, which is given in local coordinates by \(A(s)=|f_x(s)\times f_y(s)|\). After equipping the target space \(C^{\infty }(M, {\mathbb {R}}^3)\) with the flat \(L^2\) metric, the map Q becomes an infinitesimal isometry, where the space \({\text {Imm}}(M, {\mathbb {R}}^3)\) is equipped with the elastic metric \(G^{A,B,C}\) with \(A = 0, B = \frac{1}{16}\) and \(C = 1\), i.e., the pullback of the \(L^2\) metric on \(C^{\infty }(M, {\mathbb {R}}^3)\) along the map Q is equal to the metric \(G^{0,\frac{1}{16},1}\). Note, however, that the resulting metric is degenerate for this choice of constants, i.e., there might exist deformation fields that have no cost with respect to the metric. Furthermore, given \(q\in C^{\infty }(M, {\mathbb {R}}^3)\), there may be either no preimage \(Q^{-1}(q) \in {\text {Imm}}(M, {\mathbb {R}}^3)\) of q or many preimages. Most importantly, the image of the space of immersions under the SRNF map cannot be easily characterized and, so far, it is not well understood.

Although the distance between two surfaces, which is given by the \(L^2\) difference between their SRNFs, can be easily calculated, finding the inversion of the linear path between their SRNFs that realizes this distance is not possible as the linear path will usually leave the image of the SRNF representation. In [25], Laga et al. introduced a way to approximate the inversion of arbitrary paths between SRNFs by formulating inversion as an optimization problem. In practice, this has been used to approximate geodesics, by numerically inverting straight lines between the SRNFs. However, since the image of the SRNF map is not convex in \(L^2\), this method will not yield geodesics with respect to the SRNF metric, see Table 3.

2.2 Immersions and Vector Valued One-Forms

In the following, we will introduce our framework for comparing surfaces. Therefore, let \(\varOmega ^1(M,{\mathbb {R}}^3)\) denote the space of all smooth \({\mathbb {R}}^3\) valued one-forms on M and \(\varOmega _+^1(M,{\mathbb {R}}^3)\) denote the subset of \(\varOmega ^1(M,{\mathbb {R}}^3)\), which contains all full-ranked one-forms on M. Given a metric g on M, in a local chart with a field of orthonormal bases, an element of \(\varOmega _+^1(M,{\mathbb {R}}^3)\) can be represented as a field of full-ranked \(3\times 2\) matrices. We consider the differential as a mapping

$$\begin{aligned} d:{\text {Imm}}(M,{\mathbb {R}}^3)/{{\text {trans}}}&\rightarrow \varOmega _+^1(M,{\mathbb {R}}^3) \nonumber \\ f&\mapsto df\;. \end{aligned}$$
(2)

The differential d as defined above is injective, but not surjective. Furthermore, in contrast to the SRNF mapping \({\mathcal {Q}}\) mentioned in Sect. 2.1, it is easy to characterize the image of the differential d. The following theorem contains this characterization and a result concerning the manifold structure space of full-ranked one forms \(\varOmega _+^1(M, {\mathbb {R}}^3)\):

Theorem 1

The space of smooth full-ranked one-forms \(\varOmega _+^1(M, {\mathbb {R}}^3)\) is an open subset of an infinite-dimensional vector space of one-forms \(\varOmega ^1(M, {\mathbb {R}}^3)\), and thus it is an infinite-dimensional Fréchet manifold, where the tangent space at each point is simply \(\varOmega ^1(M, {\mathbb {R}}^3)\).

Furthermore, the image of the differential d is the space of all exact full-ranked one-forms, which is the intersection of \(\varOmega ^1_ +(M, {\mathbb {R}}^3)\) with a linear subspace of \(\varOmega ^1 (M, {\mathbb {R}}^3)\).

Proof

The proof of this result follows directly from the definition of these spaces. \(\square \)

This theorem allows us to define a Riemannian metric on these spaces as follows. Let \(\alpha \in \varOmega _+^1(M, {\mathbb {R}}^3)\) and \(\xi \in T_{\alpha }\varOmega _+^1(M, {\mathbb {R}}^3)\). For the volume form \(\mu \) on M induced by the metric g, we let

$$\begin{aligned} G_{\alpha }(\xi ,\xi ) = \int _M{\text {tr}}\left( \xi _x(\alpha _x^T\alpha _x)^{-1}\xi _x^T\right) \sqrt{\det (\alpha _x^T\alpha _x)}\mu . \end{aligned}$$
(3)

It is easy to see that the integrand is positive definite, and thus the formula defines a nondegenerated Riemannian metric. This metric does not depend on the choice of orthonormal bases we choose and is actually independent of the metric g on M, see [7] for more details. Thus, we can choose any convenient metric g on M and use it to calculate this metric on \(\varOmega _+^1(M, {\mathbb {R}}^3)\).

Fig. 2
figure 2

Geodesics between two cylinders in the space of immersions \({\text {Imm}}(M, {\mathbb {R}}^3)\) with respect to different choices of coefficients (from top to bottom): (1, 1, 0, 1), (1, 0, 1, 1), (1, 1, 1, 0), \((0,\frac{1}{2},1,0)\)

Using the injection (2), we obtain a pullback metric on the space \({\text {Imm}}(M, {\mathbb {R}}^3)\) modulo translations and it turns out that this metric is related to the full elastic metric. The space of immersions equipped with this inner product is an infinite-dimensional Riemannian manifold. It should be noted that, with respect to this metric, \({\text {Imm}}(M,{\mathbb {R}}^3)/{{\text {trans}}}\) is neither geodesically complete nor geodesically convex. In addition, there exists no explicit formula to calculate minimizing geodesics between two given immersions \(f_1\) and \(f_2\). Instead, we will rely on numerical methods to minimize the path length over all paths of immersions connecting the given immersions \(f_1\) and \(f_2\). Alternatively, these minimizing deformations can be found by solving the Lagrangian optimality condition for the energy functional, called the geodesic equation. Although we will not follow this strategy, we will present this equation in “Appendix A.”

First, however, we will orthogonally decompose the tangent space at \(\alpha \) in a similar manner as in the definition of the elastic metric earlier. In the following, we will denote the Moore–Penrose inverse of \(\alpha \) by \(\alpha ^+\), which is defined by \(\alpha ^+=(\alpha ^T\alpha )^{-1}\alpha ^T\) where \(\alpha \) is a \(3\times 2\) matrix of rank 2. Using this notation, we let

$$\begin{aligned} \xi = \xi _m + \frac{1}{2}{{\,\mathrm{tr}\,}}(\alpha ^+\xi )\alpha +\xi ^{\perp } +\xi _0, \end{aligned}$$

where

$$\begin{aligned} \xi _m&= \frac{1}{2}\alpha (\alpha ^T\alpha )^{-1}(\alpha ^T\xi +\xi ^T\alpha ) - \frac{1}{2}{{\,\mathrm{tr}\,}}(\alpha ^+\xi )\alpha \\ \xi ^{\perp }&= \xi - \alpha (\alpha ^T\alpha )^{-1}\alpha ^T\xi \\ \xi _0&= \frac{1}{2}\alpha (\alpha ^T\alpha )^{-1}(\alpha ^T\xi -\xi ^T\alpha ) \end{aligned}$$

It is easy to check that these terms are orthogonal with respect to metric (3). We can now obtain a family of metrics on \(\varOmega _+^1(M, {\mathbb {R}}^3)\):

$$\begin{aligned}&G^{{\mathfrak {a}},{\mathfrak {b}},{\mathfrak {c}},{\mathfrak {d}}}_{\alpha }(\xi , \xi ) \nonumber \\&\quad = {\mathfrak {a}} G_{\alpha }(\xi _m, \xi _m) + {\mathfrak {b}} G_{\alpha }\left( \frac{1}{2}{{\,\mathrm{tr}\,}}(\alpha ^+\xi )\alpha , \frac{1}{2}{{\,\mathrm{tr}\,}}(\alpha ^+\xi ){\alpha }\right) \nonumber \\&\qquad + {\mathfrak {c}}G_{\alpha }( \xi ^{\perp }, \xi ^{\perp }) + {\mathfrak {d}}G_{\alpha }(\xi _0, \xi _0), \end{aligned}$$
(4)

where the first summand is measuring the deformation of the metric (within the class of metrics with the same volume form), the second summand is measuring the deformation of the volume density, the third summand is measuring the deformation of the normal vector. The interpretation of the last summand is less intuitive: it measures changes in the one-form that locally come from rotations about the normal vector.

The following theorem shows the connection of our split metric (4) with the elastic metric (1) on surfaces.

Theorem 2

If \({\mathfrak {d}} = 0\), then the pullback of the split metric (4) gives rise to the elastic metric (1) on the space of immersions.

Proof

See “Appendix B” for a proof of this result. \(\square \)

In Fig. 2, we show geodesics between two parametrized cylinders with respect to the split metric (4) for different choices of coefficients \({\mathfrak {a}},{\mathfrak {b}},{\mathfrak {c}}\) and \({\mathfrak {d}}\). One can see how the choice of coefficients affects the resulting geodesic. Thus, in each specific application, we are now able to adjust the coefficients of the metric in a data-driven way to obtain desired deformations between the shapes under consideration.

Remark 1

In [7], we have presented a detailed study of metric (3) on the space \(\varOmega _+^1(M, {\mathbb {R}}^3)\). In particular, we have obtained an explicit formula for the corresponding geodesic initial value problem; in that situation, geodesics can be computed pointwise, so the problem reduces to a finite-dimensional ODE which can be solved explicitly, and gives the solution in the infinite-dimensional context we are dealing with here.

Fig. 3
figure 3

Geodesics between two cylinders in the space of unparametrized surfaces \({\text {Imm}}(M,{\mathbb {R}}^3)/{\text {Diff}}_+(M)\) with respect to different choices of coefficients (from top to bottom): (1, 1, 0, 1), (1, 0, 1, 1), (1, 1, 1, 0), \((0,\frac{1}{2},1,0)\)

The space of full-ranked exact one-forms \(\varOmega ^1_{+,ex}(M,{\mathbb {R}}^3)\) is, however, a proper subspace of the space of full-ranked one-forms \(\varOmega ^1_+(M,{\mathbb {R}}^3)\) and is not a totally geodesic submanifold of \(\varOmega ^1_+(M,{\mathbb {R}}^3)\) with respect to metric (3). As the space of immersions corresponds to the space of full-ranked exact one-forms, the obtained explicit formula for geodesics does not directly help to calculate geodesics on the space of immersions, which is the main goal of this article. In order to solve the geodesic problem, we will thus introduce a discretization of the metric and solve the geodesic matching problem using path-straightening algorithms.

Note that the split metric (4) is defined on differentials and thus is, by definition, independent of translations. To show the invariance of the split metric under rigid motions and diffeomorphisms, we now consider the action of the group of rotations \({\text {SO}}(3)\) on \(\varOmega _+^1(M,{\mathbb {R}}^3)\), which is defined by pointwise left multiplication:

$$\begin{aligned} {\text {SO}}(3)\times \varOmega _+^1(M,{\mathbb {R}}^3)&\rightarrow \varOmega _+^1(M,{\mathbb {R}}^3) \\ (R, \alpha )&\mapsto R\alpha , \end{aligned}$$

where \((R\alpha )_x = R\alpha _x\), and the action of the group of diffeomorphisms \({\text {Diff}}_+(M)\) on \(\varOmega _+^1(M,{\mathbb {R}}^3)\), which is defined via pullback:

$$\begin{aligned} \varOmega _+^1(M, {\mathbb {R}}^3)\times {\text {Diff}}_+(M)&\rightarrow \varOmega _+^1(M, {\mathbb {R}}^3) \\ (\alpha , \varphi )&\mapsto \varphi ^*\alpha , \end{aligned}$$

where \( (\varphi ^*\alpha )_x = \alpha _{\varphi (x)}\circ d\varphi _x. \) The following proposition summarizes the most important invariances of the metric on \(\varOmega _+^1(M, {\mathbb {R}}^3)\):

Proposition 1

Let \(\alpha \in \varOmega ^1_+(M,{\mathbb {R}}^3)\) and \(\zeta , \eta \in T_\alpha \varOmega ^1_+(M,{\mathbb {R}}^3)\).

  1. 1.

    Metric (4) is invariant under pointwise left multiplication with \({\text {SO}}(3)\), i.e., if \(R \in {\text {SO}}(3)\), then

    $$\begin{aligned} G_{\alpha }(\zeta , \eta )=G_{R\alpha }(R\zeta ,R\eta ) \end{aligned}$$
  2. 2.

    Metric (4) is invariant under the right action of the diffeomorphism group, i.e., for any \(\varphi \in {\text {Diff}}_+(M)\), we have

    $$\begin{aligned} G_{\alpha }(\zeta , \eta )=G_{\varphi ^*\alpha }(\varphi ^*\zeta ,\varphi ^*\eta ). \end{aligned}$$

Proof

The proof of the proposition follows exactly as for the metric (3), which can be found in [7]. \(\square \)

The group of rotations \({\text {SO}}(3)\) acts on the space of immersions by left multiplication, which is the same as it acts on the space of one forms. Thus, by the first statement of Proposition 1, the pullback metric on \({\text {Imm}}(M, {\mathbb {R}}^3)\) is also invariant under the group of rigid motions \({\text {SO}}(3)\ltimes {\mathbb {R}}^3\). For the standard action of \({\text {Diff}}_+(M)\) by composition from the right on \({\text {Imm}}(M, {\mathbb {R}}^3)\), the following commutative diagram illustrates that the pullback action of \({\text {Diff}}_+(M)\) on \(\varOmega _+^1(M, {\mathbb {R}}^3)\) is compatible with the action of \({\text {Diff}}_+(M)\) on \({\text {Imm}}(M, {\mathbb {R}}^3)\):

Therefore, the second statement of Proposition 1 gives the reparametrization invariance of the pullback metric on the space \({\text {Imm}}(M,{\mathbb {R}}^3)\).

Thus, the metric on the space of immersions \({\text {Imm}}(M, {\mathbb {R}}^3)\) induces a metric on the space of unparametrized surfaces \({\text {Imm}}(M, {\mathbb {R}}^3)/{\text {Diff}}_+(M)\) and a metric on the space of unparametrized surfaces modulo rigid motions \({\text {Imm}}(M, {\mathbb {R}}^3)/\left( {\text {Diff}}_+(M)\times {\text {SO}}(3)\ltimes {\mathbb {R}}^3\right) \). In Fig. 3, we show geodesics between two cylinders in the space \({\text {Imm}}(M, {\mathbb {R}}^3)/{\text {Diff}}_+(M)\) with respect to the split metric (4) for different choices of coefficients \({\mathfrak {a}},{\mathfrak {b}},{\mathfrak {c}}\) and \({\mathfrak {d}}\). The corresponding geodesics in the space \({\text {Imm}}(M, {\mathbb {R}}^3)/\left( {\text {Diff}}_+(M)\times {\text {SO}}(3)\ltimes {\mathbb {R}}^3\right) \) are shown in Fig. 4.

Fig. 4
figure 4

Geodesics between two cylinders in the space of unparametrized surfaces modulo rigid motions \({\text {Imm}}(M,{\mathbb {R}}^3)/\left( {\text {Diff}}_+(M)\times {\text {SO}}(3)\ltimes {\mathbb {R}}^3\right) \) with respect to different choices of coefficients (from top to bottom): (1, 1, 0, 1), (1, 0, 1, 1), (1, 1, 1, 0), \((0,\frac{1}{2},1,0)\). Note that the cylinders in the right are rotated in order to minimize energy beyond what is possible in Fig. 3, which leads to more extensive deformation as well. As compared to Fig. 3, the shapes on the right are rotated by \(\theta =1.54\) and \(\phi =0.21\) (row 1), \(\theta =1.56\) and \(\phi =0.28\) (row 2), \(\theta =1.58\) and \(\phi =0.56\) (row 3) and \(\theta =1.58\) and \(\phi =0.47\) (row 4), where \(\theta \) and \(\phi \) are the rotation angles around the z-axis and y-axis. All angles are in radians

3 A Numerical Framework for the General Elastic Metric

In this section, we will describe the discretization and optimization procedure that we implemented to solve the geodesic boundary value problem. From here on, we assume that \(M = S^2\) and use a spherical coordinate system to represent an immersion \(f: S^2\rightarrow {\mathbb {R}}^3\) as a function \(f:[0,2\pi ]\times [0,\pi ]\rightarrow {\mathbb {R}}^3\) such that \(f(0, \phi ) = f(2\pi , \phi ), f(\theta , 0) = f(0,0)\) and \(f(\theta , \pi ) = f(0, \pi )\), see Remark 2 below on how we obtain such (discrete) parametrizations in practice from a triangulated surface.

Remark 2

We represent the surface of a given 3D shape with its embedding on a sphere \(f: S^2 \rightarrow {\mathbb {R}}^3\), which is always possible for genus-0 surfaces. In practice, methods such as conformal mapping introduce significant distortions when dealing with complex shapes that contain many elongated parts. Since the proposed approach does not require the mapping to be conformal, we adopt the approach of Praun and Hoppe [27], which has been implemented by Kurtek et.al. [24]. The idea is to progressively embed a surface on a sphere while minimizing area distortion. The approach starts by reducing the mesh, using progressive mesh simplification, to a basic polyhedra that can be easily embedded on \(S^2\). Then, it iteratively inserts vertices and embeds each new vertex inside the spherical kernel of its one-ring neighborhood while optimizing for the area distortion. Using the implementation provided in [24], we reconstruct the mesh up to 1500 vertices, which is sufficient for computing geodesics. This procedure produces spherical maps that preserve important shape features as shown in all of the examples in this paper. We want to remark here that finding parameterizations of higher genus surfaces is still an open problem. Since we are not aiming at solving the parameterization problem, we focus in this paper on genus-0 manifold surfaces only.

The identity immersion \(i: S^2\rightarrow {\mathbb {R}}^3\) induces the spherical metric on \(S^2\), which will serve as a background metric for the discretization; the vector fields

$$\begin{aligned} \left\{ \dfrac{1}{\sin \phi }\dfrac{\partial }{\partial \theta }, \dfrac{\partial }{\partial \phi }\right\} \end{aligned}$$

form an orthonormal basis of the tangent space for any \((\theta , \phi )\in [0, 2\pi ]\times (0, \pi )\). With respect to this basis and the standard basis on \({\mathbb {R}}^3\), the differential df of an immersion \(f = (x, y, z)^T\) can be represented by a field of \(3\times 2\) matrices:

$$\begin{aligned} df\left( \dfrac{1}{\sin \phi }\dfrac{\partial }{\partial \theta }, \dfrac{\partial }{\partial \phi }\right) = \begin{pmatrix} \dfrac{1}{\sin \phi }\dfrac{\partial x}{\partial \theta }, \dfrac{\partial x}{\partial \phi } \\ \dfrac{1}{\sin \phi }\dfrac{\partial y}{\partial \theta }, \dfrac{\partial y}{\partial \phi } \\ \dfrac{1}{\sin \phi }\dfrac{\partial z}{\partial \theta }, \dfrac{\partial z}{\partial \phi } \\ \end{pmatrix}. \end{aligned}$$

In the following, we denote by \(\left\| \cdot \right\| _f\) the norm induced by the pullback of the split metric (4) and let \(u\in T_f{\text {Imm}}(S^2, {\mathbb {R}}^3)\) be a tangent vector. Since u can be seen as a function from \(S^2\) to \({\mathbb {R}}^3\), using this representation, the norm of u with respect to the split metric will be given as follows:

$$\begin{aligned} \left\| u\right\| _f = \big [G^{{\mathfrak {a}},{\mathfrak {b}},{\mathfrak {c}},{\mathfrak {d}}}_{df}(du, du)\big ]^{1/2}. \end{aligned}$$

3.1 Geodesics in the Space of Surfaces

We will now describe the solution of the boundary value problem in the preshape space of all parametrized surfaces.

Remark 3

We should emphasize here that the space of immersions with respect to the proposed family of Riemannian metrics is not geodesically convex. Thus, the solution of the minimization problem might not exist in the space of immersions but only in a larger space of functions (including those with possibly degenerate differential). In fact, for \({\text {dim}} M\ge 2\), there exists no Riemannian metric on the space of immersions for which geodesic completeness or convexity results have been obtained; in the case of immersed curves, such results have been achieved for metrics of order two or higher [9], but it remains unknown for our higher dimensional situation.

Given two parametrized surfaces \(f_1\) and \(f_2\), we can discretize the linear path connecting \(f_1\) and \(f_2\) in T time steps:

$$\begin{aligned} f_{\text {lin}}(t_i) = (1 - t_i)f_1 + t_if_2. \end{aligned}$$

where \(t_i = i/T, i = 0,\ldots ,T\). The differential \(df_{\text {lin}}\) is then the linear path between \(df_1\) and \(df_2\), which stays by definition in the space of exact one-forms for all \(i= 0,\ldots , T\). Note that this path does not necessarily stay in the space of full-ranked one-forms, e.g., if \(df_1=-df_2\) for some \(x\in S^2\). However, we did not encounter any problems with this possible degeneracy. To solve the geodesic boundary value problem, we will perturb f(t) in all possible directions that fix the end points and that remain in the space of immersions. Since the map d, as defined in Equation (2), is injective, this is equivalent to perturbing the differential df(t) in all possible directions in the space of exact one-forms that keep the two boundary one-forms fixed.

To obtain a basis of perturbations in the space of immersions, we use the fact that the set of spherical harmonics in each component form a Hilbert basis of \(L^2(S^2,{\mathbb {R}}^3)\). We truncate this basis at a chosen maximal degree \({\text {deg}}\) and denote the obtained set by \(\{S_i\}\). The number of elements in this basis is \(L = 3(({\text {deg}}+1)^2-1)\) (here, we remove the spherical harmonic of degree 0 and order 0 since it is a constant function, which corresponds to a pure translation). To calculate the optimal deformation between two given surfaces, we aim to minimize the (discrete) path energy over all curves of the form

$$\begin{aligned} f(t_0)&= f_1,\quad f(t_T) = f_2 \nonumber \\ f(t_i)&= (1 - t_i)f_1 + t_if_2 + \sum _{j =1}^L{\text {Coeff}}(j, i)S_j, \end{aligned}$$
(5)

where \(i = 1,\ldots ,T-1\) and \({\text {Coeff}}\) is a \(L\times (T-1)\) coefficient matrix.

The discrete energy functional \(F: {\mathbb {R}}^{L\times (T-1)}\rightarrow {\mathbb {R}}\) is then given by

$$\begin{aligned} F({\text {Coeff}}) = \sum _{i =1}^T\left\| f_t(t_{i-1})\right\| ^2_{f(t_{i-1})}\varDelta T \end{aligned}$$
(6)

where the norm \(\left\| \cdot \right\| \) is induced by the pullback of the split metric (4),

$$\begin{aligned} f_t(t_{i-1}) = \frac{f(t_{i}) - f(t_{i-1})}{\varDelta T} \end{aligned}$$
(7)

is the (discrete) derivative of f(t) at \(f(t_{i-1})\) and \(\varDelta T = \frac{1}{T}\) is the width of a subinterval. Alternatively, one can also discretize the derivative of f using the central difference for interior data points, which makes the energy functional symmetric, but leads to slightly higher computational cost. To find the optimal coefficient matrix \({\text {Coeff}}\), we employ a BFGS method, which is a quasi-Newton method for solving unconstrained minimization problems [13], as provided in the optimize package of scipy. We calculate the gradient using automatic differentiation in Pytorch, which leads to the algorithm described in Algorithm 1. See [21] for more examples of applying tools of deep learning and in particular automatic differentiation in shape and image analysis.

figure f

3.2 Geodesics in the Space of Unparametrized Surfaces

Now we present our algorithm for calculating geodesics in the space of unparametrized surfaces \({\text {Imm}}(S^2, {\mathbb {R}}^3)/{\text {Diff}}_+(S^2)\). The main difficulty for this task is to find the optimal \(\gamma \in {\text {Diff}}_+(S^2)\) that realizes the distance

$$\begin{aligned} {\text {dist}}_{{\mathcal {S}}}([f_1], [f_2]) = \inf _{\gamma \in {\text {Diff}}_+(S^2)}{\text {dist}}_{{\text {Imm}}}(f_1\circ \gamma , f_2), \end{aligned}$$

where [f] is the equivalence class of f under the action of the group of orientation-preserving diffeomorphisms \({\text {Diff}}_+(S^2)\) and \({\text {dist}}_{{\mathcal {S}}}\) denotes the distance function on the space \({\text {Imm}}(S^2, {\mathbb {R}}^3)/{\text {Diff}}_+(S^2)\) with respect to the metric that is induced from the split metric (4).

In order to practically perform the minimization over the infinite-dimensional space \({\text {Diff}}_+(S^2)\), we have to choose a suitable discretization of this group: Let \({\text {Id}}\) be the identity map from \(S^2\) to itself. The tangent space \(T_{{\text {Id}}}{\text {Diff}}_+(S^2)\) is the set of all (smooth) vector fields on \(S^2\). The set of gradient and skew gradient vector fields of the set of spherical harmonics provides an orthogonal basis for this tangent space—here orthogonal means with respect to the standard \(L^2\) metric, see, e.g., [22]. Normalizing these basis, we obtain an orthonormal basis for the tangent space \(T_{{\text {Id}}}{\text {Diff}}_+(S^2)\). To choose a finite-dimensional discretization of the tangent space, we truncate this basis at a maximal degree \(\overline{{\text {deg}}}\); then, the number of elements in this basis is \({\bar{L}} = 2(\overline{{\text {deg}}}+1)^2-2\). From here on, we will denote this truncated basis by \(\{v_i, i = 1,...,{\bar{L}}\}\). Let \(X^v = (X^v_1, X^v_2,\ldots ,X^v_{{\bar{L}}})\) be the coefficients of a vector field with respect to this basis, and consider the induced mapping

$$\begin{aligned} \gamma = {\text {Proj}}\left( {\text {Id}} + \sum _{k=1}^{{\bar{L}}}X^v_kv_k\right) , \end{aligned}$$
(8)

where \({\text {Proj}}(x)=\frac{x}{|x|}\) denotes the map that projects nonzero vectors in \({\mathbb {R}}^3\) onto the unit sphere \(S^2\subset {\mathbb {R}}^3\). The following result gives an explicit bound on the size of \(X^v\) that ensures that the corresponding \(\gamma \), defined by (8), is a diffeomorphisms of \(S^2\).

Theorem 3

Let \( U= \sum _{k=1}^{{\bar{L}}}X^v_kv_k\) be a vector field on the sphere \(S^2\) and let \(\gamma = {\text {Proj}}\left( {\text {Id}} + t U \right) \) be the corresponding map as defined in (8), for some real t. Then, \(\gamma \) is a diffeomorphism if

$$\begin{aligned} |t|< -\frac{1}{\inf _{p\in M} \lambda _-(\nabla U)}, \end{aligned}$$
(9)

where \(\nabla U\) is the (1, 1) tensor field \(v\mapsto \nabla _vU\) and \(\lambda _-(\nabla U)\) is the smaller of the two real eigenvalues of the symmetrized matrix \(\overline{\nabla U} = \tfrac{1}{2} \big (\nabla U + (\nabla U)^T\big )\).

Note that \(\nabla U\) is a tensor that for each point \(x\in S^2\) gives a linear transformation \(T_xS^2\rightarrow T_xS^2\), which is defined by \(\nabla U (v)\) being the covariant derivative of U in the direction v.

Proof

The proof of this result is postponed to “Appendix B.” Note that since \(\mathrm {Tr}{(\nabla U)} = \mathrm {div}{U}\), which integrates to zero over the compact manifold M, we know that \(\lambda _-(\nabla U)\) is always negative somewhere; hence, the bound on \(|t|\) is some positive number. \(\square \)

We are now able to describe the discrete optimization problem on the space of unparametrized surfaces, i.e., we aim to minimize the discrete functional \({{\bar{F}}}: {\mathbb {R}}^{{{\bar{L}}}+ L\times (T-2)}\rightarrow \mathbb R\) given by

$$\begin{aligned} {{\bar{F}}}(X^v, {\text {Coeff}}) = \sum _{i =1}^T\left\| f_t(t_{i-1})\right\| ^2_{f(t_{i-1})}\varDelta T, \end{aligned}$$
(10)

where the norm \(\left\| \cdot \right\| \) is induced by the pullback of the split metric (4), \({\text {Coeff}}\), \(S_i\), \(\varDelta T = \frac{1}{T}\) are as in Sect. 3.1 and where the discrete curve f is now of the form

$$\begin{aligned} f(t_0)&= f_1\circ \gamma ,\quad f(t_T) = f_2 \nonumber \\ f(t_i)&= (1 - t_i)f_1 + t_if_2 + \sum _{j =1}^L{\text {Coeff}}(j, i)S_j, \end{aligned}$$
(11)

and where the reparametrization \(\gamma \) is given by formula (8) with coefficient vector \(X^v = (X^v_1, X^v_2,\ldots ,X^v_{{{\bar{L}}}})\).

Fig. 5
figure 5

Examples of boundary surfaces before and after the optimization over the reparametrization group with respect to the split (1, 1, 1, 0) metric. Here, the second shape shows the parametrization of the first boundary surface after composing by the initial guess in the icosahedral group and the third shape shows the final point correspondences after the full optimization, where \(\bar{h}\) denotes the optimal reparametrization. One can observe how the parametrization of the initial surface successively better matches the parametrization of the target surface (the color map represents the parametrization of the surfaces)

Remark 4

(Initialization over \({\text {Diff}}_+(S^2)\)) When using a gradient-based optimization method, it is always an important issue to find a good initialization, as the optimization procedure can get stuck in local minima and is usually sensitive to this initialization. In order to find a good initial guess for the optimal reparametrization of the surface \(f_1\), we first align the corresponding SRNFs of the two boundary surfaces \(f_1\) and \(f_2\). This seems a natural initialization for the \((0,\frac{1}{2},1,0)\) metric as the \(L^2\)-distance on the space of SRNFs is a first-order approximation of the geodesic distance of this metrics. However, in all our experiments, it turned out that this initialization works well for other choices of constants as well, as the optimal point correspondences for different choices of constants, albeit different, are still similar on a global scale. Furthermore, we note that any three-dimensional rotation can be seen as a diffeomorphism of \(S^2\). We use this fact to first minimize only over this finite-dimensional subgroup of the infinite-dimensional reparametrization group. Finally, to initialize the optimization over this finite-dimensional group, we first consider the icosahedral group, which contains 60 orientation-preserving rotations denoted by \(h_i, i = 1,\ldots , 60\), as a finite subset of \({\text {SO}}(3)\). We then choose the best diffeomorphism among these 60 elements as our initial guess. See Fig. 5 for examples of registration before and after this initialization and the whole optimization process over the group of orientation-preserving diffeomorphisms \({\text {Diff}}_+(S^2)\).

In the following, we will describe two algorithms for calculating geodesics in the space of unparametrized surfaces \({\text {Imm}}(S^2, {\mathbb {R}}^3)\): a joint optimization procedure and a coordinate descent approach, where we minimize alternating in the space of parametrized surfaces and over the reparametrization group separately.

We will start by describing the joint optimization procedure, which is analogous to the optimization for parametrized surfaces with one caveat: Since Formula (8) only leads to diffeomorphisms near the identity, i.e., reparametrizations that map points on \(S^2\) to nearby points, we will describe large deformations between \(S^2\) as a composition of N such (small) deformations. This will lead us to iteratively solve the joint optimization problem. The corresponding algorithm is described in Algorithm 2.

figure g

As an alternative to the joint optimization, we will present in the following a coordinate descent method, where we separate the variables in the space of surfaces from the variables that govern the reparametrization of the initial surface, i.e., we alternate between calculating a discrete geodesic, denoted by \(f_{{\text {opt}}}\), between the parametrized surfaces \(f_1\) and \(f_2\) in the space of immersions \({\text {Imm}}(S^2, {\mathbb {R}}^3)\) and reparametrizing the initial surface \({{\bar{f}}}=f_1\). To update the reparametrization, we consider only the first two time points of \(f_{{\text {opt}}}\), i.e., \({{\bar{f}}}\) and \(f_{{\text {opt}}}(t_1)\) and define the following functional

$$\begin{aligned} F_r(X^v) = \left\| f_{{\text {opt}}}(t_1) - \bar{f}\circ \gamma \right\| _{{\bar{f}}\circ \gamma }^2, \end{aligned}$$
(12)

where \(\gamma \) is given by Formula (8) and \(X^v=(X^v_1, X^v_2,\ldots ,X^v_{{{\bar{L}}}})\). We can now employ a BFGS method to find the optimal coefficient vector \(X^v_{{\text {opt}}}\), compute \(\gamma \) using Formula (8) and then update \({{\bar{f}}} = \bar{f}\circ \gamma \). Then, we repeat this process by recalculating the geodesic in the space of parametrized surfaces (with the changed initial surface \({{\bar{f}}}\)). The whole optimization process is summarized in Algorithm 3.

figure h

3.3 Geodesics in the Space of Unparametrized Surfaces Modulo Rigid Motions

Note that the split metric (4) associates no cost with translation and thus the obtained geodesic is automatically in the space of surfaces modulo translations. To calculate the geodesic between two surfaces \([f_1]\) and \([f_2]\) in the space of unparametrized surfaces modulo rigid motions \({\text {Imm}}(S^2, {\mathbb {R}}^3)/\left( {\text {Diff}}_+(S^2)\times {\text {SO}}(3)\ltimes {\mathbb {R}}^3\right) \), we will need to minimize in addition over the rotation group, i.e., solve the optimization problem on \({\text {SO}}(3)\times {\text {Diff}}_+(S^2)\):

$$\begin{aligned} {\text {dist}}_{{\mathcal {S}}}([f_1], [f_2])&= \inf _{\begin{array}{c} R\in {\text {SO}}(3) \\ \gamma \in {\text {Diff}}_+(S^2) \end{array}}{\text {dist}}_{{\text {Imm}}}(f_1\circ \gamma , Rf_2), \end{aligned}$$

where [f] is the equivalence class of f under the actions of \({\text {Diff}}_+(S^2)\) and \({\text {SO}}(3)\) and \({\text {dist}}_{{\mathcal {S}}}\) denotes the distance function on the space \({\text {Imm}}(S^2, {\mathbb {R}}^3)/\left( {\text {Diff}}_+(S^2)\times {\text {SO}}(3)\ltimes {\mathbb {R}}^3\right) \).

Let \(\left\| \cdot \right\| , {\text {Coeff}}, S_i, \varDelta T\) be as in Sect. 3.1, and let \({{\bar{f}}} \) be the current parametrization of the first boundary surface. It is known that the group of rotations \({\text {SO}}(3)\) is a three-dimensional Lie group and the matrix exponential \(\exp \) from its Lie algebra \(\mathfrak {so}(3)\) is surjective. Since there is an isomorphism between \({\mathbb {R}}^3\) and \(\mathfrak {so}(3)\), the discrete optimization problem on the space of unparametrized surfaces modulo rigid motions will be minimizing the discrete functional \({\tilde{F}}:{\mathbb {R}}^{3+ {{\bar{L}}}+ L\times (T-2)}\rightarrow {\mathbb {R}}\) given by

$$\begin{aligned} {\tilde{F}}(X^R, X^v, {\text {Coeff}}) = \sum _{i =1}^{T}\left\| f_t(t_{i-1})\right\| ^2_{f_{i-1}}\varDelta T, \end{aligned}$$

where the discrete curve in this case is of the form

$$\begin{aligned} f(t_0)&= {{\bar{f}}}\circ \gamma ,\quad f(t_T) = \exp (X^R)f_2 \\ f(t_i)&= (1 - t_i)f_1 + t_if(t_T) + \sum _{j =1}^L{\text {Coeff}}(j, i)S_j, \end{aligned}$$

\(i = 1,\cdots , T-1\) and where the reparametrization \(\gamma \) is given by Formula (8) with coefficient vector \(X^v = (X^v_1, X^v_2,\ldots ,X^v_{{{\bar{L}}}})\). We will tackle this simpler (finite-dimensional) optimization problem using an analogous approach as in the previous section and will thus omit further details (Fig. 6).

4 Experiments

Fig. 6
figure 6

Example of a geodesic in several resolutions: \(12\times 25\) (top), \(25\times 49\) (middle) and \(50\times 99\) (bottom) with respect to the split (1, 1, 0.1, 0) metric in the space of unparametrized surfaces \({\text {Imm}}(S^2,{\mathbb {R}}^3)/{\text {Diff}}_+(S^2)\), where \({\text {deg}}= 7, \overline{{\text {deg}}}= 7\) and \(T = 13\)

In this section, we will present examples of geodesics as calculated using our optimization procedures. The human body shapes have been kindly provided by Nil Hasler [15], and the hand shape is taken from SHREC07 watertight models. All other shapes are courtesy of the TOSCA shape database [8].

4.1 Geodesics and Karcher Mean

In Fig. 8, we present examples of geodesics between given surfaces in the space \({\text {Imm}}(S^2, {\mathbb {R}}^3)/{\text {Diff}}_+(S^2)\) with respect to the split (1, 1, 0.1, 0) metric and the corresponding evolutions of energies. In all our examples, we observed a good and relatively fast convergence of the optimization procedure, and we present some selected results of the resulting deformation and the corresponding computation times in Table 1. In Fig. 7, we present the Karcher mean of a family of cat surfaces with respect to the split (1, 1, 0.1, 0) metric in the space of unparametrized surfaces modulo rigid motions \({\text {Imm}}(S^2, {\mathbb {R}}^3)/({\text {Diff}}_+(S^2)\times {\text {SO(3)}}\ltimes {\mathbb {R}}^{3})\). One can observe that the mean captures the overall characteristics of the family of surfaces under consideration, but simplifies some of the features that undergo high variability. To calculate the Karcher mean, we followed an iterative algorithm as described e.g. in [12] in the context of diffeomorphism-based shape analysis. In this method, one arbitrarily orders the data points as \(f_1,\ldots , f_n\) and let \(q_1\) be the midpoint of the geodesic between \(f_1\) and \(f_2\). Then, one defines \(q_{i+1}\) by travelling until time \(\frac{1}{i+1}\) on the geodesic that connects \(q_{i-1}\) to \(f_{i+1}\) in time 1. The approximation of the Karcher mean is then given by \(q_{n-1}\). We want to remark here that this method depends on the ordering of the data points (initialization), and in future work, we plan to further investigate this and compare it to other Karcher mean algorithms, such as directly minimizing the sum of squares of geodesic distances. All results were obtained on a standard laptop without any parallelization or GPU implementation, which could certainly be used to obtain a significant increase in speed (Fig. 8).

Table 1 Numerical results of matching surfaces with different resolutions in time and space: low: \(12 \times 25\), \({\text {deg}}= \overline{{\text {deg}}}= 5\), \(T = 5\); middle: \(25\times 49\), \({\text {deg}}= 7\), \(\overline{{\text {deg}}}= 8\), \(T = 10\); and high: \(50 \times 99\), \({\text {deg}}= 9\), \(\overline{{\text {deg}}}= 11\), \(T = 15\). Here, Iter denotes the number of iterations until convergence in the optimization process
Fig. 7
figure 7

Karcher mean (middle) of a set of shapes of cats in the space \({\text {Imm}}(S^2, {\mathbb {R}}^3)/({\text {Diff}}_+(S^2)\times {\text {SO(3)}}\ltimes {\mathbb {R}}^{3})\) with respect to the split (1, 1, 0.1, 0) metric

Fig. 8
figure 8

Examples of geodesics w.r.t. to the (1, 1, 0.1, 0) metric in the space of shapes \({\text {Imm}}(S^2, {\mathbb {R}}^3)/{\text {Diff}}_+(S^2)\), where we choose a resolution of \(50 \times 99\), a maximal degree of spherical harmonics \({\text {deg}}= \overline{{\text {deg}}}= 7\) and 13 timesteps, i.e., we search in an approximately 2205-dimensional space. The corresponding energy evolution for each example is shown on the bottom from left to right

Remark 5

The results in Table 1 suggest that our methods are well suited for multiresolution methods, i.e., to solve the geodesic matching problem first on a coarser resolution (in time, space and degree of spherical harmonics) and then use an upsampled version of the previously obtained solution as initial guess for solving a high-resolution version of the matching problem. Our numerical framework allows for these approaches in all available parameters, and in all our experiments, this procedure seems to allow for as moderate improvements in the speed of the optimization. See Fig. 6 for an example of a multiresolution geodesic in spatial resolution.

4.2 Comparison to the SRNF Framework

Finally, we aim to compare the results obtained with our method to the results using the inversion of linear paths in the SRNF space. The SRNF metric corresponds to the split metric (4) with constant (0, 1/2, 1, 0), see “Appendix B.” To demonstrate this correspondence, we consider four pairs of boundary surfaces. We calculated the length of the linear path between each pair of surfaces under the split (0, 1/2, 1, 0) and the length of the image of the linear path under the SRNF framework. The relative errors between the lengths for different time step sizes are shown in Table 2 and demonstrate that these two metrics indeed coincide.

Table 2 Comparisons between the lengths of linear paths with respect to the split \((0,\frac{1}{2}, 1,0)\) metric and the lengths of the SRNF representations of the linear paths with respect to the \(L^2\) metric. \(L_l\): the length of linear path; \(L_{L_2}\): the length of the SRNF representation of the linear path with respect to the \(L^2\) metric

Since the image of the SRNF map is not convex in \(L^2\), the linear interpolation between two SRNFs may not have a preimage under the SRNF map. Also, even for functions that are in the image of SRNF map, the inverse does not have an analytic expression; in fact, such an expression does not exist in general, since the SRNF map is not injective. As a way to overcome this difficulty, Laga et al. [25] introduced a numerical method to calculate an approximated inversion of any path between two given SRNFs. In practice, this has been used to approximate the geodesic by inverting the linear path between the given SRNFs. We want to remark here that the algorithm of [25] could also be used to invert a geodesic in the image of the SRNF map. However, calculating geodesics in the image of the SRNF map is a nontrivial process, which to the best of our knowledge has not yet been attempted. We would expect that this procedure would lead to minimizers thatrecover the minimizers obtained in the present framework. In Fig. 9, we consider two pairs of surfaces and calculate the geodesic between each pair of the boundary surfaces under the split (0, 1/2, 1, 0) metric with \({\text {deg}}= \overline{{\text {deg}}}= 7, T = 13\) in the space of unparametrized surfaces \({\text {Imm}}(S^2,{\mathbb {R}}^3)/{\text {Diff}}_+(S^2)\). The comparisons of these geodesics with the approximated inversions of the linear paths between the boundary surfaces are shown in Fig. 9. One can see that in the last row for the geodesic between the human body surfaces, the arms are shrinking at the beginning and then stretching, which may be not a desired deformation for some applications. However, by adjusting the coefficients of our metric, we could obtain geodesics with the natural behavior, see Fig. 10 for geodesics with respect to different choices of coefficients.

Fig. 9
figure 9

Comparisons of geodesics with respect to the split \((0,\frac{1}{2},1,0)\) metric and the approximated inversions of straight lines under the SRNF framework. Row 1, 3: the approximated inversions under the SRNF framework; Row 2, 4: geodesics under the split \((0,\frac{1}{2},1,0)\) metric in the space of parametrized surfaces

Fig. 10
figure 10

Geodesics between two human body surfaces in the space of unparametrized surfaces \({\text {Imm}}(S^2,{\mathbb {R}}^3)/{\text {Diff}}_+(S^2)\) with respect to two different choices of coefficients (0, 1, 1, 0) (top) and (1, 1, 0.1, 0) (bottom). In particular, in the deformation of the arms, one can observe the influence of the constants

In Table 3, we compare the lengths of geodesics for four pairs of surfaces in the space of parametrized surfaces \({\text {Imm}}(S^2,{\mathbb {R}}^3)\), the lengths of the approximated inversions (with seven time steps) under the split (0, 1/2, 1, 0) metric and the \(L^2\) differences between the SRNFs of the boundary surfaces. One can see from the table that for each pair of surfaces, the length of the geodesic is much closer to the \(L^2\) difference than the length of the approximated inversion of the straight line between the SRNFs of the boundary surfaces. Note that the \(L^2\)-difference is a lower bound for the geodesic distance that will, in general, be strictly smaller than the true geodesic distance, as the image of the SNRF-representation is not a totally geodesic (open) subspace of the space of all \(L^2\)-functions.

Table 3 Lengths of deformations with respect to the \((0,\frac{1}{2},1,0)\) metric between boundary surfaces with the maximal spherical harmonic degree of 7 and time step size of 25. \(L_l\): the length of the linear path between boundary surfaces; \(L_g\): the length of geodesic as calculated in our numerical framework; \(L_i\): the length of approximated inversion from SRNF straight line; and \(L^2\)-Diff: the \(L^2\) difference between the SRNFs of these boundary surfaces

5 Conclusion

In this article, we have introduced a family of elastic metrics on the space of parametrized surfaces in 3D space using a corresponding family of metrics on the space of vector-valued one-forms. For this class of metrics, we have provided a numerical framework for the computation of geodesics on the space of both parametrized and unparametrized surfaces. This new class of metrics generalizes a previously studied family of elastic metrics and includes, in particular, the Square Root Normal Field (SRNF) metric, which has been proven successful in various applications. In the numerical experiment, provided in Sect. 4, we have demonstrated our framework by showing several examples of geodesics and compared our results with earlier results obtained from the SRNF framework. Our framework does not require a numerical inversion of the SRNF map and thus overcomes some of the difficulties of previous work. Furthermore, it allows to choose the constants of the metric in a data-driven way, which has potential importance in many applications. In future work, we plan to further demonstrate the viability of the proposed method in applications to real data. In addition, we are currently working toward developing a generalization of the SRNF map that will allow us to approximate the geodesic distance for our general class of metrics and will thus speed up the computation by choosing a better initial guess for the parametrization of the boundary surface.