1 Introduction

Shape analysis studies the objects in a scene that is a pivotal way to understand objects. The shapes are the boundaries of objects that express the external form or appearance of the objects that are invariant by the transformation of translation, rotation and scaling [15]. The goal of shape analysis is to find a metric distance, which is used to measure the dissimilarity between shapes. The related methods are widely used in computer vision and biomedical engineering [9, 12, 14, 16, 24].

Fig. 1
figure 1

Our framework for shape analysis

A common framework for shape analysis, firstly proposed in [17], is to construct a shape manifold, where each element corresponds to one shape. Usually, shape manifold is a kind of Riemannian manifold, where a metric distance between shapes is computed by a particular Riemannian metric along geodesic. The metric distance is a dissimilarity score in shape comparison. Furthermore, it is a minimization process to find a geodesic on shape manifold. This technique is used in shape deformation and shape synthesis restricted by the deformation energy minimization.

In [17], the shapes are considered as planar and closed curves and represented as their direction functions or curvature functions, which leads an infinite-dimensional Hilbert manifold for the shape space. A set of Fourier basis represent the tangent of shape manifold, and a shooting method finds geodesic on an infinite-dimensional manifold. The main drawback of this work is that it fails to find the geodesic for a kind of planar closed curves, i.e., the self-crossed curves. The work [22] mentioned this drawback and solved it using an elastic curves representation. Therefore, many works use elastic curve model to find a convenient shape representation that enables complete physical interpretations of shape deformations, i.e., a complete geodesics estimation on shape manifold. Work [25] uses a square-root velocity function (SRVF) to simplify the elastic metric that allows efficient computation on the shape manifold. Furthermore, the work [5] discusses the mathematical properties of the SRVF. As an extensions, the work [26] proposes a joint landmark-constrained elastic shape analysis for SRVF, which is often used in biological applications. Recently, efforts have focused on a discretization for the elastic metric about SRVF representation [2, 3]. Furthermore, work [11] uses Lie group to represent the elastic curve model instead of SRVF, which reduces the shape manifold as a finite-dimensional manifold. In general, these elastic curve models focus on the process of registration to solve the issue of complete geodesics estimation [8].

One work [6] finds a stretching channel in shape manifold for direction function representation, which illustrates a complete method to find geodesics. This work points out that some potential techniques about manifold can fix the drawback in [17].

In this paper, we simplify the shape manifold into a linear manifold where we find the geodesic as a linear problem, rather than directly solve the nonlinear problems on the shape manifold. Firstly, we formulate shapes as discrete, planar and closed curves and represent them by direction functions with discrete form. Secondly, the shape space is constructed as a finite dimensional manifold. Then, we equip the shape manifold with a frame bundle, whose frame in local space is a set of Fourier basis. Based on the property of action on frame bundle, the structure group for manifold acts on frames instead of coordinate so that we can construct a linear manifold only using the coordinates without structure group. There is a one-to-one correspondence between the elements on shape manifold and the linear manifold so that the projection between shape manifold and the linear manifold is named bijection. Finally, we find geodesic and measure its length in the linear manifold. Our framework for shape analysis is illustrated in Fig. 1 The main contributions of this paper are summarized as follows:

  1. 1.

    we use the frame bundle property to analyze the shape manifold.

  2. 2.

    we propose a complete geodesics estimation method for the shape manifold.

  3. 3.

    we simplify the nonlinear problems, the geodesic estimation and the geodesic measurement, as linear problems.

2 Methodology

2.1 Shape representation

The shape of an object’s boundary is expressed by a continuous and closed planar curve \(\alpha (s)\), \(s \in [0,2\pi ]\). In practice, the continuous curve \(\alpha (s)\) is formulated as a discrete curve-based representation with N sampled points after the observing by a digital camera [1]. The shapes are formulated as the discrete curves with parameter \(m = 0,1,2,\ldots , N\):

$$\begin{aligned} \alpha \left( \frac{2\pi m}{N}\right) = \left( \alpha _x\left( \frac{2\pi m}{N}\right) , \alpha _y\left( \frac{2\pi m}{N}\right) \right) , \end{aligned}$$
(1)

where \(\alpha _x\) and \(\alpha _y\) record the pixel coordinates. The discrete curves are simply denoted as \(\alpha [m] = (\alpha _x[m], \alpha _y[m])\). The shapes are a set of curves, which are invariant to translation, scaling and rigid rotation. The translation and scaling are removed as:

$$\begin{aligned} \alpha [m] = \frac{1}{h}(\alpha [m]-{\overline{\alpha }}), \end{aligned}$$
(2)

where \({\overline{\alpha }} = \frac{1}{N}\sum _{i=1}^{N}\alpha [i]\), \(h = \sqrt{\sum _{i=1}^{N}||\alpha [i]-{\overline{\alpha }}||}\) and \(||\cdot ||\) denotes the \(L^2\)-norm. A rotation R for planar curves is an element of SO(2). The rotation of curve \(\alpha [m]\) is denoted as \((R, \alpha [m]) = R\alpha [m]\). For two given curves \(\alpha _1[m]\) and \(\alpha _2[m]\), the optimal rotation \(R_o\) is directly written as:

$$\begin{aligned} R_o = \mathop {\arg \min }\limits _{R \in SO(2) } ||\alpha _1[m] - R\alpha _2[m]|| = UV^{T}, \end{aligned}$$
(3)

where \(U\varSigma V^{T}=svd(B)\), and \(B = \alpha _1[m] \alpha _2[m]^{T}\). If the \(\det (B)<0\), then the last column of \(V^{T}\) changes sign before multiplication. The \(R_{o}\alpha _2[m]\) is removed rotation transformation referred as \(\alpha _1[m]\).

We represent the curves \(\alpha [m]\) using their direction functions \(\theta [n]\):

$$\begin{aligned} \theta [i]=\arctan \frac{\alpha _{y}[i]-\alpha _{y}[i-1]}{\alpha _{x}[i]-\alpha _{x}[i-1]}, \, i = 1, \ldots ,N. \end{aligned}$$
(4)

Furthermore, the curve can be recovered from the direction function \(\theta [n]\) by

$$\begin{aligned} \beta [m] = \sum \limits _{i = 1}^{m} {{\mathrm{e}^{j\theta [i]}}}, \end{aligned}$$
(5)

where \(j=\sqrt{-1}\), the point \(\beta [0]\) is fixed at (0, 0). The closure condition \(\int _{0}^{2\pi }\mathrm{exp}(j\theta (s))\hbox {d}s = 0\) is rewritten as:

$$\begin{aligned} \frac{1}{N}\sum _{i=1}^{N}\mathrm{e}^{j\theta [i]} = 0, \end{aligned}$$
(6)

which means the closure condition is restricted by the mean value of \(\mathrm{e}^{j\theta [n]}\).

2.2 Manifold projection

2.2.1 Constructing shape manifold on frame bundle

Shape manifold is a kind of set \(\mathbf{S} \), whose elements are represented as \(\theta [n] \in \mathbb {R}^N\) and satisfied by the closure condition in Eq. (6). From the extrinsic perspective, the shape manifold is a subset of the \(\mathbb {R}^N\), i.e., \(\mathbf{S} \subset \mathbb {R}^N\). This relationship is shown in Fig. 2. As shown in Fig. 2, the shapes reside on a nonlinear surface in the \(\mathbb {R}^N\). The space of \(\mathbb {R}^N\) is spanned by Cartesian coordinate system with a set of orthonormal basis \(\{E_i\}\). In this space, the elements x are located by \(x = x^i E_i\), which is the Einstein summation convention [20]. The \(x^i\) is a real number, which is equal to the value \(\theta [i]\).

Fig. 2
figure 2

Illustration of the examples of the shape set in \(\mathbb {R}^N\)

From intrinsic perspective, we formulate the shape set \(\mathbf{S} \) as a shape manifold \({\mathcal {S}}\), which can be split into a set of linear spaces for local. Any two distinct points x, y on \({\mathcal {S}}\) are separated by their respective neighborhoods U and V. Each neighborhood is homeomorphic to an open set of a Euclidean space, which is denoted as \(\varphi _{_U}:~U \rightarrow \varphi _{_U}(U)\). The \((U; \varphi _{_U})\) is the coordinate chart of manifold.

On manifold, the element x in the neighborhood U is located by local coordinate defined as \(x^i = \varphi _{_U}(x)^i\), \(i=1, \dots , N\), where N is the dimension of the shape manifold. The pair of \((U; \theta ^i)\) is named local coordinate system of shape manifold.

The tangent space at the element x is denoted as \(T_{x} {\mathcal {S}}\). Canonically, the tangent space is expressed by a set of partial differential operators (tangent vectors) \(\{\frac{\partial }{\partial x^i}\}\) depending on the local coordinate system. It is an N-dimensional space, which is equal to the dimension of shape manifold.

From globally perspective, the shape manifold is one section of frame bundle \({\mathcal {P}}\), where the element x is identified as \((x^i; e_1, \dots , e_N)\) with its coordinate \(x^i\) and associated frame \((e_1, \ldots ,e_N)\). These frames are related with the tangent vectors, i.e.,

$$\begin{aligned} e_i = A_i^n \frac{\partial }{\partial x^n}, \quad n = 1, \dots , N, \end{aligned}$$
(7)

where \((A_i^n)\) is a non-degenerate matrix. The element on frame bundle \((x^i; A_i^n \frac{\partial }{\partial x^n})\) is simply denoted as \((x^i; A_i)\). For example, the tangent bundle is a section of the frame bundle. The elements on tangent bundle are denoted as \((x^i; I_i)\), where I is a identity matrix.

From extrinsic perspective, the elements of shape space belong to \(\mathbb {R}^N\), which are located by \((\theta ^i;E_i)\). Accordingly, the elements on tangent bundle are denoted as \((\theta ^i;I_i)\). In general, \((\theta ^i; E_i)\) is equal to \((\theta ^i; I_i)\).

2.2.2 Designing bijection between shape manifold and linear manifold

Let element \(x \in U_1 \bigcap U_2\), the coordinate are \(x_A\) and \(x_B\) for the chart \(U_1\) and \(U_2\), respectively. We have

$$\begin{aligned} g\cdot x_A = x_B, \end{aligned}$$
(8)

where \(g \in GL(N)\) is a structure group of manifold. On the frame bundle, we consider the coordinate and frames together, and Eq. (8) is rewritten as:

$$\begin{aligned} g\cdot (x_A^i;E_i) = (x_B^i;E_i). \end{aligned}$$
(9)

In [7], the author gives a description for left translation in frame bundle that left translation describes the structure of manifold. The left translation \(L_A = (A_i^n) \in GL(N)\) in \({\mathcal {P}}\) is defined as:

$$\begin{aligned} L_A(x^i_A; E_i)=(x^i_A; E^{\prime }_i), \end{aligned}$$
(10)

where \(E^{\prime }_i = L^i_A E_i\). Equation (10) indicates that the nonlinear structure \(L_A\) can act for the coordinate \(x^i\) or acts for the reference frames. It provides a way to simplify the nonlinear structure of shape manifold, when we focus on the structures between relative coordinates.

Based on the property of left translation in frame bundle, we rewrite Eq. (9) as

$$\begin{aligned} (x^i_A; g^i E_i) = (x^i_B;E_i). \end{aligned}$$
(11)

As we can see, the structure group g acts on the frame \(E_i\) instead of the relative coordinate \(x_A\).

2.2.3 Linear manifold construction

Finally, we build a linear manifold only using the relative coordinate \({\hat{\theta }}[k]\). Table 1 lists the coordinate systems of each spaces with differential perspectives. Consequently, we have defined a bijection between the \((\theta ^i; E_i)\) and \(({\hat{\theta }}^i; E_i)\).

Table 1 The associated coordinate systems during the designed bijection

2.3 Finding and measuring the geodesics

2.3.1 Finding the geodesic

On shape manifold, the shooting method is a classical method to find geodesic. We use the geodesic flow \(\varPhi (\theta ,\tau ,f)\) to formulate a shooting result, where the flow makes the curves satisfy the closure condition, the \(\theta \) is initial point, \(\tau \in [0,1]\) means the beginning point to the end point, and the f is the direction function for the geodesic. Finding geodesic is equal to minimize this cost function H(ab)

$$\begin{aligned} H(f)= \inf ||\varPhi (\theta _1,1,f)-\theta _2||^2. \end{aligned}$$
(12)

where \(f=\sum _{i=0}^{\infty }(a_i\cos (is)+b_i\sin (is))\) is the optimal direction function, \(s \in [0,2\pi ]\) means the unit circle, the \(\infty \) is approximated by a enough large positive integer m. A gradient-based optimization algorithm solves this optimal problem.

We rewrite Eq. (12) using the related coordinate and reference frame as:

$$\begin{aligned} H(a,b)= \inf ||\varPhi (\theta _1^i E_i,1,C^i F_i)-\theta _2^i E_i||^2, \end{aligned}$$
(13)

where \(C^iF_i\) denotes the \(\sum _{n=0}^{\infty }(a_n\cos (ns)+b_n\sin (ns))\), C is the related coordinate and F is the reference frames, i.e., a set of Fourier basis. Based on the discrete curve-based representation, the reference frames are denoted as:

$$\begin{aligned} F = (F_i^n) = \mathrm{e}^{j2\pi (i-1)\frac{n-1}{N}}, \end{aligned}$$
(14)

where F is a N dimensional matrix. The relative coordinate is computed by FFT algorithm:

$$\begin{aligned} {\hat{\theta }}[k] = \sum _{i=1}^{N}\theta [i]\cdot \mathrm{e}^{-j2\pi (k-1){\frac{i-1}{N}}}, \qquad k=1,\dots ,N. \end{aligned}$$
(15)

In addition, IFFT algorithm transforms \({\hat{\theta }}[k]\) to \(\theta [n]\):

$$\begin{aligned} \theta [i]={\frac{1}{N}}\sum _{k=1}^{N} {\hat{\theta }}[k]\cdot \mathrm{e}^{j2\pi (k-1){\frac{i-1}{N}}}, \qquad i=1, \dots , N. \end{aligned}$$
(16)

In Eq. (13), the optimization algorithm works in different frames that inhibits the algorithm performance. We use Eq. (9) to unify the frames in Eq. (13) so that the optimal issue Eq. (13) interprets as

$$\begin{aligned} {\hat{H}}(C^i)= \inf ||\varPhi ^{'}({\hat{\theta }}_1,1,C^i)-{\hat{\theta }}_2||^2. \end{aligned}$$
(17)

where \(C^i\) is a vector \([a_0,b_0,\ldots ,a_m,b_m]\). The nonlinear projection acts on the reference frame so that the \(\varPhi ^{'}\) is a linear projection. Obviously, \(a_n\) is the real part of \(({\hat{\theta }}_2 - {\hat{\theta }}_1)\) and \(b_n\) is the image part of \(({\hat{\theta }}_2 - {\hat{\theta }}_1)\). Furthermore, the proposed method gives an accurate number for Fourier basis during the optimal process, i.e., \(m = N/2\), N is the number of sample data.

Finally, we recover the geodesic by

$$\begin{aligned} \varPhi (\theta _1,\tau ,C^i F_i) = \varPhi (\mathbf{IFFT} (\varPhi ^{'}({\hat{\theta }}_1,\tau ,C^i))), \end{aligned}$$
(18)

where \(\mathbf{IFFT} (\cdot )\) is the IFFT algorithm in Eq. (16). In addition, Algorithm 1 summarizes the process of finding geodesic.

figure a

2.3.2 The geodesic measurement

Based on the bijection, we not only reduce the problem of finding the geodesic on shape manifold, but also simplify the computing of the length of geodesic. We denote the length of geodesic on shape manifold as \(d(\theta _1, \theta _2)\), and the length of geodesic on linear manifold as \(d({\hat{\theta }}_1, {\hat{\theta }}_2)\). The \(d(\theta _1, \theta _2)\) is equivalent to the \(d({\hat{\theta }}_1, {\hat{\theta }}_2)\) under the bijection, which is denoted as \(d(\theta , \theta ) \leftrightarrow d( {\hat{\theta }}, {\hat{\theta }} )\).

For any closed curve, the direction function \(\theta [n]\) takes reference to unit circle \(\theta _0[n]\),

$$\begin{aligned} \theta [n] = f[n]+\theta _0[n]. \end{aligned}$$
(19)

As a result, we remove the unit circle before computing the distance:

$$\begin{aligned} F(\theta [i]) \leftrightarrow F({\hat{\theta }}[i]) = \left| {\hat{\theta }}[i] - {\hat{\theta }}_0[i] \right| , \end{aligned}$$
(20)

where the \(F(\cdot )\) is the FFT algorithm, and the \(F(\theta [i])\) is named the feature of curve. Based on the feature, two curves \(\theta _1[n]\) and \(\theta _2[n]\) are distinguished by the distance:

$$\begin{aligned} \begin{aligned} d(\theta _1[n], \theta _2[n])&\leftrightarrow d( {\hat{\theta }}_1[n], {\hat{\theta }}_2[n] ) \\&= \sqrt{\sum \limits _{i = 1}^N { \left( F({\hat{\theta }}_1[i]) - F({\hat{\theta }}_2[i])\right) ^2} }. \end{aligned} \end{aligned}$$
(21)

3 Experiments

3.1 Geodesics illustration

The application of shape interpolation is a useful technique to show the found geodesic on shape manifold. We compare our work with the [17] to illustrate that our method finds the same geodesics between simple curves and our method can stably find the geodesics between simple curve and self-crossed curve. We evaluate these two works on Kimia database [17].

3.1.1 The geodesic for simple curves

The simple shapes are the closed curves that do not cross themselves. The goal is to illustrate the found geodesic on shape manifold using the shape interpolation. We compare our work with the work of [17], because these two works all use differential geometric representations for curves.

Fig. 3
figure 3

The illustration of shape interpolations between simple closed curves

To compare these two works, we illustrate the two kinds result of shape interpolation at the same position. The pairs of shapes are randomly chosen from Kimia database. Figure 3 illustrates some geodesics on shape manifold between several pairs of shapes, in which the shapes with solid lines come from our method and the results from Klassen et al. [17] are shown in the dotted lines. We observe that the two results mainly express the same process of shape interpolation. Consequently, our method is able to find the geodesic on shape manifold.

Fig. 4
figure 4

The illustration of shape interpolations from simple closed shapes to self-crossed shapes

Fig. 5
figure 5

Illustration of the H(ab) for different kind of geodesics

3.1.2 The geodesic for self-crossed curves

The self-crossed curves are another kind of curves on shape manifold. Our goal is to illustrate the geodesic between simple curves and self-crossed curves. We select a shape as the first element \(\theta _1[n]\), and the second element \(\theta _2[n]\) is the elephant and the plant.

Figure 4 shows two geodesics generated by Klassen et al.’s work and our method, in which the first element \(\theta _1[n]\) is the leftmost shape, and the rightmost shape is the second element \(\theta _2[n]\) found by the methods. As we can see, the shooting method utilized in [17] is not applicable for optimization in the case of self-crossed curves. And, our method is able to find the geodesic between the first and the second elements.

Figure 5 illustrates the cost function for the proposed method (blue line) and the method [17] (red line) changes with the m increasing. As we can see, the optimal algorithm used in the proposed method is more higher accuracy, lower complexity and better robustness (not east to fall into local optimum), especially for the geodesic between the simple curve and self-crossed curve.

Fig. 6
figure 6

Illustration of the transportation of shape deformation with parameter \(k=1\)

3.1.3 Geodesic transportation

Transporting deformation is one kind of shape predictions, which is framed as follows: When we have known one shape deformation under external force, we aim to apply the “same” deformation for the other shapes and predict the deformation process. The related works [10, 25] restricted the external force as the different viewing angle during their researches. In our work, we allow the external force caused by different viewing angle or other physical factors.

The mathematical statement of the transporting deformation is as follows: Let the known deformation be the geodesic between the unit circle \(\theta _0[n]\) and a required shape \(\theta _1[n]\) on shape manifold. The transporting deformation is that transporting the known geodesic to the other required shape \(\theta _2[n]\). The fundamental technical issue is to transport a known geodesic from one point to another required point on the shape manifold. It cannot simply apply the transporting on the shape manifold because the shape manifold is a nonlinear space, while this issue can be simplified using the designed bijection.

Depending on the designed bijection, the issue of transporting deformation on shape manifold is simplified as the vector transporting on the linear manifold. We formulate the transporting deformation on linear manifold as:

$$\begin{aligned} \varPhi ^{'}({\hat{\theta }}_2,i \varepsilon ,C^i) = {\hat{\theta }}_2[n] + i \varepsilon ({\hat{\theta }}_1[n]-{\hat{\theta }}_0[n]), \end{aligned}$$
(22)

where \(i = 0,1,\ldots ,k\), the part of \(i \varepsilon (\theta _1[n]-\theta _0[n])\) means the known deformation and the parameter k describes the degree of deformation. The result of transporting deformation is computed as \(\varPhi (\theta _2,i \varepsilon ,C^i)\).

Figures 6 and 7 show two kinds of transporting deformation, in which the first panel is the known deformation under some unknown external force and the second panel illustrates three results of transporting deformation. As we can see, the predict results in Figs. 6 and 7 are expected.

Fig. 7
figure 7

Illustration of the transportation of shape deformation with parameter \(k=6\)

3.1.4 Complexity analysis

Our method uses a linear interpolation to find geodesic in the linear manifold; thus, the computational complexity of obtaining geodesic with K steps on the linear manifold is O(k). Then, we use fast Fourier transform (FFT) algorithm to project the data onto the linear manifold. The computational complexity of projection is \(O(N*\log (N))\), where N is the number of samples on the curve. Thus, the total computational complexity for geodesic finding is \(O(kN*\log (N))\).

3.2 Shapes retrieval

Shape retrieval aims to rank shapes in a database according to their similarity, which is a widely used test for the performance of shape metric. In these experiments, every shape is represented by \(N = 50\) points for the direction function \(\theta [n]\). The step k is set as 4.

3.2.1 Flavia database

The Flavia database [27] includes 32 types of leaf species with 1907 leaves in total. Figure 8 shows the examples of the leaf shapes from the Flavia database. In [18], a leave-one-out test is proposed for the shape retrieval evaluation, in which every sample is removed from the database and used as a query against the rest of the database.

Fig. 8
figure 8

Examples of each types from Flavia database

Fig. 9
figure 9

The PR curves on the Flavia database

To evaluate the performance, we use the precision vs. recall curves (PR curves) and the mean average precision (MAP) as a performance measure. They are illustrated in Fig. 9 and listed in Table 2, respectively. The result of our Fourier transformation method is named as FT. In Table 2, we highlight our results and summarize the other results. The comparison results suggest that our method works the best among the mentioned methods in the Flavia database.

Table 2 The mean average precision (MAP) on the Flavia database

3.2.2 MPEG-7 database

MPEG-7 [19] contains 70 shape classes each containing 20 elements. It is a generic shapes dataset, where there are shape categories with variations of different view point and/or large deformations. The retrieval performance on the MPEG-7 dataset is measured by a “Bull’s eye score” method, which takes the overall percentage of retrieval results, among the first 40, that belong to the same class as the query. We compare with some classical feature-based method (IDSC [21], and SC [4]) and the manifold-based method (LeoGroup [11], and DF [17]) in the Table 3. Consequently, the feature-based methods get a higher score than the manifold-based methods. The proposed method has a best performance among the manifold-based methods. Furthermore, the Bull’s eye score of proposed method is close to the SC method that means the proposed method is a potential approach of manifold-based method.

Table 3 The Bull’s eye score on the MPEG-7 database
Table 4 MAP with a Gaussian noise of different SNR
Fig. 10
figure 10

PR curves under different noise magnitudes

3.2.3 Robustness

We evaluate the robustness of the proposed method to local perturbations on Flavia database that generally degrades the retrieval system performance. We add white Gaussian noise for the direction function \(\theta [n]\) with different signal-to-noise ratio (SNR), 50 db, 40 db, 30 db and 20 db, respectively. We also do a leave-one-out test scenario to test the retrieval system performance. Table 4 summarizes the MAP values, and Fig. 10 illustrates the respective PR curves. Consequently, the proposed method is robustness to local perturbations.

4 Conclusions

In this paper, we present a new geometry framework to solve the limitations on the shape manifold mentioned in [17]. The designed bijection reduces the nonlinear problems to the linear problems on a linear manifold. Based on the bijection, we project the geodesic on the shape manifold as a straight line on the linear manifold. In the shape interpolation, the comparison experiments illustrate that the proposed method is more robust. In addition, we illustrate our performance in the transportation of shape deformation and the shape retrieval. Consequently, the designed bijection projects the shape manifold as a linear manifold that simplifies the nonlinear issues on shape manifold.