1 Introduction

Shape analysis has been an extensively studied research topic in Computer Graphics and Computer Vision. One particular task is to provide similarity measurement among shapes, that can be effectively used for shape understanding, shape matching, shape retrieval, shape editing and reconstruction, etc.

Most methods have focused on how to find geometric criteria for extracting shape descriptors. Such approaches are limited to a single generic rule (e.g., concavity, skeleton topology, approximate shape primitives) or a single feature (area, angular distance, shape diameter, curvature tensor, and geodesic distance) to partition the input mesh. However, they usually lack efficiency and consistency for complex deformable shape matching.

Our method combines sparse sampling method with spectral correspondence for efficient and intrinsic shape analysis. For a set of 3D shapes from different categories as input, we first extract a set of sample points to build a sparse representation for each shape by using an efficient row sampling method, and then the affinity matrix is created based on minimum spanning tree. We embed the set of weighted matrices into spectral domain to find the intrinsic structural correspondence and build the pairwise matching matrix between each two meshes. After that, we combine all single weighted matrices and pair-wise matching matrices to reveal the structural consistency for the input set.

We have tested the proposed approach on various categories of shapes. The final results demonstrate that our approach is efficient for shape classification despite the considerable changes of their topology, stretching and pose-variances and incompleteness.

The paper is structured as follows: the related works are presented in Section 2. Section 3 details the extraction of sampling points, the construction of sparse graph and the transformation from spatial domain into spectral domain. The non-rigid shape matching based on sparse spectral correspondence is presented in section 4. Experimental results are analyzed and discussed in Section 5. We conclude our work in Section 6.

2 Related work

2.1 Shape matching

An important work about shape matching is multidimensional scaling (MDS) proposed by Zigelman et al. [50] and Elad and Kimmel [16], they matched isometric shapes by embedding them into a Euclidian space, and implement a rigid shape matching in that space. Since it is generally impossible to embed a non-flat 2D manifold into a flat Euclidean domain without introducing some errors, Jain et al. [20], Mateus et al. [26] and Sharma and Horaud [37] suggested alternative isometry-invariant shape representations using eigendecomposition of discrete Laplace operators.

The spectrum of the Laplace operator provides large descriptive feature vectors, that is isometric-invariance, robust to local noise and sampling, shape-intrinsic, and multi-scale. Rustamov [33] proposed a Global Point Signature (GPS) for shape comparison by employing the discrete Laplace-Beltrami operator, which captures the shape’s geometry faithfully. Ovsjanikov et al. [29] construct their Heat Kernel Signature (HKS) and Heat Kernel Maps, respectively. Zaharescu et al. [48] suggested an extension of 2D descriptors for surfaces, and used them to for shape matching. Hu and Hua [19] used the Laplace-Beltrami operator for matching using prominent features, and Dubrovina and Kimmel [14] suggested employing surface descriptors based on its eigendecomposition, combined with geodesic distances, in a quadratic optimization formulation of the matching problem. The above methods, incorporating pair wise constraints, tend to be slow due to high computational complexity.

Memoli and Sapiro, Bronstein et al., [5, 27, 28] compared shapes using different approximations of the Gromov-Hausdorff distance. Bronstein et al. [6] used diffusion geometry in order to match shapes with topological noise, and Thorstensen and Keriven [40] extended it to handle surfaces with textures. The GMDS algorithm [5] results in a non-convex optimization problem, therefore it requires good initializations in order to obtain meaningful solutions, and can be used as a refinement step for other shape matching algorithms. Anguelov et al. [1] optimized a joint probabilistic model over the set of all possible correspondences to obtain a sparse set of corresponding points, and Tevs et al. [39] proposed a randomized algorithm for matching feature points based on geodesic distances between them. Zhang et al. [49] performed the matching using extrema curvature feature points and a combinatorial tree traversal algorithm, but its high complexity only allowed them to match only a small number of points. Moreover, there is no guarantee that this result minimizes the difference between pairwise geodesic distances of matched points. Instead of detecting the non-rigid mapping between two shapes, [21, 24, 30] search for a mapping from the shape to itself, and thus are able to detect intrinsic symmetries.

2.2 Shape correspondence

A recent survey of shape correspondence methods can be found in [41]. The most well-known approach is Iterative Closest Point (ICP) algorithm [2], which is used in [18] to align 3D shapes. However, this approach only computes rigid transformation for shape correspondence.

Recent works, most notably the TPS-RPM method of Chui and Rangarajan [7], attempt to incorporate non-rigid deformations into the ICP framework, using thin-plate splines (TPS) to model the deformation. To perform non-rigid alignment of shapes, some approaches embed the shapes in a common space to represent them. The most popular methods are Multi-Dimensional Scaling [50] and the spectral transform [16, 20, 26, 31, 32, 42, 44, 45]. Transforming a shape into the spectral domain can obtain a more intrinsic representation of the shape.

Computing a correspondence between shapes is one of the key problems that can benefit from semantically-driven techniques, some works segment a mesh into parts, finding analogies between these parts, transferring information or part styles [35, 36, 38], and using prior knowledge to learn how to label a shape or establish a correspondence [1122, 43]. Recently, some symmetry-aware algorithms have been proposed to improve the accuracy of intrinsic correspondence [34, 46, 47].

Our work is inspired by the spectral matching method [7, 42]. However, we find these methods are computationally costly because of the large number of dense points in each shape. So we propose a sparse graph representation method based on “Lewis weight” theory and MST; By using the multi-scale eigenfunctions it efficiently indicates the intrinsic structure among the deformable shapes. And eventually we apply TPS to implement shape similarity estimation and shape matching.

As it is shown in Fig. 1, our method takes a 3D mesh model as input. First, we combine the row sampling theory by Lewis Weights and polynomial approximation method to obtain a set of sparse and stable key points that are robust to noise and pose variance. Second, a sparse graph is constructed with the sampling points based on MST that compute the intrinsic affinity matrix efficiently. And then the 3D mesh is embedded in a sparse spectral space. Finally, the similarity between each two shapes is evaluated with the help of TPS method. A series of experimental results have shown the efficiency and accuracy of our method in similarity measurement and shape classification.

Fig. 1
figure 1

Our method applies a two-stage sample-based approximation method to obtain stable key points and constructs a sparse graph; the efficient similarity is evaluated by a sparse spectral correspondence with the help of TPS

3 Sparse graph representation

3.1 Sparse sampling

Randomized sampling is an important tool in the design of efficient algorithms. A random subset often preserves key properties of the entire data set while allowing one to run algorithms on a much smaller sample. Recently, row sampling of matrices has received much attention in numerical linear algebra [8, 10, 12, 13, 23].

  • Theorem 1. (Row sampling) For a n × d matrixAwhere n >  > d and any error parameterε > 0, we can find Awith a few (rescaled) rows ofA.

$$ {\left\Vert \mathrm{Ax}\right\Vert}_p{\approx}_{1+\varepsilon }{\left\Vert {\mathrm{A}}^{\prime}\mathrm{x}\right\Vert}_p $$
(1)

for all vectors x ∈ Rd. Here ≈1 + ε denotes a multiplicative error between (1 + ε)−1and(1 + ε). Let n and d to denote number of rows and columns respectively, and aito denote the vector corresponding to the ithrow of A. A crucial definition in ℓ2 row sampling and matrix concentration bounds is statistical leverage scores τi:

$$ {\tau}_i(A)\overset{def}{=}{a}_i^T{\left({A}^TA\right)}^{-1}{a}_i={\left\Vert {\left({A}^TA\right)}^{-1/2}{a}_i\right\Vert}_2^2 $$
(2)

The well-known sampling results for general p norms [8,9,10, 13], are based on a “change of density” construction originally due to Lewis [23]. This construction assigns weights wi, analogous to a leverage score, to each row ai; these weights can be used directly as the sampling probability.

  • Theorem 2. (Lewis Weight) For a matrixAand norm p, the ℓpLewis weights\( \overline{w} \)are the unique weights such that for each row i we have\( \overline{w}={\tau}_i\left({\overline{W}}^{1/2-1/p}A\right) \)or\( {a}_i^T{\left({A}^T{\overline{W}}^{1/2-1/p}A\right)}^{-1}{a}_i={\overline{w}}_i^{2/p} \) [9].

Note that for the case where p = 2, that is 2, W1/2 − 1/p is the identity matrix, so the Lewis weights are just the leverage scores.

Drineas etc. [12] solve a sampling algorithm and corsets for p regression based on “Lewis Weight” theory, and provide a well-conditioned basis to obtain an exponentially better “condition number”.

  • Theorem 3. (p regression problem) Let ‖ ⋅ ‖denote the p-norm of a vector. Given as input a matrixA ∈ Rn × m, a target vector b ∈ Rn, and real number p ∈ [1, ∞), find a vector xOPTand a number Z such that

$$ Z=\underset{x\in {R}^m}{\min }{\left\Vert Ax-b\right\Vert}_p={\left\Vert A{x}_{OPT}-b\right\Vert}_p $$
(3)
  • Theorem 4. (well-conditioned basis) LetAbe an n × d matrix of rank d, let p ∈ [1, ∞), let q be its dual norm, where 1/p + 1/q = 1, then an n × d matrixUis an (α, β, p)-well-conditioned basis for the column space of A if the columns ofUspan the column space ofAand (1) ‖‖U‖‖p ≤ α,and (2) for all z ∈ Rd, ‖zq ≤ βUzp. We will say thatUis a p-well-conditioned basis for the column spaceAif α and β are dO(1), independent of m and n. If p < 2, then\( \alpha ={d}^{\frac{1}{p}+\frac{1}{2}} \)and β = 1; if p = 2, then\( \alpha ={d}^{\frac{1}{2}} \)and β = 1; and if p > 2, then\( \alpha ={d}^{\frac{1}{p}+\frac{1}{2}} \)and\( \beta ={d}^{\frac{1}{q}-\frac{1}{2}} \) [12].

The significance of a p–well-conditioned basis is that it is able to minimize the variance in sampling process by randomly sampling rows of the matrix A and elements of the vector b according to a probability distribution that depends on norms of the rows of the matrix U. This will allow us to preserve the subspace structure of span (A) and thus to achieve an(1 + ε)−1-approximation to the p regression problem more efficiently.

In this paper we adopt the p regression algorithm [12] and “Lewis weights” theory [9] to efficiently extract the key sample points over 3D mesh model.

We define a polynomial surface and compute the importance of each point in a mesh model for surface approximation by solving a 2 regression problem. The importance of each point is used as sampling probability to extract key sample points. We demonstrate the two-stage approximation algorithm and the sampling results to prove the efficiency and robustness of our sampling method.

3.2 Sparse sampling by Lewis weights

An important question in algorithmic theory is whether there exists a small subset of the input such that if computations are performed only on this subset, the solution to the given problem can be approximated well.

In this paper, we adopt a two-stage sampling-based approximation algorithm to obtain the key sample points from a mesh model [12]. The first stage of the algorithm is sufficient to obtain a constant factor approximation to a polynomial surface. The second stage of the algorithm carefully uses the output of the first stage to construct a coreset and achieve arbitrary constant factor approximation, which can be considered as sampling probability.

Assume that we have a series of points A = {(x1, y1, z1), …(xn, yn, zn)}, and an approximat polynomial surface f(x, y) of degree d:

$$ \mathrm{z}=f\left(x,y\right)={\sum}_{i+j\le d}{c}_{i,j}{x}^i{y}^j $$
(4)

And all these polynomials constitute a linear space Sd:

$$ {\mathrm{S}}_{\mathrm{d}}:= \left\{f\left(x,y\right)|\deg f\left(x,y\right)\le d\right\} $$
(5)

The distance between point and surface f is defined as:

$$ distance\left(\mathrm{A},f\right):= \sqrt{\sum \limits_{i=1}^n{\left({z}_i-f\left({x}_i,{y}_i\right)\right)}^2} $$
(6)

Then there exists a unique f0(x, y) that satisfies:

$$ distance\left(\mathrm{A},{f}_0\right):= \underset{f\in {S}_d}{\min}\; distance\left(\mathrm{A},f\right) $$
(7)

So the greater the value of d is, the better the surface f0(x, y) approximates the set A. We now can rewrite distance(A, f) as matrix format:

$$ distance\left(\mathrm{A},f\right)=\sqrt{\sum_{i=1}^n{\left({z}_i-f\left({x}_i,{y}_i\right)\right)}^2}=\sqrt{\sum_{i=1}^n{\left({z}_i-{\sum}_{i+j\le d}{c}_{i,j}{x}_i{y}^j\right)}^2}\;\mathrm{Define} $$
$$ \mathrm{B}=\left(\begin{array}{ccccccccccc}1& {x}_1& {y}_1& {x}_1^2& {x}_1{y}_1& {y}_1^2& \dots & {x}_1^d& {x}_1^{d-1}{y}_1& \dots & {y}_1^d\\ {}1& {x}_2& {y}_2& {x}_2^2& {x}_2{y}_2& {y}_2^2& \dots & {x}_2^d& {x}_2^{d-1}{y}_2& \dots & {y}_2^d\\ {}\vdots & \vdots & \vdots & \vdots & \vdots & \vdots & \ddots & \vdots & \vdots & \vdots & \vdots \\ {}1& {x}_n& {y}_n& {x}_n^2& {x}_n{y}_n& {y}_n^2& \dots & {x}_n^d& {x}_n^{d-1}{y}_n& \dots & {y}_n^d\end{array}\right)\kern0.5em \mathrm{C}=\left(\begin{array}{c}{c}_{0,0}\\ {}{c}_{1,0}\\ {}\vdots \\ {}{c}_{0,d}\end{array}\right)\kern0.5em \mathrm{Z}=\left(\begin{array}{c}{z}_1\\ {}{z}_2\\ {}\vdots \\ {}{z}_n\end{array}\right) $$
(8)

Now the distance(A, f) can be written as:

$$ distance\left(\mathrm{A},{f}_0\right)={\left\Vert \mathrm{BC}-\mathrm{Z}\right\Vert}_2 $$
(9)

Matrix C is the coefficients of polynomial surface f and ordered in sequence, and we have:

$$ \underset{f\in {S}_d}{\min } distance\left(\mathrm{A},f\right)\kern0.5em =\underset{\mathrm{C}\in {R}_{\frac{\left(d+1\right)\left(d+2\right)}{2}}}{\min }{\left\Vert \mathrm{BC}-Z\right\Vert}_2 $$
(10)

Here \( {R}_{\frac{\left(d+1\right)\left(d+2\right)}{2}} \) is the \( \frac{\left(d+1\right)\left(d+2\right)}{2} \) Euclidean space.

Ideally, there exists a C0 to satisfies BC0 = Z which makes all points of A fitting the polynomial surface f0(x, y), but for a fixed d, it becomes an over-determined problem since the number of rows of matrix B is often far bigger than the number of columns, that is n> > d. It means that a few rows of the matrix B play an important role for the pregression problem as described in the Theorem1.3. Hecne we can use row sampling to solve this approximation problem.

In this paper we use a two-stage row sampling algorithm to sample the matrix and the corresponding optimal solution is between (1 + ε)−1 and (1 + ε) [9, 12].

Given a mesh M with n points, and an approximated polynomial surface f, we build the matrix B (Eq. 8). The row sampling algorithm takes as input an n × m matrix B of rank d, a target vector z ∈ Rn. And a number p = 2 means ℓ2 regression problem. It returns vector qi ∈ Rm as Lewis weight to each row, which is used as the sampling probability.

Algorithm 1: Row sampling by Lewis Weights

figure b

The algorithm first computes a p–well-conditioned basis U for span (B), as described in the Theorem1.4. Then, it uses information from the norms of the rows of U to sample constraints from the input ℓp regression problem. In particular, roughly O(dp + 1) rows of B, and the corresponding elements of z, are randomly sampled according to the probability distribution given by

$$ {p}_i=\min \left\{1,\frac{\left\Vert {U}_{i\ast}\right\Vert }{\left\Vert \right\Vert {U}_p^p}{r}_1\right\},\mathrm{where}\;{r}_1=16\left({2}^p+2\right){d}^k\left(d\ln \left(8\cdot 12\right)+\ln (200)\right). $$
(11)

In our method, we choose p = 2 and solve a ℓ2 regression problem. We use S to denote the sampling matrix for the first-stage sampling probability.

In the second stage, we use the output from the first stage to refine the sampling probability.

$$ {q}_i=\max \left\{{p}_i,\frac{{\left|{\widehat{\rho}}_i\right|}^p}{{\left\Vert \widehat{\rho}\right\Vert}_p^p}{r}_2\right\},\mathrm{where}\;{r}_2=\frac{150\cdot {24}^p{d}^k}{\varepsilon^2}\left(d\ln \left(\frac{280}{\varepsilon}\right)+\ln (200)\right). $$
(12)

And the constant factor qi as Lewis eight of each row is finally obtained which represents the importance for the approximation.

The ultimate goal of row sampling is to evaluate the importance of each point to the approximated surface. And it is obvious that the points approximate well when the degree d is larger (see Fig. 2). We choose d = 6 as a trade off in our experiments.

Fig. 2
figure 2

The polynomial surface of degree d (d = 4, 5, 6) and the approximation errors

All the points in the matrix are reordered with Lewis weight qi in descending order, and then the key sample points can be selected according to user’s choice. In our experiments we set ε = 0.01, and as we can see in Fig. 3a key sample points (black balls) are extracted with different sparse ratio τ = 0.1, τ = 0.2, τ = 0.5.

Fig. 3
figure 3

The comparison of sampling results. (a) The key points extraction with sample rates τ = 0.1, τ = 0.2, τ = 0.5 by our method. (b) sampling results based on SGS [17], MLS [3], PD [4] and ours. (c) sampling results on different poses of human models and with Gaussian noise (δ = 0.0015)

To evaluate the sampling results it usually needs two kinds of metrics: the control metric r and the distance metric dm. The control metric r reflects the user preferences of the sampling pattern, such as density, adaptivity. The distance metric dm reflects properties of the underlying sample space Ω, essentially defining the distance between any pairs of samples within Ω [15]. It is usually used to evaluate the stability of sampling method.

Here we define the distance between the pair of sample points d(s1, s2) via the surface geodesic and let r be the distribution of the histogram of geodesic distance which indicates the sampling density.

We normalize the geodesic distance between all pairs of sample points and divide the normalized geodesic distance into five ranges; the distribution of sample points in each range is calculated as shown in histogram in Fig. 3a, we can see that our method obtain the similar density distribution r of geodesic distance even with different sparse ratio τ. It effectively reveals the stability of the sample space.

Figure 3b shows the comparison of sampling results, our sample points indicate better structural distribution than those based on salient geometric sampling (SGS) method [17] and MLS method [3], and it achieves similar simplified results with fewer sample points comparing with Poisson disk (PD) method [4, 15]. In Fig. 3c we can see our sampling method is robust to the Gaussian noise (δ = 0.0015) and the distribution density of sample points in human model with sparse ratio τ = 0.3 is very stable.

3.3 Sparse graph construction with MST

We build a sparse weighted graph representation for each mesh model with key sample points by using minimum spanning tree (MST) algorithm.

Given a mesh model G = 〈V, E〉, we construct the shortest path \( {P}_{ij}^{(k)}\mid \underset{k}{\underbrace{v,\dots {v}_j}}\in \mathrm{V} \) for each pair of key sample points (vi, vj) (see Fig. 4), k is the number of the points on this shortest path, and then the weight for this path can be evaluated as \( {w}_{ij}^{(k)} \):

$$ {w}_{ij}^{(k)}=\Big\{{\displaystyle \begin{array}{ll}1& i=j\\ {}d{t}_{ij}& k=0;i\ne j\\ {}\min \left(d{t}_{ij}^{\left(k-1\right)},d{t}_{ik}^{\left(k-1\right)}+d{t}_{kj}^{\left(k-1\right)}\right)& k\ge 1;i\ne j\end{array}} $$
(13)
Fig. 4
figure 4

The Sample points and the sparse graphs with MST (n = 800; n = 200)based on different sparse ratios: τ = 0.7; τ = 0.3

Here dtij is the Euclidean distance between adjacent points (vi, vj), \( \min \left(d{t}_{ij}^{\left(k-1\right)},d{t}_{ik}^{\left(k-1\right)}+d{t}_{kj}^{\left(k-1\right)}\right) \) means iteratively computing the distance of each pair of adjacent points on the shortest path between two sample points (vi, vj). Based on MST a sparse weighted graph G′=〈V′, E′,   W〉 and its affinity matrix W are finally constructed.

Algorithm 2: Sparse graph construction with MST.

figure c

Fig. 4 has shown the key sample points by Lewis Weights and a constructed sparse graph by using MST.

Since Laplace-Beltrami spectrum reveals intrinsic structure and morphology of the deformable shape, It provides a robust tool for shape analysis [19, 29, 33, 48].

Therefore, we embed our MST into spectral space to build sparse spectral representation in order to find intrinsic similarities among shapes.

The affinity matrix W of MST is eigendecomposed as W = QΛQT, Λ is a diagonal matrix where the diagonal elements (λ1 ≥ ... λn) are the eigenvalues and sorted in descending order. Q = [v1··· vl] is a l × l matrix and v1,···vl are eigenvectors corresponding to the eigenvalues.

We choose the first k largest dimension eigenvector Qk for the corresponding ordered eigenvalues. The row vectors of Qk can be regarded as the embedding coordinates of the corresponding points in the k-dimensional spectral domain.

Figure 5 shows 2-dimensional eigenmaps of human model (with the 2nd, 3rd largest eigenvalues) with 100 and 30% sample points. We can see that our sparse spectrum keeps well structural description.

Fig. 5
figure 5

Human model and its 2-D spectral eigenmap with different sample points (τ = 1.0, τ = 0.3)

4 Shape correspondence based on sparse spectral embedding

The purpose of transforming the 3D mesh from the spatial domain to the k-D spectral domain is to attain invariance to isometric transformation, and uniform scaling, as well as robustness to difference in mesh sizes.

In this work, we match each pair of sparse graphs that we constructed with spectrum by TPS transformation [7].

Given two sparse spectral embeddings \( \widehat{A} \) and \( \widehat{B} \), the transformation function \( g:{\widehat{A}}_i\in {R}^k\to {\widehat{B}}_j\in {R}^k \) maps a point set in 2 dimensional space to another point set by minimizing the following energy function:

$$ E(g)=\sum \limits_{j=1}^n\left\Vert {\widehat{A}}_i-g\left({\widehat{B}}_j\right)\right\Vert +\beta \iint \left[{\left(\frac{\partial^2g}{\partial {x}^2}\right)}^2+2{\left(\frac{\partial^2g}{\partial x\partial y}\right)}^2+{\left(\frac{\partial^2g}{\partial {y}^2}\right)}^2\right] dxdy $$
(14)

where β is the regularization (smoothing) parameter. Assume that the correspondence between \( \widehat{A} \) and \( {\widehat{B}}_j \), the point \( {\widehat{B}}_j \)is the matching point for \( {\widehat{A}}_i \). The unique g that minimizes the above energy function has the form:

$$ g\left({\widehat{B}}_j,t,w\right)={\widehat{B}}_j\cdot t+\sum \limits_{i=1}^m{w}_i\cdot \varphi \left(\left\Vert \left({x}_i,{y}_i\right)-\left({x}_j,{y}_j\right)\right\Vert \right) $$
(15)

Where t is an (k + 1) × (k + 1) affine coefficient matrix and w is a n × (k + 1) non-rigid warping coefficient matrix. φ is the TPS kernel, φ(r) = r2 log r2, r = ‖(xi, yi) − (xj, yj)‖. The coefficients of t and w can be calculated with

$$ \left[\begin{array}{l}K\kern0.5em {\widehat{B}}^T\\ {}\begin{array}{cc}\widehat{B}& 0\end{array}\end{array}\right]\left[\begin{array}{l}w\\ {}t\end{array}\right]=\left[\begin{array}{l}\widehat{A}\\ {}0\end{array}\right] $$
(16)

where Kij = φ(‖(xi, yi) − (xj, yj)‖).

Using this transformation form, we transform the point set \( \widehat{A} \) to point set \( \widehat{B} \). This process is iterated until convergence. We have found experimentally that 5 to 10 iterations of the iterative alignment are sufficient to align the embeddings. The value of the regularization parameter β is set to be the mean distance between all embedded point pairs. Finally, we can obtain the distance of each matching point pairs \( \left({a_i}^{\left({M}_1\right)},{{b_i}^{\left({M}_2\right)}}_{C(i)}\right) \), \( {a_i}^{\left({M}_1\right)} \) is the ith vertex of spectral embedding \( \widehat{A} \) and \( {{b_i}^{\left({M}_2\right)}}_{C(i)} \) is the C(i)th vertex of spectral embedding \( \widehat{B} \), where C is the correspondence found by our algorithm.

This way, we use the sum of squared difference (SSD) and the correlation coefficient (CC) as evaluation parameters of shape matching (Eqs. 18, 19). The larger value of SSD represents the worse correspondence, but the larger CC will show the better shape similarity. Consequently, the matching accuracy sim(Mi, Mj) between each two mesh models can be computed (Eq. 20).

$$ {C}_i=\arg {\min}_j\left\Vert {\widehat{A}}_i-{\widehat{B}}_j\right\Vert; $$
(17)
$$ SSD={\left(\sum \limits_{i=1}^N{\left({a_i}^{\left({M}_1\right)}-{{b_i}^{\left({M}_2\right)}}_{C(i)}\right)}^2\right)}^{1/2}/N; $$
(18)
$$ CC=\frac{\sum \limits_{i=1}^N\left({a}_i^{\left({M}_1\right)}-{\overline{\alpha}}^{\left({M}_1\right)}\right)\left({b_i^{\left({M}_2\right)}}_{C(i)}-{{\overline{\beta}}^{\left({M}_2\right)}}_{C(i)}\right)}{{\left(\sum \limits_{i=1}^N{\left({a}_i^{\left({M}_1\right)}-{b_i^{\left({M}_2\right)}}_{C(i)}\right)}^2\right)}^{1/2}} $$
(19)

Where \( \overline{\alpha} \) and \( \overline{\beta} \) are the mean distances of \( \widehat{A} \) and converted \( \widehat{B} \) respectively. We set a threshold parameter\( \gamma =0.05\cdot \left(\overline{\alpha}+\overline{\beta}\right)/2 \), for each matching point pair\( \left({a_i}^{\left({M}_1\right)},{{b_i}^{\left({M}_2\right)}}_{C(i)}\right) \), if \( \left\Vert {a}_i^{\left({M}_1\right)}-{b_i^{\left({M}_2\right)}}_{C(i)}\right\Vert \le \gamma \), it is an effective matching pair, if \( \left\Vert {a}_i^{\left({M}_1\right)}-{b_i^{\left({M}_2\right)}}_{C(i)}\right\Vert >\gamma \), then it is not effective matching pair. We can then obtain the matching accuracy sim(Mi, Mj) of each two 3D models Mi and Mj, Neff is the number of the effective matching pairs, m, n is the number of points in Mi and Mj respectively.

$$ \mathrm{s} im\left({M}_i,{M}_j\right)=\frac{N_{eff}}{\min \left(m,n\right)} $$
(20)

Figure 6 has shown the results of sparse spectral correspondence, we first create affinity matrix for each sparse graph, and then build a sparse spectral eigenmap (see Fig. 6b). The shape correspondences are finally obtained in a spectral domain (see Fig. 6c). Figure 6d shows the shape alignment in spatial domain which is difficult to reveal the consistency of non-rigid shapes.

Fig. 6
figure 6

The shape matching results based on sparse spectral correspondence. (a) The spectral eigenmap of two human models (b) Sparse spectral eigenmap with sparse ratio τ = 0.3. (c) Sparse spectral correspondence based on TPS. (d) Non-rigid shape matching based on sparse spectral correspondence. (e) shape alignment in spatial domain

Clearly, our algorithm can effectively recognize the same category meshes. And it behaves robustly against moderate stretching and pose-variance in shapes.

To summarize, the procedure of our spectral correspondence is described in Algorithm 3.

Algorithm 3: Procedure of spectral correspondence.

figure d

5 Experimental results

In this work, we develop a row sampling and sparse graph construction method to generate a sparse descriptor for most of shapes which is useful for shape matching and analysis.

In this section, we demonstrate experimental results and analyze the performance of our algorithm. We choose Princeton Segmentation Benchmark as our test set. All experiments are performed on Intel Core™ CPU 3.5 GHz, 8GB memory.

The time complexity of our algorithm is mostly dependent on the shape resolution. In our method we firstly implement a row sampling to reduce resolution of complex shapes to less than 2 K points. At the same time, the subsampled points can maintain well the geometrical and topological properties for most of shapes.

We assume that the input shape with n points. In the sampling stage, suppose that the average number of sample points is Nk (Nk < n), then the sampling procedure is O(nd2 log(n)), where d = rank(B), and the computing of sparse graph and spectral embedding takes O(Nk log(Nk).

The spectral correspondence algorithm is dependent on initializing the affinity matrix. For a pair of shapes, the computation complexity of shape matching is \( O\left({N}_k^2\log \left({N}_k\right)\right) \).

Table 1 has shown the efficiency of our method. The longest processing time is almost 2 min for the horse’s category since it has the greatest number of points in the model and the shortest processing time is 5 s for the fish category. Figure 7 demonstrates the sparse spectral correspondence results with different sparse ratio (the third column: τ = 0.2, the forth column: τ = 0.5) for deformable shapes.

Table 1 The performance of our method (s: seconds)
Fig. 7
figure 7

Sparse spectral correspondence for mesh models

Table 2 provides a general comparison of the average accuracy of classification with other methods. Although spectral embedding takes large computational time because of the eigen-decomposition of large dense matrices, our method effectively reduces the size of input shapes by extracting the key sample points and the sparse spectral embedding generates a better shape descriptor. Experimental results have proved that our method achieves better accuracy rates of classification than the others (see Fig. 8).

Table 2 The average classification accuracy of our method on different shape categories (%)
Fig. 8
figure 8

The comparison of classification accuracy with other methods

Because this algorithm defines a sparse ratio τ to build sparse representation of the model, the choice of parametersτ may affect the classification accuracy; therefore the influence of the parameter values τ was tested in our experiments. We chose eight equidistant distributions betweenτ = 0.1~0.8, the average recognition accuracy on all models are calculated respectively as shown in Fig. 9b.

Fig. 9
figure 9

The experimental results of our method. (a) The sparse spectrum of 3D models with τ = 0.8. (b) The average recognition accuracy on all models with different sample rates (c) the effect of Gaussian noises. (d) The shape matching with Gaussian Noise (δ = 0.0015). (e) The shape matching with incomplete models

In order to test the robustness of our algorithm, we use Gaussian noise on the 3D models (Fig. 9c). The shape matching result still keeps the roughly the same accuracy. The horizontal axis indicates the intensity of Gaussian noise added, increasing the noise intensity from 0.1 to 1%. Longitudinal axis is the average recognition rate of the model under different noise intensity in all tests. When the noise intensity is below 0.2%, the average recognition rate of the algorithm can keep the high accuracy even with different sampling rate; when the noise intensity is more than 0.2%, the recognition rate increases with the increase of τ. Overall, as the noise intensity increases, the average recognition rate of the model gradually declines.

Figure 9d demonstrates the stable shape matching results between 3D models with Gaussian noise added. We add δ = 0.0015 Gaussian noise on these human models, and build the sparse representation by our row sampling (τ = 0.3), the shape similarity is revealed through the sparse spectral correspondence as shown in the fourth and fifth columns in Fig. 9d, from which the similarity estimation is obtained. The matching rates between these 3D models are 96.75 and 94.13% respectively. It verifies that our method is effective and robust to the noise and deformation. We further tested our method for incomplete models, as shown in Fig. 9e. The similarity between two incomplete models is well recognized by using the sparse spectral correspondence.

Table 3 demonstrates the representative matching results produced by our approach on the 16 categories from the PSB. We take a mesh model as the query shape and apply our sparse sampling and spectral matching method to evaluate the similarity in the database, and the most similar candidates (top 3 ranking models sorted by CC value) are obtained. From the experiment results we can find that our algorithm can accurately identify similar shapes on a large variety of categories of shapes, even though the input shapes undergo large deformation in geometric structure. The algorithm has good recognition accuracy. In the 16 categories of 3D models, the majority of the models have more than 95% accuracy, and the average recognition accuracy of all models is 88.5%.

Table 3 Non-rigid Shapes Similarity Analysis

6 Conclusions and future work

In this paper, we present an efficient approach to find the similarity between 3D shapes. We first build a sparse representation for each input shape by extracting a subset of key sample points under the “Lewis Weight” constraint. And then we generate a sparse graph with MST. After transformation from spatial domain to the spectral domain, match of the spectral embeddings are constructed to ensure a consistent ordering and sign assignment of the eigenvectors with TPS method. Our algorithm is robust against differences in mesh sizes and choice of the dimensionality of the embeddings. It is invariant under isometric transformation. Experimentally, we find it to be robust against moderate stretching in the shapes as well, and it outperforms well-known existing shape correspondence schemes. The complexity for computing the spectral embeddings and the correspondence is O (Nk2 log Nk), where Nk is the average number of sample points in the larger mesh.

However, similar to the spectrum analysis methods, our method also relies on the computation of geodesics, we require the shape to have no significant missing parts, although our sparse model can successfully deal with partial similarity, this partiality is not easily controllable. An example is shown in Fig. 10, where two octopus are left unmatched by our method, and the human models with large holes are not matched either.

Fig. 10
figure 10

An example of shape classification of the dataset, obtained by running our similarity measurement algorithm. Classes are encoded by color; note that two incomplete octopus and two human models with holes (in gray) have been left unmatched

In the future we would like to extend the scope of the structure-based and semantic-based shape descriptors for more efficient shape matching instead of time consuming point matching. Furthermore, we will investigate shape co-segmentation for the application of shape retrieval.