Keywords

1 Introduction

Shape correspondence is a fundamental problem in many research topics such as 3D mesh retrieval, shape registration and mesh deformation. 3D shape correspondence is a mapping from one point set on the source mesh to another on the target mesh. There exist three kinds of mapping: one-to-one, one-to-many and many-to-one. In this paper, we aim to address the problem of establishing the accurate one-to-one correspondence between intrinsic-symmetrically isometric human models.

The target of shape correspondence is to find the point pairs that are similar or semantically equivalent. Isometric shapes appear in various contexts such as different poses of an articulated human model or two shapes presenting different but semantically similar objects [16]. It is highly demanded to find isometric shape correspondence since most real world deformations are isometric. Moreover, shape correspondences between isometric shapes have practical values. For instance, the deformation based on isometric template will be much more efficient benefiting from their similar shapes. If two shapes are totally isometric, the geodesic distance between two points on one shape is the same as the geodesic distance between their correspondences on the other shape [16].

Embedding-based methods are popular techniques for 3D shape correspondences problem. In these methods, original mesh is embedded into a new domain where isometric deviation can be measured and optimized. Euclidean embedding can be achieved by using various techniques such as classic MDS(Multi-Dimensional Scaling) [14], least-square MDS [6], heat kernel embedding [10] and spectral embedding [7]. Besides embedding methods, other approaches [15, 16] minimize the isometric distortion directly in the 3D Euclidean space. However, most existing algorithms tend to be confused by the intrinsically symmetric features and suffer from symmetric flip problems. They can hardly discriminate symmetric points on the surface even if the mesh to be matched is not perfectly symmetric. Therefore, it is common that the correspondence of the point on the right hand of the source mesh is established on the left hand of the target mesh, as shown in Fig. 1.

We proposed a method to find correspondences for human isometric shape model which is able to solve the symmetric flipping problem. Our idea is to combine skeleton information to distinguish intrinsic symmetry. Given two meshes with their skeletons, we first utilize local features to find one-to-many correspondences between two meshes. The candidate set for each feature point presents symmetric property on the mesh. A skeleton segment associated with surface points is capable of discriminating symmetry. The final correspondence is located and refined by minimize the isometric distortion with respect to based vertex set.

In summary, our contributions are: (a) we integrate skeleton information to robustly address symmetric flip problem which still exist in state-of-the-art techniques; (b) we take advantage of the base vertex set to refine the final one-to-one correspondence and achieve better accuracy.

Fig. 1.
figure 1

Flipping correspondences. Each correspondence pair is labelled with the same colour (Color figure online).

2 Related Work

Shape correspondence is a long- and well-studied problem. In areas such as shape matching, 3D shape retrieval, and mesh registration, 3D reconstruction, many recent efforts are made on finding shape corresponding points on two meshes.

SCAPE model [1] use markers to manually locate correspondences between two meshes. Besides manual assignment, plenty of works develop local surface descriptors to automatically establish correspondences. Some works extend local descriptors in 2D images for triangulated meshes, such as MeshHOG [18] or 3D shape context [9]. The embedding-based method is a more reliable approach when it comes to isometric deformation. Multidimensional Scaling(MDS) [14] approximate geodesic distance with Euclidean distance in embedding space. Dey et al. [5] uses the Global Point Signature(GPS) [13] for spectral embedding of meshes and thereby find the mesh extremities. Sahillioglu et al. [15] also transfers vertices into spectral domain and optimize the result using expectation-maximization algorithm. However, these above methods sometimes provide false correspondence due to the presence of model symmetries. Ovsjanikov et al. [11] firstly identify the intrinsic symmetry of object in a quotient space, and then factor it out. Zhang et al. [19] differentiate the intrinsic symmetric points by calculating a signed angle field from the gradient fields of the harmonic field which is derived from four points on the hands and feet. For scan data produced by RGB-D cameras, e.g. Kinect, many imperfections make it harder to find the correspondences. Holes and non-smooth mesh result in difficulties for calculating geodesic distance. Noises and missing data have negative influence on the matrix structure which is the basis of embedding-based methods. Jiang et al. [8] and Zheng [20] detect the intrinsic symmetry of point clouds using skeleton but the skeleton they use is produced according to the surface or point clouds.

Sahillioglu et al. [16] propose a coarse-to-fine scheme to track symmetric flips. Although this method is more accurate than the previous ones based on embedding approach, it still has the symmetric flipping false pairs even in the final level. We will compare our method with them in both accuracy and addressing problem of symmetric flips. The strength of our approach is that it takes into account the skeleton information.

3 Skeleton-Based Symmetry-Aware Approach

The intrinsic symmetry leads to the symmetric flipped correspondence between two meshes. Neither embedding-based methods like MDS, GMDS nor local descriptors can differentiate them effectively. Previous works which are solely replying on surface-related information, i.e. geodesic distance, face normal, are unable to solve symmetric problems completely. However, with the help of a set of skeleton information where different skeleton segments have different labels and surface point and skeleton segment are associated, it is possible to address the symmetric flipping problem with skeleton. Moreover, along with the appearance of Kinect camera, we are able to obtain skeleton of mesh with ease. To perform the skeleton attachment process, we use the algorithm based on the work by [2], in which the input is the joints positions tracked from Kinect. The output is the skeleton attached the human model. In the following, we firstly discuss how to obtain candidate set for source point, followed by our method to address symmetric flip problem as well as to refine the final correspondence which is more accurate in terms of both visual effect and semantics.

3.1 Correspondence Candidate Set

As mentioned before, in order to make sure that the candidate set includes the correct correspondence as much as possible, we first compute the one-to-many correspondences using Heat Kernel Signature(HKS) [17], we select the top N similar points to construct the candidate set which is shown in Fig. 2(a). To compute the heat of point i at time \(t_i\), we firstly perform the Laplace-Beltrami operator L on the mesh. Let \(\Lambda \) be the diagonal matrix of the eigenvalues of L, and \(\varPhi \) be the matrix with the corresponding eigenvectors, the heat kernel of the mesh is computed as Eq. 1:

$$\begin{aligned} K_t=\varPhi exp(-t\Lambda )\varPhi ^T \end{aligned}$$
(1)

Each entry in \(k_t(i,j)\) represents the heat diffusion between point i and j. The diagonal elements of this matrix is composed of HKS. Thus, HKS feature is a vector whose entry \(k_{t_j}(p_i,p_i)\) is the heat at point i at time of \(t_j\):

$$\begin{aligned} \left\{ k_{t1}(p_i,p_i),k_{t2}(p_i,p_i),\ldots ,k_{tn}(p_i,p_i) \right\} \end{aligned}$$
(2)

When the dissimilarity of HKS between the template point and target point in Eq. 3 is less than a threshold t, the target point is selected as candidate for the template point.

$$\begin{aligned} \varDelta s = ||HKS(p_t) - HKS(p_s)||, \end{aligned}$$
(3)

where HKS(p) is the heat kernel signature at point p, \(p_t\) and \(p_s\) are the points on the template and target respectively. Here, we apply the scale-invariant HKS(si-HKS) [3] to get feature for meshes.

After the initial correspondence is achieved by si-HKS, an expanded set of candidate points are obtained as shown in Fig. 2(a). As it can be observed, the expanded candidates for the point on the right foot of the source model distribute on both feet of the target model, presenting symmetric property.

3.2 Skeleton-Based Symmetry-Aware Shape Correspondence

To locate the single correspondence for template point, the next step is to remove those symmetric flipping points. Skeleton is an important clue for filtering flipped correspondences. As shown in Fig 3, skeleton divides mesh into 17 parts and each mesh part attached a segment has a unique label and the right extremity and its left counterpart have different labels. Therefore, our method is able to discriminate the right points and their counterparts on the left, addressing symmetric flip problems. When the template point and candidate points are on the same skeleton segment, they are kept; otherwise, the candidates are removed. The filtered candidate set for template point is shown in Fig. 2(b).

Fig. 2.
figure 2

Overflow of proposed method: (a) the expanded candidate set for one point on the template; (b) the after-filtered candidate set using skeleton filtering; (c) the final one candidate point on the source based on base vertex set

Fig. 3.
figure 3

Mesh division by skeleton; each colour represents a skeleton segment (Color figure online)

3.3 One-to-One Correspondence

After the symmetric flip problem is solved, the remaining candidates need to be further filtered to find the one-to-one correspondence pair. Therefore, the next step in our method uses the sum of relative distances from candidates to the base vertex set to filter invalid candidates.

The base vertex set [15] is selected based on Gaussian curvatures. This process is illustrated in Fig. 4. Initially, at each vertex of the original mesh, we compute the Gaussian curvatures using a simple way proposed in [12] with Eq. 4.

$$\begin{aligned} gc(p)=3(2 \pi -\sum \alpha _i)/\sum A(f_i), \end{aligned}$$
(4)

where \(A(f_i)\) is the area of the face \(f_i\) that adjacent to the vertex and the angle \(\alpha _i\) is the angle of \(f_i\) at the vertex. Then we sort the vertices into a list in descending order with respect to their curvature values like in Fig. 4(a) and choose the top vertex as the first base vertex, e.g. marked point \((x_1,y_1,z_1)\) in Fig. 4(b). Then, as shown in Fig. 4(c), we compute the geodesic distance from this vertex and mark all its neighboring points lying within a radius r. In our experiment, we adopt the Dijkstra’s shortest path algorithm to compute the geodesic distance between two vertices as Eq. 5. The weight of each edge of Dijkstra’s path is the Euclidean distance between neighboring vertices by Eq. 6.

$$\begin{aligned} g(i,j)=\sum _{i\in P}\omega _i \end{aligned}$$
(5)
$$\begin{aligned} \omega _i=\min _{v_k\in N_i}||v_i-v_k||, \end{aligned}$$
(6)

where \(N_i\) is the neighbors of point i. The next base vertex is the first unmarked vertex in the list like \((x_3,y_3,z_3)\) in Fig. 4(d). This process is repeated until all points are marked and based vertex set is built. The final base vertex set is illustrated in Fig. 4(f). Given base vertex set \(\phi \), we compute the relative surface distance from each candidate to \(\phi \) with Eq. 7. The candidate C with the minimum relative distance to \(\phi \) is regarded as the final correspondence as shown in Fig. 2(c).

$$\begin{aligned} D_{iso}\left( c_i,\phi \right) =\sum _{\left( v_j\in \phi , c_i\in \varTheta \right) }g(c_i, v_j) \end{aligned}$$
(7)
$$\begin{aligned} C=arg\min _{c_i\in \varTheta } \left( D_{iso}\left( c_i,\phi \right) -D_{iso}\left( p,\phi \right) \right) \end{aligned}$$
(8)

Here, g(., .) is the geodesic distance between two vertices. p is the point on the template. After the distances from the candidates to base vertex set are acquired, we select the candidate with minimum distance to base vertex set as the final correspondence as illustrated in Eq. 8.

Fig. 4.
figure 4

The process of base vertex set

4 Experiments

4.1 Dataset

SCAPE human dataset [1] is built by Dragomir et al. in 2005. It is composed of pose dataset and shape dataset. In pose dataset, it contains scans of 70 different poses of a particular person. The shape model consists of 45 different people in a similar but un-identical pose. In the pose dataset, one mesh is chosen as the template mesh and others are denoted as instance meshes. Each mesh has 25000 triangle faces and 12500 vertices. Although the original work made use of both shape and pose data, only the pose data is distributed together with its skeleton information. Meshes in SCAPE model are hole-filled using the algorithm by Davis et al. [4]. SCAPE model also constructs a skeleton for the template mesh based on the fact that vertices on the same skeleton joint are spatially contiguous and exhibit similar motion across different scans. Thus, after scanning the pose instance for a particular person, authors decompose the mesh into several approximately rigid parts and get the location of the parts in different pose instances as well as the articulated object skeleton linking the parts. Based on the pose dataset, a tree-structured articulated skeleton is automatically constructed with 16 parts. Since SCAPE model contain both symmetric and various deformed shapes, we evaluate the performance of our proposed method with respect to symmetric flipped correspondence as well as the accuracy of final correspondences with SCAPE dataset.

Fig. 5.
figure 5

Compare our method with C2FCM. Left: C2FCM Right: Our method

Fig. 6.
figure 6

Comparison between two methods: Top: results obtained using method in [16]; Bottom: our results. Matched point pairs are connected by lines. Symmetry flips are connected by black dash lines.

4.2 Performance Evaluation

We compare our method with latest shape correspondence algorithm [16]. Firstly, we intuitively compare our result with those in coarse-to-fine combinatorial matching (C2FCM) algorithm [16] in Fig. 5. Our method outperforms C2FCM [16] with respect to semantical equivalence. For the same point on the third toe of template, C2FCM method finds its correspondence on the foot bottom. However, our method locates its correspondence almost on the third toe of foot. Secondly, the average geodesic error is compared with C2FCM algorithm in Table 1. We can see that for different proportions of correspondences, the geodesic errors of our method are less than those of C2FCM. The average of geodesic error of all correspondences, shown in the column of 100 % correspondences, our method outperforms C2FCM, which means our method is able to find the correspondences more accurately. More results are shown in Fig. 6, we can see that in coarse-to-fine algorithm correspondences which are shown in the top line present symmetry (both on the left and right foot) for template point. Our method is able to find the unique correspondences which is more accurate than [16] in terms of semantics. Moreover, it successfully removes that symmetric flipped invalid correspondences and achieved accurate one to one mapping from template to target meshes.

Table 1. Comparison of our method with C2FCM method

5 Conclusion

In summary, we present a robust method to address the symmetric flip problem in shape correspondence research area. This approach can effectively remove the flipped correspondences by introducing skeleton information and through minimising distortion error. It can locate the one-to-one semantically similar correspondence more accurately. Experimental results indicate that the proposed approach outperformed traditional approaches that rely only on surface information.

In the future, we hope to investigate other mesh data which is obtained from cheap scanners such as Kinect to find correspondence between template mesh and scanned mesh. The focus will be on tracking the intrinsic challenges posed by the incomplete and noisy data that is used to build such scanned mesh.