3D object recognition method with multiple feature extraction from LiDAR point clouds

Tian, Yifei; Song, Wei; Sun, Su; Fong, Simon; Zou, Shuanghui

doi:10.1007/s11227-019-02830-9

3D object recognition method with multiple feature extraction from LiDAR point clouds

Published: 30 March 2019

Volume 75, pages 4430–4442, (2019)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

The Journal of Supercomputing Aims and scope Submit manuscript

3D object recognition method with multiple feature extraction from LiDAR point clouds

Download PDF

Yifei Tian^1,2,
Wei Song ORCID: orcid.org/0000-0002-5909-9661^1,3,
Su Sun¹,
Simon Fong² &
…
Shuanghui Zou¹

1271 Accesses
13 Citations
Explore all metrics

Abstract

During autonomous driving, fast and accurate object recognition supports environment perception for local path planning of unmanned ground vehicles. Feature extraction and object recognition from large-scale 3D point clouds incur massive computational and time costs. To implement fast environment perception, this paper proposes a 3D recognition system with multiple feature extraction from light detection and ranging point clouds modified by parallel computing. Effective object feature extraction is a necessary step prior to executing an object recognition procedure. In the proposed system, multiple geometry features of a point cloud that resides in corresponding voxels are computed concurrently. In addition, a scale filter is employed to convert feature vectors from uncertain count voxels to a normalized object feature matrix, which is convenient for object-recognizing classifiers. After generating the object feature matrices of all voxels, an initialized multilayer neural network (NN) model is trained offline through a large number of iterations. Using the trained NN model, real-time object recognition is realized using parallel computing technology to accelerate computation.

Classifying 3D objects in LiDAR point clouds with a back-propagation neural network

Article Open access 12 October 2018

Exploring Model Transfer Potential for Airborne LiDAR Point Cloud Classification

Accelerating DNN-based 3D point cloud processing for mobile computing

Article 19 September 2019

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

1 Introduction

The accuracy and speed of object recognition are necessary criteria to evaluate the environment perception ability of unmanned ground vehicles (UGV) [1]. During autonomous driving, excellent object recognition results can support sufficient semantic environment information for UGVs to execute subsequent processing applications [2]. In addition, object recognition technology also benefits many other industry fields, such as automatic semantic map generation, local path planning for autonomous robotics, and digital terrain recognition and analysis.

Researchers employ light detection and ranging (LiDAR) sensors to perceive precise and adequate surrounding information in continuous driving. LiDAR sensors can collect large 3D point clouds quickly with few errors [3]. While 2D images captured by digital cameras are sensitive to illumination change, anti-interference of changing light is another advantage of three-dimensional (3D) point cloud sensed by LiDAR. However, several special distribution characteristics of LiDAR point clouds, especially disordered arrangement and inhomogeneous densities, will cause huge challenges in object feature selection and computation. The efficiency of the extraction and optimization of object features from sparse 3D point clouds directly determine object recognition accuracy.

Traditionally, key-point-based and local-surface-based methods are the primary feature extraction methods in the 3D object recognition domain [4]. Fixed- or adaptive-scale-based key-point detection algorithms compute curves, variances, normal vector, and other geometric spatial attributes as their feature extraction descriptors. By identifying multiple 3D points with stable attributes after a series of rigid transformation, these points are considered key points in the recognition of object types by comparing the similitude levels between target objects and the given models. However, in most key-point detection methods, feature extraction relies heavily on determining the neighboring region and finding neighbor points, in which key-point selection causes significant computational and time costs [5].

In consideration of the limitations of key-point-based feature extraction methods, researchers prefer to choose local surface features as automatic object recognition criteria [6]. There are several types of common local surface feature descriptors that are commonly used to extract features in 3D object recognition. For example, some studies have generated and used local-surface geometry feature descriptors in the spatial domain to extract basic contour and spatial distribution information about different types of 3D objects. Some complex object descriptors analyze the topological structure of point clouds to extract distribution features by rasterizing the 3D local space into ordered and aligned voxels [7]. Differing from computing intuitive distribution attributes, several studies have transformed object features from spatial space into other domains (e.g., the spectrum domain) to search for a descriptive representation. After extracting object features, the classifiers are initialized and fed into multidimensional features as a series of recognition conditions.

Recently, an increasing number of researchers have concentrated on utilizing machine learning algorithms as object classifiers to solve 3D object recognition problems [8]. Thus, this paper proposes a multilayer neural network-based 3D object recognition system with multiple feature extraction from LiDAR point clouds. To compute geometry and spatial distribution features, we initially rasterize an object point cloud in a global voxel model. In each voxel, 23 features, e.g., point divergence degree, variance, and covariance, are computed in parallel. Using multiple object features and their manually annotated labels, the initialized neural network (NN) is trained through a massive number of iterations. The feature extraction processes in the proposed system are accelerated by a graphic processing unit (GPU) to help realize real-time object recognition.

The remainder of this paper is organized as follows. In Sect. 2, we discuss work related to object recognition algorithms in 3D point clouds. In Sect. 3, the proposed GPU-based recognition system is described. We analyze the performance result of an object recognition experiment in Sects. 4 and 5 concludes the paper.

2 Related works

Object recognition is an essential function for UGVs to realize semantic environment perception and automatic driving decision making in urban areas [9, 10]. This section surveys several widely used feature extraction methods based on 3D point clouds to realize fast and accurate object recognition.

In the 3D object recognition domain, key-point-based detection is commonly used to describe stable local features using a series of descriptive points [11]. For example, Sun et al. [12] used curvature characteristics to establish a robust key-point selection algorithm and a reliable local feature descriptor. In the defined key-point-based feature descriptor, curvature maps are structured based on interest points and their neighboring points within a predefined constraint. To define a local reference frame (LRF), surface normal and max principal curvature orientation of the interest points is used as the axes in the local space. Although Sun proposed a precise key-point description and matching algorithm, the computational cost of the matching of interest points and its neighborhood points occupies majority of time. In consideration of speed performance in key-point detection, Persad et al. [13] developed a transform-based key-point detection method to search for invariant interest points during rigid rotation and translation. Compared to a traditional feature extraction algorithm, i.e., random sample and consensus, the point-matching process in their method is more efficient and does not require a predefined threshold. In addition, Ge et al. [14] realized a random point method to select key points at each scaling layer to compute local features. A point-cloud-matching process is executed by comparing local features described by some key points. Here, matching performance relies on the size of the feature dictionary generated by the method such that sequence processing stability is limited.

To enhance the descriptiveness, robustness, and time efficiency of the key-point process, some studies have employed a key-point-based histogram to gather the spatial features of point clouds in multiple dimensions [15]. For example, Weber et al. [16] computed orientation angles between points and their neighbors to generate a classic histogram-based feature descriptor in local space, which was referred to as fast point feature histograms (FPFH). Although the FPFH descriptor demonstrates excellent feature extraction processing speed, the neighboring area of the FPFH descriptor is still large, which incurs relatively large time costs [17]. Yang et al. [18] developed a more time-efficient local feature descriptor referred to as the local feature statistics histogram (LFSH), which is insensitive to noise and varying density. To reduce time costs in these types of histogram-based feature extraction methods, Garcia et al. [19] employed GPU technology to convert serial procedures into parallel procedures, which is a novel use of such technology in most point cloud recognition methods.

Differing from calculating spatial features by analyzing individual points in sequence, converging neighboring points can be collected into a series of voxels as processing units to realize course feature extraction [20]. Zhu et al. [21] proposed a semantic classification method by dividing the spatial space into arranged fixed-size voxels, which overcomes the processing difficulty associated with non-homogeneous density distributions in point clouds. To maintain the topological structure of point clouds, adjacent voxels are clustered based on pairwise connections between different components. By computing the energy values of pairwise connections, raw point clouds collected from urban environment can be divided into individual semantic components. Xu et al. [22] also developed a voxel-based model to down-sample large original point clouds to reduce the time costs of sequence processing. In this model, after down-sampling, principal directions are computed to reconstruct the local coordinate axes (referred to as the semi-LRF). Here, global point coordinates are simplified as local coordinates to extract features from the LRF. Extracting features from suitable local coordinates is much easier than extracting from global coordinates because local transformation is an advisable preprocess for feature computation.

Fix-sized voxels are sometimes not sufficiently effective to realize object recognition due to the variable densities and unsymmetrical structure of point clouds [23]. Lei et al. [24] exploited a 3D convolution kernel at different scales to extract object features of different resolutions. Thus, the topological structures of the point clouds can be described in a detailed manner such that the accuracy of object classification, semantic recognition, and other sequence applications is increased. However, the computational cost of updating new points in such tree-based storage constructions is high, especially in incremental point cloud collection systems.

Dimensional reduction methods are commonly used to increase the speed and efficiency of feature extraction by projecting a 3D point cloud into a 2D space [25]. Ligon et al. [26] utilized a traditional spin image descriptor to convert 3D point clouds into a 2D feature plane. By limiting the normal angle between a selected point of interest and neighboring points, point counts under the predefined constraint are obtained to form a statistics plane. However, the features extracted using the image spin algorithm rely heavily on the normal vector so that they are sensitive to noise on the local surface. To increase the reliability and stability of point cloud processing, Yang et al. [27] developed a rotational contour signature descriptor to extract features comprising an array of contour signatures under different rotation angles. Then, these signatures are gathered together as local feature descriptors to realize shape matching and object recognition. Dong et al. [28] projected 3D point clouds into six planes to record interest point, and its neighboring point counts as a weighted distance feature in a limited area. Here, distance features are encoded into a string of binary numbers considered a local feature descriptor to categorize outdoor objects in an urban environment. With such feature descriptors, a large number of thresholds and parameters must be adjusted and modified manually; thus, using machine learning algorithms is effective for such feature classification tasks [29].

Recently, inspired by outstanding object recognition results in the 2D image domain, machine learning algorithms have been researched in 3D point clouds processing [30]. To reduce the time cost of searching for neighbor points, Dubé et al. [31] used a k-d tree model to store LiDAR point clouds to resolve several spatial features, especially for an inhomogeneous and unstructured distribution. Soilán et al. [32] applied a neural network (NN) to a road marking recognition task in which local features are extracted as the input data of the classifier model. Automatic parameters adjustment in these classification modes can realize better recognition results than traditional threshold-predefined classifiers [33]. These methods only use multiple geometry features of the spatial distribution of the point and ignore the topological relationship among neighboring points.

Motivated by significant achievements in convolutional NN (CNN), some studies in the 3D point cloud domain have focused on finding self-adaptive feature extracting filters. For example, Bobkov et al. [34] input a sequence of 3D points into a CNN model with five filtering and five pooling layers to extract point cloud features. Here, prior to the input process, a preprocess is executed to normalize the point order and down-sample the point cloud. Chen et al. [35] defined the starting points of different models to avoid rigid transformation, such as random rotations. The filters in their system are set to 3*1 or 1*1 at the first or sequence filtering layers, respectively; however, this was inefficient to extract spatial features. Therefore, the proposed system extracts multiple dimensional features from voxels and summarizes them together to form a series of object feature datasets. The detailed connected component labeling algorithm for object clustering is explained in our previous work [36]. By feeding feature datasets into an initialized multilayer NN model, point-based object identification can be implemented under automatic parameters adjusted by massive examples.

3 Multifeature-based object recognition system

The proposed system includes preprocess, feature extraction, and classifier modules. Several spatial features of point cloud are computed from each voxel. An object feature normalization process is executed to convert feature vectors of a voxel to a normalized descriptor. The normalized descriptor forms a normalized object feature matrix, which is then fed into the classifier to realize the object recognition function.

3.1 System overview

Accuracy and processing speed are two primary bottlenecks in object recognition using a large-scale point cloud. To address these bottlenecks, the proposed method combines parallel computing technology and multifeature descriptors to realize an object-recognizing system (Fig. 1). First, all raw point clouds collected by the LiDAR sensor are input to the proposed system. Using a height threshold, the ground point clouds are filtered out to avoid their adverse effect on the subsequent feature-computing and object-recognizing procedures. For example, ground points can occupy a sizable portion of the sensed point cloud in an entire scene, which results in wasteful computation in non-ground object recognition tasks. Another reason the ground points filtering step is performed is that such points also form a horizontal plane that can connect objects near the earth’s surface.

After executing the ground point cloud filtering step, the voxel model is initialized with a predefined boundary size and voxel resolution. Multiple geometry features of point clouds projected in the corresponding voxels are computed and formed as feature vectors. The voxel counts of the different objects likely differ; thus, it is difficult to input the raw voxel feature vectors of these objects into a common NN classifier. Thus, these voxel feature vectors with different counts are converted from standard space to a normalized descriptor to generate fixed-size object feature matrices. By feeding these normalized object feature matrices into our initialized multilayer NN model, the model improves its classification ability by executing 10,000 iterations in the training process. As a result, the proposed system can achieve automatic and accurate object recognition using the trained classifier.

3.2 Voxel features generation

In the proposed system, a series of feature vectors in the voxel model are computed as the judgment basis to identify different types of outdoor objects in subsequent steps. After defining a suitable resolution and boundary size, the voxel model is established and comprises cube-shaped voxel units. Here, non-ground points are projected into corresponding voxel units based on their positions at the x, y, and z axes in a standard Cartesian coordinate system. Using these 3D points in voxels, multiple geometry features are computed (Table 1). Point count N in a given voxel is a significant parameter to measure the voxel’s importance. Similarly, point density ρ in each voxel is utilized to evaluate voxel importance, as given in formula (1), which is also affected by the relative distance between the current voxel and the sensor location. In Eq. (1), l, w, and h are the voxel sizes of the x, y, and z directions in standard space, respectively.

Table 1 Feature list computed in valid voxel

Full size table

$$ \rho = N/\left( {l \times w \times h} \right) $$

(1)

Voxel centroid μ is a geometric center computed by traversing all points’ positions in the voxel. Vector μ is expressed as {$ \bar{x},\bar{y},\bar{z} $}, where $ \bar{x} $, $ \bar{y} $, and $ \bar{z} $ are the mean values of the point cloud in three directions. Point variance $ \sigma^{2} $ is a 3D variable to measure the point cloud distribution differences in the x, y, and z directions ($ \sigma^{2} = \left\{ {\sigma_{x}^{2} ,\sigma_{y}^{2} ,\sigma_{z}^{2} } \right\} $). Point covariance $ \bar{\sigma }^{2} $ is a 3D covariance variable that measures the point distribution differences in the x–y, x–z, and y–z directions, given as $ \bar{\sigma }^{2} { = } \left\{ {\bar{\sigma }_{xy}^{2} ,\bar{\sigma }_{xz}^{2} ,\bar{\sigma }_{yz}^{2} } \right\} $. By combining point variance and covariance, a covariance matrix is expressed in Eq. (2).

$$ \text{cov} (X,Y,Z) = \left[ {\begin{array}{*{20}l} {\sigma^{2} (XX)} \hfill & {\bar{\sigma }^{2} (XY)} \hfill & {\bar{\sigma }^{2} (XZ)} \hfill \\ {\bar{\sigma }^{2} (YX)} \hfill & {\sigma^{2} (YY)} \hfill & {\bar{\sigma }^{2} (YZ)} \hfill \\ {\bar{\sigma }^{2} (ZX)} \hfill & {\bar{\sigma }^{2} (ZY)} \hfill & {\sigma^{2} (YY)} \hfill \\ \end{array} } \right] $$

(2)

Point eigenvector υ and eigenvalues λ are decomposed from this covariance matrix using a singular value decomposition algorithm. Eigenvalue λ is also a 3D variable represented as {λ₁, λ₂, λ₃}, in which variables are sorted from largest to smallest as λ₁, λ₂, and λ₃. The parameter surface curvature κ describes the point cloud curvature (Eq. 3).

$$ \kappa = \lambda_{3} /\left( {\lambda_{1} + \lambda_{2} + \lambda_{3} } \right) $$

(3)

Divergence degree F is a 3D variable used to evaluate the dispersion degree of the point clouds in the x, y, and z directions. Divergence degree F is computed according to Eq. (4), where N and μ are the point count and voxel’s centroid. Here, vector P is a set of points expressed as P = {p₁, p₂,…, p_N}, where variable p_i contains the x, y, and z coordinates as p_i = {x_i, y_i, z_i}.

$$ F = \frac{{\sum\nolimits_{i = 1}^{N} {\left( {p_{i} - \mu } \right)} }}{N} $$

(4)

For each voxel, the feature list shown in Table 1 forms a description vector $ \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {v} $ defined as $ \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {v} = \left\{ {N,\rho ,\mu ,\sigma^{2} ,\bar{\sigma }^{2} ,\upsilon ,\lambda ,\kappa ,F} \right\} $ to represent the spatial features of the point cloud. Thus, feature vectors are input to the proposed system rather than the relative coordinates of the point cloud.

3.3 Object features normalization

Different objects are of different sizes; thus, the voxel counts of different objects also differ. Under different rotation angles, the same objects have different point cloud and voxel arrangement distributions. As a result, an effective normalizing method is required to realize equitable down-sampling/up-sampling on no matter large or small counts of object voxel. Using a predefined scale filter (Fig. 2), object features are normalized from standard Cartesian space to a normalized descriptor after computing all voxel features listed in Table 1.

In object feature normalization, voxel feature vectors of a certain object are transformed from an uncertain count to a fixed-count 3 × 3 × 3 feature vectors using the object feature’s normalized descriptor. The generated feature matrix M of each object is expressed by Eq. (5), where vector $ \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {v} $ is the feature vector of different blocks in the normalized descriptor.

$$ M = \left[ {\begin{array}{*{20}c} {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {v}_{1} } \\ {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {v}_{2} } \\ \ldots \\ {\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {v}_{27} } \\ \end{array} } \right] = \left[ {\begin{array}{*{20}c} {\mu_{1} } & {\sigma_{1}^{2} } & \ldots & {F_{1} } \\ {\mu_{2} } & \ldots & {} & \ldots \\ \ldots & {} & {} & {} \\ {\mu_{27} } & \ldots & {} & {F_{27} } \\ \end{array} } \right] $$

(5)

In this matrix, each row contains 25 variables, including point count, point density, voxel centroid, point variance, point covariance, eigenvectors, eigenvalues, surface curve, and divergence degree. Each column contains 27 feature vector elements computed from the 3 × 3 × 3 normalized blocks. To input object features into the proposed multilayer NN, matrix M is converted to a vector as input data in the form of the initialized neural network classifier, as defined by Eq. (6).

$$ M' = \{ \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {v}_{1} ,\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {v}_{2} , \ldots ,\overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\rightharpoonup}$}} {v}_{27} \} = \{ u \in (\mu_{1} ,\sigma_{1}^{2} , \ldots ,F_{1} ,\mu_{2} ,\sigma_{2}^{2} , \ldots ,F_{2} , \ldots ,\mu_{27} ,\sigma_{27}^{2} , \ldots ,F_{27} )\} $$

(6)

Equation (7) expresses the computing principle of input data u_i in the input layer and the classifier parameters, including weights w_i and threshold θ in the first hidden layer. In vector Mʹ, u represents an element. Variable m is both the neuron count in the input layer and the dimension of the converted object feature vector Mʹ. Loss function f utilized is the softmax function with higher learning efficiency in the training process compared to the sigmoid function. The result c is the input data for the neuron in the second hidden layer. Note that the computation principle of the employed multilayer NN classifier is omitted.

$$ c = f\left( {\sum\limits_{i = 0}^{m} {w_{i} u_{i} - \theta } } \right) $$

(7)

4 Experiments and analysis

In an experiment, a LiDAR sensor was used to collect point cloud data from the environment, which included distance and rotation information. Here, we used the CUDA programming method to implement parallel computing in the proposed system to improve feature extraction efficiency. The system was executed on a 2.8 GHz Intel^® Core™ i7-7700HQ CPU (with 8 GB RAM) with a GeForce GTX 1050 Ti graphics processing unity. The system utilized the DirectX 9.0 software development toolkit to represent raw point clouds, voxels, and point clouds with different colors for individual object recognition results.

The ground points occupied large portion of point clouds in outdoor scenes, as illustrated in Fig. 3. Thus, ground point filtering was implemented to reduce the computation complexity of the non-ground object clustering and recognition. To speed up the object segmentation and recognition process, a parallel computing method was utilized on voxel feature extraction and object feature matrix normalization. The time consumption of our proposed GPU-based accelerated method was shorter than that using CPU programming. In the feature-computing process, point count and point density in each voxel are relevant with the distance between the voxel’s and the sensor’s locations; therefore, the feature datasets contained 23 features, with the exception of point count and density features. Using these 23 features, 1000 obstacles were labeled manually to train and test the proposed multilayer NN model. As shown in Fig. 4, the training datasets comprised 286 walls, 109 poles, 43 pedestrians, 416 trees, and 146 bushes.

The numbers of input and output neurons in the initialized multilayer NN model were 675 and 4, respectively. We set three hidden layers and 350, 100, 25 hidden neurons in hidden layers. By using 800 object feature datasets as training examples, the randomly initialized parameters in the model were modified over 10,000 iterations. Finally, using the trained classifier, the accuracy rate of the test datasets of object feature vectors was up to 92.84%. As shown in Table 2, our experiment also compared the accuracy performances of other classifiers tested by the same databases (Table 2).

Table 2 Accuracy performance comparing results of different classifiers

Full size table

Figure 5 shows the object recognition results. Here, different objects are rendered as different colors. Wall objects have prominent features; thus, most points exist in the same plane surface. As a result, recognition accuracy is much higher for walls than for other objects. Based on the limited size of the pedestrians, the features matrix cannot always clearly express the object’s characteristics. The accuracy rates for the pole and tree objects are between those of the walls and pedestrians. The object recognition accuracy in narrow roads is higher than that for the large space because the objects were closer to the LiDAR sensor in the narrow roads than in the open square environment. The point clouds of objects in a narrow environment are more intensive; thus, the feature matrix can describe the objects’ characteristics more precisely. In addition, for the same reason, the object recognition rates are relative to the distance between the objects and sensor. In future, it is expected that object recognition accuracy can be improved significantly if an interpolation algorithm is employed to increase the point cloud density.

5 Conclusion

To sense urban environment information as precisely as possible, most studies employ LiDAR on UGVs to collect 3D point cloud datasets. Fast and accurate object recognition is a huge challenge in the autonomous driving domain due to the large number of obstacle types and their various features. In this paper, we have proposed a 3D object recognition system with multiple feature extraction from LiDAR point clouds. Through a preprocess, non-ground points are extracted and a voxel model is initialized based on the valid range of the remaining point clouds. After computing a feature vector for each voxel, a scale filter is used to generate normalized feature matrices for each individual object. These object feature matrices are then fed into a multilayer NN model to obtain their object types. Using manually labeled testing data sets, the accuracy rate of the proposed 3D object recognition system was 92.84%. In future, more object feature datasets with corresponding manual labels will be collected to train this model. In additions, more features will be computed, and their effectiveness will be analyzed to form their feature space to increase accuracy.

References

Yao J, Zhang K, Yang Y et al (2018) Emergency vehicle route oriented signal coordinated control model with two-level programming. Soft Comput 2(13):4283–4294
Article Google Scholar
Simony M, Milzy S, Amende K et al (2018) Complex-YOLO: an Euler-region-proposal for real-time 3D object detection on point clouds. In: Computer Vision-ECCV 2018 Workshops, pp 197–209
Chen X, Ma H, Wan J et al (2017) Multi-view 3D object detection network for autonomous driving. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 1907–1915
Guo Y, Bennamoun M, Sohel F et al (2014) 3D object recognition in cluttered scenes with local surface features: a survey. IEEE Trans Pattern Anal Mach Intell 36(11):2270–2287
Article Google Scholar
Stamatis OK, Aouf N, Gray G et al (2018) Local feature based automatic target recognition for future 3D active homing seeker missiles. Aerosp Sci Technol 73:309–317
Article Google Scholar
Zeng H, Wang H, Dong J (2017) Robust 3D keypoint detection method based on double Gaussian weighted dissimilarity measure. Multimed Tools Appl 76(24):26377–26389
Article Google Scholar
Wang J, Lindenbergh R, Menenti M (2017) SigVox—a 3D feature matching algorithm for automatic street object recognition in mobile laser scanning point clouds. ISPRS J Photogramm Remote Sens 128:111–129
Article Google Scholar
Zeng H, Liu Y, Liu J et al (2018) Non-rigid 3D model retrieval based on quadruplet convolutional neural networks. IEEE Access 6:76087–76097
Article Google Scholar
Watanabe T, Yamazaki K, Yokokohji Y (2017) Survey of robotic manipulation studies intending practical applications in real environments-object recognition, soft robot hand, and challenge program and benchmarking. Adv Robot 31(19–20):1114–1132
Article Google Scholar
Li L, Ota K, Dong M (2018) Humanlike driving: empirical decision-making system for autonomous vehicles. IEEE Trans Veh Technol 67(8):6814–6823
Article Google Scholar
Buenoa M, González-Jorgea H, Sánchez JM et al (2017) Automatic point cloud coarse registration using geometric keypoint descriptors for indoor scenes. Autom Constr 81:134–148
Article Google Scholar
Sun J, Zhang J, Zhang G (2016) An automatic 3D point cloud registration method based on regional curvature maps. Image Vis Comput 56:49–58
Article Google Scholar
Persad RA, Armenakis C (2017) Automatic co-registration of 3D multi-sensor point clouds. ISPRS J Photogramm Remote Sens 130:162–186
Article Google Scholar
Ge X (2017) Automatic markerless registration of point clouds with semantic-keypoint-based 4-points congruent sets. ISPRS J Photogramm Remote Sens 130:344–357
Article Google Scholar
Hansch R, Webera T, Hellwich O (2014) Comparison of 3D interest point detectors and descriptors for point cloud fusion. ISPRS Ann Photogramm Remote Sens Spat Inf Sci II-3:57–64
Article Google Scholar
Weber T, Hänsch R, Hellwich O (2015) Automatic registration of unordered point clouds acquired by Kinect sensors using an overlap heuristic. ISPRS J Photogramm Remote Sens 102:96–109
Article Google Scholar
Yang J, Zhang Q, Cao Z (2017) The effect of spatial information characterization on 3D local feature descriptors: a quantitative evaluation. Pattern Recogn 66:375–391
Article Google Scholar
Yang J, Cao Z, Zhang Q (2016) A fast and robust local descriptor for 3D point cloud registration. Inf Sci 346–347:163–179
Article Google Scholar
Garcia AG, Escolano SO, Rodriguez JG et al (2018) Interactive 3D object recognition pipeline on mobile GPGPU computing platforms using low-cost RGB-D sensors. J Real-Time Image Proc 14:585–604
Article Google Scholar
Quan S, Ma J, Hu F et al (2018) Local voxelized structure for 3D binary feature representation and robust registration of point clouds from low-cost sensors. Inf Sci 444:153–171
Article Google Scholar
Zhu Q, Li Y, Hu H et al (2017) Robust point cloud classification based on multi-level semantic relationships for urban scenes. ISPRS J Photogramm Remote Sens 129:86–102
Article Google Scholar
Xu Y, Tuttas S, Hoegner L et al (2018) Voxel-based segmentation of 3D point clouds from construction sites using a probabilistic connectivity model. Pattern Recogn Lett 102:67–74
Article Google Scholar
Yang B, Dong Z, Zhao G et al (2015) Hierarchical extraction of urban objects from mobile laser scanning data. ISPRS J Photogramm Remote Sens 99:45–57
Article Google Scholar
Lei H, Jiang G, Quan L (2017) Fast descriptors and correspondence propagation for robust global point cloud registration. IEEE Trans Image Process 26(8):3614–3623
MathSciNet MATH Google Scholar
Elbaz G, Avraham T, Fischer A (2017) 3D point cloud registration for localization using a deep neural network auto-encoder. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 4631–4640
Ligon J, Bein D, Ly P et al (2018) 3D point cloud processing using spin images for object detection. In: 2018 IEEE 8th Annual Computing and Communication Workshop and Conference (CCWC), pp 731–736
Yang J, Zhang Q, Xian K et al (2017) Rotational contour signatures for both real-valued and binary feature representations of 3D local shape. Comput Vis Image Underst 160:133–147
Article Google Scholar
Dong Z, Yang B, Liu Y et al (2017) A novel binary shape context for 3D local surface description. ISPRS J Photogramm Remote Sens 130:431–452
Article Google Scholar
Tu Y, Lin Y, Wang J et al (2018) Semi-supervised learning with generative adversarial networks on digital signal modulation classification. Comput Mater Contin 55(2):243–254
Google Scholar
Zeng D, Dai Y, Li F et al (2018) Adversarial learning for distant supervised relation extraction. Comput Mater Contin 55(1):121–136
Google Scholar
Dubé R, Gollub MG, Sommer H et al (2018) Incremental-segment-based localization in 3-D point clouds. IEEE Robot Autom Lett 3(3):1832–1839
Article Google Scholar
Soilán M, Riveiro B, Sánchez JM et al (2017) Segmentation and classification of road markings using MLS data. ISPRS J Photogramm Remote Sens 123:94–103
Article Google Scholar
Roveri R, Rahmann L, Oztireli AC et al (2018) A network architecture for point cloud classification via automatic depth images generation. In: IEEE Conference on Computer Vision Pattern Recognition (CVPR), pp 4176–4184
Bobkov D, Chen S, Jian R et al (2018) Noise-resistant deep learning for object classification in three-dimensional point clouds using a point pair descriptor. IEEE Robot Autom Lett 3(2):865–872
Article Google Scholar
Chen J, Cho YK, Ueda J (2018) Sampled-point network for classification of deformed building element point clouds. In: 2018 IEEE International Conference on Robotics and Automation (ICRA), pp 2164–2169
Song W, Tian Y, Fong S et al (2016) GPU-accelerated foreground segmentation and labeling for real-time video surveillance. Sustainability 8(10):916
Article Google Scholar

Download references

Acknowledgements

This research was supported by National Natural Science Foundation of China (61503005), Beijing Natural Science Foundation (4184086), Beijing Young Topnotch Talents Cultivation Program (No. CIT&TCD201904009), the Great Wall Scholar Program (CIT&TCD20190304), NCUT “The Belt and Road” Talent Training Base Project, and NCUT “Yuyou” Project.

Author information

Authors and Affiliations

North China University of Technology, No. 5 Jinyuanzhuang Road, Shijingshan District, Beijing, 100-144, China
Yifei Tian, Wei Song, Su Sun & Shuanghui Zou
Department of Computer and Information Science, University of Macau, Taipa, 999-078, Macau, China
Yifei Tian & Simon Fong
Beijing Key Lab on Urban Intelligent Traffic Control Technology, Beijing, 100-144, China
Wei Song

Authors

Yifei Tian
View author publications
You can also search for this author in PubMed Google Scholar
Wei Song
View author publications
You can also search for this author in PubMed Google Scholar
Su Sun
View author publications
You can also search for this author in PubMed Google Scholar
Simon Fong
View author publications
You can also search for this author in PubMed Google Scholar
Shuanghui Zou
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wei Song.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Tian, Y., Song, W., Sun, S. et al. 3D object recognition method with multiple feature extraction from LiDAR point clouds. J Supercomput 75, 4430–4442 (2019). https://doi.org/10.1007/s11227-019-02830-9

Download citation

Published: 30 March 2019
Issue Date: 01 August 2019
DOI: https://doi.org/10.1007/s11227-019-02830-9

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

3D object recognition method with multiple feature extraction from LiDAR point clouds

Abstract

Similar content being viewed by others

Classifying 3D objects in LiDAR point clouds with a back-propagation neural network

Exploring Model Transfer Potential for Airborne LiDAR Point Cloud Classification

Accelerating DNN-based 3D point cloud processing for mobile computing

1 Introduction

2 Related works