View-point Invariant 3D Classification for Mobile Robots Using a Convolutional Neural Network

Moon, Jiyoun; Kim, Hanjun; Lee, Beomhee

doi:10.1007/s12555-018-0182-y

View-point Invariant 3D Classification for Mobile Robots Using a Convolutional Neural Network

Regular Papers
Robot and Applications
Published: 30 October 2018

Volume 16, pages 2888–2895, (2018)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

International Journal of Control, Automation and Systems Aims and scope Submit manuscript

View-point Invariant 3D Classification for Mobile Robots Using a Convolutional Neural Network

Download PDF

Jiyoun Moon¹,
Hanjun Kim¹ &
Beomhee Lee¹

233 Accesses
27 Citations
Explore all metrics

Abstract

3D object classification is an important component in semantic scene understanding for mobile robots. However, many current systems do not consider the practical issues such as object representation from different viewing positions of mobile robots. A novel 3D object representation is introduced using cylindrical occupancy grid and 3D convolutional neural network with row-wise max pooling layer. Due to the rotationally invariant characteristics of this method, robots can successfully classify 3D objects regardless of starting positions of object modelling. Experimental results on publicly available benchmark dataset show the significantly improved performance compared with other conventional algorithms.

Article PDF

Efficient Machine Learning of Mobile Robotic Systems Based on Convolutional Neural Networks

Real-Time Mobile Robot Perception Based on Deep Learning Detection Model

A Deep-Learning-based Strategy for Kidnapped Robot Problem in Similar Indoor Environment

Article 18 September 2020

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

I. Kostavelis and A. Gasteratos, “Semantic mapping for mobile robotics tasks: a survey,” Robotics and Autonomous Systems, vol. 66, pp. 86–103, 2015.
Article Google Scholar
A. S. Razavian, H. Azizpour, J. Sullivan, and S. Carlsson, “CNN features off-the-shelf: an astounding baseline for recognition,” Proc. of IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 512–519, 2014.
Google Scholar
R. Socher, B. Huval, B. Bath, C. D. Manning, and A. Y. Ng, “Convolutional-recursive deep learning for 3d object classification,” Advances in Neural Information Processing Systems, pp. 656–664, 2012.
Google Scholar
L. A. Alexandre, “3d object recognition using convolutional neural networks with transfer learning between input channels,” Intelligent Autonomous Systems 13, pp. 889–898, 2016.
Chapter Google Scholar
D. Maturana and S. Scherer, “Voxnet: a 3D convolutional neural network for real-time object recognition,” Proc. of IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 922–928, 2015.
Google Scholar
V. Hegde and R. Zade, “Fusionnet: 3D object classification using multiple data representations,” arXiv preprint arXiv:1607.05695, 2016.
Google Scholar
J. Behley, V. Steinhage, and A. B. Cremers, “Performance of histogram descriptors for the classification of 3D laser range data in urban environments,” Proc. of IEEE International Conference on Robotics and Automation, pp. 4391–4398, 2012.
Google Scholar
A. Teichman, J. Levinson, and S. Thrun, “Towards 3d object recognition via classification of arbitrary object tracks,” Proc. of IEEE International Conference on Robotics and Automation, pp. 4034–4041, 2011.
Google Scholar
S. M. Prakhya, B. Liu, and W. Lin, “B-shot: a binary feature descriptor for fast and efficient keypoint matching on 3d point clouds,” Proc. of IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 1929–1934, 2015.
Google Scholar
S. Bu, P. Han, Z. Liu, K. Li, and J. Han, “Shift-invariant ring feature for 3d shape,” The Visual Computer, vol. 30, no. 6–8, pp. 867–876, 2014.
Article Google Scholar
F. J. Huang, Y. L. Boureau, and Y. LeCun, “Unsupervised learning of invariant feature hierarchies with applications to object recognition,” Proc. of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 1–8, 2007.
Google Scholar
H. Lee, R. Grosse, R. Ranganath, and A. Y. Ng, “Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations,” Proc. of 26th Annual International Conference on Machine Learning, pp. 609–616, 2009.
Google Scholar
M. Norouzi, M. Ranjbar, and G. Mori, “Stacks of convolutional restricted boltzmann machines for shift-invariant feature learning,” Proc. of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 2735–2742, 2009.
Google Scholar
R. Collobert and J. Weston, “A unified architecture for natural language processing: Deep neural networks with multitask learning,” Proc. of 25th International Conference on Machine Learning, pp. 160–167, 2008.
Google Scholar
J. Fan, W. Xu, Y. Wu, and Y. Gong, “Human tracking using convolutional neural networks,” IEEE Trans. on Neural Networks, vol. 21, no. 10, pp. 1610–1623, 2010.
Article Google Scholar
M. Yang, F. Lv, W. Xu, and Y. Gong, “Detection driven adaptive multi-cue integration for multiple human tracking,” Proc. of IEEE International Conference on Computer Vision, pp. 1554–1561, 2009.
Google Scholar
H. Lee, P. Pham, Y. Largman, and A. Y. Ng, “Unsupervised feature learning for audio classification using convolutional deep belief networks,” Advances in Neural Information Processing Systems, pp. 1096–1104, 2009.
Google Scholar
S. Hershey, S. Chaudhuri, D. P. Ellis, J. F. Gemmeke, A. Jansen, R. C. Moore, M. Plakal, D. Platt, R. A. Saurous, B. Seybold, M. Slaney, R. J. Weiss, and K. Wilson, “CNN architectures for large-scale audio classification,” Proc. of IEEE International Conference on Acoustics, Speech and Signal, pp. 131–135, 2017.
Google Scholar
H. Su, S. Maji, E. Kalogerakis, and E. Learned-Miller, “Multi-view convolutional neural networks for 3D shape recognition,” Proc. of IEEE International Conference on Computer Vision, pp. 945–953, 2015.
Google Scholar
B. Shi, S. Bai, Z. Zhou, and X. Bai, “Deeppano: deep panoramic representation for 3D shape recognition,” IEEE Signal Processing Letters, vol. 22, no. 12, pp. 2339–2343, 2015.
Article Google Scholar
Z. Wu, S. Song, A. Khosla, F. Yu, L. Zhang, X. Tang, and J. Xiao, “3d shapenets: a deep representation for volumetric shapes,” Proc. of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 1912–1920, 2015.
Google Scholar
Y. Zhou and O. Tuzel, “Voxelnet: end-to-end learning for point cloud based 3d object detection,” arXiv preprint arXiv:1711.06396, 2017.
Google Scholar
A. Brock, T. Lim, J. M. Ritchie, and N. Weston, “Generative and discriminative voxel modeling with convolutional neural networks,” arXiv preprint arXiv:1608.04236, 2016.
Google Scholar
G. Riegler, A. O. Ulusoy, and A. Geiger, “Octnet: learning deep 3d representations at high resolutions,” Proc. of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 3577–3586, 2017.
Google Scholar
M. Engelcke, D. Rao, D. Z. Wang, C. H. Tong, and I. Posner, “Vote3deep: fast object detection in 3D point clouds using efficient convolutional neural networks,” Proc. of IEEE International Conference on Robotics and Automation, pp. 1355–1361, 2017.
Google Scholar
C. R. Qi, H. Su, M. NieSSner, A. Dai, M. Yan, and L. J. Guibas, “Volumetric and multi-view cnns for object classification on 3D data,” Proc. of IEEE International Conference on Computer Vision and Pattern Recognition, pp. 5648–5656, 2016.
Google Scholar
C. V. Nguyen, S. Izadi, and D. Lovell, “Modeling kinect sensor noise for improved 3D reconstruction and tracking,” Proc. of IEEE International Conference on 3D Imaging, Modeling, Processing, Visualization and Transmission, pp. 524–530, 2012.
Google Scholar
S. Li, J. Wang, Z. Liang, and L. Su, “Tree point clouds registration using an improved icp algorithm based on kdtree,” Proc. of IEEE International Conference on Geoscience and Remote Sensing Symposium, pp. 4545–4548, 2016.
Google Scholar
K. Liu, H. Skibbe, T. Schmidt, T. Blein, K. Palme, T. Brox, and O. Ronneberger, “Rotation-invariant hog descriptors using fourier analysis in polar and spherical coordinates,” International Journal of Computer Vision, vol. 106, no. 3, pp. 342–364, 2014.
Article MathSciNet MATH Google Scholar
S. Ji, W. Xu, M. Yang, and K. Yu, “3D convolutional neural networks for human action recognition,” IEEE Trans. on Pattern Analysis and Machine Intelligence, vol. 35, no. 1, pp. 221–231, 2013.
Article Google Scholar
N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: a simple way to prevent neural networks from overfitting,” The Journal of Machine Learning Research, vol. 15, no. 1, pp. 1929–1958, 2014.
MathSciNet MATH Google Scholar
K. V. Vishwanath, D. Gupta, A. Vahdat, and K. Yocum, “Modelnet: towards a datacenter emulation environment,” Proc. of IEEE International Conference on Peer-to-Peer Computing, pp. 81–82, 2009.
Google Scholar

Download references

Author information

Authors and Affiliations

Automation and Systems Research Institute, Department of Electrical Engineering, Seoul National University, Seoul, Korea
Jiyoun Moon, Hanjun Kim & Beomhee Lee

Authors

Jiyoun Moon
View author publications
You can also search for this author in PubMed Google Scholar
Hanjun Kim
View author publications
You can also search for this author in PubMed Google Scholar
Beomhee Lee
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jiyoun Moon.

Additional information

Recommended by Associate Editor Dong-Joong Kang under the direction of Editor Euntai Kim. This work was supported by the National Research Foundation of Korea(NRF) grant funded by the Korea government(MSIP) (No. 2017R1A2B2002608).

Jiyoun Moon received her Bachelor’s of Science in Robotics from Kwangwoon University in August 2014. Her major research interests include Natural language process, Semantic scene understanding, and Mission planning.

Hanjun Kim received his Bachelor’s of Science in Electrical and Computer Engineering from Seoul National University in February 2015. His major research interests include SLAM, Reinforcement Learning, and Semantic scene understanding.

Beomhee Lee received the B.S. and M.S. degrees in Electronics Engineering from Seoul National University, in 1978 and 1980, respectively, and the Ph.D. degree in Computer Information, and control engineering from the University of Michigan, Ann Arbor, MI, USA in 1985. Since then, he had been associated with the School of Electrical Engineering at Purdue University as an Assistant Professor until 1987.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Moon, J., Kim, H. & Lee, B. View-point Invariant 3D Classification for Mobile Robots Using a Convolutional Neural Network. Int. J. Control Autom. Syst. 16, 2888–2895 (2018). https://doi.org/10.1007/s12555-018-0182-y

Download citation

Received: 23 March 2018
Revised: 10 July 2018
Accepted: 13 August 2018
Published: 30 October 2018
Issue Date: December 2018
DOI: https://doi.org/10.1007/s12555-018-0182-y

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

View-point Invariant 3D Classification for Mobile Robots Using a Convolutional Neural Network

Abstract

Article PDF

Similar content being viewed by others

Efficient Machine Learning of Mobile Robotic Systems Based on Convolutional Neural Networks

Real-Time Mobile Robot Perception Based on Deep Learning Detection Model

A Deep-Learning-based Strategy for Kidnapped Robot Problem in Similar Indoor Environment

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

View-point Invariant 3D Classification for Mobile Robots Using a Convolutional Neural Network

Abstract

Article PDF

Similar content being viewed by others

Efficient Machine Learning of Mobile Robotic Systems Based on Convolutional Neural Networks

Real-Time Mobile Robot Perception Based on Deep Learning Detection Model

A Deep-Learning-based Strategy for Kidnapped Robot Problem in Similar Indoor Environment

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation