Convolutional Neural Network for Monocular Vision-based Multi-target Tracking

Kim, Sang-Hyeon; Choi, Han-Lim

doi:10.1007/s12555-018-0134-6

Convolutional Neural Network for Monocular Vision-based Multi-target Tracking

Regular Paper
Robot and Applications
Published: 04 July 2019

Volume 17, pages 2284–2296, (2019)
Cite this article

Download PDF

Access provided by Autonomous University of Puebla

International Journal of Control, Automation and Systems Aims and scope Submit manuscript

Convolutional Neural Network for Monocular Vision-based Multi-target Tracking

Download PDF

457 Accesses
29 Citations
Explore all metrics

Abstract

This paper addresses multi-target tracking using a monocular vision sensor. To overcome the fundamental observability issue of the monocular vision, a convolutional neural network (CNN)-based method is proposed. The method combines a CNN-based multi-target detection into a model-based multi-target tracking framework. While previous CNN applications to image-based object recognition and tracking focused on prediction of region of interest (RoI), the proposed method allows for prediction of the three-dimensional position information of the moving objects of interest. This is achieved by appropriately construct a network tailored to the moving object tracking problems with potentially occluded objects. In addition, the cubature Kalman filter integrated with a data association scheme is adopted for effective tracking of nonlinear motion of the objects with the measurements information from the learned network. A virtual simulator that generates the trajectories of the target motions and a sequence of images of the scene has been developed and used to test and verify the proposed CNN scheme. Simulation case studies demonstrate that the proposed CNN improves the position accuracy in the depth direction substantially.

Article PDF

Real-Time 3D Object Detection and Tracking in Monocular Images of Cluttered Environment

A Method of Dynamic Visual Scene Analysis Based on Convolutional Neural Network

Object Detection and Tracking Based on Deep Learning

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

A. Yilmaz, O. Javed, and M. Shah, “Object tracking: a survey,” Acm Computing Surveys, vol. 38, no. 4, 2006.
Google Scholar
K. Seo, J. Shin, W. Kim, and J. Lee, “Real-time object tracking and segmentation using adaptive color snake model,” International Journal of Control, Automation, and Systems, vol. 4, no. 2, pp. 236–246, 2006.
Google Scholar
A. Chosh, B. N. Subudhi, and S. Chosh, “Object detection from videos captured by moving camera by fuzzy edge incorporated Markov random field and local histogram matching,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 8, pp. 1127–1135, 2012.
Article Google Scholar
T. Moranduzzo and F. Melgani, “A SIFT-SVM method for detecting cars in UAV images,” Proc. of Geoscience and Remote Sensing Symposium, pp. 6868–6871, 2012.
Google Scholar
J. Yang, H. Ji, and Z. Fan, “Probability hypothesis density filter based on strong tracking MIE for multiple maneuvering target tracking,” International Journal of Control, Automation, and Systems, vol. 11, no. 2, pp. 306–316, 2013.
Article Google Scholar
J. Yang, P. Li, L. Yang, and H. Ge, “An improved ET-GM-PHD filter for multiple closely-spaced extended target tracking,” International Journal of Control, Automation, and Systems, vol. 15, no. 1, pp. 468–472, 2017.
Article Google Scholar
Y. LeCun, L. Bottou, Y. Bengio, and P. Haffener, “Gradient-based learning applied to document recognition,” Proceedings of the IEEE, vol. 86, no. 11, pp. 2278–2324, 1998.
Article Google Scholar
A. Karpathy, G. Toderici, S. Shetty, T. Leung, R. Suk-thankar, and L. Fei-Fei, “Large-scale video classification with convolutional neural network,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1725–1732, 2014.
Google Scholar
C. P. Papageorgiou, M. Oren, and T. Poggio, “A general framework for object detection,” Proc. of 6th IEEE International Conference on Computer Vision, pp. 555–562, 1998.
Google Scholar
D. G. Lowe, “Object recognition from local scale-invariant features,” The Proceedings of the 7th IEEE International Conference on Computer Vision, pp. 1150–1157, 1999.
Google Scholar
N. Dalai and B. Triggs, “Histograms of oriented gradients for human detection,” Proc. of IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), vol. 1, pp. 886–893, 2005.
Article Google Scholar
J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, and T. Darrell, “DeCAF: a deep convolutional activation feature for generic visual recognition,” Proc. of International Conference on Machine Learning, vol. 32, no. 1, pp. 647–655, 2014.
Google Scholar
H.-H. Kim, J.-K. Park, J.-H. Oh, and D.-J. Kang, “Multi-tast convolutional neural network system for license plate recognition,” International Journal of Control, Automation, and Systems, vol. 15, no. 6, pp. 29422949, 2017.
Google Scholar
R. Girshick, J. Donahue, T. Darrell, and J. Malik, “Rich feature hierarchies for accurate object detection and semantic segmentation,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587, 2014.
Google Scholar
R. Girshick, “Fast R-CNN,” Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448, 2015.
Google Scholar
S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: towards real-time object detection with region proposal networks,” Advances in Neural Information Proceeding Systems, pp. 91–99, 2015.
Google Scholar
J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: unified, real-time object detection,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788, 2016.
Google Scholar
J. Redmon and A. Farhadi, “YOLO9000: better, faster, stronger,” arXivpreprint arXiv: 1612.08242, 2016.
Google Scholar
W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Y. Fu, and A. C. Berg, “SSD: Single Shot Multibox Detector,” Proc. of European Conference on Computer Vision, pp. 21–37, 2016.
Google Scholar
I. Arasaratnam and S. Haykin, “Cubature Kalman filters,” IEEE Transaction on Automatic Control, vol. 54, no. 6, pp. 1254–1269, June 2009.
Article MathSciNet MATH Google Scholar
S. J. Julier and J. K. Uhlmann, “A new extension of the Kalman filter to nonlinear systems,” Proc. of Int. Symp. Aerospace/Defense Sensing, Simul. and Controls, 1997.
Google Scholar
D. Shreiner, G. Sellers, J. Kessenich, and B. Licea-Kane, OpenGL Programming Guide: The Official Guide to Learning OpenGL, version 4.3, Addison-Wesley, 2013.
Google Scholar
M. Everingham, L. Van Gool, C. K. Williams, J. Winn, and A. Zisserman, “The Pascal visual object classes (VOC) challenge,” International Journal of Computer Vision, vol. 88, no. 2, pp. 303–338, 2010.
Article Google Scholar
T. Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, and C. L. Zitnick, “Microsoft COCO: common objects in context,” Proc. of European Conference on Computer Vision, pp. 740–755, 2014.
Google Scholar
S. Bouabdallah, A. Noth, and R. Siegwart, “PID vs LQ control techniques applied to an indoor micro quadrotor,” Proc. oflEEE/RSJ International Conference on Intelligent Robots and Systems, vol. 3, pp. 2451–2456, 2004.
Google Scholar
L. R. G. Carrillo, A. E. D. Lopez, R. Lozano, and C. Pegard, “Modeling the quad-rotor minirotorcraft, Quad Ro-torcraft Control, Advances in Industrial Control, Springer, London, pp. 23–34, 2013.
Chapter Google Scholar
F. Rinaldi, S. Chiesa, and F. Quagliotti, “Linear quadratic control for quadrotors UAVs dynamics and formation flight,” Journal of Intelligent & Robotic Systems, vol. 70, no. 1–4, pp. 203–220, April 2013.
Article Google Scholar
D. Kingma and J. Ba, “Adam: a method for stochastic optimization,” arXiv preprint arXiv: 1412.6980, 2014.
Google Scholar
M. D. Jeiler, “ADADELTA: an adaptive learning rate method,” arXiv preprint arXiv: 1212.5701, 2012.
Google Scholar
T. Tieleman and G. Hinton, “Lecture 6.5-RMSprop: divide the gradient by a running average of its recent magnitude,” Coursera: Neural Networks for Machine Learning, pp. 3919–3924, 2012.
Google Scholar
L. Y. Pao and R. M. Powers, “A comparison of several different approaches for target tracking with clutter,” Proc. of the American Control Conference, vol. 5, 2003.
B. L. Stevens, F. L. Lewis, and E. N. Johnson, Aircraft Control and Simulation: Dynamics, Controls Design, and Autonomous Systems, John Wiley and Sons.

Download references

Author information

Authors and Affiliations

Department of Aerospace Engineering & KI for Robotics, KAIST, 291 Daehak-ro, Yuseonggu, Daejeon, 34141, Korea
Sang-Hyeon Kim & Han-Lim Choi

Authors

Sang-Hyeon Kim
View author publications
You can also search for this author in PubMed Google Scholar
Han-Lim Choi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Han-Lim Choi.

Additional information

Recommended by Associate Editor Vu Nguyen under the direction of Editor Won-jong Kim. This work was supported in part by ICT R&D program of the Ministry of Science & ICT via Institute for ICT Planning and Evaluation (#R-20150223-000167), and in part by Defense Acquisition Program Administration via High-speed Vehicle Research Center and Agency for Defense Development (#UD170018CD).

Sang-Hyeon Kim is a senior researcher in Samsung Electronics. He received the Ph.D. decree in Aerospace Engineering at KAIST (Korea Advanced Institute of Science and Technology) in 2018. Prior to this, he received the B.S. degree in Aerospace Engineering from the Inha University, Incheon, Korea, in 2011 and the M.S. degree in Aerospace Engineering from KAIST, Daejeon, Korea in 2013. His research interests include vision-based estimation and control for autonomous systems and deep learning techniques.

Han-Lim Choi is an Associate Professor of Aerospace Engineering at KAIST (Korea Advanced Institute of Science and Technology). He received the B.S. and M.S. degrees in aerospace engineering from KAIST, Daejeon, Korea, in 2000 and 2002, respectively, and the Ph.D. degree in aeronautics and astronautics from MIT (Massachusetts Institute of Technology), Cambridge, in 2009. His research interests include decision making for multi-agent systems and machine learning methods for dynamic systems.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Kim, SH., Choi, HL. Convolutional Neural Network for Monocular Vision-based Multi-target Tracking. Int. J. Control Autom. Syst. 17, 2284–2296 (2019). https://doi.org/10.1007/s12555-018-0134-6

Download citation

Received: 03 March 2018
Revised: 07 March 2019
Accepted: 20 March 2019
Published: 04 July 2019
Issue Date: September 2019
DOI: https://doi.org/10.1007/s12555-018-0134-6

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Convolutional Neural Network for Monocular Vision-based Multi-target Tracking

Abstract

Article PDF

Similar content being viewed by others

Real-Time 3D Object Detection and Tracking in Monocular Images of Cluttered Environment

A Method of Dynamic Visual Scene Analysis Based on Convolutional Neural Network

Object Detection and Tracking Based on Deep Learning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Convolutional Neural Network for Monocular Vision-based Multi-target Tracking

Abstract

Article PDF

Similar content being viewed by others

Real-Time 3D Object Detection and Tracking in Monocular Images of Cluttered Environment

A Method of Dynamic Visual Scene Analysis Based on Convolutional Neural Network

Object Detection and Tracking Based on Deep Learning

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation