Abstract
Siamese tracking methods have recently drawn extensive attention due to their balanced accuracy and efficiency. However, most Siamese-based trackers use shallow backbone network, in which extracting high-level semantic features is difficult. When the appearance of distractors and targets is particularly similar, these methods may lead to tracking drift or even failure. Considering this deficiency, we propose a Siamese network with enriched semantics, named ESDT. First, a semantic enrichment module (SEM) comprising dilated convolution layers is designed to improve the classification capability of the siamese tracker. In addition, the target template is updated adaptively to cope with the target texture information changes caused by illumination and blur and further promote the tracking performance. Finally, exhaustive experimental analysis on the public datasets shows that the proposed algorithm outperforms several state-of-the-art algorithms and could track the target stably despite disturbances.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Li X, Zha Y F, Zhang T Z, Cui Z, Zuo W M, Hou Z Q, Lu H C and Wang H Z, Journal of Image and Graphics 24, 2057 (2019). (in Chinese)
Li B, Wu W, Wang Q, Zhang F Y, Xing J L and Yan J J, SiamRPN++: Evolution of Siamese Visual Tracking with Very Deep Networks, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019.
Zhang Z P and Peng H W, Deeper and Wider Siamese Networks for Real-Time Visual Tracking, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019.
Fan H and Ling H B, Siamese Cascaded Region Proposal Networks for Real-Time Visual Tracking, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019.
L. Bertinetto, J. Valmadre, J. F. Henriques, A. Vedaldi and P. H. Torr, Fully-convolutional Siamese Networks for Object Tracking, Proc. of European Conference on Computer Vision, 850 (2016).
Krizhevsky A, Sutskever I and Hinton G.E, Imagenet Classification with Deep Convolutional Neural Networks, Advances in Neural Information Processing Systems, 1097 (2012).
K. Simonyan and A. Zisserman, Very Deep Convolutional Networks for Large-Scale Image Recognition, Proc. of International Conference on Learning Representations, 2015.
K. He, X. Zhang, S. Ren and J. Sun, Deep Residual Learning for Image Recognition, Proc. of IEEE Conference on Computer Vision and Pattern Recognition, 770 (2016).
Anfeng He, Chong Luo, Xinmei Tian and Wenjun Zeng, A Twofold Siamese Network for Real-Time Object Tracking, Proc. of IEEE Conference on Computer Vision and Pattern Recognition, 2018.
Zhu Z., Wu W., Zou W. and Yan J., End-to-end Flow Correlation Tracking with Spatial-Temporal Attention, Proc. of IEEE Conference on Computer Vision and Pattern Recognition, 2018.
Zhang Z, S Qiao, C Xie, W Shen, B Wang and Alan Yuille, Single-Shot Object Detection with Enriched Semantics, Proc. of IEEE Conference on Computer Vision and Pattern Recognition, 2018.
Wang M M, Liu Y and Huang Z Y, Large Margin Object Tracking with Circulant Feature Maps, Proc. of IEEE Conference on Computer Vision and Pattern Recognition, 4800 (2017).
L.C. Chen, G. Papandreou, I. Kokkinos, K. Murphy and A. L. Yuille, Deeplab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs, IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017.
O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg and L. Fei-Fei, International Journal of Computer Vision 115, 211 (2015).
Y. Wu, J. Lim and M. Yang, IEEE Transactions on Pattern Analysis and Machine Intelligence 37, 1834 (2015).
M. Kristan, A. Leonardis, J. Matas, M. Felsberg, R. Pflugfelder, L. Cehovin Zajc, T. Vojir, G. Hager, A. Lukezic, A. Eldesokey and G. Fernandez, The Visual Object Tracking Vot2017 Challenge Results, Proc. of IEEE International Conference on Computer Vision, 1949 (2017).
Bo Li, Junjie Yan, Wei Wu, Zheng Zhu and Xiaolin Hu, High Performance Visual Tracking with Siamese Region Proposal Network, Proc. of IEEE Conference on Computer Vision and Pattern Recognition, 8971 (2018).
Zhu Z., Wang Q., Li B., Wu W., Yan J. and Hu W., Distractor-aware Siamese Networks for Visual Object Tracking, Proc. of European Conference on Computer Vision, 101 (2018).
J. Valmadre, L. Bertinetto, J. F. Henriques, A. Vedaldi and P. H. Torr, End-to-end Representation Learning for Correlation Filter Based Tracking, Proc. of IEEE Conference on Computer Vision and Pattern Recognition, 2017.
M. Danelljan, G. Bhat, F. S. Khan and M. Felsberg, Eco: Efficient Convolution Operators for Tracking, Proc. of IEEE Conference on Computer Vision and Pattern Recognition, 4800 (2017).
Hamed Kiani Galoogahi, Ashton Fagg and Simon Lucey, Learning Background-Aware Correlation Filters for Visual Tracking, Proc. of IEEE Conference on Computer Vision and Pattern Recognition, 2017.
M. Danelljan, A. Robinson, F. S. Khan and M. Felsberg, Beyond Correlation Filters: Learning Continuous Convolution Operators for Visual Tracking, Proc. of European Conference on Computer Vision Workshop, 850 (2016).
A. Lukezic, T. Vojir, L. Cehovin Zajc, J. Matas and M. Kristan, Discriminative Correlation Filter with Channel and Spatial Reliability, Proc. of IEEE Conference on Computer Vision and Pattern Recognition, 4800 (2017).
Q. Wang, J. Gao, J. Xing, M. Zhang and W. Hu, DCFNet: Discriminant Correlation Filters Network For Visual Tracking, arXiv preprint, arXiv:1704.04057, 2017.
N Wang, W Zhou, Q Tian, R Hong, M Wang and H Li, Multi-Cue Correlation Filters for Robust Visual Tracking, Proc. of IEEE Conference on Computer Vision and Pattern Recognition, 2018.
E. Gundogdu and A. Alatan, Good Features to Correlate for Visual Tracking, IEEE Transactions on Image Processing, 2526 (2018).
H He, Y Fan, J Zhuang and H Bai, Correlation Filters with Weighted Convolution Responses, Proc. of IEEE International Conference on Computer Vision, 2017.
Author information
Authors and Affiliations
Corresponding author
Additional information
This work has been supported in part by the National Key Research and Development Project of China (No.2018YFB1601200), and the Fundamental Research Funds for the Central Universities (No.3122018C004).
Rights and permissions
About this article
Cite this article
Wang, Hs., Zhang, Hy. Siamese visual tracking with enriched semantics and dynamic template. Optoelectron. Lett. 17, 241–246 (2021). https://doi.org/10.1007/s11801-021-0073-y
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11801-021-0073-y