Human Action Recognition by Random Features and Hand-Crafted Features: A Comparative Study

Shen, Haocheng; Zhang, Jianguo; Zhang, Hui

doi:10.1007/978-3-319-16181-5_2

Haocheng Shen¹⁶,
Jianguo Zhang¹⁶ &
Hui Zhang¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8926))

Included in the following conference series:

European Conference on Computer Vision

4544 Accesses
2 Citations

Abstract

One popular approach for human action recognition is to extract features from videos as representations, subsequently followed by a classification procedure of the representations. In this paper, we investigate and compare hand-crafted and random feature representation for human action recognition on YouTube dataset. The former is built on 3D HoG/HoF and SIFT descriptors while the latter bases on random projection. Three encoding methods: Bag of Feature(BoF), Sparse Coding(SC) and VLAD are adopted. Spatial temporal pyramid and a two-layer SVM classifier are employed for classification. Our experiments demonstrate that: 1) Sparse Coding is confirmed to outperform Bag of Feature; 2) Using a model of hybrid features incorporating frame-static can significantly improve the overall recognition accuracy; 3) The frame-static features works surprisingly better than motion features only; 4) Compared with the success of hand-crafted feature representation, the random feature representation does not perform well in this dataset.

Download to read the full chapter text

Chapter PDF

Action Recognition Using Super Sparse Coding Vector with Spatio-temporal Awareness

Hierarchical Gaussian descriptor based on local pooling for action recognition

Article 12 November 2018

Discriminative Dictionary Design for Action Classification in Still Images and Videos

Article 03 March 2021

Keywords

References

Bay, H., Tuytelaars, T., Van Gool, L.: SURF: Speeded Up Robust Features. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006, Part I. LNCS, vol. 3951, pp. 404–417. Springer, Heidelberg (2006)
Chapter Google Scholar
Bhattacharya, S., Sukthankar, R., Jin, R., Shah, M.: A probabilistic representation for efficient large scale visual recognition tasks. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2593–2600. IEEE (2011)
Google Scholar
Brendel, W., Todorovic, S.: Activities as time series of human postures (2010)
Google Scholar
Dasgupta, S., Gupta, A.: An elementary proof of a theorem of johnson and lindenstrauss. Random Structures & Algorithms 22(1), 60–65 (2003)
Article MATH MathSciNet Google Scholar
Dollár, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: 2nd Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance, pp. 65–72. IEEE (2005)
Google Scholar
Ikizler-Cinbis, N., Sclaroff, S.: Object, Scene and Actions: Combining Multiple Features for Human Action Recognition. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 494–507. Springer, Heidelberg (2010)
Chapter Google Scholar
Jégou, H., Douze, M., Schmid, C., Pérez, P.: Aggregating local descriptors into a compact image representation. In: 2010 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3304–3311. IEEE (2010)
Google Scholar
Klaser, A., Marszalek, M.: A spatio-temporal descriptor based on 3d-gradients. In: IEEE Conference on British Machine Vision Conference, BMVC 2009 (2009)
Google Scholar
Kuehne, H., Jhuang, H., Garrote, E., Poggio, T., Serre, T.: HMDB: a large video database for human motion recognition. In: Proceedings of the International Conference on Computer Vision (ICCV) (2011)
Google Scholar
Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: IEEE Conference on Computer Vision and Pattern Recognition CVPR 2008, pp. 1–8. IEEE (2008)
Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 2169–2178. IEEE (2006)
Google Scholar
Le, Q.V., Zou, W.Y., Yeung, S.Y., Ng, A.Y.: Learning hierarchical invariant spatio-temporal features for action recognition with independent subspace analysis. In: 2011 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 3361–3368. IEEE (2011)
Google Scholar
Li, W., Zhang, J., McKenna, S.J., Coats, M., Carey, F.A.: Classification of colorectal polyp regions in optical projection tomography. ISBI (2013)
Google Scholar
Liu, J., Luo, J., Shah, M.: Recognizing realistic actions from videos in the wild. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 1996–2003. IEEE (2009)
Google Scholar
Liu, L., Fieguth, P.: Texture classification from random features. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(3), 574–586 (2012)
Article Google Scholar
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60(2), 91–110 (2004)
Article Google Scholar
Perronnin, F., Sánchez, J., Mensink, T.: Improving the Fisher Kernel for Large-Scale Image Classification. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part IV. LNCS, vol. 6314, pp. 143–156. Springer, Heidelberg (2010)
Chapter Google Scholar
Wang, H., Kläser, A., Schmid, C., Liu, C.L.: Dense trajectories and motion boundary descriptors for action recognition. International Journal of Computer Vision, 1–20 (2013)
Google Scholar
Willems, G., Tuytelaars, T., Van Gool, L.: An Efficient Dense and Scale-Invariant Spatio-Temporal Interest Point Detector. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part II. LNCS, vol. 5303, pp. 650–663. Springer, Heidelberg (2008)
Chapter Google Scholar
Wright, J., Yang, A.Y., Ganesh, A., Sastry, S.S., Ma, Y.: Robust face recognition via sparse representation. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(2), 210–227 (2009)
Article Google Scholar
Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: IEEE Conference on Computer Vision and Pattern Recognition CVPR 2009, pp. 1794–1801. IEEE (2009)
Google Scholar
Zhang, J., Marszałek, M., Lazebnik, S., Schmid, C.: Local features and kernels for classification of texture and object categories: a comprehensive study. International Journal of Computer Vision 73(2), 213–238 (2007)
Article Google Scholar

Download references

Author information

Authors and Affiliations

School of Computing, University of Dundee, Dundee, UK
Haocheng Shen & Jianguo Zhang
Department of Computer Science & Technology, United International College, Zhuhai, China
Hui Zhang

Authors

Haocheng Shen
View author publications
You can also search for this author in PubMed Google Scholar
Jianguo Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Hui Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Haocheng Shen .

Editor information

Editors and Affiliations

University College London, London, United Kingdom
Lourdes Agapito
University of Lugano, Lugano, Switzerland
Michael M. Bronstein
Technische Universität Dresden, Dresden, Germany
Carsten Rother

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Shen, H., Zhang, J., Zhang, H. (2015). Human Action Recognition by Random Features and Hand-Crafted Features: A Comparative Study. In: Agapito, L., Bronstein, M., Rother, C. (eds) Computer Vision - ECCV 2014 Workshops. ECCV 2014. Lecture Notes in Computer Science(), vol 8926. Springer, Cham. https://doi.org/10.1007/978-3-319-16181-5_2

Download citation

DOI: https://doi.org/10.1007/978-3-319-16181-5_2
Published: 20 March 2015
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16180-8
Online ISBN: 978-3-319-16181-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Human Action Recognition by Random Features and Hand-Crafted Features: A Comparative Study

Abstract

Chapter PDF

Similar content being viewed by others

Action Recognition Using Super Sparse Coding Vector with Spatio-temporal Awareness

Hierarchical Gaussian descriptor based on local pooling for action recognition

Discriminative Dictionary Design for Action Classification in Still Images and Videos

Keywords

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Human Action Recognition by Random Features and Hand-Crafted Features: A Comparative Study

Abstract

Chapter PDF

Similar content being viewed by others

Action Recognition Using Super Sparse Coding Vector with Spatio-temporal Awareness

Hierarchical Gaussian descriptor based on local pooling for action recognition

Discriminative Dictionary Design for Action Classification in Still Images and Videos

Keywords

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation