Discriminative Feature Learning for Action Recognition Using a Stacked Denoising Autoencoder

Sang, Ruoxin; Jin, Peiquan; Wan, Shouhong

doi:10.1007/978-3-319-07776-5_54

Ruoxin Sang⁷,
Peiquan Jin⁷ &
Shouhong Wan⁷

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 297))

1968 Accesses
2 Citations

Abstract

In this paper, we propose a novel method to recognize human actions based on the depth information acquired by depth-based cameras. Representations of depth maps are learned and reconstructed using a stacked denoising autoencoder. By adding the category constraint, the learned features are more discriminative and able to capture the small but significant differences between actions. Greedy layer-wise training strategy is used to train the deep neural network. Then we use temporal pyramid matching on the feature representation to generate temporal representation. Finally a linear SVM is trained to classify each sequence into actions. Our method is evaluated on MSR Action3D dataset and show superiority over other popular methods. Experimental results also indicate the great power of our model to restore highly noisy input data.

Access provided by Autonomous University of Puebla. Download to read the full chapter text

Chapter PDF

Discriminative Feature Learning with Constraints of Category and Temporal for Action Recognition

Deep Embedding Features for Action Recognition on Raw Depth Maps

DMM-Pyramid Based Deep Architectures for Action Recognition with Depth Cameras

Keywords

References

Yang, X., Zhang, C., Tian, Y.: Recognizing actions using depth motion maps-based histograms of oriented gradients. In: Proceedings of the 20th ACM International Conference on Multimedia, pp. 1057–1060. ACM (2012)
Google Scholar
Wang, J., Liu, Z., Wu, Y., Yuan, J.: Mining actionlet ensemble for action recognition with depth cameras. In: Computer Vision and Pattern Recognition, CVPR (2012)
Google Scholar
Shotton, J., Sharp, T., Kipman, A., Fitzgibbon, A., Finocchio, M., Blake, A., Cook, M., Moore, R.: Real-time human pose recognition in parts from single depth images. Communications of the ACM 56(1), 116–124 (2013)
Article Google Scholar
Xia, L., Aggarwal, J.: Spatio-temporal depth cuboid similarity feature for activity recognition using depth camera (2013)
Google Scholar
Oreifej, O., Liu, Z., Redmond, W.: Hon4d: Histogram of oriented 4d normals for activity recognition from depth sequences. In: Computer Vision and Pattern Recognition, CVPR (2013)
Google Scholar
Luo, J., Wang, W., Qi, H.: Group sparsity and geometry constrained dictionary learning for action recognition from depth maps (2013)
Google Scholar
Xia, L., Chen, C.C., Aggarwal, J.: View invariant human action recognition using histograms of 3d joints. In: Computer Vision and Pattern Recognition Workshops, CVPRW (2012)
Google Scholar
Yang, X., Tian, Y.: Eigenjoints-based action recognition using naive-bayes-nearest-neighbor. In: Computer Vision and Pattern Recognition Workshops, CVPRW (2012)
Google Scholar
Laptev, I.: On space-time interest points. International Journal of Computer Vision 64(2-3), 107–123 (2005)
Article Google Scholar
Dollár, P., Rabaud, V., Cottrell, G., Belongie, S.: Behavior recognition via sparse spatio-temporal features. In: 2nd Joint IEEE International Workshop on Visual Surveillance and Performance Evaluation of Tracking and Surveillance 2005, pp. 65–72. IEEE (2005)
Google Scholar
Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: Computer Vision and Pattern Recognition, CVPR (2008)
Google Scholar
Li, W., Zhang, Z., Liu, Z.: Action recognition based on a bag of 3d points. In: Computer Vision and Pattern Recognition Workshops, CVPRW (2010)
Google Scholar
Scholkopft, B., Mullert, K.R.: Fisher discriminant analysis with kernels. Neural Networks for Signal Processing IX
Google Scholar
Rabiner, L.: A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of the IEEE 77(2), 257–286 (1989)
Article Google Scholar
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
Article MATH MathSciNet Google Scholar
Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Computation 18(7), 1527–1554 (2006)
Article MATH MathSciNet Google Scholar
Bengio, Y.: Learning deep architectures for ai. Foundations and Trends® in Machine Learning 2(1), 1–127 (2009)
Article MATH MathSciNet Google Scholar
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11), 2278–2324 (1998)
Article Google Scholar
Bengio, Y., Courville, A., Vincent, P.: Representation learning: A review and new perspectives (2013)
Google Scholar
Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., Manzagol, P.A.: Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. The Journal of Machine Learning Research 9999, 3371–3408 (2010)
MathSciNet Google Scholar
Lazebnik, S., Schmid, C., Ponce, J.: Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories. In: Computer Vision and Pattern Recognition, CVPR (2006)
Google Scholar
Chang, C.C., Lin, C.J.: Libsvm: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology (TIST) 2(3), 27 (2011)
Google Scholar
Martens, J., Sutskever, I.: Learning recurrent neural networks with hessian-free optimization. In: Proceedings of the 28th International Conference on Machine Learning (ICML 2011), pp. 1033–1040 (2011)
Google Scholar
Müller, M., Röder, T.: Motion templates for automatic classification and retrieval of motion capture data. In: Proceedings of the 2006 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, pp. 137–146. Eurographics Association (2006)
Google Scholar
Wang, J., Liu, Z., Chorowski, J., Chen, Z., Wu, Y.: Robust 3d action recognition with random occupancy patterns. In: Fitzgibbon, A., Lazebnik, S., Perona, P., Sato, Y., Schmid, C. (eds.) ECCV 2012, Part II. LNCS, vol. 7573, pp. 872–885. Springer, Heidelberg (2012)
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science and Technology, Key Laboratory of Electromagnetic Space Information, Chinese Academy of Sciences, University of Science and Technology of China, Hefei, China
Ruoxin Sang, Peiquan Jin & Shouhong Wan

Authors

Ruoxin Sang
View author publications
You can also search for this author in PubMed Google Scholar
Peiquan Jin
View author publications
You can also search for this author in PubMed Google Scholar
Shouhong Wan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ruoxin Sang .

Editor information

Editors and Affiliations

National Kaohsiung University of Applied Sciences, Kaohsiung, Taiwan
Jeng-Shyang Pan
Department of Computer Science Faculty of Elec. Eng. & Comp. Sci., VSB-Technical University of Ostrava, Ostrava-Poruba, Czech Republic
Vaclav Snasel
Departamento de Informáca y Automática, Facultad de Biología, University of Salamanca, Salamanca, Spain
Emilio S. Corchado
Scientific Network for Innovation and Research Excellence, Machine Intelligence Research Labs (MIR Labs), Auburn, Washington, USA
Ajith Abraham
Department of Information Management, National University of Kaohsiung, Kaohsiung, Taiwan
Shyue-Liang Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sang, R., Jin, P., Wan, S. (2014). Discriminative Feature Learning for Action Recognition Using a Stacked Denoising Autoencoder. In: Pan, JS., Snasel, V., Corchado, E., Abraham, A., Wang, SL. (eds) Intelligent Data analysis and its Applications, Volume I. Advances in Intelligent Systems and Computing, vol 297. Springer, Cham. https://doi.org/10.1007/978-3-319-07776-5_54

Download citation

DOI: https://doi.org/10.1007/978-3-319-07776-5_54
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07775-8
Online ISBN: 978-3-319-07776-5
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics

Discriminative Feature Learning for Action Recognition Using a Stacked Denoising Autoencoder

Abstract

Chapter PDF

Similar content being viewed by others

Discriminative Feature Learning with Constraints of Category and Temporal for Action Recognition

Deep Embedding Features for Action Recognition on Raw Depth Maps

DMM-Pyramid Based Deep Architectures for Action Recognition with Depth Cameras

Keywords

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Discriminative Feature Learning for Action Recognition Using a Stacked Denoising Autoencoder

Abstract

Chapter PDF

Similar content being viewed by others

Discriminative Feature Learning with Constraints of Category and Temporal for Action Recognition

Deep Embedding Features for Action Recognition on Raw Depth Maps

DMM-Pyramid Based Deep Architectures for Action Recognition with Depth Cameras

Keywords

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation