Video Classification Using Deep Autoencoder Network

Hajati, Farshid; Tavakolian, Mohammad

doi:10.1007/978-3-030-22354-0_45

Farshid Hajati^17,18 &
Mohammad Tavakolian¹⁹

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 993))

Included in the following conference series:

Conference on Complex, Intelligent, and Software Intensive Systems

1792 Accesses
1 Citations

Abstract

We present a deep learning framework for video classification applicable to face recognition and dynamic texture recognition. A Deep Autoencoder Network Template (DANT) is designed whose weights are initialized by conducting unsupervised pre-training in a layer-wise fashion using Gaussian Restricted Boltzmann Machines. In order to obtain a class specific network and fine tune the weights for each class, the pre-initialized DANT is trained for each class of video sequences, separately. A majority voting technique based on the reconstruction error is employed for the classification task. The extensive evaluation and comparisons with state-of-the-art approaches on Honda/UCSD, DynTex, and YUPPEN databases demonstrate that the proposed method significantly improves the performance of dynamic texture classification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Deep Discriminative Model for Video Classification

Dynamic Texture Classification Based on 3D ICA-Learned Filters and Fisher Vector Encoding in Big Data Environment

Article 09 February 2022

Boltzmann Machines for Image Denoising

References

Hajati, F., et al.: Dynamic texture comparison using derivative sparse representation: application to video-based face recognition. IEEE Trans. Hum.-Mach. Syst. 47(6), 970–982 (2017)
Article Google Scholar
Hajati, F., Faez, K., Pakazad, S.K.: An efficient method for face localization and recognition in color images, in 2006 IEEE International Conference on Systems, Man and Cybernetics (2006)
Google Scholar
Hajati, F., et al.: Surface geodesic pattern for 3D deformable texture matching. Pattern Recogn. 62, 21–32 (2017)
Article Google Scholar
Barzamini, R., et al.: Short term load forecasting using multi-layer perception and fuzzy inference systems for Islamic Countries. J. Appl. Sci. 12(1), 40–47 (2012)
Article Google Scholar
Shojaiee, F., Hajati, F.: Local composition derivative pattern for palmprint recognition, in 2014 22nd Iranian Conference on Electrical Engineering (ICEE) (2014)
Google Scholar
Pakazad, S.K., Faez, K., Hajati, F.: Face detection based on central geometrical moments of face components, in 2006 IEEE International Conference on Systems, Man and Cybernetics (2006)
Google Scholar
Ayatollahi, F., Raie, A.A., Hajati, F.: Expression-invariant face recognition using depth and intensity dual-tree complex wavelet transform features, vol. 24: SPIE. 1–13, 13 (2015)
Google Scholar
Abdoli, S., Hajati, F.: Offline signature verification using geodesic derivative pattern, in 2014 22nd Iranian Conference on Electrical Engineering (ICEE) (2014)
Google Scholar
Ravichandran, A., Chaudhry, R., Vidal, R.: Categorizing dynamic textures using a bag of dynamical systems. IEEE Trans. PAMI 35(2), 342–353 (2013)
Article Google Scholar
Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. PAMI 24(7), 971–987 (2002)
Article Google Scholar
Feichtenhofer, C., Pinz, A., Wildes, R.P.: Bags of spacetime energies for dynamic scene recognition, in Proc. IEEE CVPR, pp. 2681–2688 (2014)
Google Scholar
Kim, T.K., Kittler, J., Cipolla, R.: Discriminative learning and recognition of image set classes using canonical correlations. IEEE Trans. PAMI 29(6), 1005–1018 (2007)
Article Google Scholar
Wang, R., et al.: Manifold-manifold distance and its application to face recognition with image sets. IEEE Trans. Image Process. 21(10), 4466–4479 (2012)
Article MathSciNet Google Scholar
Harandi, M., et al.: Graph embedding discriminant analysis on Grassmannian manifolds for improved image set matching, in Proc. IEEE CVPR, pp. 2705–2712 (2011)
Google Scholar
Wang, R., et al.: Covariance discriminative learning: a natural and efficient approach to image set classification, in Proc. IEEE CVPR, pp. 2496–2503 (2012)
Google Scholar
Péteri, R., Fazekas, S., Huiskes, M.J.: DynTex: a comprehensive database of dynamic textures. Pattern Recogn. Lett. 31(12), 1627–1632 (2010)
Article Google Scholar
Zhao, G., Pietikainen, M.: Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Trans. PAMI 29(6), 915–928 (2007)
Article Google Scholar
Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009)
Article Google Scholar
Qi, X., et al.: Dynamic texture and scene classification by transferring deep image features. Neurocomputing 171, 1230–1241 (2016)
Article Google Scholar
Azizpour, H., et al.: From generic to specific deep representations for visual recognition, in Proc. IEEE CVPR, pp. 36–45 (2015)
Google Scholar
Sermanet, P., et al.: Overfeat: integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229 (2013)
Google Scholar
Sun, Y., Wang, X., Tang, X.: Deep learning face representation from predicting 10,000 classes, in Proc. IEEE CVPR, pp. 1891–1898 (2014)
Google Scholar
Xie, J., Xu, L., Chen, E.: Image denoising and inpainting with deep neural networks, in advances in neural information processing systems, pp. 350–358 (2012)
Google Scholar
Smolensky, P.: Information processing in dynamical systems: foundations of harmony theory. In: Parallel Distributed Processing: Explorations in the Microstructure of Cognition, pp. 194–281. MIT Press (1986)
Google Scholar
Taylor, G.W., Hinton, G.E., Roweis, S.: Modeling human motion using binary latent variables, in advances in neural information processing systems, pp. 1345–1352 (2007)
Google Scholar
Taylor, G.W., et al.: Convolutional learning of spatio-temporal features, in Proc. ECCV, pp. 140–153 (2010)
Google Scholar
Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)
Article MathSciNet Google Scholar
Hinton, G., et al.: Unsupervised discovery of nonlinear structure using contrastive backpropagation. Cogn. Sci. 30(4), 725–731 (2006)
Article Google Scholar
Lee, K.C., et al.: Video-based face recognition using probabilistic appearance manifolds, in Proc. IEEE CVPR, pp. 313–320 (2003)
Google Scholar
Derpanis, K.G., et al.: Dynamic scene understanding: the role of orientation features in space and time in scene classification, in Proc. IEEE CVPR, pp. 1306–1313 (2012)
Google Scholar
Hu, Y., Mian, A.S., Owens, R.: Face recognition using sparse approximated nearest points between image sets. IEEE Trans. PAMI 34(10), 1992–2004 (2012)
Article Google Scholar
Hadid, A., Pietikainen, M.: Combining appearance and motion for face and gender recognition from videos. Pattern Recogn. 42(11), 2818–2827 (2009)
Article Google Scholar
Wang, R., Chen, X.: Manifold discriminant analysis, in Proc. IEEE CVPR, pp. 429–436 (2009)
Google Scholar
Yang, M., et al.: Face recognition based on regularized nearest points between image sets, in Proc. IEEE Int. Conf. and Workshops on Automatic Face and Gesture Recognition (FG), pp. 1–7 (2013)
Google Scholar
Yong, X., et al.: Dynamic texture classification using dynamic fractal analysis, in Proc. IEEE ICCV, pp. 1219–1226 (2011)
Google Scholar
Coviello, E., et al.: Growing a bag of systems tree for fast and accurate classification, in Proc. IEEE CVPR, pp. 1979–1986 (2012)
Google Scholar
Arashloo, S.R., Kittler, J.: Dynamic texture recognition using multiscale binarized statistical image features. IEEE Trans. Multimedia 16(8), 2099–2109 (2014)
Article Google Scholar
Thériault, C., Thome, N., Cord, M.: Dynamic scene classification: learning motion descriptors with slow features analysis, in Proc. IEEE CVPR, pp. 2603–2610 (2013)
Google Scholar
Feichtenhofer, C., Pinz, A., Wildes, R.P.: Spacetime forests with complementary features for dynamic scene recognition, in Proc. BMVC, pp. 1–12 (2013)
Google Scholar

Download references

Author information

Authors and Affiliations

College of Engineering and Science, Victoria University Sydney, Sydney, Australia
Farshid Hajati
School of Information Technology and Engineering, MIT Sydney, Sydney, Australia
Farshid Hajati
Center for Machine Vision and Signal Analysis (CMVS), University of Oulu, Oulu, Finland
Mohammad Tavakolian

Authors

Farshid Hajati
View author publications
You can also search for this author in PubMed Google Scholar
Mohammad Tavakolian
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Farshid Hajati .

Editor information

Editors and Affiliations

Department of Information and Communication Engineering, Fukuoka Institute of Technology, Faculty of Information Engineering, Fukuoka, Japan
Leonard Barolli
School of Software, University of Technology Sydney (UTS), Ultimo, NSW, Australia
Farookh Khadeer Hussain
Department of Information and Communication Engineering, Fukuoka Institute of Technology, Faculty of Information Engineering, Fukuoka, Japan
Makoto Ikeda

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hajati, F., Tavakolian, M. (2020). Video Classification Using Deep Autoencoder Network. In: Barolli, L., Hussain, F., Ikeda, M. (eds) Complex, Intelligent, and Software Intensive Systems. CISIS 2019. Advances in Intelligent Systems and Computing, vol 993. Springer, Cham. https://doi.org/10.1007/978-3-030-22354-0_45

Download citation

DOI: https://doi.org/10.1007/978-3-030-22354-0_45
Published: 21 June 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-22353-3
Online ISBN: 978-3-030-22354-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Video Classification Using Deep Autoencoder Network

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Deep Discriminative Model for Video Classification

Dynamic Texture Classification Based on 3D ICA-Learned Filters and Fisher Vector Encoding in Big Data Environment

Boltzmann Machines for Image Denoising

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Video Classification Using Deep Autoencoder Network

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Deep Discriminative Model for Video Classification

Dynamic Texture Classification Based on 3D ICA-Learned Filters and Fisher Vector Encoding in Big Data Environment

Boltzmann Machines for Image Denoising

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation