Skip to main content

Video Classification Using Deep Autoencoder Network

  • Conference paper
  • First Online:
Complex, Intelligent, and Software Intensive Systems (CISIS 2019)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 993))

Included in the following conference series:

Abstract

We present a deep learning framework for video classification applicable to face recognition and dynamic texture recognition. A Deep Autoencoder Network Template (DANT) is designed whose weights are initialized by conducting unsupervised pre-training in a layer-wise fashion using Gaussian Restricted Boltzmann Machines. In order to obtain a class specific network and fine tune the weights for each class, the pre-initialized DANT is trained for each class of video sequences, separately. A majority voting technique based on the reconstruction error is employed for the classification task. The extensive evaluation and comparisons with state-of-the-art approaches on Honda/UCSD, DynTex, and YUPPEN databases demonstrate that the proposed method significantly improves the performance of dynamic texture classification.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Hajati, F., et al.: Dynamic texture comparison using derivative sparse representation: application to video-based face recognition. IEEE Trans. Hum.-Mach. Syst. 47(6), 970–982 (2017)

    Article  Google Scholar 

  2. Hajati, F., Faez, K., Pakazad, S.K.: An efficient method for face localization and recognition in color images, in 2006 IEEE International Conference on Systems, Man and Cybernetics (2006)

    Google Scholar 

  3. Hajati, F., et al.: Surface geodesic pattern for 3D deformable texture matching. Pattern Recogn. 62, 21–32 (2017)

    Article  Google Scholar 

  4. Barzamini, R., et al.: Short term load forecasting using multi-layer perception and fuzzy inference systems for Islamic Countries. J. Appl. Sci. 12(1), 40–47 (2012)

    Article  Google Scholar 

  5. Shojaiee, F., Hajati, F.: Local composition derivative pattern for palmprint recognition, in 2014 22nd Iranian Conference on Electrical Engineering (ICEE) (2014)

    Google Scholar 

  6. Pakazad, S.K., Faez, K., Hajati, F.: Face detection based on central geometrical moments of face components, in 2006 IEEE International Conference on Systems, Man and Cybernetics (2006)

    Google Scholar 

  7. Ayatollahi, F., Raie, A.A., Hajati, F.: Expression-invariant face recognition using depth and intensity dual-tree complex wavelet transform features, vol. 24: SPIE. 1–13, 13 (2015)

    Google Scholar 

  8. Abdoli, S., Hajati, F.: Offline signature verification using geodesic derivative pattern, in 2014 22nd Iranian Conference on Electrical Engineering (ICEE) (2014)

    Google Scholar 

  9. Ravichandran, A., Chaudhry, R., Vidal, R.: Categorizing dynamic textures using a bag of dynamical systems. IEEE Trans. PAMI 35(2), 342–353 (2013)

    Article  Google Scholar 

  10. Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. PAMI 24(7), 971–987 (2002)

    Article  Google Scholar 

  11. Feichtenhofer, C., Pinz, A., Wildes, R.P.: Bags of spacetime energies for dynamic scene recognition, in Proc. IEEE CVPR, pp. 2681–2688 (2014)

    Google Scholar 

  12. Kim, T.K., Kittler, J., Cipolla, R.: Discriminative learning and recognition of image set classes using canonical correlations. IEEE Trans. PAMI 29(6), 1005–1018 (2007)

    Article  Google Scholar 

  13. Wang, R., et al.: Manifold-manifold distance and its application to face recognition with image sets. IEEE Trans. Image Process. 21(10), 4466–4479 (2012)

    Article  MathSciNet  Google Scholar 

  14. Harandi, M., et al.: Graph embedding discriminant analysis on Grassmannian manifolds for improved image set matching, in Proc. IEEE CVPR, pp. 2705–2712 (2011)

    Google Scholar 

  15. Wang, R., et al.: Covariance discriminative learning: a natural and efficient approach to image set classification, in Proc. IEEE CVPR, pp. 2496–2503 (2012)

    Google Scholar 

  16. Péteri, R., Fazekas, S., Huiskes, M.J.: DynTex: a comprehensive database of dynamic textures. Pattern Recogn. Lett. 31(12), 1627–1632 (2010)

    Article  Google Scholar 

  17. Zhao, G., Pietikainen, M.: Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Trans. PAMI 29(6), 915–928 (2007)

    Article  Google Scholar 

  18. Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009)

    Article  Google Scholar 

  19. Qi, X., et al.: Dynamic texture and scene classification by transferring deep image features. Neurocomputing 171, 1230–1241 (2016)

    Article  Google Scholar 

  20. Azizpour, H., et al.: From generic to specific deep representations for visual recognition, in Proc. IEEE CVPR, pp. 36–45 (2015)

    Google Scholar 

  21. Sermanet, P., et al.: Overfeat: integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229 (2013)

    Google Scholar 

  22. Sun, Y., Wang, X., Tang, X.: Deep learning face representation from predicting 10,000 classes, in Proc. IEEE CVPR, pp. 1891–1898 (2014)

    Google Scholar 

  23. Xie, J., Xu, L., Chen, E.: Image denoising and inpainting with deep neural networks, in advances in neural information processing systems, pp. 350–358 (2012)

    Google Scholar 

  24. Smolensky, P.: Information processing in dynamical systems: foundations of harmony theory. In: Parallel Distributed Processing: Explorations in the Microstructure of Cognition, pp. 194–281. MIT Press (1986)

    Google Scholar 

  25. Taylor, G.W., Hinton, G.E., Roweis, S.: Modeling human motion using binary latent variables, in advances in neural information processing systems, pp. 1345–1352 (2007)

    Google Scholar 

  26. Taylor, G.W., et al.: Convolutional learning of spatio-temporal features, in Proc. ECCV, pp. 140–153 (2010)

    Google Scholar 

  27. Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)

    Article  MathSciNet  Google Scholar 

  28. Hinton, G., et al.: Unsupervised discovery of nonlinear structure using contrastive backpropagation. Cogn. Sci. 30(4), 725–731 (2006)

    Article  Google Scholar 

  29. Lee, K.C., et al.: Video-based face recognition using probabilistic appearance manifolds, in Proc. IEEE CVPR, pp. 313–320 (2003)

    Google Scholar 

  30. Derpanis, K.G., et al.: Dynamic scene understanding: the role of orientation features in space and time in scene classification, in Proc. IEEE CVPR, pp. 1306–1313 (2012)

    Google Scholar 

  31. Hu, Y., Mian, A.S., Owens, R.: Face recognition using sparse approximated nearest points between image sets. IEEE Trans. PAMI 34(10), 1992–2004 (2012)

    Article  Google Scholar 

  32. Hadid, A., Pietikainen, M.: Combining appearance and motion for face and gender recognition from videos. Pattern Recogn. 42(11), 2818–2827 (2009)

    Article  Google Scholar 

  33. Wang, R., Chen, X.: Manifold discriminant analysis, in Proc. IEEE CVPR, pp. 429–436 (2009)

    Google Scholar 

  34. Yang, M., et al.: Face recognition based on regularized nearest points between image sets, in Proc. IEEE Int. Conf. and Workshops on Automatic Face and Gesture Recognition (FG), pp. 1–7 (2013)

    Google Scholar 

  35. Yong, X., et al.: Dynamic texture classification using dynamic fractal analysis, in Proc. IEEE ICCV, pp. 1219–1226 (2011)

    Google Scholar 

  36. Coviello, E., et al.: Growing a bag of systems tree for fast and accurate classification, in Proc. IEEE CVPR, pp. 1979–1986 (2012)

    Google Scholar 

  37. Arashloo, S.R., Kittler, J.: Dynamic texture recognition using multiscale binarized statistical image features. IEEE Trans. Multimedia 16(8), 2099–2109 (2014)

    Article  Google Scholar 

  38. Thériault, C., Thome, N., Cord, M.: Dynamic scene classification: learning motion descriptors with slow features analysis, in Proc. IEEE CVPR, pp. 2603–2610 (2013)

    Google Scholar 

  39. Feichtenhofer, C., Pinz, A., Wildes, R.P.: Spacetime forests with complementary features for dynamic scene recognition, in Proc. BMVC, pp. 1–12 (2013)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Farshid Hajati .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Hajati, F., Tavakolian, M. (2020). Video Classification Using Deep Autoencoder Network. In: Barolli, L., Hussain, F., Ikeda, M. (eds) Complex, Intelligent, and Software Intensive Systems. CISIS 2019. Advances in Intelligent Systems and Computing, vol 993. Springer, Cham. https://doi.org/10.1007/978-3-030-22354-0_45

Download citation

Publish with us

Policies and ethics