Abstract
We present a deep learning framework for video classification applicable to face recognition and dynamic texture recognition. A Deep Autoencoder Network Template (DANT) is designed whose weights are initialized by conducting unsupervised pre-training in a layer-wise fashion using Gaussian Restricted Boltzmann Machines. In order to obtain a class specific network and fine tune the weights for each class, the pre-initialized DANT is trained for each class of video sequences, separately. A majority voting technique based on the reconstruction error is employed for the classification task. The extensive evaluation and comparisons with state-of-the-art approaches on Honda/UCSD, DynTex, and YUPPEN databases demonstrate that the proposed method significantly improves the performance of dynamic texture classification.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Hajati, F., et al.: Dynamic texture comparison using derivative sparse representation: application to video-based face recognition. IEEE Trans. Hum.-Mach. Syst. 47(6), 970–982 (2017)
Hajati, F., Faez, K., Pakazad, S.K.: An efficient method for face localization and recognition in color images, in 2006 IEEE International Conference on Systems, Man and Cybernetics (2006)
Hajati, F., et al.: Surface geodesic pattern for 3D deformable texture matching. Pattern Recogn. 62, 21–32 (2017)
Barzamini, R., et al.: Short term load forecasting using multi-layer perception and fuzzy inference systems for Islamic Countries. J. Appl. Sci. 12(1), 40–47 (2012)
Shojaiee, F., Hajati, F.: Local composition derivative pattern for palmprint recognition, in 2014 22nd Iranian Conference on Electrical Engineering (ICEE) (2014)
Pakazad, S.K., Faez, K., Hajati, F.: Face detection based on central geometrical moments of face components, in 2006 IEEE International Conference on Systems, Man and Cybernetics (2006)
Ayatollahi, F., Raie, A.A., Hajati, F.: Expression-invariant face recognition using depth and intensity dual-tree complex wavelet transform features, vol. 24: SPIE. 1–13, 13 (2015)
Abdoli, S., Hajati, F.: Offline signature verification using geodesic derivative pattern, in 2014 22nd Iranian Conference on Electrical Engineering (ICEE) (2014)
Ravichandran, A., Chaudhry, R., Vidal, R.: Categorizing dynamic textures using a bag of dynamical systems. IEEE Trans. PAMI 35(2), 342–353 (2013)
Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. PAMI 24(7), 971–987 (2002)
Feichtenhofer, C., Pinz, A., Wildes, R.P.: Bags of spacetime energies for dynamic scene recognition, in Proc. IEEE CVPR, pp. 2681–2688 (2014)
Kim, T.K., Kittler, J., Cipolla, R.: Discriminative learning and recognition of image set classes using canonical correlations. IEEE Trans. PAMI 29(6), 1005–1018 (2007)
Wang, R., et al.: Manifold-manifold distance and its application to face recognition with image sets. IEEE Trans. Image Process. 21(10), 4466–4479 (2012)
Harandi, M., et al.: Graph embedding discriminant analysis on Grassmannian manifolds for improved image set matching, in Proc. IEEE CVPR, pp. 2705–2712 (2011)
Wang, R., et al.: Covariance discriminative learning: a natural and efficient approach to image set classification, in Proc. IEEE CVPR, pp. 2496–2503 (2012)
Péteri, R., Fazekas, S., Huiskes, M.J.: DynTex: a comprehensive database of dynamic textures. Pattern Recogn. Lett. 31(12), 1627–1632 (2010)
Zhao, G., Pietikainen, M.: Dynamic texture recognition using local binary patterns with an application to facial expressions. IEEE Trans. PAMI 29(6), 915–928 (2007)
Bengio, Y.: Learning deep architectures for AI. Found. Trends Mach. Learn. 2(1), 1–127 (2009)
Qi, X., et al.: Dynamic texture and scene classification by transferring deep image features. Neurocomputing 171, 1230–1241 (2016)
Azizpour, H., et al.: From generic to specific deep representations for visual recognition, in Proc. IEEE CVPR, pp. 36–45 (2015)
Sermanet, P., et al.: Overfeat: integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229 (2013)
Sun, Y., Wang, X., Tang, X.: Deep learning face representation from predicting 10,000 classes, in Proc. IEEE CVPR, pp. 1891–1898 (2014)
Xie, J., Xu, L., Chen, E.: Image denoising and inpainting with deep neural networks, in advances in neural information processing systems, pp. 350–358 (2012)
Smolensky, P.: Information processing in dynamical systems: foundations of harmony theory. In: Parallel Distributed Processing: Explorations in the Microstructure of Cognition, pp. 194–281. MIT Press (1986)
Taylor, G.W., Hinton, G.E., Roweis, S.: Modeling human motion using binary latent variables, in advances in neural information processing systems, pp. 1345–1352 (2007)
Taylor, G.W., et al.: Convolutional learning of spatio-temporal features, in Proc. ECCV, pp. 140–153 (2010)
Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)
Hinton, G., et al.: Unsupervised discovery of nonlinear structure using contrastive backpropagation. Cogn. Sci. 30(4), 725–731 (2006)
Lee, K.C., et al.: Video-based face recognition using probabilistic appearance manifolds, in Proc. IEEE CVPR, pp. 313–320 (2003)
Derpanis, K.G., et al.: Dynamic scene understanding: the role of orientation features in space and time in scene classification, in Proc. IEEE CVPR, pp. 1306–1313 (2012)
Hu, Y., Mian, A.S., Owens, R.: Face recognition using sparse approximated nearest points between image sets. IEEE Trans. PAMI 34(10), 1992–2004 (2012)
Hadid, A., Pietikainen, M.: Combining appearance and motion for face and gender recognition from videos. Pattern Recogn. 42(11), 2818–2827 (2009)
Wang, R., Chen, X.: Manifold discriminant analysis, in Proc. IEEE CVPR, pp. 429–436 (2009)
Yang, M., et al.: Face recognition based on regularized nearest points between image sets, in Proc. IEEE Int. Conf. and Workshops on Automatic Face and Gesture Recognition (FG), pp. 1–7 (2013)
Yong, X., et al.: Dynamic texture classification using dynamic fractal analysis, in Proc. IEEE ICCV, pp. 1219–1226 (2011)
Coviello, E., et al.: Growing a bag of systems tree for fast and accurate classification, in Proc. IEEE CVPR, pp. 1979–1986 (2012)
Arashloo, S.R., Kittler, J.: Dynamic texture recognition using multiscale binarized statistical image features. IEEE Trans. Multimedia 16(8), 2099–2109 (2014)
Thériault, C., Thome, N., Cord, M.: Dynamic scene classification: learning motion descriptors with slow features analysis, in Proc. IEEE CVPR, pp. 2603–2610 (2013)
Feichtenhofer, C., Pinz, A., Wildes, R.P.: Spacetime forests with complementary features for dynamic scene recognition, in Proc. BMVC, pp. 1–12 (2013)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Hajati, F., Tavakolian, M. (2020). Video Classification Using Deep Autoencoder Network. In: Barolli, L., Hussain, F., Ikeda, M. (eds) Complex, Intelligent, and Software Intensive Systems. CISIS 2019. Advances in Intelligent Systems and Computing, vol 993. Springer, Cham. https://doi.org/10.1007/978-3-030-22354-0_45
Download citation
DOI: https://doi.org/10.1007/978-3-030-22354-0_45
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-22353-3
Online ISBN: 978-3-030-22354-0
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)