Abstract
Convolutional neural networks have been established as an unbelievable class of models for picture confirmation issues. Enabled by these results, we give CNN’s extensive trial evaluation a large degree of video-action syllabus using another dataset of 8M YouTube accounts. To get the Chronicles and its effects, we’ve used a YouTube video specification framework, which gives the names of the accounts they focus on. While the names are machine-generated, they are high-precision and are derived from a group of human-based icons, including metadata and question click signals. We have filtered the video names (Knowledge Graph Components) using both modern and manual curation strategies, including curiosity regarding whether the print is clearly indisputable. After that, we decode each video at one-layout per-second and use the deep CNN adjusted to ImageNet to remove the cover depicted immediately before the course of the action layer. Finally, we’ve stuffed the packaging features and made available both features and video level names for download. We train unique (ambiguous) game plan models on the dataset, survey them using significant evaluation estimates, and report them as baseline. Regardless of the size of the dataset, a portion of our models train the connection in less than a day on a singular machine using VGG. CNN our course release code for setting up model deals and generating predictions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Scheidegger, F., Cavigelli, L., Schaffner, M., Malossi, A.C.I., Bekas, C., Benini, L.: Impact of temporal subsampling on accuracy and performance in practical video classification
Everingham, M., Gool, L.V., Williams, C.K.I., Winn, J., Zisserman, A.: The pascal visual object classes (voc) challenge (2009)
Fei-fei, L., Fergus, R., Perona, P.: One-shot learning of object categories. IEEE Trans. Pattern Anal. Mach. Intell. 28 (2006)
Girshick, R.: Fast R-CNN. In: Proceedings of the International Conference on Computer Vision (ICCV) (2015)
Blank, M., Gorelick, L., Shechtman, E., Irani, M., Basri, R.: Actions as space-time shapes. In: Proceedings of the International Conference on Computer Vision (ICCV) (2005)
Deng, J., Dong, W., Socher, R., Lija, L., Li, K., Fei-fei, L.: Imagenet: A large-scale hierarchical image database. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2009)
Griffin, G., Holub, A., Perona, P.: Caltech-256 object category dataset. Technical Report 7694, California Institute of Technology (2007)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. CoRR, abs/1512.03385 (2015)
Heilbron, F.C., Escorcia, V., Ghanem, B., Niebles, J.C.: Activitynet: A large-scale video benchmark for human activity understanding. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 961–970 (2015)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8) (1997)
Ioffe, S., Szegedy, C.: Batch normalization: Accelerating deep network training by reducing internal covariate shift. In: Proceedings of the International Conference on Machine Learning (ICML), pp. 448–456 (2015)
Jiang, Y., Liu, J., Roshan Zamir, A., Toderici, G., Laptev, I., Shah, M., Sukthankar, R.: THUMOS challenge: Action recognition with a large number of classes. http://crcv.ucf.edu/THUMOS14 (2014)
Jiang, Y.-G., Wu, Z., Wang, J., Xue, X., Chang, S.-F.: Exploiting feature and class relationships in video categorization with regularized deep neural networks. arXiv preprint arXiv:1502.07209 (2015)
Jordan, M.I.: Hierarchical mixtures of experts and the algorithm. Neural Comput. 6 (1994)
Karpathy, A., Toderici, G., Shetty, S., Leung, T., Sukthankar, R., Fei-Fei, L.: Large-scale video classification with convolutional neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 1725–1732, Columbus, Ohio, USA (2014)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems (NIPS), pp. 1097–1105 (2012)
Laptev, I., Lindeberg, T.: Space-time interest points. In: Proceedings of the International Conference on Computer Vision (ICCV) (2003)
Laptev, I., Marszalek, M., Schmid, C., Rozenfeld, B.: Learning realistic human actions from movies. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2008)
Google I/O 2013—semantic video annotations in the Youtube Topics
Knowledge Graph Search API
Tensorflow: Image recognition. https://www.tensorflow.org/tutorials/image_recognition
Jegou, H., Perronnin, F., Douze, M., Sanchez, J., Perez, P., Schmid, C.: Aggregating local image descriptors into compact codes. IEEE Trans. Pattern Anal. Mach. Intell. 34(9), (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 The Editor(s) (if applicable) and The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
SravyaPranati, B., Suma, D., ManjuLatha, C., Putheti, S. (2021). Large-Scale Video Classification with Convolutional Neural Networks. In: Senjyu, T., Mahalle, P.N., Perumal, T., Joshi, A. (eds) Information and Communication Technology for Intelligent Systems. ICTIS 2020. Smart Innovation, Systems and Technologies, vol 196. Springer, Singapore. https://doi.org/10.1007/978-981-15-7062-9_69
Download citation
DOI: https://doi.org/10.1007/978-981-15-7062-9_69
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-15-7061-2
Online ISBN: 978-981-15-7062-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)