Abstract
Automatic semantic classification of video databases is very useful for users searching and browsing but it is a very challenging research problem as well. Combination of visual and text modalities is one of the key issues to bridge the semantic gap between signal and semantic. In this paper, we propose to enhance the classification of high-level concepts using intermediate topic concepts and study various fusion strategies to combine topic concepts with visual features in order to outperform unimodal classifiers. We have conducted several experiments on the TRECVID’05 collection and show here that several intermediate topic classifiers can bridge parts of the semantic gap and help to detect high-level concepts.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Ayache, S., Quénot, G., Satoh, S.: Context-based conceptual image indexing. In: ICASSP (2006)
Ayache, S., Quénot, G., Gensel, J., Satoh, S.: CLIPS-LSR-NII experiments at TRECVID 2005. In: TRECVID Workshop (2005)
Ayache, S., Quénot, G., Charhad, M.: Video shot classification using lexical context. In: European Conference on Information Retrieval (2005)
Chang, C., Lin, C.: LIBSVM: A library for support vector machines (2001), Software available at: http://www.csie.ntu.edu.tw/~cjlin/libsvm
Garg, A., Agarwal, S., Huang, T.S.: Fusion of Global and Local Information for Object Detection. In: 16th International Conference on Pattern Recognition (ICPR 2002), vol. 3 (2002)
Gauvain, J.L., Lamel, L., Adda, G.: The LIMSI Broadcast News Transcription System. Speech Communication 37(1-2), 89–108 (2002)
Iyengar, G., Nock, H.J.: Discriminative model fusion for semantic concept detection and annotation in video. In: MULTIMEDIA 2003: Proceedings of the eleventh ACM international conference on Multimedia (2003)
Iyengar, G., Nock, H., Neti, C., Franz, M.: Semantic indexing of multimediq using audio, text and visual cues. In: IEEE Int. Conference on Multimedia and Expo. (2002)
Lewis, D., Li, F., Rose, T., Yang, Y.: The reuters corpus volume I as a text categorization test collection. Journal of Machine Learning Research (2003)
Lewis, D.D., Yang, Y., Rose, T., Li, F.: RCV1: A New Benchmark Collection for Text Categorization Research. Journal of Machine Learning Research 5, 361–397 (2004)
Lisin, D.A., Mattar, M.A., BlMark, M.B., Benfield, C., Learned-Miller, E.G.: Combining Local and Global Image Features for Object Class Recognition. In: CVPR (2005)
LSCOM Lexicon Definitions and Annotations Version 1.0. DTO Challenge Workshop on Large Scale Concept Ontology for Multimedia, Columbia University ADVENT Technical Report #217-2006-3 (March 2006)
Murphy, K., Torralba, A., Eaton, D., Freeman, W.T.: Object detection and localization using local and global features. In: Ponce, J., Hebert, M., Schmid, C., Zisserman, A. (eds.) Toward Category-Level Object Recognition. LNCS, vol. 4170, pp. 382–400. Springer, Heidelberg (2006)
Naphade, M.: On supervision and statistical learning for semantic multimedia analysis. Journal of Visual Communication and Image Representation 15(3), 348–369 (2004)
Nock, H.J., Iyengar, G., Neti, C.: Issues in speech-based retrieval of video. In: ISCA Tutorial Workshop (2003)
Salton, G.: Introduction to Modern Information Retrieval. McGraw-Hill, New York (1983)
Snoek, C.G.M., Worring, M., Geusebroek, J.M., Koelma, D.C., Seinstra, F.J.: The MediaMill TRECVID 2004 Semantic Video Search Engine. In: TRECVID Workshop (2004)
Snoek, C.G.M., Worring, M., Smeulders, A.W.M.: Early versus Late Fusion in Semantic Video Analysis. In: Proceedings of ACM Multimedia (2005)
Wolpert, D.H.: Stacked Generalization. In: Neural Networks, vol. 5, pp. 241–259. Pergamon Press, Oxford
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ayache, S., Quénot, G., Gensel, J., Satoh, S. (2006). Using Topic Concepts for Semantic Video Shots Classification. In: Sundaram, H., Naphade, M., Smith, J.R., Rui, Y. (eds) Image and Video Retrieval. CIVR 2006. Lecture Notes in Computer Science, vol 4071. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11788034_31
Download citation
DOI: https://doi.org/10.1007/11788034_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-36018-6
Online ISBN: 978-3-540-36019-3
eBook Packages: Computer ScienceComputer Science (R0)