Abstract
Exploiting concept correlations is a promising way for boosting the performance of concept detection systems, aiming at concept-based video indexing or annotation. Stacking approaches, which can model the correlation information, appear to be the most commonly used techniques to this end. This paper performs a comparative study and proposes an improved way of employing stacked models, by using multi-label classification methods in the last level of the stack. The experimental results on the TRECVID 2011 and 2012 semantic indexing task datasets show the effectiveness of the proposed framework compared to existing works. In addition to this, as part of our comparative study, we investigate whether the evaluation of concept detection results at the level of individual concepts, as is typically the case in the literature, is appropriate for assessing the usefulness of concept detection results in both video indexing applications and in the somewhat different problem of video annotation.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Snoek, C.G.M., Worring, M.: Concept-Based Video Retrieval. Foundations and Trends in Information Retrieval 2(4), 215–322 (2009)
Smith, J., Naphade, M., Natsev, A.: Multimedia semantic indexing using model vectors. In: 2003 Int. Conf. on Multimedia and Expo, ICME 2003, pp. 445–448. IEEE Press, New York (2003)
Jiang, W., Chang, S.F., Loui, A.C.: Active context-based concept fusion with partial user labels. In: IEEE Int. Conf. on Image Processing. IEEE Press, New York (2006)
Tsoumakas, G., Dimou, A., Spyromitros-xioufis, E., Mezaris, V., Kompatsiaris, I., Vlahavas, I.: Correlation-Based Pruning of Stacked Binary Relevance Models for Multi-Label learning. In: ECML/PKDD 2009 Workshop on Learning from Multi-Label Data (MLD 2009), pp. 101–116. Springer, Heidelberg (2009)
Weng, M.F., Chuang, Y.Y.: Cross-Domain Multicue Fusion for Concept-Based Video Indexing. IEEE Transactions on Pattern Analysis and Machine Intelligence 34(10), 1927–1941 (2012)
Qi, G.J., Hua, X.S., Rui, Y., Tang, J., Mei, T., Zhang, H.J.: Correlative multi-label video annotation. In: 15th International Conference on Multimedia, MULTIMEDIA 2007, pp. 17–26. ACM, New York (2007)
Zha, Z.J., Mei, T., Wang, J., Wang, Z., Hua, X.S.: Graph-based semi-supervised learning with multiple labels. Journal of Visual Communication and Image Representation 20(2), 97–103 (2009)
Zha, Z.J., Hua, X.S., Mei, T., Wang, J., Qi, G.J., Wang, Z.: Joint multi-label multi-instance learning for image classification. In: Computer Vision and Pattern Recognition (CVRP 2008), pp. 1–8. IEEE, New York (2008)
Wang, M., Zhou, X., Chua, T.S.: Automatic image annotation via local multi-label classification. In: Int. Conf. on Content-based image and video retrieval, CIVR 2008, pp. 17–26. ACM Press, New York (2008)
Zhu, Q., Liu, D., Meng, T., Chen, C., Shyu, M., Yang, Y., Ha, H.Y., Fleites, F., Chen, S.C.: Florida International University and University of Miami TRECVID 2012. In: TRECVID 2012 Workshop, Gaithersburg, MD, USA (2012)
Yu, S.I., Xu, Z., Ding, D., Sze, W., Vicente, F., Lan, Z., Cai, Y., Rawat, S., Schulam, P., Markandaiah, N., Bahmani, S., Juarez, A., Tong, W., Yang, Y., Burger, S., Metze, F., Singh, R., Raj, B., Stern, R., Mitamura, T., Nyberg, E., Jiang, L., Chen, Q., Brown, L., Datta, A., Fan, Q., Feris, R., Yan, S., Pankanti, S., Hauptmann, A.: Informedia @TRECVID 2012. In: TRECVID 2012 Workshop, Gaithersburg, MD, USA (2012)
Wang, F., Sun, Z., Zhang, D., Ngo, C.: Semantic Indexing and Multimedia Event Detection: ECNU at TRECVID 2012. In: TRECVID 2012 Workshop, Gaithersburg, MD, USA (2012)
Nasierding, G., Kouzani, A.Z.: Empirical Study of Multi-label Classification Methods for Image Annotation and Retrieval. In: 2010 Int. Conf. on Digital Image Computing: Techniques and Applications, pp. 617–622. IEEE, China (2010)
Kang, F., Jin, R., Sukthankar, R.: Correlated Label Propagation with Application to Multi-label Learning. In: IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, CVPR 2006, pp. 1719–1726. IEEE Press, New York (2006)
Tsoumakas, G., Katakis, I., Vlahavas, I.: Mining multi-label data. In: Data Mining and Knowledge Discovery Handbook, pp. 667–686. Springer, Berlin (2010)
Fürnkranz, J., Hüllermeier, E., Loza Mencía, E., Brinker, K.: Multilabel classification via calibrated label ranking. Machine Learning 73(2), 133–153 (2008)
Read, J.: A pruned problem transformation method for multi-label classification. In: 2008 New Zealand Computer Science Research Student Conference (NZCSRS 2008), New Zealand (2008)
Zhang, M.L., Zhou, Z.H.: ML-KNN: A lazy learning approach to multi-label learning. Pattern Recognition 40(7), 2038–2048 (2007)
Over, P., Awad, G., Fiscus, J., Antonishek, B., Michel, M., Smeaton, A.F., Kraaij, W., Queenot, G.: Trecvid 2011 – an overview of the goals, tasks, data, evaluation mechanisms and metrics. In: TRECVID 2011, NIST, USA (2011)
Over, P., Fiscus, J., Sanders, G., Shaw, B., Awad, G., Qu, G.: Trecvid 2012 an overview of the goals, tasks, data, evaluation mechanisms, and metrics. In: TRECVID 2012, NIST, USA (2012)
Sechidis, K., Tsoumakas, G., Vlahavas, I.: On the Stratification of Multi-Label Data. In: Gunopulos, D., Hofmann, T., Malerba, D., Vazirgiannis, M. (eds.) ECML PKDD 2011, Part III. LNCS (LNAI), vol. 6913, pp. 145–158. Springer, Heidelberg (2011)
Ayache, S., Quénot, G.: Video corpus annotation using active learning. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 187–198. Springer, Heidelberg (2008)
Hradiš, M., Kolář, M., Láník, A., Král, J., Zemčík, P., Smrž, P.: Annotating images with suggestions user study of a tagging system. In: Blanc-Talon, J., Philips, W., Popescu, D., Scheunders, P., Zemčík, P. (eds.) ACIVS 2012. LNCS, vol. 7517, pp. 155–166. Springer, Heidelberg (2012)
Moumtzidou, A., Gkalelis, N., Sidiropoulos, P., Dimopoulos, M., Nikolopoulos, S., Vrochidis, S., Mezaris, V., Kompatsiaris, I.: ITI-CERTH participation to TRECVID 2012. In: TRECVID 2012 Workshop, Gaithersburg, MD, USA (2012)
Le Cessie, S., Van Houwelingen, J.: Ridge estimators in logistic regression. Journal of the Royal Statistical Society. Series C (Applied Statistics) 41(1), 191–201 (1992)
Witten, I., Frank, E.: Data Mining Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
Tsoumakas, G., Spyromitros-xioufis, E., Vilcek, J., Vlahavas, I.: MULAN: A Java Library for Multi-Label Learning. Journal of Machine Learning Research 12, 2411–2414 (2011)
Greenwood, P., Nikulin, M.: A guide to chi-squared testing. Wiley-Interscience, Canada (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Markatopoulou, F., Mezaris, V., Kompatsiaris, I. (2014). A Comparative Study on the Use of Multi-label Classification Techniques for Concept-Based Video Indexing and Annotation. In: Gurrin, C., Hopfgartner, F., Hurst, W., Johansen, H., Lee, H., O’Connor, N. (eds) MultiMedia Modeling. MMM 2014. Lecture Notes in Computer Science, vol 8325. Springer, Cham. https://doi.org/10.1007/978-3-319-04114-8_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-04114-8_1
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-04113-1
Online ISBN: 978-3-319-04114-8
eBook Packages: Computer ScienceComputer Science (R0)