Abstract
This paper presents a method for the hierarchical classification of image galleries into a taxonomy. The proposed method links textual gallery metadata to Wikipedia pages and categories. Entity extraction from metadata, entity ranking, and selection of categories is based on Wikipedia and does not require labeled training data. The resulting system performs well above a random baseline, and achieves a (micro-averaged) F-score of 0.59 on the 9 top categories of the taxonomy and 0.40 when using all 57 categories.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: A large-scale hierarchical image database. In: CVPR 2009 (2009)
Hoffart, J., Yosef, M., Bordino, I., Fürstenau, H., Pinkal, M., Spaniol, M., Taneva, B., Thater, S., Weikum, G.: Robust disambiguation of named entities in text. In: Proc. of EMNLP, pp. 27–31 (2011)
Janik, M., Kochut, K.: Training-less ontology-based text categorization. In: ECIR Workshop on Exploiting Semantic Annotations in Information Retrieval (ESAIR 2008), pp. 3–17. Citeseer (2008)
Medelyan, O., Milne, D., Legg, C., Witten, I.: Mining meaning from wikipedia. International Journal of Human-Computer Studies 67(9), 716–754 (2009)
Medelyan, O., Witten, I., Milne, D.: Topic indexing with wikipedia. In: Proceedings of the AAAI WikiAI Workshop (2008)
Mihalcea, R., Csomai, A.: Wikify!: linking documents to encyclopedic knowledge. In: CIKM 2007: Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management, pp. 233–242. ACM, New York (2007)
Milne, D., Witten, I.: Learning to link with wikipedia. In: Proceeding of the 17th ACM Conference on Information and Knowledge Management, pp. 509–518. ACM (2008)
Ponzetto, S., Strube, M.: Deriving a large-scale taxonomy from wikipedia. In: AAAI, pp. 1440–1445. AAAI Press (2007)
Tsatsaronis, G., Varlamis, I., Nørvåg, K.: Semanticrank: ranking keywords and sentences using semantic graphs. In: Proceedings of the 23rd International Conference on Computational Linguistics, pp. 1074–1082. Association for Computational Linguistics (2010)
Wang, H., Liu, S., Chia, L.: Does ontology help in image retrieval?: a comparison between keyword, text ontology and multi-modality ontology approaches. In: Proceedings of the 14th Annual ACM International Conference on Multimedia, pp. 109–112. ACM (2006)
Wang, P., Hu, J., Zeng, H., Chen, Z.: Using wikipedia knowledge to improve text classification. Knowledge and Information Systems 19(3), 265–281 (2009)
Witten, I., Milne, D.: An effective, low-cost measure of semantic relatedness obtained from wikipedia links. In: Proceeding of AAAI Workshop on Wikipedia and Artificial Intelligence: an Evolving Synergy, pp. 25–30. AAAI Press, Chicago (2008)
Wolfe, A.: Social network analysis: Methods and applications. American Ethnologist 24(1), 219–220 (1997)
Zesch, T., Gurevych, I.: Analysis of the wikipedia category graph for NLP applications. In: Proceedings of the Second Workshop on TextGraphs: Graph-Based Algorithms for Natural Language Processing, pp. 1–8. Association for Computational Linguistics, Rochester (2007)
Zhu, Q., Lin, L., Shyu, M., Liu, D.: Utilizing context information to enhance content-based image classification. International Journal of Multimedia Data Engineering and Management (IJMDEM) 2(3), 34–51 (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kramer, G., Bouma, G., Hendriksen, D., Homminga, M. (2012). Classifying Image Galleries into a Taxonomy Using Metadata and Wikipedia. In: Bouma, G., Ittoo, A., Métais, E., Wortmann, H. (eds) Natural Language Processing and Information Systems. NLDB 2012. Lecture Notes in Computer Science, vol 7337. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31178-9_20
Download citation
DOI: https://doi.org/10.1007/978-3-642-31178-9_20
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31177-2
Online ISBN: 978-3-642-31178-9
eBook Packages: Computer ScienceComputer Science (R0)