Abstract
Text annotation is the procedure of identifying the semantically dominant words of a text segment and attaching them with conceptual content information in their context. In this paper, we propose novel methods for automatic annotation of text fragments with entities of Wikipedia, the largest knowledge base online, a process commonly known as Wikification aiming at resolving the semantics of synonymous and polysemous terms accurately. The cornerstone of our contribution is a novel iterative Wikification approach, converging at optimal annotations while balancing high accuracy with performance. Our first two methods can be fine-tuned through a machine-learning technique over large homogenous data sets. Our experimental evaluation resulted in remarkable improvement over state-of-the-art Wikification approaches.
Chapter PDF
Similar content being viewed by others
Keywords
References
Cucerzan, S.: Large-Scale Named Entity Disambiguation Based on Wikipedia Data. In: Proceedings of EMNLP-CoNLL 2007, pp. 708–716 (2007)
Fellbaum, C. (ed.): WordNet, an electronic lexical database. The MIT Press (1998)
Ferragina, P., Scaiella, U.: TAGME: on-the-fly annotation of short text fragments (by wikipedia entities). In: Proceedings of the 19th ACM International Conference on Information and Knowledge Management (CIKM 2010), pp. 1625–1628. ACM, New York (2010)
Gabrilovich, E., Markovitch, S.: Computing semantic relatedness using Wikipedia-based explicit semantic analysis. In: Sangal, R., Mehta, H., Bagga, R.K. (eds.) Proceedings of the 20th International Joint Conference on Artifical Intelligence (IJCAI 2007), pp. 1606–1611. Morgan Kaufmann Publishers Inc., San Francisco (2007)
Han, X., Sun, L., Zhao, J.: Collective entity linking in web text: a graph-based method. In: Proceedings of the 34th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2011), pp. 765–774. ACM, New York (2011)
Han, X., Zhao, J.: Named entity disambiguation by leveraging wikipedia semantic knowledge. In: Proceedings of the 18th ACM Conference on Information and Knowledge Management (CIKM 2009), pp. 215–224. ACM, New York (2009)
Hoffart, J., Suchanek, F.M., Berberich, K., Lewis-Kelham, E., Melo, G., Weikum, G.: YAGO2: exploring and querying world knowledge in time, space, context, and many languages. In: Proceedings of the 20th International Conference Companion on World Wide Web (WWW 2011), pp. 229–232. ACM, New York (2011)
Hoffart, J., Yosef, M.A., Bordino, I., Fürstenau, H., Pinkal, M., Spaniol, M., Taneva, B., Thater, S., Weikum, S.: Robust disambiguation of named entities in text. In: Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2011), pp. 782–792. Association for Computational Linguistics, Stroudsburg (2011)
Kulkarni, S., Singh, A., Ramakrishnan, G.: and. Chakrabarti, S. Collective annotation of Wikipedia entities in web text. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 457–466. ACM, New York (2009)
Makris, C., Plegas, Y., Theodoridis, E.: Improved Text Annotation with Wikipedia Entities. In: Proceedings of the 28th Annual ACM Symposium on Applied Computing, pp. 288–295. ACM, New York (2013)
Meij, E., Weerkamp, W., Rijke, M.: Adding semantics to microblog posts. In: Proceedings of the 5th ACM International Conference on Web Search and Data Mining (WSDM 2012), pp. 536–572. ACM, New York (2012)
Mihalcea, R., Csomai, A.: Wikify!: linking documents to encyclopedic knowledge. In: Proceedings of the 16th ACM International Conference on Information and Knowledge Management (CIKM 2007), pp. 233–242. ACM, New York (2007)
Milne, D., Witten, I.H.: Learning to link with Wikipedia. In: Proceedings of the 17th ACM Conference on Information and Knowledge Management (CIKM 2008), pp. 509–518. ACM, New York (2008)
Navigli, R.: Word Sense Disambiguation. ACM Computing Surveys 41(2), 10:1–10:69 (2003)
Ratinov, L., Roth, D., Downey, D., Anderson, M.: Local and global algorithms for disambiguation to Wikipedia. In: HLT 2011, vol. 1 (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 IFIP International Federation for Information Processing
About this paper
Cite this paper
Makris, C., Simos, M.A. (2014). Novel Techniques for Text Annotation with Wikipedia Entities. In: Iliadis, L., Maglogiannis, I., Papadopoulos, H. (eds) Artificial Intelligence Applications and Innovations. AIAI 2014. IFIP Advances in Information and Communication Technology, vol 436. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-44654-6_50
Download citation
DOI: https://doi.org/10.1007/978-3-662-44654-6_50
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-44653-9
Online ISBN: 978-3-662-44654-6
eBook Packages: Computer ScienceComputer Science (R0)