Abstract
The importance of research on knowledge management is growing due to recent issues with big data. The most fundamental steps in knowledge management are the extraction and construction of terminologies. Terms are often expressed in various forms and the term variations play a negative role, becoming an obstacle which causes knowledge systems to extract unnecessary knowledge. To solve the problem, we propose a method of term normalization which finds a normalized form (original and standard form defined in dictionaries) of variant terms. The method employs a couple of characteristics of terms: one is appearance similarity, which measures how similar terms are, and the other is context similarity which measures how many clue words they share. Through experiment, we show its positive influence of both similarities in the term normalization.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Dowdal, J., Rinaldi, F., Ibekwe-SanJuan, F., SanJuan, E.: Complex Structuring of Term Variants for Question Answering. In: Proc. of the ACM Workshop on Multiword Expressions: Analysis, Acquisition and Treatment, vol. 18, pp. 1–8 (2003)
Ibekwe-Sanjuan, F.: Terminological Variation, a Means of Identifying Research Topics from Texts. In: Proc. of Intl. Conf. on Computational Linguistics, vol. 1, pp. 564–570 (1998)
Porter, M.F.: An algorithm for suffix stripping. J. of Program 14(3), 130–137 (1980)
Toutanova, K., Manning, C.: Enriching the Knowledge Sources Used in a Maximum Entropy Part-of-Speech Tagger. In: Proc. Joint SIGDAT Conf. Empirical Methods in Natural Language Processing and Very Large Corpora, pp. 63–70 (2000)
Hwang, M., Kim, P.: A New Similarity Measure for Automatic Construction of the Unknown Word Lexical Dictionary. Intl. J. on Semantic Web and Information Systems (IJSWIS) 5(1), 48–64 (2009)
Hwang, M., Choi, C., Kim, P.: Automatic Enrichment of Semantic Relation Networks and its Application to Word Sense Disambiguation. IEEE Transactions on Knowledge and Data Engineering 23(6), 845–858 (2011)
Brank, J., Mladenic, D., Grobelnik, M., Milic-Frayling, N.: Feature Selection for the Classification of Large Document Collections. Journal of Universal Computer Science 14(10), 1562–1596 (2008)
Duong, T.H., Jo, G., Jung, J.J., Nguyen, N.T.: Complexity Analysis of Ontology Integration Methodologies: A Comparative Study. Journal of Universal Computer Science 15(4), 877–897 (2009)
Jung, J.J.: Semantic business process integration based on ontology alignment. Expert Systems with Applications 36(8), 11013–11020 (2009)
Hwang, M., Choi, D., Choi, J., Kim, H., Kim, P.: Similarity Measure for Semantic Document Interconnections. Information-An International Interdisciplinary Journal 13(2), 253–267 (2010)
Hwang, M., Choi, D., Kim, P.: A Method for Knowledge Base Enrichment using Wikipedia Document Information. Information-An International Interdisciplinary Journal 13(5), 1599–1612 (2010)
Bawakid, A., Oussalah, M.: Using features extracted from Wikipedia for the task of Word Sense Disambiguation. In: Proc. of IEEE Intl. Conf. on Cybernetic Intelligent Systems, pp. 1–6 (2010)
Fogarolli, A.: Word Sense Disambiguation Based on Wikipedia Link Structure. In: Proceedings of IEEE Intl. Conf. on Semantic Computing, pp. 77–82 (2009)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hwang, M., Jeong, DH., Jung, H., Sung, WK., Shin, J., Kim, P. (2012). A Term Normalization Method for Better Performance of Terminology Construction. In: Rutkowski, L., Korytkowski, M., Scherer, R., Tadeusiewicz, R., Zadeh, L.A., Zurada, J.M. (eds) Artificial Intelligence and Soft Computing. ICAISC 2012. Lecture Notes in Computer Science(), vol 7267. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29347-4_79
Download citation
DOI: https://doi.org/10.1007/978-3-642-29347-4_79
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29346-7
Online ISBN: 978-3-642-29347-4
eBook Packages: Computer ScienceComputer Science (R0)