Abstract
This paper describes an automatic approach to identify lexical patterns which represent semantic relationships between concepts, from an on-line encyclopedia. Next, these patterns can be applied to extend existing ontologies or semantic networks with new relations. The experiments have been performed with the Simple English Wikipedia and WordNet 1.7. A new algorithm has been devised for automatically generalising the lexical patterns found in the encyclopedia entries. We have found general patterns for the hyperonymy, hyponymy, holonymy and meronymy relations and, using them, we have extracted more than 1200 new relationships that did not appear in WordNet originally. The precision of these relationships ranges between 0.61 and 0.69, depending on the relation.
This work has been sponsored by CICYT, project numbers TIC2002-01948 and TIN2004-03140.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Ding, Y., Fensel, D., Klein, M.C.A., Omelayenko, B.: The semantic web: yet another hip? Data Knowledge Engineering 41, 205–227 (2002)
Berners-Lee, T., Hendler, J., Lassila, O.: The semantic web - a new form of web content that is meaningful to computers will unleash a revolution of new possibilities. Scientific American 284, 34–43 (2001)
Gruber, T.R.: A translation approach to portable ontologies. Knowledge Acquisition 5, 199–220 (1993)
Degen, W., Heller, B., Herre, H., Smith, B.: Gol: Towards an axiomatized upper-level ontology. In: Proceedings of the International Conference on Formal Ontology in Information Systems, FOIS-2001 (2001)
Gómez-Pérez, A., Macho, D.M., Alfonseca, E.: nez, R.N., Blascoe, I., Staab, S., Corcho, O., Ding, Y., Paralic, J., Troncy, R.: Ontoweb deliverable 1.5: A survey of ontology learning methods and techniques (2003)
Maedche, A., Staab, S.: Ontology learning for the semantic web. IEEE Intelligent systems 16 (2001)
Miller, G.A.: WordNet: A lexical database for English. Communications of the ACM 38, 39–41 (1995)
Lee, L.: Similarity-Based Approaches to Natural Language Processing. Ph.D. thesis. Harvard University Technical Report TR-11-97 (1997)
Faure, D., Nédellec, C.: A corpus-based conceptual clustering method for verb frames and ontology acquisition. In: LREC workshop on Adapting lexical and corpus resources to sublanguages and applications, Granada, Spain (1998)
Cimiano, P., Staab, S.: Clustering concept hierarchies from text. In: Proceedings of LREC-2004 (2004)
Hastings, P.M.: Automatic acquisition of word meaning from context. University of Michigan, Ph. D. Dissertation (1994)
Hahn, U., Schnattinger, K.: Towards text knowledge engineering. In: AAAI/IAAI, pp. 524–531 (1998)
Pekar, V., Staab, S.: Word classification based on combined measures of distributional and semantic similarity. In: Proceedings of Research Notes of the 10th Conference of the European Chapter of the Association for Computational Linguistics, Budapest (2003)
Alfonseca, E., Manandhar, S.: Extending a lexical ontology by a combination of distributional semantics signatures. In: Gómez-Pérez, A., Benjamins, V.R. (eds.) EKAW 2002. LNCS (LNAI), vol. 2473, pp. 1–7. Springer, Heidelberg (2002)
Maedche, A., Staab, S.: Discovering conceptual relations from text. In: Proceedings of the 14th European Conference on Artifial Intelligence (2000)
Hearst, M.A.: Automatic acquisition of hyponyms from large text corpora. In: Proceedings of COLING 1992, Nantes, France (1992)
Hearst, M.A.: Automated Discovery of WordNet Relations. In: Fellbaum, C. (ed.) WordNet: An Electronic Lexical Database, pp. 132–152. MIT Press, Cambridge (1998)
Berland, M., Charniak, E.: Finding parts in very large corpora. In: Proceedings of ACL 1999 (1999)
Finkelstein-Landau, M., Morin, E.: Extracting semantic relationships between terms: supervised vs. unsupervised methods. In: Proceedings of the International Workshop on Ontologial Engineering on the Global Information Infrastructure (1999)
Kietz, J., Maedche, A., Volz, R.: A method for semi-automatic ontology acquisition from a corporate intranet. In: Workshop “Ontologies and text”, co-located with EKAW’2000, Juan-les-Pins, French Riviera (2000)
Alfonseca, E., Manandhar, S.: Improving an ontology refinement method with hyponymy patterns. In: Language Resources and Evaluation (LREC-2002), Las Palmas (2002)
Navigli, R., Velardi, P.: Learning domain ontologies from document warehouses and dedicated websites. Computational Linguistics 30 (2004)
Wilks, Y., Fass, D.C., Guo, C.M., McDonald, J.E., Plate, T., Slator, B.M.: Providing machine tractable dictionary tools. Journal of Computers and Translation (1990)
Rigau, G.: Automatic Acquisition of Lexical Knowledge from MRDs. PhD Thesis, Departament de Llenguatges i Sistemes Informàtics, Universitat Politècnica de Catalunya (1998)
Richardson, S.D., Dolan, W.B., Vanderwende, L.: MindNet: acquiring and structuring semantic information from text. In: Proceedings of COLING-ACL 1998, Montreal, Canada, vol. 2, pp. 1098–1102 (1998)
Harabagiu, S., Moldovan, D.I.: Knowledge processing on an extended wordnet. In: WordNet: An Electronic Lexical Database, pp. 379–405. MIT Press, Cambridge (1998)
Harabagiu, S., Miller, G., Moldovan, D.: Wordnet 2 - a morphologically and semantically enhanced resource. In: Proc. of the SIGLEX Workshop on Multilingual Lexicons, ACL Annual Meeting, University of Maryland (1999)
Novischi, A.: Accurate semantic annotation via pattern matching. In: Proceedings of FLAIRS-2002 (2002)
DeBoni, M., Manandhar, S.: Automated discovery of telic relations for wordnet. In: Poceedings of the First International Conference on General WordNet, Mysore, India (2002)
Alfonseca, E.: Wraetlic user guide version 1.0 (2003)
Ruiz-Casado, M., Alfonseca, E., Castells, P.: Automatic assignment of wikipedia encyclopedic entries to wordnet synsets (2005) (in press)
Marcus, M.P., Santorini, B., Marcinkiewicz, M.A.: Building a large annotated corpus of english: the penn treebank. Computational Linguistics 19, 313–330 (1993)
Wagner, R., Fischer, M.: The string-to-string correction problem. Journal of Assoc. Comput. Mach. 21 (1974)
Alfonseca, E., Manandhar, S.: Distinguishing instances and concepts in wordnet. In: Poceedings of the First International Conference on General WordNet, Mysore, India (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Ruiz-Casado, M., Alfonseca, E., Castells, P. (2005). Automatic Extraction of Semantic Relationships for WordNet by Means of Pattern Learning from Wikipedia. In: Montoyo, A., Muńoz, R., Métais, E. (eds) Natural Language Processing and Information Systems. NLDB 2005. Lecture Notes in Computer Science, vol 3513. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11428817_7
Download citation
DOI: https://doi.org/10.1007/11428817_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-26031-8
Online ISBN: 978-3-540-32110-1
eBook Packages: Computer ScienceComputer Science (R0)