Abstract
This article describes a combined method for the acquisition of valuable terms and relations from raw texts with the help of an iterative algorithm for automated terminology extraction from Ukrainian-language scientific texts. Special attention is paid to the analysis of lexicographical features of characteristic text fragments of documents. Specific features of Ukrainian-language documents are taken into account. The emphasis is on solving the applied problem of terminology acquisition from input texts in the widely used pdf format with obtaining output term relations in the RDF format.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
S. I. Landau, Dictionaries: The Art and Craft of Lexicography [Ukrainian translation], K.I.S., Kyiv (2012).
M. Lassi, Automatic Thesaurus Construction, University College of Boras, Sweden (2002), http://www.academia.edu/506142/Automatic_thesaurus_construction.
Types of Relations in a Thesaurus, Web. 5/10/2014, http://publish.uwo.ca/~craven/677/thesaur/main06.htm.
H. Chen, T. D. Ng, J. Martinez, and B. Schatz, “A concept space approach to addressing the vocabulary problem in scientific information retrieval: An experiment on the worm community system,” J. Amer. Soc. for Inform. Sci. (1997), http://arizona.openrepository. com/arizona/bitstream/10150/105991/1/chen21.pdf.
U. Miller, “Thesaurus construction: Problems and their roots,” Inform. Proc. & Management, 33, No. 4, 481–493 (1997).
“ISO 25964 — the International Standard for Thesauri and Interoperability with Other Vocabularies,” ISO 25964 Thesaurus Schemas, Web. 08 April 2014, http://www.niso.org/schemas/iso25964/.
JSON-LD 1.0, Web. 08 June 2014, http://www.w3.org/TR/json-ld/.
H. Chen, T. Yim, D. Fye, and B. Schatz, “Automatic thesaurus generation for an electronic community system,” J. Amer. Soc. for Inform. Sci., 46, No. 3, 175–193 (1995).
H. Chen, K. Lynch, K. Basu, and T. D. Ng, “Generating, integrating, and activating thesauri for concept-based document retrieval,” IEEE Expert., 8, No. 2, 25–34 (1993).
G. Grefenstette, Automatic Thesaurus Generation from Raw Text Using Knowledge-Poor Techniques, Rank Xerox Research Centre (1993), http://www.academia.edu/4186829/AUTOMATIC_THESAURUS_GENERATION_FROM_RAW_TEXT_USING_KNOWLEDGE-POOR_TECHNIQUES.
M. A. Hearst, “Automatic acquisition of hyponyms from large text corpora,” in: Proc. 14th Conf. on Comput. Ling. (COLING ‘92), 2, (1992), pp. 539–545.
H. Alshawi, “Processing dictionary definitions with phrasal pattern hierarchies,” Comput. Ling., 13, Nos. 3–4, 195–202 (1987).
Author information
Authors and Affiliations
Corresponding author
Additional information
Translated from Kibernetika i Sistemnyi Analiz, No. 6, pp. 53–62, November–December, 2014.
Rights and permissions
About this article
Cite this article
Glybovets, A.M., Reshetnov, I.V. An Iterative Approach to the Terminology Extraction from Ukrainian-Language Scientific Text Corpora. Cybern Syst Anal 50, 866–873 (2014). https://doi.org/10.1007/s10559-014-9677-6
Received:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10559-014-9677-6