Abstract
This paper describes POWLA, a generic formalism to represent linguistic annotations in an interoperable way by means of OWL/DL. Unlike other approaches in this direction, POWLA is not tied to a specific selection of annotation layers, but it is designed to support any kind of text-oriented annotation.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Baker, C.F., Fellbaum, C.: WordNet and FrameNet as Complementary Resources for Annotation. In: Proceedings of the Third Linguistic Annotation Workshop, pp. 125–129 (August 2009)
Berners-Lee, T.: Design issues: Linked data (2006), http://www.w3.org/DesignIssues/LinkedData.html
Bird, S., Liberman, M.: A formal framework for linguistic annotation. Speech Communication 33(1-2), 23–60 (2001)
Burchardt, A., Padó, S., Spohr, D., Frank, A., Heid, U.: Formalising Multi-layer Corpora in OWL/DL – Lexicon Modelling, Querying and Consistency Control. In: Proceedings of the 3rd International Joint Conf. on NLP (IJCNLP 2008), Hyderabad (2008)
Carletta, J., Evert, S., Heid, U., Kilgour, J.: The NITE XML Toolkit: data model and query. Language Resources and Evaluation Journal (LREJ) 39(4), 313–334 (2005)
Carletta, J., Evert, S., Heid, U., Kilgour, J., Robertson, J., Voormann, H.: The NITE XML Toolkit: flexible annotation for multi-modal language data. Behavior Research Methods, Instruments, and Computers 35(3), 353–363 (2003)
Cassidy, S.: An RDF realisation of LAF in the DADA annotation server. In: Proceedings of ISA-5, Hong Kong (2010)
Chiarcos, C.: An ontology of linguistic annotations. LDV Forum 23(1), 1–16 (2008)
Chiarcos, C.: Grounding an ontology of linguistic annotations in the Data Category Registry. In: Workshop on Language Resource and Language Technology Standards (LR<S 2010), held in conjunction with LREC 2010, Valetta, Malta (May 2010)
Chiarcos, C.: Interoperability of corpora and annotations. In: Chiarcos, C., Nordhoff, S., Hellmann, S. (eds.) Linked Data in Linguistics. Representing and Connecting Language Data and Language Metadata, pp. 161–179. Springer, Heidelberg (2012)
Chiarcos, C.: A generic formalism to represent linguistic corpora in RDF and OWL/DL. In: 8th International Conference on Language Resources and Evaluation (LREC 2012) (accepted, 2012)
Chiarcos, C., Dipper, S., Götze, M., Leser, U., Lüdeling, A., Ritz, J., Stede, M.: A flexible framework for integrating annotations from different tools and tag sets. Traitement Automatique des Langues 49(2) (2009)
Chiarcos, C., Hellmann, S., Nordhoff, S.: The Open Linguistics Working Group of the Open Knowledge Foundation. In: Chiarcos, C., Nordhoff, S., Hellmann, S. (eds.) Linked Data in Linguistics. Representing and Connecting Language Data and Language Metadata, pp. 153–160. Springer, Heidelberg (2012)
Chiarcos, C., Ritz, J., Stede, M.: By all these lovely tokens... Merging conflicting tokenizations. Journal of Language Resources and Evaluation (LREJ) 46(1), 53–74
De Melo, G., Weikum, G.: Language as a foundation of the Semantic Web. In: Proceedings of the 7th International Semantic Web Conference (ISWC 2008), vol. 401 (2008)
Dipper, S.: XML-based stand-off representation and exploitation of multi-level linguistic annotation. In: Proceedings of Berliner XML Tage 2005 (BXML 2005), Berlin, Germany, pp. 39–50 (2005)
Eckart, K., Riester, A., Schweitzer, K.: A discourse information radio news database for linguistic analysis. In: Chiarcos, C., Nordhoff, S., Hellmann, S. (eds.) Linked Data in Linguistics, Springer, Heidelberg (2012)
Eckart, R.: Choosing an XML database for linguistically annotated corpora. Sprache und Datenverarbeitung 32(1), 7–22 (2008)
Farrar, S., Langendoen, D.T.: An OWL-DL implementation of GOLD: An ontology for the Semantic Web. In: Witt, A.W., Metzing, D. (eds.) Linguistic Modeling of Information and Markup Languages: Contributions to Language Technology, Springer, Dordrecht (2010)
Francopoulo, G., Bel, N., George, M., Calzolari, N., Monachini, M., Pet, M., Soria, C.: Multilingual resources for NLP in the Lexical Markup Framework (LMF). Language Resources and Evaluation 43(1), 57–70 (2009)
Hellmann, S.: The semantic gap of formalized meaning. In: Aroyo, L., Antoniou, G., Hyvönen, E., ten Teije, A., Stuckenschmidt, H., Cabral, L., Tudorache, T. (eds.) ESWC 2010. LNCS, vol. 6089, pp. 462–466. Springer, Heidelberg (2010)
Hellmann, S., Unbehauen, J., Chiarcos, C., Ngonga Ngomo, A.: The TIGER Corpus Navigator. In: 9th International Workshop on Treebanks and Linguistic Theories (TLT-9), Tartu, Estonia, pp. 91–102 (2010)
Ide, N., Fellbaum, C., Baker, C., Passonneau, R.: The manually annotated sub-corpus: A community resource for and by the people. In: Proceedings of the ACL-2010, pp. 68–73 (2010)
Ide, N., Pustejovsky, J.: What does interoperability mean, anyway? Toward an operational definition of interoperability. In: Proceedings of the Second International Conference on Global Interoperability for Language Resources (ICGL 2010), Hong Kong, China (2010)
Ide, N., Romary, L.: International standard for a linguistic annotation framework. Natural language engineering 10(3-4), 211–225 (2004)
Ide, N., Suderman, K.: GrAF: A graph-based format for linguistic annotations. In: Proceedings of The Linguistic Annotation Workshop (LAW) 2007, Prague, Czech Republic, pp. 1–8 (2007)
Kemps-Snijders, M., Windhouwer, M., Wittenburg, P., Wright, S.: ISOcat: Corralling data categories in the wild. In: Proceedings of the International Conference on Language Resources and Evaluation (LREC 2008), Marrakech, Morocco (May 2008)
Kemps-Snijders, M., Windhouwer, M., Wittenburg, P., Wright, S.: ISOcat: Remodelling metadata for language resources. International Journal of Metadata, Semantics and Ontologies 4(4), 261–276 (2009)
Marcus, M., Santorini, B., Marcinkiewicz, M.A.: Building a large annotated corpus of English: the Penn Treebank. Computational Linguistics 19(2), 313–330 (1993)
Pustejovsky, J., Meyers, A., Palmer, M., Poesio, M.: Merging PropBank, NomBank, TimeBank, Penn Discourse Treebank and Coreference. In: Proc. of ACL Workshop on Frontiers in Corpus Annotation (2005)
Rubiera, E., Polo, L., Berrueta, D., El Ghali, A.: TELIX: An RDF-based Model for Linguistic Annotation. In: Simperl, E., et al. (eds.) ESWC 2012. LNCS, vol. 7295, pp. 195–209. Springer, Heidelberg (2012)
Schiehlen, M.: Optimizing algorithms for pronoun resolution. In: Proceedings of the 20th International Conference on Computational Linguistics (COLING), Geneva, pp. 515–521 (August 2004)
Skut, W., Brants, T., Krenn, B., Uszkoreit, H.: A linguistically interpreted corpus of German newspaper text. In: Proc. ESSLLI Workshop on Recent Advances in Corpus Annotation, Saarbrücken, Germany (1998)
Stede, M.: The Potsdam Commentary Corpus. In: Proceedings of the ACL Workshop on Discourse Annotation, pp. 96–102, Barcelona, Spain (2004)
Stede, M., Bieler, H.: The MOTS Workbench. In: Mehler, A., Kühnberger, K.-U., Lobin, H., Lüngen, H., Storrer, A., Witt, A. (eds.) Modeling, Learning, and Proc. of Text-Tech. Data Struct. SCI, vol. 370, pp. 15–34. Springer, Heidelberg (2011)
Vatant, B., Wick, M.: GeoNames ontology, version 3.01 (February 2012), http://www.geonames.org/ontology (accessed March 15, 2012)
Windhouwer, M., Wright, S.E.: Linking to linguistic data categories in ISOcat. In: Linked Data in Linguistics (LDL 2012), Frankfurt/M., Germany (accepted March 2012)
Windhouwer, M., Wright, S.E.: Linking to linguistic data categories in ISOcat. In: Chiarcos, C., Nordhoff, S., Hellmann, S. (eds.) Linked Data in Linguistics. Representing and Connecting Language Data and Language Metadata, pp. 99–107. Springer, Heidelberg (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Chiarcos, C. (2012). POWLA: Modeling Linguistic Corpora in OWL/DL. In: Simperl, E., Cimiano, P., Polleres, A., Corcho, O., Presutti, V. (eds) The Semantic Web: Research and Applications. ESWC 2012. Lecture Notes in Computer Science, vol 7295. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30284-8_22
Download citation
DOI: https://doi.org/10.1007/978-3-642-30284-8_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30283-1
Online ISBN: 978-3-642-30284-8
eBook Packages: Computer ScienceComputer Science (R0)