Abstract
This paper presents an annotation scheme for events in Spanish texts, based on TimeML for English. This scheme is contrasted with different proposals, all of them based on TimeML, for various Romance languages: Italian, French and Spanish. Two manually annotated corpora for Spanish, under the proposed scheme, are now available. While manual annotation is far from trivial, we obtained a very good event identification agreement (93% of events were identically identified by both annotators). Part of the annotated text was used as a training corpus for the automatic recognition of events. In the experiments conducted so far (SVM and CRF) our best results are in the state of the art for this task (80.3% of F-measure).
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
Keywords
- Support Vector Machine
- Natural Language Processing
- Conditional Random Field
- Factivity Attribute
- Training Corpus
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Atserias, J., Casas, B., Comelles, E., González, M., Padró, L., Padró, M.: FreeLing 1.3: Syntactic and semantic services in an open-source NLP library. In: Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC) ELRA (2006)
Bethard, S., Martin, J.H.: Identification of event mentions and their semantic class. In: Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing, EMNLP 2006, pp. 146–154. Association for Computational Linguistics, Stroudsburg (2006)
Bittar, A., Amsili, P., Denis, P., Danlos, L.: French TimeBank: An ISO-TimeML Annotated Reference Corpus. In: Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: shortpapers, Portland, Oregon, pp. 130–134 (2011)
Boguraev, B., Kubota Ando, R.: TimeML-compliant text analysis for temporal reasoning. In: Proceedings of the 19th International Joint Conference on Artificial Intelligence, pp. 997–1003. Morgan Kaufmann Publishers Inc., San Francisco (2005)
Boguraev, B., Kubota Ando, R.: Effective Use of TimeBank for TimeML Analysis. In: Schilder, F., Katz, G., Pustejovsky, J. (eds.) Annotating, Extracting and Reasoning about Time and Events. LNCS (LNAI), vol. 4795, pp. 41–58. Springer, Heidelberg (2007)
Carletta, J.: Assessing agreement on classification tasks: The Kappa statistic. Computational Linguistics 22, 249–254 (1996)
Caselli, T., Bartalesi, V., Sprugnoli, R., Pianta, E., Prodanof, I.: Annotating Events, Tamporal Expressions and Relations in Italian: the It-TimeML Experience for the Ita-TimeBank. In: Proceedings of the Fifth Law Workshop (LAW V), Portland, Oregon, pp. 143–151 (2011)
Kudo, T., Matsumoto, Y.: Chunking with Support Vector Machines. In: NAACL (2001)
Lafferty, J., McCallum, A., Pereira, F.: Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In: Proceedings of the Eighteenth International Conference on Machine Learning, ICML 2001, pp. 282–289. Morgan Kaufmann Publishers Inc., San Francisco (2001)
Llorens, H., Saquete, E., Navarro-Colorado, B.: TimeML events recognition and classification: learning CRF models with semantic roles. In: Proceedings of the 23rd International Conference on Computational Linguistics, COLING 2010, pp. 725–733. Association for Computational Linguistics, Stroudsburg (2010)
March, O., Baldwin, T.: Automatic event reference identification. In: Proceedings of the Australasian Language Technology Association Workshop 2008, páginas, Hobart, Australi, pp. 79–87 (2008)
Ben, M.: Investigating Classification for Natural Language Processing Tasks, Ph.D. Thesis, Cambridge University (2007)
Pustejovsky, J., Castaño, J., Ingria, R., Saurí, R., Gaizauskas, R., Setzer, A., Katz, G.: TimeML: Robust specification of event and temporal expressions in text. In: Fifth International Workshop on Computational Semantics, IWCS-5 (2003)
Pustejovsky, J., Hanks, P., Saurí, R., See, A., Gaizauskas, R., Setzer, A., Radev, D., Sundheim, B., Day, D., Ferro, L., Lazo, M.: The TIMEBANK Corpus. In: Proceedings of Corpus Linguistics, pp. 647–656 (2003)
Resnik, G., Bel, N.: Automatic Detection of Non-deverbal Event Nouns in Spanish. In: Proceedings of the 5th International Conference on Generative Approaches to the Lexicon, Istituto di Linguistica Computazionale, Pisa (2009)
Rosá, A., Wonsever, D., Minel, J.-L.: Comparación de dos métodos para la extracción de opiniones en textos en español. In: Proceedings of IBERAMIA 2010, Workshop on Natural Language Processing and Web-based Technologies, Bahía Blanca (2010)
Saurí, R., Knippen, R., Verhagen, M., Pustejovsky, J.: Evita: a robust event recog-nizer for QA systems. In: Proceedings of the conference on Human Language Technology and Empirical Methods in Natural Language Processing, HLT 2005, pp. 700–707. Association for Computational Linguistics, Stroudsburg (2005)
Saurí, R., Littman, J., Knippen, B., Gaizauskas, R., Setzer, A., Pustejovsky, J.: TimeML Annotation Guidelines Version 1.2.1 (2006)
Saurí, R.: A Factuality Profiler for Eventualities in Text. PhD Dissertation. Brandeis University (2008)
Saurí, R., Batiukova, O., Pustejovsky, J.: Annotating Events in Spanish TimeML Annotation Guidelines. Version TempEval-2010 (2009)
Saurí, R., Goldberg, L., Verhagen, M., Pustejovsky, J.: Annotating Events in English TimeML Annotation Guidelines. Version TempEval-2010 (2009)
Vapnik, V.: The Nature of Statistical Learning Theory. Springer, New York (1995)
Wiebe, J., Wilson, T., Cardie, C.: Annotating expressions of opinions and emotions in language. In: Language Resources and Evaluation (2005)
Wonsever, D., Malcuori, M., Rosá, A.: Sibila: esquema de anotación de eventos. Technical Report 08–11, Biblioteca InCo PEDECIBA (2008) ISSN: 0797–6410
Wonsever, D., Malcuori, M., Rosá, A.: Factividad de los eventos referidos en textos. Technical Report 09–12, Biblioteca InCo PEDECIBA (2009) ISSN: 0797–6410
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wonsever, D., Rosá, A., Malcuori, M., Moncecchi, G., Descoins, A. (2012). Event Annotation Schemes and Event Recognition in Spanish Texts. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2012. Lecture Notes in Computer Science, vol 7182. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28601-8_18
Download citation
DOI: https://doi.org/10.1007/978-3-642-28601-8_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28600-1
Online ISBN: 978-3-642-28601-8
eBook Packages: Computer ScienceComputer Science (R0)