Abstract
Effectively managing the collaboration of many annotators is a crucial ingredient for the success of larger annotation projects. For collaboration, web-based tools offer a low-entry way gathering annotations from distributed contributors. While the management structure of annotation tools is more or less stable across projects, the kind of annotations vary widely between projects. The challenge for web-based tools for multi-layer text annotation is to combine ease of use and availability through the web with maximal flexibility regarding the types and layers of annotations. In this chapter, we outline requirements for web-based annotation tools in detail and review a variety of tools in respect to these requirements. Further, we discuss two web-based multi-layer annotation tools in detail: GATE Teamware and WebAnno. While differing in some aspects, both tools largely fulfill the requirements for today’s web-based annotation tools. Finally, we point out further directions, such as increased schema flexibility and tighter integration of automation for annotation suggestions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
The corresponding code still is present in ELAN 4.6.1, but is disabled and appears not to have been touched for several years.
- 2.
- 3.
- 4.
Source code and documentation are available from http://gate.ac.uk/teamware/.
- 5.
Available to use and trial at http://gatecloud.net.
- 6.
- 7.
Available for download at: http://webanno.googlecode.com/.
- 8.
- 9.
- 10.
- 11.
References
Bauer, C., King, G.: Java Persistence with Hibernate. Manning Publications Co, Bruce Park Avenue Typesetters, Greenwich, CT, USA (2007)
Benikova, D., Biemann, C., Reznicek, M.: NoSta-D Named Entity Annotation for German: Guidelines and Dataset. In: Calzolari, N., Choukri, K., Declerck, T., Loftsson, H., Maegaard, B., Mariani, J., Moreno, A., Odijk, J., Piperidis, S. (eds.) Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), pp. 2524–2531. European Language Resources Association (ELRA), Reykjavik, Iceland (2014)
Bollmann. M., Dipper, S., Krasselt, J., Petran, F.: Manual and semi-automatic normalization of historical spelling – case studies from early new high German. In: Proceedings of the First International Workshop on Language Technology for Historical Text(s) (LThist2012), KONVENS, Vienna, Austria (2012)
Bontcheva, K., Cunningham, H., Roberts, I., Roberts, A., Tablan, V., Aswani, N., Gorrell, G.: GATE Teamware: a web-based, collaborative text annotation framework. Lang. Resour. Eval. 47(4), 1007–1029 (2013). doi:10.1007/s10579-013-9215-6
Brants, T., Plaehn, O.: Interactive corpus annotation. In: Calzolari, N., Carayannis, G., Choukri, K., Höge, H., Maegaard, B., Mariani, J., Zampolli, A. (eds.) Proceedings of the 2nd International Conference on Language Resources and Evaluation (LREC’00), pp. 453–459. European Language Resources Association (ELRA), Athens, Greece (2000)
Brugman, H., Russel, A.: Annotating Multi-media / Multi-modal resources with ELAN. In: Lino, M.T., Xavier, M.F., Ferreira, F., Costa, R., Silva, R., Pereira, C., Carvalho, F., Lopes, M., Catarino, M., Barros, S. (eds.) Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC’04), pp. 2065–2068. European Language Resources Association (ELRA), Lisbon, Portugal (2004)
Brugman, H., Crasborn, O., Russel, A.: Collaborative annotation of sign language data with peer-to-peer technology. In: Lino, M.T., Xavier, M.F., Ferreira, F., Costa, R., Silva, R., Pereira, C., Carvalho, F., Lopes, M., Catarino, M., Barros, S. (eds.) Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC’04). European Language Resources Association (ELRA), Lisbon, Portugal (2004)
Burchardt, A., Erk, K., Frank, A., Kowalski, A., Pado, S.: SALTO: a versatile multi-level annotation tool. In: Calzolari, N., Choukri, K., Gangemi, A., Maegaard, B., Mariani, J., Odijk, J., Tapias, D. (eds.) Proceedings of the 5th international conference on language resources and evaluation (LREC’06), pp. 517–520. European Language Resources Association (ELRA), Genoa, Italy (2006)
Carletta, J.: Assessing agreement on classification tasks: the kappa statistic. Comput. Linguist. 22(2), 249–254 (1996)
Carletta, J., Evert, S., Heid, U., Kilgour, J.: The NITE XML Toolkit: data model and query language. Lang. Resour. Eval. 39(4), 313–334 (2005). doi:10.1007/s10579-006-9001-9
Chen, W.T., Styler, W.: Anafora: a web-based general purpose annotation tool. In: Proceedings of the 2013 NAACL HLT Demonstration Session. Association for Computational Linguistics, Atlanta, Georgia, pp. 14–19. http://www.aclweb.org/anthology/N13-3004 (2013)
Crammer, K., Singer, Y.: Ultraconservative online algorithms for multiclass problems. J. Mach. Learn. Res. 3, 951–991 (2003). doi:10.1162/jmlr.2003.3.4-5.951
Cunningham, H., Maynard, D., Bontcheva, K., Tablan, V.: GATE: an architecture for development of robust HLT applications. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL’02), pp. 168–175. Association for Computational Linguistics, Philadelphia, Pennsylvania, USA (2002). doi:10.3115/1073083.1073112
Cunningham, H., Tablan, V., Roberts, A., Bontcheva, K.: Getting more out of biomedical documents with GATE’s full lifecycle open source text analytics. PLoS Comput. Biol. 9(2), e1002854 (2013). doi:10.1371/journal.pcbi.1002854
Dashorst, M., Hillenius, E.: Wicket in Action. Manning Publications Co, Sound View Court 3B, Greenwich (2009)
Day, D., Aberdeen, J., Hirschman, L., Kozierok, R., Robinson, P., Vilain, M.: Mixed-initiative development of language processing systems. In: Proceedings of the Fifth Conference on Applied Natural Language Processing (ANLC ’97), pp. 348–355. Association for Computational Linguistics, Washington, DC (1997). doi:10.3115/974557.974608
Day, D., McHenry, C., Kozierok, R., Riek, L.: Callisto: a configurable annotation workbench. In: Lino, M.T., Xavier, M.F., Ferreira, F., Costa, R., Silva, R., Pereira, C., Carvalho, F., Lopes, M., Catarino, M., Barros, S. (eds.) Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04), pp. 2073–2076. European Language Resources Association (ELRA), Lisbon, Portugal (2004)
Dipper, S., Götze, M., Stede, M.: Simple annotation tools for complex annotation tasks: an evaluation. In: Proceedings of the LREC Workshop on XML-based Richly Annotated Corpora, Lisbon, Portugal, pp. 54–62 (2004)
Dipper, S., Lüdeling, A., Reznicek, M.: NoSta-D: A corpus of german non-standard varieties. In: Zampieri, M., Diwersy, S. (eds.) Non-standard Data Sources in Corpus-based Research, Shaker, pp. 69–76 (2013)
Eckart de Castilho, R., Gurevych, I.: DKPro-UGD: a flexible data-cleansing approach to processing user-generated discourse. In: Online-proceedings of the First French-speaking meeting around the framework Apache UIMA, LINA CNRS UMR 6241 - University of Nantes, France (2009)
Ferrucci, D., Lally, A.: UIMA: an architectural approach to unstructured information processing in the corporate research environment. Nat. Lang. Eng. 10(3–4), 327–348 (2004). doi:10.1017/S1351324904003523
Francis, W.N., Kucera, H.: Brown corpus manual. Technical report, Department of Linguistics, Brown University, Providence, Rhode Island, USA. http://icame.uib.no/brown/bcm.html (1979). (Last accessed: 2015-02-11)
Garrett, J.J.: Ajax: A New Approach to Web Applications. http://www.adaptivepath.com/ideas/ajax-new-approach-web-applications/ (2005). (Last accessed: 2015-02-11)
Gerdes, K.: Arborator - a tool for collaborative dependency annotation. https://launchpad.net/arborator (2013). (Last accessed: 2015-02-08)
Heid, U., Schmid, H., Eckart, K., Hinrichs, E.: A corpus representation format for linguistic web services: the d-spin text corpus format and its relationship with ISO standards. In: Calzolari, N., Choukri, K., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S., Rosner, M., Tapias, D. (eds.) Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC’10), pp. 494–499. European Language Resources Association (ELRA), Valletta, Malta (2010)
Hovy, E.: Annotation. In: Tutorial Abstracts of ACL 2010. Association for Computational Linguistics, Uppsala, Sweden, p. 4. http://www.aclweb.org/anthology/P10-5004 (2010)
Ide, N., Romary, L.: Towards international standards for language resources. In: Dybkjær, L., Hemsen, H., Minker, W. (eds.) Evaluation of Text and Speech Systems, chap 9, vol. 37, pp. 263–284. Springer, Netherlands (2007)
Ide, N., Bonhomme, P., Romary, L.: XCES: an XML-based encoding standard for linguistic corpora encoding standard for linguistic corpora. In: Calzolari, N., Carayannis, G., Choukri, K., Höge, H., Maegaard, B., Mariani, J., Zampolli, A. (eds.) Proceedings of the 2nd International Conference on Language Resources and Evaluation (LREC’00), pp. 825–830. European Language Resources Association (ELRA), Athens, Greece (2000)
Kaplan, D., Iida, R., Nishina, K., Tokunaga, T.: Slate - a tool for creating and maintaining annotated corpora. J. Lang. Technol. Comput. Linguist. 26(2), 89–101 (2011)
Lin, B., Chen, Y., Chen, X., Yu, Y.: Comparison between JSON and XML in Applications Based on AJAX. In: Guerrero JE (ed) Proceedings of the International Conference on Computer Science & Service System (CSSS’12). IEEE Computer Society, Nanjing, China, pp. 1174–1177 (2012). doi:10.1109/CSSS.2012.297
Maeda, K., Lee, H., Medero, S., Medero, J., Parker, R., Strassel, S.: Annotation Tool Development for Large-Scale Corpus Creation Projects at the Linguistic Data Consortium. In: Calzolari, N., Choukri, K., Maegaard, B., Mariani, J., Odijk, J., Piperidis, S., Tapias, D. (eds.) Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC’08), pp. 3052–3056. European Language Resources Association (ELRA), Marrakech, Morocco (2008)
Meurs, M.J., Murphy, C., Naderi, N., Morgenstern, I., Cantu, C., Semarjit, S., Butler, G., Powlowski, J., Tsang, A., Witte, R.: Towards evaluating the impact of semantic support for curating the fungus scientific literature. In: Baker, C.J.O., Chen, H., Bagheri, E., Du, W. (eds.) Proceedings of the 3rd Canadian Semantic Web Symposium (CSWS’11), pp. 34–39. Vancouver, British Columbia, Canada (2011)
Morton, T., LaCivita, J.: WordFreak: an open tool for linguistic annotation. In: Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: Demonstrations - Volume 4 (NAACL-Demonstrations ’03), pp. 17–18. Association for Computational Linguistics, Stroudsburg, PA, USA (2003). doi:10.3115/1073427.1073436
Müller, C., Strube, M.: Multi-level annotation of linguistic data with MMAX2. In: Braun, S., Kohn, K., Mukherjee, J. (eds.) Corpus Technology and Language Pedagogy: New Resources, New Tools, New Methods, Peter Lang, Frankfurt a.M., Germany, pp. 197–214 (2006)
Nakov, P., Schwartz, A., Wolf, B., Hearst, M.: Supporting annotation layers for natural language processing. In: Proceedings of the ACL 2005 on Interactive Poster and Demonstration Sessions, pp. 65–68. Association for Computational Linguistics, Ann Arbor, Michigan (2005). doi:10.3115/1225753.1225770
Nivre, J., Hall, J., Kübler, S., McDonald, R., Nilsson, J., Riedel, S., Yuret, D.: The CoNLL 2007 shared task on dependency parsing. In: Proceedings of the CoNLL Shared Task Session of EMNLP-CoNLL 2007, pp. 915–932. Association for Computational Linguistics Prague, Czech Republic (2007)
Ogren, P.V.: Knowtator: A protégé plug-in for annotated corpus construction. In: Proceedings of the 2006 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology: Companion Volume: Demonstrations, pp. 273–275. Association for Computational Linguistics, Stroudsburg, PA, USA, NAACL-Demonstrations ’06. doi:10.3115/1225785.1225791 (2006)
Pajas, P., Štěpánek, J.: Recent advances in a feature-rich framework for treebank annotation. In: Proceedings of the 22nd International Conference on Computational Linguistics (COLING’08), Manchester, UK, pp. 673–680. http://www.aclweb.org/anthology/C08-1085 (2008)
Rak, R., Rowley, A., Black, W., Ananiadou, S.: Argo: an integrative, interactive, text mining-based workbench supporting curation. Database 2012 (2012). doi:10.1093/database/bas010
Stenetorp, P., Pyysalo, S., Topić, G., Ohta, T., Ananiadou, S., Tsujii, J.: Brat: a web-based tool for NLP-assisted text annotation. In: Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics, pp. 102–107. Association for Computational Linguistics, Avignon, France. http://www.aclweb.org/anthology/E12-2021 (2012)
Stührenberg, M., Goecke, D., Diewald, N., Mehler, A., Cramer, I.: Web-based annotation of anaphoric relations and lexical chains. Proceedings of the Linguistic Annotation Workshop (LAW’07), pp. 140–147. Association for Computational Linguistics, Prague, Czech Republic (2007)
Tablan, V., Roberts, I., Cunningham, H., Bontcheva, K.: GATECloud.net: a platform for large-scale, open-source text processing on the cloud. Philos. Trans. R. Soc. Lond. A: Math. Phys. Eng. Sci. 371(1983) (2012). doi:10.1098/rsta.2012.0071
Walls, C.: Spring in Action, 3rd edn. Manning Publications Co, Sound View Court 3B, Greenwich, CT, USA (2011)
Wang, A., Hoang, C.D.V., Kan, M.Y.: Perspectives on crowdsourcing annotations for natural language processing. Lang. Resour. Eval. 47(1), 9–31 (2013). doi:10.1007/s10579-012-9176-1
Yimam, S.M., Gurevych, I., Eckart de Castilho, R., Biemann, C.: WebAnno: A flexible, web-based and visually supported system for distributed annotations. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 1–6. Association for Computational Linguistics, Sofia, Bulgaria. http://www.aclweb.org/anthology/P13-4001 (2013)
Yimam, S.M., Biemann, C., Eckart de Castilho, R., Gurevych, I.: Automatic annotation suggestions and custom annotation layers in WebAnno. In: Proceedings of 52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pp. 91–96. Association for Computational Linguistics, Baltimore, Maryland. http://aclweb.org/anthology/P14-5016 (2014)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Biemann, C., Bontcheva, K., Eckart de Castilho, R., Gurevych, I., Yimam, S.M. (2017). Collaborative Web-Based Tools for Multi-layer Text Annotation. In: Ide, N., Pustejovsky, J. (eds) Handbook of Linguistic Annotation. Springer, Dordrecht. https://doi.org/10.1007/978-94-024-0881-2_8
Download citation
DOI: https://doi.org/10.1007/978-94-024-0881-2_8
Published:
Publisher Name: Springer, Dordrecht
Print ISBN: 978-94-024-0879-9
Online ISBN: 978-94-024-0881-2
eBook Packages: Social SciencesSocial Sciences (R0)