Abstract
We describe a recently developed corpus annotation scheme for evaluating parsers that avoids some of the shortcomings of current methods. The scheme encodes grammatical relations between heads and dependents, and has been used to mark up a new public-domain corpus of naturally occurring English text. We show how the corpus can be used to evaluate the accuracy of a robust parser, and relate the corpus to extant resources.
This work was carried out while the second author was at the University of Sussex.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Atwell, E. (1996). Comparative evaluation of grammatical annotation models. In R. Sutcliffe, H. Koch, A. McElligott (Eds.), Industrial Parsing of Software Manuals, p. 25–46. Amsterdam, Rodopi.
Barnett, R., Calzolari, N., Flores, S., Hellwig, R, Kahrel, R, Leech, G., Melera, M., Montemagni, S., Odijk, J., Pirrelli, V., Sanfilippo, A., Teufel, S., Villegas, M., Zaysser, L. (1996). EAGLES Recommendations on Subcate-gorisation. Report of the EAGLES Working Group on Computational Lexicons. Available at ftp://ftp.ilc.pi.cnr.it/pub/eagles/lexicons/synlex.ps.gz.
Bies, A., Ferguson, M., Katz, K., MacIntyre, R., Tredinnick, V., Kim, G., Marc-inkiewicz, M., Schasberger, B. (1995). Bracketing Guidelines for Treebank II Style Penn Treebank Project. Technical Report, CIS, University of Pennsylvania, Philadelphia, PA.
Bod, R. (1999). Beyond Grammar. Stanford, CA: CSLI Press.
Briscoe, E. and Carroll, J. (1993). Generalised probabilistic LR parsing for unification-based grammars. Computational Linguistics, 19(1), p. 25–60.
Briscoe, E., Carroll, J. (1995). Developing and evaluating a probabilistic LR parser of part-of-speech and punctuation labels. Proceedings of the 4th ACL/SIGPARSE International Workshop on Parsing Technologies, p. 48–58. Prague, Czech Republic.
Carpenter, B. and Manning, C. (1997). Probabilistic parsing using left corner language models. Proceedings of the 5th ACL/SIGPARSE International Workshop on Parsing Technologies, p. 147–158. MIT, Cambridge, MA.
Carroll, J., Briscoe E. and Sanfilippo, A. (1998). Parser evaluation: a survey and a new proposal. Proceedings of the International Conference on Language Resources and Evaluation, p. 447–454. Granada, Spain.
Carroll, J., Minnen, G. and Briscoe, E. (1998). Can subcategorisation probabilities help a statistical parser?. Proceedings of the 6th ACL/SIGDAT Workshop on Very Large Corpora, p. 118–126. Montreal, Canada.
Charniak, E. (1996). Tree-bank grammars. Proceedings of the 13th National Conference on Artificial Intelligence, AAAI’96, p. 1031–1036. Portland, OR.
Collins, M. (1996). A new statistical parser based on bigram lexical dependencies. Proceedings of the 34th Meeting of the Association for Computational Linguistics, p. 184–191. Santa Cruz, CA.
Elworthy, D. (1994). Does Baum-Welch re-estimation help taggers?. Proceedings of the 4th ACL Conference on Applied Natural Language Processing, p. 53–58. Stuttgart, Germany.
Gaizauskas, R. (1998). Evaluation in language and speech technology. Computer Speech and Language, 12(3), p. 249–262.
Gaizauskas, R., Hepple M., Huyck, C. (1998). Modifying existing annotated corpora for general comparative evaluation of parsing. Proceedings of the LRE Workshop on Evaluation of Parsing Systems. Granada, Spain.
Grishman, R., Macleod, C., Sterling, J. (1992). Evaluating parsing strategies using standardized parse files. Proceedings of the 3rd ACL Conference on Applied Natural Language Processing, p. 156–161. Trento, Italy.
Harrison, P., Abney, S., Black, E., Flickinger, D., Gdaniec, C., Grishman, R., Hindle, D., Ingria, B., Marcus, M., Santorini, B., Strzalkowski, T. (1991). Evaluating syntax performance of parser/grammars of English. Proceedings of the Workshop on Evaluating Natural Language Processing Systems, p. 71–77. 29th Annual Meeting of the Association for Computational Linguistics, Berkeley, CA.
Jackendoff, R. (1977). X-bar Syntax. Cambridge, MA: MIT Press.
Kaplan, R., Bresnan, J. (1982). Lexical-Functional Grammar: a formal system for grammatical representation. In J. Bresnan (Eds.), The Mental Representation of Grammatical Relations, p. 173–281. Cambridge MA: MIT Press.
Leech, G. (1991). Running a grammar factory: the production of syntactically analysed corpora or “treebanks”, in Johansson et al (eds) English computer corpora, Berlin, Mouton de Gruyter, p. 15–32.
Lehmann, S., Oepen, S., Regnier-Prost, S., Netter, K., Lux, V., Klein, J., Falkedal, K., Fouvry, F., Estival, D., Dauphin, E., Compagnion, H., Baur, J., Balkan, L., Arnold, D. (1996). TSNLP — test suites for natural language processing. Proceedings of the 16th International Conference on Computational Linguistics, COLING’96, p. 711–716. Copenhagen, Denmark.
Lin, D. (1998). A dependency-based method for evaluating broad-coverage parsers. Natural Language Engineering, 4(2), p. 97–114.
Lin, D. (2002) Dependency-based evaluation of MINIPAR, This volume.
Magerman, D. (1995). Statistical decision-tree models for parsing. Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics, p. 276–283. Boston, MA.
Marcus, M., Santorini, B., Marcinkiewicz, M. (1993). Building a large annotated corpus of English: The Penn Treebank. Computational Linguistics, 19(2), p. 313–330.
Minnen, G., Carroll, J., Pearce, D. (2000). Robust, applied morphological generation. Proceedings of the 1st ACL/SIGGEN International Conference on Natural Language Generation, p. 201–208. Mitzpe Ramon, Israel.
Nunberg, G. (1990). The Linguistics of Punctuation. CSLI Lecture Notes 18, Stanford, CA.
Roland, D., Jurafsky, D. (1998). How verb subcategorization frequencies are affected by corpus choice. Proceedings of the 17th International Conference on Computational Linguistics, COLING-ACL’98, p. 1122–1128. Montreal, Canada.
Pollard, C., Sag, I. (1994). Head-driven Phrase Structure Grammar. Chicago, IL: University of Chicago Press.
Rubio, A. (Ed.) (1998). International Conference on Language Resources and Evaluation. Granada, Spain.
Sampson, G. (1995). English for the Computer. Oxford, UK: Oxford University Press.
Sampson, G. (2000). A proposal for improving the measurement of parse accuracy. International Journal of Corpus Linguistics, 5(1), p. 53–68.
Sekine, S. (1997). The domain dependence of parsing. Proceedings of the 5th ACL Conference on Applied Natural Language Processing, p. 96–102. Washington, DC.
Srinivas, B., Doran, C., Hockey B., Joshi A. (1996). An approach to robust partial parsing and evaluation metrics. Proceedings of the ESSLLI’96 Workshop on Robust Parsing, p. 70–82. Prague, Czech Republic.
Srinivas, B., Doran, C., Kulick, S. (1995). Heuristics and parse ranking. Proceedings of the 4th ACL/SIGPARSE International Workshop on Parsing Technologies, p. 224–233. Prague, Czech Republic.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Carroll, J., Minnen, G., Briscoe, T. (2003). Parser Evaluation. In: Abeillé, A. (eds) Treebanks. Text, Speech and Language Technology, vol 20. Springer, Dordrecht. https://doi.org/10.1007/978-94-010-0201-1_17
Download citation
DOI: https://doi.org/10.1007/978-94-010-0201-1_17
Publisher Name: Springer, Dordrecht
Print ISBN: 978-1-4020-1335-5
Online ISBN: 978-94-010-0201-1
eBook Packages: Springer Book Archive