Abstract
This paper advocates the claim that the property of a robustness of a certain automatic natural language parser is something different than a simple ability to construct a syntactic structure for each sequence of word forms (sentence) of a given language.
The robustness in our terminology should be more accurate in a sense that it should be able to distinguish between “good” and “bad” ill-formed sentence. We propose to use two measures for this purpose, the node-gap complexity which describes the complexity of the sentence with regard to nonprojective constructions, and the degree of robustness which takes into account the number of syntactic inconsistencies encountered in the process of robust parsing. These measures make it possible to develop a scale of global constraints which allow a kind of gradual parsing of both syntactically well-formed and ill-formed sentences of a natural language.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Hajič, J.: Building a syntactically annotated corpus: The Prague Dependency Treebank, In: Issues of Valency and Meaning, Studies in Honour of Jarmila Panevová (ed. by E. Hajičová) (pp. 106–132). Praha: Karolinum.
Hajič, J. et al.: Core Natural Language Processing Technology Applicable to Multiple Languages, in: Final report of the Workshop’98 of the Center for Language and Speech Processing at the Johns Hopkins University, Baltimore, 1998
Holan, T., Kuboň, V., Oliva, K., Plátek, M.: Two Useful Measures of Word Order Complexity, in: Proceedings of the Coling’ 98 Workshop “Processing of Dependency-Based Grammars”, A. Polguere and S. Kahane (eds.), University of Montreal, Montreal, 1998
Holan, T., Kuboň, V., Oliva, K., Plátek, M.: On Complexity of Word Order, ÚFAL Technical Report TR-2000-08, MFF UK Praha, 2000
Holan, T.: A Software Environment for the Development of NL Parsers, (in Czech), Dissertation at MFF UK, Praha, manuscript
Kuboň, V., Holan, T., Plátek, M.: A Grammar Checker for Czech,ÚFAL Technical Report TR-1997-02, MFF UK Praha, 1997
Kuboň, V.: A Robust Parser for Czech, Dissertation at MFF UK, Praha, manuscript
Kunze, J.: Abhängigskeitsgrammatik, Berlin: Akademie-Verlag, 1975
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2001 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kuboň, V., Plátek, M. (2001). A Method of Accurate Robust Parsing of Czech. In: Matoušek, V., Mautner, P., Mouček, R., Taušer, K. (eds) Text, Speech and Dialogue. TSD 2001. Lecture Notes in Computer Science(), vol 2166. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-44805-5_12
Download citation
DOI: https://doi.org/10.1007/3-540-44805-5_12
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-42557-1
Online ISBN: 978-3-540-44805-1
eBook Packages: Springer Book Archive