Abstract
In order to obtain a high precision and high coverage grammar, we proposed a model to measure grammar coverage and designed a PCFG parser to measure efficiency of the grammar. To generalize grammars, a grammar binarization method was proposed to increase the coverage of a probabilistic context-free grammar. In the mean time linguistically-motivated feature constraints were added into grammar rules to maintain precision of the grammar. The generalized grammar increases grammar coverage from 93% to 99% and bracketing F-score from 87% to 91% in parsing Chinese sentences. To cope with error propagations due to word segmentation and part-of-speech tagging errors, we also proposed a grammar blending method to adapt to such errors. The blended grammar can reduce about 20~30% of parsing errors due to error assignment of pos made by a word segmentation system.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Charniak, E., Carroll, G.: Context-sensitive statistics for improved grammatical language models. In: Proceedings of the 12th National Conference on Artificial Intelligence, pp. 742–747. AAAI Press, Seattle (1994)
Charniak, E.: Treebank grammars. In: Proceedings of the Thirteenth National Conference on Artificial Intelligence, pp. 1031–1036. AAAI Press/MIT Press (1996)
Charniak, E., Carroll, G., Adcock, J., Cassanda, A., Gotoh, Y., Katz, J., Littman, M., Mccann, J.: Taggers for Parsers. Artificial Intelligence 85(1-2) (1996)
Chen, F.-Y., Tsai, P.-F., Chen, K.-J., Huang, C.-R.: Sinica Treebank. Computational Linguistics and Chinese Language Processing 4(2), 87–103 (2000)
Chen, K.-J., Hsieh, Y.-M.: Chinese Treebanks and Grammar Extraction. In: The First International Joint Conference on Natural Language Processing (IJCNLP-2004) (March 2004)
Collins, M.: Head-Driven Statistical Models for Natural Language parsing. Ph.D. thesis, Univ. of Pennsylvania (1999)
Hsieh, Y.-M., Yang, D.-C., Chen, K.-J.: Grammar extraction, generalization and specialization (in Chinese). In: ’Proceedings of ROCLING 2004 (2004)
Manning, C.D., Schutze, H.: Foundations of Statistical Natural Language Processing. The MIT Press, Cambridge (1999)
Johnson, M.: PCFG models of linguistic tree representations. Computational Linguistics 24, 613–632 (1998)
Klein, D., Manning, C.D.: Accurate Unlexicalized Parsing. In: Proceeding of the 4lst Annual Meeting of the Association for Computational Linguistics, pp. 423–430 (July 2003)
Sun, H., Jurafsky, D.: Shallow Semantic Parsing of Chinese. In: Proceedings of NAACL 2004 (2004)
Zhang, H., Liu, Q., Zhang, K., Zou, G., Bai, S.: Statistical Chinese Parser ICTPROP. Technology Report, Institute of Computing Technology (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hsieh, YM., Yang, DC., Chen, KJ. (2005). Linguistically-Motivated Grammar Extraction, Generalization and Adaptation. In: Dale, R., Wong, KF., Su, J., Kwong, O.Y. (eds) Natural Language Processing – IJCNLP 2005. IJCNLP 2005. Lecture Notes in Computer Science(), vol 3651. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11562214_16
Download citation
DOI: https://doi.org/10.1007/11562214_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29172-5
Online ISBN: 978-3-540-31724-1
eBook Packages: Computer ScienceComputer Science (R0)