Abstract
This paper focuses on automatic methods of extracting a predicate-argument structure in Polish. Two approaches to extract selected aspects of the predicate-argument structure are evaluated. In the first experiment the multi-output version of the Random Forest classifier is used to extract a valency frame for each predicate in a sentence. In the second experiment the Conditional Random Fields classifier is used to find syntactic heads of all arguments realised in a sentence. What is more, the importance of various sources of features is presented, including shallow syntactic parsing, dependency parsing and a verb valency information. Due to the lack of the high-quality syntactic parser, the presented approach does not rely on the deep syntactic information.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Gildea, D., Palmer, M.: The necessity of parsing for predicate argument recognition. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, pp. 239–246. Association for Computational Linguistics (2002)
Gołuchowski, K., Przepiórkowski, A.: Semantic role labelling without deep syntactic parsing. In: Isahara, H., Kanzaki, K. (eds.) JapTAL 2012. LNCS (LNAI), vol. 7614, pp. 192–197. Springer, Heidelberg (2012)
Hacioglu, K., Pradhan, S., Ward, W., Martin, J.H., Jurafsky, D.: Semantic Role Labeling by Tagging Syntactic Chunks. In: Proceedings of CoNLL 2004, pp. 110–113 (2004)
Kudo, T.: CRF++: Yet another CRF toolkit (2005), Software available at http://crfpp.sourceforge.net
Lopatková, M., Bojar, O., Semecký, J., Benešová, V., Žabokrtský, Z.: Valency lexicon of czech verbs vallex: Recent experiments with frame disambiguation. In: Matoušek, V., Mautner, P., Pavelka, T. (eds.) TSD 2005. LNCS (LNAI), vol. 3658, pp. 99–106. Springer, Heidelberg (2005)
Meza-Ruiz, I., Riedel, S.: Jointly identifying predicates, arguments and senses using markov logic. In: Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp. 155–163. Association for Computational Linguistics (2009)
Patejuk, A., Przepiórkowski, A.: Towards an LFG parser for Polish: An exercise in parasitic grammar development. In: Proceedings of the Eighth International Conference on Language Resources and Evaluation, LREC 2012, pp. 3849–3852. ELRA, Istanbul (2012)
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P., Weiss, R., Dubourg, V., Vanderplas, J., Passos, A., Cournapeau, D., Brucher, M., Perrot, M., Duchesnay, E.: Scikit-learn: Machine learning in Python. Journal of Machine Learning Research 12, 2825–2830 (2011)
Przepiórkowski, A., Hajnicz, E., Patejuk, A., Woliński, M., Skwarski, F., Świdziński, M.: Walenty: Towards a comprehensive valence dictionary of Polish. In: Calzolari, N., Choukri, K., Declerck, T., Loftsson, H., Maegaard, B., Mariani, J., Moreno, A., Odijk, J., Piperidis, S. (eds.) Proceedings of the Ninth International Conference on Language Resources and Evaluation, LREC 2014, pp. 2785–2792. ELRA, Reykjavík (2014), http://www.lrec-conf.org/proceedings/lrec2014/index.html
Punyakanok, V., Roth, D., Yih, W.T.: The importance of syntactic parsing and inference in semantic role labeling. Computational Linguistics 34(2), 257–287 (2008)
Radziszewski, A., Pawlaczek, A.: Large-scale experiments with NP chunking of Polish. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2012. LNCS, vol. 7499, pp. 143–149. Springer, Heidelberg (2012)
Semeckỳ, J.: Verb valency frames disambiguation. The Prague Bulletin of Mathematical Linguistics 88, 31–52 (2007)
Sun, W., Sui, Z., Wang, M., Wang, X.: Chinese semantic role labeling with shallow parsing. In: Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing, EMNLP 2009, vol. 3, pp. 1475–1483. Association for Computational Linguistics, Stroudsburg (2009)
Świdziński, M., Woliński, M.: Towards a bank of constituent parse trees for Polish. In: Sojka, P., Horák, A., Kopeček, I., Pala, K. (eds.) TSD 2010. LNCS (LNAI), vol. 6231, pp. 197–204. Springer, Heidelberg (2010)
Woliński, M.: An efficient implementation of a large grammar of polish. Archives of Control Sciences 15, 481–488 (2005)
Woliński, M.: Dendrarium – an open source tool for treebank building. In: Kłopotek, M.A., Marciniak, M., Mykowiecka, A., Penczek, W., Wierzchoń, S.T. (eds.) Proceedings of IIS 2010, pp. 193–204. Wydawnictwo Akademii Podlaskiej (2010)
Wróblewska, A.: Polish dependency bank. Linguistic Issues in Language Technology 7(1) (2012), http://elanguage.net/journals/index.php/lilt/article/view/2684
Wróblewska, A., Woliński, M.: Preliminary experiments in Polish dependency parsing. In: Bouvry, P., Kłopotek, M.A., Leprévost, F., Marciniak, M., Mykowiecka, A., Rybiński, H. (eds.) SIIS 2011. LNCS, vol. 7053, pp. 279–292. Springer, Heidelberg (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Gołuchowski, K. (2014). Experiments on the Identification of Predicate-Argument Structure in Polish. In: Przepiórkowski, A., Ogrodniczuk, M. (eds) Advances in Natural Language Processing. NLP 2014. Lecture Notes in Computer Science(), vol 8686. Springer, Cham. https://doi.org/10.1007/978-3-319-10888-9_19
Download citation
DOI: https://doi.org/10.1007/978-3-319-10888-9_19
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10887-2
Online ISBN: 978-3-319-10888-9
eBook Packages: Computer ScienceComputer Science (R0)