Abstract
Recently, an algorithm - DEES - was proposed for learning rational stochastic tree languages. Given a sample of trees independently and identically drawn according to a distribution defined by a rational stochastic language, DEES outputs a linear representation of a rational series which converges to the target. DEES can then be used to identify in the limit with probability one rational stochastic tree languages. However, when DEES deals with finite samples, it often outputs a rational tree series which does not define a stochastic language. Moreover, the linear representation can not be directly used as a generative model. In this paper, we show that any representation of a rational stochastic tree language can be transformed in a reduced normalised representation that can be used to generate trees from the underlying distribution. We also study some properties of consistency for rational stochastic tree languages and discuss their implication for the inference. We finally consider the applicability of DEES to trees built over an unranked alphabet.
This work was partially supported by the Atash project ANR-05-RNTL00102 and the Marmota project ANR-05-MMSA-0016.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Denis, F., Habrard, A.: Learning rational stochastic tree languages. In: Hutter, M., Servedio, R.A., Takimoto, E. (eds.) ALT 2007. LNCS (LNAI), vol. 4754, pp. 242–256. Springer, Heidelberg (2007)
Booth, T., Thompson, R.: Applying probabilistic measures to abstract languages. IEEE Transactions on Computers 22(5), 442–450 (1973)
Wetherell, C.S.: Probabilistic languages: A review and some open questions. ACM Comput. Surv. 12(4), 361–379 (1980)
Comon, H., Dauchet, M., Gilleron, R., Jacquemard, F., Lugiez, D., Löding, C., Tison, S., Tommasi, M.: Tree automata techniques and applications (2007) (release October 12, 2007), http://tata.gforge.inria.fr/
Berstel, J., Reutenauer, C.: Recognizable formal power series on trees. Theoretical Computer Science 18, 115–148 (1982)
Ésik, Z., Kuich, W.: Formal tree series. Journal of Automata, Languages and Combinatorics 8(2), 219–285 (2003)
Denis, F., Esposito, Y.: Rational stochastic languages. Technical report, LIF - Université de Provence (2006), http://hal.ccsd.cnrs.fr/ccsd-00019728
Denis, F., Gilbert, E., Habrard, A., Ouardi, F., Tommasi, M.: Relevant representations for the inference of rational stochastic tree languages. Technical report, LIF, LIFL, and INRIA (2008), http://hal.archives-ouvertes.fr/hal-00293511/en/
Denis, F., Esposito, Y., Habrard, A.: Learning rational stochastic languages. In: Lugosi, G., Simon, H.U. (eds.) Learning theory. LNCS, pp. 274–288. Springer, Heidelberg (2006)
Stolcke, A.: An efficient probabilistic context-free parsing algorithm that computes prefix probabilities. Computional Linguistics 21(2), 165–201 (1995)
Brüggemann-Klein, A., Murata, M., Wood, D.: Regular tree and regular hedge languages over unranked alphabets. Technical report, Hong Kong University Theoretical Computer Science Center, Version 1 (2001)
Carme, J., Niehren, J., Tommasi, M.: Querying unranked trees with stepwise tree automata. In: van Oostrom, V. (ed.) RTA 2004. LNCS, vol. 3091, pp. 105–118. Springer, Heidelberg (2004)
Droste, M., Vogler, H.: Weighted logics for XML (manuscript, 2007), http://www.orchid.inf.tu-dresden.de/gdp/monographs/r20.ps
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Denis, F., Gilbert, É., Habrard, A., Ouardi, F., Tommasi, M. (2008). Relevant Representations for the Inference of Rational Stochastic Tree Languages. In: Clark, A., Coste, F., Miclet, L. (eds) Grammatical Inference: Algorithms and Applications. ICGI 2008. Lecture Notes in Computer Science(), vol 5278. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88009-7_5
Download citation
DOI: https://doi.org/10.1007/978-3-540-88009-7_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88008-0
Online ISBN: 978-3-540-88009-7
eBook Packages: Computer ScienceComputer Science (R0)