Abstract
Comparison of standard language learning paradigms (identification in the limit, query learning, Pac learning) has always been a complex question. Moreover, when to the question of converging to a target one adds computational constraints, the picture becomes even less clear: how much do queries or negative examples help? Can we find good algorithms that change their minds very little or that make very few errors? In order to approach these problems we concentrate here on two classes of languages, the topological balls of strings (for the edit distance) and the deterministic finite automata (), and (re-)visit the different learning paradigms to sustain our claims.
This work was supported in part by the IST Programme of the European Community, under the Pascal Network of Excellence, IST-2006-216886. This publication only reflects the authors’ views.
Access provided by Autonomous University of Puebla. Download to read the full chapter text
Chapter PDF
Similar content being viewed by others
References
Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Doklady Akademii Nauk SSSR 163(4), 845–848 (1965)
Navarro, G.: A guided tour to approximate string matching. ACM computing surveys 33(1), 31–88 (2001)
Chávez, E., Navarro, G., Baeza-Yates, R.A., Marroquín, J.L.: Searching in metric spaces. ACM Computing Survey 33(3), 273–321 (2001)
Kohonen, T.: Median strings. Pattern Recognition Letters 3, 309–313 (1985)
Schulz, K.U., Mihov, S.: Fast string correction with Levenshtein automata. Int. Journal on Document Analysis and Recognition 5(1), 67–85 (2002)
Sagot, M.F., Wakabayashi, Y.: Pattern inference under many guises. In: Recent Advances in Algorithms and Combinatorics, pp. 245–287. Springer, Heidelberg (2003)
Gold, E.M.: Language identification in the limit. Information and Control 10(5), 447–474 (1967)
Angluin, D.: Queries and concept learning. Machine Learning Journal 2, 319–342 (1987)
Valiant, L.G.: A theory of the learnable. Communications of the ACM 27(11), 1134–1142 (1984)
Angluin, D.: Negative results for equivalence queries. Machine Learning Journal 5, 121–150 (1990)
Pitt, L.: Inductive inference, DFA’s, and computational complexity. In: Jantke, K.P. (ed.) AII 1989. LNCS, vol. 397, pp. 18–44. Springer, Heidelberg (1989)
Li, M., Vitanyi, P.: Learning simple concepts under simple distributions. Siam Journal of Computing 20, 911–935 (1991)
Denis, F.: Learning regular languages from simple positive examples. Machine Learning Journal 44(1), 37–66 (2001)
Parekh, R.J., Honavar, V.: On the relationship between models for learning in helpful environments. In: Oliveira, A.L. (ed.) ICGI 2000. LNCS (LNAI), vol. 1891, pp. 207–220. Springer, Heidelberg (2000)
Haussler, D., Kearns, M.J., Littlestone, N., Warmuth, M.K.: Equivalence of models for polynomial learnability. Information and Computation 95(2), 129–161 (1991)
Kearns, M., Valiant, L.: Cryptographic limitations on learning boolean formulae and finite automata. In: 21st ACM Symposium on Theory of Computing (STOC 1989), pp. 433–444 (1989)
de la Higuera, C.: Characteristic sets for polynomial grammatical inference. Machine Learning Journal 27, 125–138 (1997)
Wagner, R., Fisher, M.: The string-to-string correction problem. Journal of the ACM 21, 168–178 (1974)
Papadimitriou, C.M.: Computational Complexity. Addison Wesley, New York (1994)
Becerra-Bonache, L., de la Higuera, C., Janodet, J.C., Tantini, F.: Learning balls of strings with correction queries. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 18–29. Springer, Heidelberg (2007)
Angluin, D.: Learning regular sets from queries and counterexamples. Information and Control 39, 337–350 (1987)
Warmuth, M.: Towards representation independence in PAC-learning. In: Jantke, K.P. (ed.) AII 1989. LNCS, vol. 397, pp. 78–103. Springer, Heidelberg (1989)
Kearns, M., Vazirani, U.: An Introduction to Computational Learning Theory. MIT Press, Cambridge (1994)
Pitt, L., Valiant, L.G.: Computational limitations on learning from examples. Journal of the ACM 35(4), 965–984 (1988)
Maier, D.: The complexity of some problems on subsequences and supersequences. Journal of the ACM 25, 322–336 (1977)
de la Higuera, C., Casacuberta, F.: Topology of strings: Median string is NP-complete. Theoretical Computer Science 230, 39–48 (2000)
Pitt, L., Warmuth, M.: The minimum consistent DFA problem cannot be approximated within any polynomial. Journal of the ACM 40(1), 95–142 (1993)
Angluin, D., Smith, C.: Inductive inference: theory and methods. ACM computing surveys 15(3), 237–269 (1983)
Greenberg, R.I.: Bounds on the number of longest common subsequences. Technical report, Loyola University (2003), http://arXiv.org/abs/cs/0301030v2
Greenberg, R.I.: Fast and simple computation of all longest common subsequences. Technical report, Loyola University (2002), http://arXiv.org/abs/cs.DS/0211001
Gold, E.M.: Complexity of automaton identification from given data. Information and Control 37, 302–320 (1978)
Oncina, J., García, P.: Identifying regular languages in polynomial time. In: Advances in Structural and Syntactic Pattern Recognition. Series in Machine Perception and Artificial Intelligence, vol. 5, pp. 99–108. World Scientific, Singapore (1992)
Denis, F., Lemay, A., Terlutte, A.: Learning regular languages using RFSA. Theoretical Computer Science 313(2), 267–294 (2004)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
de la Higuera, C., Janodet, JC., Tantini, F. (2008). Learning Languages from Bounded Resources: The Case of the DFA and the Balls of Strings. In: Clark, A., Coste, F., Miclet, L. (eds) Grammatical Inference: Algorithms and Applications. ICGI 2008. Lecture Notes in Computer Science(), vol 5278. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88009-7_4
Download citation
DOI: https://doi.org/10.1007/978-3-540-88009-7_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88008-0
Online ISBN: 978-3-540-88009-7
eBook Packages: Computer ScienceComputer Science (R0)