Learning Languages from Bounded Resources: The Case of the DFA and the Balls of Strings

de la Higuera, Colin; Janodet, Jean-Christophe; Tantini, Frédéric

doi:10.1007/978-3-540-88009-7_4

Colin de la Higuera¹,
Jean-Christophe Janodet¹ &
Frédéric Tantini¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5278))

Included in the following conference series:

International Colloquium on Grammatical Inference

490 Accesses
5 Citations

Abstract

Comparison of standard language learning paradigms (identification in the limit, query learning, Pac learning) has always been a complex question. Moreover, when to the question of converging to a target one adds computational constraints, the picture becomes even less clear: how much do queries or negative examples help? Can we find good algorithms that change their minds very little or that make very few errors? In order to approach these problems we concentrate here on two classes of languages, the topological balls of strings (for the edit distance) and the deterministic finite automata (), and (re-)visit the different learning paradigms to sustain our claims.

This work was supported in part by the IST Programme of the European Community, under the Pascal Network of Excellence, IST-2006-216886. This publication only reflects the authors’ views.

Access provided by Autonomous University of Puebla. Download to read the full chapter text

Chapter PDF

Learning Grammars and Automata with Queries

A Generic Algorithm for Learning Symbolic Automata from Membership Queries

Learning Languages with Decidable Hypotheses

Keywords

References

Levenshtein, V.I.: Binary codes capable of correcting deletions, insertions, and reversals. Doklady Akademii Nauk SSSR 163(4), 845–848 (1965)
MathSciNet Google Scholar
Navarro, G.: A guided tour to approximate string matching. ACM computing surveys 33(1), 31–88 (2001)
Article Google Scholar
Chávez, E., Navarro, G., Baeza-Yates, R.A., Marroquín, J.L.: Searching in metric spaces. ACM Computing Survey 33(3), 273–321 (2001)
Article Google Scholar
Kohonen, T.: Median strings. Pattern Recognition Letters 3, 309–313 (1985)
Article Google Scholar
Schulz, K.U., Mihov, S.: Fast string correction with Levenshtein automata. Int. Journal on Document Analysis and Recognition 5(1), 67–85 (2002)
Article MATH Google Scholar
Sagot, M.F., Wakabayashi, Y.: Pattern inference under many guises. In: Recent Advances in Algorithms and Combinatorics, pp. 245–287. Springer, Heidelberg (2003)
Chapter Google Scholar
Gold, E.M.: Language identification in the limit. Information and Control 10(5), 447–474 (1967)
Article MATH Google Scholar
Angluin, D.: Queries and concept learning. Machine Learning Journal 2, 319–342 (1987)
Google Scholar
Valiant, L.G.: A theory of the learnable. Communications of the ACM 27(11), 1134–1142 (1984)
Article MATH Google Scholar
Angluin, D.: Negative results for equivalence queries. Machine Learning Journal 5, 121–150 (1990)
Google Scholar
Pitt, L.: Inductive inference, DFA’s, and computational complexity. In: Jantke, K.P. (ed.) AII 1989. LNCS, vol. 397, pp. 18–44. Springer, Heidelberg (1989)
Google Scholar
Li, M., Vitanyi, P.: Learning simple concepts under simple distributions. Siam Journal of Computing 20, 911–935 (1991)
Article MATH MathSciNet Google Scholar
Denis, F.: Learning regular languages from simple positive examples. Machine Learning Journal 44(1), 37–66 (2001)
Article MATH Google Scholar
Parekh, R.J., Honavar, V.: On the relationship between models for learning in helpful environments. In: Oliveira, A.L. (ed.) ICGI 2000. LNCS (LNAI), vol. 1891, pp. 207–220. Springer, Heidelberg (2000)
Google Scholar
Haussler, D., Kearns, M.J., Littlestone, N., Warmuth, M.K.: Equivalence of models for polynomial learnability. Information and Computation 95(2), 129–161 (1991)
Article MATH MathSciNet Google Scholar
Kearns, M., Valiant, L.: Cryptographic limitations on learning boolean formulae and finite automata. In: 21st ACM Symposium on Theory of Computing (STOC 1989), pp. 433–444 (1989)
Google Scholar
de la Higuera, C.: Characteristic sets for polynomial grammatical inference. Machine Learning Journal 27, 125–138 (1997)
Article MATH Google Scholar
Wagner, R., Fisher, M.: The string-to-string correction problem. Journal of the ACM 21, 168–178 (1974)
Article MATH Google Scholar
Papadimitriou, C.M.: Computational Complexity. Addison Wesley, New York (1994)
MATH Google Scholar
Becerra-Bonache, L., de la Higuera, C., Janodet, J.C., Tantini, F.: Learning balls of strings with correction queries. In: Kok, J.N., Koronacki, J., Lopez de Mantaras, R., Matwin, S., Mladenič, D., Skowron, A. (eds.) ECML 2007. LNCS (LNAI), vol. 4701, pp. 18–29. Springer, Heidelberg (2007)
Chapter Google Scholar
Angluin, D.: Learning regular sets from queries and counterexamples. Information and Control 39, 337–350 (1987)
Article MathSciNet Google Scholar
Warmuth, M.: Towards representation independence in PAC-learning. In: Jantke, K.P. (ed.) AII 1989. LNCS, vol. 397, pp. 78–103. Springer, Heidelberg (1989)
Google Scholar
Kearns, M., Vazirani, U.: An Introduction to Computational Learning Theory. MIT Press, Cambridge (1994)
Google Scholar
Pitt, L., Valiant, L.G.: Computational limitations on learning from examples. Journal of the ACM 35(4), 965–984 (1988)
Article MATH MathSciNet Google Scholar
Maier, D.: The complexity of some problems on subsequences and supersequences. Journal of the ACM 25, 322–336 (1977)
Article MathSciNet Google Scholar
de la Higuera, C., Casacuberta, F.: Topology of strings: Median string is NP-complete. Theoretical Computer Science 230, 39–48 (2000)
Article MATH MathSciNet Google Scholar
Pitt, L., Warmuth, M.: The minimum consistent DFA problem cannot be approximated within any polynomial. Journal of the ACM 40(1), 95–142 (1993)
Article MATH MathSciNet Google Scholar
Angluin, D., Smith, C.: Inductive inference: theory and methods. ACM computing surveys 15(3), 237–269 (1983)
Article MathSciNet Google Scholar
Greenberg, R.I.: Bounds on the number of longest common subsequences. Technical report, Loyola University (2003), http://arXiv.org/abs/cs/0301030v2
Greenberg, R.I.: Fast and simple computation of all longest common subsequences. Technical report, Loyola University (2002), http://arXiv.org/abs/cs.DS/0211001
Gold, E.M.: Complexity of automaton identification from given data. Information and Control 37, 302–320 (1978)
Article MATH MathSciNet Google Scholar
Oncina, J., García, P.: Identifying regular languages in polynomial time. In: Advances in Structural and Syntactic Pattern Recognition. Series in Machine Perception and Artificial Intelligence, vol. 5, pp. 99–108. World Scientific, Singapore (1992)
Google Scholar
Denis, F., Lemay, A., Terlutte, A.: Learning regular languages using RFSA. Theoretical Computer Science 313(2), 267–294 (2004)
Article MATH MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Universities of Lyon, 18 r. Pr. Lauras, F-42000, St-Etienne,
Colin de la Higuera, Jean-Christophe Janodet & Frédéric Tantini

Authors

Colin de la Higuera
View author publications
You can also search for this author in PubMed Google Scholar
Jean-Christophe Janodet
View author publications
You can also search for this author in PubMed Google Scholar
Frédéric Tantini
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Alexander Clark François Coste Laurent Miclet

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

de la Higuera, C., Janodet, JC., Tantini, F. (2008). Learning Languages from Bounded Resources: The Case of the DFA and the Balls of Strings. In: Clark, A., Coste, F., Miclet, L. (eds) Grammatical Inference: Algorithms and Applications. ICGI 2008. Lecture Notes in Computer Science(), vol 5278. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88009-7_4

Download citation

DOI: https://doi.org/10.1007/978-3-540-88009-7_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88008-0
Online ISBN: 978-3-540-88009-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Learning Languages from Bounded Resources: The Case of the DFA and the Balls of Strings

Abstract

Chapter PDF

Similar content being viewed by others

Learning Grammars and Automata with Queries

A Generic Algorithm for Learning Symbolic Automata from Membership Queries

Learning Languages with Decidable Hypotheses

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Learning Languages from Bounded Resources: The Case of the DFA and the Balls of Strings

Abstract

Chapter PDF

Similar content being viewed by others

Learning Grammars and Automata with Queries

A Generic Algorithm for Learning Symbolic Automata from Membership Queries

Learning Languages with Decidable Hypotheses

Keywords

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation