Computational machine learning in theory and praxis

Li, Ming; Vitányi, Paul

doi:10.1007/BFb0015264

Ming Li¹ &
Paul Vitányi²

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 1000))

172 Accesses
5 Citations

Abstract

In the last few decades a computational approach to machine learning has emerged based on paradigms from recursion theory and the theory of computation. Such ideas include learning in the limit, learning by enumeration, and probably approximately correct (pac) learning. These models usually are not suitable in practical situations. In contrast, statistics based inference methods have enjoyed a long and distinguished career. Currently, Bayesian reasoning in various forms, minimum message length (MML) and minimum description length (MDL), are widely applied approaches. They are the tools to use with particular machine learning praxis such as simulated annealing, genetic algorithms, genetic programming, artificial neural networks, and the like. These statistical inference methods select the hypothesis which minimizes the sum of the length of the description of the hypothesis (also called ‘model’) and the length of the description of the data relative to the hypothesis. It appears to us that the future of computational machine learning will include combinations of the approaches above coupled with guaranties with respect to used time and memory resources. Computational learning theory will move closer to practice and the application of the principles such as MDL require further justification. Here, we survey some of the actors in this dichotomy between theory and praxis, we justify MDL via the Bayesian approach, and give a comparison between pac learning and MDL learning of decision trees.

The first author was supported in part by NSERC operating grant OGP-046506, ITRC, and a CGAT grant. The second author was supported by NSERC through International Scientific Exchange Award ISE0125663, and by the European Union through NeuroCOLT ESPRIT Working Group Nr. 8556, and by NWO through NFI Project ALADDIN under Contract number NF 62-376.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

D. Angluin, Computational learning theory: survey and selected bibliography, Proc. 24th Ann. ACM Symp. Theory of Computing, 1992, pp. 319–342.
Google Scholar
L. Breiman, J. Friedman, R. Olshen and C. Stone, Classification and Regression Trees. Wadsworth International Group, Belmont, CA, 1984.
Google Scholar
W. Buntine, Personal communication, September 1994.
Google Scholar
A. Ehrenfeucht and D. Haussler, Learning decision trees from random examples. Proc. 1988 Workshop on Comp. Learning Theory, Morgan-Kaufmann, 1988, pp. 182–195.
Google Scholar
P. Gács, On the symmetry of algorithmic information, Soviet Math. Dokl., 15 (1974) 1477–1480. Correction: ibid., 15 (1974) 1480.
Google Scholar
Q. Gao and M. Li, An application of minimum description length principle to online recognition of hand-printed alphanumerals, 11th International Joint Conference on Artificial Intelligence, Morgan Kaufmann, 1989, pp. 843–848.
Google Scholar
E.M. Gold, Language identification in the limit, Inform. Contr. 10 (1967) 447–474.
Article Google Scholar
T. Hancock, T. Jiang, M. Li, and J. Tromp, Lower bounds on learning decision lists and trees. in: E.W. Mayr, C. Puech (Eds.), STACS 95, Proc. 12th Annual Symp. Theoret. Aspects of Comput. Science, Lecture Notes in Computer Science, Vol. 900, Springer-Verlag, Heidelberg, 1995, pp. 527–538.
Google Scholar
A.N. Kolmogorov, Three approaches to the quantitative definition of information, Problems Inform. Transmission 1:1 (1965) 1–7.
Google Scholar
M. Li and P.M.B. Vitányi, Inductive reasoning and Kolmogorov complexity, J. Comput. Syst. Sci. 44 (1992) 343–384.
Article Google Scholar
M. Li and P. Vitányi, A theory of learning simple concepts under simple distributions, SIAM J. Computing 20:5 (1991) 915–935.
Article Google Scholar
M. Li and P.M.B. Vitányi, An Introduction to Kolmogorov Complexity and its Applications, Springer-Verlag, New York, 1993.
Google Scholar
R. Nohre, Some Topics in Descriptive Complexity, Ph.D. Thesis, Linköping University, Linköping, Sweden, 1994.
Google Scholar
D. MacKay, pp 59 in Maximum Entropy and Bayesian Methods, Kluwer, 1992. (Personal communication W. Buntine.)
Google Scholar
J. Mingers, An empirical comparison of selection measures for decision-tree induction. Machine Learning 3 (1989) 319–342.
Google Scholar
J.R. Quinlan, Induction of decision trees, Machine Learning 1 (1986) 81–106.
Google Scholar
J.R. Quinlan and R. Rivest, Inferring decision trees using the minimum description length principle, Inform. Computation 80 (1989) 227–248.
Article Google Scholar
J. Rissanen, Modeling by the shortest data description, Automatica-J.IFAC 14 (1978) 465–471.
Article Google Scholar
J. Rissanen, Universal coding, information, prediction and estimation, IEEE Transactions on Information Theory IT-30 (1984) 629–636.
Article Google Scholar
J. Rissanen, Stochastic Complexity and Statistical Inquiry, World Scientific Publishers, 1989.
Google Scholar
J. Rissanen, Stochastic complexity, J. Royal Stat. Soc, series B 49 (1987) 223–239. Discussion: ibid., pp. 252–265.
Google Scholar
J. Rissanen, Stochastic complexity in learning, in: P. Vitányi (Ed.), Computational Learning Theory, Proc. 2nd European Conf. (EuroCOLT '95), Lecture Notes in Artificial Intelligence, Vol. 904, Springer-Verlag, Heidelberg, 1995, pp. 196–210.
Google Scholar
J. Rissanen and M. Wax, Algorithm for constructing tree structured classifiers, US Patent No. 4719571, 1988.
Google Scholar
J. Segen, Pattern-Directed Signal Analysis, PhD Thesis, Carnegie-Mellon University, Pittsburgh, 1980.
Google Scholar
R.J. Solomonoff, Complexity-based induction systems: comparisons and convergence theorems, IEEE Trans. Inform. Theory IT-24 (1978) 422–432.
Article Google Scholar
L.G. Valiant, A Theory of the Learnable, Comm. ACM 27 (1984) 1134–1142.
Article Google Scholar
V. Vovk, Minimum description length estimators under the universal coding scheme, in: P. Vitányi (Ed.), Computational Learning Theory, Proc. 2nd European Conf. (EuroCOLT '95), Lecture Notes in Artificial Intelligence, Vol. 904, Springer-Verlag, Heidelberg, 1995, pp. 237–251.
Google Scholar
C.S. Wallace and D.M. Boulton, An information measure for classification, Computing Journal 11 (1968) 185–195.
Google Scholar
C.S. Wallace and P.R. Freeman, Estimation and inference by compact coding, J. Royal Stat. Soc, Series B, 49 (1987) 240–251. Discussion: ibid.,252–265.
Google Scholar
K. Yamanishi, Approximating the minimum description length and its applications to learning, Manuscript, NEC Research Labs, New Jersey, 1995.
Google Scholar
A.K. Zvonkin and L.A. Levin, The complexity of finite objects and the development of the concepts of information and randomness by means of the theory of algorithms, Russian Math. Surveys 25:6 (1970) 83–124.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, University of Waterloo, N2L 3G1, Waterloo, Ontario, Canada
Ming Li
CWI and University of Amsterdam, Kruislaan 413, 1098, SJ Amsterdam, The Netherlands
Paul Vitányi

Authors

Ming Li
View author publications
You can also search for this author in PubMed Google Scholar
Paul Vitányi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Jan van Leeuwen

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Li, M., Vitányi, P. (1995). Computational machine learning in theory and praxis. In: van Leeuwen, J. (eds) Computer Science Today. Lecture Notes in Computer Science, vol 1000. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0015264

Download citation

DOI: https://doi.org/10.1007/BFb0015264
Published: 09 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-60105-0
Online ISBN: 978-3-540-49435-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics