Bayes optimal instance-based learning

Kontkanen, Petri; Myllymdki, Petri; Silander, Tomi; Tirri, Henry

doi:10.1007/BFb0026675

Petri Kontkanen¹,
Petri Myllymdki¹,
Tomi Silander¹ &
…
Henry Tirri¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1398))

Included in the following conference series:

European Conference on Machine Learning

529 Accesses
6 Citations

Abstract

In this paper we present a probabilistic formalization of the instance-based learning approach. In our Bayesian framework, moving from the construction of an explicit hypothesis to a data-driven instancebased learning approach, is equivalent to averaging over all the (possibly infinitely many) individual models. The general Bayesian instance-based learning framework described in this paper can be applied with any set of assumptions defining a parametric model family, and to any discrete prediction task where the number of simultaneously predicted attributes is small, which includes for example all classification tasks prevalent in the machine learning literature. To illustrate the use of the suggested general framework in practice, we show how the approach can be implemented in the special case with the strong independence assumptions underlying the so called Naive Bayes classifier. The resulting Bayesian instance-based classifier is validated empirically with public domain data sets and the results are compared to the performance of the traditional Naive Bayes classifier. The results suggest that the Bayesian instancebased learning approach yields better results than the traditional Naive Bayes classifier, especially in cases where the amount of the training data is small.

Download to read the full chapter text

Chapter PDF

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

D. Aha. A Study of Instance-Based Algorithms for Supervised Learning Tasks: Mathematical, Empirical, an Psychological Observations. PhD thesis, University of California, Irvine, 1990.
Google Scholar
D. Aha, editor. Lazy Learning. Kluwer Academic Publishers, Dordrecht, 1997. Reprinted from Artificial Intelligence Review, 11:1-5.
Google Scholar
K. Ali and M. Pazzani. Error reduction through learning multiple descriptions. Machine Learning, 24(3):173–202, September 1997.
Google Scholar
C. Atkeson. Memory based approaches to approximating continuous functions. In M. Casdagli and S. Eubank, editors, Nonlinear Modeling and Forecasting. Proceedings Volume XII in the Santa Fe Institute Studies in the Sciences of Complexity. Addison Wesley, New York, NY, 1992.
Google Scholar
C. Atkeson, A. Moore, and S. Schaal. Locally weighted learning. In Aha [2], pages 11–73.
Google Scholar
J.O. Berger. Statistical Decision Theory and Bayesian Analysis. Springer-Verlag, New York, 1985.
Google Scholar
G. Cooper and E. Herskovits. A Bayesian method for the induction of probabilistic networks from data. Machine Learning, 9:309–347, 1992.
Google Scholar
M.H. DeGroot. Optimal statistical decisions. McGraw-Hill, 1970.
Google Scholar
B.S. Everitt and D.J. Hand. Finite Mixture Distributions. Chapman and Hall, London, 1981.
Google Scholar
U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors. Advances in Knowledge Discovery and Data Mining. MIT Press, Cambridge, MA, 1996.
Google Scholar
D. Fisher. Noise-tolerant conceptual clustering. In Proceedings of the International Joint Conference on Artificial Intelligence, pages 825–830, Detroit, Michigan, 1989.
Google Scholar
D. Fisher and D. Talbert. Inference using probabilistic concept trees. In Proceedings of the Sixth International Workshop on Artificial Intelligence and Statistics, pages 191–202, Ft. Lauderdale, Florida, January 1997.
Google Scholar
J.H. Friedman. Flexible metric nearest neighbor classification. Unpublished manuscript. Available by anonymous ftp from Stanford Research Institute (Menlo Park, CA) at playfair.stanford.edu., 1994.
Google Scholar
A. Gelman, J. Carlin, H. Stern, and D. Rubin. Bayesian Data Analysis. Chapman & Hall, 1995.
Google Scholar
D. Heckerman, D. Geiger, and D.M. Chickering. Learning Bayesian networks: The combination of knowledge and statistical data. Machine Learning, 20(3):197–243, September 1995.
Google Scholar
S. Kasif, S. Salzberg, D. Waltz, J. Rachlin, and D. Aha. Towards a better understanding of memory-based reasoning systems. In Proceedings of the Eleventh International Machine Learning Conference, pages 242–250, New Brunswick, NJ, 1994. Morgan Kaufmann Publishers.
Google Scholar
P. Kontkanen, P. Myllymäki, T. Silander, H. Tirri, and P. Grünwald. Comparing predictive inference methods for discrete domains. In Proceedings of the Sixth International Workshop on Artificial Intelligence and Statistics, pages 311–318, Ft. Lauderdale, Florida, January 1997. Also: NeuroCOLT Technical Report NCTR-97-004.
Google Scholar
P. Kontkauen, P. Myllymäki, T. Silander, H. Tirri, and P. Grünwald. On predictive distributions and Bayesian networks. In W. Daelemans, P. Flach, and A. van den Bosch, editors, Proceedings of the Seventh Belgian-Dutch Conference on Machine Learning (BeNeLearn'97), pages 59–68, Tilburg, the Netherlands, October 1997.
Google Scholar
P. Kontkanen, P. Myllymäki, and H. Tirri. Comparing Bayesian model class selection criteria by discrete finite mixtures. In D. Dowe, K. Korb, and J. Oliver, editors, Information, Statistics and Induction in Science, pages 364–374, Proceedings of the ISIS'96 Conference, Melbourne, Australia, August 1996. World Scientific, Singapore.
Google Scholar
P. Kontkanen, P. Myllymäki, and H. Tirri. Experimenting with the CheesemanStutz evidence approximation for predictive modeling and data mining. In D. Dankel, editor, Proceedings of the Tenth International FLAIRS Conference, pages 204–211, Daytona Beach, Florida, May 1997.
Google Scholar
D. Mackay. Bayesian Methods for Adaptive Models. PhD thesis, California Institute of Technology, 1992.
Google Scholar
D. Madigan, A. Raftery, C. Volinsky, and J. Hoeting. Bayesian model averaging. In AAAI Workshop on Integrating Multiple Learned Models, 1996.
Google Scholar
D. Michie, D.J. Spiegelhalter, and C.C. Taylor, editors. Machine Learning, Neural and Statistical Classification. Ellis Horwood, London, 1994.
Google Scholar
A. Moore. Acquisition of dynamic control knowledge for a robotic manipulator. In Seventh International Machine Learning Workshop. Morgan Kaufmann, 1990.
Google Scholar
P. Myllymäki and H. Tirri. Bayesian case-based reasoning with neural networks. In Proceedings of the IEEE International Conference on Neural Networks, volume 1, pages 422–427, San Francisco, March 1993. IEEE, Piscataway, NJ.
Article Google Scholar
P. Myllymäki and H. Tirri. Massively parallel case-based reasoning with probabilistic similarity metrics. In S. Wess, K.-D. Althoff, and M Richter, editors, Topics in Case-Based Reasoning, volume 837 of Lecture Notes in Artificial Intelligence, pages 144–154. Springer-Verlag, 1994.
Google Scholar
J. Rissanen. Stochastic Complexity in Statistical Inquiry. World Scientific Publishing Company, New Jersey, 1989.
Google Scholar
J. Rissanen. Fisher information and stochastic complexity. IEEE Transactions on Information Theory, 42(1):40–47, January 1996.
Article Google Scholar
C. Stanfill and D. Waltz. Toward memory-based reasoning. Communications of the ACM, 29(12):1213–1228, 1986.
Article Google Scholar
K. Ting and R. Cameron-Jones. Exploring a framework for instance based learning and Naive Bayes classifiers. In Proceedings of the Seventh Australian Joint Conference on Artificial Intelligence, pages 100–107, 1994.
Google Scholar
H. Tirri, P. Kontkanen, and P. Myllymäki. A Bayesian framework for case-based reasoning. In I. Smith and B. Faltings, editors, Advances in Case-Based Reasoning, volume 1168 of Lecture Notes in Artificial Intelligence, pages 413–427. Springer-Verlag, Berlin Heidelberg, November 1996.
Google Scholar
H. Tirri, P. Kontkanen, and P. Myllymäki. Probabilistic instance-based learning. In L. Saitta, editor, Machine Learning: Proceedings of the Thirteenth International Conference, pages 507–515. Morgan Kaufmann Publishers, 1996.
Google Scholar
D.M. Titterington, A.F.M. Smith, and U.E. Makov. Statistical Analysis of Finite Mixture Distributions. John Wiley & Son, New York, 1985.
Google Scholar
D. Wettschereck, D. Aha, and T. Mohri. A review and empirical evaluation of feature-weighting methods for a class of lazy learning algorithms. In Aha [2], pages 273–314.
Google Scholar

Download references

Author information

Authors and Affiliations

Complex Systems Computation Group (CoSCo) P.O.Box 26, Department of Computer Science, FIN-00014 University of Helsinki, Finland
Petri Kontkanen, Petri Myllymdki, Tomi Silander & Henry Tirri

Authors

Petri Kontkanen
View author publications
You can also search for this author in PubMed Google Scholar
Petri Myllymdki
View author publications
You can also search for this author in PubMed Google Scholar
Tomi Silander
View author publications
You can also search for this author in PubMed Google Scholar
Henry Tirri
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Claire Nédellec Céline Rouveirol

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kontkanen, P., Myllymdki, P., Silander, T., Tirri, H. (1998). Bayes optimal instance-based learning. In: Nédellec, C., Rouveirol, C. (eds) Machine Learning: ECML-98. ECML 1998. Lecture Notes in Computer Science, vol 1398. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0026675

Download citation

DOI: https://doi.org/10.1007/BFb0026675
Published: 16 June 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64417-0
Online ISBN: 978-3-540-69781-7
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics