Abstract
In this paper we present a probabilistic formalization of the instance-based learning approach. In our Bayesian framework, moving from the construction of an explicit hypothesis to a data-driven instancebased learning approach, is equivalent to averaging over all the (possibly infinitely many) individual models. The general Bayesian instance-based learning framework described in this paper can be applied with any set of assumptions defining a parametric model family, and to any discrete prediction task where the number of simultaneously predicted attributes is small, which includes for example all classification tasks prevalent in the machine learning literature. To illustrate the use of the suggested general framework in practice, we show how the approach can be implemented in the special case with the strong independence assumptions underlying the so called Naive Bayes classifier. The resulting Bayesian instance-based classifier is validated empirically with public domain data sets and the results are compared to the performance of the traditional Naive Bayes classifier. The results suggest that the Bayesian instancebased learning approach yields better results than the traditional Naive Bayes classifier, especially in cases where the amount of the training data is small.
Chapter PDF
Keywords
- Bayesian Network
- Predictive Distribution
- Model Family
- Feedforward Neural Network Model
- Sixth International Workshop
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
D. Aha. A Study of Instance-Based Algorithms for Supervised Learning Tasks: Mathematical, Empirical, an Psychological Observations. PhD thesis, University of California, Irvine, 1990.
D. Aha, editor. Lazy Learning. Kluwer Academic Publishers, Dordrecht, 1997. Reprinted from Artificial Intelligence Review, 11:1-5.
K. Ali and M. Pazzani. Error reduction through learning multiple descriptions. Machine Learning, 24(3):173–202, September 1997.
C. Atkeson. Memory based approaches to approximating continuous functions. In M. Casdagli and S. Eubank, editors, Nonlinear Modeling and Forecasting. Proceedings Volume XII in the Santa Fe Institute Studies in the Sciences of Complexity. Addison Wesley, New York, NY, 1992.
C. Atkeson, A. Moore, and S. Schaal. Locally weighted learning. In Aha [2], pages 11–73.
J.O. Berger. Statistical Decision Theory and Bayesian Analysis. Springer-Verlag, New York, 1985.
G. Cooper and E. Herskovits. A Bayesian method for the induction of probabilistic networks from data. Machine Learning, 9:309–347, 1992.
M.H. DeGroot. Optimal statistical decisions. McGraw-Hill, 1970.
B.S. Everitt and D.J. Hand. Finite Mixture Distributions. Chapman and Hall, London, 1981.
U. Fayyad, G. Piatetsky-Shapiro, P. Smyth, and R. Uthurusamy, editors. Advances in Knowledge Discovery and Data Mining. MIT Press, Cambridge, MA, 1996.
D. Fisher. Noise-tolerant conceptual clustering. In Proceedings of the International Joint Conference on Artificial Intelligence, pages 825–830, Detroit, Michigan, 1989.
D. Fisher and D. Talbert. Inference using probabilistic concept trees. In Proceedings of the Sixth International Workshop on Artificial Intelligence and Statistics, pages 191–202, Ft. Lauderdale, Florida, January 1997.
J.H. Friedman. Flexible metric nearest neighbor classification. Unpublished manuscript. Available by anonymous ftp from Stanford Research Institute (Menlo Park, CA) at playfair.stanford.edu., 1994.
A. Gelman, J. Carlin, H. Stern, and D. Rubin. Bayesian Data Analysis. Chapman & Hall, 1995.
D. Heckerman, D. Geiger, and D.M. Chickering. Learning Bayesian networks: The combination of knowledge and statistical data. Machine Learning, 20(3):197–243, September 1995.
S. Kasif, S. Salzberg, D. Waltz, J. Rachlin, and D. Aha. Towards a better understanding of memory-based reasoning systems. In Proceedings of the Eleventh International Machine Learning Conference, pages 242–250, New Brunswick, NJ, 1994. Morgan Kaufmann Publishers.
P. Kontkanen, P. Myllymäki, T. Silander, H. Tirri, and P. Grünwald. Comparing predictive inference methods for discrete domains. In Proceedings of the Sixth International Workshop on Artificial Intelligence and Statistics, pages 311–318, Ft. Lauderdale, Florida, January 1997. Also: NeuroCOLT Technical Report NCTR-97-004.
P. Kontkauen, P. Myllymäki, T. Silander, H. Tirri, and P. Grünwald. On predictive distributions and Bayesian networks. In W. Daelemans, P. Flach, and A. van den Bosch, editors, Proceedings of the Seventh Belgian-Dutch Conference on Machine Learning (BeNeLearn'97), pages 59–68, Tilburg, the Netherlands, October 1997.
P. Kontkanen, P. Myllymäki, and H. Tirri. Comparing Bayesian model class selection criteria by discrete finite mixtures. In D. Dowe, K. Korb, and J. Oliver, editors, Information, Statistics and Induction in Science, pages 364–374, Proceedings of the ISIS'96 Conference, Melbourne, Australia, August 1996. World Scientific, Singapore.
P. Kontkanen, P. Myllymäki, and H. Tirri. Experimenting with the CheesemanStutz evidence approximation for predictive modeling and data mining. In D. Dankel, editor, Proceedings of the Tenth International FLAIRS Conference, pages 204–211, Daytona Beach, Florida, May 1997.
D. Mackay. Bayesian Methods for Adaptive Models. PhD thesis, California Institute of Technology, 1992.
D. Madigan, A. Raftery, C. Volinsky, and J. Hoeting. Bayesian model averaging. In AAAI Workshop on Integrating Multiple Learned Models, 1996.
D. Michie, D.J. Spiegelhalter, and C.C. Taylor, editors. Machine Learning, Neural and Statistical Classification. Ellis Horwood, London, 1994.
A. Moore. Acquisition of dynamic control knowledge for a robotic manipulator. In Seventh International Machine Learning Workshop. Morgan Kaufmann, 1990.
P. Myllymäki and H. Tirri. Bayesian case-based reasoning with neural networks. In Proceedings of the IEEE International Conference on Neural Networks, volume 1, pages 422–427, San Francisco, March 1993. IEEE, Piscataway, NJ.
P. Myllymäki and H. Tirri. Massively parallel case-based reasoning with probabilistic similarity metrics. In S. Wess, K.-D. Althoff, and M Richter, editors, Topics in Case-Based Reasoning, volume 837 of Lecture Notes in Artificial Intelligence, pages 144–154. Springer-Verlag, 1994.
J. Rissanen. Stochastic Complexity in Statistical Inquiry. World Scientific Publishing Company, New Jersey, 1989.
J. Rissanen. Fisher information and stochastic complexity. IEEE Transactions on Information Theory, 42(1):40–47, January 1996.
C. Stanfill and D. Waltz. Toward memory-based reasoning. Communications of the ACM, 29(12):1213–1228, 1986.
K. Ting and R. Cameron-Jones. Exploring a framework for instance based learning and Naive Bayes classifiers. In Proceedings of the Seventh Australian Joint Conference on Artificial Intelligence, pages 100–107, 1994.
H. Tirri, P. Kontkanen, and P. Myllymäki. A Bayesian framework for case-based reasoning. In I. Smith and B. Faltings, editors, Advances in Case-Based Reasoning, volume 1168 of Lecture Notes in Artificial Intelligence, pages 413–427. Springer-Verlag, Berlin Heidelberg, November 1996.
H. Tirri, P. Kontkanen, and P. Myllymäki. Probabilistic instance-based learning. In L. Saitta, editor, Machine Learning: Proceedings of the Thirteenth International Conference, pages 507–515. Morgan Kaufmann Publishers, 1996.
D.M. Titterington, A.F.M. Smith, and U.E. Makov. Statistical Analysis of Finite Mixture Distributions. John Wiley & Son, New York, 1985.
D. Wettschereck, D. Aha, and T. Mohri. A review and empirical evaluation of feature-weighting methods for a class of lazy learning algorithms. In Aha [2], pages 273–314.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kontkanen, P., Myllymdki, P., Silander, T., Tirri, H. (1998). Bayes optimal instance-based learning. In: Nédellec, C., Rouveirol, C. (eds) Machine Learning: ECML-98. ECML 1998. Lecture Notes in Computer Science, vol 1398. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0026675
Download citation
DOI: https://doi.org/10.1007/BFb0026675
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64417-0
Online ISBN: 978-3-540-69781-7
eBook Packages: Springer Book Archive