Abstract
The Bayesian approach to machine learning amounts to inferring posterior distributions of random variables from a probabilistic model of how the variables are related (that is, a prior distribution) and a set of observations of variables. There is a trend in machine learning towards expressing Bayesian models as probabilistic programs. As a foundation for this kind of programming, we propose a core functional calculus with primitives for sampling prior distributions and observing variables. We define combinators for measure transformers, based on theorems in measure theory, and use these to give a rigorous semantics to our core calculus. The original features of our semantics include its support for discrete, continuous, and hybrid measures, and, in particular, for observations of zero-probability events. We compile our core language to a small imperative language that has a straightforward semantics via factor graphs, data structures that enable many efficient inference algorithms. We use an existing inference engine for efficient approximate inference of posterior marginal distributions, treating thousands of observations per second for large instances of realistic models.
Chapter PDF
Similar content being viewed by others
References
Abadi, M., Rogaway, P.: Reconciling two views of cryptography (the computational soundness of formal encryption). J. Cryptology 15(2), 103–127 (2002)
Barthe, G., Grégoire, B., Béguelin, S.Z.: Formal certification of code-based cryptographic proofs. In: POPL, pp. 90–101. ACM, New York (2009)
Billingsley, P.: Probability and Measure, 3rd edn. Wiley, Chichester (1995)
Bonawitz, K.A.: Composable Probabilistic Inference with Blaise. PhD thesis, MIT, Available as Technical Report MIT-CSAIL-TR-2008-044 (2008)
Borgström, J., Gordon, A.D., Greenberg, M., Margetson, J., Van Gael, J.: Measure transformer semantics for Bayesian machine learning. Technical report, Microsoft Research (2011)
Dalvi, N.N., Ré, C., Suciu, D.: Probabilistic databases: diamonds in the dirt. Commun. ACM 52(7), 86–94 (2009)
Erwig, M., Kollmansberger, S.: Functional pearls: Probabilistic functional programming in Haskell. J. Funct. Program. 16(1), 21–34 (2006)
Fraser, D.A.S., McDunnough, P., Naderi, A., Plante, A.: On the definition of probability densities and sufficiency of the likelihood map. J. Probability and Mathematical Statistics 15, 301–310 (1995)
Goodman, N., Mansinghka, V.K., Roy, D.M., Bonawitz, K., Tenenbaum, J.B.: Church: a language for generative models. In: UAI, pp. 220–229. AUAI Press (2008)
Gupta, V., Jagadeesan, R., Panangaden, P.: Stochastic processes as concurrent constraint programs. In: POPL 1999, pp. 189–202 (1999)
Herbrich, R., Minka, T., Graepel, T.: TrueSkill(TM): A Bayesian skill rating system. In: Advances in Neural Information Processing Systems 20 (2007)
Jaynes, E.T.: 15.7 The Borel-Kolmogorov paradox. In: Probability Theory: The Logic of Science, pp. 467–470. CUP (2003)
Jones, C., Plotkin, G.D.: A probabilistic powerdomain of evaluations. In: LICS, pp. 186–195. IEEE Computer Society, Los Alamitos (1989)
Kiselyov, O., Shan, C.: Monolingual probabilistic programming using generalized coroutines. In: UAI (2009)
Koller, D., Friedman, N.: Probabilistic Graphical Models. The MIT Press, Cambridge (2009)
Koller, D., McAllester, D.A., Pfeffer, A.: Effective Bayesian inference for stochastic programs. In: AAAI/IAAI, pp. 740–747 (1997)
Kozen, D.: Semantics of probabilistic programs. J. Comput. Syst. Sci. 22(3), 328–350 (1981)
Kschischang, F.R., Frey, B.J., Loeliger, H.-A.: Factor graphs and the sum-product algorithm. IEEE Transactions on Information Theory 47(2), 498–519 (2001)
Kwiatkowska, M.Z., Norman, G., Parker, D.: Quantitative analysis with the probabilistic model checker PRISM. ENTCS 153(2), 5–31 (2006)
Lowe, G.: Quantifying information flow. In: CSFW, pp. 18–31. IEEE Computer Society, Los Alamitos (2002)
MacKay, D.J.C.: Information Theory, Inference, and Learning Algorithms. CUP (2003)
McCallum, A., Schultz, K., Singh, S.: FACTORIE: Probabilistic programming via imperatively defined factor graphs. In: Poster at 23rd Annual Conference on Neural Information Processing Systems, NIPS (2009)
McIver, A., Morgan, C.: Abstraction, refinement and proof for probabilistic systems. Monographs in computer science. Springer, Heidelberg (2005)
McSherry, F.: Privacy integrated queries: an extensible platform for privacy-preserving data analysis. In: SIGMOD Conference, pp. 19–30. ACM, New York (2009)
Minka, T., Winn, J., Guiver, J., Kannan, A.: Infer.NET 2.3 (November 2009), Software available from http://research.microsoft.com/infernet
Minka, T., Winn, J.M.: Gates. In: NIPS, pp. 1073–1080. MIT Press, Cambridge (2008)
Minka, T.P.: Expectation Propagation for approximate Bayesian inference. In: UAI, pp. 362–369. Morgan Kaufmann, San Francisco (2001)
Ntzoufras, I.: Bayesian Modeling Using WinBUGS. Wiley, Chichester (2009)
Panangaden, P.: Labelled Markov processes. Imperial College Press, London (2009)
Park, S., Pfenning, F., Thrun, S.: A probabilistic language based upon sampling functions. In: POPL, pp. 171–182. ACM, New York (2005)
Pfeffer, A.: IBAL: A probabilistic rational programming language. In: Nebel, B. (ed.) IJCAI, pp. 733–740. Morgan Kaufmann, San Francisco (2001)
Pfeffer, A.: The design and implementation of IBAL: A General-Purpose Probabilistic Language. In: Statistical Relational Learning. MIT Press, Cambridge (2007)
Ramsey, N., Pfeffer, A.: Stochastic lambda calculus and monads of probability distributions. In: POPL, pp. 154–165 (2002)
Reed, J., Pierce, B.C.: Distance makes the types grow stronger: A calculus for differential privacy. In: ICFP, pp. 157–168 (2010)
Saheb-Djahromi, N.: Probabilistic LCF. In: Winkowski, J. (ed.) MFCS 1978. LNCS, vol. 64, pp. 442–451. Springer, Heidelberg (1978)
Syme, D., Granicz, A., Cisternino, A.: Expert F#. Apress (2007)
Winn, J., Minka, T.: Probabilistic programming with Infer.NET. Machine Learning Summer School lecture notes (2009), http://research.microsoft.com/~minka/papers/mlss2009/
Winn, J.M., Bishop, C.M.: Variational message passing. Journal of Machine Learning Research 6, 661–694 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Borgström, J., Gordon, A.D., Greenberg, M., Margetson, J., Van Gael, J. (2011). Measure Transformer Semantics for Bayesian Machine Learning. In: Barthe, G. (eds) Programming Languages and Systems. ESOP 2011. Lecture Notes in Computer Science, vol 6602. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19718-5_5
Download citation
DOI: https://doi.org/10.1007/978-3-642-19718-5_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19717-8
Online ISBN: 978-3-642-19718-5
eBook Packages: Computer ScienceComputer Science (R0)