Probabilistic Programming

2019

Tetsuya Sato, Alejandro Aguirre, Gilles Barthe, Marco Gaboardi, Deepak Garg, and Justin Hsu. 2019. “Formal Verification of Higher-Order Probabilistic Programs.” Principles of Programming Languages (POPL). PDF Abstract

Probabilistic programming provides a convenient lingua franca for writing succinct and rigorous descriptions of probabilistic models and inference tasks. Several probabilistic programming languages, including Anglican, Church or Hakaru, derive their expressiveness from a powerful combination of continuous distributions, conditioning, and higher-order functions. Although very important for practical applications, these features raise fundamental challenges for program semantics and verification. Several recent works offer promising answers to these challenges, but their primary focus is on foundational semantics issues. In this paper, we take a step further by developing a suite of logics, collectively named PPV, for proving properties of programs written in an expressive probabilistic higher-order language with continuous sampling operations and primitives for conditioning distributions. Our logics mimic the comfortable reasoning style of informal proofs using carefully selected axiomatizations of key results from probability theory. The versatility of our logics is illustrated through the formal verification of several intricate examples from statistics, probabilistic inference, and machine learning. We further show expressiveness by giving sound embeddings of existing logics. In particular, we do this in a parametric way by showing how the semantics idea of (unary and relational) ⊤⊤-lifting can be internalized in our logics. The soundness of PPV follows by interpreting programs and assertions in quasi-Borel spaces (QBS), a recently proposed variant of Borel spaces with a good structure for interpreting higher order probabilistic programs.

Luc De Raedt, Robin Manhaeve, Sebastijan Dumancic, Thomas Demeester, and Angelika Kimmig. 2019. “Neuro-Symbolic = Neural + Logical + Probabilistic.” In NySe @ JCAI. PDF Abstract

The overall goal of neuro-symbolic computation is to integrate high-level reasoning with low-level perception. We argue 1) that neuro-symbolic computation should integrate neural networks with the two most prominent methods for reasoning, that is, logic and probability, and 2) that neuro-symbolic integrated methods should have the pure neural, logical and probabilistic methods as special cases. We examine the state-of-the-art with regard to these claims and briefly position our own contribution DeepProbLog in this perspective.

Zenna Tavares, Javier Burroni, Edgar Minasyan, Armando Solar Lezama, and Rajesh Ranganath. 2019. “Predicate Exchange: Inference with Declarative Knowledge.” International Conference on Machine Learning (ICML). PDF Abstract

Programming languages allow us to express complex predicates, but existing inference methods are unable to condition probabilistic models on most of them. To support a broader class of predicates, we develop an inference procedure called predicate exchange, which softens predicates. A soft predicate quantifies the extent to which values of model variables are consistent with its hard counterpart. We substitute the likelihood term in the Bayesian posterior with a soft predicate, and develop a variant of replica exchange MCMC to draw posterior samples. We implement predicate exchange as a language agnostic tool which performs a nonstandard execution of a probabilistic program. We demonstrate the approach on sequence models of health and inverse rendering.

2018

Robin Manhaeve, Sebastijan Dumancic, Angelika Kimmig, Thomas Demeester, and Luc De Raedt. 12/2018. “DeepProbLog: Neural Probabilistic Logic Programming.” NeurIPS. PDF Abstract

We introduce DeepProbLog, a probabilistic logic programming language that incorporates deep learning by means of neural predicates. We show how existing inference and learning techniques can be adapted for the new language. Our experiments demonstrate that DeepProbLog supports both symbolic and subsymbolic representations and inference, 1) program induction, 2) probabilistic (logic) programming, and 3) (deep) learning from examples. To the best of our knowledge, this work is the first to propose a framework where general-purpose neural networks and expressive probabilistic-logical modeling and reasoning are integrated in a way that exploits the full expressiveness and strengths of both worlds and can be trained end-to-end based on examples.

Noah D. Goodman and Joshua B. Tenenbaum. 11/13/2018. Probabilistic Models of Cognition. 2nd ed. Publisher's Version Abstract

This book explores the probabilistic approach to cognitive science, which models learning and reasoning as inference in complex probabilistic models. We examine how a broad range of empirical phenomena, including intuitive physics, concept learning, causal reasoning, social cognition, and language understanding, can be modeled using probabilistic programs (using the WebPPL language).

Jan-Willem van de Meent, Brooks Paige, Hongseok Yang, and Frank Wood. 9/2018. “An Introduction to Probabilistic Programming”. PDF Abstract

This document is designed to be a first-year graduate-level introduction to probabilistic programming. It not only provides a thorough background for anyone wishing to use a probabilistic programming system, but also introduces the techniques needed to design and build these systems. It is aimed at people who have an undergraduate-level understanding of either or, ideally, both probabilistic machine learning and programming languages.
We start with a discussion of model-based reasoning and explain why conditioning as a foundational computation is central to the fields of probabilistic machine learning and artificial intelligence. We then introduce a simple first-order probabilistic programming language (PPL) whose programs define static-computation-graph, finite-variable-cardinality models. In the context of this restricted PPL we introduce fundamental inference algorithms and describe how they can be implemented in the context of models denoted by probabilistic programs.
In the second part of this document, we introduce a higher-order probabilistic programming language, with a functionality analogous to that of established programming languages. This affords the opportunity to define models with dynamic computation graphs, at the cost of requiring inference methods that generate samples by repeatedly executing the program. Foundational inference algorithms for this kind of probabilistic programming language are explained in the context of an interface between program executions and an inference controller.
This document closes with a chapter on advanced topics which we believe to be, at the time of writing, interesting directions for probabilistic programming research; directions that point towards a tight integration with deep neural network research and the development of systems for next-generation artificial intelligence applications.

2017

Chung-chieh Shan and Norman Ramsey. 1/2017. “Exact Bayesian Inference by Symbolic Disintegration.” Principles of Programming Languages (POPL). PDF Abstract

Bayesian inference, of posterior knowledge from prior knowledge and observed evidence, is typically defined by Bayes’s rule, which says the posterior multiplied by the probability of an observation equals a joint probability. But the observation of a continuous quantity usually has probability zero, in which case Bayes’s rule says only that the unknown times zero is zero. To infer a posterior distribution from a zero-probability observation, the statistical notion of disintegration tells us to specify the observation as an expression rather than a predicate, but does not tell us how to compute the posterior. We present the first method of computing a disintegration from a probabilistic program and an expression of a quantity to be observed, even when the observation has probability zero. Because the method produces an exact posterior term and preserves a semantics in which monadic terms denote measures, it composes with other inference methods in a modular way—without sacrificing accuracy or performance.

Johannes Borgström, Ugo Dal Lago, Andrew D. Gordon, and Marcin Szymczak. 1/2017. “A Lambda-Calculus Foundation for Universal Probabilistic Programming”. PDF Abstract

We develop the operational semantics of an untyped probabilistic lambda-calculus with continuous distributions, as a foundation for universal probabilistic programming languages such as Church, Anglican, and Venture. Our first contribution is to adapt the classic operational semantics of lambda-calculus to a continuous setting via creating a measure space on terms and defining step-indexed approximations. We prove equivalence of big-step and small-step formulations of this distribution-based semantics. To move closer to inference techniques, we also define the sampling-based semantics of a term as a function from a trace of random samples to a value. We show that the distribution induced by integrating over all traces equals the distribution-based semantics. Our second contribution is to formalize the implementation technique of trace Markov chain Monte Carlo (MCMC) for our calculus and to show its correctness. A key step is defining sufficient conditions for the distribution induced by trace MCMC to converge to the distribution-based semantics. To the best of our knowledge, this is the first rigorous correctness proof for trace MCMC for a higher-order functional language.

Owain Evans, Andreas Stuhlmüller, John Salvatier, and Daniel Filan. 2017. “Modeling Agents with Probabilistic Programs”. Publisher's Version Abstract

This book describes and implements models of rational agents for (PO)MDPs and Reinforcement Learning. One motivation is to create richer models of human planning, which capture human biases and bounded rationality.

Agents are implemented as differentiable functional programs in a probabilistic programming language based on Javascript. Agents plan by recursively simulating their future selves or by simulating their opponents in multi-agent games. Our agents and environments run directly in the browser and are easy to modify and extend.

The book assumes basic programming experience but is otherwise self-contained. It includes short introductions to “planning as inference”, MDPs, POMDPs, inverse reinforcement learning, hyperbolic discounting, myopic planning, and multi-agent planning.

2016

Sam Staton, Frank Wood, Hongseok Yang, Chris Heunen, and Ohad Kammar. 2016. “Semantics for probabilistic programming: higher-order functions, continuous distributions, and soft constraints.” Logic in Computer Science (LICS). PDF Abstract

We study the semantic foundation of expressive probabilistic programming languages, that support higher-order functions, continuous distributions, and soft constraints (such as Anglican, Church, and Venture). We define a metalanguage (an idealised version of Anglican) for probabilistic computation with the above features, develop both operational and denotational semantics, and prove soundness, adequacy, and termination. This involves measure theory, stochastic labelled transition systems, and functor categories, but admits intuitive computational readings, one of which views sampled random variables as dynamically allocated read-only variables. We apply our semantics to validate nontrivial equations underlying the correctness of certain compiler optimisations and inference algorithms such as sequential Monte Carlo simulation. The language enables defining probability distributions on higher-order functions, and we study their properties.

2013

Andreas Stuhlmüller and Noah D. Goodman. 2013. “Reasoning about Reasoning by Nested Conditioning: Modeling Theory of Mind with Probabilistic Programs.” Journal of Cognitive Systems Research. PDF Abstract

A wide range of human reasoning patterns can be explained as conditioning in probabilistic models; however, conditioning has traditionally been viewed as an operation applied to such models, not represented in such models. We describe how probabilistic programs can explicitly represent conditioning as part of a model. This enables us to describe reasoning about others’ reasoning using nested conditioning. Much of human reasoning is about the beliefs, desires, and intentions of other people; we use probabilistic programs to formalize these inferences in a way that captures the flexibility and inherent uncertainty of reasoning about other agents. We express examples from game theory, artificial intelligence, and linguistics as recursive probabilistic programs and illustrate how this representation language makes it easy to explore new directions in each of these fields. We discuss the algorithmic challenges posed by these kinds of models and describe how Dynamic Programming techniques can help address these challenges.

2011

Judea Pearl. 2011. “The algorithmization of counterfactuals.” Annals of Mathematics and Artificial Intelligence . Publisher's Version Abstract

Recent advances in causal reasoning have given rise to a computation model that emulates the process by which humans generate, evaluate and distinguish counterfactual sentences. Though compatible with the “possible worlds” account, this model enjoys the advantages of representational economy, algorithmic simplicity and conceptual clarity. Using this model, the paper demonstrates the processing of counterfactual sentences on a classical example due to Ernest Adam. It then gives a panoramic view of several applications where counterfactual reasoning has benefited problem areas in the empirical sciences.

2008

Yarden Katz, Noah D. Goodman, Kristian Kersting, Charles Kemp, and Joshua B. Tenenbaum. 2008. “Modeling Semantic Cognition as Logical Dimensionality Reduction.” CogSci. PDF Abstract

Semantic knowledge is often expressed in the form of intuitive theories, which organize, predict and explain our observations of the world. How are these powerful knowledge structures represented and acquired? We present a framework, logical dimensionality reduction, that treats theories as compressive probabilistic models, attempting to express observed data as a sample from the logical consequences of the theory’s underlying laws and a small number of core facts. By performing Bayesian learning and inference on these models we combine important features of more familiar connectionist and symbolic approaches to semantic cognition: an ability to handle graded, uncertain inferences, together with systematicity and compositionality that support appropriate inferences from sparse observations in novel contexts.

1994

Marco Ramoni, Alberto Riva, and Vimla L. Patel. 1/2/1994. “Probabilistic Reasoning under Ignorance.” Cognitive Science Society. PDF Abstract

The representation of ignorance is a long standing challenge for researchers in probability and decision theory. During the past decade, Artificial Intelligence researchers have developed a class of reasoning systems, called Truth Maintenance Systems, which are able to reason on the basis of incomplete information. In this paper we will describe a new method for dealing with partially specified probabilistic models, by extending a logic-based truth maintenance method from Boolean truth-values to probability intervals. Then we will illustrate how this method can be used to represent Bayesian Belief Networks --- one of the best known formalisms to reason under uncertainty --- thus producing a new class of Bayesian Belief Networks, called Ignorant Belief Networks, able to reason on the basis of partially specified prior and conditional probabilities. Finally, we will discuss how this new method relates to some theoretical intuitions and empirical findings in decision theory and cognitive science.

1993

Marco Ramoni and Alberto Riva. 1/1/1993. “Belief Maintenance with Probabilistic Logic.” AAAI. PDF Abstract

Belief maintenance systems are natural extensions of truth maintenance systems that use probabilities rather than boolean truth-values. This paper introduces a general method for belief maintenance, based on (the propositional fragment of) probabilistic logic, that extends the Boolean Constraint Propagation method used by the logic-based truth maintenance systems. From the concept of probabilistic entailment, we derive a set of constraints on the (probabilistic) truth-values of propositions and we prove their soundness. These constraints are complete with respect to a well-defined set of clauses, and their partial incompleteness is compensated by a gain in computational efficiency

1986

Brian Falkenhainer. 1986. “Towards a General-Purpose Belief Maintenance System.” UAI. PDF Abstract

There currently exists a gap between the theories proposed by the probability and uncertainty and the needs of Artificial Intelligence research. These theories primarily address the needs of expert systems, using knowledge structures which must be pre-compiled and remain static in structure during runtime. Many Al systems require the ability to dynamically add and remove parts of the current knowledge structure (e.g., in order to examine what the world would be like for different causal theories). This requires more flexibility than existing uncertainty systems display. In addition, many Al researchers are only interested in using "probabilities" as a means of obtaining an ordering, rather than attempting to derive an accurate probabilistic account of a situation. This indicates the need for systems which stress ease of use and don't require extensive probability information when one cannot (or doesn't wish to) provide such information. This paper attempts to help reconcile the gap between approaches to uncertainty and the needs of many AI systems by examining the control issues which arise, independent of a particular uncertainty calculus. when one tries to satisfy these needs. Truth Maintenance Systems have been used extensively in problem solving tasks to help organize a set of facts and detect inconsistencies in the believed state of the world. These systems maintain a set of true/false propositions and their associated dependencies. However, situations often arise in which we are unsure of certain facts or in which the conclusions we can draw from available information are somewhat uncertain. The non-monotonic TMS 12] was an attempt at reasoning when all the facts are not known, but it fails to take into account degrees of belief and how available evidence can combine to strengthen a particular belief. This paper addresses the problem of probabilistic reasoning as it applies to Truth Maintenance Systems. It describes a belief Maintenance System that manages a current set of beliefs in much the same way that a TMS manages a set of true/false propositions. If the system knows that belief in fact is dependent in some way upon belief in fact2, then it automatically modifies its belief in facts when new information causes a change in belief of fact2. It models the behavior of a TMS, replacing its 3-valued logic (true, false, unknown) with an infinite valued logic, in such a way as to reduce to a standard TMS if all statements are given in absolute true/false terms. Belief Maintenance Systems can, therefore, be thought of as a generalization of Truth Maintenance Systems, whose possible reasoning tasks are a superset of those for a TMS.

CS252R: Programming Languages + Artificial Intelligence (Fall 2020)

Probabilistic Programming

Topics

Level

Note