Advanced

2020

Kevin Ellis, Catherine Wong, Maxwell Nye, Mathias Sable-Meyer, Luc Cary, Lucas Morales, Luke Hewitt, Armando Solar-Lezama, and Joshua B. Tenenbaum. 6/2020. “DreamCoder: Growing generalizable, interpretable knowledge with wake-sleep Bayesian program learning”. PDF Abstract

Expert problem-solving is driven by powerful languages for thinking about problems and their solutions. Acquiring expertise means learning these languages -- systems of concepts, alongside the skills to use them. We present DreamCoder, a system that learns to solve problems by writing programs. It builds expertise by creating programming languages for expressing domain concepts, together with neural networks to guide the search for programs within these languages. A ``wake-sleep'' learning algorithm alternately extends the language with new symbolic abstractions and trains the neural network on imagined and replayed problems. DreamCoder solves both classic inductive programming tasks and creative tasks such as drawing pictures and building scenes. It rediscovers the basics of modern functional programming, vector algebra and classical physics, including Newton's and Coulomb's laws. Concepts are built compositionally from those learned earlier, yielding multi-layered symbolic representations that are interpretable and transferrable to new tasks, while still growing scalably and flexibly with experience.

2019

Osbert Bastani, Xin Zhang, and Armando Solar-Lezama. 10/2019. “Probabilistic verification of fairness properties via concentration.” PACML, OOPSLA. PDF Abstract

As machine learning systems are increasingly used to make real world legal and financial decisions, it is of paramount importance that we develop algorithms to verify that these systems do not discriminate against minorities. We design a scalable algorithm for verifying fairness specifications. Our algorithm obtains strong correctness guarantees based on adaptive concentration inequalities; such inequalities enable our algorithm to adaptively take samples until it has enough data to make a decision. We implement our algorithm in a tool called VeriFair, and show that it scales to large machine learning models, including a deep recurrent neural network that is more than five orders of magnitude larger than the largest previously-verified neural network. While our technique only gives probabilistic guarantees due to the use of random samples, we show that we can choose the probability of error to be extremely small.

Gagandeep Singh, Timon Gehr, Markus Püschel, and Martin Vechev. 1/2019. “An Abstract Domain for Certifying Neural Networks.” PACML, POPL. PDF Abstract

We present a novel method for scalable and precise certification of deep neural networks. The key technical insight behind our approach is a new abstract domain which combines floating point polyhedra with intervals and is equipped with abstract transformers specifically tailored to the setting of neural networks. Concretely, we introduce new transformers for affine transforms, the rectified linear unit (ReLU), sigmoid, tanh, and maxpool functions.

We implemented our method in a system called DeepPoly and evaluated it extensively on a range of datasets, neural architectures (including defended networks), and specifications. Our experimental results indicate that DeepPoly is more precise than prior work while scaling to large networks.

We also show how to combine DeepPoly with a form of abstraction refinement based on trace partitioning. This enables us to prove, for the first time, the robustness of the network when the input image is subjected to complex perturbations such as rotations that employ linear interpolation.

Tetsuya Sato, Alejandro Aguirre, Gilles Barthe, Marco Gaboardi, Deepak Garg, and Justin Hsu. 2019. “Formal Verification of Higher-Order Probabilistic Programs.” Principles of Programming Languages (POPL). PDF Abstract

Probabilistic programming provides a convenient lingua franca for writing succinct and rigorous descriptions of probabilistic models and inference tasks. Several probabilistic programming languages, including Anglican, Church or Hakaru, derive their expressiveness from a powerful combination of continuous distributions, conditioning, and higher-order functions. Although very important for practical applications, these features raise fundamental challenges for program semantics and verification. Several recent works offer promising answers to these challenges, but their primary focus is on foundational semantics issues. In this paper, we take a step further by developing a suite of logics, collectively named PPV, for proving properties of programs written in an expressive probabilistic higher-order language with continuous sampling operations and primitives for conditioning distributions. Our logics mimic the comfortable reasoning style of informal proofs using carefully selected axiomatizations of key results from probability theory. The versatility of our logics is illustrated through the formal verification of several intricate examples from statistics, probabilistic inference, and machine learning. We further show expressiveness by giving sound embeddings of existing logics. In particular, we do this in a parametric way by showing how the semantics idea of (unary and relational) ⊤⊤-lifting can be internalized in our logics. The soundness of PPV follows by interpreting programs and assertions in quasi-Borel spaces (QBS), a recently proposed variant of Borel spaces with a good structure for interpreting higher order probabilistic programs.

Zenna Tavares, Javier Burroni, Edgar Minasyan, Armando Solar Lezama, and Rajesh Ranganath. 2019. “Predicate Exchange: Inference with Declarative Knowledge.” International Conference on Machine Learning (ICML). PDF Abstract

Programming languages allow us to express complex predicates, but existing inference methods are unable to condition probabilistic models on most of them. To support a broader class of predicates, we develop an inference procedure called predicate exchange, which softens predicates. A soft predicate quantifies the extent to which values of model variables are consistent with its hard counterpart. We substitute the likelihood term in the Bayesian posterior with a soft predicate, and develop a variant of replica exchange MCMC to draw posterior samples. We implement predicate exchange as a language agnostic tool which performs a nonstandard execution of a probabilistic program. We demonstrate the approach on sequence models of health and inverse rendering.

2018

Mislav Balunovic, Pavol Bielik, and Martin Vechev. 12/2018. “Learning to Solve SMT Formulas.” NeurIPS. PDF Abstract

We present a new approach for learning to solve SMT formulas. We phrase the challenge of solving SMT formulas as a tree search problem where at each step a transformation is applied to the input formula until the formula is solved. Our approach works in two phases: first, given a dataset of unsolved formulas we learn a policy that for each formula selects a suitable transformation to apply at each step in order to solve the formula, and second, we synthesize a strategy in the form of a loop-free program with branches. This strategy is an interpretable representation of the policy decisions and is used to guide the SMT solver to decide formulas more efficiently, without requiring any modification to the solver itself and without needing to evaluate the learned policy at inference time. We show that our approach is effective in practice – it solves 17% more formulas over a range of benchmarks and achieves up to 100 × runtime improvement over a state-of-the-art SMT solver.

2017

Mona Alshahrani, Mohammad Asif Khan, Omar Maddouri, Akira R Kinjo, Núria Queralt-Rosinach, and Robert Hoehndorf. 9/2017. “Neuro-symbolic representation learning on biological knowledge graphs.” Bioinformatics. PDF Abstract

Motivation: Biological data and knowledge bases increasingly rely on Semantic Web technologies and the use of knowledge graphs for data integration, retrieval and federated queries. In the past years, feature learning methods that are applicable to graph-structured data are becoming available, but have not yet widely been applied and evaluated on structured biological knowledge. Results: We develop a novel method for feature learning on biological knowledge graphs. Our method combines symbolic methods, in particular knowledge representation using symbolic logic and automated reasoning, with neural networks to generate embeddings of nodes that encode for related information within knowledge graphs. Through the use of symbolic logic, these embeddings contain both explicit and implicit information. We apply these embeddings to the prediction of edges in the knowledge graph representing problems of function prediction, finding candidate genes of diseases, protein-protein interactions, or drug target relations, and demonstrate performance that matches and sometimes outperforms traditional approaches based on manually crafted features. Our method can be applied to any biological knowledge graph, and will thereby open up the increasing amount of Semantic Web based knowledge bases in biology to use in machine learning and data analytics. Availability and Implementation: https://github.com/bio-ontology-research-group/walking-rdf-and-owl

Chung-chieh Shan and Norman Ramsey. 1/2017. “Exact Bayesian Inference by Symbolic Disintegration.” Principles of Programming Languages (POPL). PDF Abstract

Bayesian inference, of posterior knowledge from prior knowledge and observed evidence, is typically defined by Bayes’s rule, which says the posterior multiplied by the probability of an observation equals a joint probability. But the observation of a continuous quantity usually has probability zero, in which case Bayes’s rule says only that the unknown times zero is zero. To infer a posterior distribution from a zero-probability observation, the statistical notion of disintegration tells us to specify the observation as an expression rather than a predicate, but does not tell us how to compute the posterior. We present the first method of computing a disintegration from a probabilistic program and an expression of a quantity to be observed, even when the observation has probability zero. Because the method produces an exact posterior term and preserves a semantics in which monadic terms denote measures, it composes with other inference methods in a modular way—without sacrificing accuracy or performance.

Johannes Borgström, Ugo Dal Lago, Andrew D. Gordon, and Marcin Szymczak. 1/2017. “A Lambda-Calculus Foundation for Universal Probabilistic Programming”. PDF Abstract

We develop the operational semantics of an untyped probabilistic lambda-calculus with continuous distributions, as a foundation for universal probabilistic programming languages such as Church, Anglican, and Venture. Our first contribution is to adapt the classic operational semantics of lambda-calculus to a continuous setting via creating a measure space on terms and defining step-indexed approximations. We prove equivalence of big-step and small-step formulations of this distribution-based semantics. To move closer to inference techniques, we also define the sampling-based semantics of a term as a function from a trace of random samples to a value. We show that the distribution induced by integrating over all traces equals the distribution-based semantics. Our second contribution is to formalize the implementation technique of trace Markov chain Monte Carlo (MCMC) for our calculus and to show its correctness. A key step is defining sufficient conditions for the distribution induced by trace MCMC to converge to the distribution-based semantics. To the best of our knowledge, this is the first rigorous correctness proof for trace MCMC for a higher-order functional language.

2015

Cindy Mason. 1/2015. “Engineering Kindness: Building a Machine with Compassionate Intelligence.” International Journal of Synthetic Emotions. PDF Abstract

The author provides first steps toward building a software agent/robot with compassionate intelligence. She approaches this goal with an example software agent, EM-2. She also gives a generalized software requirements guide for anyone wishing to pursue other means of building compassionate intelligence into an AI system. The purpose of EM-2 is not to build an agent with a state of mind that mimics empathy or consciousness, but rather to create practical applications of AI systems with knowledge and reasoning methods that positively take into account the feelings and state of self and others during decision making, action, or problem solving. To program EM-2 the author re-purposes code and architectural ideas from collaborative multi-agent systems and affective common sense reasoning with new concepts and philosophies from the human arts and sciences relating to compassion. EM-2 has predicates and an agent architecture based on a meta-cognition mental process that was used on India's worst prisoners to cultivate compassion for others, Vipassana or mindfulness. She describes and presents code snippets for common sense based affective inference and the I-TMS, an Irrational Truth Maintenance System, that maintains consistency in agent memory as feelings change over time, and provides a machine theoretic description of the consistency issues of combining affect and logic. The author summarizes the growing body of new biological, cognitive and immune discoveries about compassion and the consequences of these discoveries for programmers working with human-level AI and hybrid human-robot systems.

2011

Judea Pearl. 2011. “The algorithmization of counterfactuals.” Annals of Mathematics and Artificial Intelligence . Publisher's Version Abstract

Recent advances in causal reasoning have given rise to a computation model that emulates the process by which humans generate, evaluate and distinguish counterfactual sentences. Though compatible with the “possible worlds” account, this model enjoys the advantages of representational economy, algorithmic simplicity and conceptual clarity. Using this model, the paper demonstrates the processing of counterfactual sentences on a classical example due to Ernest Adam. It then gives a panoramic view of several applications where counterfactual reasoning has benefited problem areas in the empirical sciences.

1994

Marco Ramoni, Alberto Riva, and Vimla L. Patel. 1/2/1994. “Probabilistic Reasoning under Ignorance.” Cognitive Science Society. PDF Abstract

The representation of ignorance is a long standing challenge for researchers in probability and decision theory. During the past decade, Artificial Intelligence researchers have developed a class of reasoning systems, called Truth Maintenance Systems, which are able to reason on the basis of incomplete information. In this paper we will describe a new method for dealing with partially specified probabilistic models, by extending a logic-based truth maintenance method from Boolean truth-values to probability intervals. Then we will illustrate how this method can be used to represent Bayesian Belief Networks --- one of the best known formalisms to reason under uncertainty --- thus producing a new class of Bayesian Belief Networks, called Ignorant Belief Networks, able to reason on the basis of partially specified prior and conditional probabilities. Finally, we will discuss how this new method relates to some theoretical intuitions and empirical findings in decision theory and cognitive science.

1993

Marco Ramoni and Alberto Riva. 1/1/1993. “Belief Maintenance with Probabilistic Logic.” AAAI. PDF Abstract

Belief maintenance systems are natural extensions of truth maintenance systems that use probabilities rather than boolean truth-values. This paper introduces a general method for belief maintenance, based on (the propositional fragment of) probabilistic logic, that extends the Boolean Constraint Propagation method used by the logic-based truth maintenance systems. From the concept of probabilistic entailment, we derive a set of constraints on the (probabilistic) truth-values of propositions and we prove their soundness. These constraints are complete with respect to a well-defined set of clauses, and their partial incompleteness is compensated by a gain in computational efficiency

1991

Stephen Muggleton. 1991. “Inductive logic programming.” New Generation Computing. PDF Abstract

A new research area, Inductive Logic Programming, is presently emerging. While inheriting various positive characteristics of the parent subjects of Logic Programming and Machine Learning, it is hoped that the new area will overcome many of the limitations of its forebears. The background to present developments within this area is discussed and various goals and aspirations for the increasing body of researchers are identified. Inductive Logic Programming needs to be based on sound principles from both Logic and Statistics. On the side of statistical justification of hypotheses we discuss the possible relationship between Algorithmic Complexity theory and Probably-Approximately-Correct (PAC) Learning. In terms of logic we provide a unifying framework for Muggleton and Buntine's Inverse Resolution (IR) and Plotkin's Relative Least General Generalisation (RLGG) by rederiving RLGG in terms of IR. This leads to a discussion of the feasibility of extending the RLGG framework to allow for the invention of new predicates, previously discussed only within the cor~text of IR.

1966

John von Neumann. 1/1/1966. Theory of Self-Reproducing Automata. University of Illinois Press. PDF

CS252R: Programming Languages + Artificial Intelligence (Fall 2020)

Advanced

Topics

Level

Note