Who knew you could do all this with bash?
Interesting question and answer.
David Christiansen does write great tutorials, but scheme…
Despite their massive size, successful deep artificial neural networks can exhibit a remarkably small difference between training and test performance. Conventional wisdom attributes small generalization error either to properties of the model family, or to the regularization techniques used during training. Through extensive systematic experiments, we show how these traditional approaches fail to explain why large neural networks generalize well in practice. Specifically, our experiments establish that state-of-the-art convolutional networks for image classification trained with stochastic gradient methods easily fit a random labeling of the training data. This phenomenon is qualitatively unaffected by explicit regularization, and occurs even if we replace the true images by completely unstructured random noise. We corroborate these experimental findings with a theoretical construction showing that simple depth two neural networks already have perfect finite sample expressivity as soon as the number of parameters exceeds the number of data points as it usually does in practice. We interpret our experimental findings by comparison with traditional models.
It is worth noting that Google’s OSS-Fuzz generated 10 trillion test inputs a day [using hundreds of servers] for a few months to fuzz these applications, but QYSM ran them for three hours using a single workstation.
the community believes that symbolic and concolic executions are slow due to path explosion and constraint solving
but in fact emulation itself is 3-5 orders of magnitude slower
Synthesizing programs using example input/outputs is a classic problem in artificial intelligence. We present a method for solving Programming By Example (PBE) problems by using a neural model to guide the search of a constraint logic programming system called miniKanren. Crucially, the neural model uses miniKanren’s internal representation as input; miniKanren represents a PBE problem as recursive constraints imposed by the provided examples. We explore Recurrent Neural Network and Graph Neural Network models. We contribute a modified miniKanren, drivable by an external agent, available at this URL We show that our neural-guided approach using constraints can synthesize programs faster in many cases, and importantly, can generalize to larger problems.
Discusses AI in terms of six related fields
Science aims to learn how the world works, by experiment when possible, or observation otherwise. In AI, we have the luxury of experiment. Still better: we have the luxury of perfectly repeatable experiments, under perfectly controlled conditions! Almost no other domain is as ideally suited to scientific investigation.
Yet it is uncommon for AI research to include either a hypothesis or an experiment.
We don’t know how to assign and update probabilities on predicate statements using Bayes nets, in a coherent and general manner. So we don’t know how to do that with neural nets either, except to the degree that neural nets are simpler or easier to work with than general Bayes nets.
For instance, as Thomas Colthurst points out in the comments, message passing algorithms don’t provably work in general Bayes nets, but do work in feedforward neural nets, which don’t have cycles. It may be that neural nets provide a restricted domain in which modeling predicate statements probabilistically is more tractable. I would have to learn more about this.
Abram Demski:
Logical induction is more or less an extension of probability theory to allow us to meaningfully work with first-order logic while at the same time remaining computable (though not very feasibly computable!).
What logical induction does which that kind of approach could never do is give some guarantees about the way in which a coherent distribution gets approximated; it guarantees that good heuristic guesses will be made. So logical induction formalizes the way mathematicians can make conjectures, and illustrates that it necessarily goes beyond probability theory in a certain way (while also keeping as close to probability theory as possible).
If anything, I would say that a better characterization of the limits of logical induction is that it doesn’t do second-order logic. It doesn’t have notions of “finite” vs “infinite” like those which second-order logic can supply. It doesn’t come to believe in a standard model of the natural numbers.
An optimizing compiler is faced with a collection of intractable search problems and given very little time in which to solve them. The solutions to these problems enable and block each other in ways that can be difficult to predict ahead of time: this is the “phase ordering” problem where a compiler must find a sequence of optimization passes that seems to give good results most of the time, without taking too long to execute. Optimizations that work well in one problem domain (loop vectorization or aggressive partial evaluation, for example) are often useless or counterproductive in other domains, and yet a given compiler instance has no obvious principled way to know what will work well this time.
This paper is organized around a handful of thesis statements; the main one is:
Thesis 1. Major components of future compilers will be generated partially automatically, with the help of SMT solvers, directly addressing compiler speed (automatically discovered optimizations will be structured uniformly and amenable to fast rewrite strategies), compiler correctness (automated theorem provers are generally less fallible than humans), and code quality (solvers can be used to conduct a thorough search of the optimization space).
Thesis 6. Durable interfaces, such as ISAs, IRs, and programming languages, should be accompanied by first-class formal semantics.