RNNs can turn their hidden states into bounded-capacity stacks so efficiently, they can generate a bounded hierarchical language using asymptotically optimally few hidden units of memory.
Recommended, even if this is old news. Fun post.
I think that one of the key observations of focusing/CBPV is that programs are dealing with two different things - data and computation - and that we tend to get the most tripped up when we confuse the two.
Data is classified by data types (a.k.a. positive types). Data is defined by how it is constructed, and the way you use data is by pattern-matching against it.
Computation is classified by computation types (a.k.a. negative types). Computations are defined their eliminations - that is, by how they respond to signals/messages/pokes/arguments.
The constantly updated catalogue of entrances to Hell in and around the UK
Autonomous AI systems (Agent AIs) trained using reinforcement learning can do harm when they take wrong actions, especially superintelligent Agent AIs. One solution would be to eliminate their agency by not giving AIs the ability to take actions, confining them to purely informational or inferential tasks such as classification or prediction (Tool AIs), and have all actions be approved & executed by humans, giving equivalently superintelligent results without the risk.
I argue that this is not an effective solution for two major reasons. First, because Agent AIs will by definition be better at actions than Tool AIs, giving an economic advantage. Secondly, because Agent AIs will be better at inference & learning than Tool AIs, and this is inherently due to their greater agency: the same algorithms which learn how to perform actions can be used to select important datapoints to learn inference over, how long to learn, how to more efficiently execute inference, how to design themselves, how to optimize hyperparameters, how to make use of external resources such as long-term memories or external software or large databases or the Internet, and how best to acquire new data. All of these actions will result in Agent AIs more intelligent than Tool AIs, in addition to their greater economic competitiveness. Thus, Tool AIs will be inferior to Agent AIs in both actions and intelligence, implying use of Tool AIs is a even more highly unstable equilibrium than previously argued, as users of Agent AIs will be able to outcompete them on two dimensions (and not just one).
This seems like basically an expression of Arrow’s impossibility theorem for ranked choice voting, since we’re less familiar with it than first-past-the-post. Of course the real question is which system works best in practice.
the night sky looks darker than Eigengrau because of the contrast provided by the stars
Prior to Steele and White’s ”How to print floating-point numbers accurately”, implementations of printf and similar rendering functions did their best to render floating point numbers, but there was wide variation in how well they behaved. A number such as 1.3 might be rendered as 1.29999999, for instance, or if a number was put through a feedback loop of being written out and its written representation read back, each successive result could drift further and further away from the original.
In 2010, Florian Loitsch published a wonderful paper in PLDI, ”Printing floating-point numbers quickly and accurately with integers”, which represents the biggest step in this field in 20 years: he mostly figured out how to use machine integers to perform accurate rendering! Why do I say “mostly”? Because although Loitsch’s “Grisu3” algorithm is very fast, it gives up on about 0.5% of numbers, in which case you have to fall back to Dragon4 or a derivative.
If you’re a language runtime author, the Grisu algorithms are a big deal: Grisu3 is about 5 times faster than the algorithm used by printf in GNU libc
… Ryū algorithm, which I believe is the fastest algorithm at this point
The decimal expansion of 1/89 is just the Fibonacci series, added together in an appropriate fashion.
are there really only 313?
Foundations is a series of articles, first published in the magazine Eidolon, on some of the theories of twentieth-century physics that have most influenced modern science fiction. However, these are not essays on the history or philosophy of science; their aim is to show how the central idea of each theory leads to detailed, quantitative predictions of real physical effects. For example, the article on special relativity derives formulas for time dilation, Doppler shift, and aberration.
These articles are for the interested lay reader. No prior knowledge of mathematics beyond high school algebra and geometry is needed.
We give a definition of 𝖰-net, a generalization of Petri nets based on a Lawvere theory 𝖰, for which many existing variants of Petri nets are a special case. This definition is functorial with respect to change in Lawvere theory, and we exploit this to explore the relationships between different kinds of 𝖰-nets. To justify our definition of 𝖰-net, we construct a family of adjunctions for each Lawvere theory explicating the way in which 𝖰-nets present free models of 𝖰 in 𝖢𝖺𝗍. This gives a functorial description of the operational semantics for an arbitrary category of 𝖰-nets. We show how this can be used to construct the semantics for Petri nets, pre-nets, integer nets, and elementary net systems.
This is an extremely interesting comic on some potential enhancements to chat. Kind of a Bret Victor treatment of chat.
Many problems can be concisely expressed as finding the least fixed point of a monotone map, including graph reachability, static analysis by abstract interpretation, regular expression matching, and parsing. There is a straightforward way to compute these fixpoints: iterate the function. Unfortunately, this is inefficient, for at least two reasons: first, it performs redundant work during each iteration; second, it is inherently single-threaded.
In this talk, I will show how to solve the first of these problems using seminaive evaluation, a technique that iterates to a fixed point faster by computing only the differences between iterations. I will also discuss work in progress on the second problem using a simple graph based model of concurrent monotone computation. Unfortunately, I do not know how to reconcile these two approaches and achieve an implementation strategy for monotone fixed points that is both incremental and concurrent.