The smartest person I’ve ever known had a habit that, as a teenager, I found striking. After he’d prove a theorem, or solve a problem, he’d go back and continue thinking about the problem and try to figure out different proofs of the same thing. Sometimes he’d spend hours on a problem he’d already solved.
I had the opposite tendency: as soon as I’d reached the end of the proof, I’d stop since I’d “gotten the answer”.
Afterwards, he’d come out with three or four proofs of the same thing, plus some explanation of why each proof is connected somehow. In this way, he got a much deeper understanding of things than I did.
I concluded that what we call 'intelligence' is as much about virtues such as honesty, integrity, and bravery, as it is about 'raw intellect’.
Units are really hard. See also names, dates, unicode, etc.
First of all, there are multiple different representations of units. There’s the SI system, which all sane, right-thinking nations use, and then there’s the American system. Feet and meters are both distance. Is it okay to mix feet and meters? This is the bug that destroyed the Mars Climate Orbiter. But there are also valid uses cases where you’d want to mix them! When I cook, I use a mix of metric weights and American volumes, like 1 cup of water and 128 grams of flour. In the UK, beer is measured using pints but its alcohol by volume is measured in SI.
Different dimensions doesn’t always mean the units are incompatible. There’s a nonstandard set of units called “Gaussian units” used in some niches of physics. In SI, capacitance is measured in Farads, which has dimensions A² s⁴ kg-1 m−2. In Gaussian units, capacitance is measured in cm. If you’re explicit about what you’re doing you can add these seemingly-incompatible dimensions.
Dimensions aren’t unique, and two incompatible physical quantities can have the same dimension. The canonical example is that energy and angular force are both measured in Newton-meters.2 There are also plenty of domain specific examples. As you enter a gravity well, the rate at which gravitational force changes has dimensions N/m. Surface tension is also measured in N/m.
ZBLAN fiber can have 10 to 100 times lower signal loss than silica fiber, but gravity produces defects, so companies are looking to produce it in space.
Self-locating belief and the Sleeping Beauty oblem Adam Elga
Sleeping Beauty: reply to Elga
"And while the story of the single self tends to be closely correlated with the system’s actions, the narrative self does not actually decide the person’s actions, it’s just a story of someone who does. In a sense, the part of your mind that may feel like the “you” that takes actions, is actually produced by a module that just claims credit for those actions.
The self-narrative agent disguises itself as the causally acting agent.
"If one develops sufficient introspective awareness, they may come to see that the intentions are arising on their own, and that there is actually no way to control them: if one intends to control them somehow, then the intention to do that is also arising on its own.
Paper that introduced mesa-optimization / inner alignment.
"We analyze the type of learned optimization that occurs when a learned model (such as a neural network) is itself an optimizer—a situation we refer to as mesa-optimization, a neologism we introduce in this paper. We believe that the possibility of mesa-optimization raises two important questions for the safety and transparency of advanced machine learning systems. First, under what circumstances will learned models be optimizers, including when they should not be? Second, when a learned model is an optimizer, what will its objective be—how will it differ from the loss function it was trained under—and how can it be aligned? In this paper, we provide an in-depth analysis of these two primary questions and provide an overview of topics for future research.
"What is the difference between a boulder - for which it’s impossible to go to the red button (because of its momentum, which determines its position, by the laws of physics) - and a subagent - for which it’s impossible to go to the red button (because of its programming, which determines its position, by the laws of physics)?
Convergent instrumental subgoals that are pretty universal:
Tessa and I started doing these recently, recommended:
See also worse is better.
^ See response with collection of ZUIs
I'm still tempted to start a Goodhart's law twitter account:
Thread: