The smartest person I’ve ever known had a habit that, as a teenager, I found striking. After he’d prove a theorem, or solve a problem, he’d go back and continue thinking about the problem and try to figure out different proofs of the same thing. Sometimes he’d spend hours on a problem he’d already solved.
I had the opposite tendency: as soon as I’d reached the end of the proof, I’d stop since I’d “gotten the answer”.
Afterwards, he’d come out with three or four proofs of the same thing, plus some explanation of why each proof is connected somehow. In this way, he got a much deeper understanding of things than I did.
I concluded that what we call ‘intelligence’ is as much about virtues such as honesty, integrity, and bravery, as it is about ‘raw intellect’.
Units are really hard. See also names, dates, unicode, etc.
First of all, there are multiple different representations of units. There’s the SI system, which all sane, right-thinking nations use, and then there’s the American system. Feet and meters are both distance. Is it okay to mix feet and meters? This is the bug that destroyed the Mars Climate Orbiter. But there are also valid uses cases where you’d want to mix them! When I cook, I use a mix of metric weights and American volumes, like 1 cup of water and 128 grams of flour. In the UK, beer is measured using pints but its alcohol by volume is measured in SI.
Different dimensions doesn’t always mean the units are incompatible. There’s a nonstandard set of units called “Gaussian units” used in some niches of physics. In SI, capacitance is measured in Farads, which has dimensions A² s⁴ kg-1 m−2. In Gaussian units, capacitance is measured in cm. If you’re explicit about what you’re doing you can add these seemingly-incompatible dimensions.
Dimensions aren’t unique, and two incompatible physical quantities can have the same dimension. The canonical example is that energy and angular force are both measured in Newton-meters.2 There are also plenty of domain specific examples. As you enter a gravity well, the rate at which gravitational force changes has dimensions N/m. Surface tension is also measured in N/m.
ZBLAN fiber can have 10 to 100 times lower signal loss than silica fiber, but gravity produces defects, so companies are looking to produce it in space.
- "The Sleeping Beauty problem: Some researchers are going to put you to sleep. During the two days that your sleep will last, they will briefly wake you up either once or twice, depending on the toss of a fair coin (Heads: once; Tails: twice). After each waking, they will put you to back to sleep with a drug that makes you forget that waking. When you are first awakened, to what degree ought you believe that the outcome of the coin toss is Heads?
“And while the story of the single self tends to be closely correlated with the system’s actions, the narrative self does not actually decide the person’s actions, it’s just a story of someone who does. In a sense, the part of your mind that may feel like the “you” that takes actions, is actually produced by a module that just claims credit for those actions.
The self-narrative agent disguises itself as the causally acting agent.
“If one develops sufficient introspective awareness, they may come to see that the intentions are arising on their own, and that there is actually no way to control them: if one intends to control them somehow, then the intention to do that is also arising on its own.
- Prosaic, based on AI safety via debate - Setup: - Agents: - Q, question - H, human. - M, Adv: models. - 1. M tries to predict what, at the end of the procedure, H will think about Q. - 2. Adv tries to output a string which will cause H to think something maximally different than what M predicted. - 3. Return to step 1 and repeat until M's predictions stop changing. - 4. Deploy M, which in the limit should act as an oracle for what H will think about Q after seeing all relevant information. - I find this "find a fixed point" (with a human in the loop) setup satisfying - "AI safety via market making still inherits many of the potential outer alignment issues of debate, including the possibility of deceptive equilibria wherein the human is more convinced by false arguments than true arguments. Hopefully, however, the use of techniques such as cross-examination should help alleviate such issues. - Re inner alignment verification, myopia seems to be the biggest consideration
Paper that introduced mesa-optimization / inner alignment.
“We analyze the type of learned optimization that occurs when a learned model (such as a neural network) is itself an optimizer—a situation we refer to as mesa-optimization, a neologism we introduce in this paper. We believe that the possibility of mesa-optimization raises two important questions for the safety and transparency of advanced machine learning systems. First, under what circumstances will learned models be optimizers, including when they should not be? Second, when a learned model is an optimizer, what will its objective be—how will it differ from the loss function it was trained under—and how can it be aligned? In this paper, we provide an in-depth analysis of these two primary questions and provide an overview of topics for future research.
“What is the difference between a boulder - for which it’s impossible to go to the red button (because of its momentum, which determines its position, by the laws of physics) - and a subagent - for which it’s impossible to go to the red button (because of its programming, which determines its position, by the laws of physics)?
Convergent instrumental subgoals that are pretty universal:
- Tai-Danae Bradley: [What is Applied Category Theory](https://arxiv.org/pdf/1809.05923.pdf) (50 pages) - Bartosz Milewski: [Programming Cafe](https://bartoszmilewski.com/2014/10/28/category-theory-for-programmers-the-preface/) (blog) - My book with Brendan Fong: [Invitation to applied category theory](https://arxiv.org/abs/1803.05316) (textbook) - A short paper of mine: [Categories as Mathematical Models](https://arxiv.org/pdf/1409.6067.pdf) (21 pages about modeling with category theory) - John Baez: [Some definitions everyone should know](http://math.ucr.edu/home/baez/qg-winter2001/definitions.pdf) (6 pages of math definitions) - Emily Riehl's book: [Category Theory in Context](http://www.math.jhu.edu/~eriehl/context.pdf) (more advanced textbook) - Paolo Perrone's notes: [Notes on Category Theory with examples from basic mathematics](https://arxiv.org/pdf/1912.10642.pdf)
Tessa and I started doing these recently, recommended:
See also worse is better.
^ See response with collection of ZUIs
I’m still tempted to start a Goodhart’s law twitter account:
- reminder they people you know are atypical - See also [Different Worlds](https://web.archive.org/web/20190427211306/https://slatestarcodex.com/2017/10/02/different-worlds/)