There Is a Fundamental Flaw in How We Do Statistics in Science

Suppose I tell you that only 1% of people with COVID have a body temperature less than 97°. If you take someone’s temperature and measure less than 97°, what is the probability that they have COVID? If your answer is 1% you have committed the conditional probability fallacy and you have essentially done what researchers do whenever they use p-values. In reality, these inverse probabilities (i.e., probability of having COVID if you have low temperature and probability of low temperature if you have COVID) are not the same.

in practically every situation that people use statistical significance, they commit the conditional probability fallacy

Now if we gather some new data (D), what needs to be examined is the probability of the null hypothesis given that we observed this data, not the inverse! That is, Pr(H0|D) should be compared with a 1% threshold, not Pr(D|H0). In our current methods of statistical testing, we use the latter as a proxy for the former.

By using p-values we effectively act as though we commit the conditional probability fallacy. The two values that are conflated are Pr(H0|p<α) and Pr(p<α|H0). We conflate the chances of observing a particular outcome under a hypothesis with the chances of that hypothesis being true given that we observed that particular outcome.

Researchers often wish to turn a p-value into a statement about the truth of a null hypothesis or about the probability that random chance produced the observed data. The p-value is neither.

What alternatives do we have to p-values? Some suggest using confidence intervals to estimate effect sizes. Confidence intervals may have some advantages but they still suffer from the same fallacies (as nicely explained in Morey et al. 2016). Another alternative is to use Bayes factors as a measure for evidence. Bayesian model comparison has been around for nearly two decades but has not gained much traction, for a number of practical reasons.

The bottom line is that there is practically no correct way to use p-values. It does not matter if you understand what it means or if you frame it as a decision procedure rather than a method for inference . If you use p-values you are effectively behaving like someone that confuses conditional probabilities. Science needs a mathematically sound framework for doing statistics. In future posts I will suggest a new simple framework for quantifying evidence. This framework is based on Bayes factors but makes a basic assumption: that every experiment has a probability of error that cannot be objectively determined. From this basic assumption a method of evidence quantification emerges that is highly reminescent of p-value testing but is 1) mathematically sound and 2) practical. (In contrast to Bayes factor, it produces numbers that are not extremely large or small).

Gwern links

How UNIX Linkers Work

think of an archive library as a bookshelf, with some books on it (the separate .o files).

some books may refer you to other books (via unresolved symbols), which may be on the same, or on a different bookshelf.

technicalities: “not rocket science” (the story of monotone and bors)

The Not Rocket Science Rule Of Software Engineering: automatically maintain a repository of code that always passes all the tests

Time passed, that system aged and (as far as I know) went out of service. I became interested in revision control, especially systems that enforced this Not Rocket Science Rule. Surprisingly, only one seemed to do so automatically (Aegis, written by Peter Miller, another charming no-nonsense Australian who is now, sadly, approaching death).


Fantastic post by Jason Crawford (The Roots of Progress)

A major theme of the 19th century was the transition from plant and animal materials to synthetic versions or substitutes mostly from non-organic sources

(Ivory, fertilizer, lighting, smelting, shellac)

There are many other biomaterials we once relied on—rubber, silk, leather and furs, straw, beeswax, wood tar, natural inks and dyes—that have been partially or fully replaced by synthetic or artificial substitutes, especially plastics, that can be derived from mineral sources. They had to be replaced, because the natural sources couldn’t keep up with rapidly increasing demand. The only way to ramp up production—the only way to escape the Malthusian trap and sustain an exponentially increasing population while actually improving everyone’s standard of living—was to find new, more abundant sources of raw materials and new, more efficient processes to create the end products we needed. As you can see from some of these examples, this drive to find substitutes was often conscious and deliberate, motivated by an explicit understanding of the looming resource crisis.

In short, plant and animal materials had become unsustainable.

To my mind, any solution to sustainability that involves reducing consumption or lowering our standard of living is no solution at all. It is giving up and admitting defeat. If running out of a resource means that we have to regress back to earlier technologies, that is a failure—a failure to do what we did in the 19th century and replace unsustainable technologies with new, improved ones that can take humanity to the next level and support orders of magnitude more growth.

Planet Ebook

free classic literature ebooks

Gravity is not a force

Under general relativity, gravity is not a force. Instead it is a distortion of spacetime. Objects in free-fall move along geodesics (straight lines) in spacetime, as seen in the inertial frame of reference on the right. When standing on Earth we experience a frame of reference that is accelerating upwards, causing objects in free-fall to move along parabolas, as seen in the accelerating frame of reference on the left.

Bayesian Investor review of Where Is My Flying Car?

Atlanta police arrest murder suspect by drone

Securing posterity

It is not safe stagnation and risky growth that we must choose between; rather, it is stagnation that is risky and it is growth that leads to safety.

we might be advanced enough to have developed the means for our destruction, but not advanced enough to care sufficiently about safety. But stagnation does not solve the problem: we would simply stagnate at this high level of risk.

The risk of a existential catastrophe then looks like an inverted U-shape over time:

There is an analog to this in environmental economics, called the “environmental Kuznets curve.” It was theorized that pollution initially rises as countries develop, but, as people grow richer and begin to value a clean environment more, they will work to reduce pollution again. That theory has arguably been vindicated by the path that Western countries have taken with regard to water and air pollution, for example, over the past century.

Carl Sagan was the one who coined the term “time of perils.” Derek Parfit called it the “hinge of history.”

On the other extreme, humanity is extremely fragile. No matter how high a fraction of our resources we dedicate to safety, we cannot prevent an unrecoverable catastrophe. Perhaps weapons of mass destruction are simply too easy to build, and no amount of even totalitarian safety efforts can prevent some lunatic from eventually causing nuclear annihilation. We indeed might indeed be living in this world; this would be the model’s version of Bostrom’s “vulnerable world hypothesis,” Hanson’s “Great Filter,” or the “Doomsday Argument.”

Perhaps, if we followed this argument to the end, we might reach the counterintuitive conclusion that the most effective thing we can do reduce the risk of an existential catastrophe is not to invest in safety directly or to try to persuade people to be more long-term oriented—but rather to spend money on alleviating poverty, so more people are well-off enough to care about safety.

Visual information theory

Where are All the Successful Rationalists?

It’s been 13 years since Yudkowsky published the sequences, and 11 years since he wrote “Rationality is Systematized Winning“.

So where are all the winners?

Immediately after the Systematised Winning, Scott Alexander wrote Extreme Rationality: It’s Not That Great claiming that there is “approximately zero empirical evidence that x-rationality has a large effect on your practical success”

The primary impacts of reading rationalist blogs are that 1) I have been frequently distracted at work, and 2) my conversations have gotten much worse.

Qiaochu Yuan (math) reading recommendations

Spin Networks

Spin networks are states of quantum geometry in a theory of quantum gravity, discovered by Lee Smolin and Carlo Rovelli, which is the conceptual ancestor of the imaginary physics of Schild’s Ladder.


Cool, but also damning?

“Proposed by Michael Spivak in 1965, as an exercise in Calculus”