Page last updated: 2022-05-29 Sun 19:20

Notes on Yudkowsky's "Rationality: From AI to Zombies"

Table of Contents

Finally going to read the Sequences.

This is a work-in-progress summary of all (or at least most) blogposts in "Rationality: From AI to Zombies". I don't expect it to be useful to anyone other than me because of how short the summaries are, since they are basically only supposed to be a reminder of what I've already read; I often also don't include 'proofs' of claims summarized.

1 Preface

Cognitive bias - a systematic, non-random error in how we think, a pattern. Basically, something that skews our beliefs so that they represent the facts less accurately.

  • Conjunction fallacy - overestimate probabilities of complicated-but-sensible-sounding narratives compared to the probabilities of strictly simpler scenarios.
  • Base rate neglect - grounding judgment in how intuitively "normal" a combination of attributes is while neglecting how common each attribute is in the population ("shy librarian" seems more likely than "shy salesperson", but actually "librarian" is much less likely than "salesperson")
  • Duration neglect - evaluating experiences without regard to how long they lasted.
  • Sunk cost fallacy - feeling committed to what you've spent resources in the past, even when it fails.
  • Confirmation bias - giving more weight to evidence that confirms what we already believe.

Bias blindness is especially severe among people who are intelligent, thoughtful, and open-minded.

  • System 1 - fast, implicit, associative, automatic cognition.
  • System 2 - slow, explicit, intellectual, controlled cognition.

Both are valuable, system 1 is harder to train, but has its merits; it's rational to use it in some contexts, and it's a mistake (and irrational) to believe it is only ever rational to use system 2.

2 12 Virtues of Rationality

  1. Curiosity.
  2. Relinquishment (being ready to abandon beliefs that are shown to be wrong). "That which can be destroyed by the truth should be"
  3. Lightness. Don't fight against the evidence, surrender to the truth as fast as you can. Be faithless to your cause, betray it to a stronger enemy (in terms of evidence).
  4. Evenness. Don't fall into confirmation bias, evaluate all evidence. Do not seek to argue for one side or another.
  5. Argument. Argue.
  6. Empiricism. Do not ask which beliefs to profess, but which experiences to anticipate.
  7. Simplicity. "Perfection is achieved not when there's nothing left to add, but when there is nothing left to take away."
  8. Humility. Be humble and take specific actions in anticipation of your own errors, prepare for the deepest, most catastrophic errors in your beliefs and plans.
  9. Perfectionism. When you notice an error in yourself, this signals your readiness to seek advancement to the next level. Correct it, and advance to the next level. Do not be content with the answer that is almost right; seek one that is exactly right.
  10. Precision. Each piece of evidence should shift your beliefs by exactly the right amount, neither more nor less.
  11. Scholarship. Study many sciences and absorb their power. Swallow enough sciences the gaps between them will diminish and your knowledge will become a unified whole.
  12. The void? Every step of your reasoning must cut through to the correct answer. ("you must be thinking of carrying your movement through to cutting your enemy, not hitting, springing, striking, touching"). "The Way" is not the terminal goal, the terminal goal is seeing how things really are. If some principle/science fails you, do not stick to it.

3 Rationality?

Epistemic rationality - systematically improving the accuracy of your beliefs.

  • Systematically producing truth.

Instrumental rationality - systematically achieving your values (anything you care about, including other people).

  • Systematically producing value.

4 Feeling rational

Emotions arise from our models of reality.

To become more rational is to arrive at a better estimate of how-the-world-is. It can diminish feelings or intensify them (if you were running from the facts)

5 Why truth?

Emotion is "not rational" if it rests on mistaken beliefs.

Motives for seeking truth: curiosity, pragmatism, morality.

6 What's a bias?

Biases arise from our brains operating by an identifiable algorithm that does some useful work but produces systematic errors (and those are biases)

Biases are not errors that arise from cognitive content (adopted beliefs or adopted moral duties), those are mistakes.

Biases are not errors that arise from damage to an individual human brain/cultural mores.

Biases arise from machinery that is humanly universal.

7 Availability

Memory is not always a good guide to probabilities in the past (there are events that are more likely to be remembered/talked about, and therefore remembered), let alone in the future (humans are bad at assigning probabilities to events that have not recently occurred).

8 Conjunction fallacy

Each next detail of the prediction decreases the probability of that prediction.

Conjunction fallacy is when people hold something to be more probable if it's in conjuction with some other event; as in, "X happens and Y happens" is deemed to be more true Occam's Razor.

9 Planning fallacy

"Planning fallacy" is when people think they can plan.

"Inside view" - figuring out how much time and how much resource is required, visualize the steps from beginning to successful conclusion. Inside view does not take into account unexpected delays and unforeseen catastrophes, even some conscious pessimism is usually not enough. "Outside view" - how much time similar projects took in the past, without considering any of the special properties of this project.

Use outside view. It'll give the right answer.

10 Illusion of transparency

We always know what we mean by our words, and expect others to know it too; it turns out that people overestimate how well people understand their words.

11 Expecting short inferential distances

Inference - reasoning that bases on circumstantial evidence, prior conclusions rather than on an empirical basis. (Something like a corollary?)

In ancestral environment, all background knowledge was universal knowledge, there weren't "abstract disciplines with vast bodies of carefully gathered evidence generalized into elegant theories transmitted by written books" whose conclusions are a hundred inferential steps removed from the universally shared background knowledge.

Self-anchoring - our ability to visualise other minds is imperfect.

Failures of explanation - explainer takes 1 step back, when they need to take 2, 3 or more steps back (explain more background knowledge)

Self-anchoring + systematically underestimated inferential distances + illusion of transparency = HELL

12 The Lens That Sees Its Own Flaws

All brains are flawed lens through which to see reality, but a human brain (compared, to, say, a mouse brain) is a flawed lens that can understand its own flaws, its systematic errors, its biases, and apply second-order corrections to them.

In practice, this makes the lens far more powerful.

13 Making beliefs pay rent (in anticipated experiences)

It is a great strength of humans that we can, better than any other species now known, learn to model the unseen. It's also one of our weakest points, since humans often believe in things that are not only unseen but unreal.

Beliefs should 'pay rent' in anticipated experiences - as in, they should be useful for predicting experiences. They shouldn't become 'floating beliefs' - as in, networks of beliefs that are connected only to each other, and aren't connected to sensory experience.

Don't ask what to believe - ask what to anticipate. Every guess of belief should begin by flowing to a specific guess of anticipation, and should continue to pay rent in future anticipations.

14 A fable of science and politics


15 Belief in belief

P, a proof of P, and a proof of P is provable are different. In the same way, P, wanting P, believing P, wanting to believe P, and believing you believe P are different.

Belief in belief is when you believe you ought to believe something, that it is good and virtuous to believe it, and it's hard to admit you don't believe in something if it's good and virtuous to believe in it.

16 Pretending to be Wise

There's a difference between

  • passing neutral judgment; …which is still a judgment! it's not "above" a positive or negative judgment, it is a particular position.
  • suspending judgment/declining to invest marginal resources; because "it is rational to suspend judgment about X" is not the same as "any judgment about X is as plausible as any other"
  • pretending that either is a mark of wisdom/maturity/etc if the goal is to improve general ability to form more accurate beliefs, it might be useful to avoid focusing on politics. but it's not that a rationalist is too mature to talk about politics.

17 Religion's Claim to be Non-Disprovable

In the old days, people actually believed their religions instead of just believing in them (believing you believe them). Religion used to make claims not only about ethical matters, religion made claims about everything - law, history, sexual morals, forms of government, scientific questions, - and much of that has been moved on from.

After that Eliezer basically (well, I interpreted it so, at least) says that since Bible was wrong about science, history, and some ethical questions, endorsing other Bible ethics is pretty bad and requires a manageable effort of self-deception to overlook the Bible's moral problems, and so on.

I'm not sure Women's Lib (talked about earlier in the text and cited as the 'replacement' of sexual morals of the Bible) is better than Bible's sexual morals. It actually seems very much worse and very much more dangerous.

If I had a binary choice between the two, I would rather conform to the "culture dump from 2500 ago" about sexual/romantic morals than the modern insanity. Or something like that, anyway.

I suppose I might be a bit biased against all this since I'm very much against many modern "norms", and religion sometimes feels like something that is able to, despite its numerous shortcomings in specifics, be an "ally" for me in "fighting" the degeneracy, hedonism and continuing moral decay, and also not let me succumb to it. Still a "non-theist" as of 21.02.2022, though.

18 Professing and Cheering

A certain new 'type' of belief is introduced - a belief that is a cheer, an outrageous, non-convincing one, like saying completely ridiculous things just to cheer for your group, holding up a banner "Go Blues" and such.

19 Belief as Attire

Some beliefs are "attire", group identification.

"People can bind themselves as a group by believing "crazy" things together. Then among outsiders they could show the same pride in their crazy belief as they would show wearing "crazy" group clothes among outsiders."

Say, the Muslims who did 9/11 saw themselves as heroes defending truth, justice, and so on, but "the American thing" to say is that terrorists "hate our freedom" and that flying a plane into a building is "cowardly". But this is an inaccurate description of how the 'Enemy' sees the world.

The very concept of the courage and altruism of a suicide bomber is "Enemy attire". The cowardice and sociopathy of a suicide bomber is "American attire".

And you can't talk about how the Enemy sees the world: it'd be like dressing up as a Nazi for Halloween.

20 Applause lights

Some phrases are just "applause lights" that can be detected by a simple reversal test; if the phrase reversed sounds abnormal and ridiculous, the statement is probably normal, meaning that it doesn't convey any new information.

Some sentences that are normal aren't applause lights, they can instead introduce a discussion topic, and emphasize the importance of a specific proposal for doing some good thing which is undoubtedly good.

21 Focus your uncertainty


22 What is evidence?

For an event to be evidence, it has to happen differently in a way that's entangled (correlated because of the links of cause and effect between them) with the different possible states of the target.

If your eyes and brain work correctly, your beliefs will end up entangled with the facts. Rational thought produces beliefs which are themselves evidence.

If your tongue speaks truly, your rational beliefs, which are themselves evidence, can act as evidence for someone else. Entanglement can be transmitted through chains of cause and effect.

Therefore rational beliefs are contagious, among honest folk who believe each other to be honest. And it’s why a claim that your beliefs are not contagious, that you believe for private reasons which are not transmissible is suspicious. If your beliefs are entangled with reality, they should be contagious among honest folk.

If your model of reality suggests that the outputs of your thought processes should not be contagious to others, then your model says that your beliefs are not themselves evidence, meaning they are not entangled with reality. You stop believing.

As soon as you stop believing "'snow is white' is true," you should (automatically!) stop believing "snow is white," or something is very wrong.

23 Scientific evidence, legal evidence, rational evidence

Rational evidence - broadest possible sense of evidence; rational evidence about a hypothesis H is any observation that has a different likelihood depending on whether H holds in reality or not.

Legal evidence - introduces constraints on rational evidence: includes only particular kinds of evidence, such as personal observations; excludes hearsay.

Science is a publicly reproducible knowledge of humankind; it's made up of generalizations which apply to many particular instances, so that you can run new real-world experiments which test the generalization and verify it, without having to trust some authority.

Scientific evidence - publicly reproducible evidence, i.e. requires an experiment a person can perform themselves to verify it.

24 How much evidence does it take?

Mathematician's bits - logarithms, base 1/2, of probabilities.

For example, if there are four possible outcomes A, B, C, and D, whose probabilities are 50%, 25%, 12.5%, and 12.5%, and I tell you the outcome was “D,” then I have transmitted three bits of information to you, because I informed you of an outcome whose probability was 1/8.

Beliefs based on inadequate evidence are possible, but they will not be accurate. It is like driving a car without any fuel, because you don't believe in the concept that you need fuel to move.

"To really arrive at accurate beliefs requires evidence-fuel, and the further you want to go, the more fuel you need"

25 Einstein's Arrogance

Traditional Rationality emphasizes justification ("to justify believing in X requires Y bit of evidence"):

hunch/private line of reasoning -> hypothesis -> gather evidence to confirm it

But from a Bayesian perspective, you need an amount of evidence roughly equivalent to the complexity of the hypothesis to even just locate a hypothesis in theory-space. It's not really a question of justification; if you don't have enough bits of evidence, you can't focus your attention on one specific hypothesis.

It holds for "hunches" and "intuitions" (of an [at least] impressive-but-still-human Bayesian agent) as well, since no subconscious or conscious process can single out one hypothesis in a million targets using too few bits of information.

I disagreed with the original blogpost and the example put forward in it

26 Occam's Razor

The more complex an explanation is, the more evidence you need just to find it in belief-space.

How can we measure the complexity of an explanation? How can we determine how much evidence is required?

Occam's Razor is often phrased as "the simplest explanation that fits the facts".

A good way to measure the simplicity/complexity is by using Solomonoff induction: length of a shortest computer program which produces that description as an output.

This way, various supernatural explanations (for example, Thor causing thunder) can be differentiated from non-supernatural explanations for the same event (Maxwell's equations): it's much easier to write a program that simulates Maxwell's equations than a program that simulates an intelligent, emotional mind like Thor.

To predict sequences (such as HTHTHTTTHH when throwing coin), Solomonoff induction works by summing up over all allowed computer programs with each program having a prior probability of (1/2) to the power of its code length in bits, and each program being weighted by its fit to all data observed so far.

Minimum Message Length formalism: send string describing a code, send a string describing the data in that code. Whichever explanation leads to the shortest total message is the best.

27 Your strength as a rationalist

Your strength as a rationalist lies in the ability to be more confused by fiction than by reality.

Usefulness of a model is not what it can explain, but what it can't; this way discerning between truth and fiction.

When an explanation of something feels a little forced, here's what the sensation of "a little forced" is actually trying to say: "Either your model is false or this story is wrong._"

Work in progress...

28 Semantic stopsigns

A semantic stopsign is a word that stops further questions about something; it's a cognitive traffic signal: do not think past this point. It doesn't resolve the paradox but puts up a cognitive traffic signal to halt the obvious continuation of the question-and-answer chain.

Examples of semantic stopsigns are: "God!", "Corporations!", "Capitalism!"/"Communism!"/etc, "Liberal democracy!" for some (that is, "What do we need to do about X?" or "Why does bad thing X exist? How could it be resolved?" is answered by "Liberal democracy!" or other ideological stopsign like that).

No word is a stopsign in itself, what matters is what effect it has on a particular person.