### Incorrect hypotheses point to correct observations

1. The Consciousness Researcher and Out-Of-Body Experiences

In his book Consciousness and the Brain, cognitive neuroscientist Stansilas Dehaene writes about scientifically investigating people’s reports of their out-of-body experiences:

the Swiss neurologist Olaf Blanke[ did a] beautiful series of experiments on out-of-body experiences. Surgery patients occasionally report leaving their bodies during anesthesia. They describe an irrepressible feeling of hovering at the ceiling and even looking down at their inert body from up there. […]

What kind of brain representation, Blanke asked, underlies our adoption of a specific point of view on the external world? How does the brain assess the body’s location? After investigating many neurological and surgery patients, Blanke discovered that a cortical region in the right temporoparietal junction, when impaired or electrically perturbed, repeatedly caused a sensation of out-of-body transportation. This region is situated in a high-level zone where multiple signals converge: those arising from vision; from the somatosensory and kinesthetic systems (our brain’s map of bodily touch, muscular, and action signals); and from the vestibular system (the biological inertial platform, located in our inner ear, which monitors our head movements). By piecing together these various clues, the brain generates an integrated representation of the body’s location relative to its environment. However, this process can go awry if the signals disagree or become ambiguous as a result of brain damage. Out-of-body flight “really” happens, then—it is a real physical event, but only in the patient’s brain and, as a result, in his subjective experience. The out-of-body state is, by and large, an exacerbated form of the dizziness that we all experience when our vision disagrees with our vestibular system, as on a rocking boat.

Blanke went on to show that any human can leave her body: he created just the right amount of stimulation, via synchronized but delocalized visual and touch signals, to elicit an out-of-body experience in the normal brain. Using a clever robot, he even managed to re-create the illusion in a magnetic resonance imager. And while the scanned person experienced the illusion, her brain lit up in the temporoparietal junction—very close to where the patient’s lesions were located.

We still do not know exactly how this region works to generate a feeling of self-location. Still, the amazing story of how the out-of-body state moved from parapsychological curiosity to mainstream neuroscience gives a message of hope. Even outlandish subjective phenomena can be traced back to their neural origins. The key is to treat such introspections with just the right amount of seriousness. They do not give direct insights into our brain’s inner mechanisms; rather, they constitute the raw material on which a solid science of consciousness can be properly founded.

The naive hypotheses that out-of-body experiences represented the spirit genuinely leaving the body, were incorrect. But they were still pointing to a real observation, namely that there are conditions which create a subjective experience of leaving the body. That observation could then be investigated through scientific means.

2. The Artist and the Criticism

In art circles, there’s a common piece of advice that goes along the lines of:

When people say that they don’t like something about your work, you should treat that as valid information.

When people say why they don’t like it or what you could do to fix it, you should treat that with some skepticism.

Outside the art context, if someone tells you that they’re pissed off with you as a person (or that you make them feel good), then that’s likely to be true; but the reason that they give may not be the true reason.

People have poor introspective access to the reasons why they like or dislike something; when they are asked for an explanation, they often literally fabricate their reasons. Their explanation is likely false, even though it’s still pointing to something in the work having made them dislike it.

3. The Traditionalist and the Anthropologist

The Scholar’s Stage blog post “Tradition is Smarter Than You Are“, quotes Joseph Henrich’s The Secret of Our Success which reports that many folk traditions, such as not eating particular fish during pregnancy, are adaptive: not eating that fish during pregnancy is good for the child, mother, or both. But the people in question often do not know why they follow that tradition:

We looked for a shared underlying mental model of why one would not eat these marine species during pregnancy or breastfeeding—a causal model or set of reasoned principles. Unlike the highly consistent answers on what not to eat and when, women’s responses to our why questions were all over the map. Many women simply said they did not know and clearly thought it was an odd question. Others said it was “custom.” Some did suggest that the consumption of at least some of the species might result in harmful effects to the fetus, but what precisely would happen to the fetus varied greatly, though a nontrivial segment of the women explained that babies would be born with rough skin if sharks were eaten and smelly joints if morays were eaten. Unlike most of our interview questions on this topic, the answers here had the flavor of post-hoc rationalization: “Since I’m being asked for a reason, there must be a reason, so I’ll think one up now.” This is extremely common in ethnographic fieldwork, and I’ve personally experienced it in the Peruvian Amazon with the Matsigenka and with the Mapuche in southern Chile.

The people’s hypotheses for why they do something is wrong. But their behavior is still pointing to the fish in question being bad to eat during pregnancy.

4. The Martial Artist and the Ki

In Types of Knowing, Valentine writes:

Another example is the “unbendable arm” in martial arts. I learned this as a matter of “extending ki“: if you let magical life-energy blast out your fingertips, then your arm becomes hard to bend much like it’s hard to bend a hose with water blasting out of it. This is obviously not what’s really happening, but thinking this way often gets people to be able to do it after a few cumulative hours of practice.

But you know what helps better?

Knowing the physics.

Turns out that the unbendable arm is a leverage trick: if you treat the upward pressure on the wrist as a fulcrum and you push your hand down (or rather, raise your elbow a bit), you can redirect that force and the force that’s downward on your elbow into each other. Then you don’t need to be strong relative to how hard your partner is pushing on your elbow; you just need to be strong enough to redirect the forces into each other.

Knowing this, I can teach someone to pretty reliably do the unbendable arm in under ten minutes. No mystical philosophy needed.

The explanation about magical life energy was false, but it was still pointing to a useful trick that could be learned and put to good use.

Observations and the hypotheses developed to explain them often get wrapped up, causing us to evaluate both as a whole. In some cases, we only hear the hypothesis rather than the observation which prompted it. But people usually don’t pull their hypotheses out of entirely thin air; even an incorrect hypothesis is usually entangled with some correct observations. If we can isolate the observation that prompted the hypothesis, then we can treat the hypothesis as a burdensome detail to be evaluated on its own merits, separate from the original observation. At the very least, the existence of an incorrect but common hypothesis suggests to us that there’s something going on that needs to be explained.

Originally published at Kaj Sotala. You can comment here or there.

### Incorrect hypotheses point to correct observations

1. The Consciousness Researcher and Out-Of-Body Experiences

In his book Consciousness and the Brain, cognitive neuroscientist Stansilas Dehaene writes about scientifically investigating people’s reports of their out-of-body experiences:

the Swiss neurologist Olaf Blanke[ did a] beautiful series of experiments on out-of-body experiences. Surgery patients occasionally report leaving their bodies during anesthesia. They describe an irrepressible feeling of hovering at the ceiling and even looking down at their inert body from up there. […]

What kind of brain representation, Blanke asked, underlies our adoption of a specific point of view on the external world? How does the brain assess the body’s location? After investigating many neurological and surgery patients, Blanke discovered that a cortical region in the right temporoparietal junction, when impaired or electrically perturbed, repeatedly caused a sensation of out-of-body transportation. This region is situated in a high-level zone where multiple signals converge: those arising from vision; from the somatosensory and kinesthetic systems (our brain’s map of bodily touch, muscular, and action signals); and from the vestibular system (the biological inertial platform, located in our inner ear, which monitors our head movements). By piecing together these various clues, the brain generates an integrated representation of the body’s location relative to its environment. However, this process can go awry if the signals disagree or become ambiguous as a result of brain damage. Out-of-body flight “really” happens, then—it is a real physical event, but only in the patient’s brain and, as a result, in his subjective experience. The out-of-body state is, by and large, an exacerbated form of the dizziness that we all experience when our vision disagrees with our vestibular system, as on a rocking boat.

Blanke went on to show that any human can leave her body: he created just the right amount of stimulation, via synchronized but delocalized visual and touch signals, to elicit an out-of-body experience in the normal brain. Using a clever robot, he even managed to re-create the illusion in a magnetic resonance imager. And while the scanned person experienced the illusion, her brain lit up in the temporoparietal junction—very close to where the patient’s lesions were located.

We still do not know exactly how this region works to generate a feeling of self-location. Still, the amazing story of how the out-of-body state moved from parapsychological curiosity to mainstream neuroscience gives a message of hope. Even outlandish subjective phenomena can be traced back to their neural origins. The key is to treat such introspections with just the right amount of seriousness. They do not give direct insights into our brain’s inner mechanisms; rather, they constitute the raw material on which a solid science of consciousness can be properly founded.

The naive hypotheses that out-of-body experiences represented the spirit genuinely leaving the body, were incorrect. But they were still pointing to a real observation, namely that there are conditions which create a subjective experience of leaving the body. That observation could then be investigated through scientific means.

2. The Artist and the Criticism

In art circles, there’s a common piece of advice that goes along the lines of:

When people say that they don’t like something about your work, you should treat that as valid information.

When people say why they don’t like it or what you could do to fix it, you should treat that with some skepticism.

Outside the art context, if someone tells you that they’re pissed off with you as a person (or that you make them feel good), then that’s likely to be true; but it may not be the true reason.

People have poor introspective access to the reasons why they like or dislike something; when they are asked for an explanation, they often literally fabricate their reasons. Their explanation is likely false, even though it’s still pointing to something in the work having made them dislike it.

3. The Traditionalist and the Anthropologist

The Scholar’s Stage blog post “Tradition is Smarter Than You Are“, quotes Joseph Henrich’s The Secret of Our Success which reports that many folk traditions, such as not eating particular fish during pregnancy, are adaptive: not eating that fish during pregnancy is good for the child, mother, or both. But the people in question often do not know why they follow that tradition:

We looked for a shared underlying mental model of why one would not eat these marine species during pregnancy or breastfeeding—a causal model or set of reasoned principles. Unlike the highly consistent answers on what not to eat and when, women’s responses to our why questions were all over the map. Many women simply said they did not know and clearly thought it was an odd question. Others said it was “custom.” Some did suggest that the consumption of at least some of the species might result in harmful effects to the fetus, but what precisely would happen to the fetus varied greatly, though a nontrivial segment of the women explained that babies would be born with rough skin if sharks were eaten and smelly joints if morays were eaten. Unlike most of our interview questions on this topic, the answers here had the flavor of post-hoc rationalization: “Since I’m being asked for a reason, there must be a reason, so I’ll think one up now.” This is extremely common in ethnographic fieldwork, and I’ve personally experienced it in the Peruvian Amazon with the Matsigenka and with the Mapuche in southern Chile.

The people’s hypotheses for why they do something is wrong. But their behavior is still pointing to the fish in question being bad to eat during pregnancy.

4. The Martial Artist and the Ki

In Types of Knowing, Valentine writes:

Another example is the “unbendable arm” in martial arts. I learned this as a matter of “extending ki“: if you let magical life-energy blast out your fingertips, then your arm becomes hard to bend much like it’s hard to bend a hose with water blasting out of it. This is obviously not what’s really happening, but thinking this way often gets people to be able to do it after a few cumulative hours of practice.

But you know what helps better?

Knowing the physics.

Turns out that the unbendable arm is a leverage trick: if you treat the upward pressure on the wrist as a fulcrum and you push your hand down (or rather, raise your elbow a bit), you can redirect that force and the force that’s downward on your elbow into each other. Then you don’t need to be strong relative to how hard your partner is pushing on your elbow; you just need to be strong enough to redirect the forces into each other.

Knowing this, I can teach someone to pretty reliably do the unbendable arm in under ten minutes. No mystical philosophy needed.

The explanation about magical life energy was false, but it was still pointing to a useful trick that could be learned and put to good use.

Observations and the hypotheses developed to explain them often get wrapped up, causing us to evaluate both as a whole. In some cases, we only hear the hypothesis rather than the observation which prompted it. But people usually don’t pull their hypotheses out of entirely thin air; even an incorrect hypothesis is usually entangled with some correct observations. If we can isolate the observation that prompted the hypothesis, then we can treat the hypothesis as a burdensome detail to be evaluated on its own merits, separate from the original observation. At the very least, the existence of an incorrect but common hypothesis suggests to us that there’s something going on that needs to be explained.

Originally published at Kaj Sotala. You can comment here or there.

### Incorrect hypotheses point to correct observations

1. The Consciousness Researcher and Out-Of-Body Experiences

In his book Consciousness and the Brain, cognitive neuroscientist Stansilas Dehaene writes about scientifically investigating people’s reports of their out-of-body experiences:

the Swiss neurologist Olaf Blanke[ did a] beautiful series of experiments on out-of-body experiences. Surgery patients occasionally report leaving their bodies during anesthesia. They describe an irrepressible feeling of hovering at the ceiling and even looking down at their inert body from up there. […]

What kind of brain representation, Blanke asked, underlies our adoption of a specific point of view on the external world? How does the brain assess the body’s location? After investigating many neurological and surgery patients, Blanke discovered that a cortical region in the right temporoparietal junction, when impaired or electrically perturbed, repeatedly caused a sensation of out-of-body transportation. This region is situated in a high-level zone where multiple signals converge: those arising from vision; from the somatosensory and kinesthetic systems (our brain’s map of bodily touch, muscular, and action signals); and from the vestibular system (the biological inertial platform, located in our inner ear, which monitors our head movements). By piecing together these various clues, the brain generates an integrated representation of the body’s location relative to its environment. However, this process can go awry if the signals disagree or become ambiguous as a result of brain damage. Out-of-body flight “really” happens, then—it is a real physical event, but only in the patient’s brain and, as a result, in his subjective experience. The out-of-body state is, by and large, an exacerbated form of the dizziness that we all experience when our vision disagrees with our vestibular system, as on a rocking boat.

Blanke went on to show that any human can leave her body: he created just the right amount of stimulation, via synchronized but delocalized visual and touch signals, to elicit an out-of-body experience in the normal brain. Using a clever robot, he even managed to re-create the illusion in a magnetic resonance imager. And while the scanned person experienced the illusion, her brain lit up in the temporoparietal junction—very close to where the patient’s lesions were located.

We still do not know exactly how this region works to generate a feeling of self-location. Still, the amazing story of how the out-of-body state moved from parapsychological curiosity to mainstream neuroscience gives a message of hope. Even outlandish subjective phenomena can be traced back to their neural origins. The key is to treat such introspections with just the right amount of seriousness. They do not give direct insights into our brain’s inner mechanisms; rather, they constitute the raw material on which a solid science of consciousness can be properly founded.

The naive hypotheses that out-of-body experiences represented the spirit genuinely leaving the body, were incorrect. But they were still pointing to a real observation, namely that there are conditions which create a subjective experience of leaving the body. That observation could then be investigated through scientific means.

2. The Artist and the Criticism

In art circles, there’s a common piece of advice that goes along the lines of:

When people say that they don’t like something about your work, you should treat that as valid information.

When people say why they don’t like it or what you could do to fix it, you should treat that with extreme skepticism.

People have poor introspective access to the reasons why they like or dislike something; when they are asked for an explanation, they often literally fabricate their reasons. Their explanation is likely false, even though it’s still pointing to something in the work having made them dislike it.

3. The Traditionalist and the Anthropologist

The Scholar’s Stage blog post “Tradition is Smarter Than You Are“, quotes Joseph Henrich’s The Secret of Our Success which reports that many folk traditions, such as not eating particular fish during pregnancy, are adaptive: not eating that fish during pregnancy is good for the child, mother, or both. But the people in question often do not know why they follow that tradition:

We looked for a shared underlying mental model of why one would not eat these marine species during pregnancy or breastfeeding—a causal model or set of reasoned principles. Unlike the highly consistent answers on what not to eat and when, women’s responses to our why questions were all over the map. Many women simply said they did not know and clearly thought it was an odd question. Others said it was “custom.” Some did suggest that the consumption of at least some of the species might result in harmful effects to the fetus, but what precisely would happen to the fetus varied greatly, though a nontrivial segment of the women explained that babies would be born with rough skin if sharks were eaten and smelly joints if morays were eaten. Unlike most of our interview questions on this topic, the answers here had the flavor of post-hoc rationalization: “Since I’m being asked for a reason, there must be a reason, so I’ll think one up now.” This is extremely common in ethnographic fieldwork, and I’ve personally experienced it in the Peruvian Amazon with the Matsigenka and with the Mapuche in southern Chile.

The people’s hypotheses for why they do something is wrong. But their behavior is still pointing to the fish in question being bad to eat during pregnancy.

4. The Martial Artist and the Ki

In Types of Knowing, Valentine writes:

Another example is the “unbendable arm” in martial arts. I learned this as a matter of “extending ki“: if you let magical life-energy blast out your fingertips, then your arm becomes hard to bend much like it’s hard to bend a hose with water blasting out of it. This is obviously not what’s really happening, but thinking this way often gets people to be able to do it after a few cumulative hours of practice.

But you know what helps better?

Knowing the physics.

Turns out that the unbendable arm is a leverage trick: if you treat the upward pressure on the wrist as a fulcrum and you push your hand down (or rather, raise your elbow a bit), you can redirect that force and the force that’s downward on your elbow into each other. Then you don’t need to be strong relative to how hard your partner is pushing on your elbow; you just need to be strong enough to redirect the forces into each other.

Knowing this, I can teach someone to pretty reliably do the unbendable arm in under ten minutes. No mystical philosophy needed.

The explanation about magical life energy was false, but it was still pointing to a useful trick that could be learned and put to good use.

Observations and the hypotheses developed to explain them often get wrapped up, causing us to evaluate both as a whole. In some cases, we only hear the hypothesis rather than the observation which prompted it. But people usually don’t pull their hypotheses out of entirely thin air; even an incorrect hypothesis is usually entangled with some correct observations. If we can isolate the observation that prompted the hypothesis, then we can treat the hypothesis as a burdensome detail to be evaluated on its own merits, separate from the original observation. At the very least, the existence of an incorrect but common hypothesis suggests to us that there’s something going on that needs to be explained.

Originally published at Kaj Sotala. You can comment here or there.

### Incorrect hypotheses point to correct observations

1. The Consciousness Researcher and Out-Of-Body Experiences

In his book Consciousness and the Brain, cognitive neuroscientist Stansilas Dehaene writes about scientifically investigating people’s reports of their out-of-body experiences:

the Swiss neurologist Olaf Blanke[ did a] beautiful series of experiments on out-of-body experiences. Surgery patients occasionally report leaving their bodies during anesthesia. They describe an irrepressible feeling of hovering at the ceiling and even looking down at their inert body from up there. Should we take them seriously? Does out-of-body flight “really” happen?

In order to verify the patients’ reports, some pseudoscientists hide drawings of objects atop closets, where only a flying patient could see them. This approach is ridiculous, of course. The correct stance is to ask how this subjective experience could arise from a brain dysfunction. What kind of brain representation, Blanke asked, underlies our adoption of a specific point of view on the external world? How does the brain assess the body’s location? After investigating many neurological and surgery patients, Blanke discovered that a cortical region in the right temporoparietal junction, when impaired or electrically perturbed, repeatedly caused a sensation of out-of-body transportation. This region is situated in a high-level zone where multiple signals converge: those arising from vision; from the somatosensory and kinesthetic systems (our brain’s map of bodily touch, muscular, and action signals); and from the vestibular system (the biological inertial platform, located in our inner ear, which monitors our head movements). By piecing together these various clues, the brain generates an integrated representation of the body’s location relative to its environment. However, this process can go awry if the signals disagree or become ambiguous as a result of brain damage. Out-of-body flight “really” happens, then—it is a real physical event, but only in the patient’s brain and, as a result, in his subjective experience. The out-of-body state is, by and large, an exacerbated form of the dizziness that we all experience when our vision disagrees with our vestibular system, as on a rocking boat.

Blanke went on to show that any human can leave her body: he created just the right amount of stimulation, via synchronized but delocalized visual and touch signals, to elicit an out-of-body experience in the normal brain. Using a clever robot, he even managed to re-create the illusion in a magnetic resonance imager. And while the scanned person experienced the illusion, her brain lit up in the temporoparietal junction—very close to where the patient’s lesions were located.

We still do not know exactly how this region works to generate a feeling of self-location. Still, the amazing story of how the out-of-body state moved from parapsychological curiosity to mainstream neuroscience gives a message of hope. Even outlandish subjective phenomena can be traced back to their neural origins. The key is to treat such introspections with just the right amount of seriousness. They do not give direct insights into our brain’s inner mechanisms; rather, they constitute the raw material on which a solid science of consciousness can be properly founded.

The naive hypotheses that out-of-body experiences represented the spirit genuinely leaving the body, were incorrect. But they were still pointing to a real observation, namely that there are conditions which create a subjective experience of leaving the body. That observation could then be investigated through scientific means.

2. The Artist and the Criticism

In art circles, there’s a common piece of advice that goes along the lines of:

When people say that they don’t like something about your work, you should treat that as valid information.

When people say why they don’t like it or what you could do to fix it, you should treat that with extreme skepticism.

People have poor introspective access to the reasons why they like or dislike something; when they are asked for an explanation, they often literally fabricate their reasons. Their explanation is likely false, even though it’s still pointing to something in the work having made them dislike it.

3. The Traditionalist and the Anthropologist

The Scholar’s Stage blog post “Tradition is Smarter Than You Are“, quotes Joseph Henrich’s The Secret of Our Success which reports that many folk traditions, such as not eating particular fish during pregnancy, are adaptive: not eating that fish during pregnancy is good for the child, mother, or both. But the people in question often do not know why they follow that tradition:

We looked for a shared underlying mental model of why one would not eat these marine species during pregnancy or breastfeeding—a causal model or set of reasoned principles. Unlike the highly consistent answers on what not to eat and when, women’s responses to our why questions were all over the map. Many women simply said they did not know and clearly thought it was an odd question. Others said it was “custom.” Some did suggest that the consumption of at least some of the species might result in harmful effects to the fetus, but what precisely would happen to the fetus varied greatly, though a nontrivial segment of the women explained that babies would be born with rough skin if sharks were eaten and smelly joints if morays were eaten. Unlike most of our interview questions on this topic, the answers here had the flavor of post-hoc rationalization: “Since I’m being asked for a reason, there must be a reason, so I’ll think one up now.” This is extremely common in ethnographic fieldwork, and I’ve personally experienced it in the Peruvian Amazon with the Matsigenka and with the Mapuche in southern Chile.

The people’s hypotheses for why they do something is wrong. But their behavior is still pointing to the fish in question being bad to eat during pregnancy.

4. The Martial Artist and the Ki

In Types of Knowing, Valentine writes:

Another example is the “unbendable arm” in martial arts. I learned this as a matter of “extending ki“: if you let magical life-energy blast out your fingertips, then your arm becomes hard to bend much like it’s hard to bend a hose with water blasting out of it. This is obviously not what’s really happening, but thinking this way often gets people to be able to do it after a few cumulative hours of practice.

But you know what helps better?

Knowing the physics.

Turns out that the unbendable arm is a leverage trick: if you treat the upward pressure on the wrist as a fulcrum and you push your hand down (or rather, raise your elbow a bit), you can redirect that force and the force that’s downward on your elbow into each other. Then you don’t need to be strong relative to how hard your partner is pushing on your elbow; you just need to be strong enough to redirect the forces into each other.

Knowing this, I can teach someone to pretty reliably do the unbendable arm in under ten minutes. No mystical philosophy needed.

The explanation about magical life energy was false, but it was still pointing to a useful trick that could be learned and put to good use.

Observations and the hypotheses developed to explain them often get wrapped up, causing us to evaluate both as a whole. In some cases, we only hear the hypothesis rather than the observation which prompted it. But people usually don’t pull their hypotheses out of entirely thin air; even an incorrect hypothesis is usually entangled with some correct observations. If we can isolate the observation that prompted the hypothesis, then we can treat the hypothesis as a burdensome detail to be evaluated on its own merits, separate from the original observation. At the very least, the existence of an incorrect but common hypothesis suggests to us that there’s something going on that needs to be explained.

Originally published at Kaj Sotala. You can comment here or there.

### Mark Eichenlaub: How to develop scientific intuition

Recently on the CFAR alumni mailing list, someone asked a question about how to develop scientific intuition. In response, Mark Eichenlaub posted an excellent and extensive answer, which was so good that I asked for permission to repost it in public. He graciously gave permission, so I’ve reproduced his message below. (He otherwise retains the rights to this, meaning that the standard CC license on my blog doesn’t apply to this post.)

From: Mark Eichenlaub
Date: Tue, Oct 23, 2018 at 9:34 AM
Subject: Re: [CFAR Alumni] Suggestions for developing scientific intuition

Sorry for the length, I recently finished a PhD on this topic. (After I wrote the answer kerspoon linked, I went to grad school to study the topic.) This is specifically about solving physics problems but hopefully speaks to intuition a bit more broadly in places.

I mostly think of intuition as the ability to quickly coordinate a large number of small heuristics. We know lots of small facts and patterns, and intuition is about matching the relevant ones onto the current situation. The little heuristics are often pretty local and small in scope.

For example, the other day I heard this physics problem:

You set up a trough with water in it. You hang just barely less than half of the trough off the edge of a table, so that it balances, but even a small force at the far end would make it tip over.

You put a boat in the trough at the end over the table. The trough remains balanced.

Then you slowly push the boat down to the other end of the trough, so that’s it’s in the part of the trough that hangs out from the table. What happens? (I.E. does the trough tip over?)

The answer is (rot13) Gur gebhtu qbrf abg gvc; vg erznvaf onynaprq (nf ybat nf gur zbirzrag bs gur obng vf fhssvpvragyl fybj fb gung rirelguvat erznvaf va rdhvyvoevhz).

I knew this “intuitively”, by which I mean I got it within a second or so of understanding the question, and without putting in conscious effort to thinking about it. (I wasn’t certain I was right until I had consciously thought it out, but I was reasonably confident within a second, and my intuition bore out.) I don’t think this was due to some sort of general intuition about problem solving, science, physics, mechanics, or even floating. It felt like I could solve the problem intuitively specifically because I had seen sufficiently-similar things that led me to the specific heuristic “a floating object spreads its weight out evenly over the bottom of the container it’s floating in.” Then I think of “having intuition” in physics as having maybe a thousand little rules like that and knowing when to call on which one.

For this particular heuristic, there is a classic problem asking what happens to the water level in a lake if you are in a boat with a rock, and you throw the rock into the water and it sinks to the bottom. One solution to that problem is that when the rock is on the bottom of the lake, it exerts more force on that part of the bottom of the lake than is exerted at other places. By contrast, when the rock is still in the boat, the only thing touching the bottom of the lake is water, and the water pressure is the same everywhere, so the weight of the rock is distributed evenly across the entire lake. The total force on the bottom of the lake doesn’t change between the two scenarios (because gravity pulls on everything just as hard either way), when the rock is sitting on the bottom of the lake and the force on the bottom of the lake is higher under the rock, it must be lower everywhere else to compensate. The pressure everywhere else is $\rho g h$, so if that goes down, the level of the lake goes down. Conclusion: when you throw the rock overboard, the level of the lake goes down a bit. When I thought about that problem, I presumably built the “weight distributed evenly” heuristic. All I had to do was quickly apply it to the trough problem to solve that one as well.

And if someone else also had a background in physics but didn’t find the trough problem easy, it’s probably because they simply hadn’t happened to think about the boat problem, or some other similar problems, in the right way, and hadn’t come away with the heuristic about the weight of floating things being spread out evenly.

To me, this picture of intuition as small heuristics doesn’t look good for the idea of developing powerful intuition. The “weight gets spread out by floating” heuristic is not likely to transfer to much else. I’ve used it for two physics problems about floating things and, as far as I know, nothing else.

You can probably think of lots of similar heuristics. For example, “conservation of expected evidence“. You might catch a mistake in someone’s reasoning, or an error in a long probability calculation you made, if you happen to notice that the argument or calculation violates conservation of expected evidence. The nice thing about this is that it can happen almost automatically. You don’t have to stop after every calculation or argument and think, “does this break conservation of expected evidence?”. Instead, you wind up learning some sorts of triggers that you associate with the principle that prime it in your mind, and then, if it becomes relevant to the argument, you notice that and cite the principle.

In this picture, building intuition is about learning a large number of these heuristics, along with their triggers.

However, while the individual small heuristics are often the easiest things to point to in an intuitive solution to a problem, I do think there are more general, and therefore more transferrable parts of intuition as well. I imagine that the paragraph I wrote explaining the solution to the boat problem will be largely incomprehensible to someone who hasn’t studied physics. That’s partially because it uses concepts they won’t have a rigorous understanding of (e.g. pressure), that it tacitly uses small heuristics it didn’t explain (e.g. that the reason the pressure is the same along the bottom of the lake is that if it weren’t, there would be horizontal forces that push the water around until the pressure did equalize in this way), partially that it made simplifications that it didn’t state and it might not be clear are justified (e.g. that the bottom of the lake is flat). More importantly, it relies on a general framework of Newtonian mechanics. For example, there are a number of tacit applications of Newton’s laws in the argument. For example, I stated that the total force on the bottom of the lake is the same whether the rock is resting on the bottom or floating in the boat “because gravity pulls on everything just as hard either way”, but these aren’t directly connected concepts. Gravity pulls the system (boat + water + rock) down just as hard no matter where the rock is. That system is not accelerating, so by Newton’s second law, the bottom of the lake pushes up on that system just as hard in each scenario. And by Newton’s third law, the system pushes down on the bottom of the lake just as hard in each scenario. So understanding the argument involves some fairly general heuristics such as “apply Newton’s second law to an object in equilibrium to show that two forces on it have equal magnitude” – a heuristic I’ve used hundreds of times, and “decide what objects to define as part of a system fluidly as you go through a problem” (in this case, switching from thinking about the rock as a system to thinking about rock+boat+water as a single system) – a skill I’ve used hundreds to thousands of times across all of physics. (My job is to teach high schoolers to be really good at solving problems like this, so I spend way more time on it than most people, so applying a heuristic specific to solving introductory physics problems in a thousand independent instances is realistic for me.)

Then there may be more meta-level skills and heuristics that you develop in solving problems. These could be things like valuing non-calculation solutions, or believing that persevering on a tough problem is worthwhile. It’s also important that intuition isn’t just about having lots of little heuristics. It’s about organizing them and calling the right one up at the right time. You’ll have to ask yourself the right sorts of questions to prompt yourself to find the right heuristics, and that’s probably a pretty general skill.

There is a fair amount of research on trying to understand what all these little heuristics are and how to develop them, but I’m mostly familiar with the research in physics.

In the Quora answer kerspoon linked, I cited George Lakoff, and I still that he’s a good source for understanding how we go about taking primitive sorts of concepts (e.g. “up” and “down”) and using and adapting them, via partial metaphor, to understanding more abstract things. For a specific example that’s well-argued, see:

Wittmann, Michael C., and Katrina E. Black. “Mathematical actions as procedural resources: An example from the separation of variables.” Physical Review Special Topics-Physics Education Research 11.2 (2015): 020114.

They argue that students understand the arithmetic action “separation of variables” via analogy to their physical understanding of taking things and physically moving them around. However, I think Wittman and Black’s work is incomplete. For example, it doesn’t explain why students using the motion analogy for separation of variables do it correctly – they could just as well use motion to encode algebraically-invalid rules. Also, they don’t explain how the analogy develops. They just catalog that it exists.

A foundational work in trying to understand the components of physical intuition is:

DiSessa, Andrea A. “Toward an epistemology of physics.” Cognition and instruction 10.2-3 (1993): 105-225.

This work establishes “phenomenological primitives”; little core heuristics such as “near is more”, which are templates for physical reasoning. Drawing from these templates, we might conclude that the nearer you are to a speaker, the louder the sound, or that the nearer you are to the sun, the hotter it will be (and therefore that summer is hot because the Earth is nearer the sun – a false but common and reasonable belief).

That’s a long and somewhat-obscure paper. I really like his student’s work

Sherin, Bruce L. “How students understand physics equations.” Cognition and instruction 19.4 (2001): 479-541.

Like Disessa, Sherin builds his own framework for what intuition is. His scope is more limited though, focusing solely on building and interpreting certain types of equations in a manner that combines “intuitive” physical ideas and mathematical templates. He spells this out in detail more in the paper, and it’s incredibly clear and well-argued. Probably my favorite paper in the field.

A more general reference that’s much more accessible than Disessa and more general an overview of cognition in physics than Sherin is
“How Should We Think About How Our Students Think” by my advisor, Joe Redish http://media.physics.harvard.edu/video/?id=COLLOQ_REDISH_093013 (video) https://arxiv.org/abs/1308.3911 (paper).

The actual process of building new heuristics is also studied, but over all I don’t think we know all that much. See my friend Ben’s paper

Dreyfus, Benjamin W., Ayush Gupta, and Edward F. Redish. “Applying conceptual blending to model coordinated use of multiple ontological metaphors.” International Journal of Science Education 37.5-6 (2015): 812-838.

for an example of theory-building around how we create new intuitions. He calls on a framework from cognitive science called “conceptual blending” that is rather formal, but I think pretty entertaining to read.

A relevant search terms in the education literature:

“conceptual change”

but I find a lot of this literature to be hard-to-follow and not always a productive use of time to read.

On the applied side, I think the state of the art in evidence-backed approaches to building intuition, at least in physics, is modeling instruction. I’m not sure what the best introduction to modeling instruction is. They have a website that seems okay. Eric Brewe writes on it and he’s usually very good. The basic idea is to have students collaboratively participate in the building of the theories of physics they’re using (in a specific way, with guidance and direction from a trained instructor), which gets them to think about the “whys” involved with a particular theory or model in a way they usually wouldn’t.

I have written some about why I think things like checking the extreme cases of a formula are powerful intuition-building tools. A preprint is available here: https://arxiv.org/pdf/1804.01639.pdf

However, I think it’s dangerous to have rules like “always check the dimensions of your answer”, “always check the extreme cases of a formula”, or even “always check that the numbers come out reasonable.” The reason is that having these things as procedures tends to encourage students to follow them by rote. A large part of the cognitive work involved isn’t in checking the extreme cases or the dimensions, but in realizing that in this particular situation, that would be a good thing to do. If you’re doing it only because an external prompt is telling you to, you aren’t building the appropriate meta-level habits. See https://www.tandfonline.com/doi/abs/10.1080/09500693.2017.1308037 for an example of this effect.

See papers on “metarepresentation” by Disessa and/or Sherin for another example of generalizable skills related to intuition and problem solving.

Unfortunately, I don’t think writing books well or writing courses of individual study is something we know much about. I don’t know anyone who has a significant grant for that; the most I’ve ever seen on it is a poster here or there at a conference. Generally, grants are awarded for improving high school and college courses, or for professional development programs, supporting department or institution level changes at schools, etc. So adults who just want to learn on their own are not really served much by the research on the area. If you’re an adult who wants to self-study theoretical physics with an eye towards intuition, I recommend Leonard Susskind’s series of courses “The Theoretical Minimum” (the first three courses exist as books, the rest only as video lectures). He approaches mathematical topics with what I find an intuitive approach in most cases. Of course the Feynman lectures on physics are also very good.

I’ll be building an introduction to physics course at Art of Problem Solving, starting work sometime this winter. It might be available in the spring, although students will mostly be middle and high school students (but anyone is welcome to take our courses). I currently teach an advanced physics problem-solving course at AoPS called “PhysicsWOOT”. I try to support intuition-building practices there, but the main aim is in training these many small heuristics which students need to solve contest problems.

There should be something like modeling instruction for adult independent learners, but I don’t know of it.

Originally published at Kaj Sotala. You can comment here or there.

### On insecurity as a friend

There’s a common narrative about confidence that says that confidence is good, insecurity is bad. It’s better to develop your confidence than to be insecure. There’s an obvious truth to this.

But what that narrative does not acknowledge, and what both a person struggling with insecurity and their well-meaning friends might miss, is that that insecurity may be in place for a reason.

You might not notice it online, but I’ve usually been pretty timid and insecure in real life. But this wasn’t always the case. There were occasions earlier in my life when I was less insecure, more confident in myself.

I was also pretty horrible at things like reading social nuance and figuring out when and why someone might be offended. So I was given, repeatedly, the feedback that my behavior was bad and inappropriate.

Eventually a part of me internalized that as “I’m very likely to accidentally offend the people around me, so I should be very cautious about what I say, ideally saying nothing at all”.

This was, I think, the correct lesson to internalize at that point! It shifted me more into an observer mode, allowing me to just watch social situations and learn more about their dynamics that way. I still don’t think that I’m great at reading social nuance, but I’m at least better at it than I used to be.

And there have been times since then when I’ve decided that I should act with more confidence, and just get rid of the part that generates the insecurity. I’ve been about to do something, felt a sense of insecurity, and walked over the feeling and done the thing anyway.

Sometimes this has had good results. But often it has also led to things blowing up in my face, with me inadvertently hurting someone and leaving me feeling guilty for months afterwards.

Turns out, that feeling of insecurity wasn’t a purely bad thing. It was throwing up important alarms which I chose to ignore, alarms which were sounding because it recognized my behavior as matching previous behavior which had had poor consequences.

Yes, on many occasions that part of me makes me way too cautious. And it would be good to moderate that caution a little. But the same part which generates the feelings of insecurity is the same part which is constantly working to model other people and their experience, their reactions to me. The part that is doing its hardest to make other people feel safe and comfortable around me, to avoid doing things that would make them feel needlessly hurt or upset or unsafe, and to actively let them know that I’m doing this.

Just carving out that part would be a mistake. A moral wrong, even.

The answer is not to get rid of it. The answer is to integrate its cautions better, to keep it with me as a trusted friend and ally – one which feels safe enough about getting its warnings listened to, that it will not scream all the time just to be heard.

Originally published at Kaj Sotala. You can comment here or there.

### New paper: Long-Term Trajectories of Human Civilization

Long-Term Trajectories of Human Civilization (free PDF). Foresight, forthcoming, DOI 10.1108/FS-04-2018-0037.

Authors: Seth D. Baum, Stuart Armstrong, Timoteus Ekenstedt, Olle Häggström, Robin Hanson, Karin Kuhlemann, Matthijs M. Maas, James D. Miller, Markus Salmela, Anders Sandberg, Kaj Sotala, Phil Torres, Alexey Turchin, and Roman V. Yampolskiy.

Abstract
Purpose: This paper formalizes long-term trajectories of human civilization as a scientific and ethical field of study. The long-term trajectory of human civilization can be defined as the path that human civilization takes during the entire future time period in which human civilization could continue to exist.
Approach: We focus on four types of trajectories: status quo trajectories, in which human civilization persists in a state broadly similar to its current state into the distant future; catastrophe trajectories, in which one or more events cause significant harm to human civilization; technological transformation trajectories, in which radical technological breakthroughs put human civilization on a fundamentally different course; and astronomical trajectories, in which human civilization expands beyond its home planet and into the accessible portions of the cosmos.
Findings: Status quo trajectories appear unlikely to persist into the distant future, especially in light of long-term astronomical processes. Several catastrophe, technological transformation, and astronomical trajectories appear possible.
Value: Some current actions may be able to affect the long-term trajectory. Whether these actions should be pursued depends on a mix of empirical and ethical factors. For some ethical frameworks, these actions may be especially important to pursue.

An excerpt from the press release over at the Global Catastrophic Risk Institute:

Society today needs greater attention to the long-term fate of human civilization. Important present-day decisions can affect what happens millions, billions, or trillions of years into the future. The long-term effects may be the most important factor for present-day decisions and must be taken into account. An international group of 14 scholars calls for the dedicated study of “long-term trajectories of human civilization” in order to understand long-term outcomes and inform decision-making. This new approach is presented in the academic journal Foresight, where the scholars have made an initial evaluation of potential long-term trajectories and their present-day societal importance.

“Human civilization could end up going in radically different directions, for better or for worse. What we do today could affect the outcome. It is vital that we understand possible long-term trajectories and set policy accordingly. The stakes are quite literally astronomical,” says lead author Dr. Seth Baum, Executive Director of the Global Catastrophic Risk Institute, a non-profit think tank in the US.

The group of scholars including Olle Häggström, Robin Hanson, Karin Kuhlemann, Anders Sandberg, and Roman Yampolskiy have identified four types of long-term trajectories: status quo trajectories, in which civilization stays about the same, catastrophe trajectories, in which civilization collapses, technological transformation trajectories, in which radical technology fundamentally changes civilization, and astronomical trajectories, in which civilization expands beyond our home planet.

Available here: https://kajsotala.fi/assets/2018/08/trajectories.pdf

Originally published at Kaj Sotala. You can comment here or there.

### Finland Museum Tour 1/??: Tampere Art Museum

I haven’t really been to museums as an adult; not because I’d have been particularly Anti-Museum, but just because museums never happened to become a Thing That I Do. I vaguely recall having been to a few museums with my parents when I was little, an occasional Japan exhibition as a teen when Japan was a Thing, and a few visits to various museums with school. I think my overall recollection of those visits afterwards could be summarized as being around 5.5 on the BoardGameGeek rating scale grade of “5/10: Slightly boring, take it or leave it” and “6/10: Ok – will play if in the mood”. (The BGG rating scale is my favorite of the ones that I’ve seen, but I digress.)

So I’m not sure, but it’s at least possible that between becoming an adult and yesterday, I didn’t visit a single museum.

For the last year or so however, I’ve had a definite feeling of being stuck in a rut, life-wise. Up until summer last year, I used to have a lot of anxiety; I’m still not totally free of it, but I’ve reduced the amount of it enough that escaping from it is no longer my main driving motivation, the way that it used to be. Meaning that I’m more free to focus on things that I actually enjoy.

But once you have spent most of your adult life feeling a desperate need to escape from a constant level of background anxiety, anxiety which was preventing you from doing anything slow-paced as that would have been insufficient to drown out the suffering… then it’s hard to know *what* you really enjoy anymore. Because you haven’t really been looking for enjoyable things, you have been looking for things that would make the pain go away.

What I was left with, even after getting rid of most of the anxiety, was some level of anhedonia – a difficulty deriving any pleasure from something. And most of my old routines were built around doing things that were mainly palliatives for anxiety, rather than being particularly enjoyable.

Then one day, I happened to see a news article saying something about how the national Museum Card – a single card that you can buy for a year, that gives you free access to various museums around the country – had brought more visitors to museums. And it crossed my mind that visiting museums is a Thing That People Do, which I wasn’t doing, and that I probably hadn’t been able to properly appreciate museums as a kid.

Also, that it would be a fun adventure to go around the country and visit all the various museums that the card gives you access to (currently 278 of them). That kind of an adventure was also Something That People Did, but which hadn’t really been the Kind Of Thing That I Would Do.

So I happened to mention this idea in a conversation with my good friend Tiina, who then mentioned that she also had a museum card, and that company would be welcome. That wasn’t quite an agreement to visit *all* the museums in the country, but still good enough to get started!

Our first visit, today, was to Tampere Art Museum. I had no particular reason for picking this place in particular: me and Tiina would be traveling elsewhere later in the day, so I let her as a Tampere local choose a place from which we could easily get to the bus station afterwards. I also found the notion of starting out with an art museum intriguing. Some of the other museums in Tampere, such as the Espionage Museum, Game Museum or Lenin Museum had topics that were intrinsically interesting. But just a generic “art museum” was probably closest to my inner stereotype of the kind of a dull museum that I wasn’t really able to appreciate as a kid. This time, I was determined to have an open mind and enjoy my experience, whatever it might be.

And I did.

The museum had two exhibitions. The first, Tuomo Rosenlund and Mika Hannu’s “Paikan muisti” (“The Memory of Place”) consisted of black-and-white drawings of different places in the city of Tampere, together with poems and writing about the history of those places.

I feel like I have recently done a number of things to help me connect with more intuitive parts of my mind, ranging from various psychotherapy-style practices like Focusing, to shaman drumming and mild vision trances. Looking at the various drawings, the thought came to me to let my subconscious complete them. I asked it to do so and it obliged, and in my mind’s eye, the black-and-white drawings were flooded with color and texture and depth, and I could experience myself standing in the depicted places, imagining myself into locations that I had never seen before. Not as vividly as actually being there, but as strongly as in a particularly vivid memory.

That felt enjoyable, like a pleasant meditation exercise.

And then there was the exhibition by J.A. Juvani, a visual artist who – as I learned – is the national Young Artist of the Year.

The essence of his pieces, as I experienced them, was that of pure, raw, unashamed sexuality; lust and desire shaded in colors of queer and kink. Furthermore, it was an obviously personal essence; an expression of the artist’s own sexuality, an expression of intimacy that drew you in, erotic even to someone like me who doesn’t ordinarily think of himself as being physically attracted to men.

One of his displays was a looping video, on a large screen right in front of the stairs leading to the second floor, placed so that you couldn’t avoid seeing it when you came up. At the end of the very suggestive video, he would be staring you right in the eyes; and as I stared back, I felt my breathing growing deeper and faster, as if he was propositioning to me in the flesh right there, me finding him too appealing to say no to.

What probably got me was the rawness – this was not the nice, unreal kind of erotica where everything is pink and nobody’s position ever feels uncomfortable, but the kind of primal lust that felt real and totally uninterested in Photoshopping any aspect of the experience. It was much more interested in just having a good, visceral, _physical_ fuck right then and there.

And it brought up partially forgotten memories. Memories of a same kind of pure, two-sided lust between people, with a force that I realized I probably hadn’t experienced since my teen years, in that roller-coaster relationship with my first girlfriend that was all shades of dysfunctional but never lacking in intensity.

There was a book on Juvani’s work on sale; I skimmed it, and his descriptions of his life, the casual sex, the friend who had visited “for a tea and a blowjob” further brought back memories of those first teenage sex experiences, when everything was novel and exciting and innocent and powerful; before I had accumulated the sexual traumas and problems that sometimes make me almost averse over the thought of having sex, even with a willing and desirable partner.

And there was a sense of catharsis, feeling somehow more *pure* as a result; the same kind of feeling that I may get from writing something that I’ve managed to really immerse myself in, or from really intensive fiction. A feeling of having connected with deeper, more emotional parts of my mind, and becoming more whole as a result.

It was a good experience. I’m glad that I went.

Originally published at Kaj Sotala. You can comment here or there.

### Is the Star Trek Federation really incapable of building AI?

In the Star Trek universe, we are told that it’s really hard to make genuine artificial intelligence, and that Data is so special because he’s a rare example of someone having managed to create one.

But this doesn’t seem to be the best hypothesis for explaining the evidence that we’ve actually seen. Consider:

• In the TOS episode “The Ultimate Computer“, the Federation has managed to build a computer intelligent enough to run the Enterprise by its own, but it goes crazy and Kirk has to talk it into self-destructing.
• In TNG, we find out that before Data, Doctor Noonian Soong had built Lore, an android with sophisticated emotional processing. However, Lore became essentially evil and had no problems killing people for his own benefit. Data worked better, but in order to get his behavior right, Soong had to initially build him with no emotions at all. (TNG: “Datalore“, “Brothers“)
• In the TNG episode “Evolution“, Wesley is doing a science project with nanotechnology, accidentally enabling the nanites to become a collective intelligence which almost takes over the ship before the crew manages to negotiate a peaceful solution with them.
• The holodeck seems entirely capable of running generally intelligent characters, though their behavior is usually restricted to specific roles. However, on occasion they have started straying outside their normal parameters, to the point of attempting to take over the ship. (TNG: “Elementary, Dear Data“) It is also suggested that the computer is capable of running an indefinitely long simulation which is good enough to make an intelligent being believe in it being the real universe. (TNG: “Ship in a Bottle“)
• The ship’s computer in most of the series seems like it’s potentially quite intelligent, but most of the intelligence isn’t used for anything else than running holographic characters.
• In the TNG episode “Booby Trap“, a potential way of saving the Enterprise from the Disaster Of The Week would involve turning over control of the ship to the computer: however, the characters are inexplicably super-reluctant to do this.
• In Voyager, the Emergency Medical Hologram clearly has general intelligence: however, it is only supposed to be used in emergency situations rather than running long-term, its memory starting to degrade after a sufficiently long time of continuous use. The recommended solution is to reset it, removing all of the accumulated memories since its first activation. (VOY: “The Swarm“)

There seems to be a pattern here: if an AI is built to carry out a relatively restricted role, then things work fine. However, once it is given broad autonomy and it gets to do open-ended learning, there’s a very high chance that it gets out of control. The Federation witnessed this for the first time with the Ultimate Computer. Since then, they have been ensuring that all of their AI systems are restricted to narrow tasks or that they’ll only run for a short time in an emergency, to avoid things getting out of hand. Of course, this doesn’t change the fact that your AI having more intelligence is generally useful, so e.g. starship computers are equipped with powerful general intelligence capabilities, which sometimes do get out of hand.

Dr. Soong’s achievement with Data was not in building a general intelligence, but in building a general intelligence which didn’t go crazy. (And before Data, he failed at that task once, with Lore.)

The Federation’s issue with AI is not that they haven’t solved artificial general intelligence. The Federation’s issue is that they haven’t reliably solved the AI alignment problem.

Originally published at Kaj Sotala. You can comment here or there.

### Some conceptual highlights from “Disjunctive Scenarios of Catastrophic AI Risk”

My forthcoming paper, “Disjunctive Scenarios of Catastrophic AI Risk”, attempts to introduce a number of considerations to the analysis of potential risks from Artificial General Intelligence (AGI). As the paper is long and occasionally makes for somewhat dry reading, I thought that I would briefly highlight a few of the key points raised in the paper.

The main idea here is that most of the discussion about risks of AGI has been framed in terms of a scenario that goes something along the lines of “a research group develops AGI, that AGI develops to become superintelligent, escapes from its creators, and takes over the world”. While that is one scenario that could happen, focusing too much on any single scenario makes us more likely to miss out alternative scenarios. It also makes the scenarios susceptible to criticism from people who (correctly!) point out that we are postulating very specific scenarios that have lots of burdensome details.

To address that, I discuss here a number of considerations that suggest disjunctive paths to catastrophic outcomes: paths that are of the form “A or B or C could happen, and any one of them happening could have bad consequences”.

Superintelligence versus Crucial Capabilities

Bostrom’s Superintelligence, as well as a number of other sources, basically make the following argument:

1. An AGI could become superintelligent
2. Superintelligence would enable the AGI to take over the world

This is an important argument to make and analyze, since superintelligence basically represents an extreme case: if an individual AGI may become as powerful as it gets, how do we prepare for that eventuality? As long as there is a plausible chance for such an extreme case to be realized, it must be taken into account.

However, it is probably a mistake to focus only on the case of superintelligence. Basically, the reason why we are interested in a superintelligence is that, by assumption, it has the cognitive capabilities necessary for a world takeover. But what about an AGI which also had the cognitive capabilities necessary for taking over the world, and only those?

Such an AGI might not count as a superintelligence in the traditional sense, since it would not be superhumanly capable in every domain. Yet, it would still be one that we should be concerned about. If we focus too much on just the superintelligence case, we might miss the emergence of a “dumb” AGI which nevertheless had the crucial capabilities necessary for a world takeover.

That raises the question of what might be such crucial capabilities. I don’t have a comprehensive answer; in my paper, I focus mostly on the kinds of capabilities that could be used to inflict major damage: social manipulation, cyberwarfare, biological warfare. Others no doubt exist.

A possibly useful framing for future investigations might be, “what level of capability would an AGI need to achieve in a crucial capability in order to be dangerous”, where the definition of “dangerous” is free to vary based on how serious of a risk we are concerned about. One complication here is that this is a highly contextual question – with a superintelligence we can assume that the AGI may get basically omnipotent, but such a simplifying assumption won’t help us here. For example, the level of offensive biowarfare capability that would pose a major risk, depends on the level of the world’s defensive biowarfare capabilities. Also, we know that it’s possible to inflict enormous damage to humanity even with just human-level intelligence: whoever is authorized to control the arsenal of a nuclear power could trigger World War III, no superhuman smarts needed.

Crucial capabilities are a disjunctive consideration because they show that superintelligence isn’t the only level of capability that would pose a major risk: and there many different combinations of various capabilities – including ones that we don’t even know about yet – that could pose the same level of danger as superintelligence.

Incidentally, this shows one reason why the common criticism of “superintelligence isn’t something that we need to worry about because intelligence isn’t unidimensional” is misfounded – the AGI doesn’t need to be superintelligent in every dimension of intelligence, just the ones we care about.

How would the AGI get free and powerful?

In the prototypical AGI risk scenario, we are assuming that the developers of the AGI want to keep it strictly under control, whereas the AGI itself has a motive to break free. This has led to various discussions about the feasibility of “oracle AI” or “AI confinement” – ways to restrict the AGI’s ability to act freely in the world, while still making use of it. This also means that the AGI might have a hard time acquiring the resources that it needs for a world takeover, since it either has to do so while it is under constant supervision by its creators, or while on the run from them.

However, there are also alternative scenarios where the AGI’s creators voluntarily let it free – or even place it in control of e.g. a major corporation, free to use that corporation’s resources as it desires! My chapter discusses several ways by which this could happen: i) economic benefit or competitive pressure, ii) criminal or terrorist reasons, iii) ethical or philosophical reasons, iv) confidence in the AI’s safety, as well as v) desperate circumstances such as being otherwise close to death. See the chapter for more details on each of these. Furthermore, the AGI could remain theoretically confined but be practically in control anyway – such as in a situation where it was officially only giving a corporation advice, but its advice had never been wrong before and nobody wanted to risk their jobs by going against the advice.

Would the Treacherous Turn involve a Decisive Strategic Advantage?

Looking at crucial capabilities in a more fine-grained manner also raises the question of when an AGI would start acting against humanity’s interests. In the typical superintelligence scenario, we assume that it will do so once it is in a position to achieve what Bostrom calls a Decisive Strategic Advantage (DSA): “a level of technological and other advantages sufficient to enable [an AI] to achieve complete world domination”. After all, if you are capable of achieving superintelligence and a DSA, why act any earlier than that?

Even when dealing with superintelligences, however, the case isn’t quite as clear-cut. Suppose that there are two AGI systems, each potentially capable of achieving a DSA if they prepare for long enough. But the longer that they prepare, the more likely it becomes that the other AGI sets its plans in motion first, and achieves an advantage over the other. Thus, if several AGI projects exist, each AGI is incentivized to take action at such a point which maximizes its overall probability of success – even if the AGI only had rather slim chances of succeeding in the takeover, if it thought that waiting for longer would make its chances even worse.

Indeed, an AGI which defects on its creators may not be going for a world takeover in the first place: it might, for instance, simply be trying to maneuver itself into a position where it can act more autonomously and defeat takeover attempts by other, more powerful AGIs. The threshold for the first treacherous turn could vary quite a bit, depending on the goals and assets of the different AGIs; various considerations are discussed in the paper.

A large reason for analyzing these kinds of scenarios is that, besides caring about existential risks, we also care about catastrophic risks – such as an AGI acting too early and launching a plan which resulted in “merely” hundreds of millions of deaths. My paper introduces the term Major Strategic Advantage, defined as “a level of technological and other advantages sufficient to pose a catastrophic risk to human society”. A catastrophic risk is one that might inflict serious damage to human well-being on a global scale and cause ten million or more fatalities.

“Mere” catastrophic risks could also turn into existential ones, if they contribute to global turbulence (Bostrom et al. 2017), a situation in which existing institutions are challenged, and coordination and long-term planning become more difficult. Global turbulence could then contribute to another out-of-control AI project failing even more catastrophically and causing even more damage

Summary table and example scenarios

The table below summarizes the various alternatives explored in the paper.

 AI’s level of strategic advantage Decisive Major AI’s capability threshold for non-cooperation Very low to very high, depending on various factors Sources of AI capability Individual takeoff Hardware overhang Speed explosion Intelligence explosion Collective takeoff Crucial capabilities Biowarfare Cyberwarfare Social manipulation Something else Gradual shift in power Ways for the AI to achieve autonomy Escape Social manipulation Technical weakness Voluntarily released Economic or competitive reasons Criminal or terrorist reasons Ethical or philosophical reasons Desperation Confidence in lack of capability in values Confined but effectively in control Number of AIs Single Multiple

And here are some example scenarios formed by different combinations of them:

The classic takeover

(Decisive strategic advantage, high capability threshold, intelligence explosion, escaped AI, single AI)

The “classic” AI takeover scenario: an AI is developed, which eventually becomes better at AI design than its programmers. The AI uses this ability to undergo an intelligence explosion, and eventually escapes to the Internet from its confinement. After acquiring sufficient influence and resources in secret, it carries out a strike against humanity, eliminating humanity as a dominant player on Earth so that it can proceed with its own plans unhindered.

(Major strategic advantage, high capability threshold, gradual shift in power, released for economic reasons, multiple AIs)

Many corporations, governments, and individuals voluntarily turn over functions to AIs, until we are dependent on AI systems. These are initially narrow-AI systems, but continued upgrades push some of them to the level of having general intelligence. Gradually, they start making all the decisions. We know that letting them run things is risky, but now a lot of stuff is built around them, it brings a profit and they’re really good at giving us nice stuff—for the while being.

The wars of the desperate AIs

(Major strategic advantage, low capability threshold, crucial capabilities, escaped AIs, multiple AIs)

Many different actors develop AI systems. Most of these prototypes are unaligned with human values and not yet enormously capable, but many of these AIs reason that some other prototype might be more capable. As a result, they attempt to defect on humanity despite knowing their chances of success to be low, reasoning that they would have an even lower chance of achieving their goals if they did not defect. Society is hit by various out-of-control systems with crucial capabilities that manage to do catastrophic damage before being contained.

Is humanity feeling lucky?

(Decisive strategic advantage, high capability threshold, crucial capabilities, confined but effectively in control, single AI)

Google begins to make decisions about product launches and strategies as guided by their strategic advisor AI. This allows them to become even more powerful and influential than they already are. Nudged by the strategy AI, they start taking increasingly questionable actions that increase their power; they are too powerful for society to put a stop to them. Hard-to-understand code written by the strategy AI detects and subtly sabotages other people’s AI projects, until Google establishes itself as the dominant world power.

This blog post was written as part of work for the Foundational Research Institute.

Originally published at Kaj Sotala. You can comment here or there.