?

Log in

No account? Create an account
A view to the gallery of my mind

> recent entries
> calendar
> friends
> Website
> profile
> previous 20 entries

Sunday, March 18th, 2018
10:15 am - Is the Star Trek Federation really incapable of building AI?

In the Star Trek universe, we are told that it’s really hard to make genuine artificial intelligence, and that Data is so special because he’s a rare example of someone having managed to create one.

But this doesn’t seem to be the best hypothesis for explaining the evidence that we’ve actually seen. Consider:

  • In the TOS episode “The Ultimate Computer“, the Federation has managed to build a computer intelligent enough to run the Enterprise by its own, but it goes crazy and Kirk has to talk it into self-destructing.
  • In TNG, we find out that before Data, Doctor Noonian Soong had built Lore, an android with sophisticated emotional processing. However, Lore became essentially evil and had no problems killing people for his own benefit. Data worked better, but in order to get his behavior right, Soong had to initially build him with no emotions at all. (TNG: “Datalore“, “Brothers“)
  • In the TNG episode “Evolution“, Wesley is doing a science project with nanotechnology, accidentally enabling the nanites to become a collective intelligence which almost takes over the ship before the crew manages to negotiate a peaceful solution with them.
  • The holodeck seems entirely capable of running generally intelligent characters, though their behavior is usually restricted to specific roles. However, on occasion they have started straying outside their normal parameters, to the point of attempting to take over the ship. (TNG: “Elementary, Dear Data“) It is also suggested that the computer is capable of running an indefinitely long simulation which is good enough to make an intelligent being believe in it being the real universe. (TNG: “Ship in a Bottle“)
  • The ship’s computer in most of the series seems like it’s potentially quite intelligent, but most of the intelligence isn’t used for anything else than running holographic characters.
  • In the TNG episode “Booby Trap“, a potential way of saving the Enterprise from the Disaster Of The Week would involve turning over control of the ship to the computer: however, the characters are inexplicably super-reluctant to do this.
  • In Voyager, the Emergency Medical Hologram clearly has general intelligence: however, it is only supposed to be used in emergency situations rather than running long-term, its memory starting to degrade after a sufficiently long time of continuous use. The recommended solution is to reset it, removing all of the accumulated memories since its first activation. (VOY: “The Swarm“)

There seems to be a pattern here: if an AI is built to carry out a relatively restricted role, then things work fine. However, once it is given broad autonomy and it gets to do open-ended learning, there’s a very high chance that it gets out of control. The Federation witnessed this for the first time with the Ultimate Computer. Since then, they have been ensuring that all of their AI systems are restricted to narrow tasks or that they’ll only run for a short time in an emergency, to avoid things getting out of hand. Of course, this doesn’t change the fact that your AI having more intelligence is generally useful, so e.g. starship computers are equipped with powerful general intelligence capabilities, which sometimes do get out of hand.

Dr. Soong’s achievement with Data was not in building a general intelligence, but in building a general intelligence which didn’t go crazy. (And before Data, he failed at that task once, with Lore.)

The Federation’s issue with AI is not that they haven’t solved artificial general intelligence. The Federation’s issue is that they haven’t reliably solved the AI alignment problem.

Originally published at Kaj Sotala. You can comment here or there.

(Leave an echo)

Monday, February 12th, 2018
11:33 am - Some conceptual highlights from “Disjunctive Scenarios of Catastrophic AI Risk”

My forthcoming paper, “Disjunctive Scenarios of Catastrophic AI Risk”, attempts to introduce a number of considerations to the analysis of potential risks from Artificial General Intelligence (AGI). As the paper is long and occasionally makes for somewhat dry reading, I thought that I would briefly highlight a few of the key points raised in the paper.

The main idea here is that most of the discussion about risks of AGI has been framed in terms of a scenario that goes something along the lines of “a research group develops AGI, that AGI develops to become superintelligent, escapes from its creators, and takes over the world”. While that is one scenario that could happen, focusing too much on any single scenario makes us more likely to miss out alternative scenarios. It also makes the scenarios susceptible to criticism from people who (correctly!) point out that we are postulating very specific scenarios that have lots of burdensome details.

To address that, I discuss here a number of considerations that suggest disjunctive paths to catastrophic outcomes: paths that are of the form “A or B or C could happen, and any one of them happening could have bad consequences”.

Superintelligence versus Crucial Capabilities

Bostrom’s Superintelligence, as well as a number of other sources, basically make the following argument:

  1. An AGI could become superintelligent
  2. Superintelligence would enable the AGI to take over the world

This is an important argument to make and analyze, since superintelligence basically represents an extreme case: if an individual AGI may become as powerful as it gets, how do we prepare for that eventuality? As long as there is a plausible chance for such an extreme case to be realized, it must be taken into account.

However, it is probably a mistake to focus only on the case of superintelligence. Basically, the reason why we are interested in a superintelligence is that, by assumption, it has the cognitive capabilities necessary for a world takeover. But what about an AGI which also had the cognitive capabilities necessary for taking over the world, and only those?

Such an AGI might not count as a superintelligence in the traditional sense, since it would not be superhumanly capable in every domain. Yet, it would still be one that we should be concerned about. If we focus too much on just the superintelligence case, we might miss the emergence of a “dumb” AGI which nevertheless had the crucial capabilities necessary for a world takeover.

That raises the question of what might be such crucial capabilities. I don’t have a comprehensive answer; in my paper, I focus mostly on the kinds of capabilities that could be used to inflict major damage: social manipulation, cyberwarfare, biological warfare. Others no doubt exist.

A possibly useful framing for future investigations might be, “what level of capability would an AGI need to achieve in a crucial capability in order to be dangerous”, where the definition of “dangerous” is free to vary based on how serious of a risk we are concerned about. One complication here is that this is a highly contextual question – with a superintelligence we can assume that the AGI may get basically omnipotent, but such a simplifying assumption won’t help us here. For example, the level of offensive biowarfare capability that would pose a major risk, depends on the level of the world’s defensive biowarfare capabilities. Also, we know that it’s possible to inflict enormous damage to humanity even with just human-level intelligence: whoever is authorized to control the arsenal of a nuclear power could trigger World War III, no superhuman smarts needed.

Crucial capabilities are a disjunctive consideration because they show that superintelligence isn’t the only level of capability that would pose a major risk: and there many different combinations of various capabilities – including ones that we don’t even know about yet – that could pose the same level of danger as superintelligence.

Incidentally, this shows one reason why the common criticism of “superintelligence isn’t something that we need to worry about because intelligence isn’t unidimensional” is misfounded – the AGI doesn’t need to be superintelligent in every dimension of intelligence, just the ones we care about.

How would the AGI get free and powerful?

In the prototypical AGI risk scenario, we are assuming that the developers of the AGI want to keep it strictly under control, whereas the AGI itself has a motive to break free. This has led to various discussions about the feasibility of “oracle AI” or “AI confinement” – ways to restrict the AGI’s ability to act freely in the world, while still making use of it. This also means that the AGI might have a hard time acquiring the resources that it needs for a world takeover, since it either has to do so while it is under constant supervision by its creators, or while on the run from them.

However, there are also alternative scenarios where the AGI’s creators voluntarily let it free – or even place it in control of e.g. a major corporation, free to use that corporation’s resources as it desires! My chapter discusses several ways by which this could happen: i) economic benefit or competitive pressure, ii) criminal or terrorist reasons, iii) ethical or philosophical reasons, iv) confidence in the AI’s safety, as well as v) desperate circumstances such as being otherwise close to death. See the chapter for more details on each of these. Furthermore, the AGI could remain theoretically confined but be practically in control anyway – such as in a situation where it was officially only giving a corporation advice, but its advice had never been wrong before and nobody wanted to risk their jobs by going against the advice.

Would the Treacherous Turn involve a Decisive Strategic Advantage?

Looking at crucial capabilities in a more fine-grained manner also raises the question of when an AGI would start acting against humanity’s interests. In the typical superintelligence scenario, we assume that it will do so once it is in a position to achieve what Bostrom calls a Decisive Strategic Advantage (DSA): “a level of technological and other advantages sufficient to enable [an AI] to achieve complete world domination”. After all, if you are capable of achieving superintelligence and a DSA, why act any earlier than that?

Even when dealing with superintelligences, however, the case isn’t quite as clear-cut. Suppose that there are two AGI systems, each potentially capable of achieving a DSA if they prepare for long enough. But the longer that they prepare, the more likely it becomes that the other AGI sets its plans in motion first, and achieves an advantage over the other. Thus, if several AGI projects exist, each AGI is incentivized to take action at such a point which maximizes its overall probability of success – even if the AGI only had rather slim chances of succeeding in the takeover, if it thought that waiting for longer would make its chances even worse.

Indeed, an AGI which defects on its creators may not be going for a world takeover in the first place: it might, for instance, simply be trying to maneuver itself into a position where it can act more autonomously and defeat takeover attempts by other, more powerful AGIs. The threshold for the first treacherous turn could vary quite a bit, depending on the goals and assets of the different AGIs; various considerations are discussed in the paper.

A large reason for analyzing these kinds of scenarios is that, besides caring about existential risks, we also care about catastrophic risks – such as an AGI acting too early and launching a plan which resulted in “merely” hundreds of millions of deaths. My paper introduces the term Major Strategic Advantage, defined as “a level of technological and other advantages sufficient to pose a catastrophic risk to human society”. A catastrophic risk is one that might inflict serious damage to human well-being on a global scale and cause ten million or more fatalities.

“Mere” catastrophic risks could also turn into existential ones, if they contribute to global turbulence (Bostrom et al. 2017), a situation in which existing institutions are challenged, and coordination and long-term planning become more difficult. Global turbulence could then contribute to another out-of-control AI project failing even more catastrophically and causing even more damage

Summary table and example scenarios

The table below summarizes the various alternatives explored in the paper.

AI’s level of strategic advantage
  • Decisive
  • Major
AI’s capability threshold for non-cooperation
  • Very low to very high, depending on various factors
Sources of AI capability
  • Individual takeoff
    • Hardware overhang
    • Speed explosion
    • Intelligence explosion
  • Collective takeoff
  • Crucial capabilities
    • Biowarfare
    • Cyberwarfare
    • Social manipulation
    • Something else
  • Gradual shift in power
Ways for the AI to achieve autonomy
  • Escape
    • Social manipulation
    • Technical weakness
  • Voluntarily released
    • Economic or competitive reasons
    • Criminal or terrorist reasons
    • Ethical or philosophical reasons
    • Desperation
    • Confidence
      • in lack of capability
      • in values
  • Confined but effectively in control
Number of AIs
  • Single
  • Multiple

And here are some example scenarios formed by different combinations of them:

The classic takeover

(Decisive strategic advantage, high capability threshold, intelligence explosion, escaped AI, single AI)

The “classic” AI takeover scenario: an AI is developed, which eventually becomes better at AI design than its programmers. The AI uses this ability to undergo an intelligence explosion, and eventually escapes to the Internet from its confinement. After acquiring sufficient influence and resources in secret, it carries out a strike against humanity, eliminating humanity as a dominant player on Earth so that it can proceed with its own plans unhindered.

The gradual takeover

(Major strategic advantage, high capability threshold, gradual shift in power, released for economic reasons, multiple AIs)

Many corporations, governments, and individuals voluntarily turn over functions to AIs, until we are dependent on AI systems. These are initially narrow-AI systems, but continued upgrades push some of them to the level of having general intelligence. Gradually, they start making all the decisions. We know that letting them run things is risky, but now a lot of stuff is built around them, it brings a profit and they’re really good at giving us nice stuff—for the while being.

The wars of the desperate AIs

(Major strategic advantage, low capability threshold, crucial capabilities, escaped AIs, multiple AIs)

Many different actors develop AI systems. Most of these prototypes are unaligned with human values and not yet enormously capable, but many of these AIs reason that some other prototype might be more capable. As a result, they attempt to defect on humanity despite knowing their chances of success to be low, reasoning that they would have an even lower chance of achieving their goals if they did not defect. Society is hit by various out-of-control systems with crucial capabilities that manage to do catastrophic damage before being contained.

Is humanity feeling lucky?

(Decisive strategic advantage, high capability threshold, crucial capabilities, confined but effectively in control, single AI)

Google begins to make decisions about product launches and strategies as guided by their strategic advisor AI. This allows them to become even more powerful and influential than they already are. Nudged by the strategy AI, they start taking increasingly questionable actions that increase their power; they are too powerful for society to put a stop to them. Hard-to-understand code written by the strategy AI detects and subtly sabotages other people’s AI projects, until Google establishes itself as the dominant world power.

This blog post was written as part of work for the Foundational Research Institute.

Originally published at Kaj Sotala. You can comment here or there.

(Leave an echo)

Thursday, January 25th, 2018
7:25 pm - On not getting swept away by mental content

There’s a specific subskill of meditation that I call “not getting swept away by the content”, that I think is generally valuable.

It goes like this. You sit down to meditate and focus on your breath or whatever, and then a worrying thought comes to your mind. And it’s a real worry, something important. And you are tempted to start thinking about it and pondering it and getting totally distracted from your meditation… because this is something that you should probably be thinking about, at some point.

So there’s a mental motion that you make, where you note that you are getting distracted by the content of a thought. The worry, even if valid, is content. If you start thinking about whether you should be engaging with the worry, those thoughts are also content.

And you are meditating, meaning that this is the time when you shouldn’t be focusing on content. Anything that is content, you dismiss, without examining what that content is.

So you dismiss the worry. It was real and important, but it was content, so you are not going to think about it now.

You feel happy about having dismissed the content, and you start thinking about how good of a meditator you are, and… realize that this, too, is a thought that you are getting distracted by.

So you dismiss that thought, too. Doesn’t matter what the content of the thought is, now is not the time.

And then you keep letting go of thoughts that came to your mind, but that doesn’t seem to do anything and you start to wonder whether you are doing this meditation thing right… and aha, that’s content too. So you dismiss that…

The thing that is going on here is that usually, when you experience a distracting thought and want to get rid of it, you often start engaging in an evaluation process of whether that thought should be dismissed or not. By doing so, you may end up engaging with the thought’s own internal logic – which might be totally wrong for the situation.

Yes, maybe your relationship is in tatters and your partner is about to leave you. And maybe there are things that you can do to avoid that fate. Or maybe there are not. But if you try to dismiss the thought by disputing the truth or importance of those things, you will fail. Because they are true and important.

The way to short-circuit that is to move the evaluation a meta-level up and just decide that whatever is content, gets dismissed on that basis. Doesn’t matter if it’s true. It’s content, so not what you are doing now. You avoid getting entangled up with the thought’s internal logic, because you never engage with the internal logic in the first place.

Having this mental motion available to you is also useful outside meditation, if you are prone to having any other thoughts that aren’t actually useful.

As I write this, I’m sitting at a food place, eating the food and watching the traffic outside. And, like I often am, I am bothered by pessimistic thoughts about the future of humanity, and all the different disasters that could befall the world.

Yeah, I could live to see the day when AIs destroy the world, or worse.

That’s true.

That’s also content. I’m not going to engage with that content right now.

Hmm.

I look outside the window, watch cars pass by, and finish my dinner.

The food is tasty.

Originally published at Kaj Sotala. You can comment here or there.

(Leave an echo)

Thursday, January 4th, 2018
9:53 am - Papers for 2017

I had three new papers either published or accepted into publication last year; all of them are now available online:

  • How Feasible is the Rapid Development of Artificial Superintelligence? Physica Scripta 92 (11), 113001.
    • Abstract: What kinds of fundamental limits are there in how capable artificial intelligence (AI) systems might become? Two questions in particular are of interest: 1) How much more capable could AI become relative to humans, and 2) how easily could superhuman capability be acquired? To answer these questions, we will consider the literature on human expertise and intelligence, discuss its relevance for AI, and consider how AI could improve on humans in two major aspects of thought and expertise, namely simulation and pattern recognition. We find that although there are very real limits to prediction, it seems like AI could still substantially improve on human intelligence.
    • Links: published version (paywalled), free preprint.
  • Disjunctive Scenarios of Catastrophic AI Risk. AI Safety and Security (Roman Yampolskiy, ed.), CRC Press. Forthcoming.
    • Abstract: ​ Artificial intelligence (AI) safety work requires an understanding of what could cause AI to become unsafe. This chapter seeks to provide a broad look at the various ways in which the development of AI sophisticated enough to have general intelligence could lead to it becoming powerful enough to cause a catastrophe. In particular, the present chapter seeks to focus on the way that various risks are disjunctive—on how there are multiple different ways by which things could go wrong, any one of which could lead to disaster. We cover different levels of a strategic advantage an AI might acquire, alternatives for the point where an AI might decide to turn against humanity, different routes by which an AI might become dangerously capable, ways by which the AI might acquire autonomy, and scenarios with varying number of AIs. Whereas previous work has focused on risks specifically only from superintelligent AI, this chapter also discusses crucial capabilities that could lead to catastrophic risk and which could emerge anywhere on the path from near-term “narrow AI” to full-blown superintelligence.
    • Links: free preprint.
  • Superintelligence as a Cause or Cure for Risks of Astronomical Suffering. Informatica 41 (4).
    • (with Lukas Gloor)
    • Abstract: Discussions about the possible consequences of creating superintelligence have included the possibility of existential risk , often understood mainly as the risk of human extinction. We argue that suffering risks (s-risks) , where an adverse outcome would bring about severe suffering on an astronomical scale, are risks of a comparable severity and probability as risks of extinction. Preventing them is the common interest of many different value systems. Furthermore, we argue that in the same way as superintelligent AI both contributes to existential risk but can also help prevent it, superintelligent AI can both be a suffering risk or help avoid it. Some types of work aimed at making superintelligent AI safe will also help prevent suffering risks, and there may also be a class of safeguards for AI that helps specifically against s-risks.
    • Links: published version (open access).

In addition, my old paper Responses to Catastrophic AGI Risk (w/ Roman Yampolskiy) was republished, with some minor edits, as the book chapters “Risks of the Journey to the Singularity” and “Responses to the Journey to the Singularity”, in The Technological Singularity: Managing the Journey (Victor Callaghan et al, eds.), Springer-Verlag.

Originally published at Kaj Sotala. You can comment here or there.

(Leave an echo)

Friday, December 8th, 2017
2:10 pm - Fixing science via a basic income

I ran across Ed Hagen’s article “Academic success is either a crapshoot or a scam”, which pointed out that all the methodological discussion about science’s replication crisis is kinda missing the point: yes, all of the methodological stuff like p-hacking is something that would be valuable to fix, but the real problem is in the incentives created by the crazy publish-or-perish culture:

In my field of anthropology, the minimum acceptable number of pubs per year for a researcher with aspirations for tenure and promotion is about three. This means that, each year, I must discover three important new things about the world. […]

Let’s say I choose to run 3 studies that each has a 50% chance of getting a sexy result. If I run 3 great studies, mother nature will reward me with 3 sexy results only 12.5% of the time. I would have to run 9 studies to have about a 90% chance that at least 3 would be sexy enough to publish in a prestigious journal.

I do not have the time or money to run 9 new studies every year.

I could instead choose to investigate phenomena that are more likely to yield strong positive results. If I choose to investigate phenomena that are 75% likely to yield such results, for instance, I would only have to run about 5 studies (still too many) for mother nature to usually grace me with at least 3 positive results. But then I run the risk that these results will seem obvious, and not sexy enough to publish in prestigious journals.

To put things in deliberately provocative terms, empirical social scientists with lots of pubs in prestigious journals are either very lucky, or they are p-hacking.

I don’t really blame the p-hackers. By tying academic success to high-profile publications, which, in turn, require sexy results, we academic researchers have put our fates in the hands of a fickle mother nature. Academic success is therefore either a crapshoot or, since few of us are willing to subject the success or failure of our careers to the roll of the dice, a scam.

The article then suggests that the solution would be to have better standards for research, and also blames prestigious journal publishers for exploiting their monopoly on the field. I think that looking at the researcher incentives is indeed the correct thing to do here, but I’m not sure the article goes deep enough with it. Mainly, it doesn’t ask the obvious question of why researchers have such a crazy pressure to publish: it’s not the journals that set the requirements for promotion or getting to the tenure track, that’s the universities and research institutions. The journals are just exploiting a lucrative situation that someone else created.

Rather my understanding is that the real problem is that there are simply too many PhD graduates who want to do research, relative to the number of researcher positions available. It’s a basic fact of skill measurement that if you try to measure skill and then pick people based on how well they performed on your measure, you’re actually selecting for skill + luck rather than pure skill. If the number of people you pick is small enough relative to the number of applicants, anyone you pick has to be both highly skilled and highly lucky; simply being highly skilled isn’t enough to make it to the top. This is the situation we have with current science, and as Hagen points out, it leads to rampant cheating when people realize that they have to cheat in order to make the cut. As long as this is the situation, there will remain an incentive to cheat.

This looks hard to fix; two obvious solutions would be to reduce the number of graduate students or to massively increase the number of research jobs. The first is politically challenging, especially since it would require international coordination and lots of nations view the number of graduating PhDs as a status symbol. The second would be expensive and thus also politically challenging. One thing that some of my friends also suggested was some kind of a researchers’ basic income (or just a universal basic income in general); for fields in which doing research isn’t much more expensive than covering the researchers’ cost of living, a lot of folks would probably be happy to do research just on the basic income.

A specific suggestion that was thrown out was to give some number of post-docs a 10-year grant of 2000 euros/month; depending on the exact number of grants given out, this could fund quite a number of researchers while still being cheap in comparison to any given country’s general research and education expenses. The existence of better-paid and more prestigious formal research positions like university professorships would still exist as an incentive to actually do the research, and historically quite a lot of research has been done by people with no financial incentive for it anyway (Einstein doing his research on the side while working at the patent office maybe being the most famous example); the fact that most researchers are motivated by the pure desire to do science is already shown by the fact that anyone at all decides to go to academia today. A country being generous handing out these kinds of grants also has the potential to be made into an international status symbol, creating the incentive to actually do this. Alternatively, this could just be viewed as yet another reason to just push for a universal basic income for everyone.

EDIT: Jouni Sirén made the following interesting comment in response to this article: “I think the root issue goes deeper than that. There are too many PhD graduates who want to do research, because money and prestige are insufficient incentives for a large part of the middle class. Too many people want a job that is interesting or meaningful, and nobody is willing to support all of them financially.” That’s an even deeper reason than the one I was thinking of!

Originally published at Kaj Sotala. You can comment here or there.

(1 echo left behind | Leave an echo)

Monday, December 4th, 2017
1:02 pm - Book review: The Upside of Your Dark Side: Why Being Your Whole Self–Not Just Your “Good

The Upside of Your Dark Side: Why Being Your Whole Self–Not Just Your “Good” Self–Drives Success and Fulfillment. By Todd Kashdan & Robert Biswas-Diener. Avery, 2014.

This book was written by a pair of psychologists who thought that the excessive focus on good and positive feelings in positive psychology was a little overblown, and that the value of so-called “negative” feelings or aspects of personality was being neglected. They do think that it’s good for us to be happy most of the time, but that it will be even better for us if we have a flexibility that allows us to switch to non-happy states of mind when it’s beneficial. They suggest an 80:20 ratio as a rough rule of thumb: be happy 80% of the time and non-happy 20% of the time. They call this philosophy “wholeness”: a person is whole if they are able to flexibly tap into all aspects of their being when it’s warranted.

The authors offer a number of examples about the value of so-called negative states. Too much comfort makes us oversensitive to inevitable discomfort. Anger motivates us to act, fix injustices, and defend ourselves and our loved ones; guilt tells us when we’ve screwed up and motivates us to improve our behavior; anxiety helps us catch mistakes and take safeguards against risks. Happy people are less persuasive, can be too trusting, and are lazier thinkers. Intentionally trying to become happy easily backfires and makes us less happy; and there are situations where happiness feels inappropriate and will make others respond worse to you. Sometimes it’s better to act on instinct or engage in mind-wandering than to always be mindful and think things through consciously. The “dark triad” traits of narcissism, Machiavellianism, and psychopathy are all useful in moderation and provide benefits such as fearlessness and self-assuredness.

The following paragraph from the final chapter is a pretty good summary of the book’s message:

The basic idea is that psychological states are instrumental. That is, they are useful for a specific purpose, such as finding your car keys, being physically safe in a parking garage, negotiating a business deal, or arguing with your child’s teacher. Rather than viewing your thoughts and feelings as reactions to external events, we argue that you ought to view these states as tools to be used as circumstances warrant. Simply put, quit labeling your inner states as good or bad or positive or negative, and start thinking of them as useful or not useful for any given situation.

While I liked the book’s message and agreed with many of its points, I felt like it was mostly trying to tell a story that sounds plausible to a layman, rather than making a particularly rigorous argument. The authors tend to base their claims on isolated studies with no mention of their replication status; some of their example studies draw on paradigms and methods that have been seriously challenged (social priming and implicit association tests); occasionally they made claims that I thought contradicted things I knew from elsewhere; and some of the cited empirical results seem to have alternative interpretations that are more natural than the ones offered in the book. It’s plausible that they are drawing on much more rigorous academic work and that the argument has been dumbed down for a popular audience: even granting them the benefit of doubt, the book still feels way too much like a collection of examples that have been cherry-picked to make the wanted points.

Regardless, the book’s general message feels almost certainly correct – after all, why would we have evolved negative states if they weren’t sometimes useful? – so if anyone feels like they’ve been overwhelmed with too many messages of positivity, I would recommend this book for inspiration and an alternative viewpoint, if not for any of its specific details.

Originally published at Kaj Sotala. You can comment here or there.

(Leave an echo)

Monday, November 6th, 2017
1:05 pm - Meditation and mental space

One effect that I often notice after my meditation practice has been interrupted and I then manage to resume it again, is an increase in a kind of mental resilience.

That is, when I have a lower resilience, feeling bad for any reason feels much more like an emergency. It’s something that forces itself into my consciousness, takes over, and refuses to go away. I would like to ignore it, but I can’t; as long as it’s there, it’s hard to think of anything else.

When my resilience is higher, it’s like my mind has more room for thoughts and emotions. Something might be making me feel bad, but something else might also be making me feel good, and there’s space for those two to intermingle. It becomes much easier to accept that I’m feeling a little bad, but I don’t need to do anything about it. I can just go on and do something else, and the nasty feeling might go away on its own – or if it doesn’t, that’s fine too.

Interestingly, being on antidepressants can also give me a similar effect.

Of course, in itself this kind of an effect isn’t too surprising, given that it’s one of the explicit goals of the practice. Culadasa’s The Mind Illuminated notes that two of the goals of mindfulness practice are an increase in the amount of “conscious power” (roughly, the amount of things that can be consciously processed at a time), as well as learning to more intentionally shift the focus of attention, so that it won’t just automatically go to the most painful or pleasant thing and become preoccupied with that, but can rather be controlled in a more useful manner. Still, it’s nice to see that the practice is bearing fruit.

Originally published at Kaj Sotala. You can comment here or there.

(Leave an echo)

Tuesday, October 17th, 2017
10:19 am - Anti-tribalism and positive mental health as high-value cause areas

I think that tribalism is one of the biggest problems with humanity today, and that even small reductions of it could cause a massive boost to well-being.

By tribalism, I basically mean the phenomenon where arguments and actions are primarily evaluated based on who makes them and which group they seem to support, not anything else. E.g. if a group thinks that X is bad, then it’s often seen as outright immoral to make an argument which would imply that X isn’t quite as bad, or that some things which are classified as X would be more correctly classified as non-X instead. I don’t want to give any specific examples so as to not derail the discussion, but hopefully everyone can think of some; the article “Can Democracy Survive Tribalism” lists lot of them, picked from various sides of the political spectrum.

Joshua Greene (among others) makes the argument, in his book Moral Tribes, that tribalism exists for the purpose of coordinating aggression and alliances against other groups (so that you can kill them and take their stuff, basically). It specifically exists for the purpose of making you hurt others, as well as defend yourself against people who would hurt you. And while defending yourself against people who would hurt you is clearly good, attacking others is clearly not. And everything being viewed in tribal terms means that we can’t make much progress on things that actually matter: as someone commented, “people are fine with randomized controlled trials in policy, as long as the trials are on things that nobody cares about”.

Given how deep tribalism sits in the human psyche, it seems unlikely that we’ll be getting rid of it anytime soon. That said, there do seem to be a number of things that affect the amount of tribalism we have:

* As Steven Pinker argues in The Better Angels of Our Nature, violence in general has declined over historical time, replaced by more cooperation and an assumption of human rights; Democrats and Republicans may still hate each other, but they generally agree that they still shouldn’t be killing each other.
* As a purely anecdotal observation, I seem to get the feeling that people on the autism spectrum tend to be less tribal, up to the point of not being able to perceive tribes at all. (this suggests, somewhat oddly, that the world would actually be a better place if everyone was slightly autistic)
* Feelings of safety or threat seem to play a lot into feelings of tribalism: if you perceive (correctly or incorrectly) that a group Y is out to get you and that they are a real threat to you, then you will react much more aggressively to any claims that might be read as supporting Y. Conversely, if you feel safe and secure, then you are much less likely to feel the need to attack others.

The last point is especially troublesome, since it can give rise to self-fulfilling predictions. Say that Alice says something to Bob, and Bob misperceives this as an insult; Bob feels threatened so snaps at Alice, and now Alice feels threatened as well, so shouts back. The same kind of phenomenon seems to be going on a much larger scale: whenever someone perceives a threat, they are no longer willing to give someone the benefit of doubt, and would rather treat the other person as an enemy. (which isn’t too surprising, since it makes evolutionary sense: if someone is out to get you, then the cost of misclassifying them as a friend is much bigger than the cost of misclassifying a would-be friend as an enemy. you can always find new friends, but it only takes one person to get near you and hurt you really bad)

One implication might be that general mental health work, not only in the conventional sense of “healing disorders”, but also the positive psychology-style mental health work that actively seeks to make people happy rather than just fine, could be even more valuable for society than we’ve previously thought. Curing depression etc. would be enormously valuable even by itself, but if we could figure out how to make people generally happier and resilient to negative events, then fewer things would threaten their well-being and they would perceive fewer things as being threats, reducing tribalism.

Originally published at Kaj Sotala. You can comment here or there.

(Leave an echo)

Saturday, October 14th, 2017
11:29 am - You can never be universally inclusive

A discussion about the article “We Don’t Do That Here” (h/t siderea) raised the question about the tension between having inclusive social norms on the one hand, and restricting some behaviors on the other hand.

At least, that was the way the discussion was initially framed. The thing is, inclusivity is a bit of a bad term, since you can never really be universally inclusive. Accepting some behaviors is going to attract people who like engaging in those behaviors while repelling people who don’t like those behaviors; and vice versa for disallowing them.

Of course, you can still create spaces that are more inclusive than others, in being comfortable to a broader spectrum of people. But the way you do that, is by disallowing behaviors that would, if allowed, repel more people that the act of disallowing them does.

If you use your social power to shut up people who would otherwise be loudly racist and homophobic and who then leave because they don’t want to be in a place where those kinds of behaviors aren’t allowed, then that would fit the common definition of “inclusive space” pretty well.

That said, the “excluding racists and homophobes” thing may make it sound like you’re only excluding “bad” people, which isn’t the case either. Every set of rules (including having no rules in the first place) is going to repel some completely decent people.

Like, maybe you decide to try to make a space more inclusive by having a rule like “no discussing religion or politics”. This may make the space more inclusive towards people of all kinds of religions and political backgrounds, since there is less of a risk of anyone feeling unwelcome when everyone else turns out to disagree with their beliefs.

But at the same time, you are making the space less inclusive towards people who are perfectly reasonable and respectful people, but who would like to discuss religion or politics. As well as to people who aren’t so good at self-regulation and will feel uncomfortable about having to keep a constant eye on themselves to avoid saying the wrong things.

And maybe these people would feel more comfortable at a different event with different rules, which was more inclusive towards them. Which is fine. Competing access needs:

Competing access needs is the idea that some people, in order to be able to participate in a community, need one thing, and other people need a conflicting thing, and instead of figuring out which need is ‘real’ we have to acknowledge that we can’t accommodate all valid needs. I originally encountered it in disability community conversations: for example, one person might need a space where they can verbally stim, and another person might need a space where there’s never multiple people talking at once. Both of these are valid, but you can’t accommodate them both in the same space.

I wrote a while ago that I think this concept extends to a lot of activist/social justice community challenges and a lot of the difficulty of designing good messages. For example, body positivity: some people need to hear “love your body! no matter who you are you are soooo sexy” and some people really hate being told that they’re ‘sexy’. Or some gay people might need a space where it’s against the rules to ask “well, what if it actually is morally wrong to be gay?” but other gay people (like me of a few years ago) might need a space where they can ask that so there can be a serious discussion and they can become convinced that they’re okay.

Every set of rules is going to be bad for someone, so a better question than “how to make this space inclusive” is “who do we want to make this space inclusive towards”. You’re always going to exclude some people who aren’t jerks or bad people, but would just prefer a different set of rules. And you just have to accept that.

See also: The Unit of Caring on Safe Spaces and Competing Access Needs.

Originally published at Kaj Sotala. You can comment here or there.

(2 echoes left behind | Leave an echo)

Saturday, October 7th, 2017
5:30 pm - What are your plans for the evening of the apocalypse?
If everyone found out for sure that the world would end in five years, what would happen?

My guess is that it would take time before anything big happened. Finding out about the end of the world, that’s the kind of a thing that you need to digest for a while. For the first couple of days, people might go “huh”, and then carry on with their old routines while thinking about it.

A few months later, maybe there still wouldn’t be all that much change. Sure, people would adjust their life plans, start thinking more near-term, some would decide not to go to college after all. But a lot of people already don’t plan much beyond a couple of years; five years is a long time, and you’ll still need to pay your bills until the Apocalypse hits. So many people might just carry on with their jobs as normal; if they were already doing college, well, you need to pass the time until the end of the world somehow. Might as well keep studying.

Of course, some people would have bigger reactions, right from day one. Quit their unsatisfying job, that kind of thing. People with a lot of savings might choose this moment to start living off them. And as the end of the world got closer and closer, people might get an increasingly relaxed attitude to work; though there might also be a feeling of, we’re all in this together, let’s make our existing institutions work until the end. I could imagine doctors and nurses in a hospital, who had decided that they want to make sure the hospital runs for as long as it can, and that nobody has to die before they really have to.

But I could also imagine, say, the waiter at some restaurant carrying on, serving customers even on the night of the apocalypse. (Be sure to make a reservation, we expect to have no free tables that evening.) Maybe out of principles, maybe out of professional pride, but maybe just out of habit.

I’m guessing there would be gradual changes to society, with occasional tipping points when a lot of people decided to stop whatever they had been doing and that created a chain reaction of others doing so as well. But it seems really hard to guess for how long things would remain mostly normal.

Originally published at Kaj Sotala. You can comment here or there.

(Leave an echo)

Thursday, October 5th, 2017
11:24 am - Meaningfulness and the scope of experience

I find that the extent to which I find life meaningful, seems strongly influenced by my scope of experience [1, 2].

Say that I have a day off, and there’s nothing in particular that I need to get done or think about. This makes it easy for the spatial scope of my experience to become close. My attention is most strongly drawn to the sensations of my body, nearby sounds, tempting nearby things like the warmth of the shower that I could take, the taste of a cup tea that I could prepare, or the pleasant muscle fatigue that I’d get if I went jogging in the nearby forest. The temporal scope of my experience is close as well; these are all things that are nearby in time, in that I could do them within a few minutes of having decided to do them.

Say that I don’t have a day off, and I’m trying to focus on work. My employer’s website says that our research focuses on reducing risks of dystopian futures in the context of emerging technologies; this is a pretty accurate description of what I try to do. Our focus is on really large-scale stuff, including the consequences of eventual space colonization; this requires thinking in the scale of galaxies, a vast spatial scope. And we are also trying to figure out whether there is anything we can do to meaningfully influence the far future, including hundreds if not thousands of years from now; that means taking a vast temporal scope.

It is perhaps no surprise that it is much easier to feel that things are meaningful when the scope of my experience is close, than when it is far.


My favorite theory of meaning actually comes from a slightly surprising direction: the literature on game design and analysis. In Rules of Play: Game Design Fundamentals, Katie Salen and Eric Zimmerman define meaningful play in a game as emerging when the relationships between actions and outcomes are both discernable and integrated into the larger context of the game. In other words:

The consequences of your actions in a game have to be discernable: you need to have some idea of what happened as a result of your actions. If you shoot at an opponent and the opponent dies, that’s pretty clear and discernable. If you press a button and a number changes but you have no idea of what that number means or why it’s relevant, that’s not very clear nor discernable. If you don’t know what happens as a result of your actions, you might as well be randomly pressing buttons or throwing down cards.

The consequences of your actions have to be integrated into the larger context of the game: they need to affect the game experience at some later point in the game. If you move a piece in a game of chess, then that move will directly shape the whole rest of the game, making the moves deeply integrated. But if every game of chess included three opening moves after which the board was reset to the initial position, throwing away everything that happened during those three moves, then those moves would not be integrated to the gameplay. People would just make some moves at random as fast as possible, to get on with the actual opening moves of the game.

As Salen & Zimmerman write: “Whereas discernability of game events tells players what happened (I hit the monster), integration lets players know how it will affect the rest of the game (If I keep on hitting the monster I will kill it. If I kill enough monsters, I’ll gain a level.).”

My own model is that regardless of whether we’re playing a game or living our ordinary lives, our minds will automatically keep looking for actions whose outcomes are discernable and integrated, relative to the current scope of experience.

When the scope is close, it is easy to find such actions. Taking a shower, making a cup of tea, going out for a jog; the consequences of these actions will manifest as concrete and enjoyable bodily sensations, clearly discernable both within the temporal and spatial scope. And because the scope is so close, almost everything I do will affect the whole scope, so it will feel tightly integrated.  I imagine getting a taste of tea, and think no farther out in time; thus, getting up from bed, going to the kitchen, preparing the tea, and sitting down to drink it, feels like a tight chain of actions where each step gives rise to the next, culminating in the warmth of the tea cup pressing against my lips, the sensation of taste on my tongue.

When the scope is far, it is much different. What action could one even think of, whose consequences were discernable on a scale spanning entire galaxies? Or whose consequences could be traced out for tens, maybe hundreds of years? It’s hard to imagine anything. An intellectual analysis may suggest things that could plausibly result as a consequence of our actions, but unless one can really visualize those and translate them into emotional terms, it’s still going to feel hard to connect them to the small-scale things happening in our daily lives.

I find that my mind will automatically look for objectives that makes sense within a given scope. When my scope is relatively close, things like finding a romantic partner and maybe having children feel strongly appealing; they would have a strong impact within the entire scope. When my scope gets more remote, such things seem to lose their appeal: what is one more family going to matter? It is highly unlikely to change the course of history. Better to ignore those things, as my chances to make a lasting impact are remote already; better to concentrate on finding something that would inch those chances ever-so-slightly upwards.


The naive implication of this would be, “keep your scopes close, and you’ll be happy”. But of course, it’s not that simple.

For one, most of us can’t just focus on small pleasures and not worry at all about things like earning an income, what we’ll be doing next year, or whatever it is that we need to think about at work. The necessities of everyday life force us to think long term and in a larger context, which forces us to attend to a broader scope of experience.

Even if we did have the opportunity to keep our scope small, the effect would be to make ourselves happy by ignoring everything else that’s going on in the world. It’s easy to be happy, if you can just the ignore the suffering of your neighbor; a small scope easily gets very self-centered. (Of course, a large scope can be centered on the self as well; it’s just that it’s big enough to also contain other beings, regardless of who happens to be in the center.)

Even if you widen your scope to contain family and friends, that scope will only contain a small fraction of everybody who exists. You don’t necessarily want to only think about what happens to those you personally have reason to care about, if it means neglecting the well-being of everyone else.

Of course, it also makes no sense to burden yourself with things that you realistically can’t affect. Better to exclude those from your scope.

Except… how certain are you of not being able to affect them? The only thing that guarantees that you can’t knowingly affect somebody, is if you make the decision to never think about them. If you do keep them in your scope, even only occasionally, then you might come up with something that lets you help them after all.

So the right thing is not to stick with a certain scope, but to learn to adjust the scope if needed. Draw it closer when you are feeling overwhelmed, or when you are at risk of neglecting yourself or your loved ones; broaden it out when you have the resources to deal with the larger scope, and its demands. When you are operating in a larger scope, see if you can find ways to visualize your impact in a way that makes your current actions feel more integrated to the whole context, so as to experience their meaningfulness.

It’s easier said than done.

Exercise: see if you can consciously manipulate the scope of your experience. Try pulling it close, both spatially and temporally: focus only on your immediate surroundings and let your attention be drawn to things that you could be doing right away. Then try gradually expanding the scope, maybe all the way up to the level of galaxies and multiverses and millions of years, but also stopping at more intermediate points: e.g. your own life in a few years, or your country or your planet in the same time. How do those changes make a difference to what you feel, and what you feel like doing?

Originally published at Kaj Sotala. You can comment here or there.

(Leave an echo)

Saturday, September 23rd, 2017
10:32 am - Nobody does the thing that they are supposedly doing

I feel like one of the most important lessons I’ve had about How the World Works, which has taken quite a bit of time to sink in, is:

In general, neither organizations nor individual people do the thing that their supposed role says they should do. Rather they tend to do the things that align with their incentives (which may sometimes be economic, but even more often they are social and psychological). If you want to really change things, you have to change people’s incentives.

But I feel like I’ve had to gradually piece this together from a variety of places, over a long time; I’ve never read anything that would have laid down the whole picture. I remember that Freakonomics had a few chapters about how incentives cause unexpected behavior, but that was mostly about economic incentives, which are just a small part of the whole picture. And it didn’t really focus on the “nothing in the world works the way you’d naively expect” thing; as I recall, it was presented more as a curiosity.

On the other hand, Robin Hanson has had a lot of stuff about “X is not about Y“, but that has mostly been framed in terms of prestige and signaling, which is the kind of stuff that’s certainly an important part of the whole picture (the psychological kind of incentives), but again just a part of the picture. (However, his upcoming book goes into a lot more detail on why and how the publicly-stated motives for human or organizational behavior aren’t actually the true motives.)

And then in social/evolutionary/moral psychology there’s a bunch of stuff about social-psychological incentives, of how we’re motivated to denounce outgroups and form bonds with our ingroups; and how it can be socially costly to have accurate beliefs about outgroups and defend them to your ingroup, whereas it would be much more rewarding to just spread inaccuracies or outright lies about how terrible the outgroups are, and thus increase your own social standing. And how even well-meaning ideologies will by default get hijacked by these kinds of dynamics and become something quite different from what they claimed to be.

But again, that’s just one piece of the whole story. And you can find more isolated pieces of the whole story scattered around in a variety of articles and books, also stuff like the iron law of oligarchy, rational irrationality, public choice theory, etc etc. But no grand synthesis.

There’s also a relevant strand of this in the psychology of motivation/procrastination/habit-formation, on why people keep putting off various things that they claim they want to do, but then don’t. And how small things can reshape people’s behavior, like if somebody ends up as a much more healthy eater just because they don’t happen to have a fast food restaurant conveniently near their route home from work. Which isn’t necessarily so much about incentives themselves, but an important building block in understanding why our behavior tends to be so strongly shaped by things that are entirely separate from consciously-set goals.

Additionally, the things that do drive human behavior are often things like maintaining a self-concept, seeking feelings of connection, autonomy and competence, maintaining status, enforcing various moral intuitions, etc., things that only loosely align one’s behavior with one’s stated goals. Often people may not even realize what exactly it is that they are trying to achieve with their behavior.

“Experiental pica” is a misdirected craving for something that doesn’t actually fulfill the need behind the craving. The term originally comes from a condition where people with a mineral deficiency start eating things like ice, which don’t actually help with the deficiency. Recently I’ve been shifting towards the perspective that, to a first approximation, roughly everything that people do is pica for some deeper desire, with that deeper desire being something like social connection, feeling safe and accepted, or having a feeling of autonomy or competence. That is, most of the things that people will give as reasons for why they are doing something will actually miss the mark, and also that many people are engaging in things that are actually relatively inefficient ways of achieving their true desires, such as pursuing career success when the real goal is social connection. (This doesn’t mean that the underlying desire would never be fulfilled, just that it gets fulfilled less often than it would if people were aware of their true desires.)

Originally published at Kaj Sotala. You can comment here or there.

(Leave an echo)

Sunday, September 3rd, 2017
3:27 pm - Debiasing by rationalizing your own motives
 Some time back, I saw somebody express an opinion that I disagreed with. Next, my mind quickly came up with emotional motives the other person might have for holding such an opinion, that would let me safely justify dismissing that opinion.

Now, it’s certainly conceivable that they did have such a reason for holding the opinion. People do often have all kinds of psychological, non-truth-tracking reasons for believing in something. So I don’t know whether this guess was correct or not.

But then I recalled something that has stayed with me: a slide from a presentation that Stuart Armstrong held several years back, that showed the way that we tend to think of our own opinions as being based on evidence, reasoning, etc.. And at the same time, we don’t see any of the evidence that caused other people to form their opinion, so instead we think of the opinions of others as being only based on rationalizations and biases.

Yes, it was conceivable that this person I was disagreeing with, held their opinion because of some bias. But given how quickly I was tempted to dismiss their view, it was even more conceivable that I had some similar emotional bias making me want to hold on to my opinion.

And being able to imagine a plausible bias that could explain another person’s position, is a Fully General Counterargument. You can dismiss any position that way.

So I asked myself: okay, I have invented a plausible bias that would explain the person’s commitment to this view. Can I invent some plausible bias that would explain my own commitment to my view?

I could think of several, right there on the spot. And almost as soon as I could, I felt my dismissive attitude towards the other person’s view dissolve, letting me consider their arguments on their own merits.

So, I’ll have to remember this. New cognitive trigger-action plan: if I notice myself inventing a bias that would explain someone else’s view, spend a moment to invent a bias that would explain *my* opposing view, in order to consider both more objectively.

Originally published at Kaj Sotala. You can comment here or there.

(Leave an echo)

Thursday, August 24th, 2017
1:47 pm - The muted signal hypothesis of online outrage

Everyone, it sometimes seems, has their own pet theory of why social media and the Internet often seem like so unpleasant and toxic places. Let me add one more.

People want to feel respected, loved, appreciated, etc. When we interact physically, you can easily experience subtle forms of these feelings. For instance, even if you just hang out in the same physical space with a bunch of other people and don’t really interact with them, you often get some positive feelings regardless. Just the fact that other people are comfortable having you around, is a subtle signal that you belong and are accepted.

Similarly, if you’re physically in the same space with someone, there are a lot of subtle nonverbal things that people can do to signal interest and respect. Meeting each other’s gaze, nodding or making small encouraging noises when somebody is talking, generally giving people your attention. This kind of thing tends to happen automatically when we are in each other’s physical presence.

Online, most of these messages are gone: a thousand people might read your message, but if nobody reacts to it, then you don’t get any signal indicating that you were seen. Even getting a hundred likes and a bunch of comments on a status, can feel more abstract and less emotionally salient than just a single person nodding at you and giving you an approving look when you’re talking.

So there’s a combination of two things going on. First, many of the signals that make us feel good “in the physical world” are relatively subtle. Second, online interaction mutes the intensity of signals, so that subtle ones barely even register.

Depending on how sensitive you are, and how good you are generally feeling, you may still feel the positive signals online as well. But if your ability to feel good things is already muted, because of something like depression or just being generally in a bad mood, you may not experience the good things online at all. So if you want to consistently feel anything, you may need to ramp up the intensity of the signals.

Anger and outrage are emotional reactions with a very strong intensity, strong enough that you can actually feel them even in online interactions. They are signals that can consistently get similar-minded people rallied on your side. Anger can also cause people to make sufficiently strongly-worded comments supporting your anger that those comments will register emotionally. A shared sense of outrage isn’t the most pleasant way of getting a sense of belonging, but if you otherwise have none, it’s still better than nothing.

And if it’s the only way of getting that belonging, then the habit of getting enraged will keep reinforcing itself, as it will give all of the haters some of what they’re after: pleasant emotions to fill an emotional void.

So to recap:

When interacting physically, we don’t actually need to do or experience much in order to experience positive feelings. Someone nonverbally acknowledging our presence or indicating that they’re listening to us, already feels good. And we can earn the liking and respect of others, by doing things that are as small as giving them nonverbal signals of liking and respect.

Online, all of that is gone. While things such as “likes” or positive comments serve some of the same function, they often fail to produce much of a reaction. Only sufficiently strong signals can consistently break through and make us feel like others care about us, and outrage is one of the strongest emotional reactions around, so many people will learn to engage in more and more of it.

Originally published at Kaj Sotala. You can comment here or there.

(Leave an echo)

Tuesday, August 15th, 2017
2:03 pm - The parliamentary model as the correct ethical model

In 2009, Nick Bostrom brought up the possibility of dealing with moral uncertainty with a “parliamentary model” of morality. Suppose that you assign (say) 40% probability to some form particular of utilitarianism being correct, and 20% probability to some other form of utilitarianism being correct, and 20% probability to some form of deontology being true. Then in the parliamentary model, you imagine yourself as having a “parliament” that decides on what to do, with the first utilitarian theory having 40% of the delegates, the other form having 20% of the delegates, and the deontological theory having 20% of the delegates. The various delegates then bargain with each other and vote on different decisions. Bostrom explained:

The idea here is that moral theories get more influence the more probable they are; yet even a relatively weak theory can still get its way on some issues that the theory think are extremely important by sacrificing its influence on other issues that other theories deem more important. For example, suppose you assign 10% probability to total utilitarianism and 90% to moral egoism (just to illustrate the principle). Then the Parliament would mostly take actions that maximize egoistic satisfaction; however it would make some concessions to utilitarianism on issues that utilitarianism thinks is especially important. In this example, the person might donate some portion of their income to existential risks research and otherwise live completely selfishly.

As I noted, the model was proposed for dealing with a situation where you’re not sure of which ethical theory is correct. I view this somewhat differently. I lean towards the theory that the parliamentary model itself is the most correct ethical theory, as the brain seems to contain multiple different valuation systems that get activated in different situations, as well as multiple competing subsystems that feed inputs to these higher-level systems. (E.g. there exist both systems that tend to produce more deontological judgments, and systems that tend to produce more consequentialist judgments.)

Over time, I’ve settled upon something like a parliamentary model for my own decision-making. Different parts of me clearly tend towards different kinds of ethical frameworks, and rather than collapse into constant infighting, the best approach seems to go for a compromise where the most dominant parts get their desires most of the time, but less dominant parts also get their desires on issues that they care particularly strongly about. For example, a few days back I was considering the issue of whether I want to have children; several parts of my mind subscribed to various ethical theories which felt that the idea of having them felt a little iffy. But then a part of my mind piped up that clearly cared very strongly about the issue, and which had a strong position of “YES. KIDS”. Given that the remaining parts of my mind only had ambivalent or weak preferences on the issue, they decided to let the part with the strongest preference to have its way, in order to get its support on other issues.

There was a time when I had a strong utilitarian faction in my mind which did not want to follow a democratic process and tried to force its will on all the other factions. This did not work very well, and I’ve felt much better after it was eventually overthrown.

Originally published at Kaj Sotala. You can comment here or there.

(Leave an echo)

Thursday, August 3rd, 2017
9:34 pm - Confidence and patience don’t feel like anything in particular

After doing my self-concept work, I’ve been expecting to feel confident in social situations. And observing myself in them or after them, I have been more confident. But I haven’t felt particularly confident.

The thing is, being confident doesn’t feel like much in particular. I was pretty confident in my ability to open my laptop and write this post. I’m also confident in my ability to go to the shower and wash my hair, and I’m confident in my ability to go to the grocery store to buy stuff.

But writing this, or washing my hair, or going to the grocery store, aren’t things that would fill me with any particular “feeling of confidence”. They’re just things that I do, without thinking about them too much.

Similarly, being confident in a social situation doesn’t mean you’d actually have any strong feeling of confidence. It just means you don’t have any feeling of unconfidence.

Which is obvious when I think about it. So why did I expect otherwise?

I think the explanation is, the only times when I have previously paid conscious attention to my confidence, have been in situations where I’ve felt unconfident. And if you lack confidence, you try to psych yourself up. You try to summon some *other* emotion to flood your mind and push the feeling of unconfidence away.

If you are successfully suppressing your lack of confidence with some other emotion, you do “feel confident”. You are feeling whatever the other emotion is, that’s temporarily allowing you to be confident.

But if you don’t have any uncertainties that are actively surfacing, you don’t need to summon any other emotion to temporarily suppress them. Just those uncertainties not being around, is enough by itself. And something that’s just not around, doesn’t feel like anything.

Another similar thing is “patience”. If we feel impatient with someone, we might struggle to “try to be patient”. But if you actually are patient with someone, it usually doesn’t feel like anything in particular. You don’t have a glow of patience as you think about how badly the other person is getting on your nerves but how you withstand it anyway; rather the other person’s behavior just doesn’t bother you very much in the first place.

Edited to add: somebody pointed out that there exists good feeling of “you’ve got this” that one can feel. That’s true, and I agree that this could sensibly be called “confidence”. What I was trying to say was less “there’s no sensation that could reasonably be called confidence” but more “most everyday confidence doesn’t feel like anything in particular”. Paradoxically, even if confidence wouldn’t usually feel like anything, the lack of a feel can make you unconfident if you think that you should feel something to be confident. Somebody else mentioned that they do also have an actual feeling of patience; I’m not sure if I’ve experienced this myself, but the same thing applies.

Originally published at Kaj Sotala. You can comment here or there.

(Leave an echo)

Wednesday, July 26th, 2017
3:46 pm
http://kajsotala.fi/2017/07/how-i-found-fixed-the-root-problem-behind-my-depression-and-anxiety-after-20-years/

So, I haven't talked about this in public before because I wanted to wait and make sure that the changes would last, but...

I believe that I recently managed to find and fix what was the root problem of all of the depression and anxiety that I've had for the last 20+ years.

Concrete changes that this had led to over the last five weeks include:

* My experience of work has gone from "literally soul-crushing" to moderately enjoyable; my bank account balance looks better than it has done in years, and I'm for the first time confident in my ability to actually hold a job
* The pervasive sense of meaningless and pointlessness is gone
* My sexuality has changed: some paraphilias that used to be at the core of my sexuality have become more of a mild extra spice; many fantasies that were obsessive to the point of bothersome have completely lost their emotional appeal
* I'm feeling increasingly free to think about anything: there's no longer any secret fear of hitting upon a thought that would suddenly make me feel feel guilty or ashamed
* I'm increasingly shifting towards not intrinsically caring about what others think of me, and being fine with people disliking me (though of course I still see the practical value of being liked)

among other things. The link talks about all of this in more detail (my WordPress crossposting plugin hasn't worked for a while for some reason, and the post is too long for me to bother cross-posting it here manually, so you'll just have to read it at the original source).

(Leave an echo)

Thursday, July 20th, 2017
5:41 pm - Meditation insights: suffering and pleasure are intrinsically bound together
 A principle which I've been gradually been able to observe and internalize, thanks both to meditation and some other mind-hacking practices, is that suffering is never about the pain itself. There are conditions in which people report pain but do not mind it; pain is just an attention signal. Pain does not intrinsically cause suffering: what causes suffering is experiencing the pain, and desiring the relief that would come from the pain ceasing. One does not wish the pain to end, as such; one wishes to feel the pleasure that would come from the pain ending.

This may sound like a pure semantic distinction. It is not: it is a distinction with enormous practical value.

Some time back, Juha lent me his copy of The Mind Illuminated, a book on meditation. This is the best book on meditation that I have ever read. Among other practical instructions, it was the first time that a text really properly explained what the concrete goal of mindfulness practices are.

The goal (or at least a goal) of mindfulness is to train the mental processes responsible for maintaining your peripheral awareness - your background sense of everything that is going on around you, but which is not in the focus of your active attention - to observe not only your physical surroundings, but also the processes going on in your mind. By doing so, the mental processes responsible for habit formation start to get more information about what kinds of thought patterns produce pleasure and which kinds of thought patterns produce suffering. Over time this will start reshaping your mind, as patterns which only produce suffering will get dropped.

And part of the reason why this happens, is that you will start seeing thoughts with false promises of pleasure as what they are; rather than chasing promises of short-term pleasure, you will shift to sustainable thought patterns that produce long-term pleasure.

Suppose that you are meditating, and trying to maintain a focus on your breath. Over time this may start to feel boredom. A pleasant-feeling thought will arise, tempting you to get distracted with its promise of relief from the boredom. But if you do get distracted sufficiently many times, and pay attention to how you feel afterwards, you will notice that this didn't actually make you feel very good. Your concentration is in shambles and chasing random thoughts has just made you feel scatter-brained.

So the next time when that particular distraction arises, it may be slightly less tempting. And you begin to notice that it does feel good when you succeed at maintaining your concentration and ignoring the distractions. You had been suffering because your mind had been offering promises of pleasure which you felt you had to reject, but eventually you begin to internalize it's not a choice of pleasure versus concentration at all. Concentration is only boring, or otherwise unpleasant, if you buy into the illusion of needing to chase the pleasant thought in order to feel good. If the false promise of pleasure stops tempting you, then the suffering of not having that pleasure goes away.

The tempting, pleasant thought is kind of like a marketer who first makes you feel inadequate about something, and then offers to sell you a product that will make you feel better. Your problem was never the lack of product; your problem was the person who made you think you can only feel good once you have his product.

Over time you learn to transfer this to your everyday life, paying attention to tempting thought-patterns that cause you suffering there. You experience different kinds of suffering, and feel that this could be fixed, if only you had X. Maybe you are procrastinating on something, and you get distracted by the idea of playing video games instead. Your mind tells you that if you just played video games, they would feel so good, and that pleasure would take away the pain of procrastination.

But if you do start to play the game, you may eventually notice that the promised pleasure never really manifested. Procrastination didn't make you feel good, it just made you feel more miserable. And it's one thing to know this on an intellectual level, in the way that most of us know intellectually that we're going to regret procrastinating later; it's quite another to actually internalize that belief in such a way that you recognize the temptation itself as harmful, and your mind begins learning to just ignore the temptation, until it never arises in the first place.

And the same principle applies more widely. Social anxiety, frustration over having to participate in an event you wouldn't actually want to participate in, regrets over past mistakes: all are fundamentally about clinging to a thought which promises to offer pleasure, if only you (weren't around these people/could skip the event/could change what had happened in the past). It is when you internalize that thinking about this isn't actually going to deliver the pleasure and is actually causing you suffering, that reframe of the thought makes it easier to just automatically let go of it, with no need to struggle or expend willpower.

Originally published at 
Kaj Sotala. You can comment here or there.

(Leave an echo)

Sunday, May 28th, 2017
6:22 pm - Books that have had the biggest impact on my life/thought
In roughly chronological order:

1. J.R.R. Tolkien: The Hobbit / The Lord of the Rings
2. Eliezer Yudkowsky: The Less Wrong Sequences
3. Michele Boldrin & David K. Levine: Against Intellectual Monopoly
4. Olivia Fox Cabane: The Charisma Myth
5. Marshall Rosenberg: Non-Violent Communication
6. Eugene Gendlin: Focusing; Connirae Andreas & Tamara Andreas: Core Transformation

Tolkien, because he got me really into fantasy.

The Sequences woke me from a certain super-postmodern thought, where I basically felt that I could believe in anything as long as I came up with a sufficiently clever argument for it. They made me realize that there are actual mathematical laws regarding the kind of evidence I must have witnessed in order to start believing in something, if I wish to have correct beliefs. Also convinced me about AI being the biggest thing in the history of humanity, and got me on the career path that I'm still on.

Against Intellectual Monopoly bolted me from a very strong, principled "all online piracy is wrong" mindset to one where I later ended up as one of the founding members of the Finnish Pirate Party. It was also my first introduction to pro-market thinking and theories, with me having grown up in a climate that was very left economically.

The Charisma Myth got me to realize that one can be charismatic without being extroverted, and that being charismatic doesn't necessarily mean saying interesting things all the time. It made me understand that just being present and paying genuine attention to the other person were things that could already give you considerable charisma, and furthermore these were some skills that I already possessed. It meant the start of my conversations with people going from the constant question of "oh now what do I say next?!?" to actually being present in the moment and not worrying so much.

Non-Violent Communication started me on the path where I can actually usefully work with the underlying needs and beliefs behind my emotional reactions, instead of treating them as atomic reactions that I can do very little about.

I'm bunching Focusing and Core Transformation together, as I think of them as two books that discuss variants of what's fundamentally the same technique. I've found the Core Transformation version of it really powerful during some of the last few months, on the order of taking maybe half an hour to permanently cure psychological issues that had plagued me for decades. That said, I suspect I wouldn't have been able to properly use the technique had I not first read Focusing and practiced with the instructions there.

Honorable mentions:

* David Friedman: The Machinery of Freedom. The book which further shook my very leftist thought, and got me to realize that libertarians also have some pretty damn compelling arguments, and that I'm not really qualified to say who's right. Decided that I'd avoid taking any strong positions on economics from now, given how complicated the whole thing is. (have had varying levels of success with this decision)

* Pema Chödrön: The Wisdom of No Escape: How to love yourself and your world. Only read this book in the beginning of this year, but it has been very powerful in changing my thought and putting me on a path of greater self-compassion and self-acceptance.

* John Yates & Matthew Immergut & Jeremy Graves: The Mind Illuminated: A Complete Meditation Guide Integrating Buddhist Wisdom and Brain Science for Greater Mindfulness. The best book on meditation that I've ever read. This is also something I started reading *very* recently (many thanks, Juha!), but I'm already provisionally ready to nominate it for an honorable mention, because it's meditation instructions have been super-useful. They've helped install an automatic habit of my mind automatically dropping any lines of thought that will only be harmful (e.g. worrying about things that I have no control over); time will show whether that habit will last.

(Leave an echo)

Saturday, May 6th, 2017
12:11 pm - Cognitive Core Systems explaining intuitions behind belief in souls, free will, and creation myths

A book I’m currently reading, Cognitive Pluralism, cites research suggesting that human infants as well as many non-human animals (particularly primates) are born with four “hard-coded” core reasoning systems:

  • A Core Object System which identifies cohesive and continuous objects (as opposed to say liquids or heaps), enables tracking of such objects, and causes us to expect that objects will follow some specific properties: they will preserve their boundaries, move as a unit, interact with one another only through contact, and be set into motion only when acted on through direct contact. Has some signature limitations, such that we can only attend to about 3-4 objects at the same time.
  • A Core Number System which allows for numerical comparisons, such as by saying that a set with thirty stimuli is larger than a set with ten. Unlike the core object system, the core number system is nonmodal and not limited to contiguous objects; it can compare the number of e.g. sounds or actions.
  • A Core Agency System that causes us to intuitively treat humans, animals, and other things exhibiting signs of agency as being different from objects, liquids, or heaps. Things that are classified as agents are expected to exhibit autonomous, goal-directed behavior; and they will activate social behavior, such as when an infant imitates their actions.
  • A Core Geometric System which represents space and environment according to geometric properties such as distance and angle, while ignoring non-geometric properties such as color and smell. Does things such as constructing perspective-invariant representations of geometric layouts, or predicting how objects will appear when turned around or look at from a different perspective.

Now one particularly intriguing hypothesis which the book mentioned was that the intuitive human belief in souls or consciousness continuing after death, may come from the Agent and Object systems having different classification criteria. In particular, objects are assumed to only move when acted upon, while agents are assumed to exhibit independent, goal-directed motion.

Apparently the psychologist Paul Bloom has proposed that seeing or thinking about a human causes us to perceive there being two entities in the same space: a body (object) and a soul (agent). While the book did not explicitly mention this, this would also explain the origin of many intuitions about free will and mind-body dualism. Under this model, the object system would classify the body as something that only moves when being ordered to by an external force, requiring an agent in the form of a mind/soul being the “unmoved mover” that initiates the movement. One could also speculate on this being the intuition that motivated Aristotle’s unmoved movers in the celestial spheres, to say nothing about all the different creation myths, if we have an inborn intuition for movement requiring an agent to set it going.

Also, as a fun implication: if you were to design an AI to have the same core reasoning systems, then it might also have an intuitive belief in free will, souls, and creators.

Further reading: Cognitive Pluralism cites Spelke & Kinzler (2007), Core Knowledge, in Developmental Science 10:1, as well as Paul Bloom’s 2004 book Descartes’ Baby: How the Science of Child Development Explains What Makes Us Human, which based on its title sounds absolutely fascinating and which I probably want to read soon.

Originally published at Kaj Sotala. You can comment here or there.

(Leave an echo)

> previous 20 entries
> top of page
LiveJournal.com