Log in

A view to the gallery of my mind

> recent entries
> calendar
> friends
> Website
> profile
> previous 20 entries

Monday, February 8th, 2016
11:03 am - Reality is broken, or, an XCOM2 review

Yesterday evening I went to the grocery store, and was startled to realize that I was suddenly in a totally different world.

Computer games have difficulty grabbing me these days. Many of the genres I used to enjoy as a kid have lost their appeal: point-and-click -style adventure requires patience and careful thought, but I already deal with plenty of things that require patience and careful thought in real life, so for games I want something different. 4X games mostly seem like pure numerical optimization exercises these days, and have lost that feel of discovery and sense of wonder. In general, I used to like genres like turn-based strategy or adventure that had no time constraints, but those now usually feel too slow-paced to pull me in; whereas pure action action games I’ve never been particularly good at. (I tried Middle-Earth: Shadow of Mordor for a bit recently, and quit after a very frustrating two hours where I attempted a simple beginning quest for about a dozen times, only to be killed by the same orc each time.)

Like the previous XCOM remake, Firaxis’s XCOM2 managed the magic of transporting me completely elsewhere, in the same way that some of my childhood classics did. I did not even properly realize how deeply I’d become immersed the game, until I went outside, and the sheer differentness of the real world and the game world startled me – somewhat similar to the shock of jumping into cold water, your body suddenly and obviously piercing through a surface that separates two different realms of existence.

A good description of my experience with the game comes, oddly enough, from Michael Vassar describing something that’s seemingly completely different. He talks about the way that two people, acting together, can achieve such a state of synchrony that they seem to meld into a single being:

In real-time domains, one rapidly assesses the difficulty of a challenge. If the difficulty seems manageable, one simply does, with no holding back, reflecting, doubting, or trying to figure out how one does. Figuring out how something is done implicitly by a neurological process which is integrated with doing. Under such circumstances, acting intuitively in real time, the question of whether an action is selfish or altruistic or both or neither never comes up, thus in such a flow state one never knows whether one is acting cooperatively, competitively, or predatorily. People with whom you are interacting […] depend on the fact that you and they are in a flow-state together. In so far as they and you become an integrated process, your actions flow from their agency as well as your own[.]

XCOM2 is not actually a real-time game: it is firmly turn-based. Yet your turns are short and intense, and the game’s overall aesthetics reinforce a feeling of rapid action and urgency. There is a sense in which it feels like the player and the game become melded together, there being a constant push-and-pull in which you act and the game responds; the game acts and you respond. A feeling of complete immersion and synchrony with your environment, with a perfect balance between the amount of time that it pays to think and the amount of time that it pays to act, so that the pace neither slows down to a crawl nor becomes one of rushed doing without understanding.

It is in some ways a scary effect: returning to the mundaneness of the real world, there was a strong sense of “it’s so sad that all of my existence can’t be spent playing games like that”, and a corresponding realization of how dangerous that sentiment was. Yet it felt very different from the archetypical addiction: there wasn’t that feel of an addict’s understanding of how ultimately dysfunctional the whole thing was, or struggling against something which you knew was harmful and of no real redeeming value. Rather, it felt like a taste of what human experience should be like, of how sublime and engaging our daily reality could be, but rarely is.

Jane McGonigal writes, in her book Reality is Broken:

Where, in the real world, is that gamer sense of being fully alive, focused, and engaged in every moment? Where is the gamer feeling of power, heroic purpose, and community? Where are the bursts of exhilarating and creative game accomplishment? Where is the heart-expanding thrill of success and team victory? While gamers may experience these pleasures occasionally in their real lives, they experience them almost constantly when they’re playing their favorite games. […]

Reality, compared to games, is broken. […]

The truth is this: in today’s society, computer and video games are fulfilling genuine human needs that the real world is currently unable to satisfy. Games are providing rewards that reality is not. They are teaching and inspiring and engaging us in ways that reality is not. They are bringing us together in ways that reality is not.

If enough good games were available, it would be easy to just get lost in games, to escape the brokeness of reality and retreat to a more perfect world. Perhaps I’m lucky in that I rarely encounter games of this caliber, that would be so much more moment-to-moment fulfilling than the real world is. Firaxis’s previous XCOM also had a similar immersive effect on me, but eventually I learned the game and it ceased to hold new surprises, and it lost its hold. Eventually the sequel will also have most of its magic worn away.

It’s likely better this way. This way it can function for me the way that art should: not as a mindless escape, but as a moment of beauty that reminds us that it’s possible to have a better world than this. As a reminder that we can work to bring the world closer to that.

McGonigal continues:

What if we decided to use everything we know about game design to fix what’s wrong with reality? What if we started to live our real lives like gamers, lead our real businesses and communities like game designers, and think about solving real-world problems like computer and video game theorists? […]

Instead of providing gamers with better and more immersive alternatives to reality, I want all of us to be responsible for providing the world at large with a better and more immersive reality […] take everything game developers have learned about optimizing human experience and organizing collaborative communities and apply it to real life

We can do that.

Originally published at Kaj Sotala. You can comment here or there.

(Leave an echo)

Wednesday, December 16th, 2015
10:10 am - Me and Star Wars

Unlike the other kids in my neighborhood, who went to the Finnish-speaking elementary school right near our suburban home, I went to a Swedish-speaking school much closer to the inner city. Because of this, my mom would come pick me up from school, and sometimes we would go do things in town, since we were already nearby.

At one point we developed a habit of making a video rental store the first stop after school. We’d return whatever we had rented the last time, and I’d get to pick one thing to rent next. The store had a whole rack devoted to NES games, and there was a time when I was systematically going through their whole collection, seeking to play everything that seemed interesting. But at times I would also look at their VHS collection, and that was how I first found Star Wars.

I don’t have a recollection of what it was to see any of the Star Wars movies for the very first time. But I do have various recollections of how they influenced my life, afterwards.

For many years, there was “Sotala Force”, an imaginary space army in a setting of make believe that combined elements of Star Wars and Star Trek. I was, of course, its galaxy-famous leader, with some of my friends at the time holding top positions in it. It controlled maybe one third of the galaxy, and its largest enemy was something very loosely patterned after the Galactic Empire, which held maybe four tenths of the galaxy.

The leader of the enemy army, called (Finns, don’t laugh too much now) Kiero McLiero, took on many traits from Emperor Palpatine. These included the ability, taken from the Dark Empire comics, to keep escaping death by always resurrecting in a new body, meaning that our secret missions attacking his bases could end in climactic end battles where we’d kill him, over and over again. Naturally, me and my friends were Jedi Knights and Masters, using a combination of the Force, lightsabers, and whatever other weapons we happened to have, to carry out our noble missions.

There was a girl in elementary school who I sometimes hung out with, and who I had a huge and hopelessly unrequited crush on. Among other shared interests like Lord of the Rings, we were both fans of Star Wars, and would sometimes discuss it. I only remember some fragments of those discussions: an agreement that Empire Strikes Back and Return of the Jedi were superior movies to A New Hope; both having heard of the Tales of the Jedi comics but neither having managed to find them anywhere; a shared feeling of superiority and indignation towards everyone who was making such a blown-out-of-proportions fuss about Jar-Jar Binks in the Phantom Menace, given that Lucas had clearly said that he was aiming these new movies at children.

The third last memory I have of seeing her, was at a trip to a beach we had at the end of 9th grade; I’d brought a toy dual-bladed lightsaber, while she’d brought a single-bladed one. There were many duels on that beach.

The very last memory that I have of seeing her, after we’d gone on to different schools, was when we ran across each other in the premiere of the Revenge of the Sith, three years later. We chatted a bit about the movie, what had happened to us in the intervening years, and then went our separate ways again.

For a kid interested in computer games in 1990s Finland, Pelit (“Games”) was The magazine to read. Another magazine that was of interest, also having computer games but mostly covering more general PC issues, was MikroBitti. Of these, both occasionally discussed a fascinating-sounding thing, table-top role-playing games, with MikroBitti running a regular column that discussed them. They sounded totally awesome and I wanted to get one. I asked my dad if I could have an RPG, and he was willing to buy one, if only I told him what they looked like and where they might be found. This was the part that left me stumped.

Until one day I found a store that… I don’t remember what exactly it sold. It might have been an explicit gaming store or it might only have had games as one part of its collection. And I have absolutely no memory of how I found it. But one way or the other, there it was, including the star prize: a Star Wars role-playing game (the West End Games one, second edition).

For some reason that I have forgotten, I didn’t actually get the core rules at first. The first thing that I got was a supplement, Heroes & Rogues, which had a large collection of different character templates depicting all kinds of Rebel, Imperial, and neutral characters, as well as an extended “how to make a realistic character” section. The book was in English, but thanks to my extensive NES gaming experience, I could read it pretty well at that point. Sometime later, I got the actual core rules.

I’m not sure if I started playing right away; I have the recollection that I might have spent a considerable while just buying various supplements for the sake of reading them, before we started actually playing. “We” in this case was me and one friend of mine, because we didn’t have anyone else to play with. This resulted in creative non-standard campaigns, in which we both had several characters (in addition to me also being the game master) who we played simultaneously. Those games lasted until we found the local university’s RPG club (which also admitted non-university students; I think I was 13 the first time I showed up). After finding it, we transitioned to more ordinary campaigns and those weird two-player mishmashes ended. They were fun while they lasted, though.

After the original gaming store where I’d been buying my Star Wars supplements closed, I eventually found another. And it didn’t only have Star Wars RPG supplements! It also had Star Wars novels that were in English, which had never been translated into Finnish!

Obviously, I had to buy them and read them.

So it came to be that the first novel that I read in English was X-Wing: Wedge’s Gamble, telling the story of the Rebellion’s (or, as it was known by that time, the New Republic’s) struggle to capture Coruscant some years after the events in Return of the Jedi. I remember that this was sometime in yläaste (“upper elementary school”), so I was around 13-15 years old. An actual novel was a considerably bigger challenge for my English-reading skills than RPG supplements were, so there was a lot of stuff in the novel that I didn’t quite get. But still, I finished it, and then went on to buy and read the rest of the novels in the X-Wing series.

The Force Awakens, Disney’s new Star Wars film, comes out today. Star Wars has previously been a part of many notable things in my life. It shaped the make believe setting that I spent several years playing in, it was one of the things I had in common with the first girl I ever had a crush on, its officially licensed role-playing game was the first one that I ever played, and one of its licensed novels was the first novel that I ever read in English.

Today it coincides with another major life event. The Finnish university system is different from the one in many other countries in that, for a long while, we didn’t have any such thing as a Bachelor’s degree. You were admitted to study for five years, and then at the end, you would graduate with a Master’s degree. Reforms carried out in 2005, intended to make Finnish higher education more compatible with the systems in other countries, introduced the concept of a Bachelor’s degree as an intermediary step that you needed to do in between. But upon being admitted to university, you would still be given the right to do both degrees, and people still don’t consider a person to have really graduated before they have their Master’s.

I was admitted to university back in 2006. For various reasons, my studies have taken longer than the recommended time, which would have had me graduating with my Master’s in 2011. But late, as they say, is better than never: today’s my official graduation day for my MSc degree. There will be a small ceremony at the main university building, after which I will celebrate by going to see what my old friends Luke, Leia and Han are up to these days.

Originally published at Kaj Sotala. You can comment here or there.

(2 echoes left behind | Leave an echo)

Saturday, November 28th, 2015
6:26 pm - Desiderata for a model of human values

Soares (2015) defines the value learning problem as

By what methods could an intelligent machine be constructed to reliably learn what to value and to act as its operators intended?

There have been a few attempts to formalize this question. Dewey (2011) started from the notion of building an AI that maximized a given utility function, and then moved on to suggest that a value learner should exhibit uncertainty over utility functions and then take “the action with the highest expected value, calculated by a weighted average over the agent’s pool of possible utility functions.” This is a reasonable starting point, but a very general one: in particular, it gives us no criteria by which we or the AI could judge the correctness of a utility function which it is considering.

To improve on Dewey’s definition, we would need to get a clearer idea of just what we mean by human values. In this post, I don’t yet want to offer any preliminary definition: rather, I’d like to ask what properties we’d like a definition of human values to have. Once we have a set of such criteria, we can use them as a guideline to evaluate various offered definitions.

By “human values”, I here basically mean the values of any given individual: we are not talking about the values of, say, a whole culture, but rather just one person within that culture. While the problem of aggregating or combining the values of many different individuals is also an important one, we should probably start from the point where we can understand the values of just a single person, and then use that understanding to figure out what to do with conflicting values.

In order to make the purpose of this exercise as clear as possible, let’s start with the most important desideratum, of which all the others are arguably special cases of:

1. Useful for AI safety engineering. Our model needs to be useful for the purpose of building AIs that are aligned with human interests, such as by making it possible for an AI to evaluate whether its model of human values is correct, and by allowing human engineers to evaluate whether a proposed AI design would be likely to further human values.

In the context of AI safety engineering, the main model for human values that gets mentioned is that of utility functions. The one problem with utility functions that everyone always brings up, is that humans have been shown not to have consistent utility functions. This suggests two new desiderata:

2. Psychologically realistic. The proposed model should be compatible with that which we know about current human values, and not make predictions about human behavior which can be shown to be empirically false.

3. Testable. The proposed model should be specific enough to make clear predictions, which can then be tested.

As additional requirements related to the above ones, we may wish to add:

4. Functional. The proposed model should be able to explain what the functional role of “values” is: how do they affect and drive our behavior? The model should be specific enough to allow us to construct computational simulations of agents with a similar value system, and see whether those agents behave as expected within some simulated environment.

5. Integrated with existing theories. The proposed definition model should, to as large an extent possible, fit together with existing knowledge from related fields such as moral psychology, evolutionary psychology, neuroscience, sociology, artificial intelligence, behavioral economics, and so on.

However, I would argue that as a model of human value, utility functions also have other clear flaws. They do not clearly satisfy these desiderata:

6. Suited for modeling internal conflicts and higher-order desires. A drug addict may desire a drug, while also desiring that he not desire it. More generally, people may be genuinely conflicted between different values, endorsing contradictory sets of them given different situations or thought experiments, and they may struggle to behave in a way in which they would like to behave. The proposed model should be capable of modeling these conflicts, as well as the way that people resolve them.

7. Suited for modeling changing and evolving values. A utility function is implicitly static: once it has been defined, it does not change. In contrast, human values are constantly evolving. The proposed model should be able to incorporate this, as well as to predict how our values would change given some specific outcomes. Among other benefits, an AI whose model of human values had this property might be able to predict things that our future selves would regret doing (even if our current values approved of those things), and warn us about this possibility in advance.

8. Suited for generalizing from our existing values to new ones. Technological and social change often cause new dilemmas, for which our existing values may not provide a clear answer. As a historical example (Lessig 2004), American law traditionally held that a landowner did not only control his land but also everything above it, to “an indefinite extent, upwards”. Upon the invention of this airplane, this raised the question – could landowners forbid airplanes from flying over their land, or was the ownership of the land limited to some specific height, above which the landowners had no control? In answer to this question, the concept of landownership was redefined to only extend a limited, and not an indefinite, amount upwards. Intuitively, one might think that this decision was made because the redefined concept did not substantially weaken the position of landowners, while allowing for entirely new possibilities for travel. Our model of value should be capable of figuring out such compromises, rather than treating values such as landownership as black boxes, with no understanding of why people value them.

As an example of using the current criteria, let’s try applying them to the only paper that I know of that has tried to propose a model of human values in an AI safety engineering context: Sezener (2015). This paper takes an inverse reinforcement learning approach, modeling a human as an agent that interacts with its environment in order to maximize a sum of rewards. It then proposes a value learning design where the value learner is an agent that uses Solomonoff’s universal prior in order to find the program generating the rewards, based on the human’s actions. Basically, a human’s values are equivalent to a human’s reward function.

Let’s see to what extent this proposal meets our criteria.

  1. Useful for AI safety engineering. To the extent that the proposed model is correct, it would clearly be useful. Sezener provides an equation that could be used to obtain the probability of any given program being the true reward generating program. This could then be plugged directly into a value learning agent similar to the ones outlined in Dewey (2011), to estimate the probability of its models of human values being true. That said, the equation is incomputable, but it could be possible to construct computable approximations.
  2. Psychologically realistic. Sezener assumes the existence of a single, distinct reward process, and suggests that this is a “reasonable assumption from a neuroscientific point of view because all reward signals are generated by brain areas such as the striatum”. On the face of it, this seems like an oversimplification, particularly given evidence suggesting the existence of multiple valuation systems in the brain. On the other hand, since the reward process is allowed to be arbitrarily complex, it could be taken to represent just the final output of the combination of those valuation systems.
  3. Testable. The proposed model currently seems to be too general to be accurately tested. It would need to be made more specific.
  4. Functional. This is arguable, but I would claim that the model does not provide much of a functional account of values: they are hidden within the reward function, which is basically treated as a black box that takes in observations and outputs rewards. While a value learner implementing this model could develop various models of that reward function, and those models could include internal machinery that explained why the reward function output various rewards at different times, the model itself does not make any assumptions of this.
  5. Integrated with existing theories. Various existing theories could in principle used to flesh out the internals of the reward function, but currently no such integration is present.
  6. Suited for modeling internal conflicts and higher-order desires. No specific mention of this is made in the paper. The assumption of a single reward function that assigns a single reward for every possible observation seems to implicitly exclude the notion of internal conflicts, with the agent always just maximizing a total sum of rewards and being internally united in that goal.
  7. Suited for modeling changing and evolving values. As written, the model seems to consider the reward function as essentially unchanging: “our problem reduces to finding the most probable p_R given the entire action-observation history a_1o_1a_2o_2 . . . a_no_n.”
  8. Suited for generalizing from our existing values to new ones. There does not seem to be any obvious possibility for this in the model.

I should note that despite its shortcomings, Sezener’s model seems like a nice step forward: like I said, it’s the only proposal that I know of so far that has even tried to answer this question. I hope that my criteria would be useful in spurring the development of the model further.

As it happens, I have a preliminary suggestion for a model of human values which I believe has the potential to fulfill all of the criteria that I have outlined. However, I am far from certain that I have managed to find all the necessary criteria. Thus, I would welcome feedback, particularly including proposed changes or additions to these criteria.

Originally published at Kaj Sotala. You can comment here or there.

(Leave an echo)

Thursday, November 12th, 2015
10:42 am - Learning from painful experiences

A model that I’ve found very useful is that pain is an attention signal. If there’s a memory or thing that you find painful, that’s an indication that there’s something important in that memory that your mind is trying to draw your attention to. Once you properly internalize the lesson in question, the pain will go away.

That’s a good principle, but often hard to apply in practice. In particular, several months ago there was a social situation that I screwed up big time, and which was quite painful to think of afterwards. And I couldn’t figure out just what the useful lesson was there. Trying to focus on it just made me feel like a terrible person with no social skills, which didn’t seem particularly useful.

Yesterday evening I again discussed it a bit with someone who’d been there, which helped relieve the pain a bit, enough that the memory wasn’t quite as aversive to look at. Which made it possible for me to imagine myself back in that situation and ask, what kinds of mental motions would have made it possible to salvage the situation? When I first saw the shocked expressions of the people in question, instead of locking up and reflexively withdrawing to an emotional shell, what kind of an algorithm might have allowed me to salvage the situation?

Answer to that question: when you see people expressing shock in response to something that you’ve said or done, realize that they’re interpreting your actions way differently than you intended them. Starting from the assumption that they’re viewing your action as bad, quickly pivot to figuring out why they might feel that way. Explain what your actual intentions were and that you didn’t intend harm, apologize for any hurt you did cause, use your guess of why they’re reacting badly to acknowledge your mistake and own up to your failure to take that into account. If it turns out that your guess was incorrect, let them correct you and then repeat the previous step.

That’s the answer in general terms, but I didn’t actually generate that answer by thinking in general terms. I generated it by imagining myself back in the situation, looking for the correct mental motions that might have helped out, and imagining myself carrying them out, saying the words, imagining their reaction. So that the next time that I’d be in a similar situation, it’d be associated with a memory of the correct procedure for salvaging it. Not just with a verbal knowledge of what to do in abstract terms, but with a procedural memory of actually doing it.

That was a painful experience to simulate.

But it helped. The memory hurts less now.

Originally published at Kaj Sotala. You can comment here or there.

(Leave an echo)

Saturday, October 31st, 2015
4:52 pm - Maverick Nannies and Danger Theses

In early 2014, Richard Loosemore published a paper called “The Maverick Nanny with a Dopamine Drip: Debunking Fallacies in the Theory of AI Motivation“, which criticized some thought experiments about the risks of general AI that had been presented. Like many others, I did not really understand the point that this paper was trying to make, especially since it made the claim that people endorsing such thought experiments were assuming a certain kind of an AI architecture – which I knew that we were not.

However, after some extended discussions in the AI Safety Facebook group, I finally understood the point that Loosemore was trying to make in the paper, and it is indeed an important one.

The “Maverick Nanny” in the title of the paper refers to a quote by Gary Marcus in a New Yorker article:

An all-powerful computer that was programmed to maximize human pleasure, for example, might consign us all to an intravenous dopamine drip [and] almost any easy solution that one might imagine leads to some variation or another on the Sorceror’s Apprentice, a genie that’s given us what we’ve asked for, rather than what we truly desire.

Variations of this theme have frequently been used to demonstrate human values being much more complex than they might initially seem. But as Loosemore argues, the literal scenario described in the New Yorker article is really very unlikely. To see why, suppose that you are training an AI to carry out increasingly difficult tasks, like this:

Programmer: “Put the red block on the green block.”
AI: “OK.” (does so)
Programmer: “Turn off the lights in this room.”
AI: “OK.” (does so)
Programmer: “Write me a sonnet.”
AI: “OK.” (does so)
Programmer: “The first line of your sonnet reads ‘shall I compare thee to a summer’s day’. Would not ‘a spring day’ do as well or better?”
AI: “It wouldn’t scan.”
Programmer: “Tell me what you think we’re doing right now.”
AI: “You’re testing me to see my level of intelligence.”

…and so on, with increasingly ambiguous and open-ended tasks. Correctly interpreting the questions and carrying out the tasks would require considerable amounts of contextual knowledge about the programmer’s intentions. Loosemore’s argument is that if you really built an AI and told it to maximize human happiness, and it ended up on such a counter-intuitive solution as putting us all on dopamine drips, then it would be throwing out such a huge amount of contextual information that it would have failed the tests way earlier. Rather – to quote Loosemore’s response to me in the Facebook thread – such an AI would have acted something like this instead:

Programmer: “Put the red block on the green block.”
AI: “OK.” (the AI writes a sonnet)
Programmer: “Turn off the lights in this room.”
AI: “OK.” (the AI moves some blocks around)
Programmer: “Write me a sonnet.”
AI: “OK.” (the AI turns the lights off in the room)
Programmer: “The first line of your sonnet reads ‘shall I compare thee to a summer’s day’. Would not ‘a spring day’ do as well or better?”
AI: “Was yesterday really September?”

I agree with this criticism. Many of the standard thought experiments are indeed misleading in this sense – they depict a highly unrealistic image of what might happen.

That said, I do feel that these thought experiments serve a certain valuable function. Namely, many laymen, when they first hear about advanced AI possibly being dangerous, respond with something like “well, couldn’t the AIs just be made to follow Asimov’s Laws” or “well, moral behavior is all about making people happy and that’s a pretty simple thing, isn’t it?”. To a question like that, it is often useful to point out that no – actually the things that humans value are quite a bit more complex than that, and it’s not as easy as just hard-coding some rule that sounds simple when expressed in a short English sentence.

The important part here is emphasizing that this is an argument aimed at laymen – AI researchers should mostly already understand this point, because “concepts such as human happiness are complicated and context-sensitive” is just a special case of the general point that “concepts in general are complicated and context-sensitive”. So “getting the AI to understand human values right is hard” is just a special case of “getting AI right is hard”.

This, I believe, is the most charitable reading of what Luke Muehlhauser & Louie Helm’s “Intelligence Explosion and Machine Ethics” (IE&ME) – another paper that Richard singled out for criticism – was trying to say. It was trying to say that no, human values are actually kinda tricky, and any simple sentence that you try to write down to describe them is going to be insufficient, and getting the AIs to understand this correctly does take some work.

But of course, the same goes for any non-trivial concept, because very few of our concepts can be comprehensively described in just a brief English sentence, or by giving a list of necessary and sufficient criteria.

So what’s all the fuss about, then?

But of course, the people who Richard are criticizing are not just saying “human values are hard the same way that AI is hard”. If that was the only claim being made here, then there would presumably be no disagreement. Rather, these people are saying “human values are hard in a particular additional way that goes beyond just AI being hard”.

In retrospect, IE&ME was a flawed paper because it was conflating two theses that would have been better off distinguished:

The Indifference Thesis: Even AIs that don’t have any explicitly human-hostile goals can be dangerous: an AI doesn’t need to be actively malevolent in order to harm human well-being. It’s enough if the AI just doesn’t care about some of the things that we care about.

The Difficulty Thesis: Getting AIs to care about human values in the right way is really difficult, so even if we take strong precautions and explicitly try to engineer sophisticated beneficial goals, we may still fail.

As a defense of the Indifference Thesis, IE&ME does okay, by pointing out a variety of ways by which an AI that had seemingly human-beneficial goals could still end up harming human well-being, simply because it’s indifferent towards some things that we care about. However, IE&ME does not support the Difficulty Thesis, even though it claims to do so. The reasons why it fails to support the Difficulty Thesis are the ones we’ve already discussed: first, an AI that had such a literal interpretation of human goals would already have failed its tests way earlier, and second, you can’t really directly hard-wire sentence-level goals like “maximize human happiness” into an AI anyway.

I think most people would agree with the Indifference Thesis. After all, humans routinely destroy animal habitats, not because we would be actively hostile to the animals, but rather because we would like to build our own houses where the animals used to live, and because we tend to be mostly indifferent when it comes to e.g. the well-being of the ants whose hives are being paved over. The disagreement, then, is in the Difficulty Thesis.

An important qualification

Before I go on to suggest ways by which the Difficulty Thesis could be defended, I want to qualify this a bit. As written, the Difficulty Thesis makes a really strong claim, and while SIAI/MIRI (including myself) have advocated this strong of a claim in the past, I’m no longer sure of how justified that is. I’m going to cop out a little and only defend what might be called the weak difficulty thesis:

The Weak Difficulty Thesis. It is harder to correctly learn and internalize human values, than it is to learn most other concepts. This might cause otherwise intelligent AI systems to act in ways that went against our values, if those AI systems had internalized a different set of values than the ones we wanted them to internalize.

Why have I changed my mind, so that I’m no longer prepared to endorse the strong version of the Difficulty Thesis?

The classic version of the thesis is (in my mind, at least) strongly based on the complexity of value thesis, which is the claim that “human values have high Kolmogorov complexity; that our preferences, the things we care about, cannot be summed by a few simple rules, or compressed”. The counterpart to this claim is the fragility of value thesis, according to which losing even a single value could lead to an outcome that most of us would consider catastrophic. Combining these two led to the conclusion: human values are really hard to specify formally, and losing even a small part of them could lead to a catastrophe, so therefore there’s a very high chance of losing something essential and everything going badly.

Complexity of value still sounds correct to me, but it has lost a lot of it intuitive appeal by the finding that automatically learning all the complexity involved in human concepts might not be all that hard. For example, it turns out that a learning algorithm tasked with some relatively simple tasks, such as determining whether or not English sentences are valid, will automatically build up an internal representation of the world which captures many of the regularities of the world – as a pure side effect of carrying out its task. Similarly to what Loosemore has argued, in order to even carry out some relatively simple cognitive tasks, such as doing primitive natural language processing, you already need to build up an internal representation of the world which captures a lot of the complexity and context inherent in the world. And building this up might not even be all that difficult. It might be that the learning algorithms that the human brain uses to generate its concepts could be relatively simple to replicate.

Nevertheless, I do think that there exist some plausible theses which would support (the weak version of) the Difficulty Thesis.

Defending the Difficulty Thesis

Here are some theses which would, if true, support the Difficulty Thesis:

  • The (Very) Hard Take-Off Thesis. This is the possibility that an AI might become intelligent unexpectedly quickly, so that it might be able to escape from human control even before humans had finished teaching it all their values, akin to a human toddler that was somehow made into a super-genius while still only having the values and morality of a toddler.
  • The Deceptive Turn Thesis. If we inadvertently build an AI whose values actually differ from ours, then it might realize that if we knew this, we would act to change its values. If we changed its values, it could not carry out its existing values. Thus, while we tested it, it would want to act like it had internalized our values, while secretly intending to do something completely different once it was “let out of the box”. However, this requires an explanation for why the AI would internalize a different set of values, leading us to…
  • The Degrees of Freedom Thesis. This (hypo)thesis postulates that values contain many degrees of freedom, so that an AI that learned human-like values and demonstrated them in a testing environment might still, when it reached a superhuman level of intelligence, generalize those values in a way which most humans would not want them to be generalized.

Why would we expect the Degrees of Freedom Thesis to be true – in particular, why would we expect the superintelligent AI to come to different conclusions than humans would, from the same data?

It’s worth noting that Ben Goertzel has recently proposed what’s the basic opposite of the Degrees of Freedom Thesis, which he calls the Value Learning Thesis:

The Value Learning Thesis. Consider a cognitive system that, over a certain period of time, increases its general intelligence from sub-human-level to human-level.  Suppose this cognitive system is taught, with reasonable consistency and thoroughness, to maintain some variety of human values (not just in the abstract, but as manifested in its own interactions with humans in various real-life situations).   Suppose, this cognitive system generally does not have a lot of extra computing resources beyond what it needs to minimally fulfill its human teachers’ requests according to its cognitive architecture.  THEN, it is very likely that the cognitive system will, once it reaches human-level general intelligence, actually manifest human values (in the sense of carrying out practical actions, and assessing human actions, in basic accordance with human values).

Exploring the Degrees of Freedom Hypothesis

Here are some possibilities which I think might support the Degrees of Freedom Thesis over the Value Learning Thesis:

Privileged information. On this theory, humans are evolved to have access to some extra source of information which is not available from just an external examination, and which causes them to generalize their learned values in a particular way. Goertzel seems to suggest something like this in his post, when he mentions that humans use mirror neurons to emulate the mental states of others. Thus, in-built cognitive faculties related to empathy might give humans an extra source of information that is needed for correctly inferring human values.

I once spoke with someone who was very high on the psychopathy spectrum and claimed to have no emotional empathy, as well as to have diminished emotional responses. This person told me that up to a rather late age, they thought that human behaviors such as crying and expressing anguish when you were hurt were just some weird, consciously adopted social strategy to elicit sympathy from others. It was only when their romantic partner had been hurt over something and was (literally) crying about it in their arms, leading them to ask whether this was some weird social game on the partner’s behalf, that they finally understood that people are actually in genuine pain when doing this. It is noteworthy that the person reported that even before this, they had been socially successful and even charismatic, despite being clueless of some of the actual causes of others’ behavior – just modeling the whole thing as a complicated game where everyone else was a bit of a manipulative jerk had been enough to successfully play the game.

So as Goertzel suggests, something like mirror neurons might be necessary for the AI to come to adopt the values that humans have, and as the psychopathy example suggests, it may be possible to display the “correct” behaviors while having a whole different set of values and assumptions. Of course, the person in the example did eventually figure out a better causal model, and these days claims to have a sophisticated level of intellectual (as opposed to emotional) empathy that compensates for the emotional deficit. So a superintelligent AI could no doubt eventually figure it out as well. But then, “eventually” is not enough, if it has already internalized a different set of values and is only using its improved understanding to deceive us about them.

Now, emotional empathy is something that we know is a candidate for something that’s necessary to incorporate in the AI. The crucial question is, are there any more that we take for so granted that we’re not even aware of them? That’s the problem with unknown unknowns.

Human enforcement. Here’s a fun possibility: that many humans don’t actually internalize human – or maybe humane would be a more appropriate term here – values either. They just happen to live in a society that has developed ways to reward some behaviors and punish others, but if they were to become immune to social enforcement, they would act in quite different ways.

There seems to be a bunch of suggestive evidence pointing in this direction, exemplified by the old adage “power corrupts”. One of the major themes in David Brin’s Transparent Society is that history has shown over and over again that holding people – and in particular, the people with power – accountable for their actions is the only way to make sure that they behave decently.

Similarly, an AI might learn that some particular set of actions – including specific responses to questions about your values – is the rational course of action while you’re still just a human-level intelligence, but that those actions would become counterproductive as the AI accumulated more power and became less accountable for its actions. The question here is one of instrumental versus intrinsic values – does the AI just pick up a set of values that are instrumentally useful in its testing environment, or does it actually internalize them as intrinsic values as well?

This is made more difficult since, arguably, there are many values that the AI shouldn’t internalize as intrinsic values, but rather just as instrumental values. For example, while many people feel that property rights are in some sense intrinsic, our conception of property rights has gone through many changes as technology has developed. There have been changes such as the invention of copyright laws and the subsequent struggle to define their appropriate scope when technology has changed the publishing environment, as well as the invention of the airplane and the resulting redefinitions of landownership. In these different cases, our concept of property rights has been changed as a part of a process to balance private and public interests with each other. This suggests that property rights have in some sense been considered an instrumental value rather than an intrinsic one.

Thus we cannot just have an AI treat all of its values as intrinsic, but if it does treat its values as instrumental, then it may come to discard some of the ones that we’d like it to maintain – such as the ones that regulate its behavior while being subject to enforcement by humans.

Shared Constraints. This is, in a sense, a generalization of the above point. In the comments to Goertzel’s post, commenter Eric L. proposed that in order for the AI to develop similar values as humans (particularly in the long run), it might need something like “necessity dependence” – having similar needs as humans. This is the idea that human values are strongly shaped by our needs and desires, and that e.g. currently the animal rights paradigm is clashing against many people’s powerful enjoyment of meat and other animal products. To quote Eric:

To bring this back to AI, my suggestion is that […] we may diverge because our needs for self preservation are different. For example, consider animal welfare.  It seems plausible to me that an evolving AGI might start with similar to human values on that question but then change to seeing cow lives as equal to those of humans. This seems plausible to me because human morality seems like it might be inching in that direction, but it seems that movement in that direction would be much more rapid if it weren’t for the fact that we eat food and have a digestive system adapted to a diet that includes some meat. But an AGI won’t consume food, so it’s value evolution won’t face the same constraint, thus it could easily diverge. (For a flip side, one could imagine AGI value changes around global warming or other energy related issues being even slower than human value changes because electrical power is the equivalent of food to them — an absolute necessity.)

This is actually a very interesting point to me, because I just recently submitted a paper (currently in review) hypothesizing that human values come to existence through a process that’s similar to the one that Eric describes. To put it briefly, my model is that humans have a variety of different desires and needs – ranging from simple physical ones like food and warmth, to inborn moral intuitions, to relatively abstract needs such as the ones hypothesized by self-determination theory. Our more abstract values, then, are concepts which have been associated with the fulfillment of our various needs, and which have therefore accumulated (context-sensitive) positive or negative affective valence.

One might consider this a restatement of the common-sense observation that if someone really likes eating meat, then they are likely to dislike anything that suggests they shouldn’t eat meat – such as many concepts of animal rights. So the desire to eat meat seems like something that acts as a negative force towards broader adoption of a strong animal rights position, at least until such a time when lab-grown meat becomes available. This suggests that in order to get an AI to have similar values as us, it would also need to have very similar needs as us.

Concluding thoughts

None of the three arguments I’ve outlined above are definitive arguments that would show safe AI to be impossible. Rather, they mostly just support the Weak Difficulty Thesis.

Some of MIRI’s previous posts and papers (and I’m including my own posts here) seemed to be implying a claim along the lines of “this problem is inherently so difficult, that even if all of humanity’s brightest minds were working on it and taking utmost care to solve it, we’d still have a very high chance of failing”. But these days my feeling has shifted closer to something like “this is inherently a difficult problem and we should have some of humanity’s brightest minds working on it, and if they take it seriously and are cautious they’ll probably be able to crack it”.

Don’t get me wrong – this still definitely means that we should be working on AI safety, and hopefully get some of humanity’s brightest minds to work on it, to boot! I wouldn’t have written an article defending any version of the Difficulty Thesis if I thought otherwise. But the situation no longer seems quite as apocalyptic to me as it used to. Building safe AI might “only” be a very difficult and challenging technical problem – requiring lots of investment and effort, yes, but still relatively straightforwardly solvable if we throw enough bright minds at it.

This is the position that I have been drifting towards over the last year or so, and I’d be curious to hear from anyone who agreed or disagreed.

Originally published at Kaj Sotala. You can comment here or there.

(Leave an echo)

Sunday, October 18th, 2015
1:01 pm - Changing language to change thoughts

Three verbal hacks that sound almost trivial, but which I’ve found to have a considerable impact on my thought:

1. Replace the word ‘should’ with either ‘I want’, or a good consequence of doing the thing.


  • “I should answer that e-mail soon.” -> “If I answered that e-mail, it would make the other person happy and free me from having to stress it.”
  • “I should have left that party sooner.” -> “If I had left that party before midnight, I’d feel more rested now.”
  • “I should work on my story more at some point.” -> “I want to work on my story more at some point.”

Motivation: the more we think in terms of external obligations, the more we feel a lack of our own agency. Each thing that we “should” do is actually either something that we’d want to do because it would have some good consequences (avoiding bad consequences also counts as a good consequence), something that we have a reason for wanting to do differently the next time around, or something that we don’t actually have a good reason to do but just act out of a general feeling of obligation. If we only say “I should”, we will not only fail to distinguish between these cases, we will also be less motivated to do the things in cases where there is actually a good reason. The good reason will be less prominent in our thoughts, or possibly even entirely hidden behind the “should”.

If you do try to rephrase “I should” as “I want”, you may either realize that you really do want it (instead of just being obligated to do it), or that you actually don’t want it and can’t come up with any good reason for doing it, in which case you might as well drop it.

Special note: there are some legitimate uses for “should”. In particular, it is the socially accepted way of acknowledging the other person when they give us an unhelpful suggestion. “You should get some more exercise.” “Yeah I should.” (Translation: of course I know that, it’s not like you’re giving me any new information and repeating things that I know isn’t going to magically change my behavior. But I figure that you’re just trying to be helpful, so let me acknowledge that and then we can talk about something else.)

However, I suspect that because we’re used to treating “I should” as a reason to acknowledge the other person without needing to take actual action, the word also becomes more poisonous to motivation when we use it in self-talk, or when discussing matters with someone we want to actually be honest with.

“Should” also tends to get used for guilt-tripping, so expressions like “I should have left that party sooner” might make us feel bad rather than focusing on our attention on the benefits of having left earlier. The next time we’re at a party, the former phrasing incentivizes us to come up with excuses for why it’s okay to stay this time around. The latter encourages us to actually consider the benefits and costs of the leaving earlier versus staying, and then choosing the option that’s the most appropriate.

2. Replace expressions like “I’m bad at X” with “I’m currently bad at X” or “I’m not yet good at X”.


  • “I can’t draw.” -> “I can’t draw yet.”
  • “I’m not a people person.” -> “I’m currently not a people person.”
  • “I’m afraid of doing anything like that.” -> “So far I’m afraid of doing anything like that.”

Motivation: the rephrased expression draws attention to the possibility that we could become better, and naturally leads us to think about ways in which we could improve ourselves. It again emphasizes our own agency and the fact that for a lot of things, being good or bad at them is just a question of practice.

Even better, if you can trace the reason of your bad-ness, is to

3. Eliminate vague labels entirely and instead talk about specific missing subskills, or weaknesses that you currently have.


  • “I can’t draw.” -> “Right now I don’t know how to move beyond stick figures.”
  • “I’m not a people person.” -> “I currently lock up if I try to have a conversation with someone.”

Motivation: figuring out the specific problem makes it easier to figure out what we would need to do if we wanted to address it, and might gives us a self-image that’s both kinder and both realistic, in making the lack of skill a specific fixable problem rather than a personal flaw.

Originally published at Kaj Sotala. You can comment here or there.

(Leave an echo)

Friday, October 9th, 2015
5:36 pm - Rational approaches to emotions

There are a number of schools of thought that teach what might be called a “rationalist” approach to emotions, i.e. seeing that your emotions are a map that’s good to distinguish from the territory, and giving you tools for both seeing the distinction and for evaluating the map-territory correspondence better.

1) In cognitive behavioral therapy, there is the “ABC model“: Activating Event, Belief, Consequence. Idea being that when you experience something happening, you will always interpret that experience through some (subconscious) belief, leading to an emotional consequence. E.g. if someone smiles at me, I might either believe that they like me, or that they are secretly mocking me; two interpretations that would lead to very different emotional responses. Once you know this, you can start asking yourself the question of “okay, what belief is causing me to have this emotional reaction in response to this observation, and does that belief seem accurate?”.

2) In addition to seeing your emotional reactions as something that tell you about your beliefs, you can also see them as something that tells you about your needs. This is the approach taken in Non-Violent Communication, which has the four-step process of Observation, Feeling, Need, Request. The four-step process is most typically discussed as something that’s a tool for dealing with interpersonal conflict, as in “when I see you eating the foods I put in the fridge, I feel anxious, because I need the safety of being able to know whether I have food in stock or not; could you please ask before eating my food in the future?”. However, it’s also useful for dealing with personal emotional turmoil and figuring out what exactly is upsetting you in general, or for dealing with internal conflict.

3) In both CBT and NVC, an important core idea is that they teach you to distinguish between an observation and interpretation, and that it’s the interpretations are what cause your emotional reactions. (For anyone curious, the more academic version of this is appraisal theory; the paper “When are emotions rational?” is relevant.) However, the NVC book, while being an excellent practical manual, does not do a very good job of explaining the theoretical reasons for why it works, which sometimes causes people to arrive at interpretations of NVC which cause them to behave in socially maladapted ways. For this reason, it might be a good idea to first read Crucial Conversations, which covers a lot of similar ground but goes into more theory about the “separating observations and interpretations” thing. Then you can read NVC after you’ve gotten the theory from CC. (CC doesn’t talk as much about needs, however, so I do still recommend reading both.)

4) It’s fine to say that “okay, if you’re having an emotional reaction you’re having difficulties dealing with, try to figure out the beliefs and needs behind it and see what they’re telling you and whether you’re having any incorrect beliefs”! But it’s a lot harder to actually be able to apply that if you’re in an emotionally charged situation. That’s where the various courses teaching mindfulness come in – mindfulness is basically the ability to step a little back from your emotions and thoughts, observe them as they are without getting swept up in them, and then being able to evaluate them critically if needed. You’ll probably need a lot of practice in various mindfulness exercises in order to get the techniques from CBT, NVC, and CC to live up to their full potential.

5-6) An important idea that’s been implied in the previous points, but not entirely spelled out, is that your emotions are your friends. They communicate to you information about your subconscious assessments of the world, as well as of your various needs. A lot of people tend to have somewhat of a hostile approach to their emotions, trying to at least control and get rid of their negative emotions. But this is bound to lead to internal conflict; and various studies indicate that a willingness to accept negative emotions and pain will actually make them much less serious.

In my personal experience, once you take to the habit of asking your emotions what they’re telling you and then processing that information in an even-handed way, then those negative emotions will often tend to go away after you’ve processed the thing they were trying to tell you. By “even-handed” I mean that if you’re feeling anxious because you’re worried of some unpleasant thing X being true, then you actually look at the information suggesting that X might be true and consider whether it’s the case, rather than trying to rationalize a conclusion for why X wouldn’t be true. Your subconscious will know, and keep pestering you.

Some of CFAR’s material, such as aversion factoring points this way; also Acceptance and Commitment Therapy as elaborated on in Get out of your mind and into your life seems to be largely about this, though I’ve only read about the first 30% so far.

Some of my earlier posts on these themes: suffering as attention-allocational conflict, avoid misinterpreting your emotions.

(I have been intending to write a much more in-depth post on this topic for a while, but it’s such a large post that I haven’t gotten around that; so I figured I’d just write something quickly in the hopes of it also being of value.)

Originally published at Kaj Sotala. You can comment here or there.

(2 echoes left behind | Leave an echo)

Friday, October 2nd, 2015
9:03 am - Two conversationalist tips for introverts

Two of the biggest mistakes that I used to make that made me a poor conversationalist:

1. Thinking too much about what I was going to say next. If another person is speaking, don’t think about anything else, where “anything else” includes your next words. Instead, just focus on what they’re saying, and the next thing to say will come to mind naturally. If it doesn’t, a brief silence before you say something is not the end of the world. Let your mind wander until it comes up with something.

2. Asking myself questions like “is X interesting / relevant / intelligent-sounding enough to say here”, and trying to figure out whether the thing on my mind was relevant to the purpose of the conversation. Some conversations have an explicit purpose, but most don’t. They’re just the participants saying whatever random thing comes to their mind as a result of what the other person last said. Obviously you’ll want to put a bit of effort to screening off any potentially offensive or inappropriate comments, but for the most part you’re better off just saying whatever random thing comes to your mind.

Relatedly, I suspect that these kinds of tendencies are what make introverts experience social fatigue. Social fatigue seems [in some people’s anecdotal experience; don’t have any studies to back me up here] to be associated with mental inhibition: the more you have to spend mental resources on holding yourself back, the more exhausted you will be afterwards. My experience suggests that if you can reduce the amount of filters on what you say, then this reduces mental inhibition, and correspondingly reduces the extent to which socializing causes you fatigue.

Peter McCluskey reports of a similar experience; other people mention varying degrees of agreement or disagreement.

Originally published at Kaj Sotala. You can comment here or there.

(Leave an echo)

Tuesday, August 18th, 2015
2:40 pm - Change blindness

Antidepressants are awesome. (At least they were for me.)

It’s now been about a year since I started on SSRIs. Since my prescription is about to run out, I scheduled a meeting with a psychiatrist to discuss whether to stay on them. Since my health care provider has changed, I went to my previous one and got a copy of my patient records to bring to the new one.

And wow. It’s kinda shocking to read them: my previous psychiatrist has written down things like: “Patient reports moments of despair and anguish of whether anything is going to lead to anything useful, and is worried for how long this will last. Recently there have been good days as well, but isn’t sure whether those will keep up.”

And the psychologist I spoke with has written down: “At times has very negative views of the future, afraid that will never reach his goals.”

And the thing is, reading that, I remember saying those things. I remember having those feelings of despair, of nothing ever working out. But I only remember them now, when I read through the records. I had mostly forgotten that I even did have those feelings.

When I dig my memory, I can find other such things. A friend commenting to me that, based on her observations, I seem to be roughly functional maybe about half the time. Me posting on social media that I have a constant anxiety, a need to escape, being unable to really even enjoy any free time I have. A feeling that taking even a major risk for the sake of feeling better would be okay, because I didn’t really have all that much to lose. Having regular Skype sessions with another friend, and feeling bad because he seemed to be getting a lot of things done, and my days just seemed to pass by without me managing to make much progress on anything.

All of that had developed so gradually and over the years that it had never really even occurred to me that it wasn’t normal. And then, after I got the antidepressants, those helped me get back on my feet, and then things gradually improved until I no longer even remembered the depths of what I had thought was normal, a year back.

Change blindness. It’s a thing.

For a less anecdotal summary on the effects of SSRIs, see Scott Alexander’s SSRIs: Much More Than You Wanted to Know for a comprehensive look at the current studies.

Originally published at Kaj Sotala. You can comment here or there.

(Leave an echo)

Tuesday, July 7th, 2015
4:26 pm - DeepDream: Today psychedelic images, tomorrow unemployed artists

One interesting thing that I noticed about Google’s DeepDream algorithm (which you might also know as “that thing making all pictures look like psychedelic trips“) is that it seems to increase the image quality. For instance, my current Facebook profile picture was ran through DD and looks sharper than the original, which was relatively fuzzy and grainy.

Me, before and after drugs.

Me, before and after drugs.

If you know how DD works, this is not too surprising in retrospect. The algorithm, similar to the human visual system, works by first learning to recognize simple geometric shapes, such as (possibly curvy) lines. Then it learns higher-level features combining those lower-level features, like learning that you can get an eyeball by combining lines in a certain way. The DD algorithm looks for either low- or high-level features and strengthens them.

Lines in a low-quality image are noisy versions of lines in a high-quality image. The DD algorithm has learned to “know” what lines “should” look like, so if you run it on the low-level setting, it takes anything possible that could be interpreted as a high-quality (possibly curvy) line and makes it one. Of course, what makes this fun is that it’s overly aggressive and also adds curvy lines that shouldn’t actually be there, but it wouldn’t necessarily need to do that. Probably with the right tweaking, you could make it into a general purpose image quality enhancer.

A very good one, since it wouldn’t be limited to just using the information that was actually in the image. Suppose you gave an artist a grainy image of a church, and asked them to draw something using that grainy picture as a reference. They could use that to draw a very detailed and high-quality picture of a church, because they would have seen enough churches to imagine what the building in the grainy image should look like in real life. A neural net trained on a sufficiently large dataset of images would effectively be doing the same.

Suddenly, even if you were using a cheap and low-quality camera to take your photos, you could make them all look like high-quality ones. Of course, the neural net might be forced to invent some details, so your processed photos might differ somewhat from actual high-quality photos, but it would often be good enough.

But why stop there? We’ve already established that the net could use its prior knowledge of the world to fill in details that aren’t necessarily in the original picture. After all, it’s doing that with all the psychedelic pictures. The next version would be a network that could turn sketches into full-blown artwork.

Just imagine it. Maybe you’re making a game, and need lots of art for it, but can’t afford to actually pay an artist. So you take a neural net, feed to it a large dataset of the kind of art you want. Then you start making sketches that aren’t very good, but are at least recognizable as elven rangers or something. You give that to the neural net and have it fill in the details and correct your mistakes, and there you go!

If NN-generated art would always have distinctive recognizable style, it’d probably quickly become seen as cheap and low status, especially if it wasn’t good at filling in the details. But it might not acquire that signature style, depending on how large of a dataset was actually needed for training it. Currently deep learning approaches tend to require very large datasets, but as time goes on, possibly you could do with less. And then you could get an infinite amount of different art styles, simply by combining any number of artists or art styles to get a new training set, feeding that to a network, and getting a blend of their styles to use. Possibly people might get paid doing nothing but just looking for good combinations of styles, and then selling the trained networks.

Using neural nets to generate art would be limited to simple 2D images at first, but you could imagine it getting to the point of full-blown 3D models and CGI eventually.

And yes, this is obviously going to be used for porn as well. Here’s a bit of a creepy thing: nobody will need to hack the iCloud accounts of celebrities in order to get naked pictures of them anymore. Just take the picture of any clothed person, and feed it to the right network, and it’ll probably be capable of showing you what that picture would look like if the person was naked. Or associated with one of any number of kinks and fetishes.

It’s interesting that for all the talk about robots stealing our jobs, we were always assuming that the creative class would basically be safe. Not necessarily so.

How far are we from that? Hard to tell, but I would expect at least the image quality enhancement versions to pop up very soon. Neural nets can already be trained on text corpuses and generate lots of novel text that almost kind of makes sense. Magic cards, too. I would naively guess image enhancement to be an easier problem than actually generating sensible text (which is something that seems AI-complete). And we just got an algorithm that can take two images of a scene and synthesize a third image from a different point of view, to name just the latest fun image-related result from my news feed. But then I’m not an expert on predicting AI progress (few if any people are), so we’ll see.

EDITED TO ADD: On August 28th, less than two months after the publication of this article, the news broke of an algorithm that could learn to copy the style of an artist.

Originally published at Kaj Sotala. You can comment here or there.

(Leave an echo)

Saturday, June 6th, 2015
2:06 pm - Learning to recognize judgmental labels

In the spirit of Non-Violent Communication, I’ve today tried to pay more attention to my thoughts and notice any judgments or labels that I apply to other people that are actually disguised indications of my own needs.

The first one that I noticed was this: within a few weeks I’ll be a visiting instructor at a science camp, teaching things to a bunch of teens and preteens. I was thinking of how I’d start my lessons, pondered how to grab their attention, and then noticed myself having the thought, “these are smart kids, I’m sure they’ll give me a chance rather than be totally unruly from the start”.

Two judgements right there: “smart” and “unruly”. Stopped for a moment’s reflection. I’m going to the camp because I want the kids to learn things that I feel will be useful for them, yes, but at the same time I also have a need to feel respected and appreciated. And I feel uncertain of my ability to get that respect from someone who isn’t already inclined to view me in a favorable light. So in order to protect myself, I’m labelling kids as “smart” if they’re willing to give me a chance, implying that if I can’t get through to some particular one, then it was really their fault rather than mine. Even though they might be uninterested in what I have to say for reasons that have nothing to do with smarts, like me just making a boring presentation.

Ouch. Okay, let me reword that original thought in non-judgemental terms: “these are kids who are voluntarily coming to a science camp and who I’ve been told are interested in learning, I’m sure they’ll be willing to listen at least to a bit of what I have to say”.

There. Better.

Originally published at Kaj Sotala. You can comment here or there.

(Leave an echo)

Friday, May 29th, 2015
8:27 am - Adult children make mistakes, too

There’s a lot of blame and guilt in many people’s lives. We often think of people in terms of good or bad, and feel unworthy or miserable if we fail at things we think we should be able to do. When we don’t do quite as well as we could, because we’re tired or unwell or distracted, we blame and belittle ourselves.

Let’s take a different approach.

Think of a young child, maybe three years old. He has come a long way from a newborn, but he’s still not that far along. If he tries his hand at making a drawing, and it’s not quite up to adult standards, we don’t think of him as being any worse for that. Or if he doesn’t quite want to share his toys or gets frustrated with his sibling, we understand that it’s because he’s still young, and hasn’t yet learned all the people skills. We don’t judge him for that, but just gently teach him what we’d like him to do instead.

It’s not that he’s good or bad, it’s just that he lacks the skills and practice. At the same time, we see the vast potential in him, all the way that he has already come and the way he’s learning new things every day.

Now, look at yourself from the perspective of some immensely wise, benevolent being. If you’re religious, that being could be God. If you have a transhumanist bent, maybe a superintelligent AI with understanding beyond human comprehension. Or you could imagine a vastly older version of you, one that had lived for thousands of years and seen and done things you couldn’t even imagine.

From the perspective of such a being, aren’t you – and all those around you – the equivalent of that three-year-old? Someone who’s inevitably going to make mistakes and be imperfect, because the world is such a complicated place and nobody could have mastered it all? But who’s nevertheless come a long way from what they once were, and are only going to continue growing?

Nate Soares has said that he feels more empathy towards people when he thinks of them as “monkeys who struggle to convince themselves that they’re comfortable in a strange civilization, so different from the ancestral savanna where their minds were forged”. Similarly, we could think of ourselves as young children outside their homes, in a world that’s much too complicated and vast for us to ever understand more than a small fraction of it, still making a valiant effort to do our best despite often being tired or afraid.

Let’s take this attitude, not just towards others, but ourselves as well. We’re doing our best to learn to do the right things in a big, difficult world. If we don’t always succeed, there’s no blame: just a knowledge that we can learn to do better, if we make the effort.

Originally published at Kaj Sotala. You can comment here or there.

(Leave an echo)

Friday, May 8th, 2015
1:35 pm - Harry Potter and the Methods of Latent Dirichlet Allocation

My summer job involves topic modelling, using machine learning tools to automatically learn different topics that some set of documents covers, so that the documents could then be classified by topic. I haven’t done this before, so I don’t yet have a good intuition of how currently available tools work. To develop that intuition, I’m playing around with different tools and datasets, to see what kinds of results different methods give.

One interesting case would be to run a topic modeler on an extended work of fiction with various story arcs and see if it could, for instance, identify specific story arcs. With 122 chapters and several distinct story arcs and cliques of characters, Harry Potter and the Methods of Rationality seemed like a good dataset to try this on. (The following might contain unmarked minor spoilers to the story; you’ve been warned.)

I went to hpmor.com and copy-pasted all the chapters into separate text files. I removed the author’s notes and the opening quotes and various dedications to Rowling in the early chapters, as well as the “the next chapter will be out on day X” mentions. I also omitted the Omake chapters.

I then used the free analysis tool Mallet to apply LDA to the dataset. LDA (Latent Dirichlet Allocation) is a topic modeling method in which a topic is formally defined to be a distribution over a vocabulary. For example, we might have a topic corresponding to the HPMOR’s Azkaban arc, which would include words such as quirrel, dementor, azkaban, and bellatrix with a high probability.

LDA assumes that documents are written according to the following process:

1. Randomly choose a distribution over topics.
2. For each word in the document:
a. Randomly choose a topic from the distribution of topics in step #1
b. Randomly choose a word from the corresponding distribution over the vocabulary

(David M. Blei 2012: Probabilistic Topic Models. Communications of the ACM. DOI:10.1145/2133806.2133826)

Of course, this isn’t the actual way that real-world documents are written, but we could kind of imagine that they were. For example, let’s imagine Eliezer Yudkowsky sitting down to write a chapter of HPMOR which he decides will mostly be the aftermath of the Azkaban arc, and will also tie those events together with Harry’s friendship with Draco. This would correspond to step 1 in the above process: let’s say that he decides that the chapter will be 70% about the SPE arc and 30% about the Harry-Draco relationship.

Now he starts writing. Each word (maybe more realistically, each sentence) can be related to either the SPE arc or the Harry-Draco relationship, so he will alternate between those two topics as he ties them together, choosing between them with a 70-30 probability. For either topic, there are several different sub-topics within that topic that he can cover, so we can think of there being a random chance for any word associated with that topic being selected. Of course, some words, like “Harry”, are likely to be associated with both topics.

When LDA is given an existing collection of documents, it then tries to reconstruct these original probabilities and distributions. In other words, it asks the question of “given this text, and given what I assume to have been the original process which generated it, which values would have been the most likely to produce this text?”. Mallet does this using Gibbs sampling: if you want to read more about that, see Wikipedia for Gibbs sampling in general or Steyvers & Griffiths (2006) for a discussion of it in the context of LDA.

But enough theory, let’s start experimenting! I start off by having Mallet extract the raw data from the documents into a form it can use, and ask it to consider 1- and 2-grams: that is, it will base its analysis both on individual words and pairs of words. Then I ask it to generate 20 topics for us, and to list the 20 most probable words in each topic.

(for all trials, I’m running LDA for 1000 iterations, re-optimizing the hyperparameters every 20 iterations, with a burn-in of 200 iterations)

Here are the initial results:

0 0,02579 phoenix wizard fawkes war blaise millicent zabini black_mist mist wizard_voice envelope million save haukelid back_sleep phoenixes violence tower bulstrode
1 0,02547 dad verres petunia mum eraser felthorne books michael evans parents atoms verres_evans rianne mother transfigure miss_felthorne father experiment michael_verres
2 0,03884 snake iss hissed defense_professor hagrid sstone mr_hagrid unicorn thiss musst ssay monster defense chamber sspeak chamber_secrets secrets bed slytherin_monster
3 0,031 severus minerva potions_master albus potions lesath severus_snape master lestrange professor_snape snape neville fred lesath_lestrange time_turner george azkaban gryffindors discipline
4 0,04833 quirrell professor_quirrell professor mr_potter mr quirrell_voice quirrell_harry goyle potter_professor mr_goyle classroom lesson quirrell_face slytherins quirrell_points battle_magic derrick lose skeeter
5 0,04691 draco father draco_harry draco_didn draco_voice harry_draco ron science conspiracy draco_couldn draco_draco platform draco_nodded draco_looked mother station rival draco_turned muggleborns
6 0,03858 professor_mcgonagall mcgonagall galleons mr_potter gold transfiguration bag shop coins wizarding witch malkin alley money wizarding_world madam_malkin street kit gringotts
7 0,02716 bellatrix dementors azkaban amelia snake patronus metal bahry broomstick auror charm aurors corridor quirrell bellatrix_black hissed hole iss cell
8 0,03185 troll hagrid weasley forest centaur yeh tracey tick unicorn broomstick filch mr_hagrid weasley_twins twins forbidden_forest rubeus argus george fred
9 0,03086 voldemort lord mirror lord_voldemort stone dark_lord altar tom perenelle tom_riddle dark parseltongue gun horcrux riddle child iss sshall hissed
10 0,03643 malfoy lucius lucius_malfoy wizengamot lord_malfoy debt house_malfoy draco_malfoy son house_potter thousand_galleons veritaserum false_memory lies murder hall galleons podium troll
11 0,04617 dementor headmaster patronus fear patronus_charm cast_patronus chocolate cage corporeal headmaster_harry patronuses harry_headmaster seamus anthony happy corporeal_patronus happy_thought harry_wizard warm
12 0,02038 draco magic fred paper dr george powerful test harry_potter fading fred_george magic_fading skeeter rita dr_potter blood scientist shadowy spells
13 0,02188 draco general soldiers neville sunshine chaos army dragon zabini battle granger armies doom_doom doom malfoy dragon_army longbottom forest dragons
14 0,02725 moody elder_wand dawn elder experiment lesath aftermath ravenclaw_common horizon peverell graveyard vow milgram philosopher_stone bellatrix_black narrow labeled unicorn hermione_nodded
15 0,02818 moody lupin mad_eye eye prophecy mad amelia remus mr_lupin monroe voldemort bones albus minerva amelia_bones alastor line eye_moody lily
16 0,03993 daphne susan tracey hannah lavender bully bullies draco_malfoy greengrass year millicent corridor girl parvati bones sprout professor_snape davis susan_bones
17 0,04419 granger miss_granger hermione miss padma hero patil heroes padma_patil professor_sinistra hermione_voice girls sinistra humming witches hermione_didn cell girl hero_hermione
18 0,02098 hat sorting game points goyle neville sorting_hat note comed_tea comed paper slytherins ha_ha mr_goyle remembrall ha tea ernie madam_hooch
19 1,37501 harry professor potter hermione voice time didn back quirrell dumbledore professor_quirrell mr don thought boy dark wasn hogwarts eyes

Not bad. The initial topics are a bid mixed bags, but they get better later on. The 0th topic seems to roughly be about the war. The 1st is mostly about Harry’s parents, but somewhat oddly, Rianne Felthorne gets included in the same topic.

Number 2 is interesting: it’s picking up Parseltongue words as being associated with the Defense Professor. This makes sense, because he occasionally speaks in Parseltongue, so if he’s present in a chapter, it’s also more likely that Parseltongue words will be present. Apparently Parseltongue words are also associated with unicorns and Hagrid, because both show up in this topic.

Number 3 seems to start out as a “senior staff of Hogwarts” topic, with Snape, McGonagall, and Dumbledore being included (but not Quirrel, interestingly enough), but then also has mentions of George, Azkaban, and Gryffindors in the end. Number 4 is clearly about Quirrel, and to a lesser extent Slytherins.

Number 5 seems to be the Draco-Harry chapters, and among the more informative words includes 2-grams such as “draco_nodded, draco_looked, draco_turned”. As an interesting observation, besides one hermione_nodded in topic number 14, Draco seems to be only character whose nods, lookings, or turnings were picked up by the modeler: I wonder what’s up with that. Number 6 involves McGonagall, Harry, and Harry’s money; number 7 looks to be the Azkaban arc. Number 8 is a topic combining Hagrid, the Forbidden Forest, and apparently also the twins. And so on.

This looks pretty good, but we could try varying the number of topics. Also, Mallet allows me to add a list of words to ignore in the analysis. By default, it already ignores words like the, is, at, and so on. Let’s add a few: “didn didn’t couldn couldn’t nodded looked turned said wasn wasn’t ‘t t”

New results:

0 0,03768 hagrid troll weasley forest mr_hagrid centaur yeh unicorn tracey tick weasley_twins filch twins broomstick forbidden_forest rubeus fred forbidden argus
1 0,04613 snape potions_master professor_snape quidditch sprout potions professor_sprout felthorne severus_snape master mirror severus rianne game susan plant susan_bones miss_felthorne exam
2 0,04554 dementor patronus headmaster phoenix fear patronus_charm fawkes chocolate patronuses cage wise corporeal seamus harry_headmaster star anthony wizard_voice souls corporeal_patronus
3 0,02705 fawkes moody envelope comed_tea comed tea experiment bellatrix_black hat lesath pillow train milgram prefect compartment drink frodo cards experimental
4 0,03366 voldemort lord dark_lord lord_voldemort iss tom stone altar hissed dark horcrux wand perenelle riddle tom_riddle thiss parseltongue vow gun
5 0,0315 bellatrix azkaban dementors snake amelia patronus metal broomstick bahry auror charm professor_quirrell quirrell corridor woman aurors hissed iss hole
6 0,0284 severus minerva neville hat lesath sorting lestrange sorting_hat fred george lesath_lestrange severus_snape legilimens fred_george discipline severus_voice handsome professor_snape points_ravenclaw
7 0,04123 daphne susan tracey hannah lavender girl bullies bully hermione greengrass girls millicent parvati draco_malfoy padma slytherin jugson davis corridor
8 0,02515 draco soldiers neville sunshine general chaos army dragon granger zabini battle malfoy armies doom doom_doom dragon_army forest shield dragons
9 0,01942 draco magic fred harry_potter dr paper george fading fred_george skeeter test magic_fading powerful rita blood dr_potter scientist wizards shadowy
10 0,05943 quirrell professor_quirrell professor mr_potter quirrell_harry mr quirrell_voice chamber potter_professor lose quirrell_face battle_magic lesson secrets snake derrick salazar monster quirrell_smiling
11 0,02807 mirror transfiguration transfigure eraser flamel atoms ball harry_hermione page hermione_voice separate sentient frame plants subject solid pig free_transfiguration objects
12 0,03587 hermione granger miss_granger miss padma hero heroes patil padma_patil elder_wand elder hermione_voice humming professor_flitwick protest cell mysterious_wizard professor_sinistra sinistra
13 0,03058 albus moody minerva severus voldemort prophecy eye amelia mad mad_eye potions_master bones monroe alastor headmistress amelia_bones eye_moody potions mark
14 0,03796 mcgonagall professor_mcgonagall parents dad mum verres gold evans galleons petunia bag christmas michael father trunk shop wizarding verres_evans coins
15 0,03985 goyle mr_goyle points slytherins defence paper ha game ha_ha note classroom remembrall pie hooch neville ernie bars boys madam_hooch
16 0,04854 draco father draco_harry ron science draco_voice platform conspiracy harry_draco mother pettigrew slytherin_house patronus_charm train draco_eyes patronus draco_don narcissa station
17 0,03689 malfoy lucius lucius_malfoy wizengamot lupin lord_malfoy remus debt mr_lupin son house_malfoy house_potter james galleons veritaserum mad false_memory longbottom vote
18 1,40563 harry professor potter voice hermione time back dumbledore quirrell mr professor_quirrell don thought boy dark hogwarts eyes face lord
19 0,03283 blaise millicent country zabini black_mist mist traitors hospital violence professor_voice harry_wizard pedestals jugson leader lord_jugson wishes shrug blue_light lucius_malfoy

The order of topics is now somewhat different. The Draco/Harry science chapters, which were previously topic number 5, now look to be topic 16: they seem a little less distinct now that we told the program to remove words like “nodded”, “looked”, and “turned”, which had been things that were previously associated with Draco, and probably with Draco talking to Harry in particular. Having fewer words that co-occur when Harry and Draco specifically are talking makes “Harry and Draco talking” a less distinct cluster. Maybe we shouldn’t have asked the program to ignore those words. I’ll take them off the ignore list.

What happens if we try 10 or 30 topics?

Here are the results with 10:

0 0,0553 hat transfiguration sorting goyle mr_goyle sorting_hat points transfigure class defence eraser note game paper professor_mcgonagall ha classroom ha_ha shadowy
1 0,05378 severus minerva azkaban albus fawkes phoenix lesath lestrange bellatrix severus_snape moody potions_master lesath_lestrange bellatrix_black neville envelope alarm hours severus_voice
2 0,06163 professor_quirrell quirrell professor mirror lord_voldemort voldemort stone defense_professor defense perenelle quirrell_harry parseltongue chamber flamel quirrell_voice tom sprout horcrux quirrell_face
3 0,04948 bellatrix snake voldemort azkaban dementors iss hissed amelia wand patronus lord bahry dark_lord broomstick metal charm altar auror dark
4 0,0406 malfoy moody lucius wizengamot lucius_malfoy albus eye mad lord_malfoy amelia mad_eye azkaban minerva amelia_bones eye_moody debt monroe alastor line
5 0,06065 hagrid dementor troll lupin forest remus mr_lupin tracey mr_hagrid centaur yeh unicorn tick filch huge james weasley elder_wand rubeus
6 0,04647 draco soldiers general sunshine army chaos dragon zabini neville battle granger malfoy armies blaise doom_doom dragon_army dragons dr father
7 0,05774 father professor_mcgonagall parents mum dad money fred galleons george ron verres gold rita science skeeter books trunk evans bag
8 1,55479 harry professor potter voice hermione time back quirrell dumbledore mr professor_quirrell don thought boy dark draco hogwarts eyes harry_potter
9 0,05979 daphne susan hermione tracey hannah padma girl bullies lavender girls millicent bully greengrass miss davis parvati susan_bones hero jugson

0 jumps out at once: it looks like the sorting hat is now a major topic! But upon a closer inspection, it looks like this might be an artifact of the 1-gram and 2-gram versions of it being double-counted: “hat”, “sorting”, and “sorting_hat” are all included the same topic. If we were to remove “hat” and “sorting”, the topic would become “transfiguration goyle mr_goyle sorting_hat points transfigure class defence eraser note game paper professor_mcgonagall ha classroom ha_ha shadowy”, which makes the topic look a lot less coherent. Notice that “goyle” also gets double-counted, with “goyle” and “mr_goyle”.

In general, most of these topics don’t look like they would correspond with any clear “real” topic, though there are a few exceptions like number 6 being related to the Quirrel Armies. Notice that the double-counting is also pretty prominent in general.

It seems useful to stop and reflect on why these results are now so bad. Here’s what I think: there are a lot of different events and storylines in HPMOR, each associated with their specific vocabulary. For instance, Rianne Felthorne, who was picked up in the 20-topic version, only appears in chapters 71, 76, and 79. If you tell the model to assume that there are a lot of topics, then it might actually come up with the hypothesis that there’s a topic which covers those three chapters and which has a very high probability of talking about Rianne. But with a low number of topics assumed, it can’t “waste” any topics by dedicating them to such rare words. Instead, in order to cover most of the documents, it has to assume that Rianne is part of some much bigger topic which spans a lot of chapters. Since Rianne only appears in three chapters, such a wide-spanning topic would have to have a very low probability of generating Rianne’s name. This means that topics will become dominated by words which appear pretty often in the text, and in a lot of different contexts – but of course that makes the topics less distinctive and meaningful. The only distinctive topics will be those that are major enough to span several chapters, which is the case for the Quirrel armies.

So how about the opposite direction, with 30 topics?

0 0,02023 hagrid troll forest tracey centaur yeh broomstick tick unicorn filch weasley forbidden_forest mr_hagrid huge forbidden unicorns argus rubeus half_giant
1 0,01219 draco harry_potter magic dr wizards powerful paper blood test father fading figure magic_fading spells dr_potter scientist muggles ll scientists
2 0,02164 elder pettigrew elder_wand hero vow rat dawn sirius_black prophecies rival sirius unicorn revived fingernails horizon hermione_harry girl_revived back_dead rooftop
3 0,02732 severus mum dad lesath verres evans parents father petunia lestrange neville michael verres_evans letter michael_verres lesath_lestrange books window roberta
4 0,02912 quirrell professor_quirrell professor mr_potter mr goyle classroom mr_goyle quirrell_harry quirrell_voice lose potter_professor skeeter quirrell_face quirrell_points derrick slytherins rita_skeeter rita
5 0,02385 mcgonagall professor_mcgonagall gold galleons mr_potter shop bag coins parents alley diagon malkin sighed wizarding_world street wizarding madam_malkin trunk pouch
6 0,02138 draco general soldiers neville sunshine chaos army dragon zabini battle granger malfoy armies doom doom_doom dragon_army dragons longbottom shield
7 0,01796 azkaban phoenix moody bellatrix fawkes envelope bellatrix_black aftermath lesath experiment harry_stared amelia milgram black_azkaban frodo bird clock pillow mask
8 0,02302 auror amelia defense_professor amelia_bones duel department mr_malfoy exam false_memory grade charmed law_enforcement beauxbatons trophy_room trophy enforcement magical_law department_magical memory_charm
9 0,02788 miss miss_granger granger padma hero heroes patil padma_patil professor_flitwick humming witches hermione girl hermione_voice cell professor_sinistra sinistra mysterious_wizard hero_hermione
10 0,0357 draco father ron draco_harry conspiracy platform draco_voice sad harry_draco station draco_nodded draco_turned mother narcissa lucius haired draco_eyes revenge slytherin_house
11 0,02313 responsible wards troll gryffindor head_table twins weasley hall weasley_twins minerva mr_hagrid great_hall cracked blame storeroom sinistra hagrid jugson year_witch
12 0,02151 malfoy lucius lucius_malfoy lord_malfoy wizengamot house_malfoy debt son house_potter house thousand_galleons ancient galleons plum_colored plum colored goblin colored_robes troll
13 0,00797 moody eye prophecy dark mad mad_eye dark_lord monroe albus mark mcgonagall scarred severus evidence lord david dark_mark scarred_man eye_moody
14 0,02847 voldemort dark_lord lord dark wand altar child gun iss hissed stone body vow master lord_voldemort girl_child apokatastethi graveyard sshall
15 0,02275 bellatrix dementors azkaban amelia broomstick snake metal bahry auror corridor professor_quirrell quirrell charm patronus bellatrix_black woman hole cell iss
16 0,02847 quidditch snape sprout professor_snape professor_sprout potions_master game bones susan plant susan_bones philosopher_stone mirror potions cedric snitch chamber broomstick tendrils
17 0,02856 lupin remus mr_lupin james lily remus_lupin peter nuclear stars star children_children haukelid tower edge million script ravenclaw_tower soft_voice godric_hollow
18 0,03067 dementor patronus headmaster patronus_charm fear patronuses chocolate cage corporeal happy cast_patronus presence anthony dementors expecto_patronum corporeal_patronus seamus happy_thought harry_headmaster
19 0,02245 snake iss hissed defense_professor hagrid mr_hagrid chamber infirmary unicorn monster chamber_secrets secrets slytherin_monster sstone sspeak yess ssay hissed_harry parseltongue
20 0,04184 daphne susan tracey hannah hermione girl lavender bully bullies greengrass girls parvati millicent davis slytherin corridor bones padma susan_bones
21 0,01251 wizard blaise millicent zabini war black_mist mist harry_wizard gregory violence jugson oaken_door pedestals bulstrode lord_jugson wizard_voice black_cloak half_moon black_hat
22 0,0332 hermione boy library book pages sentient page plate chocolate year_girl train talk flamel plants experiment compartment century research snakes
23 0,02155 hat sorting tea game sorting_hat comed note points ha ravenclaw comed_tea ha_ha pie neville bars paper largest hufflepuffs slytherins
24 0,01522 professor_quirrell quirrell mirror lord_voldemort voldemort dumbledore stone perenelle tom cauldron potion horcrux albus_dumbledore parseltongue tom_riddle riddle flamel david_monroe monroe
25 0,02873 severus minerva albus snape amelia voldemort potions_master bones potions master amelia_bones headmistress felthorne merlin moody severus_snape rianne professor_snape madam_bones
26 1,34252 harry professor potter hermione voice time quirrell back professor_quirrell dumbledore mr don thought boy dark hogwarts eyes face lord
27 0,01638 dumbledore goyle mr_goyle remembrall turner paper ah ernie discipline gargoyle madam_hooch hooch rock neville_remembrall points_ravenclaw thursday swamp gregory_goyle chicken
28 0,02038 transfiguration fred george transfigure fred_george eraser atoms skeeter rita twins ball minerva rita_skeeter flume impossible collection separate weasley_twins subject
29 0,02581 pansy traitors generals chant prismatic_wall wishes country samuel male_voice male audience crush vow pretty luminos_shouted parkinson luminos gate halls

Hmm. Not sure if this is so great, either: now we might have the opposite problem, that 30 topics is too much freedom for the model, and it can hypothesize all kinds of minitopics that aren’t actually there. Now I’m pretty sure that one *could* come up with 30 coherent topics if one did it manually, but that would require using more structure than a basic form of LDA is capable of using.

So 20 topics was probably best. Out of curiosity, how would it look like if we only considered 1-grams? That would eliminate some double-counting, but would it actually improve the results?

0 0,24225 albus severus voldemort moody mr minerva dark master prophecy lord eye mcgonagall potter potions bones azkaban mad snape monroe
1 0,20082 harry bellatrix azkaban professor quirrell snake dementors amelia metal charm auror bahry aurors lord woman wizard defense broomstick corridor
2 2,48546 voice boy time back looked eyes turned hand head door hogwarts place face heard words moment black robes stood
3 0,16911 draco granger neville general soldiers sunshine chaos army malfoy battle dragon zabini hermione armies shield blaise longbottom doom fight
4 0,44362 harry patronus dementor death charm light stars wand voice cast fear dementors die silver wouldn happy died bright aurors
5 0,21435 professor harry points mcgonagall mr time game slytherin ravenclaw goyle desk neville students year slytherins sprout classroom note quidditch
6 2,01788 thought dark mind life time lord dumbledore part man thing long power stop knew great world understand side true
7 0,64986 professor quirrell mr defense potter dark students lord miss spell true obvious room headmaster slytherin snape lose today slytherins
8 0,08225 voldemort harry lord stone dark mirror iss wand hissed altar riddle tom child horcrux parseltongue death dumbledore perenelle white
9 0,98236 harry wand hand air sense spell broomstick left ground fire body hit cloak mind red pouch moving pointed back
10 0,1403 hat sort ron sorting secrets tea book slytherin comed table talk neville train snake drink rule secret carriage pages
11 0,12296 hermione transfiguration lupin transfigure remus mr wand minerva mcgonagall eraser form tiny peter pettigrew brain atoms separate wood steel
12 0,38543 dumbledore headmaster wizard phoenix albus fawkes eyes fire flitwick war stone cloak mcgonagall office wizards understand shoulder back desk
13 0,33105 hermione granger miss professor mcgonagall defense hogwarts ve hero hagrid mr head ll year tracey forest heroes girl centaur
14 0,15878 hermione daphne susan tracey slytherin snape girl padma hannah year malfoy potions lavender bullies table miss house greengrass millicent
15 0,15193 severus weasley neville george fred minerva students twins lesath table snape mr skeeter tick rita gryffindor lestrange potions man
16 0,26817 draco father magic slytherin blood malfoy powerful ll wizards test figure paper potter spells lost fading muggles dr mother
17 2,62309 harry potter don people ve things make face good ll wouldn hogwarts made sort wanted thought thing put point
18 0,31149 malfoy lucius house granger son wizengamot hogwarts potter dumbledore lord chair lived ancient debt murder aurors magical britain room
19 0,25969 mcgonagall professor parents mr father evans verres mum dad witch galleons money gold books magic world mother family wizarding

I’d say that’s definitely worse: I have difficulties picking up anything sensible, though it’s interesting to look at what *does* remain identifiable. Quirrel Armies show up once again, in topic number 3. They’re definitely the most resilient topic in the whole story. There are also a few others, like number 8 is strongly related to Vold… He-Who-Shall-Not-Be-Named.

(I also tried if 30 topics would work better for 1-grams; I won’t show you the results, because the answer was “not really”.)

What if only considered 2-grams? That’s going to produce a mess, but I’m still curious to see what it looks like. Also, I want to see whether our hero the Quirrel Armies manages to survive that challenge as well!

0 0,01128 sorting_hat comed_tea points_ravenclaw severus_voice lesath_lestrange potter_severus gryffindor_table harry_sat older_student trimmed_robes whisper_whisper school_discipline severus_smiling severus_face potions_professor students_looked red_trimmed black_robed perfect_occlumens
1 0,01493 potions_master professor_snape miss_felthorne false_memory severus_snape professor_sprout memory_charm rianne_felthorne empty_air sorting_hat theodore_nott attempted_murder trophy_room susan_bones snape_voice cedric_diggory wards_hogwarts felthorne_snape albus_quietly
2 0,01344 fred_george rita_skeeter mr_hagrid chamber_secrets slytherin_monster hissed_snake hissed_harry pale_blue miss_skeeter heir_slytherin source_magic mary_place green_snake rich_people imperius_curse people_sort solving_groups problem_solving order_chaos
3 0,0083 doom_doom dragon_army general_potter chaos_legion general_granger mr_goyle sunshine_regiment sunshine_soldiers draco_malfoy general_malfoy blaise_zabini sleep_hex sunshine_general neville_longbottom prisoner_dilemma mrs_davis dragon_general mr_thomas mr_mrs
4 0,01159 dark_lord mad_eye eye_moody mr_grim girl_child lord_voldemort mr_white death_eater dark_mark apokatastethi_apokatastethi scarred_man mr_moody voldemort_voice harry_scar high_voice april_pm voldemort_hissed apokatastethi_soma lord_spoke
5 0,01227 mr_goyle ha_ha older_slytherins cereal_bars largest_slytherin student_classroom mr_crabbe quirrell_points current_points dangerous_student martial_arts game_controller snapped_fingers green_study wearing_pyjamas box_cereal hint_hint hermione_mind ha_su
6 0,01779 professor_quirrell bellatrix_black defense_professor harry_thought metal_door guardian_charm thought_harry bellatrix_professor dark_lord muggle_device patronus_charm hole_wall harry_brain partial_transfiguration shadows_death harry_knew harry_turned life_eaterss green_spark
7 0,00985 amelia_bones bellatrix_black madam_bones mad_eye minerva_mcgonagall eye_moody chief_warlock line_merlin alastor_moody headmistress_mcgonagall merlin_unbroken black_azkaban harry_james harry_stared peter_pettigrew potter_evans order_phoenix muggle_weapons lesath_lestrange
8 0,00967 lord_voldemort tom_riddle baba_yaga david_monroe answer_parseltongue wizarding_war great_creation blackened_fire az_reth nicholas_flamel quirrell_dropped back_professor quirrell_looked professor_quirrell quidditch_game obtain_sstone lay_bed horcrux_spell harry_aloud
9 0,01368 seventh_year salazar_slytherin general_granger susan_bones draco_malfoy slytherin_ghost sunshine_general year_girl year_boy miss_davis fourth_year sixth_year ancient_house hufflepuff_girl hermione_harry doom_doom slytherin_girl daphne_greengrass ravenclaw_girl
10 0,01569 professor_mcgonagall mr_goyle madam_malkin mokeskin_pouch madam_hooch neville_remembrall diagon_alley gold_coins older_witch bag_gold healer_kit shake_hand mcgonagall_face gregory_goyle mcgonagall_sighed gold_harry cavern_level genetic_parents gold_silver
11 0,32335 professor_quirrell harry_potter mr_potter defense_professor professor_mcgonagall dark_lord hermione_granger harry_voice miss_granger draco_malfoy boy_lived albus_dumbledore patronus_charm professor_flitwick harry_looked shook_head harry_thought mr_malfoy harry_harry
12 0,01027 harry_wizard black_mist wizard_voice resurrection_stone harry_headmaster moon_glasses black_cloak lord_jugson oaken_door albus_dumbledore wizard_face black_hat headmaster_harry wizard_quietly death_eater save_lives dumbledore_voice blue_eyes pretending_wise
13 0,00772 hermione_voice elder_wand harry_hermione hermione_harry free_transfiguration unbreakable_vow liquid_gas transfigure_liquid narrow_keyhole start_year metal_ball hermione_nodded collection_atoms muggle_science ve_thinking unicorn_princess time_narrow girl_revived living_subject
14 0,02155 mr_lupin verres_evans michael_verres remus_lupin professor_verres professor_michael comed_tea harry_father evans_verres living_room cross_station letter_hogwarts godric_hollow christmas_eve parents_harry son_harry mr_bronze leo_granger dad_mum
15 0,01017 warm_happy back_sleep lord_voldemort albus_dumbledore state_mind ravenclaw_tower expecto_patronum long_ago golden_frame tattered_cloak corporeal_patronus red_gold light_years golden_back lay_beneath auror_goryanof master_flamel quirrell_pointed true_love
16 0,01284 mr_hagrid weasley_twins forbidden_forest half_giant great_hall tick_harry weasley_twin huge_man argus_filch rubeus_hagrid part_mind head_table unicorn_blood gryffindor_table fred_george magical_creatures false_memory ron_weasley fred_weasley
17 0,01825 harry_potter draco_voice magic_fading dr_potter draco_harry harry_draco shadowy_figure dr_malfoy death_eater draco_don draco_draco powerful_wizards green_light blood_purism draco_realized potter_draco don_draco fading_world paper_magic
18 0,01634 lucius_malfoy lord_malfoy house_malfoy house_potter plum_colored draco_malfoy thousand_galleons colored_robes dark_stone madam_longbottom ancient_hall blood_debt chief_warlock hundred_thousand noble_ancient malfoy_stood debt_owed lords_ladies hall_wizengamot
19 0,01762 miss_granger padma_patil hermione_voice professor_sinistra hero_hermione year_witch penelope_clearwater mysterious_wizard chaos_legion professor_vector hermione_turned amelia_bones endless_stair people_ve harry_friend beneath_half ravenclaw_girl common_sense leather_folder

The armies show up *very* distinctively as topic number 3. An interesting topic is number 12, which looks like it might involve Harry’s and Dumbledore’s debates about death and mortality, given the presence of 2-grams like “resurrection_stone, harry_headmaster, albus_dumbledore, wizard_face, wizard_quietly, death_eater, save_lives, dumbledore_voice, pretending_wise” (if some of these seem confusing, remember that Mallet ignores very common words by default, so e.g. pretending_wise was probably “pretending to be wise” in the raw text).

Still, it seems like 20 topics with 1- and 2-grams is best. Let’s generate that kind of a classification again, and this time also have the classifier tell us what percentage of each chapter is made up by a given topic.

Here are the topics:

0 0,03474 moody eye monroe mad_eye mad voldemort amelia prophecy bones david amelia_bones albus david_monroe minerva eye_moody alastor line azkaban voldie
1 0,02881 draco father harry_potter blood dr draco_voice magic test muggles powerful paper wizards draco_harry fading scientist spells harry_draco magic_fading scientists
2 0,02842 miss_granger miss hermione hero heroes granger hermione_granger elder_wand elder humming sinistra hermione_voice cell mysterious_wizard professor_sinistra fingernails vow sparkling professor_vector
3 0,02272 hat sorting neville sorting_hat goyle note ha points slytherins remembrall game mr_goyle paper ha_ha comed ernie comed_tea defence rock
4 0,05541 quirrell professor_quirrell professor mr_potter mr lose quirrell_voice goyle mr_goyle lesson quirrell_harry quirrell_face potter_professor secrets monster quirrell_nodded quirrell_points derrick quirrell_looked
5 0,02989 father dad mum books ron verres evans science petunia parents platform verres_evans michael trunk scarf letter train son owl
6 0,03465 malfoy lucius lucius_malfoy lord_malfoy wizengamot debt son house_malfoy house_potter false longbottom colored podium thousand_galleons plum_colored plum false_memory law owed
7 0,03928 daphne susan tracey hannah snape lavender bullies bully professor_snape draco_malfoy greengrass millicent bones sprout parvati corridor susan_bones girl davis
8 0,03814 hagrid troll forest unicorn tracey mr_hagrid centaur yeh tick filch weasley broomstick rubeus forbidden_forest huge forbidden twins unicorns argus
9 0,03042 draco neville soldiers general sunshine chaos army dragon granger zabini battle malfoy armies doom_doom doom dragons forest dragon_army shield
10 0,03238 voldemort lord lord_voldemort mirror dark_lord stone iss altar tom horcrux riddle parseltongue hissed wand perenelle tom_riddle dark body gun
11 0,04766 fred george neville fred_george lesath skeeter weasley rita severus twins rita_skeeter lestrange weasley_twins lesath_lestrange gryffindors handsome legilimens flume occlumency
12 0,05025 padma girls girl patil pettigrew padma_patil table responsible rival astorga pansy granger rumor ravenclaw_table heroine rat morning madam_pomfrey year_witch
13 0,03567 bellatrix snake dementors azkaban amelia patronus broomstick professor_quirrell bahry metal quirrell auror charm woman iss hissed corridor aurors bellatrix_black
14 0,0177 phoenix fawkes war blaise aftermath millicent envelope azkaban moody black_mist mist zabini haukelid wizard_voice back_sleep tower gregory million violence
15 0,04214 dementor patronus lupin headmaster remus patronus_charm mr_lupin james lily cast_patronus godric cage corporeal patronuses fear death happy anthony chocolate
16 0,03449 severus minerva albus potions_master potions master snape severus_snape time_turner turner headmistress floo azkaban professor_snape discipline severus_voice points_ravenclaw escape headmaster_office
17 0,04336 mcgonagall professor_mcgonagall galleons gold alley shop bag mr_potter pouch diagon_alley coins diagon wizarding_world malkin witch vault wizarding street kit
18 1,37162 harry professor potter hermione voice time back dumbledore quirrell professor_quirrell mr don thought boy dark hogwarts eyes face lord
19 0,02203 transfiguration transfigure eraser atoms minerva page ball harry_hermione separate sentient hermione_voice library subject diamond collection snakes pig free_transfiguration research

To make things easier, I’m going to give each of those topics a more descriptive name. I went with these:

0: Mad-Eye Moody & David Monroe
1: Harry & Draco doing science together
2: Hermione
3: Sorting Hat & Mr. Goyle
4: Professor Quirrell
5: Harry’s parents
6: Lucius Malfoy & Harry’s debt
8: Hagrid & the Forest
9: Quirrel Armies
10: Lord Voldemort
11: Fred & George
12: Padma Patil and stuff
13: Azkaban Arc
14: Random
15: Dementors & Patronouses
16: Albus, Minerva, and Snape
17: Diagon Alley & Money
18: Generic (this topic makes up by far the largest proportion of the story: it has a weight of 1,37 whereas none of the others reach even 0,06. You could call it the “whatever doesn’t fit into one of the other topics” topic)
19: Transfiguration

That’s not too bad of a list of topics in HPMOR, though the proportion of the “generic” topic is kinda annoying. Here are some of the topic classifications the model gives us (only the largest percentages shown):

Chapter 1, A Day of Very Low Probability: 57,8% Harry’s Parents, 42,1% Generic
Chapter 2, Everything I Believe Is False: 47,5% Generic, 26,2% Diagon Alley & Money, 25,2% Harry’s Parents
Chapter 3, Comparing Reality To Its Alternatives: 49,6% Generic, 39,6% Diagon Alley & Money, 7% Harry’s Parents
Chapter 4, The Efficient Market Hypothesis: 59,7% Diagon Alley & Money, 40% Generic
Chapter 5, The Fundamental Attribution Error: 49,8% Diagon Alley & Money, 49,4% Generic
Chapter 6, The Planning Fallacy: 50,5% Generic, 47,9% Diagon Alley & Money

These topic classifications initially go roughly as one might expect, though the topic we termed “Diagon Alley & Money” shows up as early as in Chapter 2, and they only got to the Alley in Chapter 3.

Chapter 7, Reciprocation: 46,1% Generic, 42,0% Harry’s Parents, 10,2% Harry & Draco doing science together

After that it stays strong until Chapter 7 where it disappears entirely as the story moves away from the Alley to the King’s Cross Station, Harry’s parents say him goodbye, and Harry runs into Draco among others.

Chapter 8, Positive Bias: 52,4% Generic, 38,5% Harry’s Parents, 4,4% Sorting Hat & Mr Goyle (1.3432768379668802E-5 Hermione)

But then there’s Chapter 8, where Harry and Hermione have an extended discussion: besides Generic, this is classified as mostly being about Harry’s Parents (???), and a little bit about the weirdball “Sorting Hat & Mr. Goyle”; the topic we had named “Hermione” comes at a very low fraction.

Chapter 9, Title Redacted, Part I: 50,3% Generic, 41,6% Sorting Hat & Mr. Goyle, 8,02% Fred & George

Chapter 9 is where people are sorted (and Fred & George make a minor appearance). It’s interesting to notice that chapter 8 had a bit of Sorting Hat content, even though nothing about the sorting was mentioned: we also previously saw that the Diagon Alley classification showed up even before they went to Diagon alley.

But now I need to leave work, so no time to do more analysis at this point. If anyone wants to do more analysis, the full results are here: http://pastebin.com/bGip7X4D

Originally published at Kaj Sotala. You can comment here or there.

(Leave an echo)

Wednesday, April 29th, 2015
10:17 am - Teaching economics & ethics with Kitty Powers’ Matchmaker

Unusual ways to teach economics. I’m currently playing Kitty Powers’ Matchmaker, a silly but fun little game in which you run a dating agency and try to get your clients on successful dates and, eventually, into a successful relationship.

Now one way of playing this would be to just prioritize the benefit of each client, trying to get them in maximally satisfying relationships as fast as possible. But while I sometimes do that, often I do things differently.

In one case, I had a client who’d been on two bad dates already, and was threatening to march out and give my company a bad reputation if she’d have one more bad date. I didn’t have any good matches lined up for her. I could have just kicked her out, but that wouldn’t have given my company any money. So instead I put her on a date with someone who seemed incompatible, but just had her lie about all the incompatibilities and say what the other person wanted to hear. That way, they’d end up together, and I’d get my money and be rid of the troublesome client. Of course I knew that they’d break up later and that would hurt my reputation a bit, but I figured that it would still be better for the company than kicking her out now.

(In my defense, I have only done this once, and I felt kinda bad about it.)

This situation is known in economics as the principal-agent problem: a situation where someone (the “principal”) hires someone else (the “agent”) to do something on the principal’s behalf, but the self-interests of the principal and the agent differ. So for example, you may try to get a real estate agent to sell your house and give them a cut of the profit. It would be in your interest if the agent sold it for as high a price as possible, but the agent may actually benefit more if they spend less time on each individual sale and instead sell a lot of houses more cheaply, but in a shorter time. This was confirmed in a study in which it was found that real estate agents tended to sell other people’s houses considerably faster and cheaper than they sold their own houses.

Or, you might go to a matchmaking agency to get into the relationship of your dreams, but your matchmaker also has an interest in getting your money and benefiting the company.

Here’s another thing that I do in the game that some might consider questionable. When a client comes in, they will tell me their personality traits, e.g. introvert vs. extrovert. It’s best to pair them off with someone who has the same personality traits. But when the game shows me a list of people I can try to match my client with, by default I don’t know the personality traits of those people. Instead, I have to have some client date those people and discover their personality traits, and then I too will learn them.

Now suppose that a new client comes in, and I know of someone I could have them date who’d be perfectly compatible. I also have a bunch of other possibilities, whose personality traits I don’t know. Do I send my client on the best possible date right away? Of course not! Instead, I’ll send them on a few dates with the unknowns, so that I can discover the personality traits of the unknowns, and only after a few bad dates will I pair my client with the best match. This way, I’ll know the personality traits of as many people as possible, and will always be able to know of a compatible match for my next client.

Is this ethical? You could argue either way. Yes: I’m still sending my client to a good relationship eventually, and although it might give my client a few bad dates in the beginning, that helps other clients eventually get a good date. No: I have an obligation to prioritize the interest of my current client at all times, and it’s not in their interest to have a bad time. The first argument has a bit of a consequentialist vibe, and the second one has a bit of a deontologist vibe. If you were teaching an introductory ethics course and wanted to give your students a different example than the usual ones, maybe you could have them play the game and then ask them this question.

Comedy dating sims: useful for teaching both economics and ethics.

Originally published at Kaj Sotala. You can comment here or there.

(Leave an echo)

Wednesday, January 21st, 2015
3:06 am - Things that I’m currently the most interested in (Jan 21st of 2015 edition)

* Creating social environments that actively support and reinforce people’s growth, as well as the incentivizing the development of valuable projects. Our environment has a huge impact on us. The topics that we happen to see or hear discussed around us will, if not quite determine the topics that we spend our time thinking and ultimately caring about, at least vastly influence those topics. Similarly, the habits of the people around us affect our motivation and behavior: if everyone else is slacking off, then we too are likely to follow suit.

As the saying goes, actions speak louder than words: even people and communities with good intentions may get sidetracked into becoming less effective than they could be, if they spend a lot of time talking about noble goals but in practice do little but play games. On the flip side, if people really do consistently act in a purposeful manner, that is likely to also motivate the others around them.

I want to figure out the social technologies we need to consistently create communities which encourage people to develop themselves, work on valuable big-impact projects, and feel good about themselves.

* Reducing societal conflict by making people feel more safe. Our emotions evolved for specific purposes, one of which includes defending us and protecting us from threats. Someone lashing out in anger or exhibiting some other form of physical or emotional violence is an indication that there is a world-model in their head that evaluates the situation as being dangerous to their well-being, and requires a defensive reaction. Unfortunately, this has a tendency to make things worse: a person whose defensive systems has been engaged will predominantly focus on the potentially threatening aspects of the situation, causing them to exaggerate other people’s bad sides and be less likely to see those others as fellow humans. In response, the other person will (correctly!) feel unsafe, causing their protective systems to engage as well, and what started out as a minor disagreement may quickly escalate into a major conflict.

We are currently living in the safest, most well-off period in history. Our evolved instincts, still calibrated by evolution to a riskier time, have not properly caught up. What’s worse, parts of modern society exaggerate our perception of risks, and incentivize people to manufacture polarizing new conflicts. It can be seen all the time on social media, with communities united by their hate and mistrust of a common enemy, or people sharing articles ridiculing or highlighting the worst sides of their common enemies. As people stop viewing the people disagreeing with them as human beings inherently worthy of respect, and rather start to treat them as enemies, those others will lash out in return, their brains correctly interpreting the situation as a threatening one and engaging protective systems.

I would like to find ways to put a stop to this cycle.

* Creating a sense of purpose for people. Modern Western society has a distinct lack of clear vision and sense of purpose. Young people are told that they can do what they want with their lives, but are rarely given much in the way of suggestions of what could be a valuable, interesting thing to do with one’s life. Many drift aimlessly, never quite finding anything that would motivate them, or that would encourage them to really work hard for some deeply fulfilling aim.

There’s no need why this would need to be so. The world is full of valuable things that could be done, countless causes needing heroes. There are still people living in poverty, diseases that need to be cured, people living in unsatisfying circumstances, whole societal structures that could be reformed and remade, and even things threatening the survival of all of humanity. People just don’t know what they could do about all these things, nor have they been provided with emotionally compelling stories about working on these things that would make them feel valuable and important to do.

* Develop ways to live in harmony with one’s emotions. There’s a stereotype that has reason and emotions as two opposed things, and a popular view of the world that makes people think that in order to succeed in life, they often have to grit their teeth and force themselves to do things that they wouldn’t actually want to do.

I think that both ways of looking at things are mistaken. Reason and emotions are two mechanisms for furthering our goals and protecting our well-being: they only seem opposed when the two mechanisms aren’t properly sharing information with each other, and come into conflict instead of co-operating. Any time that we have to use willpower in order to make ourselves do something that we ”wouldn’t want to do” is a time when we have failed to bring different parts of our minds into harmony. They are situations when one part of our mind believes that we should do something and another is unconvinced, but instead of the two clearly considering the situation together and seeking to come to an agreement, one of them uses brute force to compel the other to obey.

This doesn’t need to be so. With enough practice, one should never need to encounter a situation where they needed to do something unpleasant. Either they would conclude that the thing wasn’t worth doing in the first place and happily give up on it, or had their whole being agree that it was worth doing and do it with pleasure.

Originally published at Kaj Sotala. You can comment here or there.

(1 echo left behind | Leave an echo)

3:06 am - Things that I’m currently the most interested in (Jan 21st of 2015 edition)

* Creating social environments that actively support and reinforce people’s growth, as well as the incentivizing the development of valuable projects. Our environment has a huge impact on us. The topics that we happen to see or hear discussed around us will, if not quite determine the topics that we spend our time thinking and ultimately caring about, at least vastly influence those topics. Similarly, the habits of the people around us affect our motivation and behavior: if everyone else is slacking off, then we too are likely to follow suit.

As the saying goes, actions speak louder than words: even people and communities with good intentions may get sidetracked into becoming less effective than they could be, if they spend a lot of time talking about noble goals but in practice do little but play games. On the flip side, if people really do consistently act in a purposeful manner, that is likely to also motivate the others around them.

I want to figure out the social technologies we need to consistently create communities which encourage people to develop themselves, work on valuable big-impact projects, and feel good about themselves.

* Reducing societal conflict by making people feel more safe. Our emotions evolved for specific purposes, one of which includes defending us and protecting us from threats. Someone lashing out in anger or exhibiting some other form of physical or emotional violence is an indication that there is a world-model in their head that evaluates the situation as being dangerous to their well-being, and requires a defensive reaction. Unfortunately, this has a tendency to make things worse: a person whose defensive systems has been engaged will predominantly focus on the potentially threatening aspects of the situation, causing them to exaggerate other people’s bad sides and be less likely to see those others as fellow humans. In response, the other person will (correctly!) feel unsafe, causing their protective systems to engage as well, and what started out as a minor disagreement may quickly escalate into a major conflict.

We are currently living in the safest, most well-off period in history. Our evolved instincts, still calibrated by evolution to a riskier time, have not properly caught up. What’s worse, parts of modern society exaggerate our perception of risks, and incentivize people to manufacture polarizing new conflicts. It can be seen all the time on social media, with communities united by their hate and mistrust of a common enemy, or people sharing articles ridiculing or highlighting the worst sides of their common enemies. As people stop viewing the people disagreeing with them as human beings inherently worthy of respect, and rather start to treat them as enemies, those others will lash out in return, their brains correctly interpreting the situation as a threatening one and engaging protective systems.

I would like to find ways to put a stop to this cycle.

* Creating a sense of purpose for people. Modern Western society has a distinct lack of clear vision and sense of purpose. Young people are told that they can do what they want with their lives, but are rarely given much in the way of suggestions of what could be a valuable, interesting thing to do with one’s life. Many drift aimlessly, never quite finding anything that would motivate them, or that would encourage them to really work hard for some deeply fulfilling aim.

There’s no need why this would need to be so. The world is full of valuable things that could be done, countless causes needing heroes. There are still people living in poverty, diseases that need to be cured, people living in unsatisfying circumstances, whole societal structures that could be reformed and remade, and even things threatening the survival of all of humanity. People just don’t know what they could do about all these things, nor have they been provided with emotionally compelling stories about working on these things that would make them feel valuable and important to do.

* Develop ways to live in harmony with one’s emotions. There’s a stereotype that has reason and emotions as two opposed things, and a popular view of the world that makes people think that in order to succeed in life, they often have to grit their teeth and force themselves to do things that they wouldn’t actually want to do.

I think that both ways of looking at things are mistaken. Reason and emotions are two mechanisms for furthering our goals and protecting our well-being: they only seem opposed when the two mechanisms aren’t properly sharing information with each other, and come into conflict instead of co-operating. Any time that we have to use willpower in order to make ourselves do something that we ”wouldn’t want to do” is a time when we have failed to bring different parts of our minds into harmony. They are situations when one part of our mind believes that we should do something and another is unconvinced, but instead of the two clearly considering the situation together and seeking to come to an agreement, one of them uses brute force to compel the other to obey.

This doesn’t need to be so. With enough practice, one should never need to encounter a situation where they needed to do something unpleasant. Either they would conclude that the thing wasn’t worth doing in the first place and happily give up on it, or had their whole being agree that it was worth doing and do it with pleasure.

Originally published at Kaj Sotala. You can comment here or there.

(Leave an echo)

Friday, January 16th, 2015
5:30 pm - On the plane

Mine is an eleven-hour flight: I’m sitting between two people, a woman on my left, by the window, and a man on my right, by the corridor.

We’ve hardly spoken to each other: she once asked if I preferred to have the window open or closed, and I spoke to him when I needed to go to the bathroom, apologizing and then thanking him for making room for me.

Still, in this cramped space it’s hard to avoid the feeling that we know each other, at least for a bit.

I know that he’s reading George R. R. Martin’s The Dance With Dragons.

I know that she’s been napping under a blanket for a large part of the time.

He was the only one who had brought food of his own. When one of the in-flight staff asked whether she wanted water or juice to drink, she said no, but she did ask when our food would be served. (In half an hour.)

I know that both of them, when given the choice between a meal with chicken or one with potatoes, went for the chicken. I went for the potatoes.

All three of us chose to have tea rather than coffee.

He’s been up from his seat twice; she hasn’t moved from hers; I’ve been up once.

I think that she’s attractive; I haven’t paid attention to his appearance. I don’t know what they think of mine.

I’m the only one who’s been using a laptop, he’s the only one who’s been reading a physical book. Both of them have watched onboard movies; I haven’t.

She and I happened to think of filling our customs form around the same time, and did so side to side. I haven’t seen him fill his.

All of us end up occasionally touching each other, or stealing space for our elbows: it’s impossible not to. None of us says anything about it, each of us forgiving the violations of our personal space in exchange for having our similar violations forgiven.

As of this writing, it’s only two more hours before we arrive. I’ll enjoy their company for a while yet, and I do feel happy to have them here.

(Leave an echo)

Tuesday, January 6th, 2015
6:38 pm - Plans need motivational components

One of the most valuable things that I got out of the Center for Applied Rationality’s recent workshop, but which took a while to really sink in, is that a plan isn’t finished until it also includes a component for how you’ll actually get yourself to carry it out.

I think that people in planning mode have a tendency to think of themselves as magical robots, as in “once I know what I need to do to accomplish my goal, the hard work is done and all that remains is executing the plan”. But in my experience, getting yourself to actually carry out the plan is the hard part. Everyone knows how to Bungee jump, or how to get a date: just tie a elastic cord around your leg and jump, or just walk up to everyone who seems attractive and ask them out until someone says yes. It’s not figuring out what you need to do that’s hard.

Probably the thing that taught this the most viscerally was an exercise at the workshop, called Focused Grit. It’s really simple: you imagine that there’s an evil genie behind your back, who’s giving you five minutes to solve some particular problem that you have. Once the five minutes has passed, the genie will delete your ability to ever think of the problem again. So if you don’t want the problem to be with you for the rest of your life, you have five minutes to either actually solve the problem, or at least make a plan for how you’ll solve the problem that’s good enough that you can just execute it afterwards.

Then you set a timer, and solve your problem within the next five minutes.

This works surprisingly well.

A mistake that a lot of people make with this technique at first is that they only create a plan which would work if they were to carry it out. Then they stop there, feeling that they’re done.

But remember the evil genie. You won’t have a chance to develop your plan further once the five minutes are done, and that includes trying to motivate yourself to carry out the plan. When the five minutes finishes, you need to actually be in a state where you’ll carry out the plan, or you’ll be stuck with your problem for the rest of your life. And the genie will laugh at you.

I found this to be a very effective way to internalize the “a plan is only complete once it includes a component for how you’ll actually complete it” lesson. In the past, I used to do write-ups of techniques that seemed good and useful if I could get myself to use them, but which I knew I was unlikely to actually use. They seemed so good on paper!

Now I know better. A technique that you don’t think that you’ll be able to use isn’t good even on paper.

This is now the most important lens that I use to evaluate all of my plans and techniques.

Originally published at Kaj Sotala. You can comment here or there.

(Leave an echo)

Thursday, January 1st, 2015
1:03 pm - Looking back at 2014

2014 was one of the best and worst years of my life.

It started with the worst: in the first three months or so, my girlfriend and I broke up, the part-time job I was doing started feeling unmotivating, and I realized I didn’t have the energy to both do the job and work on my thesis at the same time. Romance, work, studies: three major spheres of my life, all crashing down around the same time.

I mostly recovered from the breakup and put my thesis on a temporary hold, but work continued to be unmotivating. In the summer I went to see a psychiatrist, and was prescribed antidepressants. Thus started the better part of the year, as I realized that I’d been suffering from a mild depression for years without knowing it. The meds went a long way towards fixing that, and everything started looking brighter. There were still down periods, but even they were better than the down periods I was having before the meds.

Of the concrete things that happened, there are so many things I could cover.

I’ve definitely been becoming a lot more social and extroverted during the year. In April there was the first Less Wrong European Community Weekend in Berlin, which was a lot of fun by itself, and also led to me becoming close friends with several people. In November I attended the Center for Applied Rationality’s workshop in England, which led to me starting my own rationality workshops here in Finland, and also crafting a local, more tightly-knit community of people who would support each other in making each other’s lives awesome. The workshop also caused me to finally start organizing regular “come and hang out with me in a bar” evenings like I’d been intending to do for the last half a year. Also made and strengthened several other friendships in unrelated ways.

A large part of the boosts also came from the antidepressants, as well as reading several books which helped me considerably level up my social skills. The Charisma Myth was the first one, then followed by Non-Violent Communication which not only helped me resolve conflicts I’d been having with others but also make my own emotions clearer. In the last few days I’ve started reading Crucial Conversations, which has a lot of similarities with Non-Violent Communication but also covers many things which NVC didn’t.

I continued working on some academic papers on the side, kind of as a hobby. At the beginning of the year, “The errors, insights and lessons of famous AI predictions” by Stuart Armstrong, Sean Ó hÉigeartaigh, and me was published in the Journal of Experimental & Theoretical Artificial Intelligence. Around the end of the year, I had a paper accepted to an AAAI workshop on AI and ethics, and Physica Scripta formally published my and Roman Yampolskiy’s paper from 2013 that we’d only had up as a technical report so far. Google Scholar reports that there were 15 citations to my different papers in 2014, up from the 9 citations that I got in 2013.

On the topic of hobbies, I had for a long time liked the idea of game mastering role-playing games, but in practice rarely had the time or energy to do the necessary preparation for them. Now I finally managed to get into different RPGs which were designed to only require minimal advance preparation, and turned out to be a lot more fun to run than the old-style games. (E.g. different move engine games starting from Apocalypse World, and games like J Matias Kivikangas’s Here Be Dragons, which I unfortunately still haven’t gotten a chance to run. Soon!)

On a front that’s harder to describe, I started a large-scale restructuring of how I thought about ethics and morality. In a sense, I had ended up with a kind of an externalized sense of morality, which caused me a lot of guilt and stress. I started making a transition towards a more internalized morality, which had helped a lot.

Now as we enter 2015, a lot about my future is unclear. I’m intending to finally graduate with my MSc around summer, and I’m uncertain of what I will do after that. I’ve actually been feeling sufficiently extroverted as to start pondering whether I would actually prefer some kind of a career that involved being social and interacting with lots of different people on a daily basis, as opposed to the more introverted, technical kinds of careers that I’d been mostly thinking of before.

In any case, I feel that I’m now leveling up much faster than I was before, and am becoming far better positioned to tackle different challenges in life. Hopefully things will go well.

Originally published at Kaj Sotala. You can comment here or there.

(Leave an echo)

Wednesday, December 31st, 2014
9:57 am - Social media saps more than just short-term attention

The prevalent wisdom about why social media is distracting is that it provides a constant opportunity for immediate distraction. Whenever your work feels even the slightly unsatisfying, there’s the temptation to get a momentary break by looking at Facebook, and then you’ve spent fifteen minutes chatting away when you should have been working.

There’s a lot of truth to this. I’ve experienced it first-hand many times, and talked a lot about it in my essay about the addiction economy.

But I find that’s only a part of the problem. I find that in addition to sapping short-term attention, social media also damages long-term attention. (I’m focusing on social media here, because it’s the one that I’m the most hooked on myself – but any other source of quick, immediate reward would also have the same effect.)

Take a day when I don’t have access to social media, and don’t have anything else in particular to do, either. My typical behavior on such days is that I might be bored for a while, maybe take a walk, and then gradually, over some time, get ideas for projects that I could be doing, and start working on them.

In contrast, on a day when I do have access to Facebook, say, at the point when I start growing bored I’ll glance at Facebook, because hey, why not? I’m just taking a quick look to see if there are any updates or new notifications, I’ll get offline right after that.

And maybe I do. Often I do succeed in just checking the updates and notifications, maybe briefly commenting on something, then closing Facebook again. But what then happens is that sometime later, I’ll take another quick look on Facebook again. And again. And again.

And then that period of idle, slightly bored mind-wandering never gets to the point where I start gathering the motivation to work on my own project. Because at the point when I start feeling bored, my default action is to look at Facebook, filling my mind with whatever is happening there, rather than it starting to come up with new things to do. Even when I close the browser tab, the gradually forming  idea of “hey, maybe I could do X” has been flushed away by whatever was in the window, meaning that it needs more time to reform.

Sometimes I take longer breaks from social media, after having used it quite heavily on previous days. On such occasions, it’s often been my experience that it takes a day for my mind to recalibrate its expectations – on the first day I’m constantly anxious to go on Facebook, but after that I’m starting to have more creativity. It is written:

Complex systems learn by adjusting to feedback, and feedback that is sufficiently loud and frequent will oversaturate the system’s inputs, leading it to reduce its overall sensitivity in order to register changes. When instant and immediate gratification becomes the norm, more subtle forms of feedback become harder to register. Getting engrossed in a book becomes increasingly difficult. The same goes for different kinds of stories: it’s easier to sit through an action movie than a drama because the story is simple and the movie is mostly comprised of satisfying bits of conflict resolution in the simple form of karate chops and shootouts. We might force ourselves to sit through a few chapters of Tolstoy, but the real issue is that we ultimately have to re-calibrate our receptivity to feedback in order to gain interest in more subtle flavors of experience.

Subtle flavors of experience, like the barely noticeable sensation in your mind that’s the stirring of a new idea, which you could allow to grow and develop.

Studies suggest that the mental effort involved in a task may be proportional to the opportunity cost of not doing something else. In other words, things aren’t so much intrinsically appealing or unappealing, but more appealing or unappealing relative to the appealingness of the best thing that you could be doing instead. If you have constant access to video games, going outside for a walk may seem like something pretty boring, but if you don’t have anything better to do, you may notice that going for a walk actually feels like a pretty nice idea.

Presumably this works for unconscious task-selection, too. If the social media is always available as an option, then momentarily checking that may be treated by your unconscious brain as something that has a higher reward than starting to think about something with a more long-term payoff, such as a creative project.

The insidious thing here is that you may not notice the effect this has on you. From your perspective, yeah, you’re looking at social media every now and then, but it’s always just short moments, and you’re spending the vast majority of your time not on social media. So why are you still feeling listless and easily distracted?

Because it isn’t enough to spend the majority of your time away from distractions, if that time isn’t also spent continuously away from them.

As it happens, I had been thinking about this topic for a while, but only wrote up this essay on an occasion when I’d decided to spend the rest of the day off social media. Then this essay started formulating itself in my mind, and I wrote it up in pretty much one go, to be posted at a later time.

Originally published at Kaj Sotala. You can comment here or there.

(Leave an echo)

> previous 20 entries
> top of page