Functional Decision Theory

This is a summary of parts of MIRI’s FDT paper, available here.

A decision theory is a way of choosing actions in a given situation. There are two competing decision theories that have been investigated for decades: causal decision theory (CDT) and evidential decision theory (EDT).

CDT asks: what action would give me the best outcome?

EDT asks: which action would I be most delighted to learn that I had taken?

These theories both perform well on many problems, but on certain problems they choose actions that we might think of as poor choices.

Functional decision theory is an alternative to these two forms of decision theory that performs better on all known test problems.

Why not CDT?

CDT works by saying: given exactly what I know now, what would give me the best outcome. The process for figuring this out would be to look at all the different actions available, and then calculate the payoffs for the different actions. Causal deciders have a model of the world that they manipulate to predict the future based on the present. Intuitively, it seems like this would perform pretty well.

Asking what would give you the better outcome in a given situation only works when dealing with situations that don’t depend on your thought process. That rules out any situation that deals with other people. Anyone who’s played checkers has had the experience of trying to reason out what their opponent will do to figure out what their own best action is.

Causal decision theory fails at reasoning about intelligent opponents in some spectacular ways.

Newcomb’s Problem

Newcomb’s problem goes like this:

Some super-powerful agent called Omega is known to be able to predict with perfect accuracy what anyone will do in any situation. Omega confronts a causal decision theorist with the following dilemma: “Here is a large box and a small box. The small box has $1000 in it. If I have predicted that you will only take the large box, then I have put $1 million into it. If I have predicted that you will take both boxes, then I have left the large box empty.”

Since Omega has already made their decision. The large box is already filled or not-filled. Nothing that the causal decision theorist can do now will change that. The causal decision theorist will therefore take both boxes, because either way that means that they get an extra $1000.

But of course Omega predicts this and the large box is empty.

Since causal decision theory doesn’t work on some problems that a human can easily solve, there must be a better way.

Evidential decision theorists will only take the large box in Newcomb’s problem. They’ll do this because they will think to themselves: “If I later received news that I had taken only one box, then I’ll know I had received $1 million. I prefer that to the news that I took both boxes and got $1000, so I’ll take only the one box.”

So causal decision theory can be beaten on at least some problems.

Why not EDT?

Evidential decision theory works by considering the news that they have performed a certain action. Whatever news is the best news, that’s what they will do. Evidential deciders don’t manipulate a model of the world to calculate the best event, they simply calculate the probability of a payoff given a certain choice. This intuitively seems like it would be easy to take advantage of, and indeed it is.

Evidential decision theorists can also be led astray on certain problems that a normal human will do well at.

Consider the problem of an extortionist who writes a letter to Eve the evidential decider. Eve and the extortionist both heard a rumor that her house had termites. The extortionist is just as good as Omega at predicting what people will do. The extortionist found out the truth about the termites, and then sent the following letter:

Dear Eve,

I heard a rumor that your house might have termites. I have investigated, and I now know for certain whether your house has termites. I have sent you this letter if and only if only one of the following is true:

a) Your house does not have termites, and you send me $1000.
b) Your house does have termites.

Sincerely,
The Notorious Termite Extortionist

Eve knows that it will cost more than $1000 to fix the termite problem. So when she receives the letter, she will think to herself:

If I learn later that I paid the extortionist, then that would mean that my house didn’t have termites. That is cheaper than the alternative, so I will pay the extortionist.

The problem here is that paying the extortionist doesn’t have any impact on the termites at all. That’s something that Eve can’t see, because she doesn’t have a concrete model that she’s using to predict outcomes. She’s just naively computing the probability of an outcome given an action. That only works when she’s not playing against an intelligent opponent.

If the extortionist tried to use this strategy against a causal decision theorist, the letter would never be sent. The extortionist would find that the house didn’t have termites and would predict the causal decision theorist would not pay, so the conditions of the letter are both false. A causal decision theorist would never have to worry about such a letter even arriving.

Why FDT?

EDT is better in some situations, and in other situations CDT is better. This implies that you could do better than either by just choosing the right decision theory in the right context. That, in turn, implies that you could just make a completely better decision theory, which may just be MIRI’s functional decision theory.

Functional Decision Theory asks: what is the best thing to decide to do?

The functional decider has a model of the world that they use to predict outcomes, just like the causal decider. The difference is in the way the model is used. A causal decider will model changes in the world based on what actions are made. A functional decider will model changes in the world based on what policies are used to decide.

A function decision theorist would take only one box in Newcomb’s problem, and they would not succumb to the termite extortionist.

FDT and Newcomb’s problem

When presented with Newcomb’s problem, a functional decider would make their decision based on what decision was best, not on what action was best.

If they decide to take only the one box, then they know that they will be predicted to make that decision. Thus they know that the one box will be filled with $1 million.

If they decide to take both boxes, then they know they will be predicted to take both boxes. So the large box will be empty.

Since the policy of deciding to take one box does better, that is the policy that they use.

FDT and the Termite Extortionist

Just like the causal decider, the functional decider will never get a letter from the termite extortionist. If there’s ever a rumor that the functional decider’s house has termites, the extortionist will investigate. If there are no termites, then the extortionist will predict what the functional decider will do upon receiving the letter:

If I decide to pay the extortion letter, then the extortionist will predict this and send me this letter. If I decide not to pay, then the extortionist will predict that I won’t, and will not send me a letter. It is better to not get a letter, so I will follow the policy of deciding not to pay.

The functional decider would not pay, even if they got the letter, because paying would guarantee getting the letter.

The differing circumstances for CDT and EDT

Newcomb’s problem involves a predictor that models the agent and determines the outcome.

The termite extortionist involves a predictor that models the agent, but imposes a cost that’s based on something that the agent cannot control (the termites).

The difference between these two types of problems is called subjunctive dependence.

Causal dependence between A and B: A causes B

Subjunctive dependence between A and B: A and B are computing the same function

FDT is to subjunctive dependence as CDT is to causal dependence.

A Causal Decider makes decisions by assuming that, if their decision changes, anything that can be caused by that decision could change.

A Functional Decider makes decisions by assuming that, if the function they use to choose an action changes, anything else that depends on that function could change (including things that happened in the past). The functional decider doesn’t actually believe that their decision changes the past. They do think that the way they decide provides evidence for what past events actually happened if those past events were computing functions that the functional decider is computing in their decision procedure.

Do you support yourself?

One final recommendation for functional decision theory is that it endorses its own use. A functional decider will make the same decision, regardless of when they are asked to make it.

Consider a person trapped in a desert. They’re dying of thirst, and think that they are saved when a car drives by. The car rolls to a stop, and the driver says “I’ll give you a ride into town for $1000.”

Regardless of if the person is a causal, evidential, or functional decider, they will pay the $1000 if they have it.

But now imagine that they don’t have any money on them.

“Ok,” says the driver, “then I’ll take you to an ATM in town and you can give me the money when we get there. Also, my name is Omega and I can completely predict what you will do.”

If the stranded desert-goer is a causal decider, then when they get to town they will see the problem this way:

I am already in town. If I pay $1000, then I have lost money and am still in town. If I pay nothing, then I have lost nothing and am still in town. I won’t pay.

The driver knows that they will be cheated, and so drives off without the thirsty causal decider.

If the desert-goer is an evidential decider, then once in town they’ll see things this way:

I am already in town. If I later received news that I had paid, then I would know I had lost money. If I received news that I hadn’t paid, then I would know that I had saved money. Therefore I won’t pay.

The driver, knowing they’re about to be cheated, drives off without the evidential decider.

If the desert goer is a functional decider, then once in town they’ll see things this way:

If I decide to pay, I’ll be predicted to have decided to pay, and I will be in town and out $1000. If I decide not to pay, then I’ll be predicted to not pay, and I will be still in the desert. Therefore I will decide to pay.

So the driver takes them into town and they pay up.

The problem is that causal and evidential deciders can’t step out of their own algorithm enough to see that they’d prefer to pay. If you give them the explicit option to pay up-front, they would take it.

Of course, functional deciders also can’t step out of their algorithm. Their algorithm is just better.