Posts Tagged ‘Nash equilibrium’

On Rationality

August 28, 2007

I want to expand on what I wrote previously in “A Simple But Challenging Game: Part II, this time focusing on Rosenthal’s Centipede Game. To remind you of the rules, in that game there are two players. The players, named Mutt and Jeff, start out with $2 each, and they alternate rounds. On the first round, Mutt can defect by stealing $2 from Jeff, and the game is over. Otherwise, Mutt cooperates by not stealing, and Nature gives Mutt $1. Then Jeff can defect and steal $2 from Mutt, and the game is over, or he can cooperate and Nature gives Jeff $1. This continues until one or the other defects, or each player has $100.

As I previously wrote, in this game, the Nash equilibrium is that Mutt should immediately defect on his first turn. This result is obtained by induction. When both players have $99, it is clearly in Mutt’s interest to steal from Jeff, so that the he will end with $101, and Jeff will end with $97. But that means that when Jeff has $98 and Mutt has $99, Jeff knows what Mutt will do if he cooperates, and can see that he should steal from Mutt, so that he will end with $100 and Mutt will end with $97. But of course that means that when both players have $98, Mutt can see that he should steal from Jeff, and so on, until one reaches the conclusion that Mutt should start the game by stealing from Jeff.

Of course, this Nash equilibrium behavior doesn’t really seem very wise (not to mention ethical), and experiments show that humans will not follow it. Instead they usually will cooperate until the end or near the end of the game, and thus obtain much more money than would “Nashists” who rigorously follow the conclusions of theoretical game theory.

Game theorists often like to characterize the behavior of Nashists as “rational,” which means that they need to explain the “irrational” behavior of humans in the Rosenthal Centipede Game. See for example, this economics web-page, which gives the following “possible explanations of ‘irrational’ behavior”:

There are two types of explanation to account for the divergence. The first assumes that the subject pool contains a certain proportion of altruists who place a positive weight in their utililty function on the payoff of their opponent. Also to the extent that selfish players believe that there is some probability that other players are altruists, they have an incentive to mimic altruistic behaviour by passing.

The second explanation considers the possibility of action errors. Errors in action, or ‘noisy’ play, may result from subjects experimenting with different strategies. Or simply from subjects pressing the wrong key.

Let’s step back for a second and consider what “rational” behavior should mean. A standard definition from economics is that a rational agent will act so as to maximize his expected utility. Let’s accept this definition of “rational.”

The first thing we should note is that “utility” is not usually the same as “pay-off” in a game. As noted in the first explanation above, many people get utility from helping other people get a pay-off. But there are many other differences between pay-offs and utility. You might lose utility from performing actions that seem unethical or unjust, and gain utility from performing actions that seem virtuous or just. You might want to minimize the risk in your pay-off as well as maximize the expected pay-off. You might value pay-offs in a non-linear way, so that the difference between $101 and $100 is very small in terms of utility.

Of course, this difference between pay-off and utility is very annoying theoretically. We’d really like the pay-offs to strictly represent utilities, but unfortunately for experiments, it is only possible to hand out dollars, not some abstract “utils.”

But suppose that the pay-offs in the Rosenthal Centipede Game really did represent utils. Would the game theory result really be “rational” even in that case? Would the only remaining explanation of cooperating behavior be that the players just don’t understand the situation and are making an error?

No. Remember that to be “rational,” an agent should maximize his expected utility. But he can only do that conditioned on some belief about the nature of the person he is playing with. That belief should take the form of a probability distribution for the possible strategies of his opponent. A Nashist rigidly reasons by backward induction that his opponent must always defect at the first opportunity. He even believes this if he plays second, and his opponent cooperates on the first turn! But is this the most accurate belief possible, or the one that will serve to maximize utility? Probably not.

A much more accurate belief could be based on the understanding that even people who understand the backward induction argument can reason beyond it and see that many of their opponents are nevertheless likely to cooperate for a long time, and therefore it pays to cooperate. If you believe that your opponent is likely to cooperate, it is completely “rational” to cooperate. And if this belief that other players are likely to cooperate is backed by solid evidence such as the fact that they started the game by cooperating, then the behavior of the Nashist, based on inaccurate beliefs that cannot be updated, is in fact quite “irrational,” because it does not maximize his utility.

Sophisticated game theorists do in fact understand these points very well, but they muddy the waters by unnecessarily overloading the term “rational” with a second meaning beyond the definition above; they in essence say that “rational” beliefs are those of the Nashist. For example, take a look at this 1995 paper about the centipede game by Nobel Laureate Robert Aumann. Aumann proves that “Common Knowledge of Rationality” (by which he which he essentially means the certain knowledge that all players must always behave as Nashists) will imply backward induction. He specifically adds the following disclaimer at the end of his paper:

We have shown that common knowledge of rationality (CKR) implies backward induction. Does that mean that in perfect information games, only the inductive choices are appropriate or wise? Would we always recommend the inductive choice?

Certainly not. CKR is an ideal (this is not a value judgement; “ideal” is meant as in “ideal gas”) condition that is rarely met in practice; when it is not met, the inductive choice may be not only unreasonable and unwise, but quite simply irrational. In Rosenthal’s (1982) centipede games, for example, even minute departures from CKR may make it incumbent on rational players to “stay in” until quite late in the game (Aumann, 1992); the resulting outcome is very far from that of backward induction. What we have shown is that if there is CKR, then one gets the backward induction outcome; we do not claim that CKR obtains or “should” obtain, and we make no recommendations.

This is all well and good, but why use the horribly misleading name “Common Knowledge of Rationality” for something that would be more properly called “Universal Insistence on Idiocy?”

I hope it is obvious by now why I am skeptical of explanations of various types of human behavior that are based on assuming that all humans are always Nashists, and even more skeptical of recommendations about how we should behave that are based on those same assumptions.

[Acknowledgement: I thank my son Adam for discussions about these issues.]

A Simple but Challenging Game: Part II

August 15, 2007

Here’s Part I again:

My 15-year-old son Adam likes game theory. He invented the following simple game, and asked me about it when I got on the phone with him while I was away at a conference last month (I’ve simplified and formalized the set-up slightly):

There are two players, each of whom is given a real number which is chosen randomly from a uniform distribution between 0.0 and 1.0. The players know their own number but not their opponent’s. One player moves first and has the choice of passing or challenging. If he challenges, both players reveal their number, and the player with the higher number receives a payoff of 1, while the other player receives a payoff of 0. If the first player passes, the second player has a choice of challenging or passing. If he challenges, again both players reveal their numbers and the player with the higher number receives a payoff of 1, while the other player receives a payoff of 0. If the second player also passes, both players receive a payoff of 1/2. They play the game one time, and are interested in maximizing their expected payoff.

What is the right strategy? For example, if you received the number 0.17, would you pass or challenge if you were the first player? What about if you were the second player? What would you do if the number you received was 0.0017?

I’ll tell you more in a later post, but for now why don’t you think about it….

Here’s the promised followup:

It is clear that if any player has the advantage, it’s the second player, because he gets some information from the first player, and can use it to make his decision. Nevertheless, the first player can adopt the strategy of always challenging, and thereby guarantee that he wins half the time. So apparently he should always challenge. This is the answer that was given by “Optionalstopping” in the comments. The same answer was given to me by the evolutionary game theorist Arne Traulsen (who has recently worked on a ground-breaking theory for the emergence of punishment), after I asked him about the game at a lunch conversation.

There is another way to arrive at the same answer. Assume that each player chooses a strategy parameterized by a single value; if he receives a number above that value, he challenges, while if he receives a number below that value he passes. If you work out the Nash equilibrium (basically, that means that both you and your opponent pick the strategy that gives you the best payoff assuming that the opponent is maximizing their payoff), you’ll find (I won’t bore you with the math) that the value for both players is zero–they should always challenge. My son Adam gave an intuitive version of this argument, without the math. Of course it’s true that probabilistic strategies are also possible. I haven’t proven it, but I strongly doubt that introducing probabilistic strategies will change the result that the Nash equilibrium is to always challenge for both players.

So at first I thought the matter was settled. But still, there is something very weird about this result. Would you really challenge if you were the first player and you received a .001? Would you really?

And what if you were the second player, and the first player passed, and you had a .000001? You know that the first player is not “following the rules” of Nash equilibrium. Are you really going to challenge because Nash tells you to? It’s obviously a crazy result! There must be a hole in the arguments.

So what’s wrong with the above arguments?

First let’s start with the Nash equilbrium arguments. By the way, many authors use the term “rational” for players that use strategies dictated by Nash equilibrium arguments, but I think “rational” and “irrational” are excessively loaded terms, so I prefer to instead say that a player that follows a strategy dictated by Nash equilibrium arguments is a “Nashist.”

If you are the second player, and the first player has passed, you can deduce that the first player is not a Nashist. So in order to make a correct play (maximize your expected winnings) you need to choose some probability distribution for what the first player’s strategy is, and then compute whether you will win more or less depending on whether you challenge. There doesn’t seem to be any obvious way to choose the probability distribution for the first player’s strategy, but you can definitely say that he is not certainly a Nashist! A Nashist is stuck (by definition of being a Nashist) believing that all other players are always Nashists, even in the face of clear evidence that they are not (you see why I don’t like to call Nashists “rational”) and would choose the strategy that followed from that obviously wrong belief; he would always challenge as the second player.

A non-Nashist, on the other hand, can come up with a reasonable probability distribution for the first player’s strategy, and come to the conclusion that he should pass if he is the second player and he has a .000001.

OK, so we can see why the second player might want to pass if he receives a .000001. What about the first player: is it wrong to be a Nashist? Should you pass or challenge if you get a .001, or a .000001, or a .000000001? A Nashist would be compelled, by the force of his “idealogy,” to challenge in each case. But you can make a good argument that that’s wrong. Instead, let’s say I am the first player and I have a .000001. I know that if I pass, the second player will be able to deduce that I’m not a Nashist, and will go through the argument given above for the second player. Now “all” I have to do is form my probability distribution for what his probability distribution of my strategy will be, and compute whether I will have more chance of winning depending whether I challenge or not. Again there’s no obvious way to choose these probability distributions, but it seems pretty clear to me to that reasonable probability distributions will give a result that says don’t challenge if you are the first player and you have a sufficiently low number.

Well what about the other argument, that says that the first player should always challenge, since he is at a disadvantage and if he always challenges, he’ll win half the time? It seems paradoxical to think that there is a strategy that can do better than winning half the time for the first player.

Of course, all Nashists will always win exactly half the time, whether they play first or second. If you are playing against someone who you know is a Nashist, it actually doesn’t matter what you do! But suppose instead that you are playing against an ordinary human. You should play forming the best possible probability distribution of what they will do. Many humans will challenge if they have a number above .5 and pass otherwise, whether or not they play first (a very bad strategy by the way). It is perfectly possible, indeed likely, that playing against a population of ordinary humans, there exists a strategy that wins more than half the time for the first player. I can’t prove that strategy exists by arguing in this way (one would need to run experiments to determine the probability distributions of strategies, and then it would be easy to compute), but I’m actually pretty confident that it does exist, and I’m also pretty confident that the strategy involves passing when you are given a .000001 as the first player.

So I don’t believe either argument for always challenging holds up, which is comforting, because always challenging does seem intuitively wrong. Unfortunately, I can’t tell you exactly what the optimal strategy is either, at least until you tell me what the true probability distribution is for player strategies.

images1.jpg

By the way, Arne recommended that I pick up Game Theory Evolving, by Herbert Gintis, for my son. It’s a wonderful book, full of interesting games and solved problems in game theory. Adam and I both love it. Gintis gives other examples showing that Nashists (he calls them “self-regarding agents”) can choose bizarre strategies, including “Rosenthal’s Centipede Game:”

The players, Mutt and Jeff, start out with $2 each, and they alternate rounds. On the first round, Mutt can defect by stealing $2 from Jeff, and the game is over. Otherwise, Mutt cooperates by not stealing, and Nature gives Mutt $1. Then Jeff can defect and steal $2 from Mutt, and the game is over, or he can cooperate and Nature gives Jeff $1. This continues until one or the other defects, or each player has $100.

In this game, the Nash equilibrium, obtained by induction by working backwards from the end of the game, when it is clearly “correct” to defect, is that Mutt should immediately defect on his first turn. So that’s what a Nashist would do, but fortunately humans are much more “rational” than Nashists!