Do People Play Nash Equilibrium? Lessons from Evolutionary Game Theory
by
George J. Mailath
Citation
Title:
Do People Play Nash Equilibrium? Lessons from Evolutionary Game Theory
Author:
George J. Mailath
Year:
1998
Publication:
Journal of Economic Literature
Volume:
36
Issue:
3
Start Page:
1347
End Page:
1374
Publisher:
Language:
English
URL:
Select license:
Select License
DOI:
PMID:
ISSN:
Updated: October 30th, 2012
Abstract:
Do People Play Nash Equilibrium? Lessons From Evolutionary Game Theory
1.Itztrodz~ctiotz
T THE SAME TIME that noncoopera tive game theory has become a stan dard tool in economics, it has also come under increasingly critical scrutiny from theorists and experimentalists. Noncoop erative game theory, like neoclassical economics, is built on two heroic assumptions: ~44aximizationevery economic agent is a rational decision maker with a clear understanding of the world; and consistencythe agent's understanding, in particular, expectations, of other agents' behavior, is correct (i.e., the overall pattern of individual optimiz ing behavior forms a Nash equilibrium). These assumptions are no less controver sial in the context of noncooperative game theory than they are in neoclassical econornics.
ory in applications is problematic. The appropriate use of galne theory requires understanding when its assumptions make sense and when they do not.
In some ways, the challenge of pro viding a compelling justification is not a new one. A major complaint other social scientists (and some economists) have about economic inethodology is the central role of the maximization 11y pothesis. A common informal argument is that any agent not optimizingin particular, any firm not maximizing profitswill be driven out by market forces. This is an evolutionary argument, and as is well known, Charles Darwin was led to the idea of natural selection from reading Thomas Malthus.2 But does such a justification work? IS Nash equilibrium (or some re
2 "In October 1838, that is, fifteen months after I had begun my systelnatic enquiry, I happened to read for alnusement 'Malthus on Population,' and being well prepared to appreciate the struggle for existence which every~vhere goes on from long continued observatioil of the habits of animals and plants, it at once struck me that under these cir cumstances favorable variations would tend to be preserved, and unfavorable ones to be destroyed. The results of this u,ould be the formation of new species. Here, then I had at last got a theory by which to work;" Charles Darwin (1887, p. 83).
lated concept) a good predictor of be havior?
While the parallel between noncoop erative game theory and neoclassical economics is close, it is not perfect. Certainly, the question of whether agents maximize is essentially the same in both. Moreover, the consistency assumption also appears in neoclassical economics as the assumption that prices clear markets. However, a fundamental distinction between neoclassical economics and noncooperative game theory is that, while the many equilibria of a competitive economy almost always share many of the same properties (such as efficiency or its lack),3 the many equilibria of games often have dramatically different properties. While neoclassical economics does not address the question of equilibrium selection, game theory must.
Much of the work in evolutionary game theory is motivated by two basic questions:
 Do agents play Nash equilibrium?
 Given that agents play Nash equi librium, which equilibriunl do they play?
Evolutionary game theory formalizes and generalizes the evolutionary argu ment given above by assuming that more successful behavior tends to be more prevalent. The canonical model has a population of players interacting over time, with their behavior adjusting over time in response to the payoffs (utilities, profits) that various choices have historically received. These play ers could be workers, consumers, firms, etc. The focus of study is the dynamic behavior of the system. The crucial as sumptions are that there is a population
3 Or perhaps econonlists have chosen to study only those properties that are shared by all equi libria. For example, different competitive equilib ria have different income distributions.
of players, these players are interacting, and the behavior is naive (in two senses: players do not believeunderstandthat their own behavior potentially af fects future play of their opponents, and players typically do not take into account the possibility that their oppo nents are similarly engaged in adjusting their own behavior). It is important to note that successful behavior becomes more prevalent not just because market forces select against unsuccessful behavior, but also because agents imitate successful behavior.
Since evolutionary game theory stud ies populations playing games, it is also useful for studying social norms and conventions. Indeed, many of the moti vating ideas are the same.4 The evolu tion of conventions and social norms is an instance of players learning to play an equilibrium. A convention can be thought of as a symmetric equilibrium of a coordination game. Examples include a population of consumers who must decide which type of good to pur chase (in a world of competing standards); a population of \vorlzers who must decide how much effort to exert; a population of traders at a fair (market) who must decide how aggressively to bargain; and a population of drivers ran domly meeting at intersections who must decide who gives way to whom.
Evolutionary game theory has provided a qualified affirmative answer to the first question: In a range of settings, agents do (eventually) play Nash. There is thus support for equilibrium analysis in environments where evolutionary ar guments make sense. Equilibrium is best viewed as the steady state of a community whose members are myopi cally groping toward maximizing behav ior. This is in marlzed contrast to the
See, for example, Jon Elster (1989), Brian Skyrlns (1996), Robert Sugden (1989), and H. Peyton Young (1996).
earlier view (which, as I said, lacks satisfactory foundation), according to which game theory and equilibrium analysis are the study of the interaction of (ultra) rational agents with a large amount of (common) knowledge.5
The question of which equilibrium is played has received a lot of attention, most explicitly in the refinements litera ture. The two most influential ideas in that literature are backward and forward induction. Backward induction and its extensionssubgame perfection and sequentialitycapture notions of credibility and sequential rationality. Forward induction captures the idea that a player's choice of current action can be informative about his future play. The concerns about adequate foundations extend to these refinement ideas. While evolutionary game theory does discriminate between equilibria, backward induction receives little sup port from evolutionary game theory. Forward induction receives more support. One new important principle for selecting an equilibrium, based on sto chastic stability, does emerge from evolutionary game theory, and this prin ciple discriminates between strict equi libria (something backward and forward induction cannot do).
The next section outlines the major justifications for Nash equilibrium, and the difficulties with them. In that section, I identify learning as the best available justification for Nash equilib rium. Section 3 introduces evolutionary game theory from a learning perspec tive. The idea that Nash equilibrium can be usefully thought of as an evolu tionary stable state is described in Sec tion 4. The question of which Nash equilibrium is played is then discussed
Even in environments where an evolutionary analysis wiould not be appropriate, equilibrium analysis is valuable in illuminating the strategic structure of the game.
in Section 5. As much as possible, I have used simple examples. Very few theorems are stated (and then only in formally). Recent surveys of evolution ary game theory include Eric van Damme (1987, ch. 9), Michihiro Kan dori (1997), Mailath (1992), and Jorgen Weibull (1995).
2. The Ouestion
Economics and game theory typically assume that agents are "rational" in the sense of pursuing their own welfare, as they see it. This hypothesis is tautologi cal without further structure, and it is usually further assumed that agents un derstand the world as well as (if not better than) the researcher studying the world inhabited by these agents. This often requires an implausible degree of computational and conceptual ability on the part of the agents. For example, while chess is strategically tri~ial,~
it is computationally impossible to solve (at least in the foreseeable future).
Computational limitations, however, are in many ways less important than conceptual limitations of agents. The typical agent is not like Gary Kasparov, the world champion chess player who knows the rules of chess, but also knows that he doesn't know the winning strat egy. In most situations, people do not know they are playing a game. Rather, people have some (perhaps imprecise) notion of the environment they are in, their possible opponents, the actions they and their opponents have available, and the possible payoff implications of different actions. These people use heu ristics and rules of thumb (generated from experience) to guide behavior; sometimes these heuristics work well
Since chess is a finite game of perfect informa tion, it has been know11 since 1912 that either White can force a win, Black can force a win, or either player can force a draw, (Ernst Zermelo
1912).
and sometimes they don't.7 These heu ristics can generate behavior that is inconsistent with straightforward maxi mization. In some settings, the behavior can appear as if it was generated by concerns of equity, fairness, or revenge.
I turn now to the question of consis tency. It is useful to first consider situ ations that have aspects of coordination, i.e., where an agent (firm, consumer, worker, etc.) maximizes his welfare by choosing the same action as the major ity. For example, in choosing between computers, consuiners have a choice between PCs based on Microsoft's op erating systems and Applecompatible computers. There is significantly more software available for Microsoftcompatible computers, due to the market share of the Microsoft computers, and this increases the value of the Microsoftcompatible computers. Firms inust often choose between different possible standards.
The first example concerns a team of workers in a modern version of Jean Jacques Rousseau's (1950) Stag Hunt.8 In the example, each worker can put in low or high effort, the team's total out put (and so each worker's compensation) is determined by the minimuin ef fort of all the workers, and effort is privately costly. Suppose that if all worlzers put in low effort, the team pro
7 Both Andrew Postlewaite and Larry Salnuelson
have made the observation that in life there are no
oneshot games and no "last and final offers."
Thus, if an experimental subject is placed in an
artificial environlnent with tliese properties, the
subject's heuristic will not work well until it has
adjusted to this environment. I return to this in
niy discussion of the ultiinatunl game.
RRousseau's (1950) stag hunt describes several
hunters in the wilderness. Indi\,idually, each
hunter can catch rabbits and survi\,e. Acting to
ether, the hunters call catch a stag and have a
feast. However, in order to catch a stag, every
hunter lnust cooperate in the stag hunt. If even
one hunter does not cooperate (by catching rab
bits), the stag escapes.
niinimum of other workers' efforts
high  low  

worker's  high  
effoit  low 
Figure 1. A "stagliunt" played by workers in a team
duces a per capita output of 3, while if all workers put in high effort, per capita output is 7. Suppose, moreover, the disutility of high effort is 2 (valued in the same units as output). We can thus represent the possibilities, as in Figure
1.9 1t is wort11 emphasizing at this point that the characteristics that make the stag hunt game interesting are pervasive. In most organizations, the value of a worlzer's effort is increasing in the ef fort levels of the other workers.
What should we predict to be the outcome? Consider a typical worker, whom I call Bruce for definiteness. If all the other workers are only putting in low effort, then the best choice for Bruce is also low effort: high effort is costly, and choosing high effort cannot increase output (since output is deter mined by the minimum effort of all the workers). Thus, if Bruce expects the other worlzers to put in low effort, then Bruce will also put in low effort. Since all workers find themselves in the same situation, we see that all workers choos ing to put in low effort is a Nash equilibrium: each worker is behaving in his own best interest, given the behavior of the others. Now suppose the workers (other than Bruce) are putting in high effort. In this case, the best choice for Bruce is now high effort. While high ef fort is costly, Bruce's choice of high rather than low effort now does affect
9 The stag hunt game is an example of a coordination ganze. A pure coordination game differs from the game in Figure 1by ha\,ing zeroes in the offdiagonal elements. The game in Figure 13 is a pure coordination game.
output (since Bruce's choice is the minimum) and so the increase in output (+ 4) is more than enough to justify the increase in effort. Thus, if Bruce expects all the other \vorlzers to be putting in high effort, then he will also put in high effort. As for the loweffort case, a description of behavior in which all workers choose high effort constitutes a Nash equilibrium.
These two descriptions of worker be havior (all choose low effort and all choose high effort) are internally consistent; they are also strict: Bruce strictly prefers to choose the same effort level as the other workers.10 This implies that even if Bruce is somewhat unsure about the minimum effort choice (in particular, as long as Bruce assigns a probability of no more than
0.4 to some worker choosing the other effort choice), this does not affect his behavior.
But are these two descriptions good predictions of behavior? Should we, as outside observers, be confident in a prediction that all the workers in Bruce's team will play one of the two Nash equilibria? And if so, why and which one? Note that this is not the same as asking if Bruce will choose an effort that is consistent with equilibrium. After all, both choices (low and high effort) are consistent with equilib rium, and so Bruce necessarily chooses an equilibrium effort.
Rather, the question concerns the be havior of the group as a whole. HOW do we rule out Bruce choosing high effort because he believes everyone else will, while Sheila (another worker on Bruce's team) chooses low effort because she believes everyone else \vill?ll This is,
10A strict Nash equilibriulil is a Nash equilib rium in which, given tlie play of tlie opponents, eacli player has a unique best reply.
11 While the term "coordination failure" would seem to be an apt one to describe this scenario,
of course, ruled out by equilibrium con siderations. But what does that mean? The critical feature of the scenario just described is that the expectations of Bruce or Sheila about the behavior of the other members of the team are incorrect, something that Nash equilibrium by definition does not allow.
As I said earlier, providing a compel ling argument for Nash equilibrium is a major challenge facing noncooperative game theory today.1' The consistency in Nash equilibrium seenls to require that players know what the other play ers are doing. But where does this knowledge come from? When or why is this a plausible assumption? Several justifications are typically given for Nash equilibria: preplay communication, selffulfilling prophecies (consistent predictions), focal points, and learning.13
The idea underlying preplny communication is straightforward. Suppose the workers in Bruce's teain meet before
that term is colnlnonly uliderstood (particularly in macroeconomics) to refer to coordination on an inefficient equilibrium (such as low effort choseli by nll workers).
12The best introspective (i.e., knowledge or epistemic) foundatiolis for Nash equilibrium assulne that eacli player's conjectures about tlie be ha\,ior of other players is knoton by all the players; see Robert Aulnann and Adaln Brandenburger (1995). This assuluption does not appear to be a significant impro\,ement over tlie original assump tion of Nash behavior.
More generally, there is no colnpelling intro spective argument for ally useful equilibrium notion. The least controversial are those that do not require the imposition of a consistency condition, such as rationalizability (introduced by Douglas Bernheim 1984 and David Pearce 1984); for two players it is equivalelit to tlre iterated deletion of strictly dominated strategies), tlie iterated deletion of weakly dominated strategies, and backward in duction. Hou,e\,er, in most games, rationalizability does little to constrain players' behavior, and as \ire will see below, tlie iterated deletion of weakly dominated strategies and backward induction are both controversial.
l3What if there is oilly one ecjuilibrium? Does this by itself give us a reason to belie\,e that the unicjue equilibrium will be played? The answer is no.
they must choose their effort levels and discuss how much effort they each will exert. If the worlzers reach an agreement that they all believe will be fol lowed, it must be a Nash equilibrium (otherwise at least one worker has an incentive to deviate). This justification certainly has appeal and some range of applicability. However, it does not cover all possible applications. It also assumes that an agreement is reached, and, once reached, is kept. While it seems clear that an agreement will be reached and kept (and which one) in our stag hunt example (at least if the team is small!), this is not true in gen eral. The first difficulty, discussed by Aumann (1990), is that an agreement may not be kept. Suppose we change the payoff in the bottom left cell of Fig ure 1 from 3 to 4 (so that a worker choosing low effort, when the minimum of the other workers' effort is high, re ceives a payoff of 4 rather than 3). In this version of the stag hunt, Bruce benefits from higheffort choices by the other workers, irrespective of his own choice. Bruce now has an incentive to agree to high effort, no matter what he actually intends to do (since this increases the likelihood of high effort by the other workers). But then reaching the agreement provides no information about the intended play of workers and so may not be kept. The second diffi culty is that no agreement may be reached. Suppose, for example, the in teraction has the characteristics of a battleofthesexes game (Figure 2).
It is possible that the unique Nash equilibrium yields each plager the:r maxi,min values, while at the same time elng rlskler (~n the sense that the Nash equilibrium strate v does not guarantee the maximin value) This is %iscussed by, for example, John Harsanyi (1977, p. 125) and Aumann (1985). David Kreps (1990a, p. 135) describes a compli cated alne with a unique equilibrium that is also unlike6 to be played.
employer high wage low wage
Figure 2. A "Battleofthesexes" between an employer and a potential employee bargaining over wages. Each simultaneously makes a wage demand or offen The worker is only hired if they agree.
Such a game, which may describe a bar gaining interaction, has several Pareto noncomparable Nash equilibria: there are several profitable agreements that can be reached, but the bargainers have opposed preferences over which agree ment is reached. In this case, it is not clear that an agreement will be reached. Moreover, if the game does have multi ple Pareto noncomparable Nash equilibria, then the preplay communication stage is itself a bargaining game and so perhaps should be explicitly modelled (at which point, the equilibrium prob lem resurfaces).l4 Finally, there may be no possibility of preplay communication.
The second justification of selffulfilling prophecy runs as follows: If a theory uniquely predicting players' behaviors is known by the players in the game, then it must predict Nash equilibria (see Roger Myerson (1991, pp. 105108) for an extended discussion of this argument). The difficulty, of course, is that the justification requires a theory that uniquely predicts player behavior, and that is precisely what is at issue.
The focal point justification, due to Thomas Schelling (1960), can be phrased as "if there is an obvious way to play in a game (derived from either the structure of the game itself or from the
14The preplay communication stage might also involve a correlatin deuice, like a coin. For exam ple, the players mig%t agree to fli a coin: if heads, then the worker receives a higE wage, while if tails, the worker receives a low wage.
setting), then players will know what other players are doing." There are many different aspects of a game that can single out an "obvious way to play." For example, considerations of fairness may make equal divisions of a surplus particularly salient in a bargaining game. Previous experience suggests that stopping at red lights and going through green is a good strategy, while another possible strategy (go at red and stop at green) is not (even though it is part of another equilibrium). It is sometimes argued that efficiency is such an aspect: if an equilibrium gives a higher payoff to every player than any other equilib rium, then players "should not have any trouble coordinating their expectations at the commonly preferred equilibrium point" (John Harsanyi and Reinhard Selten 1988, p. 81). In our earlier stag hunt example, this principle (called payoff dominance by Harsanyi and Selten 1988) suggests that the higheffort equilibrium is the obvious way to play. On the other hand, the loweffort equilibrium is less risky, with Bruce re ceiving a payoff of 3, no matter what the other members of his team do. In contrast, it is possible that a choice of high effort yields a payoff of only 0. As we will see, evolutionary game theory has been particularly important in ad dressing this issue of riskiness and pay off dominance. See Kreps (1990a) for an excellent extended discussion and further examples of focal points.15
Finally, agents may be able to learn to play an equilibrium. In order to learn to play an equilibrium, players must be playing the same game repeatedly, or at least, similar games that can provide valuable experience. Once all players have learned how their opponents are playing, and if all players are maximiz
15Kreps (1990b, ch. 12) is a formal version of Kreps (1990a).
ing, then we must be at a Nash equilib rium. There are two elements to this learning story. The first is that, given maximizing behavior of players, players can learn the behavior of their oppo nents.16 The second is that players are maximizing. This involves, as I discussed earlier, additional considerations of learning. Even if a player knows how their opponents have played (for example, the player may be the "last mover"), they may not know what the best action is. A player will use their past experience, as well as the experi ence of other players (if that is available and relevant), to make forecasts as to the current or future behavior of oppo nents, as well as the payoff implications of different actions. Since learning it self changes the environment that the agents are attempting to learn (as other agents change their behavior in response to their own learning), the pro cess of learning is quite subtle. Note that theories of learning have a focal point aspect, since histories (observed patterns of play of other agents) can serve as coordinating devices that make certain patterns of play "the obvious way to play," as in the traffic example from the previous paragraph.
The discussion so far points out some of the probleins with many of the stan dard justifications for Nash equilibrium and its two assumptions (maximization and consistency). Besides learning, an other approach is to concede the lack of a sound foundation for consistency, but maintain the hypothesis that agents maximize. The question then is whether rationality and knowledge of the game (including the rationality of opponents)
lG Nonevolutionaiy game theory work on learn ing has focused on the question of when maximiz ing players can in fact learn the behavior of their opponents. Examples of this work include Drew Fudenberg and David Kreps (1989), Drew Fuden berg and David Levine (1993), Faruk Gul (1996), and Ehud Kalai and Ehud Lehrer (1993).
is enough, at least in some interesting cases, to yield usable predictions. The two (related) principles that people have typically used in applications are backward induction and the iterated de letion of weakly dominated strategies.17 In many games, these procedures iden tify a unique outcome. The difficulty is that they require an implausible degree of rationality and knowledge of other players' rationality. The two key exam ples here are Rosenthal's centipede game (so called because of the appear ance of its extensive formFigure 3 is a short centipede) and the finitely re peated Prisoner's dilemma. The centi pede is conceptually simpler, since it is a game of perfect information. The cru cial feature of the game is that each player, when it is their turn to move, strictly prefers to end the game imme diately (i.e., choose En), rather than have the opponent end the game on the next move by choosing E,+l. Moreover, at each move, each player strictly prefers to have play proceed for a fur ther two moves, rather than end imme diately. For the game in Figure 3, if play reaches the last possible move, player I surely ends the game by choos ing E3 rather than C3. Knowing this, player II should choose E2. The induc tion argument then leads to the conclu sion that player I necessarily ends the game on the first move.18 I suspect everyone is comfortable with the pre diction that in a twomove game, player I stops the game immediately. How
17 Backward induction is also the basic principle under1 ing the elimination of equilibria relying on "incre&ble" threats.
18This argument is not special to the extensive form. If the centipede is represented as a normal form game, this backward induction is mimicked by the iterated deletion of weakly dominated strategies. While this game has many Nash equi libria, they all involve the same behavior on the equilibrium ath player I chooses El. The e ui libria only differ in the behavior of players off;be: equilibriumpath.
Figure 3. A short centipede game.
ever, while this logic is the same in longer versions, many researchers are no longer comfortable with the same prediction.19 It only requires one player to think that there is some chance that the other player is willing to play C initially to support playing C in early moves. Similarly, if we consider the repeated prisoner's dilemma, the logic of backward induction (together with the property that the unique one period dominantstrategy equilibrium yields each player their maximin payoff) implies that even in early periods coop eration is not possible.
3. Evolutionary Game Theory and Learning
The previous section argued that, of the various justifications that have been advanced for equilibrium analysis, learning is the least problematic. Evolu tionary game theory is a particularly at tractive approach to learning. In the typical evolutionary gametheoretic model, there is a population of agents, each of whose payoff is a function of not only how they behave, but also how the agents they interact with behave. At

any point in time, behavior within the population is distributed over the dif ferent possible strategies, or behaviors. If the population is finite, a state (of the
19 Indeed, more general knowledgebased considerations have led some researchers to focus on the procedure of one round of weak domination followed by iterated rounds of strict domination. A nice (althou h technical) discussion is Eddie Dekel and Faru! Gul (1997').
population) is a description of which agents are choosing which behavior. If the population is infinite, a state is a description of the fractions of the popu lation that are playing each strategy. If a player can maximize, and knows the state, then he can choose a best reply. If he does not know the state of the population, then he must draw infer ences about the state from whatever in formation he has. In addition, even given knowledge of the state, the player may not be able to calculate a best re ply. calculating a best reply requires that a player know all the strategies available and the payoff implications of all these strategies. The observed his tory of play is now valuable for two rea sons. First, the history conveys informa tion about how the opponents are expected to play. Second, the observed success or failure of various choices helps players determine what might be good strategies in the future. Imitation is often an important part of learning; successful behavior tends to be imitated. In addition, successful behavior will be taught. To the extent that play ers are imitating successful behavior and not explicitly calculating best replies, it is not necessary for players to distinguish between knowledge of the game being played and knowledge of how opponents are playing. Players need know only what was successful, not why it was successful.
Evolutionary gametheoretic game theoretic models are either static or dy namic, and the dynamics are either in discrete or continuous time. A discrete time dynamic is a function that specifies the state of the population in period t + 1 as a function of the state in period t, i.e., given a distribution of behavior in the population, the dynamic specifies next period's distribution. A continuous time dynamic specifies the relative rates of change of the fractions of the popula tion playing each strategy as a function of the current state. Evolutionary (also known as selection or learning) dynamics specify that behavior that is success ful this period will be played by a larger fraction in the immediate future. Static models study concepts that are intended to capture stability ideas moti vated by dynamic stories without explic itly analyzing dynamics.
An important point is that evolution ary dynamics do not build in any assumptions on behavior or knowledge, other than the basic principle of differ ential selectionapparently successful behavior increases its representation in the population, while unsuccessful be havior does not.
Evolutionary models are not structural models of learning or bounded ra tionality. While the motivation for the basic principle of differential selection involves an appeal to learning and bounded rationality, individuals are not explicitly modeled. The feature that successful behavior last period is an at tractive choice this period does seem to require that agents are naive learners. They do not believe, or understand, that their own behavior potentially af fects future play of their opponents, and they do not take into account the possibility that their opponents are similarly engaged in adjusting their own behavior. Agents do not look for pat terns in historical data. They behave as if the world is stationary, even though their own behavior should suggest to them it is not." Moreover, agents be have as if they believe that other agents' experience is relevant for them. Imita tion then seems reasonable. Note that the context here is important. This style
"It is difficult to build lnodels of boundedly rational agents who look for patterns. Masaki Aoyagi (1996) and Doron Sonsino (1997) are rare examples of lnodels of boundedly rational agents who can detect cycles.
of modeling does not lend itself to small numbers of agents. If there is only a small population, is it plausible to be lieve that the agents are not aware of this? And if agents are aware, then imitation is not a good strategy. As we will see, evolutionary dynamics have the property that, in large popu lations, if they converge, then they converge to a Nash equilibrium.21 This property is a necessary condition for any reasonable model of social learn ing. Suppose we had a model in which behavior converged to something that is not a Nash equilibrium. Since the environment is eventually stationary and there is a behavior (strategy) available to some agent that yields a higher payoff, then that agent should eventually figure this out and so deviate.22
One concern sometimes raised about evolutionary game theory is that its agents are implausibly naive. This con cern is misplaced. If an agent is bound edly rational, then he does not under stand the model as written. Typically, this model is very simple (so that com plex dynamic issues can be studied) and so the bounds of the rationality of the agents are often quite extreme. For ex ample, agents are usually not able to de tect any cycles generated by the dynam ics. Why then are the agents not able to figure out what the modeler can? As in most of economic theory, the role of
"More accurately, they conver e to a Nash equilibrium of the ame determinefby the strate gies that are playef along the dynamic path. It is possible that the limit point fails to be a Nash equilibrium because a strategy that is not played along the path has a hi her payoff than any strat egy played along the pa%
Even if there is "drift" isee the ultimatum game below), the limit point will be close to a Nash equilibrium.
220f course, this assumes that this superior strategy is something the agent could have thought of. If the strateg is never played, then the agent might never thiniof it,
models is to improve our intuition and to deepen our understanding of how particular economic or strategic forces interact. For this literature to progress, we must analyze (certainly now, and perhaps forever) simple and tractable games. The games are intended as examples, experiments, and allegories. Modelers do not make assumptions of bounded rationality because they believe players are stupid, but rather that players are not as sophisticated as our models generally assume. In an ideal world, modelers would study very com plicated games and understand how agents who are boundedly rational in some way behave and interact. But the world is not ideal and these models are intractable. In order to better understand these issues, we need to study models that can be solved. Put differ ently, the bounds on rationality are to be understood relative to the complex ity of the environment.
4. Nash Equilibrium as an Evolutionary Stable State
Consider a population of traders that engage in randomly determined pair wise meetings. As is usual, I will treat the large population as being infinite. Suppose that when two traders meet, each can choose one of two strategies, "bold" and "cautious." If a trader has chosen to be "bold," then he will bar gain aggressively, even to the point of losing a profitable trade; on the other hand, if a trader has chosen to be "cau tious," then he will never lose a profit able trade. If a bold trader bargains with a cautious trader, a bargain will be struck that leaves the majority of gains from trade with the bold trader. If two cautious traders bargain, they equally divide the gains from trade. If two bold traders bargain, no agreement is reached. One meeting between two traders is depicted as the symmetric game in Figure 4.23
Behaviors with higher payoffs are more likely to be followed in the future. Suppose that the population originally consists only of cautious traders. If no trader changes his behavior, the popula tion will remain completely cautious. Now suppose there is a perturbation that results in the introduction of some bold traders into the population. This perturbation may be the result of entry: perhaps traveling traders from another community have arrived; or experimen tation: perhaps some of the traders are not sure that they are behaving opti mally and try something different.24 In a population of cautious traders, bold traders also consummate their deals and receive a higher payoff than the cautious traders. So over time, the fraction of cautious traders in the population will decrease and the fraction of bold traders will increase. However, once there are enough bold traders in the population, bold traders no longer have an advantage (on average) over cautious traders (since two bold traders cannot reach an agreement), and so the frac tion of bold traders will always be strictly less than one. Moreover, if the population consists entirely of bold traders, a cautious trader can successfully invade the population. The only stable population is divided between bold and cautious traders, with the pre cise fraction determined by payoffs. In our example, the stable population is equally divided between bold and cau tious traders. This is the distribution with the property that bold and cautious traders have equal payoffs (and so also describes a mixedstrategy Nash equi
"This is the HawkDove game traditionally used to introduce the concept of an evolutionary stable strategy.
"In the biological context, this perturbation is referred to as a mutation or an invasion.
Bold Cautious
Figure 4. Payoff to a trader following the row strategy against a trader followillg the column strategy. The gains from trade are 4.
librium). At that distribution, if the population is perturbed, so that, for ex ample, slightly more than half the popu lation is now bold while slightly less than half is cautious, cautious traders have a higher payoff, and so learning will lead to an increase in the number of cautious traders at the expense of bold traders, until balance is once again restored.
It is worth emphasizing that the final state is independent of the original dis tribution of behavior in the population, and that this state corresponds to the symmetric Nash equilibrium. Moreover, this Nash equilibrium is dynamically stable: any perturbation from this state is always eliminated.
This parable illustrates the basics of an evolutionary game theory model, in particular, the interest in the dynamic behavior of the population. The next section describes the wellknown notion of an evolutiollary stable strategy, a static notion that attempts to capture dynamic stability. Section 4.2 then de scribes explicit dynamics, while Section
4.3 discusses asymmetric games.
4.1. Evolutionary Stable Strategies
In the biological setting, the idea that a stable pattern of behavior in a popula tion should be able to eliminate any in vasion by a "mutant" motivated John Maynard Smith and G. R. Price (1973) to define an evolutionary stable strategy (ESS)." If a population pattern of be
"Good references on biological dynairiics and evolution are Mavnard Sinit11 (1982) and Josef Hofbauer and ~arf sigmund (1988).
havior is to eliminate invading mutations, it must have a higher fitness than the mutant in the population that results from the invasion. In biology, animals are programmed (perhaps genetically) to play particular strategies and the payoff is interpreted as "fitness," with fitter strategies having higher reproductive rates (reproduction is asexual).
It will be helpful to use some nota tion at this point. The collection of available behaviors (strategies) is S and the payoff to the agent choosing i when his opponent chooses j is z(i,j).We will follow most of the literature in assum ing that there is only a finite number of available strategies. Any behavior in S is called pure. A mixed strategy is a prob ability distribution over pure strategies. While any pure strategy can be viewed as the mixed strategy that places prob ability one on that pure strategy, it will be useful to follow the convention that the term "mixed strategy" always refers to a mixed strategy that places strictly positive probability on at least two strategies. Mixed strategies have two leading interpretations as a description of behavior in a population: either the population is monomorphic, in which every member of the population plays the same mixed strategy, or the popula tion is polymorphic, in which each member plays a pure strategy and the fraction of the population playing any particular pure strategy equals the probability assigned to that pure strat egy by the mixed strategy.26 As will be clear, the notion of an evolutionary sta ble strategy is best understood by assuming that each agent can choose a
26The crucial distinction is whether agents can play and inherit (learn) mixed strategies. If not, then any mixed strategy state is necessarily the re sult of a polyinorphic po ulation. On the other hand, even if agents can ply mixed strate ies, the population may be polymorphic with fifferent agents playing different strategies.
Figure 5. The numbers in the matrix are the row
player's payoff. The strateg). a is an ESS in this game
mixed strategy and the population is originally monomorphic.
Definition 1. A (potentially mixed) strategy p is an Evolutionary Stable Strategy (ESS) if:
1.the payoff from playing p against p is at least as large as the payoff from playing any other strategy against p; and
2. for any other strategy q that has the same payoff as p against p, the pay off from playing p against q is at least as large as the payoff from playing q against q.27
Thus, p is an evolutionary stable strategy if it is a symmetric Nash equi librium, and if, in addition, when q is also a best reply to p, then p does better against q than q does. For example, a is an ESS in the game in Figure 5.
ESS is a static notion that attempts to capture dynamic stability. There are two cases to consider: The first is that p is a strict Nash equilibrium (see foot note 10).Then, p is the only best reply to itself (so p must be pure), and any agent playing q against a population whose members mostly play p receives a lower payoff (on average) than a p player. As a result, the fraction of the population playing q shrinks.
The other possibility is that p is not the only best reply to itself (if p is a mixed strategy, then any other mixture with the same or smaller support is also a best reply to p). Suppose q is a best
27The payoff to p against q is n(p,q)= Zgn(ij)piqj Formally, the strate y p is an ESS if, for all q, n(p,p) n(q,p),and i B there exists q + p such that n(p,p)= n(q,p),then n(q,p)> n(q,q).
reply to p. Then both an agent playing p and an agent playing q earn the same payoff against a population of p players. After the perturbation of the population by the entry of 9 players, however, the population is not simply a population of p players. There is also a small fraction of 9 players, and their presence will determine whether the q players are eliminated. The second condition in the definition of ESS guarantees that in the perturbed population, it is the p players who do better than the q players when they play against a q player.
4.2. The Replicator and Other more General Dynamics
While plausible, the story underlying ESS suffers from its reliance on the as sumption that agents can learn (in the biological context, inherit) mixed strate gies. ESS is only useful to the extent that it appropriately captures some notion of dynamic stability. Suppose individuals now choose only pure strategies. Define pf. as the proportion of the population choosing strategy i at time t. The state of the population at time t is then pt = (ptl, ... ,pi), where n is the number of
strategies (of course, pt is in the 11 1 dimensional simplex). The simplest evo lutionary dynamic one could use to investigate the dynamic properties of ESS is the replicator dynamic. In its simplest form, this dynamic specifies that the proportional rate of growth in a strategy's representation in the popula tion, pt, is given by the extent to which
that strategy does better than the popu lation a~erage.~8
The payoff to strategy i when the state of the population is pt is n(i,pt)= Zjn(ij)pjt, while the population
average payoff is n(pt,pt)=
li..n(ij)p;pj.
"The replicator dynamic can also be derived from more basic biological arguments.
The continuous time replicator dynamic is then:
Thus, if strategy i does better than average, its representation in the popu lation grows (dp:/ dt > 0) , and if another strategy i' is even better, then its growth rate is also higher than that of strategy i. Equation (1)is a differential equation that, together with an initial condition, uniquely determines a path for the population that describes, for any time t, the state of the population.
A state is a rest point of the dynamics if the dynamics leave the state unchanged (i.e., dp:/dt = 0 for all i). A rest point is Liapunov stable if the dynamics do not take states close to the rest point far away. A rest point is asymptotically stable if, in addition, any path (implied by the dynamics) that starts sufficiently close to the rest point converges to that rest point.
There are several features of the rep licator dynamic to note. First, if a pure strategy is extinct (i.e., no fraction of the population plays that pure strategy) at any point of time, then it is never played. In particular, any state in which the same pure strategy is played by every agent (and so every other strategy is extinct) is a rest point of the replica tor dynamic. So, being a rest point is not a sufficient condition for Nash equi librium. This is a natural feature that we already saw in our discussion of the game in Figure 4if everyone is bar gaining cautiously and if traders are not aware of the possibilities of bold play, then there is no reason for traders to change their behavior (even though a rational agent who understood the pay off implications of the different strate gies available would choose bold behav ior rather than cautious).
Second, the replicator dynamic is not a best reply dynamic: strategies that are not best replies to the current popula tion will still increase their representation in the population if they do better than average (this feature only becomes apparent when there are at least three available strategies). This again is consistent with the view that this is a model of boundedly rational learning, where agents do not understand the full payoff implications of the different strategies.
Finally, the dynamics can have multi ple asymptotically stable rest points. The asymptotic distribution of behavior in the population can depend upon the starting point. Returning to the stag hunt game of Figure 1, if a high frac tion of workers has chosen high effort historically, then those workers who had previously chosen low effort would be expected to switch to high effort, and so the fraction playing high effort would increase. On the other hand, if workers have observed low effort, perhaps low effort will continue to be observed. Un der the replicator dynamic (or any de terministic learning dynamic), if p?,,,, > 3/5, then piig,+ 1,while if p~l,gh
< 3/5, then pLig,,+ 0. The equilibrium that players eventually learn is determined by the original distribution of players across high and low effort. If the origi nal distribution is random (e.g., pizg,is determined as a realization of a uniform random variable), then the low effort equilibrium is 3/5's as likely to arise as the high effort equilibrium. This notion of path dependencethat history mat tersis important and attractive.
E. Zeeman (1980) and Peter Taylor and Leo Jonker (1978) have shown that if p is an ESS, it is asymptotically stable under the continuous time replicator dynamic, and that there are examples of asymptotically stable rest points of the replicator dynamic that are not ESS. If dynamics are specified that allow for mixed strategy inheritance, then p is an ESS if and only if it is asymptotically stable (see W. G. S. Hines 1980, Arthur Robson 1992, and Zeeman 1981). A point I will come back to is that both asymptotic stability and ESS are concerned with the stability of the system after a once and for all perturbation. They do not address the consequences of continual perturbations. As we will see, depending upon how they are modeled, continual perturbations can profoundly change the nature of learn ing.
While the results on the replicator dynamic are suggestive, the dynamics are somewhat restrictive, and there has been some interest in extending the analysis to more general dynamics.29 Interest has focused on two classes of dynamics. The first, monotone dynam ics, roughly requires that on average, players switch from worse to better (not necessarily the best) pure strategies. The second, more restrictive, class, aggregate monotone dynamics, requires that, in addition, the switching of strate gies has the property that the induced distribution over strategies in the popu lation has a higher average payoff. It is worth noting that the extension to ag gregate monotone dynamics is not that substantial: aggregate monotone dynam ics are essentially multiples of the repli cator dynamic (Samuelson and Jianbo Zhang 1992).
29 There has also been recent work exploring the link between the replicator dynamic and explicit models of learnin Tilman Borgers and Rajiv Sarin (1997) consicfir a single bounded1 lational decision maker using a version of the Ro 8ert' Bush and Frederick Mosteller (1951, 1955) model of positive reinforcement learning, and show that the equation describing individual behavior looks like the replicator dynamic. John Gale, Kenneth Bin more, and Larry Samuelson (1995) derive the rep licator dynamic from a behavioral model of aspira tions. Karl Schlag (1998) derives the replicator dynamic in a bandit setting, where agents learn from others.
Since, by definition, a Nash equilib rium is a strategy profile with the prop erty that every player is playing a best reply to the behavior of the other play ers, every Nash equilibrium is a rest point of any monotone dynamic. How ever, since the dynamics may not intro duce behavior that is not already pre sent in the population, not all rest points are Nash equilibria. If a rest point is asymptotically stable, then the learning dynamics, starting from a point close to the rest point but with all strategies being played by a positive fraction of the population, converge to the rest point. Thus, if the rest point is not a Nash equilibrium, some agents are not optimizing and the dynamics will take the system away from that rest point. This is the first major message from evolutionary game theory: if the state is asymptotically stable, then it de scribes a Nash equilibrium.
4.3.Asymmetric Games and Nonexistence of ESS
30 More accurately, the traders' behavior cannot depend on the location.
visiting trader
Figure 6. Payoff to t\vo traders bargaining in the row trader's establishment. The first number is the owner's payoff, the second is the cisitor's. The gains from trade are 4.
equilibrium). There are also two pure strategy asymmetric equilibria (the owner is bold, while the visitor is cau tious; and the owner is cautious, while the visitor is bold). Moreover, these two pure strategy equilibria are strict.
Symmetric games, like that in Figure 6, have the property that the strategies available to the players do not depend upon their role , i.e., row (owner) or column (visitor). The assumption that in a symmetric game, agents cannot condi tion on their role is called no role iden tqication. In most games of interest, the strategies available to a player also depend upon his role. Even if not, as in the example just discussed, there is often some distinguishing feature that allows players to identify their role (such as the row player being the in cumbent and the column player being an entrant in a contest for a location). Such games are called asymmetric.
The notion of an ESS can be applied to such games by either changing the definition to allow for asymmetric mu tants (as in Jeroen Swinkels 1992), or, equivalently, by symmetrizing the game. The symmetrized game is the game obtained by assuming that, ex ante, players do not know which role they will have, so that players' strategies specify behavior conditional on different roles, and first having a move of na ture that allocates each player to a role. (In the trader example, a coin flip for each trader determines whether the trader stays "home" this period, or visits another establishment.) However, every
Figzire 7. The asyininetric version of the game in Figure 5.The strate? (a,@is not an ESS of the syininetiized game, smce the payoff earned by the strategy (a,a)when playing against (b,a)equals the payoff earned by the strategy (b,a)when pla>ring against (b,a)(which equals 1/2n(a,a)+llzn(a,b)=3/2) ESS in such a symmetrized game is a strict equilibrium (Selten 1980).This is an important negative result, since most games (in particular, nontrivial exten sive form games) do not have strict equilibria, and so ESSs do not exist for most asymmetric games. The intuition for the nonexistence is helpful in what comes later, and is most easily conveyed if we take the monomorphic interpretation of mixed strate gies. Fix a nonstrict Nash equilibrium of an asymmetric game, and suppose it is the row player that has available other best replies. Recall that a strategy specifies behavior for the agent in his role as the row player and also in his role as the column player. The mutant of interest is given by the strategy that specifies one of these other best replies for the row role and the existing behav ior for the column role. This mutant is not selected against, since its expected payoff is the same as that of the remain der of the population. First note that all agents in their column role still behave the same as before the invasion. Then, in his row role, the mutant is (by as sumption) playing an alternative best reply, and in his column role he receives the same payoff as every other column player (since he is playing the same as them). See Figure 7. The idea that evolutionary (or selection) pressures lnay not be effective against alternative best replies plays an important role in subsequent work on  Proportion of bold owners Figure 8. The phase diagram for the ix7o population trader game. cheap talk and forward and backward induction. It is also helpful to consider the be havior of dynamics in a twopopulation world playing the trader game. There are two populations, owners and visi tors. A state now consists of the pair (p,q), where p is the fraction of owners who are bold, while q is the fraction of visitors who are bold.31 There is a sepa rate replicator dynamic for each popula tion, with the payoff to a strategy fol lowed by an owner depending only on q, and not p (this is the observation from the previous paragraph that row players do not interact with row play ers). While (p",q"), where p" = q" = 1/2, is still a rest point of the two di mensional dynalnical system describing the evolution of trader behavior, it is no longer asyinptotically stable. The phase diagram is illustrated in Figure 8. If there is a perturbation, then owners and traders move toward one of the two strict pure equilibria. Both of the asymmetric equilibria are asymptotically stable. If the game is asymmetric, then we 31 This is equivalent to considering dynamics in the onepopulation model where the gaine is sym metrized and p is the fraction of the population who are bold in the owner role.
Player 2 L R Figure 9. Any state in which all agents in population 2 play L is a stable rest point. have already seen that the only ESS are strict equilibria. There is a similar lack of power in considering asymptotic stability for general dynamics in asym metric games. In particular, asymptotic stability in asymmetric games implies "almost" strict Nash equilibria, and if the profile is pure, it is ~trict.3~ For the special case of replicator dynamics, a Nash equilibrium is asy~nptotically stable if and only if it is strict (Klaus Ritzberger and Jorgen Weibull 1995). 5. Wlaich Nash Equilibrium? Beyond excluding nonNash behavior as stable outcomes, evolutionary game theory has provided substantial insight into the types of behavior that are con sistent with evolutionary models, and the types of behavior that are not.
The assuinption that the traders are drawn from a single population and that the two traders in any bargaining encounter are symmetric is important. In order for two traders in a trading en counter to be symmetric in this way, their encounter must be on "neutral" ground, and not in one of their estab lishments.30 Another possibility is that each encounter occurs in one of the traders' establishments, and the behav ior of the traders depends upon whether he is the visitor or the owner of the establishment. This is illustrated in Figure 6. This game has three Nash equilibria: The stable profile of a 50/50 mixture between cautious and bold be havior is still a inixed strategy equilib rium (in fact, it is the only symmetric
5.1. Domination
The strongest positive results concern the behavior of the dynamics with respect to strict domination: If a strategy is strictly dominated by another (that is, it yields a lower payoff than the other strategy for all possible choices of the opponents), then over time that strictly dominated strategy will disappear (a smaller fraction of the popula tion will play that strategy). Once that strategy has (effectively) disappeared, any strategy that is now strictly domi nated (given the deletion of the original
32Salnuelsol~ and Zhang (1992, p. 377) has the precise statement. See also Daniel Friedman (1991) and IVeibull (1995) for general results on continuous tillle dynamics.
Figure 10. An extensive form game wit11 the ilormal form in Figure 9.
dominated strategy) will also now disappear.
There is an important distinction here between strict and weak domina tion. It is not true that weakly domi nated strategies are similarly eliminated. Consider the game taken from Samuelson and Zhang (1992, p. 382) in Figure 9. It is worth noting that this is the normal form of the extensive form game with perfect information given in Figure 10.
" I
point has lnostly Lplaying agellts in population 2. If we model the dynainics for this game as a twodimensional con tinuous time replicator dynamic, we have33
and
where pt is the fraction of population 1 playing T and qt is the fraction of popula tion 2 playing L. The adjustment of the fraction of population 2 playing L reflects the strict dominance of L over R:
Since L always does better than R, if qt is interior (that is, there are both agents playing L and R in the population) the fraction playing L increases (dqydt> O), independently of the fraction in popula tion 1 playing T, with the adjustment only disappearing as qt approaches 1. The adjustment of the fraction of popu lation 1 playing T, on the other hand de pends on the fraction of population 2 playing R: if almost all of population 2 is playing L (qt is close to l),then the ad justment in pt is small (in fact, arbitrarily small for qt arbitrarily close to l),no matter what the value of pt. The phase diagram is illustrated in Figure 11.
It is also important to note that no rest point is asymptotically stable. Even the state with all agents in population 1 playing T and all agents in population 2 playing L is not asymptotically stable, because a perturbation that increases the fraction of population 2 playing R while leaving the entire population 1 playing T will not disappear: the system has been perturbed toward another rest
33 The twopopulation version of (1)is:
dpj
=pi x (nl(i,qt)nl(pt,qt))
dt
and
where nk(ij)is player k's payoff from the strategy pair (id)and
Fraction playing T, pt
Figure 11. The phase diagram for the game with weak dominance.
point. Nothing prevents the system from "drifting" along the heavy lined horizontal segment in Figure 11. Moreover, each time the system is perturbed from a rest point toward the interior of the state space (so that there is a posi tive fraction of population 1 playing B and of population 2 playing R), it typi cally returns to a different rest point (and for many perturbations, to a rest point with a higher fraction of popula tion 2 playing L). This logic is reminis cent of the intuition given earlier on for the nonexistence of ESS in asymmetric games. It also suggests that sets of states will have better stability proper ties than individual states. Recall that if a single strategy profile is "evolutionarily stable," then behavior within the population, once near that profile, converges to it and never leaves it. A single strategy profile describes the aggregate pattern of behavior within the population. A set of strategy profiles is a collection of such descrip tions. Loosely, we can think of a set of strategy profiles as being "evolutionarily stable" if behavior within the population, once near any profile in the set, converges to some profile within it and never leaves the set. The important fea ture is that behavior within the popula tion need not settle down to a steady state, rather it can "drift" between the different patterns within the "evolution arily stable" set.
5.2. The Ultimatum Game and Backward Induction
The ultimatum game is a simple game with multiple Nash equilibria and a unique backward induction outcome. There is $1 to divide. The proposer proposes a division to the responder. The responder either accepts the divi sion, in which case it is implemented, or rejects it, in which case both players re ceive nothing. If the proposer can make any proposal, the only baclzward induc tion solution has the receiver accepting the proposal in which the proposer receives the entire dollar. If the dollar must be divided into whole pennies, there is another solution in which the responder rejects the proposal in which he receives nothing and accepts the proposal of 99 cents to the proposer and 1 cent to the responder. This pre diction is uniformly rejected in experi ments!
How do Gale, Binmore, and Samuel son (1995) explain this? The critical is sue is the relative speeds of convergence. Both proposers and responders are "learning." The proposers are learn ing which offers will be rejected (this is learning as we have been discussing it). In principle, if the environment (terms of the proposal) is sufficiently compli cated, responders may also have diffi culty evaluating offers. In experiments, responders do reject as much as 30 cents. However, it is difficult to imag ine that responders do not understand that 30 cents is better than zero. There are at least two possibilities that still al low the behavior of the responders to be viewed as learning. The first is that responders do not believe the rules of the game as described and talze time to learn that the proposer really was mak ing a takeitorleaveit offer. As mentioned in footnote 7, most people may not be used to takeitorleaveit offers, and it may take time for the responders to properly appreciate what this means. The second is that the monetary reward is only one ingredient in responders' utility functions, and that responders must learn what the "fair" offer is.
If proposers learn sufficiently fast relative to responders, then there can be convergence to a Nash equilibrium that is not the baclzward induction solu tion. In the backward induction solution, the responder gets almost nothing, so that the cost of making an error is low, while the proposer loses significantly if he misjudges the acceptance threshold of the responder. In fact, in a simplified version of the ultimatum game, Nash equilibria in which the responder gets a substantial share are stable (although not asymptotically stable). In a nonbackward induction outcome, the proposer never learns that he could have offered less to the responder (since he never observes such behavior). If he is sufficiently pessimis tic about the responder's acceptance threshold, then he will not offer less, since a large share is better than noth ing. Consider a simplified ultimatum game that gives the proposer and responder two choices: the proposer either offers even division or a small positive payment, and the responder only responds to the small positive pay ment (he must accept the equal divi sion). Figure 12 is the phase diagram for this simplified game. While this game has some similarities to that in Figure 9, there is also an important dif ference. All the rest points inconsistent with the backward induction solution (with the crucial exception of the point labelled A) are Liapunov stable (but not asymptotically so). Moreover, in re
Figure 12. A suggestive phase diagram for a simplified version of the ultimatum game.
sponse to some infrequent perturbations, the system will effectively "move along" these rest points toward A. But once at A (unlike the corresponding sta ble rest point in Figure ll),the system will move far away. The rest point la beled A is not stable: Perturbations near A can move the system onto a tra jectory that converges to the backward induction solution.
While this seems to suggest that non backwardinduction solutions are fragile, such a conclusion is premature. Since any model is necessarily an approximation, it is important to allow for drift. Binmore and Samuelson (1996) use the term drift to refer to unmodeled small changes in behavior. One way of allowing for drift would be to add to the learning dynamic an addi tional term reflecting a deterministic flow from one strategy to another that was independent of payoffs. In general, this drift would be small, and in the presence of strong incentives to learn, irrelevant. However, if players (such as the responders above) have little incen tive to learn, then the drift term becomes more important. In fact, adding an arbitrarily small uniform drift term to the replicator dynamic changes the dynamic properties in a fundamental way. With no drift, the only asymptoti cally stable rest point is the backward induction solution. With drift, there can be another asymptotically stable rest point near the nonbackwardinduction Nash equilibria.
The ultimatum game is special in its simplicity. Ideas of backward induction and sequential rationality have been in fluential in more complicated garnes (like the centipede game, the repeated prisoner's dilemma, and alternating of fer bargaining games). In general, back ward induction has received little sup port from evolutionary game theory for more complicated garnes (see, for exam ple, Ross Cressman and Karl Schlag (1995 ), Georg Noldeke and Samuelson (1993 ), and Giovanni Ponti (1996)).
There are two populations, with agents in population 1 playing the role of player 1 and agents in population 2 playing the role of player 2. Any state in which all agents in population 2 play L is Liapunov stable: Suppose the state starts with almost all agents in popula tion 2 playing L. There is then very lit tle incentive for agents in the first population playing B to change behav ior, since T is only marginally better (moreover, if the game played is in fact the extensive form of Figure 10, then agents in population 1 only have a choice to make if they are matched with one of the few agents choosing R). So, dynamics will not move the state far from its starting ~oint. if the starting.
5.3. Forward Induction and Efficiency
In addition to backward induction, the other major refinement idea is for ward induction. In general, forward in duction receives more support from evolutionary arguments than does back ward induction. The best examples of this are in the context of cheap talk (starting with Akihiko Matsui 1991). Some recent papers are V. Bhaskar (1995), Andreas Blume, YongGwan Kim and Joel Sobel (1993), Kim and Sobel (1995), and Karl Warneryd (1993)see Sobel (1993) and Samuel son (1993) for a discussion.
Forward induction is the idea that ac
tions can convey information about the
future intentions of players even off the
equilibrium path. Cheap talk garnes
(signaling games in which the messages
are costless) are ideal to illustrate these
ideas. Cheap talk games have both re
vealing equilibria (cheap talk can con
vey information) and the socalled bab
bling equilibria (messages do not
convey any information because neither
the sender nor the receiver expects them to), and forward induction has been used to eliminate some nonreveal ing equilibria.
Consider the following cheap talk game: There are two states of the world, rain and sun. The sender knows the state and announces a message, rain or sun. On the basis of the message, the receiver chooses an action, picnic or movie. Both players receive 1if the re ceiver's action agrees with the state of the world (i.e., picnic if sun, and movie if rain), and O otherwise. Thus, the sender's message is payoff irrelevant and so is "cheap talk." The obvious pat tern of behavior is for the sender to sig nal the state by malting a truthful announcement in each state and for the receiver to then choose the action that agrees with the state. In fact, since the receiver can infer the state froin the an nouncement (if the announcement dif fers across states), there are two sepa rating equilibrium profiles (truthful announcing, where the sender announces rain if rain and sun if sun; and false announcing, where sender announces sun if rain and rain if sun).
A challenge for traditional noncoop erative theory, however, is that babbling is also an equilibrium: The sender places equal probability on rain and sun, independent of the state. The re ceiver, learning nothing from the message, places equal probability on rain and sun. Consider ESS in the symmetrized game (where each player has equal probability of being the sender or receiver). It turns out that only separat ing equilibria are ESS. The intuition is in two parts. First, the babbling equilib rium is not an ESS: Consider the truth ful entrant (who announces truthfully and responds to announcements by choosing the same action as the announcement). The payoff to this entrant is strictly greater than that of the bab bling strategy against the perturbed population (both receive the same pay off when matched with the babbling strategy, but the truthful strategy does strictly better when matched with the truthful strategy). Moreover, the sepa rating equilibria are strict equilibria, and so, ESSs. Suppose, for example, all players are playing the truthful strategy. Then any other strategy must yield strictly lower payoffs: Either, as a sender, the strategy specifies an action conditional on a state that does not cor respond to that state, leading to an in correct action choice by the truthful re ceiver, or, as a responder, the strategy specifies an incorrect action after a truthful announcement).
This simple example is driven by the crucial assumption that the number of messages equals the number of states. If there are inore messages than states, then there are no strict equilibria, and so no ESSs. To obtain similar efficiency results for a larger class of games, we need to use setvalued solution concepts, such as cyclical stability (Itzhak Gilboa and Matsui 1991) used by Mat sui (1991) , and equilibriun~ evolution ary stability (Swinkels 1992) used by Blume, Kiin, and Sobel (1993). Matsui (1991) and Kim and Sobel (1995) study coordination gaines with a preplay round of communication. In such gaines, coinmunication can allow play ers to coordinate their actions. How ever, as in the above example, there are also babbling equilibria, so that commu nication appears not to guarantee coordination. Evolutionary pressures, on the other hand, destabilize the babbling equilibria. Blume, Kiin, and Sobel (1993) study cheap talk signaling games like that of the example. Bhaskar (1995) obtains efficiency with noisy preplay communication, and shows that, in his context at least, the relative iinporta~lce of noise and inutatioils is irrelevant.
The results in this area strongly sug
column player
H L
100,100
row player
L
Figure 13. (H,H)seems more likely than (L,L).
gest that evolutionary pressures can destabilize inefficient outcomes. The key intuition is that suggested by the example above. If an outcome is ineffi cient, then there is an entrant that is equally successful against the current population, but that can achieve the ef ficient outcome when playing against a suitable opponent. A crucial aspect is that the model allow the entrant to have sufficient flexibility to achieve this, and that is the role of cheap talk above.
5.4.Multiple Strict Equilibria
Multiple best replies for a player raise the question of determining which of these best replies are "plausible" or "sensible." The refinements literature (of which backward and forward induc tion are a part) attempted to answer this question, and by so doing eliminate some equilibria as being uninteresting. Multiple strict equilibria raise a completely new set of issues. It is also worth recalling that any strict equilibrium is asymptotically stable under any mono tonic dynamic. I argued earlier that this led to the desirable feature of history dependence. However, even at an intui tive level, some strict equilibria are more likely. For example, in the game described by Figure 13, (H,H) seems more lilzely than (L,L). There are several ways this can be phrased. Certainly, (H,H) seems more "focal," and if asked to play this game, I would play H, as would (I suspect) most people. Another way of phrasing this is to compare the basins of attraction of the two equilibria under monotone dynamics (they will all agree in this case, since the game is so simple),34 and observe that the basin of attraction of (H,H) is 100times the size of (L,L). If we imagine that the in itial condition is chosen randomly, then the H pattern of behavior is 100 times as likely to arise. I now describe a more recent perspective that makes the last idea more precise by eliminating the need to specify an initial condition.
The motivation for ESS and (asymp totic) stability was a desire for robust ness to a single episode of perturbation. It might seem that if learning operates at a sufficiently higher rate than the rate at which new behavior is introduced into the population, focusing on the dynamic implications of a single perturbation is reasonable. Dean Foster and Young (1990) have argued that the notions of an ESS and attractor of the replicator dynamic do not adequately capture longrun stability when there are continual small stochastic shoclzs. Young and Foster (1991) describe simu lations and discuss this issue in the con text of Robert Axelrod's (1984) com puter tournaments.
There is a difficulty that must be con fronted when explicitly modeling randomness. As I mentioned above, the standard replicator dynamic story is an idealization for large populations (spe cifically, a continuum). If the mutation experimentation occurs at the individ ual level, there should be no aggregate impact; the resulting evolution of the system is deterministic and there are no "invasion events." There are two ways to approach this. One is to consider aggregate shocks (this is the approach of Foster and Young 1990 and Fuden berg and Christopher Harris 1992). The
34.4 state is in the basin of attraction of an equi librium if, starting at that state and applying the dynamic, eventually the system is taken to the state in which all players play the equilibrium strategy.
other is to consider a finite population and analyze the impact of individual ex perimentation. Michihiro Kandori, Mailath, and Rafael Rob (1993) study the implications of randomness on the individual level; in addition to empha sizing individual decision making, the paper analyzes the simplest model that illustrates the role of perpetual random ness.
Consider again the stag hunt game, and suppose each workteam consists of two workers. Suppose, moreover, that the firm has N worlzers. The relevant state variable is z, the number of worlz ers who choose high effort. Learning implies that if high effort is a better strategy than low effort, then at least some workers currently choosing low ef fort will switch to high effort (i.e., zt+l > zt if zt < N). A similar property holds if low effort is a better strategy than high. The learning or selection dy namics describe, as before, a dynamic on the set of population states, which is now the number of workers choosing high effort. Since we are dealing with a finite set of workers, this can also be thought of as a Marlzov process on a fi nite state space. The process is Markov, because, by assumption workers only learn from last period's experience. Moreover, both all workers choosing high effort and all worlzers choosing low effort are absorbing states of this Markov process.35 This is just a restate ment of the observation that both states correspond to Nash equilibria.
Perpetual randomness is incorporated by assuming that, in each period, each worker independently switches his effort choice (i.e., experiments) with probability E, where E> 0 is to be thought of as small. In each period, there are two phases: the learning phase
35An absorbing state is a state that the process, once in, never leaves.
and the experimentation phase. Note that after the experimentation phase (in contrast to the learning phase), with positive probability, fewer workers may be choosing a best reply.
Attention now focuses on the behav ior of the Markov chain with the per petual randomness. Because of the experimentation, every state is reached with positive probability from any other state (including the states in which all workers choose high effort and all work ers choose low effort). Thus, the Marlzov chain is irreducible and aperi odic. It is a standard result that such a Marlzov chain has a unique stationary distribution. Let p(~)denote the station ary distribution. The goal has been to characterize the limit of p(~)as E becomes small. This, if it exists, is called the stochastically stable distribution (Foster and Young 1990) or the limit distribution. States in the support of the limit distribution are sometimes called longrun equilibria (Kandori, Mailath, and Rob 1993). The limit dis tribution is informative about the behavior of the system for positive but very small E . Thus, for small degrees of experimentation, the system will spend almost all of its time with all players choosing the same action. However, every so often (but infrequently) enough players will switch action, which will then switch play of the population to the other action, until again enough players switch.
It is straightforward to show that the stochastically stable distribution exists; characterizing it is more difficult. Kan dori, Mailath, and Rob (1993) is mostly concerned with the case of 2x2 symmetric games with two strict symmetric Nash equilibria. Any monotone dynamic divides the state space into the same two basins of attraction of the equilib ria. The risk donzinant equilibrium (Harsanyi and Selten 1988) is the equi
minimum of other quires n workers. While this is no worker's efforts
longer a two player game, it is still
1 high
high low
true that (for large populations) the size
of the basin of attraction is the deter
worker's
effort low
Figure 14. A new "staghunt" played by workers in a team.
librium with the larger basin of attrac tion. The risk dominant equilibrium is "less risky" and may be Pareto domi nated by the other equilibrium (the risk dominant equilibrium results when players choose best replies to beliefs that assign equal probability to the two possible actions of the opponent). In the stag hunt example, the risk domi nant equilibrium is low effort and it is Pareto dominated by high effort. In Figure 13, the risk dominant equilib rium is H and it Pareto dominates L. Kandori, Mailath, and Rob (1993) show that the limit distribution puts prob ability one on the risk dominant equilib rium. The nonrisk dominant equilibrium is upset because the probability of a sufficiently large number of simulta neous mutations that leave society in the basin of attraction of the risk domi nant equilibrium is of higher order than that of a sufficiently large number of si inultaneous mutations that cause society to leave that basin of attraction.
This style of analysis allows us to for mally describe strategic uncertainty. To make this point, consider again the workers involved in team production as a stag hunt game. For the payoffs as in Figure 1, risk dominance leads to the low effort outcome even if each work team has only two workers. Suppose, though, the payoffs are as in Figure 14, with V being the value of high effort (if reciprocated). For V > 6 and twoworker teams, the principles of risk dominance and payoff dominance agree: high effort. But now suppose each team remining feature of the example. For ex ample, if V = 8, high effort has a smaller basin of attraction than low effort for all n 23. As V increases, the size of the team for which high effort becomes too risky increases.36 This attractively cap tures the idea that cooperation in large groups can be harder to achieve than cooperation in small groups, just due to the uncertainty that everyone will coop erate. While a small possibility of non cooperation by any one agent is not de stabilizing in small groups (since cooperation is a strict equilibrium), it is in large ones.
In contrast to both ESS and replica tor dynamics, there is a unique outcome. History does not matter. This is the result of taking two limits: first, time is taken to infinity (which justifies looking at the stationary distribution), and then the probability of mutation is talzen to zero (looking at small rates). The rate at which the stationary distri bution is approached from an arbitrary starting point is decreasing in popula tion size (since the driving force is the probability of a simultaneous mutation by a fraction of the population). Moti vated by this observation, Glenn Ellison (1993) studied a model with local inter actions that has substantially faster rates of convergence. The key idea is that, rather than playing against a ran domly drawn opponent from the entire population, each player plays only against a small number of neighbors. The neighborhoods are overlapping, however, so that a change of behavior in one neighborhood can (eventually) in fluence the entire population.
36High effort has the larger basin of attraction if (l/2). > 3/V
Since it is only for the case of 2x2 symmetric games that the precise mod eling of the learning dynamic is irrele vant, extensions to larger games require specific assumptions about the dynam ics. Kandori and Rob (1995), Noldeke and Samuelson (1993), and Young (1993) generalize the best reply dynamic in various directions.
Kandori, Mailath, and Rob (1993), Young (1993), and Kandori and Rob (1995) study games with strict equilib ria, and (as the example above illustrates) the relative magnitudes of probabilities of simultaneous mutations are important. In contrast, Samuelson (1994) studies normal form games with nonstrict equilibria, and Noldeke and Samuelson (1993) study extensive form games. In these cases, since the equilib ria are not strict, states that correspond to equilibria can be upset by a single mutation. This leads to the limit distri bution having nonsingleton support. This is the reflection in the context of stochastic dynamics of the issues illus trated by discussion of Figures 9, 10, and 12. In general, the support will con tain "connected" components, in the sense that there is a sequence of single mutations from one state to another state that will not leave the support. Moreover, each such state will be a rest point of the selection dynamic. The re sults on extensive forms are particularly suggestive, since different points in a connected component of the support correspond to different specifications of offtheequilibrium path behavior.
The introduction of stochastic dynamics does not, by itself, provide a general theory of equilibrium selection. James Bergin and Bart Lipman (1996) show that allowing the limiting behavior of the mutation probabilities to depend on the state gives a general possibility theorem: Any strict equilibrium of any game can be selected by an appropriate choice of statedependent mutation probabilities. In particular, in 2 x 2 games, if the risk dominant strategy is "harder" to learn than the other (in the sense that the limiting behavior of the mutation probabilities favors the nonrisk dominant strategy), then the risk dominant equilibrium will not be selected. On the other hand, if the state dependence of the mutation probabilities only arises because the prob abilities depend on the difference in payoffs from the two strategies, the risk dominant equilibrium is selected (Lawrence Blume 1994). This latter state dependence can be thought of as being strategy neutral in that it only de pends on the payoffs generated by the strategies, and not on the strategies themselves. The state dependence that Bergin and Lipman (1996) need for their general possibility result is per haps best thought of as strategy depen dence, since the selection of some strict equilibria only occurs if players find it easier to switch to (learn) certain strate gies (perhaps for complexity reasons).
Binmore, Samuelson, and Richard Vaughan (1995), who study the result of the selection process itself being the source of the randomness, also obtain different selection results. Finally, the matching process itself can also be an important source of randomness; see Young and Foster (1991) and Robson and Fernando VegaRedondo (1996).
6. Conclusion
The result that any asymptotically sta ble rest point of an evolutionary dy namic is a Nash equilibrium is an im portant result. It shows that there are primitive foundations for equilibrium analysis. However, for asymmetric games, asymptotic stability is effectively equivalent to strict equilibria (which do not exist for many games of interest).
To a large extent, this is due to the fo cus on individual states. If we instead consider sets of states (strategy pro files), as I discussed at the end of Sec tion 5.1, there is hope for more positive results.37
The lack of support for the standard refinement of backward induction is in some ways a success. Backward induc tion has always been a problematic prin ciple, with some examples (like the cen tipede game) casting doubt on its universal applicability. The reasons for the lack of support improve our under standing of when backward induction is an appropriate principle to apply.
The ability to discriminate between different strict equilibria and provide a formalization of the intuition of strate gic uncertainty is also a major contribu tion of the area.
I suspect that the current evolution ary modeling is still too stylized to be used directly in applications. Rather, applied researchers need to be aware of what they are implicitly assuming when they do equilibrium analysis.
In many ways, there is an important parallel with the refinements literature. Originally, this literature was driven by the hope that theorists could identify the unique "right" equilibrium. If that original hope had been met, applied re searchers need never worry about a multiplicity problem. Of course, that
37Sets of strategy profiles that are asymptotically stable under lausible deterministic dynam ics turn out also to Rave strong Elon Kohlberg and JeanFrancois Mertens (1986) type stability prop erties (Swinkels (1993)), in particular, the prop erty of robustness to deletion of never weak best re lies This latter property implies many of the regne~ents that have played an important role in the refinements literature and signalin games such as the intuitive criterion, the test of equilib: rium domination, and Dl (InKoo Cho and Kreps 1987). A similar result under different conditions was subsequently proved by Ritzberger and Weibull (1995), who also characterize the sets of profiles that can be asymptotically stable under certain conditions.
hope was not met, and we now under stand that that hope, in principle, could never be met. The refinements litera ture still serves the useful role of pro viding a language to describe the prop erties of different equilibria. Applied researchers find the refinements litera ture of value for this reason, even though they cannot rely on it mechani cally to eliminate "uninteresting" equi libria. The refinements literature is cur rently out of fashion because there were too many papers in which one example suggested a minor modification of an existing refinement and no persuasive general refinement theory emerged.
There is a danger that evolutionary game theory could end up like refinements. It is similar in that there was a lot of early hope and enthusiasm. And, again, there have been many perturbations of mod els and dynamic processes, not always well motivated. As yet, the overall pic ture is still somewhat unclear.
However, on the positive side, impor tant insights are still emerging from evolutionary game theory (for example, the improving understanding of when backward induction is appropriate and the formalization of strategic uncertainty). Interesting games have many equilibria, and evolutionary game the ory is an important tool in understanding which equilibria are particularly relevant in different environments.
REFERENCES
Aoyagi, Masaki. 1996. "Evolution of Beliefs and the Nash Equilibrium of Normal Form Games,"
J. Econ. Theory, 70, pp. 44469.
Aumann, Robert . 1985. "On the NonTransfer able Utility Va 1ue: A Comment on the Roth Shafer Examples," Econornetrica, 53, pp. 667
77. . 1990. "Nash Equilibria are Not SelfEn forcing," in Jean Jaskold Gabszewicz, Jean Fran~ois Richard, and Laurence A. Wolsey, pp. 201206. Aumann, Robert J, and Brandenburger, Adam.
1995. "Epistemic Conditions for Nash Equilib rium," Econornetrica, 63, pp. 116180.
1995. "Episternic Conditions for Nash Equilib rium," Econometrica, 63, pp. 116180.
Axelrod, Robert. 1984. The Evolution of Coopera tion. New York: Basic Books.
Bergin, James and Barton L. Li man 1996. "Evolution with StateDepenIent ' Mutations," Econometrica, 64, pp. 94356.
Bernheim, B. Douglas. 1984. "Rationalizable Stra te ic Behavior," Econometrica, 52, pp. 100728. Bhasfar, V 1995 "Noisy Communication md the
Evolution of Cooperation," U. St. Andrews.
Binmore, Kenneth G. and Larry Samuelson. 1996. "Evolutionary Drift and Equilibrium Selection," mimeo, U. Wisconsin.
Blume, Andreas, YongGwan Kim, and Joel Sobel. 1993. "Evolutionarv Stabilitv in Games of Com munication," ami is G Ecdn. Behavior, 5, pp. 54775.
Blume, Lawrence. 1994. "How Noise Matters," mimeo, Cornell U.
Bor ers, Tilman and Rajiv Sarin. 1997. "Learning T$rough Reinforcement and Replicator Dynamics,"J. Econ. Theory, 77, pp. 114.
Bush, Robert R. and Frederick Mosteller. 1951. "A Mathematical Model for Simple Learning," Psych. Rev., 58, pp. 31323.
. 1955. Stochastic Models of Learning.
New York: Wiley.
Cho, InKoo and David Kreps. 1987. "Signaling Games and Stable E~uilibria," Quart. 1. Econ., 102, pp 179221.
Creedy, John, Jeff Borland, and Jiirgen Eichber eer, eds. 1992. Recent Develo~ments in Game Theory. Hants, England: ~dw&d Elgar Publish ing Limited. Cressman, Ross and Karl H. Schla 1995. "The Dynamic (1n)Stability of Backwar% Induction," Technical report, Wilfred Laurier U. and Bonn u. Darwin, Charles. 1887. The Life and Letters of Charles Darwin, Including an Autobiographical Chapter. Francis Darwin, ed., second ed., Vol. 1, London: John Murray. Dekel, Eddie and Faruk Gul. 1997. "Rationality and Knowledge in Game Theory," in David M. Kreps and Kenneth F. Wallis, pp. 87172. Ellison, Glenn. 1993. "Learning, Local Interaction, and Coordination," Econometrica, 61, pp. 104771. Elster, Jon. 1989. "Social Norms and Economic Theory,"J. Econ. Perspectives, 3, pp. 99117. Foster, Dean and H. Peyton Young. 1990. "Stochastic Evolutionary Game Dynamics," Theor. Population Bio., 38, pp. 21932. Friedman, Daniel. 1991. "Evolutionary Games in Economics," Econometrica, 59, pp. 63766. Fudenberg, Drew and Christopher Harris. 1992. "Evolutionary Dynamics with Aggregate Shocks,"]. Econ. Theory, 57, pp. 42041. Fudenberg, Drew and David Kreps. 1989. "A The ory of Learning, Experimentation, and Equilib rium," mimeo, MIT and Stanford. Fudenberg, Drew and David Levine. 1993. "Steady State Learning and Nash Equilibrium," Econometrica, 61, p 54773. Gabszewicz, Jean Jas$old, Jean Franyois Richard, and Laurence A. Wolsey, eds. 1990. Economic DecisionMaking: Games, Econometrics, and Optimisation. Contributions in Honour of Jacques H. DrBze. New York: NorthHolland. Gale, John, Kenneth G. Binmore, and Larry Sam uelson. 1995. "Learning to be Imperfect: The Ultimatum Game," Games G Ecotz. Behavior, 8, 5690. Giita, Itzhak and Akihiko Matsui. 1991. "Social Stabilitv and Eauilibrium." Econometrica. 59. 85967. , , Faruk. 1996. "Rationality and Coherent Theories of Strategic Behavior," J. Econ. The ory, 70, pp. 131. Harsanyi, John C. 1977. Rational Behavior and Bargaining Equilibriz~in in Gaines and Social Situations. Carnbridge, UK: Cambridge U. Press. Harsanyi, John C. and Reinhard Selten. 1988. A General Theory of Equilibrium in Games. Cambridge: MIT Press. Hines, W. G. S. 1980. "Three Characterizations of Population Strategy Stability," J. Appl. Probability, 17, p 33340. Hofbauer, Josef.nd Karl Sigmund 1988 The The ory of Evolution and Dynamical Systems. Cambridge: Carnbridge U. Press. Kalai, Ehud and Ehud Lehrer. 1993. "Rational Learning Leads to Nash Equilibrium," Econometric~,61, p . 101945. Kandori, Michiliro 1997 "Evolutionary Game Theory in Economics," in David M. Kreps and Kenneth F. Wallis, p 24377. Kandori, Michihiro anSkafae1 Rob. 1995. "Evolution of Equilibria in the Long Run: A General Theory and Applications," J. Econ. Theory, 65, pp 383414. Kandori, Michihiro, George J. Mailath, and Rafael Rob. 1993. "Learning, Mutation, and Long Run Equilibria in Games," Econoinetrica, 61, pp. 2956. Kim, YongGwan and Joel Sobel. 1995. "An Evolu tionary Approach to PrePlay Communication," Econometrica, 63, pp. 118193. Kohlberg, Elon and JeanFrancois Mertens. 1986. "On the Strategic Stability of Equilibria," Econometrica, 54, pp. 100337. Kreps, David M. 1990a. Game Theory and Eco nomic Modelling. Oxford: Clarendon Press. . 1990b. A Course in Microeconornic Theory. Princeton, NJ: Princeton U. Press. Kreps, David M, and Kenneth F. Wallis, eds. 1997. Advances in Economics and Econometrics: Theory and ApplicationsSeventh World Congress of the Econometric Society, Vol. 1. Carnbridge: Carnbridge U. Press. Mailath, George J. 1992. "Introduction: Sympo 1374 Journal of Economic Literature, Vol. XXXVZ (September 1998) Matsui, Akihiko. 1991. "CheapTalk and Coopera tion in Society," J. Econ. Theory, 54, pp. 245 58. Maynard Smith, John. 1982. Evolution and the Theory of Games. Cambridge: Cambridge U. Press. Maynard Smith, John and G. R. Price. 1973. "The Logic of Animal Conflict," Nature, 246, pp. 15 18. Myerson, Roger B. 1991. Game Theory Analysis of Conflict Cambridge, MA: Harvard U. Press. Nitecki, Z, and Robinson, C. eds. 1980. Global Theory of Dynamical Systems Vol. 819 of Lecture Notes in Mathematics, Berlin: Springer Verlag. Noldeke, Georg and Larry Samuelson. 1993. "An Evolutionary Analysis of Backward and Forward ~nduction,"'~ainesG Econ. Behavior, 5, pp. 42554. Pearce, David. 1984. "Rationalizable Strategic Be havior and the Problem of Perfection," Econometric~,52, pp. 102950. Ponti, Giovanni. 1996. "Cycles of Learning in the Centipede Game," Technical Report, University College, London. Ritzber er, Klaus and Jorgen W. Weibull. 1995. "Evofutionary Selection in NormalForm Games," Econometrica, 63, pp. 137199. Robson, Arthur J. 1992. "Evolutionar Game The ory," in John Creedy, Jeff BorlanJ and Jurgen Eichberger, pp. 16578. Robson, Arthur J. and Fernando VegaRedondo. 1996. "Efficient Equilibrium Selection in Evo lutionary Games with Random Matching," J. Econ. Theory, 70, pp. 6592. Rousseau, JeanJacques. 1950. "A Discourse on the Origin of Inequality," in The Social Con tract and Discourses. New York: Dutton. Trans lated by G. D. H. Cole. Samuelson, Larry. 1993. "Recent Advances in Evo lutionary Economics: Comments," Econ. Letters, 42, pp. 31319. . 1994. "Stochastic Stability in Games with Alternative Best Replies," J. ELon. Theory, 64, OD. 3565. SaLelson, Larry and Jianbo Zhang. 1992. "Evolu tionary Stability in Asymmetric Games," J. Econ. Theory, 57, pp. 36391. Schelling, Thomas. 1960. The Strategy of Conflict. Cambridge, MA: Harvard U. Press. Schlag, Karl H. 1998. "Why Imitate, and If So, How?"J. Econ. Theory, 78, pp. 13056. Selten, Reinhard. 1980. "A Note on Evolutionary Stable Strategies in Asymmetric Animal Con flicts," J. Theor. Bio., 84, pp. 93101. Skyrms, Brian. 1996. Evolution of the Social Con tract. Cambridge: Cambridge U. Press. Sobel, Joel. 1993. "Evolutionary Stability and Effi ciency," Econ. Letters, 42, pp. 30112. Sonsino, Doron. 1997. "Learning to Learn, Pattern Recognition, and Nash Equilibrium," Games G Econ. Belzavior, 18, pp. 286331. Sugden, Robert. 1989. "Spontaneous Order," J. Econ. Perspectives, 3, pp. 8597. Swinkels, Teroen M. 1992. "Evolutionary Stability with ~quilibrium Entrants," J. Econ. ~heori, 57, OD. 30632. L . L 1993. "Adjustment Dynamics and Rational Play in Games," Games G Econ. Behavior, 5, 45584. T$l%r, Peter D and Leo B Jonker 1978 "Evolu tionary Stable Strategies and Game Dynamics," Matlz. Biosciences, 40, pp. 14556. van Damme, Eric. 1987. Stability and Perfection of Nash Equilibria. Berlin: SpringerVerlag. Warneryd, Karl. 1993. "Cheap Talk, Coordination, and Evolutionary Stability," Gaines G Econ. Be havior, 5, pp. 53246. Weibull, Jorgen W. 1995. Evolutionary Game The ory. Cambridge: MIT Press. Young, H. Peyton. 1993. "The Evolution of Con ventions," Econometrica, 61, p 5784. . 1996 "The Economics o~~onventions," J,Econ. Perspectives, 10, pp. 10522. Young, H. Peyton. and Dean Foster. 1991. "Coop eration in the Short and in the Long Run," Games G Econ. Behavior, 3, pp. 14556. Zeeman, E. 1980. "Population Dynamics from Game Theory," in Z. ~itecki and C. Robinson, pp 47197. . 1981. "Dynamics of the Evolution of Ani mal Conflicts," J. Theor. Bio., 89, pp. 24970. Zermelo, Ernst. 1912. "Uber eine Anwendung der Mengenlehre auf die Theorie des Schachspeils," Proceedings of the Fqth International Congress of Mathematicians, 11, pp. 501504.
Binmore, Kenneth G., Larry Samuelson, and Rich ard Vaughan. 1995. "Musical Chairs: Modeling Noisv Evolution," Games G Econ. Behavior. 11. pp. i35.
A major challenge facing noncoopera tive game theorists today is that of providing a compelling justification for these two assumptions. As I will argue here, many of the traditional justifica tions are not compelling. But without
1 University of Pennsyl\,ania. Acknowledgments:
thank Robert Aumann, Steven Matthewis, Loretta Mester, John Penca\,el, three referees, and especially Larry Sarnuelson for their comments. Email: gmailath@econ.sas.upenn.edu .
such iustification, the use of game the
"
Comments