Reasoning, Games, Action and Rationality: 2008

Tuesday, January 29, 2008

Presentations and papers

Thanks to Mark and Stefan for their presentations!

I remind you that the papers are due next Monday, 4th of February, and that 10% of the grade for the whole course is based on your contribution to this blog.

Monday, January 28, 2008

Room for the presentations: P1.14

Today's meeting will be in P1.14, on the first floor of Euclides building.

Tuesday, January 22, 2008

Room for today : P0.16

Today's lecture will be in P0.16, at 11.30.

Friday, January 18, 2008

Yet another change: more on correlated equilibria

The questions around correlated equilibrium deserve more scrutiny than what I have presented last week. In tomorrow's lecture I will come back to them instead of moving on to team reasoning. The schedule for the last two meetings will thus look like this:

Tuesday January 22nd. 11.30-13.30.
More on correlated equilibrium.
- The Common Prior assumption: the probabilistic and the relational cases, examples, and discussion/motivations.
- Aumann's theorem(s) : '87 and '05. Examples and discussion.
Monday January 28th. 11.30-13.30
- Presentations.
  For the presentations, your are expected to give a 10 to 15 min talk on either your survey paper or your personal contribution. If you have doubts or would need more references on the topic please contact me by email.
- If time permits, I will say a few words about team reasoning.

I will post the notes about correlated equilibrium after tomorrow's lecture.

Room change

Today we will meet in P0.18. As mentioned in the last lecture, we will talk about correlated equilibrium. Next lecture will be about team reasoning.

Wednesday, January 16, 2008

Notes for the second meeting - Part 2: Dominated strategies and Nash equilibrium

In this second part we will look at what are rational choices given, first, that the players are rational and that this is common belief and, second, that the players are rational and know each other's strategies. This will lead to two classical solution concepts in game theory: iterated elimination of strictly dominated strategies and Nash equilibrium.

Rationality and common belief of rationality.
Look at the game in Table 1.

Strategy B for Bob is irrational in a very strong sense. Whatever his or Ann's information are, it is never a rational thing to choose. Take any type and t_b for Bob. Choosing B is rational if and only if the following hold:

EV(B, t_b) ≥ EV(A, t_b)

That is, whenever:

π_bob(aB)λ(t_b)(a) + π_bob(bB)λ(t_b)(b) ≥ π_bob(aA)λ(t_b)(a) + π_bob(bA)λ(t_b)(b)

Putting in the values of the payoff function we obtain:

1λ(t_b)(a) + 0λ(t_b)(b) ≥ 2λ(t_b)(a) + 1λ(t_b)(b)

That is:

λ(t_b)(a) ≥ 2λ(t_b)(a) + λ(t_b)(b)

But then we would have that :

0 ≥ λ(t_b)(a) + λ(t_b)(b)

Which is impossible given that λ(t_b) induces a probability distribution, that is:
λ(t_b)(a) + λ(t_b)(b) = 1

Since we took an arbitrary type for Bob, and that the last observation is true irrespective of Ann's type set T_ann, we know that it is never rational for Bob to choose B. Indeed, this is what comes out intuitively if we look at the game matrix. Whatever Ann chooses, it is always better for Bob to choose A. This conclusion could also have been reached using relational models (I leave it as an exercise - note that knowledge of one's own strategy is also crucial here!).

I mentioned in the first lecture that classical game-theoretic analysis were always assuming that the players are rational and that this fact is common knowledge. Now, observe that there are a lot of possible states for this game where this assumption does not hold. For instance in any state where Bob plays B. But let us look at states where Ann and Bob are rational and where this is commonly known. At such states Ann must believe that Bob is rational or, more precisely, all of Ann's types must assign probability 0 to Bob playing B: to all pairs (B, t_b), for any type t_b of Bob. But then all of Ann's types must assign probability 1 to Bob playing A. It should then be clear that Ann can only be rational, given her belief in Bob's rationality, at states where she plays a. In other words, the only strategy profile that is rational for both players, given the fact that Ann believes that Bob is rational, is aA.

This strategy profile is the only one that survives iterated elimination strictly dominated strategy, one of the most well-known solution concept in game theory. In classical game theory, a solution concept can be seen as a proposal of what rational player will/should choose. Iterated elimination of strictly dominated strategies runs as follow. First, remove from the original game all strictly dominated strategies. (A strategy s_i of player i strategy is strictly dominated by another strategy s'_i when for all combinations of strategies s_(-i) of the other players, π_i(s_i, s_(-i)) < π_i(s'_i, s_(-i)).) Then look at the resulting sub-game, remove the strategies have become strictly dominated there, and repeat this process until the elimination does not remove any strategies. The profiles that survive this process are said to be iteratively non-dominated.

In our example above, elimination of strictly dominated strategies closely follows the reasoning steps we took by looking at what Ann and Bob would choose if they are rational and believe in each others' rationality. This is of course not a coincidence. In this simple game we only needed rationality for Bob an Ann and one level of interactive belief in rationality (Ann believing that Bob is rational). But in the general case, that is for arbitrary large (finite) strategic game, if all players are rational and this is common belief then they will choose a strategy that is iteratively non-dominated.

Theorem (Brandenburger and Dekel 1987; Tan and Werlang 1988): For any state (s, t) of a type structure for an arbitrary finite game G, if all players are rational and it is common belief that all players are rational at (s, t), then s is a iteratively non-dominated strategy profile.

This epistemic characterization result says that iteratively non-dominated strategies are rational choices for the players, under the assumption that they are rational and that this is common belief. This last assumption concerns specifically the players' mutual expectation. Under common belief of rationality, everyone expects everyone else to choose rationally, and this is enough to exclude the choice of iteratively dominated strategies. In comparison with the specific information structure that we studied in the coordination game example, this assumption is definitely less ad hoc. It indeed covers a huge set of possible expectations, the "only" requirement being common belief in rationality.

I mentioned that the introduction of information structure to represent the player's expectations turns the question "what is the rational thing to do?" into "where do a given set of expectations come from?" The epistemic characterization of iteratively non-dominated strategies answers that question by invoking classical game theoretic "intuitive" motivation for the iterated dominance solution (see e.g. section 1.1 in Myerson's book in the bibliography): if the players are rational and that is common knowledge, like classical game theorists generally assumed, then they will not play iteratively dominated strategies. In other words, if one grants that it is intuitively plausible to assume rationality and common belief in rationality, then one is committed to accept that the players will choose a iteratively non-dominated strategy.

For more on this, see A. Brandenburger's "The power of paradox" and G. Bonanno's presentation. K. Apt, in "The many faces of rationalizability", offers an encompassing and penetrating analysis of solution concepts like iterated elimination of dominated strategies. All these references are in the bibliography.
A dynamic view on iterated elimination of dominated strategies.
Strategy profiles which are not iteratively dominated can also be characterized in a more "dynamic" way, in which one recovers the algorithmic or dynamic character of the elimination process. An information structure (whether a relational model or a type structure), is indeed an essentially "static" representation of the game. The players' expectations are fixed by certain assumptions - in the present case rationality and common belief of rationality - and we or the players "compute" the rational choices from there.

In "Rational Dynamic..." (see the bibliography), J. van Benthem has proposed to look at iteratively non-dominated strategies as those who survive iterated announcements of rationality from the players. The idea is that by announcing publicly - that is to everyone - that they are rational, the players dynamically change each other's expectations.

Formally, these announcements take the form of operations which remove states in relational models. Here we will be mainly interested in announcement of rationality, as defined in the first part of these notes, but this style of analysis can be applied to various sorts of announcements. Following van Benthem, let us call a full relational model for a given game in strategic form the relational model whose states are in 1 to 1 correspondence with the strategy profiles, and where w ∼_i w' iff w_i = w'_i. Intuitively, a full relational model is just a copy of the original game where the player know their own strategies, but nothing more. They consider possible every other combination of actions of the other players.

Look at the full relational model for the game in Table 1. The reflexives loops are omitted.

As before, Ann's accessibility relation runs along the solid arrows, while Bob's runs on the dashed ones. If both players announce that they are rational, then they both "eliminate" or "rule out" the possibility that they are not rational. Following the definition of rationality in relational model that we saw before, this removes aB and bB, the two states where Bob is not rational (Ann is rational everywhere in this full model). This leaves us the following sub-model, where Ann is not rational at state bA (Bob is trivially rational in this reduced game, because he only has one action available). A second announcement of rationality thus eliminates this state, leaving only aA.

Again, the fact that by iterating announcements of rationality we end up in a sub-model where all states get assigned a iteratively non-dominated strategy is not a coincidence.

Theorem (van Benthem 2006): The following are equivalent for states w in full game models:
(a) f(w) is a iteratively non-dominated strategies,
(b) w is in the submodel which remains after any number of successive announcement of rationality.

In other words, by repeatedly announcing their own rationality the players shape each other's expectations in a way that leads them to play strategies which are not iteratively dominated.

There is of course an intuitive connection between this dynamic and the "static" characterization. Removing states in a model is an operation through which the players get more and more informed about each others, and adapt their expectations
accordingly. In full game models they are as uniformed as they can be, considering possible all actions of others. Upon learning that they are rational, they narrow the range of what they consider possible actions of others. By learning once more that the others are rational, this time with respect to this smaller set of possibilities, what they expect the others to do become even more precise, and so on until they reach a point where announcing rationality does not change the player's expectations anymore. In all the states that remain, all players are rational, and this is common knowledge. In other words, iterated announcement of rationality is a process through which the expectations of the players converge towards sufficient epistemic conditions to play iteratively non-dominated strategies.

Possible research questions: van Benthem's result is based on full relational models. What happen if we allow for bigger models, like the one we used to model Bob's two possible types in the coordination example? Along the same line: in full epistemic models the players are never wrong about the other's strategy choices. Their information is veridical. What happens is we drop this assumption? For new work in that direction, see Mihalache and Baltag & Smets (reference in Part 1). Finally, how to formulate public announcements of rationality in type structures? Probability update? Conditioning?

For (much) more on logics for public announcements:
- A. Baltag, L.S. Moss, and S. Solecki. The logic of public announcements, common knowledge and private suspicions. In TARK 98, 1998.
- H. van Ditmarsch, W. van de Hoek, and B. Kooi. Dynamic Epistemic Logic, volume 337 of Synthese Library Series. Springer, 2007.
Mutual knowledge of strategies and Nash equilibrium:
Iterated elimination of strictly dominated strategies is a very intuitive concept, but for many games it does not tell anything about what the players will or should choose. The coordination game that we studied before is a typical example. Rationality and common belief of rationality does not exclude any strategy choice for Ann or for Bob. Whether choosing A is rational for Bob depends on what he expects Ann will choose, and believing that she is rational does not help him to narrow his expectations to a or b.

But what if Bob knows (that is rightly believes) Ann's strategy choice? Intuitively, this is quite clear what he should do, that is what is his best response: coordinate with her. If he knows that she plays a, for instance, then playing A is clearly the only rational thing for him to choose, and vice versa if he knows that she plays b. The situation is entirely symmetrical for Ann. If she is sure that Bob plays A, and she is right about it, then her only rational choice is to choose a, and vice versa if she knows that Bob plays B. In terms of type structure, in all states where Ann is rational and her type knows Bob's strategy choice where Bob is also rational and his type knows Ann's strategy choices, they either play aA or bB. The same hold for relational models: they play either aA or bB in all states where Ann and Bob are rational and in all situation they considers possible they play the same strategy.

The strategy profiles aA and bB are Nash equilibrium of that game. This solution concept is usually described as the profiles where no player has an incentive to unilaterally change his strategy choice. In other words, a Nash equilibrium is a combination of strategies, one for each player, such they all play their best response given the strategy choices of the others. This is precisely what happens at the profiles aA and bB in our coordination example. At aA, for instance, a is the best response of Ann, given that Bob plays A, and similarly for Bob.

This key intuition being Nash equilibria, the idea of "best response given the action of others", is exactly what is ensured by the mutual knowledge of strategy in the example above. If Ann knows at a state that Bob chooses A, she is not only convinced of that, but she is also right. Bob is playing A at that state. By playing rationally given her information, Ann is thus playing her best response given Bob's strategy choice. This is not only the case in our small example, but also in general:

Theorem (R. Aumann & A. Brandenburger, 1995): For any two-players strategic game and epistemic model for that game, if at state s both players are rational and know each other's strategy choice, then the profile they play at s is a Nash equilibrium.

This epistemic characterization differs from the one of iterated elimination of dominated strategies in three respects. First, it requires mutual knowledge and not only beliefs. It is crucial that each player is right about the strategy choice of the other. Otherwise his response will simply fall off the mark. Second, it only requires one level of interactive knowledge. It is not necessary that both players know that they know each other's choices, nor is any higher-order level of information necessary. Finally, I have stated the result only for games with two players. One can find the general result in:
- R.J. Aumann and A. Brandenburger. Epistemic conditions for nash equilibrium.
  Econometrica, pages 1161–1180, 1995.
Nash equilibrium can also be characterized dynamically, much in the flavor of van Benthem's approach to iterated elimination of dominated strategies. Just like one only needs one level of interactive knowledge to ensure equilibrium choices, only one announcement is needed to narrow down a relational model to the equilibrium state: the announcement of each player's strategy.

Theorem. For any two-players strategic game and relational model for that game, if at state s after announcing player i's strategy choice in the original model j is rational, and after announcing player j's strategy choice in the original model i is rational, then the profile they play at s is a Nash equilibrium.

Again, there's an intuitive connection between the static and the dynamic characterizations of Nash equilibrium. Each announcement provides the players with all the information they need to make the best decision given the other's strategy choice. Observe, however, that the two announcements of strategy are not simultaneous nor successive. They are both made individually in the original model. A joint announcement of all player's strategy choice trivialize any posterior announcement of rationality. It leaves only one possible strategy profile to consider possible for all players, which makes them trivially rational. What is required is rather a "test" of each player's rationality posterior to the announcement of his opponent's choice. This is indeed what captures the idea that each strategy in a Nash equilibrium is a best response, given the choices of others.

To conclude, I will make two general remarks about epistemic characterizations of solutions concepts. First, the reader will probably have noticed that all the epistemic characterizations mentioned above are "if... then" statements. They provide sufficient conditions on the players' expectations to play according to such and such solution concept. What about necessary conditions? Surprisingly, this is a question which makes little sense, given the way we constructed our representations of the players' information. Given a fixed strategy profile s, for example one that is a Nash equilibrium, we can construct any state we want. The definition of a relational model or of a type structure does not impose any constraint on what information (type or accessibility relation) should be assigned to the player at a given state. In other words, we can always construct a states where the players do in fact play a Nash equilibrium at s, but out of sheer luck! They might be totally wrong about each other's choices or beliefs, or even be irrational given what they expect, but still playing a strategy which turns out to be the best response given what the other in fact chose. In view of this, most characterization results provide a weaker "converse direction" result: that if a strategy profile fall under such and such solution concept then we can construct a model where they have the expectations which match the epistemic characterization. This is obviously a much weaker statement (and usually much easier to prove), but one that is as good as one can get. van Benthem's dynamic characterization of iteratively non-dominated strategy is, however, an "if and only if". Question: does it has to do with the fact that he is working with "full models"?

The second remark concern the concern more precisely the epistemic conditions for Nash equilibrium. One can reasonably wonder how the players might ever come to know the strategy choice of their opponent. This is especially true in the coordination game that I used to introduce this solution concept. The whole puzzle about that game is indeed how would they come to have enough information about each other's action to successfully coordinate. By using directly mutual knowledge of strategy choice, the epistemic condition of Nash equilibrium brings little light on this particular problem. In the next two lectures we will look at two proposals which provides partially cope with this issue: correlated beliefs and team reasoning.

Monday, January 14, 2008

Notes for the second meeting - Part 1: Rationality and Beliefs

At the end of the last lecture we saw how relational structures and type spaces allow to precisely represent the player's mutual expectations in games, which are arguably the missing ingredient to understand game-theoretic rationality. In this lecture I will start by spelling out the notion of a rational strategy itself, and then we will look at a classical result for the epistemic program in game theory: the characterization of iterated removal of strictly dominated strategies. I will present two perspectives on this solution concept: the "static" one which uses rationality and common beliefs in rationality, and a "dynamic" one which rather use iterated announcements of rationality in relational models. To do that we will have to look more precisely are what beliefs, and common beliefs are in type spaces and relational models. We will finally look at the epistemic characterization of Nash equilibrium, both from a static and from a dynamic point of view.

Rationality and rational play in epistemic games.
Just like in decision theory, Bayesian or instrumental rationality in games is a matter of maximizing one's payoffs given one's expectation. The key difference is that in games the expectations are about the choices of other rational agents, who also form expectations about each other's choices. As we saw in the last lecture, relational models and type spaces are two precise ways to represent these inter-dependent expectations. With them it becomes relatively simple to define instrumental rationality in games.

In type spaces, the expected payoffs of player i playing strategy s_i, given that he is of type t_i, is defined as follows:
EV(s_i, t_i) = Σ_s'_(-i) π(s_i, s'_(-i))λ(t_i)(s'_(-i))
s'_(-i) are combinations of strategies, one for each player different from i. The expected value of playing (s_i, t_i) is thus the sum, over all such possible combinations of strategy s'_(-i) of the other players, of the payoff π(s_i, s'_i) that i gets by playing s_i against s'_(-i), weighted by the probability λ(t_i)(s'_(-i)) that i's type t_i assigns to the others playing s'_(-i). The term λ(t_i)(s'_(-i)) is a shorthand:
λ(t_i)(s'_(-i)) = Σ_t'_(-i)λ(t_i)(s'_(-i) & t_'(-i))
where λ(t_i)(s'_(-i) & t_'(-i)) is the probability that the image of λ(t_i) assigns to the pair (s'_(-i), t_'(-i)).

This definition of expected payoff in type spaces is indeed very close to the decision-theoretic idea of expected value/utility/payoffs. The payoff that player i gets from choosing a strategy s_i depends on what the other players choose. Player i has some expectations or information relative to these choices, which are encoded by his type t_i. Hence, the payoff that i can expect from choosing s_i is just the sum of the payoffs he would get by choosing this strategy against the various possible combinations of choices of the others, π(s_i, s'_(-i)), each weighted by how probable i thinks these choices are, that his by the probability he or his type t_i assigns to them.

A player's strategy s_i is rational, given that i is of type t_i, simply when it maximizes expected value. In other words, it is rational of i to choose s_i, given what he expects about the choices of his co-players, his type t_i, when no other strategy gives him a strictly higher expected value.

Consider the state (aA, t_a, u_b) in the example of the first lecture. Ann is indeed rational at that state: she is convinced that Bob will play A, against which a is the only best response. Her current strategy, a, gives her an expected payoff of 1:
EV_ann(a, t_a) = 1(1/2) + 1(1/2) + 0(0) + 0(0)
while the expected payoff of choosing b instead is 0:
EV_ann(b, t_a) = 0(1/2) + 0(1/2) + 1(0) + 1(0)
Bob, on the other hand, is not rational at that state. The problem is not that he his choosing a wrong strategy in itself. After all, he successfully coordinates with Ann at that state. Bob's irrationality stems from the fact that Bob mistakingly believes that Ann chooses b. Given this belief, Bob's best response would be to choose B, and not A:
EV_bob(A, u_b) = 1(0) + 0(1)
EV_bob(B, u_b) = 0(0) + 1(1)

One can easily check that, in this example, Ann and Bob are rational at (aA, t_a, t_b). Their respective strategy choices maximize expected value given their types.

Rational choices in relational models are also thought in Bayesian terms, with the difference that the payoffs and expectations are now defined qualitatively instead of in terms of utility values and probability functions. Look again at state (aA, t_a, u_b), but this time in the Kripke structure (which is state aA). Here Bob is sure that Ann plays b: he considers only one state possible, namely bA. But in that context he would strictly prefer the outcome bB. Switching his strategy choice from A to B would give him an outcome he strictly prefers, in all eventualities that he considers possible. Choosing A is, in other words, a very bad decision for Bob, thus the following definition of an irrational strategy in a relational model.

Let w_i be i's component of f(w), that is i's strategy in the profile that f assigns to w. Given a state w, I write w[s_i/w_i] be the state, if any, that is just like w except that w'_i = s_i. A player i is said to be irrational at state w when there is a s'_i ≠ w_i such that for all w' ∼_i w, w' ≤ w'[s'_i/w'_i] (observe that is it important here that the player know their own actions, so that w' _i = w_i). In words, a player is irrational at a state w where there is an action of his, different from the one he plays at w, which gives him an outcome he prefers in every situation he considers possible.

A player is rational when he is not irrational, that is when for each of his alternative action s'_i there is a state he considers possible where is current strategy choice is at least as good as s'_i. Again, knowledge of one's own choice is crucial here. In our model, Ann and Bob are both rational in that sense at aA.

In our example, the two models of the agents' expectations and their associated notions of rationality thus give the same answer to the question "what is it rational for Ann and Bob to choose?" They allows to assess precisely whether a player is choosing rationally, given what he believes. But they only partially answer the question of what is, in general, a rational play. Rationality as just defined depends entirely on the player's expectations, and one can legitimately wonder where do these expectations come from. For instance, what possible ground would Ann have for being convinced that Bob will play A?

The epistemic program in game theory takes one step towards answering that question, by looking at what are rational choices in games given some less ad hoc generic beliefs and expectations. For example, instead of looking at what are Ann's rational choices given that she is sure that Bob plays A, one might ask what is rational for her to play in all situations where she believes that Bob is rational or that she believes that he believes that they are both rational. This, of course, requires to define precisely these kinds of general beliefs in epistemic models.

Possible research questions: Is there a general correspondence between type spaces and relational models for games? Do their associated definiti0n of rationality always coincide? van Benthem (see bibliography) propose different definitions of a rational play in relational models; what would be their quantitative (type space) counterpart? What happens if we drop the knowledge of one's own strategy assumption? Look at:

D. Mihalache. Safe belief, rationality and backwards induction in games. Master’s thesis, Oxford University Computing Laboratory, September 2007.
A. Baltag and S. Smets. Knowledge, safe belief and dynamic rationality: logics for higher-order belief revision. Oxford and Vrije Universiteit Brussel, Unpublished manuscript.
Beliefs and common beliefs.

We saw in the last lecture what it for an agent i to believe that an event E occurs at state w in a relational model: it is just for E to occurs at all states that i considers possible at w. As E can be any set of states, one can easily define the "event" corresponding to player j being rational-R_j = {w : j is rational at w}- and with it identify the set of states where i believes that j is rational: B_i(R_j) = {w : ∀w' ∼_i w, w' ∈ Rat_j}. In our example, at states aA and bA Bob believes that Ann is rational.

Higher-order beliefs in rationality is defined in the exact same way. One can check, for instance, that in all states Ann considers it possible that Bob is not rational, and that Bob knows that she believes this - knowledge, in that context, is understood as belief which content is true.

These notions can also be defined in type spaces. Recall that there states are just tuples (s, t) where s is a strategy profile and t is a "profile" of types. In type structures beliefs are defined relative to types. An event for a type t_i is a set of pairs (s_(-j), t_(-j)), i.e. a set of strategy choice and types for all the other players. Type t_i of player i is said to believe the event E ⊆ S_(-i) x T_(-i) whenever the probability of E is 1 according to λ_i(t_i). In our example, Ann thus believes (in her only type) that Bob plays A because she assigns probability 1 to that event, which consists of the A column in the matrix specifying Ann's function λ_ann.

Higher order information is build in type structure along the same line as in relational models. Let B_i(E) = {t_i : t_i believes that E} be the set of i's types where he believes that E. Agent j believes that i believes that E when λ_j(t_j) assigns probability 1 to the event consisting of all pairs (s_(-j), t_(-j)) such that t_i ∈ B_i(E), which are just pairs where player i's type believes that E occurs. In our example, Ann does not believe that Bob believes that she plays a. This would boil down for her to assign probability 1 to Bob being of type t_b, which is not the case. Bob, however, knows - again taking knowledge as truthful belief - that Ann is uncertain about what he believes she will do.

From a given state in a relational model, or a given type in a type structure, one thus can compute any (finite) level of such higher-order information. This form what is often called the agent's belief hierarchy : starting from his beliefs about the other's strategy choices, the "ground facts", and building up to higher and higher order of beliefs about the other's beliefs.

One can also define various notions relative to the information of groups of agents. Common belief one of the most frequently used. An event is said to be common belief when all players believe it, all players believe that all players believe it, all players believe that all players believe it, and so on for any finite iteration of "all players believe that all players believe that". Alternatively, in a more fixed-point-like formulation, an event it common belief when all agents know that it is common belief.

In relational models, this translates as follows: Let ∼*_N be the reflexive and transitive closure of the union of all the accessibility relations of the players N. E is common knowledge at a state w when it occurs in all states w' ∼*_N w.

In type spaces, we use an inductive definition. Take a state (s, t) - recall that a state is a pair (strategy profile, type "profile"). The set of all states where event E is common believed to degree 0, write B^0(E), is the set of all states where all types at that state believes E. Take now the set of all states where E is common belief to degree _m. Then the set of state where E is common believed to degree m+1 when is defined as the set of pairs (s, t) where all types believe that E is common believed to degree m. The set of state where E is common belief is just the set of state where E is common believed to degree m, as m goes to infinity.

It is important to realize that, in both Kripke and type structures, common belief is not something that pertains to a single agent. It says something about what every agent in a group believes at a certain state. It is also highly interactive: it says something not only about the fact that all agents believe that an event occurs, but also about the fact that all agent believe that all others believe that, for any number of iteration of "believe that". It is, finally, a very strong notion, which is notoriously difficult to attain.

More on knowledge, beliefs and the difficulties which arises from common knowledge in interaction, see Aumann (Interactive Epistemology) and chapter 5 of Osborne and Rubinstein in the bibliography. One can find an interesting new take on the topic, with more references, in :
- P. Egre and Denis Bonnay, A Non-Standard Semantics for Inexact Knowledge with Introspection. 2006, in Proceedings of the ESSLLI 2006 Workshop Rationality and Knowledge, R. Parikh and S. Artemov (eds.)

Last minute room change

Today's lecture - January 14th - will be in P0.18 instead of P3.27. We will meet at 11.30, as planned.

Wednesday, January 9, 2008

10% for blogging

As I mentioned during the lecture, 10% of the grade will go to your participation on this blog. To get these points you have to post 2 questions related to one of the course note, and answer 2 questions from someone else. These are participation points, you will not be evaluated on the content. But note that the blog is moderated: I will simply not authorize questions or answers if they are totally off-topic.

Monday, January 7, 2008

Notes for the first meeting

General introduction:
Decision and game theory are two extensive mathematical toolboxes to study rational decision making. Both conceive of rationality as instrumental: choosing rationality is to choose what is thought to be the best mean to reach one's end.

Decision and game theoretic models are very similar in many respect, but they differ in a simple but crucial one, namely what determines the outcome of a decision. Decision theory is concerned with single agent decision making under uncertainty. It models individual decision makers, having to make a decision "against Nature", i.e. in an uncertain environment. In decision-theoretic situations, the choice of the decision maker together with the actual state of the world, which might be in turn dependent on some random happenings, completely determine the outcome of a decision. Game theory, on the other hand, is concerned with the interaction of rational decision makers. Here the outcome of a decision depends on the choices of all the agents involved in the situation.

From this small difference---one agent/many agents---comes a whole array of different problems and issues. Theses rest mainly on the fact that in decision-theoretic scenarios the decision maker's expectations are based on what he knows and believes about the state of his environment, while in game-theoretic scenarios expectations are based on what agents think the others will choose. In this project we will mainly look at how rational expectations determine choices in games. But as a start it is instructive to look first at how they determine choices in decision-theoretic scenario.
Rational expectations in decision under uncertainty.
Decision theory deals with two kinds of uncertainty: endogenous and exogenous. Situations of exogenous uncertainty are situations in which the results of the agent’s actions depend on random or non-deterministic occurrences in the environment. Exogenous uncertainty is thus the result of objective non-determinacy, external to the decision maker. Buying a lottery ticket is a typical example. Someone cannot simply choose to buy “the” winning ticket. All one can do is pick one and wait for the drawing to determine whether it is a winner. In models of decision making under exogenous uncertainly, the decision maker has to choose between different “lotteries” or simply probability distributions over the set of outcomes. Buying a lottery ticket, for instance, would then be represented as an action which gives the agent a certain probability of winning.

Situations of endogenous uncertainty, on the other hand, are situations in which the outcomes of the agent’s decisions depend on the actual state of the world, about which he has only partial information. The following is a classical example, from Savage (1954, p.17). Imagine an agent who is making an omelet. Suppose he has already broken five eggs into the bowl and he is about to break the sixth and last one. Whether this will result in a good omelet depends on the state of this last egg, which can be rotten or not. The agent, however, does not know whether the egg is rotten. In slightly more technical jargon, he lacks information about the actual state of the egg, and this makes the outcome of his decision uncertain. Note the contrast with buying a lottery ticket. There is no chance or randomness involved here. The egg is rotten or it is not. What matters is that the agent does not know the state of the egg. Decision theoretic models, for example the one proposed by Anscombe and Aumann (1963), usually represent such situations with a set of possible states. In the omelet example there would be two states, “the egg is rotten” and “the egg is not rotten”. The uncertainty over these states is represented either qualitatively, by an epistemic accessibility relation, or quantitatively, by a probability distribution.

The most wide-spread expression of instrumental rationality in the face of uncertainty, whether exogenous or endogenous, is Bayesian rationality also known as maximization of expected utility/payoffs/values. A decision problem usually consists of a set of states, representing the agent's uncertainty, a set of outcomes, which have a certain value for the agent, and a set of actions which are functions from the set of states to the set of outcomes. This simply encodes the idea that the result (outcome) of an agent's action depends on the state of the world. The agent's uncertainty is usually represented by a probability distribution over the set of states. The expected value of an action is the sum of the values of all its possible outcomes, weighted by their respective (objective and/or subjective) probability. To be instrumentally rational, in the Bayesian sense, is to choose the action, or one of the actions, which maximizes this sum. In other words, the agent is instrumentally rational when he chooses an action which gives the best expectations.

It is important to see that in these scenarios, the expectations of the agent are fixed by the probability distribution induced by the objective random event or by the (partial) information he has about the state of the world. These expectations are fixed in the sense
that they do not depend on the agent’s preferences and possible choices. They only depend on "Nature", the decision-making environment, which do not in turn form expectations about what the decision maker will choose. This is where decision-theoretic scenario differ from game-theoretic ones.

Some more references on decision theory:
- L.J. Savage. The Foundations of Statistics. Dover Publications, Inc., New York, 1954.
- R. Jeffrey. The Logic of Decision. McGraw-Hill, New-York, 1965.
- R.B. Myerson. Game Theory: Analysis of Conflict. Harvard University Press,
  1997 edition, 1991. Chapter 1.
- J.M. Joyce. "Bayesianism". In A.R. Mele and P. Rawling, editors, The Oxford Handbook of Rationality. Oxford University Press, 2004.
Games (I - basic definitions and concepts)
The key ingredients of a game-theoretic scenario are similar to those in decision theory. A given set of agents have to choose between a number of actions. The combination of these actions, one for each agent, determine an outcome which each agent values differently.

Two main models are used in game theory: extensive and strategic representations. Extensive representations are often represented as trees, as in the following figure.
The order of nodes in the tree represent the sequential structure of the game. In this example Bob plays first, choosing Hi or Lo, and Ann plays second, choosing between Hi or Lo if Bob played Hi, and hi or lo if Bob played Lo. At each node the thus player has a number of actions available. A strategy for a player i, in an extensive games is so to speak a plan for all eventualities. It is a function which specifies an action to take at each nodes in which player i has to make a decision.

For this project we will be mainly interested in games in strategic representation. These abstract away from the sequential structure. The players choose once and for all a strategy for the whole game. At this point it is worthwhile to get slightly more formal.

A game in strategic from contains a finite set of agents N and for each agent i ∈ N a finite set S_i of strategies. A strategy profile s is a combination of strategies, one for each agent. I will often use the notation s_(-i) to denote a combination of strategies for all players except i. Each agent is also endowed with a preference relation ≤_i over the set of profiles. This relation is usually assumed to be transitive, reflexive and total. In main stream game theory the preference relations are usually represented with a collections of real-valued payoff functions π_i which assign to each strategy profile a payoff to the player i. Games in strategic forms are often represented as matrices, as in Table 1.

This game is a two-players game. The row player, which I will call here Ann, has two available actions or strategies, a and b. Bob, the column player, has also two strategies, A and B. In this game the payoffs are identical for both players. If they coordinate, either on aA or bB they get 1. Otherwise they get 0.

The game depicted in Table 1, often called a "coordination" game, is of particular interest for us. What is the rational thing to choose, say, for Ann? Intuitively, it all depends on what she expects Bob to choose. If she thinks Bob will choose A, then it is rational for her to choose a, and symmetrically if she thinks Bob will choose B. But how can she come to have such beliefs?

In games in strategic from, is it generally assumed that the players choose simultaneously, without observing (or knowing) each other's choices, and without telling in advance to the others what they plan choose. This means that in our coordination game, Ann and Bob are left alone with their expectations. On the other hand, it is often assumed that both players are rational, and that this fact, together with the structure of the game, are common knowledge---we will come back to this later--- i.e. that they know that they are rational, they know that they know that they are rational, and so on. This means that in making her decision, both Ann and Bob look at the game, and try to anticipate each other's choice on the basis of what they know about the payoff structure and about each other's beliefs and rationality.

Observe here the crucial difference with the decision theoretic scenario sketched above. Here Ann's and Bob's expectations are tightly interconnected. Ann expects Bob to choose rationally. But she knows that Bob will do so on the basis of what he expects her to choose. This, in turns, also depends on what she thinks Bob thinks she thinks Bob will choose, and so on ad infinitum. The same holds for Bob. In decision theoretic scenario, on the other hand, the decision maker's expectations only stem from his uncertainty about the state of his environment, which does not depends on what the environment "expects" the decision maker to choose. In short, mutual expectations (knowledge and beliefs) are at the very heart of interactive reasoning in game theory.

For more on strategic games, see Myerson and Osborne & Rubinstein in the bibliography. The classic in the field is of course: J. von Neumann and O. Morgenstern. A Theory of Games and Economic Behaviour. Princeton University Press: Princeton, NJ, 1944.
Types and states; models of the players' informations.
Two models of the agents' information (knowledge and beliefs) are mainly used in the literature on what is called the epistemic program in game theory: the qualitative Kripke structures or relational models that we know from (epistemic) modal logic---these are for historical reasons called Aumann structures or partition structures in the economics literature---and the quantitative type spaces. I will first introduce them formally, and then look at an example.

A relational model of a game in strategic form is a set of state W, together with a collection of epistemic accessibility relations ∼_i, one for each player i , and a function f which gives, for each state in W, the strategy profile that is played at that state. This function is indeed assumed to be total, but does not have to be injective nor surjective.

Each state w of a relational model represent a possible play of the game. The strategy that each player chooses in that play is given by the function f. Observe that there need not be a one to one correspondence between the possible states and the set of profile. There can be many states in which the same profile is chosen, and there can be profiles which are chosen in no state of the model.

The information of the players at each state is give by the accessibility relations. If state w' is accessible from state w for player 1, this means that, at w, 1 considers that the actual play of the game might be w' . The relation is usually assumed to be transitive and Eucledian (if w ∼_i w' and w ∼_i w'' then w' ∼_i w''). An event E is just a set of states, i.e. E ⊆ W. An event E occurs at a state just in case that state is in E. A player is said to believe that a certain event occurs at a state when this event occurs all states that he considers might be the actual one. In other words, a player believes that a certain event occurs when he does not consider possible that it might not.

A type structure for a given strategic game consists, for each player i, of a set of types T_i and function λ_i which gives for each type in T_i a probability distribution on the set of pairs (s_(-i), t_(-i)) where s_(-i) is as above and t_(-i) is a combinations of types, one for each player except i. Types and their associated image under λ_i encode possible information state of the players in that game. Intuitively, the function λ_i gives for each type of player i his probabilistic beliefs in the others playing some strategies and being of a certain type, i.e. a probability distribution on the set of pairs (s_(-i), t_(-i)). A state in a type structure is a tuple (s, t) where s is a strategy profile and t is a combination of type, one for each player.

States in relational models and states in type structure thus encode very similar information. They specify a particular play of the game, i.e. a strategy profile, and the information the players have at that play. In relational models this information is represented by the players' accessibility relations. In type structures it is represented by the players' types.

Let us look at an example to make this clearer. Take the game pictured in Table 1. We are going to build a model of this game where both Bob and Ann are certain each other's strategy choice, but where Bob is certain about what Ann believes while she is uncertain about Bob's information.

That Bob is certain about Ann's information translates in her having only one type, i.e. T_ann = {t_a}. This is only for simplicity. We could do with more types for Ann and make λ_bob assign probability 1 to one of them. On the other hand, for Ann to be uncertain about Bob's information means that he has more than one type T_bob = {t_b, u_b}. The function λ_ann gives Ann's partial beliefs about Bob's types and strategies. An insightful way to specify these functions is by using matrices, as in Table 2.

The rows are Bob's types, the columns are his possible strategies. The values in the cells represent the player's degree of belief in the others playing some strategies and being of some type. In our case we want to represent the fact that Ann is sure that Bob will play A but that she is uncertain about what he believes. Ann's certainty about Bob's strategy choice boils down to put all the probability weight to the column where Bob plays A. This column sums up to one. One the other hand, Ann's uncertainty about Bob's information is represented by the fact that she assigns 1/2 to him being of type u_b, and 1/2 of him being of type t_b. In the first type (Table 4), Bob is certain that she will play a. In the second (Table 3), he is certain that she will play b. So Ann is sure that Bob is certain about what she will do. But she does not know whether Bob is convinced that she will do a or he is convinced that she will do b.

This type structure can be easily represented with a relational model, as in the figure below. Here we need eight states, i.e. two "copies" of each strategy profiles, one for each possible "type" of Bob. In the figure the blue states are those where Bob is of type u_b, while the red states are those where he is of type t_a.

Ann and Bob's information is represented by the arrows : plain for Ann and dashed for Bob. These should be seen as transitive, and the "reflexive" loops are omitted. There should be one at all states except at bA, bB, aA and aB for Bob, and aB, bB, aB and bB for Ann.

The fact that, at type t_b, Bob is convinced of Ann's strategy choice is a i represented by the fact that at all states where he is of that type he considers possible only states where she plays a, and vice-versa at type u_b. The same goes for Ann. She his sure that Bob will play A. At all states she only considers possible the states where he plays A. But she is uncertain about Bob's information. At state bA, where Bob believes that she will play b, she considers it possible that Bob thinks she will play a.

Observe that at state bA both have "true" beliefs about each other's choices. But this need not to be the case. At aA, for example, Bob believes that Ann plays b, while in fact she is playing a.

Observe that in this relational model each player knows his own strategy choice. This is revealed by the fact that Ann's accessibility relation only runs along the horizontal axis and that Bob's runs along the vertical axis, where they respectively play the same strategies. This is a general assumption in relational models for games, but the definition does not enforce this. Here this is the case because we built our model on the basis on a type structure in which, by definition, the players are fully and truthfully informed about their own strategy choices. Just look at the definition of the functions λ_i. They assign probabilities to combinations of types and strategies of others, not on i's type. Another interesting consequence of the fact that we built our relational model on the basis of a type space is that we obtain relations that are transitive and euclidian. This see not a coincidence, see: Halpern, J. Y. (1991). The relationship between knowledge, belief, and certainty. Annals of Mathematics and Artificial Intelligence 4, 301-322.

With these representations of the players' information in hand, we are ready to look at what it means to maximizing one's payoffs given one's information in game. This will be the topic of our next meeting. Until then I strongly recommend to look at Johan van Benthem's and Adam Brandenburger's paper in the bibliography.

Wednesday, January 2, 2008

First Meeting in P3.27

Our first meeting will be next Wednesday (9th) at 11.30 in P3.27, "Euclides" building.

I will first review briefly the basic game-theoretic definitions that we will use all through the project. We will then discuss the idea of rationality in games, especially in comparison with Bayesian or decision-theoretic rationality. We will finally look at two widely-used models of the player's informations: type spaces and relational models.

If you have questions of general interest about the topics, please post them as comments. Otherwise do not hesitate to send me an email.

Registration

It seems possible to officially register to the project via Studieweb. If for whatever reason it does not work for you, just come to the first meeting. We will take care of the official registration later.

Reasoning, Games, Action and Rationality