JID:YJETH AID:4052 /FLA

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.1 (1-40)

Available online at www.sciencedirect.com

Journal of Economic Theory ••• (••••) •••–••• www.elsevier.com/locate/jet

An experiment on learning in a multiple games environment ✩ Veronika Grimm a,∗ , Friederike Mengel b,c a University of Erlangen–Nuremberg, Lehrstuhl für Volkswirtschaftslehre, insb. Wirtschaftstheorie, Lange Gasse 20,

D-90403 Nürnberg, Germany b University of Nottingham, School of Economics, University Park Campus, Nottingham, NG7 2RD, United Kingdom c Department of Economics, Maastricht University, PO BOX 616, 6200MD Maastricht, The Netherlands

Received 21 September 2010; final version received 29 February 2012; accepted 4 March 2012

Abstract We study how players learn to make decisions if they face many different games. Games are drawn randomly from a set of either two or six games in each of 100 rounds. If either there are few games or if extensive summary information is provided (or both) convergence to the unique Nash equilibrium generally occurs. Otherwise this is not the case. We demonstrate that there are learning spillovers across games but participants learn to play strategically equivalent games in the same way. Our design and analysis allow us to distinguish between different sources of complexity and theoretical models of categorization. © 2012 Elsevier Inc. All rights reserved. JEL classification: C70; C73; C91 Keywords: Game theory; Learning; Multiple games; Experiments

✩ We thank Rene Fahr and two anonymous reviewers for helpful suggestions and seminar participants in Amsterdam, Barcelona, Irvine, CA, Cambridge, Granada, Jena, Maastricht, Santa Fe and Tilburg as well as Bart Lipman, Dirk Engelmann, Werner Gueth, Georg Kirchsteiger, Rosemarie Nagel, Hans-Theo Normann, Aljaz Ule and Eyal Winter for valuable comments. We also thank Meike Fischer and Michael Seebauer for excellent research assistance. Financial support by the Deutsche Forschungsgemeinschaft, the European Union (grant PIEF-2009-235973), the Dr. Theo and Friedl Schoeller Research Center for Business and Society, and the Spanish Ministry of Education and Science (grant SEJ 2004-02172) is gratefully acknowledged. * Corresponding author. Fax: +49 (0)911 5302 168. E-mail addresses: [email protected] (V. Grimm), [email protected] (F. Mengel).

0022-0531/$ – see front matter © 2012 Elsevier Inc. All rights reserved. http://dx.doi.org/10.1016/j.jet.2012.05.011

JID:YJETH AID:4052 /FLA

2

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.2 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

1. Introduction The theory of learning in games has largely been concerned with agents that face one well specified game and try to learn optimal actions in that game. In addition, it is often assumed that the description of the game (players, strategies and payoffs) is known by all players interacting in the game. Reality, though, is much more complex. Agents face many different games and, while they typically know their own preferences, they often do not know those of their opponents. In such cases while they know the “label” of a game, they do not know the full strategic form. An important question is how people learn from their experience in these complex environments with many different games and whether their behavior in a game is affected by some of the other games played. This type of effects are relevant in many contexts, such as for example the design of institutions and organizations. Colleagues within an organization interact in many different types of strategic situations. The designer of an organization can affect who interacts with whom, and in which “games”. Understanding learning spillovers and categorization of games can enable the designer to make use of positive spillover effects (and to avoid negative ones) and hence to design workflows in more efficient ways. Categorization can also play a role at a more fundamental level, for example to explain the emergence of cultural differences. If we think of institutions as creating game forms, then extrapolation across games within different sets of institutions can create those seemingly inconsistent patterns of behavior which are often referred to as cultural differences (see e.g. Bednar and Page [2]). If people growing up in different social backgrounds face a different distribution of games, then learning across games can provide an explanation for why they behave differently in any given game. While there is a (small) number of theories about how agents extrapolate between different games (see the literature referred to below), there is little evidence on whether spillovers exist and, if so, how successful existing theories are in organizing or predicting actual human behavior in complex environments. In this paper we investigate experimentally how agents deal with situations involving many different games. We provide a clean test for learning spillovers and demonstrate that such spillovers do exist. We also show that the existence of spillovers is closely related to the complexity of the environment. Our design allows us to discriminate between different models of categorization and learning spillovers that have been proposed in the literature. Our first set of treatments allows us to study the implications of complexity (measured by the number of games and the accessibility of information) on learning and convergence to equilibrium in simple normal-form games. In our experiment participants interact in 3 × 3 normal-form games for 100 rounds. The number of games that the participants face during the experiment is varied across treatments. In addition, information about (i) the opponent’s past behavior and (ii) the payoff matrices of the games is provided more or less explicitly across treatments. While we vary the complexity of the environment across treatments (by varying the number of games and the amount of information explicitly provided), the games themselves are simple. In particular, all games are acyclic and have a unique strict Nash equilibrium that coincides with the maxmin prediction. We find clear evidence of learning spillovers in complex environments. If either there are few games or if explicit information is provided (or both), convergence to the unique Nash equilibrium generally occurs. Otherwise this is not always the case and play often converges to a distribution of actions which is non-Nash. The fact that play fails to converge to Nash equilibrium only in the most complex environment suggests that higher cognitive costs in these environments may be responsible for the failure to converge to Nash equilibrium in these treatments.

JID:YJETH AID:4052 /FLA

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.3 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

3

Consistently with this explanation we find that participants scoring better in the “Cognitive Reflection Test” (Frederick [8]) choose Nash actions more often than other participants. Of course, this does not prove yet that learning spillovers do occur since behavior may be affected by the complexity of the environment in other ways. Hence, we conducted two additional treatments where we either relabel an action or change some payoffs in only two out of six games. We find that these changes affect behavior also in other games and conclude that there must be learning spillovers. In spite of those learning spillovers we can show that some fundamental predictions of game theory have empirical relevance even in our complex environments. In particular, we find that in games with a dominant strategy equilibrium play converges to Nash equilibrium in all treatments.1 We also find that there is almost perfect consistency in aggregate behavior among pairs of games which are strategically equivalent. Two additional treatments allow us to identify the source of complexity that is responsible for learning spillovers in our setting. We find that the difficulty of distinguishing games (caused by a non-permanent display of the payoff matrices of the games) is a major source of complexity in our experiment. Providing explicit summary information on one’s past opponents’ actions in the same game also seems to help convergence in some games, but the evidence is somewhat weaker. We demonstrate that our results can be captured by a model of categorization where some agents rely on coarse partitions and make choices using a best response correspondence based on the “average game” in each category. We also show that a model where agents form the same beliefs across all games in a category (as in “analogy-based expectations equilibrium, ABEE”, Jehiel [14]) can explain behavior in some games, but not in others. Interestingly, the games where belief bundling can explain behavior are exactly those where we find evidence that providing information about past choice frequencies does help convergence to Nash equilibrium. The paper is organized as follows. Section 2 reviews the related literature and points out the contribution of our paper. In Section 3 we describe the experimental design. In Section 4 we present the theoretical background and state our conjectures. Section 5 presents the results. Section 7 concludes. Some additional regression tables, proofs, and the experimental instructions can be found in Appendices A–E. 2. Related literature The probably most well known piece of literature dealing with similarity is case-based decision theory (Gilboa and Schmeidler [10,11]). Gilboa and Schmeidler argue that if agents face a decision problem for the first time they will reason by analogy to similar situations they have faced in the past. They then derive a utility representation for an axiomatization of such a decision rule. The notion of similarity from case-based decision theory has been used by LiCalzi [20] to show that fictitious play learning across different games converges under certain conditions. Steiner and Stewart [34] show that there can be contagion in global games caused by coarse inference. Rubinstein [30] provides an explanation of the Allais paradox based on similarity. One property of these similarity judgments is that they are not transitive.2 1 Taken at face value, one implication of this result is to emphasize the value of dominant strategy implementation, since extrapolation across games in our experiment never leads participants to deviate from dominant strategies, while otherwise they do deviate from best responses in very obvious and strong ways. 2 See also Luce [21].

JID:YJETH AID:4052 /FLA

4

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.4 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

Another way to tackle the question of similarity is to assume that agents, who face a set of (many) different games or experiences, partition this set into equivalence classes (or categories) of games or experiences they see as analogous or do not distinguish. This yields a transitive notion of similarity. See, e.g., Samuelson [31], Jehiel [14], Fryer and Jackson [9], Mullainathan, Schwartzstein and Schleifer [26], Mengel [22], or Jehiel [16]. Among other things these papers differ in whether partitions are exogenous (Jehiel [14], Jehiel and Koessler [15], Mullainathan, Schwartzstein and Schleifer [26], or Jehiel [16]) or endogenous (Samuelson [31], Fryer and Jackson [9], Mohlin [25], or Mengel [22]) and whether agents partition games (Samuelson [31], Jehiel [14,16], Mengel [22]) or other objects (Fryer and Jackson [9], Mullainathan, Schwartzstein and Schleifer [26], or Mohlin [25]). In our paper we first demonstrate that learning spillovers do exist and may impede convergence to Nash equilibrium. We then distinguish three possible “modes of categorization” that differ with respect to the aspects of the strategic situation that may be subject to analogy based reasoning. Let us explain this point in more detail. The behavioral assumptions underlying Nash equilibrium can be described as follows: Agents form beliefs about the opponent’s choice and then use their best response correspondence to choose an action. Now, within any given equivalence class of games there are three ways in which an agent can fail to distinguish the games contained in that equivalence class. She can (i) form the same beliefs for all games contained in an equivalence class (belief bundling). She can (ii) apply the same best response correspondence to all the games in an equivalence class (best response bundling). Or she can (iii) do both: hold the same beliefs and apply the same best response correspondence. The latter implies choosing the same action in all games (action bundling). We refer to those three ways of (not) discriminating between games as three “modes of categorization”. The most prominent example of a theory of belief bundling is Jehiel [14]. Jehiel assumes that agents partition games into equivalence classes and hold the same beliefs for all games in the same equivalence class. He proposes a so-called “analogy-based expectations equilibrium” (ABEE), where beliefs have to be correct (on average in each equivalence class) and agents must choose a best response to their beliefs in each game. Note that in an ABEE agents can choose different actions in two games belonging to the same equivalence class. Jehiel and Koessler [15] study the relation between analogy based expectations and incomplete information games and Jehiel [16] applies ABEE to auction design and demonstrates that an auction designer can exploit coarse beliefs.3 Another paper that models how agents bundle beliefs across a number of situations in the same category is Mullainathan, Schwartzstein and Schleifer [26]. However, while they study how coarse thinkers can be exploited, they do not study strategic interactions among coarse thinkers. Models of action bundling are widespread in the automaton literature. In this literature there are complexity costs of playing more complicated strategies, which may involve distinguishing many games. See among many others Abreu and Rubinstein [1]. Samuelson [31] has presented a model where distinguishing games requires more states of a machine and where hence players may group together bargaining games in order to save the cost of an additional state. If two games are lumped together, then agents have to choose the same action in these games. Mengel [22] presents an evolutionary model where agents learn endogenously how to partition a set of games and which actions to choose in each equivalence class. She characterizes stable outcomes of the 3 Note that ABEE is more general and can be applied e.g. to agents distinguishing nodes (subgames) in an extensive form game. In this paper we study the difficulty of distinguishing normal-form games.

JID:YJETH AID:4052 /FLA

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.5 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

5

evolutionary model and shows that certain well known stability results can be reversed even for arbitrarily small costs of distinguishing games. As mentioned before, action bundling can be seen as the result of agents bundling both, best responses and beliefs. This distinction is not formally worked out in the literature since most of it is evolutionary and hence beliefs only play a role implicitly.4 In this paper, however, we will distinguish explicitly between belief bundling, best response bundling and their “sum”, namely action bundling. Some other authors have investigated similarity learning or analogy based decisions in experiments. Haruvy and Stahl [17] let participants interact in 10 different normal-form games to see how and whether they extrapolate between similar games. They show that convergence to Nash equilibrium occurs faster if participants have been exposed to a strategically equivalent game before and rationalize the data through Stahl’s n-level learning theory. Stahl and van Huyck [33] report results on play in different 2 × 2 stag hunt games and demonstrate that similarity plays a role in the participants’ decisions. Huck, Jehiel and Rutter [19] conducted experiments involving two different normal-form games varying the degree of accessibility of information about the opponent’s behavior. They show that Jehiel’s [14] equilibrium concept of “analogy-based expectations” can rationalize their data. Their paper focuses on testing ABEE and on equilibrium play rather than learning. Hence, they do not distinguish between action, best response, and belief bundling, nor between sources of complexity. Several other studies have found more or less explicit evidence that there are learning spillovers between games. Examples are Weber [35] in a study of ‘feedback-less’ learning, Rapoport et al. [29], or Cooper and Kagel [4,5], among others. Other studies have analyzed learning across games which are very similar except for one or two parameters or payoffs. Grosskopf et al. [13] find support for case based decision making as proposed by Gilboa and Schmeidler [10]. Selten et al. [32] let subjects submit strategies for tournaments with many different 3 × 3 games and find that the induced fraction of pure strategy Nash equilibria increases over time. There is also some literature showing that experienced players are able to extrapolate between familiar real life situations and “similar” laboratory situations. (See e.g. Palacios-Huerta and Volij [28], or Chi, Feltovich and Glaser [3].) In our experiment we find that differences in how people extrapolate between different games do not come from experience, but from differences in the willingness to engage in cognitive reflection. Subjects with higher cognitive ability (willingness to engage in cognitive reflection) manage better to distinguish different games and come closer to Nash play. An experiment by Oechssler and Schipper [27] is related to our experiment in that they study learning of participants who do not have information about the payoffs of their opponent. Interestingly they find that, while players do not seem able to learn about the payoffs of the opponent very well (they often answer questions about these payoffs wrongly), they do converge to Nash equilibrium most of the time. These findings are in line with our findings from the treatments with few games where convergence to the unique Nash equilibrium occurs in all games, in spite of the fact that participants do not know their opponent’s payoffs. 3. The experimental design We let 437 participants anonymously interact in different normal-form games for 100 rounds. This enables us to study learning across the different games. We ran six treatments that differed 4 However, the literature is aware of the fact that, e.g., propensities in reinforcement learning models can be interpreted as beliefs (see, for example, Hopkins [18] or Mengel [22]).

JID:YJETH AID:4052 /FLA

6

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.6 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

in (a) the number of different games the subjects were confronted with (either two or six) and (b) how difficult it was (in terms of cognitive effort) to access information about the games and/or the opponents’ play.5 Two additional treatments, where we relabel an action or change some payoffs in some games, provide a precise test for learning spillovers. In order to have a clean environment in which to address our research questions we had to create a situation where complexity arises solely from the difficulty of playing multiple games and not, for example, from playing the game as such. One way to accomplish this is to avoid that participants can engage in strategic considerations within single games (such as eliminating strategies through iterated dominance etc.). An important feature of our design is hence, that throughout the experiment subjects could only see their own payoffs, but not the payoffs of their match.6 As a consequence, the only way they could find out about the opponent’s behavior was to learn about it. Furthermore, this design feature had the advantage to largely eliminate other confounding factors, such as for example fairness concerns or efficiency considerations. In addition, a number of constraints on the games had to be satisfied. First, we needed games to be such that models of categorization yield different predictions than Nash equilibrium if coarse categories are used and that action bundling, best response bundling, and belief bundling lead to (at least partly) different predictions. This implies that we had to use at least 3 × 3 games. On the other hand we didn’t want to use larger games in order to maximize chances that game theory predicts well in these games if they are played in isolation (i.e. if issues of categorization play no role). To give standard theory (not accounting for learning spillovers) the best possible chances, the games in our experiment also have a unique strict Nash equilibrium. Furthermore games should be abstract (rather than e.g. coordination or conflict games) in order to make sure that participants have no experience with similar games in real life, since unobserved extrapolation from real life situations to the experiment could be a confounding factor. In order to make second guessing of the opponent’s payoffs more difficult we also decided to use asymmetric rather than symmetric games. Both these design features also mitigate the possibility that participants associate the game with fairness or efficiency concerns. Games The normal-form games we chose are given in Table 1. Note that games 4, 5, and 6 are “copies” of games 1, 2, and 3 (payoffs are monotonically transformed by adding either 5 or 10 to all entries). In the two treatments with few games (labeled “F”), subjects played games 1 and 2. In the treatments with many games (labeled “M”) subjects played all games shown in Table 1. All games have a unique strict Nash equilibrium. In games 2 and 5 this Nash equilibrium is in strictly dominant strategies and games 3 and 6 are solvable by iterated strict dominance. In games 1 and 4 elimination of a weakly dominated strategy is needed to reach the unique Nash equilibrium. All games are acyclic (Young [36]) and the unique strict Nash equilibrium coincides with the maxmin prediction. Hence (a) it is easy to learn the Nash equilibrium if the game is the only game played and (b) there are no conflicting predictions of existing theories of learning in a single game.7 5 Participants obtained exactly the same information in all treatments, but in some treatments it was easier to access this information than in others. 6 Of course one can also argue that it is an assumption that is often satisfied in reality in situations where there is little information about the opponent’s preferences. 7 The fact that the maxmin prediction does not conflict with the Nash prediction should make Nash choices even more attractive. Hence if we observe a deviation from Nash we will have a stronger result.

JID:YJETH AID:4052 /FLA

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.7 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

7

Table 1 Payoff matrices. Payoffs associated with the unique strict Nash equilibrium are in bold letters. Game 1

a

b

c

Game 4

a

b

c

A B C

20, 20 10, 15 10, 15

15, 10 25, 10 15, 35

15, 10 0, 10 35, 0

A B C

25, 25 15, 20 15, 20

20, 15 30, 15 20, 40

20, 15 5, 15 40, 5

Game 2

a

b

c

Game 5

a

b

c

A B C

5, 5 10, 15 20, 5

15, 20 5, 25 25, 15

5, 10 10, 10 15, 10

A B C

15, 15 20, 25 30, 15

25, 30 15, 35 35, 25

15, 20 20, 20 25, 20

Game 3

a

b

c

Game 6

a

b

c

A B C

15, 10 15, 20 20, 5

20, 20 10, 15 15, 35

15, 15 5, 10 35, 0

A B C

20, 15 20, 25 25, 15

25, 25 15, 20 20, 40

20, 20 10, 15 40, 5

Table 2 Frequencies of the different games.

F M

1

2

3

4

5

6

50 13

50 13

– 17

– 30

– 15

– 12

We generated games randomly from the set of either two or six games (depending on the treatment) according to a uniform distribution. Table 2 shows how often each game occurred in the treatments with few games (F) and many games (M).8 In all treatments subjects were split equally in row and column players and kept this role throughout the experiment. They were randomly rematched in each round within groups of eight participants. Information After each round subjects were informed about the choice of their match and their own payoffs. In all treatments subjects received exactly the same information, but we varied how hard it was to extract and memorize the relevant information. In some treatments it required substantial reasoning and memorizing capacities. Subjects were not allowed to take notes throughout the experiment in any of the treatments. In the treatments with high accessibility of information (FI and MI), payoff matrices for all the games (containing only the participants’ own payoffs) were printed in the instructions that were distributed at the beginning of the experiment and were displayed permanently on the decision screen.9 In each round subjects were, moreover, informed about the relative frequencies of each action choice by their past interaction partners in the last five rounds in which the same game was played. 8 We also ran an additional treatment (M-UD) were each game appeared either 16 or 17 times in order to check the robustness of our results to small changes in the relative frequency with which the different games occurred. The evidence is reported in Section 6.4. 9 Both, in the instructions as well as during the experiment, payoff information for both row and column players was presented in exactly the same way.

JID:YJETH AID:4052 /FLA

8

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.8 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

In the treatments with low accessibility of information (F and M), the payoff matrices did not appear in the instructions but were only shown to the subjects for a limited time in each round before the software switched to the decision screen (where the payoffs where not displayed any more). There was also no explicit information given on average action choices of the opponent. However, in principle subjects could extract this information from the information provided to them on the action choice of their match in each round. We furthermore conducted two treatments with many games and intermediate accessibility of information (M + G and M + F). In treatment M + G payoff matrices (containing only the participants’ own payoffs) for all the games were printed in the instructions and displayed on the decision screen (just as in MI), but participants did not receive explicit information on average action choices of the opponent. In treatment M + F participants were informed about the relative frequencies of each action choice by their past interaction partners in the last five rounds in which the same game was played, but we did not provide the payoff matrices in the instructions. Those two treatments allowed us to identify the source of complexity in our framework and to discriminate between different modes of categorization (see Section 5.4). It is important to emphasize once more that in all our treatments participants did obtain exactly the same information. The only difference was in how easy it was to memorize all the different bits of information. Complexity and cognitive reflection test Note that if one tries to memorize the past action choices of opponents in the six different games, one has to memorize and update six “elements of information”. Partitioning the set of games in a number of categories can reduce this number. It is a widely accepted finding in psychology that the bound for the number of pieces of information humans can process simultaneously is seven (+/− two), or four (+/− two) for short term memory (see, e.g., Miller [24], or Cowan [6]). Hence with six games we hoped to get a division of participants into some that are able to memorize all the relevant information and some that are not able or willing to do so. In order to obtain information on the participants’ willingness to engage in cognitive effort, we let them participate in a cognitive reflection test (Frederick [8]) immediately after the experiment. We did not provide material incentives for correct answers in the test. The test is described in detail in Section 6.1, where we report the results and relate them to behavior observed in the experiment. Spillovers Strictly speaking our six treatments do not allow to conclude that treatment differences – if we find any – are due to spillover effects across games. In principle, complexity in our treatments could affect behavior in other ways. Thus, we ran two additional treatments: M* In treatment M* all games except for games 1 and 4 are unchanged. In games 1 and 4, though, the row players’ payoffs for (A, a), (A, b) and (B, b) are reduced by 5. M** In treatment M** all games are unchanged but actions B and C are relabeled for row players in games 2 and 5, i.e. the Nash equilibrium in these games is (B, b) in treatment M**. In all other respects M* and M** coincide with M. Hence it seems fair to say that M, M* and M** are equally complex. To test whether learning spillovers exist we will be interested in comparing behavior in treatments M, M*, and M** in those games that remain unchanged across treatments. Other details The experiment took place in December 2007 at the Cologne Laboratory for Experimental Research (four independent observations each of treatments F, FI, M, and MI),

JID:YJETH AID:4052 /FLA

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.9 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

9

Table 3 Treatments (number of matching groups in brackets).

Explicit information No explicit information

Only info about games Only info about frequencies

Few games

Many games

FI (4) F (4)

MI (8) M (8) M* (7) M** (7) M-UD (4) M + G (6) M + F (7)

in February 2010 at the BEE-Lab at Maastricht University (three observations each of treatments M* and M**), and in May 2011 in Maastricht (four additional observations of M, MI, M*, and M**, as well as seven observations of M + F, six observations of M + G, and four observations of treatment M-UD, which is described in Section 6.4).10 Table 3 summarizes our nine treatments together with the number of independent observations. Other than the treatments reported we did not conduct any additional sessions nor did we conduct pilot studies. Each independent observation corresponds to 800 decisions.11 Participants were students at the Universities of Cologne and Maastricht with little or no prior exposure to game theory. Experimental sessions consisted of 16–32 participants and were computerized.12 Written instructions were distributed at the beginning of the experiment.13 Sessions lasted between 60 min (treatments with few games) and 90 min (treatments with many games), including reading the instructions, answering a post-experimental questionnaire and receiving payments. Students earned between Euro 12.90 and Euro 21.50 in the experiment (including the showup fee of 2.50). On average subjects earned a little more than 18 Euros (all included). 4. Theoretical background This section contains the theoretical background and the conjectures we want to address with our experimental design. 4.1. Modes of categorization We consider three modes of categorization. Agents may choose the same action in all games in the same category (action bundling), they may choose the same best response correspondence across all games in the same category (best response bundling), or they may hold the same 10 Three out of the seven independent observations of M + F were conducted within matching groups of only seven (instead of eight) players, due to non-showup of participants. In those matching groups one of the column players was randomly matched to two row players in each round. This explains the total number of 437 (instead of 440) participants we needed to generate our 55 observations in total. We pool these matching groups with the other 4 matching groups in M + F in the subsequent analysis, but no results hinge on that. Tables separately for each matching group are available upon request. 11 An exception is treatment M + F (see footnote above). 12 The experiment was programmed and conducted with the software z-Tree (Fischbacher [7]). Subjects were recruited using the online recruitment system ORSEE by Greiner [12]. 13 The English instructions for treatments M, M*, M** and M-UD can be found in Appendix E. In Cologne the same instructions were used translated in German. Instructions for the remaining treatments are available upon request.

JID:YJETH AID:4052 /FLA

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.10 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

10

beliefs in all those games (belief bundling). In the following we describe these three modes of categorization in more detail. Denote a game by γ ∈ {1, 2, 3, 4, 5, 6} with payoff matrix Π(γ ). Denote by f (γ ) the frequency with which game γ occurs. Denote by ai (γ ) the action player i chooses in game γ , by σ−i (γ ) player i’s opponent’s (mixed) choice in game γ , and by μi (γ ) player i’s beliefs about her opponent’s choice in game γ . A category of games denoted by G ⊆ {1, 2, 3, 4, 5, 6} is a subset of games between which spillovers might take place. Denote by Gi a category held by player i and by    −1 G= f (γ )Π(γ ) f (γ ) γ ∈G

γ ∈G

the average game matrix corresponding to category G. We refer to Gi as the (average) game induced by category Gi . Denote by μi (G) the average belief in category G and by    −1 σ −i (G) = f (γ )σ−i (γ ) f (γ ) γ ∈G

γ ∈G

the average behavior of i’s opponent across the games contained in category G. Finally denote by BRi (γ , μi (γ )) agent i’s best response in game γ given belief μi (γ ). Best response bundling (BRB) Under BRB agents apply the same best response correspondence to all games in the same category. In an equilibrium with BRB we have that ∀i, Gi : μi (γ ) = σ−i (γ ) and   ai (γ ) ∈ BRi Gi , μi (γ ) , ∀γ ∈ Gi . Best response bundling can be thought of as arising from a process where agents learn about their own optimal behavior in each category, but where potentially they may have different beliefs about the opponents’ behavior in different games. Hence, even if players categorize, their beliefs are not restricted to be the same in each game in a given category. Belief bundling (BB) Under BB agents apply the same beliefs to all games in the same category. In an equilibrium with belief bundling we have ∀i, Gi : μi (Gi ) = σi (Gi ) and   ai (γ ) ∈ BRi γ , μi (Gi ) , ∀γ ∈ Gi . Belief bundling corresponds to a situation where agents learn about the opponents’ behavior across different games and best respond to their beliefs in each game individually. Beliefs are correct on average across games. An equilibrium with belief bundling is known in the literature as an “analogy-based expectations equilibrium” (ABEE), a concept defined by Jehiel [14].14 Action bundling (AB) Under AB agents apply both – the same beliefs and the same best response correspondence to all games in the same category. In an equilibrium with action bundling we have that ∀i, Gi : μi (Gi ) = σi (Gi ) and   ai (γ ) ∈ BRi Gi , μi (Gi ) , ∀γ ∈ Gi . 14 We refer to this mode of categorization as “belief bundling” to be more explicit about how it differs from the other two modes of categorization we discuss.

JID:YJETH AID:4052 /FLA

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.11 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

11

This strongest mode of categorization might correspond to a situation where agents do not explicitly form beliefs about others but simply try to learn what is best in a given category. This could be the case if agents are reinforcement learners.15 Alternatively one could think of agents learning rules of thumb such as “In category G action A is optimal”. For a game contained in a singleton category we have G = {γ } and hence in equilibrium ai (γ ) ∈ BRi (γ , μi (γ )). Hence for singleton categories all the above concepts reduce to Nash equilibrium. Note that only under action bundling will agents necessarily choose the same action in all games contained in the same category. Both, under belief and best response bundling they may choose different actions in two games even if they are in the same category. It is also possible to view the equilibria described above as outcomes of a learning process, in which case beliefs can be learned or can be thought of as propensities. 4.2. Conjectures and hypotheses A fundamental premise of our analysis is that learning spillovers matter if and only if the environment is sufficiently “complex”. Complexity costs are often explicitly modeled in theoretical approaches with endogenous partitions. See, e.g., Samuelson [31] or Mengel [22]. In our experiment there are two dimensions of complexity: the number of games and the accessibility of information. Our first conjecture takes up this basic premise of much of the theoretical literature and first asks whether behavior is affected by the complexity of the environment. Conjecture 1 (Complexity). Behavior is affected by the complexity of the environment as measured by the number of games and the accessibility of information. To gain insight into this conjecture we can test the Null-Hypothesis that there is no difference in behavior in any given game across treatments FI, F, MI, and M, which differ in the number of games and the accessibility of information. Rejecting this hypothesis will lead us to conclude that complexity does matter. Eventually we would like to understand whether participants try to reduce complexity by categorizing games. If they do indeed categorize, then there should be learning spillovers across the games contained in the same category. Observing significant differences across treatments FI, F, MI and M, however, does not prove that learning spillovers do exist, since complexity may affect behavior in other ways. Our next conjecture is hence: Conjecture 2 (Learning spillovers). There are learning spillovers. To gain insight into this conjecture we can test the Null-Hypothesis that there is no difference in behavior in any given game across treatments M, M*, and M**. Those treatments are equally complex in terms of the number of games and the accessibility of information. Games 3 and 6 are exactly the same in all treatments, and games 1 and 4 are the same in treatments M and M**. Hence, if we observe differences across any of these treatments (in a game which is the same across these treatments), those differences must be due to learning spillovers. The design of these treatments allows us, moreover, to test whether learning spillovers are invariant to action relabeling, which is a property of learning spillovers that is sometimes discussed in the literature. Such an invariance would, for example, hold true in a model where agents learn 15 Of course it is possible to think of the propensities in reinforcement models as beliefs.

JID:YJETH AID:4052 /FLA

12

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.12 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

and extrapolate strategic context across games.16 In other models, where agents learn e.g. about optimal actions or about the population, it will not hold. Invariance to action relabeling would be rejected if behavior differed in games 1, 3, 4, or 6 between treatments M** and M. Our next two conjectures refer to a number of regularities with respect to learning spillovers. Understanding these regularities can help us to (a) understand to which extent spillovers can be modeled using standard tools and language from game theory and (b) empirically discriminate between models which do or do not have the properties we test. A first fundamental question is whether strategically equivalent games are affected in the same way by learning spillovers. Conjecture 3 (Strategic equivalence). Strategically equivalent games are affected in the same way by learning spillovers and are always played in the same way. To gain insight into this conjecture we can test the Null-Hypothesis that behavior in our pairs of strategically equivalent games (games 1 and 4, 2 and 5, 3 and 6) is the same in all treatments. If we fail to reject this Null-Hypothesis, then we can be more optimistic that game theoretic models will be able to rationalize behavior. In a second step we want to find out whether models of categorization can explain the observed behavior. Our treatments M + G and M + F allow us to gain insight into the source of complexity in treatment M.17 Understanding the source of complexity, in turn, should give us some insight into which mode of categorization may be relied on by participants to reduce complexity. Suppose, for example, that complexity in treatment M was stemming mostly from the difficulty of distinguishing one’s past opponents’ choice frequencies in the different games. It seems natural that participants would react to such complexity by bundling beliefs. If, on the other hand, complexity stemmed mostly from the difficulty of distinguishing games, it would seem natural that participants would aim at reducing this complexity by bundling best response correspondences. Now, treatment M + F eliminates the difficulty of distinguishing past choice frequencies and treatment M + G eliminates the difficulty of distinguishing games. Conjecture 4 (Distinguishing opponents’ choices). Distinguishing opponents’ choice frequencies across games is a major source of complexity in treatment M. Conjecture 5 (Distinguishing games). Distinguishing game payoffs across games is a major source of complexity in treatment M. To gain insight into Conjecture 4 we can test the Null-Hypothesis that behavior is no different across treatments M + F and M and to gain insight into Conjecture 5 we can test the NullHypothesis that behavior is no different across treatments M + G and M. Since presumably participants reduce complexity which is caused by the difficulty of distinguishing past opponents’ choices by bundling beliefs and reduce complexity caused by distinguishing games by bundling best responses, testing these hypotheses can give us some insights into whether belief bundling or best response bundling (or both) play a role in our experiment. We can also distinguish the 16 See e.g. Haruvy and Stahl [17] or Mengel and Sciubba [23]. 17 We thank a referee for pointing us into this direction.

JID:YJETH AID:4052 /FLA

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.13 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

13

Table 4 Relative frequencies of action choices in games 3 and 6, last 50 rounds. Nash equilibrium, (A, b), in bold numbers. Game 3 (RP)

A

B

C

(CP)

a

b

c

MI M

0.94 0.50

0.01 0.01

0.05 0.49

MI M

0.01 0.01

0.99 0.98

0.00 0.01

Game 6 (RP)

A

B

C

(CP)

a

b

c

MI M

0.85 0.44

0.01 0.00

0.13 0.55

MI M

0.00 0.01

0.99 0.98

0.00 0.01

different modes of categorization via their equilibrium and comparative statics predictions. We will discuss those predictions in Section 5.4, and in Appendices C and D.18 5. Results In the following sections we report the results of our experiments, discussing in sequence Conjectures 1 to 5. 5.1. Complexity (Conjecture 1) To understand the effect of complexity (manipulated across treatments via (i) the amount of games and (ii) the amount of explicit feedback) we compare treatments F, FI, MI and M. Games 2 and 5 In the “easiest” games, 2 and 5, both row and column players, had a strictly dominant strategy. The Nash equilibrium in games 2 and 5 is (C, b). As expected participants quickly learn to play these strategies in all treatments. In fact, in none of the treatments and for none of the player roles the overall share of Nash choices is below 98%. Games 3 and 6 In games 3 and 6, in contrast, we observe an effect of complexity. The unique strict Nash equilibrium in these games is (A, b). Games 3 and 6 don’t have a dominant strategy, but are solvable by iterated elimination of strictly dominated strategies.19 Remember, though, that this has to be learned by our participants. Table 4 shows the distribution of action choices in games 3 and 6 during the last 50 periods of the experiment. It can be seen clearly from Table 4 that, while the Nash equilibrium is learned in 94 percent (85 percent) of all cases in game 3 (6) in treatment MI, this percentage is much lower in treatment M. In fact, in treatment M only about 50 percent of row players choose the best response (A) in spite of the fact that column players choose b virtually always. Apparently, higher complexity – as measured by the amount of explicitly available information – affects convergence to Nash equilibrium. Fig. 1 illustrates this difference and regression Table 6 (discussed in detail below) shows that the respective treatment differences are statistically significant. Games 1 and 4 Now let us move to the “most difficult” games 1 and 4. Both have a unique strict Nash equilibrium which is given by (A, a). Table 5 and Fig. 1 show that participants do 18 Note that in addition we could test the hypotheses that “there is no difference in behavior across treatments M + F (M + G) and MI”. Rejecting these hypotheses would lead us to conclude that the difficulty of distinguishing opponents’ choices (games) is the unique source of complexity in treatment M. 19 B is dominated for the row player, and given the row player does not play B, the column player should play b.

JID:YJETH AID:4052 /FLA

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.14 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

14

Fig. 1. The effect of complexity. Share of Nash choices by row players across treatments F, FI, MI and M.

Table 5 Relative frequencies of action choices in games 1 and 4, last 50 rounds. Nash equilibrium, (A, a), in bold numbers. Game 1 (RP)

A

B

C

(CP)

a

b

c

FI F MI M

0.90 0.96 0.62 0.37

0.08 0.04 0.32 0.50

0.02 0.00 0.06 0.13

FI F MI M

0.92 0.95 0.74 0.45

0.07 0.03 0.26 0.55

0.00 0.02 0.00 0.00

Game 4 (RP)

A

B

C

(CP)

a

b

c

MI M

0.76 0.40

0.20 0.52

0.03 0.07

MI M

0.83 0.48

0.17 0.52

0.00 0.00

learn the strict Nash equilibrium in treatments F and FI, but clearly fail to do so in treatment M. While in treatment F the strict Nash equilibrium is played in 95 percent of all cases, less than 50 percent of players choose the Nash action if the number of different games is increased.20 Even in treatment MI, where explicit information is available, a significant fraction of participants fails to learn Nash play. Both dimensions of complexity (lack of summary information and number of games) have an effect on behavior. 20 There is convergence in treatment F even if we focus only on the first 13 occurrences of game 1.

JID:YJETH AID:4052 /FLA

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.15 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

15

Table 6 Multinomial logit regression on row players’ choices (periods 51–100). Standard errors clustered by matching group ((Pr > X 2 ) < 0.0001). Baseline is A-choices in treatment MI. Action choice B Constant F FI M C Constant F FI M Observations Groups

(Game 1)

(Game 3)

(Game 4)

(Game 6)

−0.6683 (0.5044) −33.28*** (0.7183) −1.7173** (0.6706) 0.9699* (0.5882)

−4.5035*** (0.4651)

−1.3130** (0.5601)

−4.5108*** (0.6970)

0.6323 (0.7010)

1.5787** (0.6626)

0.2552 (1.2421)

−2.2942*** (0.6030) −1.0016 (0.9292) −1.5050 (0.9207) 1.2504 (0.7828)

−2.9360*** (0.4324)

−3.1957*** (0.6295)

−1.8253*** (0.2388)

2.9420*** (0.4899)

1.4312** (0.6895)

2.0526*** (0.3314)

896 24

576 16

1280 16

640 16

*** 1%. ** 5%. * 10%.

At this point it may be worthwhile to remember that in both treatments, M and MI (as well as F and FI), the same amount of feedback is available in principle. However, in M and F participants are not presented summary statistics about past behavior, nor are they reminded about the game payoffs on the decision screen. Hence, they have to provide cognitive effort to keep track of this information themselves. This cognitive effort is more costly the more different games there are. Table 6 shows the results of a multinomial logit regression, where we cluster standard errors by matching group. The table shows the results of a restricted regression where we only included treatments that are relevant for addressing the complexity conjecture. In Appendix A a regression table including all treatments can be found. The baseline are A-choices in treatment MI. Choosing MI as baseline treatment reveals the most interesting treatment differences, since MI is an intermediate treatment.21 Note also that the table shows only treatment differences with respect to non-baseline options B and C. However since probabilities sum to one inference can be drawn also about A-choices. The coefficient on the constant for B shows that there are significantly less B-choices compared to A-choices in treatment MI in games 3, 4 and 6. The coefficient on the constant for C shows that there are significantly less C-choices than A-choices in treatment MI in all games. 21 Choosing M as a baseline would reveal significant coefficients throughout, since all treatments are significantly different from M, but wouldn’t answer the more interesting questions such as e.g. whether MI is different from F. Table 26 in Appendix A shows the same regression with treatment M as a baseline.

JID:YJETH AID:4052 /FLA

16

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.16 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

The coefficients on treatment variables are interpreted as follows. The significant coefficient on M (for B-choices) shows that in game 4 players are relatively more likely to choose B rather than A in treatment M compared to MI. In other words there is an increased odds ratio (Pr(B)/Pr(A)) of B- vs. A-choices in treatment M compared to MI. Since almost no participants choose B in games 3 and 6 in any treatment there are, as expected, no significant treatment differences for B-choices in these games. The coefficient on M for C shows that there are relatively more C-choices than A-choices in games 3, 4 and 6 in treatment M compared to MI. Result 1 (Complexity). 1. Higher complexity affects learning across games. 2. Dominant actions (in games 2 and 5) are played irrespective of the complexity of the environment. The evidence that leads us to this conclusion is that behavior in games 1, 3, 4 and 6 significantly differs across treatment M on the one hand and treatments F, FI and MI on the other hand. Most strikingly, row players fail to best respond in games 3 and 6 at least 50% of the time in the last 50 periods of treatment M, in spite of the fact that all column players choose b. 5.2. Learning spillovers (Conjecture 2) In the previous section we have seen that complexity hinders convergence to Nash equilibrium in all games except those that have a dominant strategy (games 2 and 5). This, however, does not yet prove the existence of learning spillovers, since complexity might affect behavior in a way that is unrelated to learning spillovers. Our treatments M* and M** are designed to provide a precise test for learning spillovers. M vs. M** In treatment M** we switched the labels of actions B and C in games 2 and 5, such that the equilibrium in dominant strategies in games 2 and 5 changed from (C, b) to (B, b). If the effects observed in treatment M were not due to learning spillovers, then this relabeling should not affect behavior in games 1, 3, 4, and 6. Let us start again by considering games 3 and 6, where we have seen that in treatment M about 50 percent of row players fail to best respond to column players who almost always choose b. If we observe that behavior changes in games 3 and 6 in treatment M** compared to M, then this must be due to learning spillovers (from games 2 and 5 to games 3 and 6). And indeed the share of participants choosing C in games 3 and 6 is almost twice as high in treatment M compared to M**. This effect is strongly significant as the regression summarized in Table 9 illustrates. In games 1 and 4 we find no significant effect of action relabeling in games 2 and 5 on B- and C-choices (compare also Table 9). M vs. M* In treatment M* we reduced the row player payoffs of outcomes (A, a), (A, b) and (B, b) in games 1 and 4 by 5. The changes did not affect the equilibrium prediction, which is (A, a). Table 8 illustrates that – as expected – in M* we observe more C-choices in games 1 and 4 as compared to treatment M. This, of course, does not yield any insights on possible learning spillovers since in M* we did change payoffs in games 1 and 4.22 22 Note, though, that since the Nash equilibria in games 1 and 4 were unaffected by this change, treatment M* reveals once more that participants were unable to learn the Nash equilibrium in the more complex M-environments.

JID:YJETH AID:4052 /FLA

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.17 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

17

Table 7 Relative frequencies of action choices in games 3 and 6, last 50 rounds. Nash equilibrium, (A, b), in bold numbers. Game 3 (RP)

A

B

C

(CP)

a

b

c

M M* M**

0.50 0.46 0.71

0.01 0.01 0.04

0.49 0.53 0.25

M M* M**

0.01 0.01 0.01

0.98 0.99 0.99

0.01 0.00 0.00

Game 6 (RP)

A

B

C

(CP)

a

b

c

M M* M**

0.44 0.41 0.66

0.00 0.01 0.03

0.55 0.58 0.31

M M* M**

0.01 0.00 0.01

0.98 1.00 0.99

0.01 0.00 0.00

Table 8 Relative frequencies of action choices in games 1 and 4, last 50 rounds. Nash equilibrium, (A, a), in bold numbers. Game 1 (RP)

A

B

C

(CP)

a

b

c

M M* M**

0.37 0.21 0.30

0.50 0.56 0.56

0.13 0.23 0.14

M M* M**

0.45 0.43 0.53

0.55 0.56 0.46

0.00 0.00 0.00

Game 4 (RP)

A

B

C

(CP)

a

b

c

M M* M**

0.40 0.23 0.46

0.52 0.57 0.49

0.07 0.19 0.05

M M* M**

0.48 0.39 0.64

0.52 0.61 0.35

0.00 0.00 0.00

In games 3 and 6, that were not changed at all, the changes in games 1 and 4 seem to have increased the attractiveness of action C somewhat (compare Table 7) but the differences between M and M* are not significant. One reason why we may not observe significant learning spillovers in this case may be that our treatment variation was not strong enough, i.e. did not affect behavior sufficiently in games 1 and 4 themselves. The other reason could be that there are simply no spillover effects from games 1 and 4 to other games. We will come back to this later. A multinomial logit regression (Table 9) confirms these impressions. There are significantly less C-choices in M** compared to M in games 3 and 6, which demonstrates that learning spillovers do occur. Furthermore since the only change we did in M** compared to M was relabel two actions this result also demonstrates that learning spillovers are not invariant to action relabeling. Result 2 (Learning spillovers). 1. Learning spillovers do occur in complex environments. 2. Learning spillovers are not invariant to action relabeling. 5.3. Strategic equivalence (Conjecture 3) The previous two subsections have demonstrated that learning spillovers do occur in complex environments, where complexity was induced by a lack of feedback and the number of different games. One way to reduce the complexity induced by the occurrence of many different games is to reduce the number of elements to memorize by forming categories of games. A typical property of an efficient categorization will be that strategically equivalent games are placed in

JID:YJETH AID:4052 /FLA

18

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.18 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

Table 9 Multinomial logit regression on row players’ choices. Periods 51–100. Standard errors clustered by matching group ((Pr > X 2 ) < 0.0001). Baseline is A-choices in treatment M. Action choice

(Game 1)

(Game 3)

(Game 4)

(Game 6)

0.30166 (0.3032) 0.6968* (0.4031) 0.3098 (0.4673)

−3.8712*** (0.5197) −0.2162 (0.6646) 0.8922 (0.5893)

0.2653 (0.3056) 0.6289 (0.4464) −0.2118 (0.5782)

−4.2556*** (1.0230) 0.6180 (1.3921) 1.2379 (1.0436)

−1.0438** (0.5001) 1.1260 (0.7111) 0.2474 (0.6277)

−0.0210 (0.22282) −0.1578 (0.2995) −1.0279*** (0.3463)

−1.7645*** (0.2789) 1.5602** (0.7270) −0.5491 (0.9061)

0.2273 (0.2277) 0.1301 (0.2674) −0.9764*** (0.3200)

528 22

788 22

1232 22

880 22

B Constant M* M** C Constant M* M** Observations Groups *** 1%. ** 5%. * 10%.

Table 10 Strategic equivalence. Difference in choice frequencies between strategically equivalent games in the last 50 periods of treatment M. (RP)

A

B

C

(CP)

a

b

c

Games (1)–(4) Games (2)–(5) Games (3)–(6)

−0.03 0.00 0.06

−0.02 0.00 0.01

0.06 0.00 −0.06

Games (1)–(4) Games (2)–(5) Games (3)–(6)

−0.03 0.00 0.00

0.03 0.00 0.00

0.00 0.00 0.00

the same category. (More precisely, if all other agents put strategically equivalent games in the same category, then no player can gain anything by not doing so). From a more general point of view, if we observed that strategically equivalent games are played in the same way we can be optimistic that behavior can be explained using the tools and language from game theory. Table 10 shows the difference in choice frequencies between each pair of strategically equivalent games in treatment M.23 The table shows that there are no differences (exceeding 6 percentage points) between any pair of strategically equivalent games. None of the numbers in Table 10 is significantly different from zero according to a binomial test. Hence even in the most complex treatment M, where we do observe spillovers effects, strategically equivalent games are played in the same manner. This is an important result because it shows that the strategic nature of games matters. This in turn suggests that there are regularities in learning across games that can be explained using game theoretic models. Details on individual behavior in this respect can be found in Section 6.2. 23 Tables for other treatments can be found in Appendix A.1.

JID:YJETH AID:4052 /FLA

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.19 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

19

Table 11 Relative frequencies of action choices in games 3 and 6, last 50 rounds. Nash equilibrium is (A, b). Game 3 (RP)

A

B

C

(CP)

a

b

c

M M+F M+G MI

0.50 0.53 0.72 0.94

0.01 0.02 0.04 0.01

0.49 0.44 0.27 0.05

M M+F M+G M

0.01 0.00 0.01 0.01

0.98 1 0.99 0.99

0.01 0.00 0.00 0.00

Game 6 (RP)

A

B

C

(CP)

a

b

c

M M+F M+G MI

0.44 0.47 0.68 0.85

0.00 0.01 0.01 0.01

0.55 0.50 0.31 0.13

M M+F M+G M

0.01 0.00 0.00 0.00

0.98 1 1 0.99

0.01 0.00 0.00 0.00

Result 3 (Strategic equivalence). Strategically equivalent games are played in the same way even in the most complex environment. 5.4. Belief, best response and action bundling (Conjectures 4 and 5) The previous sections suggested that game theoretic models based on categorization may be able to rationalize most of the behavior we observe in treatment M. In this section we first ask what is the main source of complexity in our experiment and then use all the evidence gathered so far to understand whether and how the different models of categorization presented in Section 4.1 can explain our data. 5.4.1. The source of complexity In the following analysis we use treatments M, MI, M + F, and M + G to discriminate between the different modes of categorization and to identify the source of complexity in our experiment. Remember that in treatment MI participants were given (i) summary statistics about past behavior of their opponents in the different games and (ii) they were shown the payoff matrices of the different games on the decision screens and in the instructions. In treatment M, participants received neither information.24 Treatments M + F and M + G provide only either information (i) or (ii). In M + F participants were shown summary statistics about past behavior of their opponents, but not the payoff matrices. In M + G they were shown the payoff matrices on the decision screen and in the instructions, but were not given summary statistics about past play. Among the four treatments, M is clearly the most complex and MI the least complex environment. Comparison of M + F and M + G yields insights about the source of complexity in M. In particular, we can pin down to what extent complexity stems from the difficulty of distinguishing different games or from the difficulty of remembering the opponents’ choice frequencies for different games. This will also give us insight into whether participants reduce complexity by considering average games (best response bundling), by bundling beliefs, or by doing both (action bundling). Games 3 and 6 We first consider games 3 and 6. Table 11 compares the shares of action choices in the last 50 rounds in those games across the treatments mentioned. As expected, both 24 They were only shown the payoff matrices for 10 seconds immediately preceding the decision screen. Hence, it required cognitive effort to memorize the payoff matrices as well as the past play of opponents.

JID:YJETH AID:4052 /FLA

20

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.20 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

Fig. 2. Varying feedback conditions. Predicted share of Nash (A) choices (from logit regressions on periods 51–100) in treatments M, MI, M + F and M + G.

treatments, M + F and M + G, range “in between” M and MI in the sense that the share of Nash choices is higher compared to M and lower compared to MI. More interestingly, there are more Nash choices in M + G as compared to M + F and, moreover, behavior in M + F seems very similar to behavior in treatment M. This is illustrated in Fig. 2.25 Regressions (see Table 12) confirm that there are no significant differences between M and M + F. In M + G, though, behavior is much closer to MI and significantly different from M. Hence, providing information about the games (on the decision screen) helps convergence to Nash equilibrium. This leads us to conclude that the difficulty of distinguishing different games is a major source of complexity in treatment M. Once removed, behavior is much closer to Nash equilibrium in games 3 and 6. This is consistent with theories of action and best response bundling that assume that participants reduce complexity by averaging payoff matrices across the different games. We do not observe significant differences between M + F and M, though. Hence, providing frequencies alone does not seem to help convergence to Nash equilibrium, while providing games on the decision screen does. Still, whereas we observe almost only Nash choices in MI, there remains a significant proportion of non-Nash choices in M + G. (The coefficients on MI 25 For the sake of clarity (with four treatments in one graph) the figure shows the results of logit regressions rather than actual shares such as Fig. 1.

JID:YJETH AID:4052 /FLA

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.21 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

21

Table 12 Multinomial logit regression on row players’ choices (periods 51–100). Standard errors clustered by matching group ((Pr > X 2 ) < 0.0001). Baseline is A-choices in treatment M. Action choice B Constant MI M+F M+G C Constant MI M+F M+G Observations Groups

(Game 1)

(Game 3)

(Game 4)

(Game 6)

0.3016 (0.3015) −0.9696* (0.5860) −0.6280 (0.6637) −0.5667 (0.4409)

−3.8712*** (0.5168) −0.6323 (0.6907) 0.7213 (0.7523) −1.1786 (1.0852)

0.2653 (0.3486) −1.5783** (0.6528) −1.2270* (0.7419) −0.8747* (0.4758)

−4.2556*** (1.0172) −0.2552 (1.2239) 0.9306 (1.1667) −0.8381 (1.3476)

−1.0438** (0.4973) −1.2504* (0.7799) 0.0749 (0.8142) −0.5386 (0.6624)

−0.0210 (0.2269) −2.2942*** (0.4828) −0.1756 (0.4034) −0.9512** (0.3500)

−1.7645*** (0.2773) −1.4312** (0.6794) 0.9022 (0.6426) −0.2896 (0.6644)

0.2273 (0.2264) −2.0526*** (0.3266) −0.1782 (0.4085) −0.9904** (0.4797)

702 29

1404 29

1638 29

1170 29

*** 1%. ** 5%. * 10%.

and M + G are significantly different at 5% level.) The only difference between MI and M + G, however, is that in MI frequencies are provided while in M + G they are not. This suggests that misperception of relative frequencies may play a role as well, but that it seems to interact with the difficulty of distinguishing games. Games 1 and 4 Now consider games 1 and 4. In these games information on frequencies seems ex ante more valuable than in games 3 and 6, since the behavior of column players displays much more variation in games 1 and 4. And indeed in games 1 and 4 both providing information on games (as in M + G) and providing information on frequencies (as in M + F) affects behavior. Table 13 compares the shares of action choices in the last 50 rounds in those games across treatments. Both treatments, M + F and M + G, range “in between” M and MI in the sense that the share of Nash choices is higher compared to M and lower compared to MI. However, unlike in games 3 and 6, behavior in M + G and M + F is rather similar and overall effects are weaker. In game 4 the difference between M + G (or M + F) and treatment M is significant at the 10% level according to our regression in Table 12. In game 1 we do not observe significant differences, most likely because this game was played more often early on in the experiment. Fig. 2 illustrates the results. The overall evidence supports our earlier impression that – while the strongest and most significant effects are found between M + G and M – both, the difficulty of distinguishing games and the difficulty of remembering past choice frequencies of the opponent, are sources of complexity in our experiment.

JID:YJETH AID:4052 /FLA

22

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.22 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

Table 13 Relative frequencies of action choices in games 1 and 4, last 50 rounds. Nash equilibrium is (A, a). Game 1 (RP)

A

B

C

(CP)

a

b

c

M M+F M+G MI

0.37 0.50 0.51 0.62

0.50 0.39 0.39 0.31

0.13 0.11 0.10 0.06

M M+F M+G MI

0.45 0.64 0.69 0.74

0.55 0.34 0.31 0.26

0.00 0.01 0.00 0.00

Game 4 (RP)

A

B

C

(CP)

a

b

c

M M+F M+G MI

0.40 0.58 0.61 0.76

0.53 0.34 0.32 0.21

0.07 0.07 0.07 0.03

M M+F M+G MI

0.48 0.71 0.71 0.82

0.52 0.29 0.29 0.17

0.00 0.00 0.00 0.00

Table 14 Row player behavior in each pair of strategically equivalent games in the last 50 periods of treatment M. Nash predictions in individual games are bold. (RP)

A

B

C

Games (1)–(4) Games (2)–(5) Games (3)–(6)

0.37–0.40 0.00–0.01 0.44–0.50

0.50–0.52 0.00–0.01 0.00–0.01

0.07–0.13 0.99–1.00 0.49–0.55

Result 4 (Source of complexity). The difficulty of distinguishing games is a major source of complexity in our experiment. There is also evidence that remembering the opponents’ choice frequencies creates additional complexity for participants. The evidence which supports this result is that comparison of treatments M + G and M reveals that in games 3 and 6 behavior significantly differs across the two treatments with more Nash choices observed in M + G. However there are still less Nash choices in M + G compared to MI. We also find weakly significant treatment differences between M + F and M in games 1 and 4. Both results indicate that remembering choice frequencies is a source of complexity as well. 5.4.2. Modes of categorization In this subsection we want to exploit our findings discussed in Sections 5.1 to 6.1 to show how a theory of categorization can explain our results. As before we will seek to explain row player behavior only, but we will of course require it to be optimal given the empirically observed behavior by column players and the mode of categorization.26 Table 14 summarizes the behavior we want to explain. While in games 2 and 5 all participants choose the Nash action, there are about 50 percent of non-Nash choices in games 1, 3, 4 and 6 that we will focus on. One difficulty with models of categorization is that the researcher has considerable degree of freedom in choosing partitions, i.e. in choosing which games are assumed to be in the same category. Hence, instead of making arbitrary assumptions, we will let the evidence presented in Sections 5.1 to 6.1 guide our choice of partition. Our first observation is that there is heterogene26 We will discuss column player behavior briefly below.

JID:YJETH AID:4052 /FLA

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.23 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

23

ity since, most strikingly, in games 3 and 6 some row players choose the Nash action A, but many others choose C, even though all column players choose the same action (b).27 In Section 6.1 we have seen that this heterogeneity is highly correlated with the willingness to engage in cognitive reflection. We will furthermore rely on Section 5.2, where we have identified significant spillovers from games 2 and 5 to games 3 and 6. Hence, we will assume that at least some players place those games in the same category. Finally, in Section 5.4.1 we have identified the difficulty to distinguish games as a major source of complexity in our experiment. We will use those three findings (heterogeneity, spillovers from games 2 and 5 to games 3 and 6 and the difficulty to distinguish games as a source of complexity) and show one example of a theory which can explain our results. The example involves the minimal amount of heterogeneity (and hence the smallest number of degrees of freedom) possible, but we will also discuss alternative plausible explanations. Action or best response bundling Given the evidence presented above we assume that some participants (those very willing to reflect) hold the finest partition (or alternatively partition {{1, 4}, {2, 5}, {3, 6}}) and choose best responses to observed behavior by the column players in each game (or category of strategically equivalent games). This clearly involves choosing the Nash actions C in games 2 and 5 and A in games 3 and 6. In games 1 and 4, column players choose a and b with about equal frequencies, in which case row players are indifferent between actions A and B and, hence, depending on individual experience row players may choose either A or B in those games. What about the other participants (those that choose C in games 3 and 6)? Consistently with the evidence from Section 5.2 we assume that some participants rely on category {2, 3, 5, 6} and consistently with evidence from Section 5.4.1 we assume that they engage in either action or best response bundling. Prediction 1 (Action or best response bundling). • Assume best response (or action) bundlers hold partition {{1, 4}, {2, 5}, {3, 6}}. Then row players choose A or B in category {1, 4},28 C in category {2, 5} and A in category {3, 6} as best responses to observed behavior by column players. • Assume best response (or action) bundlers hold partition {{2, 3, 5, 6}, {1, 4}}. Then if f3 + f6 < 2(f2 + f5 ), row players choose C in all games in category {2, 3, 5, 6} as best responses to observed behavior by column players. The latter claim follows simply from the fact that if f3 + f6 < 2(f2 + f5 ), then C is a dominant strategy for row players in the average game corresponding to category {2, 3, 5, 6}.29 Hence one explanation is that some participants hold partition {{1, 4}, {2, 3, 5, 6}} and others the finest partition (or partition {{1, 4}, {2, 5}, {3, 6}}) and that players engage in action or best response bundling. This explanation can explain around 90% of observed behavior. It cannot explain why 27 Below we will see that these are indeed two “types” of players rather than everyone choosing a mixed strategy. 28 Either A or B are possible because variation in column player behavior is big enough s.t. different row players have

faced sufficiently different distribution of column player choices. 29 Note that action and best response bundling are not distinguished here since in category {2, 3, 5, 6} column players choose b almost all the time in all games. Hence a best response bundler who bundles beliefs in addition (action bundling) will behave in the same way as an agent that bundles only best responses.

JID:YJETH AID:4052 /FLA

24

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.24 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

around 10 percent of participants choose C in games 1 and 4. However, this percentage decreases over time to almost zero.30 Belief bundling What about belief bundling? Irrespective of the partition belief bundling cannot explain why players choose action C in games 3 and 6 (which is actually chosen in roughly 50% of all cases). To see this, note that choosing C in those games is optimal only if an agent believes that column players choose c (which they don’t do in any game) or if they choose a with at least 50 percent probability. Column players, however, choose a with slightly less than 50 percent probability even in games 1 and 4 (which are the only games where they do choose a at all). Hence as long as f3 + f6 > 0, there is no category G ⊇ 3, 6 and average belief μ(G) that can rationalize choosing C in games 3 and 6. Consistently with this argument the evidence presented in Section 5.4.1 suggests that participants do not seem to be (substantially) confused about their opponents past choice frequencies in these games. In Section 5.4.1 we have also found some evidence that providing frequencies leads to more Nash choices, in games 1 and 4. And indeed, B-choices in those games can be explained via belief bundling as the following result demonstrates. Prediction 2 (Belief bundling). • Assume belief bundlers hold partition {{1, 4}, {2, 5}, {3, 6}}. Then row players choose A or B in category {1, 4}, C in category {2, 5} and A in category {3, 6} as best responses to observed behavior by column players. • Assume belief bundlers hold partition {1, 2, 3, 4, 5, 6}. Then row players choose B in games 1 and 4, C in games 2 and 5 and A in games 3 and 6 as best responses to observed behavior by column players. The latter part of the claim follows from the fact that C is a dominant strategy in games 2 and 5. In games 1 and 4, row players will believe that column players choose a with less than 50 percent probability (irrespective of the frequency with which different games occur). Hence they will best respond with B. And in games 3 and 6 they will best respond with A. Result 5 (A theoretical explanation). 1. A model of best response bundling or action bundling in which some participants hold partition {{1, 4}, {2, 3, 5, 6}} and others the finest partition (or partition {{1, 4}, {2, 5}, {3, 6}}) can explain observed (row player) behavior in treatment M. 2. A model of belief bundling can explain B-choices by row players in games 1 and 4, but not C-choices in either games 1 and 4 or 3 and 6. 30 An explanation that can account for all behavior relies on a third type of player who holds the coarsest partition and engages in action or best response bundling, hence choosing C also in games 1 and 4. Indeed one may find it more natural to assume that participants that are not very willing to reflect use the coarsest partition. In Appendix C we show that the results can also be explained by some participants holding partition {1, 2, 3, 4, 5, 6} and others the finest partition (or partition {{1, 4}, {2, 5}, {3, 6}}). In this case, however, only best response bundling will be able to explain the results. The reason is that action bundling prescribes that players choose the same action in all games in a given category, which is clearly inconsistent with about 50 percent of B-choices in games 1 and 4 on the one hand and about 50 percent of C-choices in games 3 and 6 on the other hand.

JID:YJETH AID:4052 /FLA

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.25 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

25

Comparative statics By comparing treatments M, M*, and M** we can furthermore check whether our data is in line with comparative statics predictions of the different models of categorization. In Appendix D we show that under best response bundling, irrespective of the partition used, we should observe weakly more C-choices in M* as compared to M and weakly less C-choices in M** as compared to M.31 This is also what we observe. Belief bundling, on the contrary, would predict no change of behavior in M* as compared to M and weakly more C-choices in M** as compared to M, which is not in line with our evidence. In order to keep the paper short we refer the reader to Appendix D for more details and proofs. Column players Our experiment was designed to study row player behavior. But let us have a brief look at whether column player behavior is consistent with theories of categorization. Column players choose b in games 2, 3, 5 and 6, which is the Nash equilibrium choice as well as the best response to empirical behavior by row players. In games 1 and 4, 45 to 48 percent of column players choose a, while 52 to 55 percent of them choose b. It is important to note at this stage that there is a priori no reason to believe that column players partition games in the same way as row players do, since the information (payoff matrices and feedback) they see and the experience they have in the six games differs from that of row players. Theories of categorization can rationalize b-choices in games 1 and 4 via best response or action bundling using partition {1, 2, 3, 4, 5, 6}, {1, 3, 4, 6} or {1, 2, 4, 5}. Belief bundling can also rationalize b-choices given any of the previous partitions as long as an average belief is created that places enough weight on C. Hence, column player behavior cannot be used to discriminate between different modes of categorization. 6. Extensions and robustness 6.1. Cognitive reflection In the context of our study it is natural to expect that the willingness of subjects to engage in cognitively costly reasoning processes should be correlated with their behavior in the different games. To be able to analyze this issue we conducted a cognitive reflection test (Frederick [8]) at the end of the experiment. The test consists of the three following questions: 1. A bat and a ball cost Euro 1.10 in total. The bat costs Euro 1.00 more than the ball. How much does the ball cost? 2. If it takes 5 machines 5 minutes to make 5 widgets, how long would it take 100 machines to make 100 widgets? 3. In a lake, there is a patch of lily pads. Every day, the patch doubles in size. If it takes 48 days for the patch to cover the entire lake, how long would it take for the patch to cover half of the lake? All these questions have an answer that immediately springs into mind (10 cents, 100 minutes, 24 days), but which is wrong. The right answer (5 cents, 5 minutes, 47 days) can only be found by engaging in some cognitive reflection. Note that the test is not measuring intelligence, but 31 Of course, it is possible that different partitions arise in the three different treatments.

JID:YJETH AID:4052 /FLA

26

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.26 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

Table 15 Reflective, unreflective and other answers in the CRT in treatment M (percentage of individuals). Question No.

1

2

3

Reflective Unreflective Other answers

0.46 0.50 0.04

0.53 0.38 0.09

0.55 0.31 0.14

Table 16 Relative frequencies of action choices of reflective and other individuals in treatment M in the last 50 rounds of play. (Equilibrium choices in the respective game are bold.) Game 1

A

B

C

Reflective Others

0.36 0.37

0.50 0.50

0.14 0.12

Game 4

A

B

C

Reflective Others

0.58 0.35

0.40 0.56

0.00 0.08

Game 3

A

B

C

Reflective Others

0.60 0.47

0.00 0.00

0.40 0.52

Game 6

A

B

C

Reflective Others

0.64 0.38

0.00 0.00

0.36 0.61

rather the willingness of subjects to engage in costly cognitive reflection. Table 15 summarizes the results from the cognitive reflection test in treatment M. In our experiment cognitive reflection is mostly required to memorize past payoffs and the past behavior of the opponents in previous games. (Remember that since participants can only see their own payoffs, cognitive reflection here cannot be about the ability to perform iterated elimination of dominated strategies.) Table 16 reports action choices of row players in games 1, 3, 4, and 6 in treatment M separately for those subjects who answered all three questions correctly and for those who did not. Three correct answers indicate a high willingness to engage in costly cognitive reflection. Hence, we refer to those subjects as “reflective”. Table 16 shows that participants classified as “reflective” tend to choose A more often than others in all games 3, 4 and 6. This effect is highly significant according to a Spearman correlation test (ρ = 0.1092∗∗ , (0.1915∗∗∗ , 0.2155∗∗∗ ), in games 3, 4 and 6). In game 1 there is no significant correlation between the frequency of A choices and the variable “reflected” (Spearman test, p > 0.8485). We also find that all participants categorized as “reflective” choose the same action most of the time in games 3 and 6. Among those participants 72% always choose A, while the remaining 28% always choose C. Furthermore, 85% of the reflective participants show the same behavior in every pair of strategically equivalent games. Taken together these results suggest that non-convergence in treatment M can indeed be attributed to the cognitive costs implied by the more complex environment.

JID:YJETH AID:4052 /FLA

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.27 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

27

Table 17 Individual level strategic equivalence. Share of participants that choose the same action “most of the time” in each pair of strategically equivalent games in the last 50 periods of treatment M. Individuals are classified according to the action they choose most often in each game (at least 66 percent of the time within the last 50 periods). (RP)

A

B

C

(CP)

a

b

c

Games 1 and 4 Games 3 and 6

0.50 0.92

0.58 −

0.00 0.87

Games 1 and 4 Games 3 and 6

0.78 −

0.79 1

− −

Result 6 (Cognitive reflection test). 1. “Reflective” participants choose the Nash action significantly more often in games 3, 4, and 6. 2. 85% of the “reflective” participants show the same behavior in every pair of strategically equivalent games. 3. All “reflective” participants always choose the same action in games 3 and 6 (72% choose action A and 28% action C) and in games 2 and 5. 6.2. Individual heterogeneity In this subsection we will take a look at individual heterogeneity. We focus (i) on the question whether individuals choose the same action in strategically equivalent games and (ii) on payoff differences between participants. Understanding individual heterogeneity is important, for example, to gain insight into whether aggregate heterogeneity is due to the existence of different types of players or due to players using mixed strategies. For our analysis we consider only data from treatment M. In order to address the above questions, for each game we classify participants according to the action they choose “most often”, as long as they choose this action at least 66 percent of the time in the last 50 periods. If there is no action a participant chooses at least 66 percent of the time in the last 50 periods this participant is classified as “neither”. This concerns two participants in game 1, one participant in game 3 and three participants in game 6. Do individuals choose the same action in strategically equivalent games? Table 17 shows the percentage of individuals that choose the same action “most of the time” in each pair of strategically equivalent games. We compute the shares by looking at the number of participants that choose the same action in both games (e.g. action A in both games 1 and 4) and relate it to the sum of participants that played this action “most of the time” in either game. To give an example, denote by A1 (A4) the number of participants that choose A most of the time in game 1 (game 4) and by A(1 + 4) the number of participants that choose A most of the time in both games 1 and 4. The share reported corresponds to (2 ∗ A(1 + 4))/ (A1 + A4). The table illustrates that in games 3 and 6 most participants choose the same action “most of the time”. With respect to games 1 and 4 the picture is less clear. This could be due to some players choosing mixed strategies or it could be due simply to the overall complexity of these games. Payoff differences Table 18 summarizes the payoffs obtained by the different “types” of players in our experiment. It can be seen that in all games 1, 3, 4 and 6 players who choose C have

JID:YJETH AID:4052 /FLA

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.28 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

28

Table 18 Payoffs by game and action choice in games 1 and 4, as well as 3 and 6. Players are classified A (B, C, a, b) if they chose that action most of the time (at least 66 percent of the time within the last 50 periods). Game 1

A

B

C

a

b

Game 3

A

C

FI F MI M

18.5 18.7 17.4 16.9

17.8 – 18.4 17.2

17.5 – 15.3 15.7

18.2 18.7 17.4 16.8

15.1 – 18.1 17.2

MI M

18.9 18.3

17.0 16.0

Game 4

A

B

C

a

b

Game 6

A

C

MI M

23.4 22.3

23.7 23.3

19.8 20.6

23.1 21.9

21.0 21.1

MI M

24.2 23.8

22.6 20.9

lower payoffs than others (Mann–Whitney, p < 0.0001 in games 3, 4 and 6 and p < 0.0261 in game 1). Result 7 (Individual heterogeneity). 1. More than 87 percent of participants choose the same action most of the time in games 3 and 6, but only between 50 and 79 percent of participants do so in games 1 and 4. 2. Participants that choose C most often in games 1, 3, 4 and 6 make significant lower payoffs than other participants. 6.3. Learning Our design implies that participants have to learn which action to choose in which game or category. In the following we study whether their choices in any given game are affected by their experience in strategically different games. This is a second way to gain insight into learning spillovers which does not hinge on treatment comparisons. It yields some additional indication for how reliable the results are that we derived in Section 5.2. To these ends we run logit regressions where we use past payoffs in each pair of strategically equivalent games and for each action as independent variables. Note that if we adopted an equilibrium perspective (where each player chooses a contingent plan of actions at the start of the experiment), then such a regression would suffer from a huge endogeneity problem. Our fundamental view in this subsection is that participants learn which action to choose in each game (or category of games). Under this view there is no way in which past payoffs should be affected by current choices (and hence no endogeneity issue). Still, the results of the regression, while indicative, should be interpreted with care. We eliminated actions from the regression which are almost never chosen (i.e. in less than 3 percent of the cases).32 Table 19 summarizes the results of our regression. In games 1 and 4 we find that action B is chosen more often compared to A the higher the past payoffs were with this action (coefficients 32 The reason is that since these actions are never chosen by most people the variable average payoff will be zero for most people and will take on a positive value for very few people. If these few people display certain choice pathologies (which may well be because they choose, e.g., strictly dominated actions), then those will distort the regression results.

JID:YJETH AID:4052 /FLA

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.29 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

29

Table 19 Multinomial logit regression. Standard errors clustered by matching group. Decisions in treatment M depending on previous payoffs of action A (B, C) in all pairs of strategically equivalent games. Baseline is A-choices ((Pr > X 2 ) < 0.0001). Action choice

(Game 1)

(Game 3)

(Game 4)

(Game 6)

B Constant A Games14 B Games14 C Games14 C Games25 A Games36 C Games36

16.674** −0.1470*** 0.0568*** −0.1751 −0.4205** 0.0129 −0.1052

0.8739 −0.0347 −0.0364* 0.0880* 0.0338 −0.2396** −0.0704

−1.6659 −0.3899** 0.1519*** −0.0247 0.1832* 0.0284 0.0325

−58.454** 0.0983*** 0.3643 0.1762 0.7798 0.2562 0.6260

C Constant A Games14 B Games14 C Games14 C Games25 A Games36 C Games36

0.4797 −0.1667*** −0.0519** −0.0007 0.0137 −0.0516* 0.1898

−3.900* −0.0203 −0.0392* 0.0136 0.1163*** −0.2543*** 0.3802***

−0.3351 −0.387** −0.0155 0.1358 0.2645** −0.0411* −0.1679

−59.9414*** −0.0711 −0.0251 0.0507 1.5881*** −0.4169** 1.2479***

Observations Groups

416 8

544 8

960 8

384 8

*** 1%. ** 5%. * 10%.

on (B Games14)) and the lower were past payoffs of A in games 1 and 4. C is chosen more often compared to A if A was unsuccessful in the past. So far these are not spillover effects but simple learning effects. We find few significant spillover effects for games 1 and 4. The only effect which is significant at the 5 percent level is that the success of C in games 2 and 5, seems to increase the probability with which C is chosen in game 4 and to decrease that of B in game 1. Also in games 3 and 6 we can observe intuitive effects of learning. The more successful action C in games 3 and 6 (and the less successful action A) the more often C is chosen relative to A. Those effects are strongly significant. In these games we also see a very strong spillover effect from games 2 and 5. The more successful action C was on average in these games the more often participants choose C also in games 3 and 6.33 From games 1 and 4 we see mostly marginally significant effects and those mostly on B-choices. Remember, however, that B was essentially never chosen in the last 50 periods of the experiment. Hence those effects seem to be mostly due to learning in early stages of the experiment. In sum regression Table 19 provides some evidence for our conjecture based on previous evidence that some participants may be using partition {{1, 4}, {2, 3, 5, 6}}. However, also the coarsest partition cannot be rejected given this regression. 33 Since especially in the first 50 periods some participants do deviate from the dominant strategy at times we do see some variation in average payoffs even for C-choices in games 2 and 5.

JID:YJETH AID:4052 /FLA

30

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.30 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

Table 20 Difference in relative frequencies of action choice in the last 50 periods between treatments M and M-UD. M-(M-uniform)

A

B

C

(CP)

a

b

c

Game 1 Game 2 Game 3 Game 4 Game 5 Game 6

0.07 0.00 −0.06 0.06 0.00 0.01

−0.08 0.00 0.01 −0.08 0.00 −0.01

0.02 0.00 0.05 0.02 0.00 0.00

Game 1 Game 2 Game 3 Game 4 Game 5 Game 6

0.02 0.00 0.01 −0.03 0.00 0.00

0.02 0.00 −0.01 0.03 0.00 0.00

0.00 0.00 0.00 0.00 0.00 0.00

Result 8 (Learning). We do observe spillovers effects in learning. Spillover effects are particularly strong (and highly significant) from games 2 and 5 to games 3 and 6. 6.4. Robustness The reader may have noted that, while we were drawing games from a uniform distribution, the resulting empirical distribution was not uniform. In particular, game 4 occurred more often than other games. We hence also conducted an additional treatment (M-UD), where each game appeared either 16 or 17 times. In this subsection we will compare behavior in treatments M and M-UD. Table 20 shows that there are not too many differences in behavior between treatments M and M-UD and the small differences observed are not significant (p > 0.417 according to a multinomial logit regression). Result 9 (Robustness). Results are robust in the sense that slight changes in the frequencies with which each game occurs do not significantly alter observed behavior in any of the games played. 7. Conclusions In this paper we investigated experimentally how agents learn to make decisions in a multiple games environment. Participants interacted in simple 3 × 3 normal-form games for 100 rounds. We varied the number of games across treatments. In addition, in some of the treatments it required substantial cognitive effort to acquire and remember information on payoff matrices and the opponents’ past behavior, but less in others. We find that participants do extrapolate between games. If either there are few games or if explicit summary information about the games and the opponent’s behavior is provided (or both) convergence to the unique Nash equilibrium generally occurs. Otherwise this is not the case and play converges to a distribution of actions which is non-Nash. In games with a dominant strategy equilibrium, however, convergence always takes place. We identify significant learning spillovers across games and demonstrate that participants that are more reflective (in the sense of a higher score in the “Cognitive Reflection Test” by Frederick [8]) choose Nash actions more often. We furthermore used some control treatments in order to identify the source of complexity in our experiment. The evidence points to the difficulty to distinguish games in the first place, but also missing summary information about other players’ choices affects behavior. We demonstrate that a model of best response or action bundling can explain a huge part of our data and also discuss which part of the evidence can plausibly be explained by phenomena of belief bundling (or “analogy-based expectations”, Jehiel [14]).

JID:YJETH AID:4052 /FLA

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.31 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

31

Our results have a number of interesting implications. For example, the fact that behavior in dominant strategy games seems unaffected by the complexity of the environment is suggestive for mechanism design and merits further research. More generally, if we understand how categorization of different games/mechanisms takes place, under which conditions extrapolation will emerge and how it will affect learning, then policy design can be much improved. Knowledge of how consumers learn across different everyday situations will have implications on contracts that firms offer to consumers and also might suggest regulatory interventions in order to impede exploitation of consumers by firms. The potential for future research in this area, both fundamental and applied, is huge. Appendix A. Additional tables and graphs A.1. Strategic equivalence and full regression Table 21 Strategic equivalence. Difference in choice frequencies between strategically equivalent games in the last 50 periods of treatment M*. (RP)

A

B

C

(CP)

a

b

c

Games (1)–(4) Games (2)–(5) Games (3)–(6)

−0.02 0.00 0.06

−0.01 0.00 0.00

0.03 0.00 −0.06

Games (1)–(4) Games (2)–(5) Games (3)–(6)

0.04 0.00 0.00

−0.04 0.00 0.00

0.00 0.00 0.00

Table 22 Strategic equivalence. Difference in choice frequencies between strategically equivalent games in the last 50 periods of treatment M**. (RP)

A

B

C

(CP)

Games (1)–(4) Games (2)–(5) Games (3)–(6)

−0.16 0.00 0.06

0.06 0.00 0.00

0.09 0.00 −0.06

Games (1)–(4) Games (2)–(5) Games (3)–(6)

a 0.02 0.00 0.02

b

c

0.03 0.00 −0.02

−0.04 0.00 0.00

Table 23 Strategic equivalence. Difference in choice frequencies between strategically equivalent games in the last 50 periods of treatment M + F. (RP)

A

B

C

(CP)

a

b

c

Games (1)–(4) Games (2)–(5) Games (3)–(6)

−0.11 0.00 0.03

0.09 0.00 0.00

0.02 0.00 −0.04

Games (1)–(4) Games (2)–(5) Games (3)–(6)

−0.05 0.00 0.00

0.05 0.00 0.00

0.00 0.00 0.00

Table 24 Strategic equivalence. Difference in choice frequencies between strategically equivalent games in the last 50 periods of treatment M + G. (RP)

A

B

C

(CP)

a

b

c

Games (1)–(4) Games (2)–(5) Games (3)–(6)

−0.07 0.00 0.04

0.05 0.00 0.00

0.02 0.00 −0.03

Games (1)–(4) Games (2)–(5) Games (3)–(6)

−0.01 0.00 0.00

0.01 0.00 0.00

0.00 0.00 0.00

JID:YJETH AID:4052 /FLA

32

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.32 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

Table 25 Multinomial logit regression on row player’s choices. All treatments, all periods. Standard errors clustered by matching group ((Pr > X 2 ) < 0.0001). Baseline is A-choices in treatment MI. Action choice B Constant F FI M M-uni M* M** M+F M+G C Constant F FI M M-uni M* M** M+F M+G Observations Groups *** 1%. ** 5%. * 10%.

(Game 1)

(Game 3)

(Game 4)

(Game 6)

−0.2993 (0.3991) −2.3900*** (0.5295) −0.6770 (0.5638) 0.5515 (0.4891) 0.9538* (0.5436) 1.4492*** (0.4880) 1.0637 (0.5580* ) 0.4031 (0.5577) 0.3931 (0.4849)

−4.2219*** (0.4598)

−0.4634 (0.3339)

−4.3006*** (0.5204)

0.9542* (0.5793) 0.2820* (0.7127) 0.6219 (0.7101) 1.1894** (0.5145) 1.3017** (0.6745) −0.3782 (0.7872)

0.8288** (0.3947) 1.0911** (0.5012) 1.6546*** (0.4096) 0.8724* (0.5124) 0.1702 (0.5545) 0.3092** (0.4119) −1.4483** (0.4311)

−0.0687 (1.1627) 0.3987 (0.8377) 0.7891 (0.8602) 1.1366** (0.5457) 1.0751 (0.6878) −0.8754 (1.0197)

−1.113** (0.4374) −1.315*** (0.5140) −0.9270* (0.5216) 0.8549* (0.5107) 1.0671* (0.5697) 1.9147*** (0.5370) 1.1022 (0.5322) 0.9344* (0.7015) 0.3349 (0.5517)

−1.153*** (0.1707)

−1.2443*** (0.1707)

1.5936*** (0.2727) 1.5105*** (0.4183) 1.6073*** (0.2265) 0.9883*** (0.2180) 1.4598*** (0.3198) 1.1933** (0.2458)

1.0860** (0.4311) 0.9707* (0.5255) 2.1302*** (0.5486) 0.7226 (0.5404) 1.0290* (0.6278) 0.6010 (0.5007)

1.5933*** (0.2727) 1.7791*** (0.4684) 1.6347*** (0.2308) 0.6191** (0.2659) 1.4793*** (0.3637) 0.7686* (0.3689)

4105 55

3208 47

5466 47

2346 47

JID:YJETH AID:4052 /FLA

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.33 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

33

A.2. Regression treatments, MI, M, FI and F with M as baseline Table 26 Multinomial logit regression on row player’s choices (periods 51–100). Standard errors clustered by matching group ((Pr > X 2 ) < 0.0001). Baseline is A-choices in treatment M. Action choice B Constant F FI MI C Constant F FI MI Observations Groups

(Game 1)

(Game 3)

(Game 4)

(Game 6)

0.3016 (0.3026) −34.25*** (0.5942) −2.6872*** (0.5356) −0.9699* (0.5882)

−3.8712*** (0.4651)

0.2653 (0.5601)

−4.5108*** (0.6970)

−0.6323 (0.7010)

−1.5787** (0.6626)

−0.2552 (1.2421)

−1.0438** (0.4992) −2.2520*** (0.8655) −2.7554*** (0.8562) −1.2504 (0.7828)

−0.0210 (0.4324)

−1.7645*** (0.2814)

0.2273 (0.2298)

−2.9420*** (0.4899)

−1.4312** (0.6895)

−2.0526*** (0.3314)

896 24

576 16

1280 16

640 16

*** 1%. ** 5%. * 10%.

Appendix B. Subject pool effects Since some of the sessions were run in Cologne and some sessions in Maastricht we compare behavior across these labs to identify possible subject pool effects. Fig. 3 shows the share of A-choices over time in games 1, 3, 4 and 6 and treatments M and MI separately by lab. There do not seem to be significant differences in behavior across the two labs. To be sure we also ran a multinomial logit regression where we included Maastricht as a dummy variable (together with treatment interactions). These regressions show no significant effects (p > 0.172) except for treatment MI in game 3, where there are more C-choices relative to A-choices in Maastricht (p = 0.0090). Appendix C. Equilibria In this appendix we compute equilibria for all the three modes of categorization (belief bundling, best response bundling, and action bundling) given the coarsest category {1, 2, 3, 4, 5, 6}. The average game for this category is computed as follows (for space reasons we only show row player payoffs):

JID:YJETH AID:4052 /FLA

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.34 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

34

Fig. 3. Subject pool effects. Share of Nash choices by row players across treatments MI and M and sessions run in Maastricht and Cologne.

 f1 or

20 15 15 10 25 0 10 15 35



 + f2

5 15 5 10 5 10 20 25 15



⎛ 5f + 15(f + f ) + 20(f + f ) + 25f 2 3 5 1 6 4 .. ⎝ . 10f1 + 15f4 + 20(f2 + f3 ) + 25f6 + 30f5

 + · · · + f6

··· ···

20 25 20 20 15 10 25 20 40





5f2 + 15(f1 + f3 + f5 ) + 20(f4 + f6 ) .. ⎠. . 15f2 + 25f5 + 35(f1 + f3 ) + 40(f4 + f6 )

Hence, equilibria will depend on the frequencies with which the different games occur. (This is the case because both, the payoffs of the average game G as well as the frequencies μi (G), depend on the frequencies with which the games occur.) In the experiment we drew games from a uniform distribution, however the resulting draw was not exactly uniform. If we assume that equilibria are learned rather than played from period 1 onwards this is not innocuous. Hence, when we describe the set of equilibria we make explicit assumptions and conditions on game frequencies. All equilibria presume that all agents engage in the same mode of categorization and that all hold the same partition. In Appendix D we present some comparative statics predictions across our treatments.

JID:YJETH AID:4052 /FLA

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.35 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

35

C.1. Best response bundling Equilibrium with best response bundling. Assume f2 + f5  13 . Then if f1 + f4  59 , row players choose ai = (A, C, C, A, C, C) in the unique equilibrium and column players choose ai = (a, b, b, a, b, b). If f1 + f4  59 , row players choose ai = (C, C, C, C, C, C) in the unique equilibrium and column players choose ai = (b, b, b, b, b, b). Proof. b is a dominant strategy for the column player in the average game whenever f2 + f5  13 and whenever (f1 + f4 )  32 (f5 + f2 ) + (f3 + f6 ). Hence a sufficient condition is f2 + f5  13 and f1 + f4  12 . If (f1 + f4 )  32 (f5 + f2 ) + (f3 + f6 ), then a is a best response to A. For the row player if f2 + f5  13 , then whenever f3 + f6  23 , C is a best response to b in the average game. If in addition it holds that (f1 + f4 )  32 (f5 + f2 ) + 12 (f3 + f6 ), then C is a dominant strategy in the average game. Clearly the latter condition is implied by the condition above for column players and it is violated (under the other assumptions) whenever f1 + f4  59 , in which case A is a best response to a. 2 C.2. Belief bundling Equilibrium with belief bundling (ABEE). Assume f1 + f4 < 12 . Then, if f2 + f5 < 15 , row players choose ai = (B, C, A, B, C, A) and column players choose ai = (a, b, b, a, b, b) in the unique equilibrium with belief bundling (or ABEE) for the coarsest partition. If f2 + f5 > 15 , then row players choose ai = (B, C, A, B, C, A) and column players choose ai = (b, b, b, b, b, b) in the unique equilibrium. Proof. For the row player B is a best response to the average behavior of column players in games 1 and 4 if f1 + f4 < 12 . C is a dominant strategy for the row player in games 2 and 5. A is a best response in games 3 and 6 whenever f1 + f4 < 12 . For the column player a sufficient condition for a to be a best response in games 1 and 4 is that f2 + f5 < 15 . Otherwise b is a best response. In games 2 and 5, b is a strictly dominant strategy for the column player. And in games 3 and 6, b is a best response whenever f1 + f4 < 34 . If the column player always chooses b, then it is easy to see that the row player’s best response vector is ai = (B, C, A, B, C, A). Uniqueness: First note that column players never choose c since it is strictly dominated in each game. Then, row players never choose C in games 1 and 4. Row players will also never choose B in games 3 and 6 since it is strictly dominated. Row players will choose C in games 3 and 6 only if the column player chooses a with overall frequency exceeding 12 . Since we assumed f1 + f4 < 12 this will require some column players to choose a in games 3 and 6 which they will do only if they expect row players to choose B often enough. But row players will choose C in games 2 and 5 and by assumption also in games 3 and 6. Given this column players will choose b in games 3 and 6. But this leaves only the two equilibria identified above. 2 C.3. Action bundling Equilibrium with action bundling. Assume f1 + f4 < 12 . Then, if f3 + f6 < 2(f2 + f5 ), row players choose ai = (C, C, C, C, C, C) and column players choose ai = (b, b, b, b, b, b) in the unique equilibrium with action bundling. Otherwise, row players choose ai = (A, A, A, A, A, A) and column players ai = (b, b, b, b, b, b) in the unique equilibrium with action bundling.

JID:YJETH AID:4052 /FLA

36

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.36 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

Proof. Let us start with the column player. First note that c is strictly dominated in the average game. Also note that a is never a best response to C. a is a best response to B whenever f2 + f5 < 13 . Otherwise, b is a best response to B. a is a best response to A whenever 10(f1 + f4 ) > 15(f2 + f5 ) + 10(f3 + f6 ). A necessary condition for this is f1 + f4 > 12 . For the row player B is never a best response to a. A is a best response to a whenever 10(f1 + f4 ) > 15(f2 + f5 ) + 5(f3 + f6 ) which is implied by the condition that makes a a best response to A. Hence all candidate equilibria (under the assumption that f1 + f4 < 12 ) involve the column player choosing b in all games. B is a best response to b only if f1 + f4  23 . Finally A is a best response to b whenever f3 + f6 > 2(f2 + f5 ) and otherwise C is a best response to b. 2 Note that given the coarsest category all modes of categorization predict the same behavior in each pair of strategically equivalent games. We have seen that belief bundling can explain a substantial part of the observed behavior except for C-choices in games 3 and 6. Best response bundling can explain why participants choose C in games 3 and 6. However, to explain all data we will have to assume, e.g., that some players hold the finest partition (and mix A or B in games 1 and 4 while they choose A in games 3 and 6), while others hold the coarsest partition and choose C in games 3 and 6 (and either A or C in games 1 and 4). Action bundling given the coarsest partition clearly has little explanatory power for our data since it implies that players choose the same action in all games which is not what we predominantly observe (only 3 out of our 32 row players in treatment M choose the same action (C) most of the time in all games). There are some difficulties with the equilibrium approach. Firstly, it assumes that all players engage in the same mode of bundling (either belief or best response or action bundling). Secondly we cannot really know which partitions players use. We can get some indication via our treatment comparisons (e.g. M* and M** vs. M) and via regressions like those presented in Section 5.2, but we can ultimately not read the minds of our participants. Hence, in the next section we study the comparative statics predictions of the different modes of bundling across our treatments. Appendix D. Comparative statics In this section we restrict ourselves to statements that are true irrespective of the partition (the categories) that agents employ. These comparative statics predictions can provide an additional test (in addition to the comparison between treatments M, M + games and M + frequencies) for the different modes of bundling. D.1. Best response bundling Proposition 1 (BRB: M* vs. M). For any given distribution of partitions: If all agents are best response bundlers, then we should observe a (weak) increase of C-choices in all games in M* compared to M. Proof. Since payoffs of outcomes (A, a), (A, b) and (B, b) are reduced in games 1 and 4, these outcomes will also have lower payoffs in the average game corresponding to any category containing either game 1 or game 4 (or both). Hence weakly C-choices should increase in M* compared to M according to action bundling or best response bundling. 2

JID:YJETH AID:4052 /FLA

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.37 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

37

Proposition 2 (BRB: M** vs. M). For any given distribution of partitions: If all agents are best response bundlers, then we should observe a (weak) decrease in C-choices and a (weak) increase in B-choices in all games in M** compared to M. Proof. Since B instead of C is now dominant in games 2 and 5, it will have higher payoffs while C will have lower payoffs in the average game corresponding to any category which contains either game 2 or game 5 (or both). 2 D.2. Belief bundling Proposition 3 (BB: M* vs. M). For any given distribution of partitions: If all agents are belief bundlers, then we should observe no change in behavior in M* compared to M. Proof. In order to see whether belief bundling predicts a change in behavior of row players we have to study in which way their beliefs, i.e. in which way the behavior of column players is affected. First note that c is still strictly dominated for column players in all games. Hence there is no (rationalizable) belief which supports row players choosing C in games 1 and 4. But then (unless there are more C-choices in other games), the behavior of column players should not change in games 1 and 4. But could there be more C-choices in games 3 and 6? Only if there are more a-choices in any of the games. But there will only be more a-choices in any game if there are less C-choices, yielding a contradiction. Hence there cannot be more C-choices in any game and hence also no change in the behavior of column players. But then belief bundling predicts the same behavior also for row players in M* and M.34 2 Proposition 4 (BB: M** vs. M). For any given distribution of partitions: If all agents are belief bundlers, then we should observe a (weak) increase in C-choices in games 3 and 6 and a (weak) increase in A-choices in games 1 and 4 in M** compared to M. Proof. Column players (if at all) will believe that row players choose B more often. This will not affect their behavior in games 2 and 5, but will make it more likely that they choose a in games 1, 3, 4 and 6. If row players believe that column players choose a more often they – if at all – should choose C more often in games 3 and 6 and A more often in games 1 and 4. 2 Appendix E. Instructions treatments M, M*, M** and M-uniform Welcome and thanks for participating in this experiment. Please read these instructions carefully. If you have any questions, please raise your hand. An experimenter will come to you and answer your question. From now on you must not communicate with any other participant in the experiment. You are not allowed to use pencils, nor to use any paper other than these instructions. If you do not conform with these rules, we will have to exclude you from the experiment. Please do also switch off your mobile phone now. 34 If we assume a more noisy model in which players choose all actions with positive probability, then C-choices of row players should increase in games 1 and 4 and hence also b-choices of column players under belief bundling. This in turn should lead to less C-choices in games 3 and 6.

JID:YJETH AID:4052 /FLA

38

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.38 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

You will receive 2.50 e just for showing up to the experiment. During the course of the experiment you can earn more money. How much you will earn depends on your behavior as well as on the behavior of the other participants. During the experiment all money amounts are being calculated ECU (Experimental Currency Units) which will be converted into Euros according to the exchange rate 1 e = 130 ECU at the end of the experiment. All your decisions will be treated confidentially. E.1. The experiment The experiment consists of 100 rounds. In each round you will play one of several possible games with a randomly selected interaction partner. At the beginning of each round, your interaction partner is randomly determined. This random selection of your interaction partner takes places in each period irrespective of the previous periods. You cannot identify the other participants and hence you cannot know, whether you have interacted with your current interaction partner before. As a consequence also all of your decisions will remain anonymous for the other participants. Which game you will play in any given round is determined randomly. At the beginning of each round you will be informed about which game is currently played and the payoffs associated to this game. In each game you can choose one of three possible actions, action A, B or C. Your payment in each round depends on your action and the action your interaction partner has chosen. Your overall payment at the end of the experiment is the sum of all payoffs that you have earned in each round additionally to the showup fee of 2.50 Euro. E.2. Summary 1. At the beginning of each round your interaction partner is determined randomly. All possible interaction partners have the same probability. 2. A game is randomly selected. We will inform you which game was selected and we show you the payoff table of the selected game. 3. You choose an action A, B or C. 4. We inform you about the action you and your interaction partner chose and about the payment you have received in this round. During the experiment you will be shown payoff tables for each game that is being chosen in a certain round. We will explain in the following how to read such a payoff table. Game. . . You choose

Your interaction partner chooses A B C

a 1 4 7

b 2 5 8

c 3 6 9

In the table your actions and the corresponding payments are given in red and the possible actions of your interaction partner are given in blue. Your payment follows from this table as follows.

JID:YJETH AID:4052 /FLA

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.39 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

39

• If you choose A and your interaction partner chose a, you will receive 1 ECU (upper left entry). • If you choose B and your interaction partner chose a, you will receive 4 ECU (middle left entry). • If you choose C and your interaction partner chose a, you will receive 7 ECU (lower left entry). • If you choose A and your interaction partner chose b, you will receive 2 ECU (upper middle entry). • If you choose B and your interaction partner chose b, you will receive 5 ECU (middle entry). • If you choose C and your interaction partner chose b, you will receive 8 ECU (lower middle entry). • If you choose A and your interaction partner chose c, you will receive 3 ECU (upper right entry). • If you choose B and your interaction partner chose c, you will receive 6 ECU (middle right entry). • If you choose C and your interaction partner chose c, you will receive 9 ECU (lower right entry). In the upper left corner you can see to which game the table belongs. The payments shown in this table are an example and will not appear in the experiment. Appendix F. Supplementary material Supplementary material related to this article can be found online at http://dx.doi.org/10.1016/ j.jet.2012.05.011. References [1] D. Abreu, A. Rubinstein, The structure of Nash equilibrium in repeated games with finite automata, Econometrica 56 (6) (1988) 1259–1281. [2] J. Bednar, S. Page, Can game theory explain culture?, Ration. Soc. 19 (1) (2007) 65–97. [3] M. Chi, P. Feltovich, R. Glaser, Categorization and representation of physics problems by experts and novices, Cogn. Sci. 5 (1986) 121–152. [4] D.J. Cooper, J.H. Kagel, Lessons learned: Generalizing learning across games, Amer. Econ. Rev. 93 (2) (2003) 202–207. [5] D.J. Cooper, J.H. Kagel, Learning and transfer in signaling games, Econ. Theory 34 (2008) 415–439. [6] N. Cowan, The magical number 4 in short term memory: A reconsideration of mental storage capacity, Behav. Brain Sci. 24 (2001) 87–125. [7] Urs Fischbacher, Z-tree, Zurich toolbox for readymade economic experiments, Exper. Econ. 10 (2) (2007) 171–178. [8] S. Frederick, Cognitive reflection and decision making, J. Econ. Perspect. 19 (4) (2005) 25–42. [9] R. Fryer, M. Jackson, Categorical cognition: A psychological model of categories and identification in decision making, B. E. J. Theor. Econ. (Contributions) 8 (1) (2008), Article 6. [10] I. Gilboa, D. Schmeidler, Case-based decision theory, Quart. J. Econ. 110 (3) (1995) 605–639. [11] I. Gilboa, D. Schmeidler, Case-based optimization, Games Econ. Behav. 15 (1996) 1–26. [12] B. Greiner, An online recruitment system for economic experiments, in: K. Kremer, V. Macho (Eds.), Forschung und Wissenschaftliches Rechnen 2003, in: GWDG-Ber., vol. 63, Ges. für Wiss. Datenverarbeitung, Göttingen, Germany, 2004, pp. 79–93. [13] B. Grosskopf, R. Sarin, E. Watson, An experiment on case-based decision theory, mimeo, 2008. [14] P. Jehiel, Analogy-based expectation equilibrium, J. Econ. Theory 123 (2005) 81–104. [15] P. Jehiel, F. Koessler, Revisiting games of incomplete information with analogy based expectations, Games Econ. Behav. 62 (2) (2008) 533–557.

JID:YJETH AID:4052 /FLA

40

[m1+; v 1.147; Prn:2/07/2012; 11:47] P.40 (1-40)

V. Grimm, F. Mengel / Journal of Economic Theory ••• (••••) •••–•••

[16] P. Jehiel, Manipulative auction design, Theoretical Econ. 6 (2) (2011) 185–217. [17] E. Haruvy, D.O. Stahl, Between-game rule learning in dissimilar symmetric normal-form games, Games Econ. Behav. 74 (1) (2011) 208–221. [18] E. Hopkins, Adaptive learning models of consumer behavior, J. Econ. Behav. Organ. 64 (2007) 348–368. [19] S. Huck, P. Jehiel, T. Rutter, Feedback spillover and analogy based expectations: A multi game experiment, Games Econ. Behav. 71 (2) (2010) 351–365. [20] M. LiCalzi, Fictitious play by cases, Games Econ. Behav. 11 (1995) 64–89. [21] R.D. Luce, Semiorders and a theory of utility discrimination, Econometrica 24 (2) (1955) 178–191. [22] F. Mengel, Learning across games, Games Econ. Behav. 74 (2) (2012) 601–619. [23] F. Mengel, E. Sciubba, Extrapolation and structural learning in games, mimeo, 2011. [24] G.A. Miller, The magical number 7 plus or minus two: Some limits on our capacity for processing information, Psychol. Rev. 63 (1956) 81–97. [25] E. Mohlin, Optimal categorization, SSE/EFI Working Paper Series in Economics and Finance, No. 721, 2010. [26] S. Mullainathan, J. Schwartzstein, A. Shleifer, Coarse thinking and persuasion, Quart. J. Econ. 132 (2) (2008) 577–619. [27] J. Oechssler, B. Schipper, Can you guess the game you are playing?, Games Econ. Behav. 43 (1) (2003) 137–152. [28] I. Palacios-Huerta, O. Volij, Experientia docet: Professionals play minimax in laboratory experiments, Econometrica 76 (1) (2008) 71–115. [29] A. Rapoport, D. Seale, E. Winter, An experimental study of coordination and learning in iterated two-market entry games, Econ. Theory 16 (2000) 661–687. [30] A. Rubinstein, Similarity and decision-making under risk (Is there a utility theory resolution to the Allais paradox?), J. Econ. Theory 46 (1988) 145–153. [31] L. Samuelson, Analogies, anomalies and adaptation, J. Econ. Theory 97 (2001) 320–366. [32] R. Selten, K. Abbink, J. Buchta, A. Sadrieh, How to play 3 × 3 games. A strategy method experiment, Games Econ. Behav. 45 (2003) 19–37. [33] D.O. Stahl, J. van Huyck, Learning conditional behavior in similar stag hunt games, mimeo, University of Texas, 2002. [34] J. Steiner, C. Stewart, Contagion through learning, Theoretical Econ. 3 (2008) 431–458. [35] R.A. Weber, Learning and transfer of learning with no feedback: An experimental test across games, Department of Social and Decision Sciences Paper 26, 2003. [36] P. Young, The evolution of conventions, Econometrica 61 (1) (1993) 57–84.

An experiment on learning in a multiple games ...

Available online at www.sciencedirect.com ... Friedl Schoeller Research Center for Business and Society, and the Spanish Ministry of Education and Science (grant .... this does not prove yet that learning spillovers do occur since behavior may be .... ing two different normal-form games varying the degree of accessibility of ...

2MB Sizes 0 Downloads 340 Views

Recommend Documents

Communication with Multiple Senders: An Experiment - Quantitative ...
The points on each circle are defined by the map C : [0◦,360◦)2 →R2 ×. R. 2 given by. C(θ) := (( sinθ1 ..... While senders make their decisions, receivers view a.

Communication with Multiple Senders: An Experiment - Quantitative ...
a mutual best response for the experts and DM, full revelation is likely to be a ..... the experimental interface, all angular measurements will be given in degrees. ...... computer strategy would seem natural candidates: the equilibrium strategy 릉

Mimicking and Modifying: An Experiment in Learning ...
Apr 19, 2010 - Doing so many enable one to achieve outcomes superior to those one would achieve acting independently, ... Callander reformulates policy-outcome uncertainty by modeling the shock as Brownian motion. In ..... If different types of quest

About political polarization in Africa: An experiment on Approval ...
Feb 4, 2013 - possibly be a factor of exacerbation of political, social, ethnic or religious divisions. ... forward platforms, and the most popular platform is chosen through the election ..... In the media, some commentators mentioned the depth of r

An experiment on cooperation in ongoing organizations
Jan 13, 2018 - We study experimentally whether an overlapping membership structure affects the incen- tives of short-lived organizational members. We compare organizations in which one member is replaced per time period to organizations in which both

About political polarization in Africa: An experiment on Approval ...
Feb 4, 2013 - formation of two big sides/electoral coalitions within the society. Even if ... work in the individual act of voting for one and only one candidate or party. ..... police interrupted our collect of data in Vodjè-Kpota, and we had to st

About political polarization in Africa: An experiment on Approval ...
Feb 4, 2013 - democracy might result in the largest group confiscating the .... occurred with the constitution of the new country-wide computerized list of registered voters ...... prosperous trader, owner of several companies, both in Benin and.

Learning in Games
Encyclopedia of Systems and Control. DOI 10.1007/978-1-4471-5102-9_34-1 ... Once player strategies are selected, the game is played, information is updated, and the process is repeated. The question is then to understand the long-run ..... of self an

That's how we roll: an experiment on rollover risk - macroeconomics.tu ...
raise awareness on some useful statistical techniques to analyze continuous ..... From the experimental data, we plot the cumulative density functions of both ...

Anticipatory Learning in General Evolutionary Games - CiteSeerX
“anticipatory” learning, or, using more traditional feedback ..... if and only if γ ≥ 0 satisfies. T1: maxi ai < 1−γk γ. , if maxi ai < 0;. T2: maxi ai a2 i +b2 i. < γ. 1−γk

Learning in Network Games - Quantitative Economics
Apr 4, 2017 - arguably, most real-life interactions take place via social networks. In our .... 10Since 90% of participants request information about the network ...

Anticipatory Learning in General Evolutionary Games - CiteSeerX
of the Jacobian matrix (13) by ai ±jbi. Then the stationary ... maxi ai. , if maxi ai ≥ 0. The proof is omitted for the sake of brevity. The important ..... st.html, 2004.

Better Later than Never? An Experiment on Bargaining ...
Dec 2, 2016 - On the other hand, if the number of periods is small, the unique equilibrium ..... from business, economics and law degrees. ..... screen L-type sellers by continuously raising offers from a median of 800 to prices of 2750 ± 250.

ASPIRATION LEARNING IN COORDINATION GAMES 1 ... - CiteSeerX
This work was supported by ONR project N00014- ... ‡Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, ...... 365–375. [16] R. Komali, A. B. MacKenzie, and R. P. Gilles, Effect of selfish node ...

ASPIRATION LEARNING IN COORDINATION GAMES 1 ... - CiteSeerX
‡Department of Electrical and Computer Engineering, The University of Texas .... class of games that is a generalized version of so-called coordination games.

Learning in Network Games - Quantitative Economics
Apr 4, 2017 - solely on observed action choices lead us to accept certain learning rules .... arguably, most real-life interactions take place via social networks.

A Conjoint Experiment in Japan
May 18, 2018 - former employees of the party or the religious organization (Smith, ...... a set of manipulation checks (a knowledge test of the electoral rules.

A Modest Experiment in Wellness
achievement, in spite of the current interest in wellness, is still given insufficient consideration. The following scene is ... order to avoid lengthening the tournament day by scheduling set meal times, which could negatively impact ... Third, the

Friendship in a public good experiment - CiteSeerX
First, experimental subjects do contribute to public goods, and do so to .... artificial and unfamiliar environment of a computer laboratory. This is an ..... More Pain is Preferred to Less: Adding a Better End”, Psychological Science, 4(6), 401-.

Interactive Segmentation based on Iterative Learning for Multiple ...
Interactive Segmentation based on Iterative Learning for Multiple-feature Fusion.pdf. Interactive Segmentation based on Iterative Learning for Multiple-feature ...

Burning Man, an Experiment in Community
music, performance, written and spoken word. A deep .... white out conditions where large clouds of the alkali playa would be so dense you had to stop and sit ...