Int J Game Theory (2010) 39:53–69 DOI 10.1007/s00182-009-0193-2

Explicit formulas for repeated games with absorbing states Rida Laraki

Accepted: 27 October 2009 / Published online: 1 December 2009 © Springer-Verlag 2009

Abstract Explicit formulas are given for the asymptotic value limλ→0 v(λ) and the asymptotic minmax lim w(λ) of finite λ-discounted absorbing games together with new simple proofs for the existence of the limits as λ goes to zero. Similar characterizations for stationary Nash equilibrium payoffs are obtained. The results may be extended to absorbing games with compact metric action sets and jointly-continuous payoff functions. Keywords

Absorbing discounted games · Asymptotic analysis · Explicit formulas

1 Introduction “When people interact, they have usually interacted in the past, and expect to do so again in the future. It is this ongoing element that is studied by the theory of repeated games”.1

I am honored to publish this paper in the special issue in the memory of a great game theorist: Michael Maschler. My PhD dissertation was based in a large part on his work with Robert Aumann on repeated games with incomplete information. R. Laraki—Part time associated with Équipe Combinatoire et Optimisation, Université Paris 6 1 Aumann and Maschler (1995, p. Xi).

R. Laraki (B) CNRS, Economics Department, Ecole Polytechnique, Paris, France e-mail: [email protected] R. Laraki Équipe Combinatoire et Optimisation, Université Paris 6, Paris, France

123

54

R. Laraki

Aumann and Maschler (1995) introduced in 1966–1968 repeated games with incomplete information. Their approach was revolutionary in many aspects and generates a huge and deep literature. One of the main contributions was the conceptual distinction of the several ways of evaluating the stream of payoffs in long interactions. The finitely repeated game n has “a definite duration, n, on which the players can base their strategies. Indeed, optimal strategies in n may be quite different for different n”. On the other hand, in the infinitely repeated game ∞ “the strategies are by definition independent of n. Thus, ∞ reflects properties of the game n that hold “uniformly” in the duration n […] By using an optimal strategy in ∞ (if there is one), a player guarantees in one fell swoop that in sufficiently long finite truncation, the outcome will not be appreciably worse in each n ”.2 If vn denotes the value of the finitely repeated game n , limn→∞ vn “tells the analyst something about repetitions that are “long”, without his having to know how long. But lim vn is only meaningful as the limit of values of games whose duration is precisely known to the players […] To analyze a situation in which the players themselves know only that the game is “long”[…], ∞ is the appropriate model”.3 Let v∞ denote the uniform value of the repeated game (i.e. the value of ∞ if it exists). In addition to v∞ and lim vn , “there are two specific models of “long” repetitions that warrant discussion. The first is the limit of the values vλ of the discounted game λ as the discounted rate goes to zero […] this is conceptually closer to lim vn than to v∞ […] [and] most of the above discussion applies when λ is substituted for n […] On the other hand, discounted games λ are like ∞ —and unlike n —in that they have no fixed, commonly known last stage […] [and] it admits [optimal] strategies with some kind of stationarity property”.4 For repeated games with incomplete information on one side, Aumann and Maschler proved that v∞ exists and so the asymptotic values lim vn and lim vλ exist and are equal to v∞ . More importantly, they provide an explicit formula for the common value (their famous Cav(u) theorem) and show that v∞ does not always exist for repeated games with incomplete information on both sides. Mertens and Zamir (1971) proved the existence of the asymptotic values lim vn = lim vλ in repeated games with incomplete information on both sides and provided an elegant system of functional equations that characterizes the common limit. Aumann and Maschler extended the existence of v∞ and its characterization to repeated games with incomplete information on one side, imperfect monitoring and state dependent signaling (that is, the players does not fully observe the past moves of the opponents but only a signal that may depend, deterministically or stochastically, on the true state and the last moves). To study repeated games with symmetric incomplete information on both sides, imperfect monitoring and state dependent public signaling, Kohlberg and Zamir (1974) reduced the existence of the uniform value in the deterministic case to the study of an absorbing game  ∗ . Combined with the result of Kohlberg (1974), this implies the existence of the uniform value.

2 Aumann and Maschler (1995, p. 131). 3 Aumann and Maschler (1995, p. 132). 4 Aumann and Maschler (1995, p. 139). They used the notations  δ and v δ , for  and v respectively. λ λ

123

Explicit formulas for repeated games with absorbing states

55

Repeated game with absorbing states, in short absorbing games, are stochastic games in which only one state is non-absorbing. Stochastic games are repeated games in which a state variable follows a Markov chain controlled by the actions of the players. Shapley (1953) introduced the two player zero-sum model with finitely many states and actions (the finite model). He proved the existence of the value vλ in the λdiscounted game5 by introducing a dynamic programming principle (called the Shapley operator). The idea of the Kohlberg and Zamir (1974) reduction is simple: each time an informative pair is played, the identity of the true state is revealed (i.e. the game is absorbed). “At about this time, it was realized that the game  ∗ are particular instances of “stochastic games” in the sense of Shapley […] Motivated by the above application to repeated games, Bewley and Kohlberg managed to prove that lim vn exists for all stochastic games […] But thought they tried hard, and obtained important partial results, they were unable to prove that v∞ exists for all stochastic games. This difficult problem was finally solved (positively) by Mertens and Neyman”.6 The Kohlberg and Zamir reduction has been extended by Neyman and Sorin (1998) to establish the existence of uniform equilibria in multi-player repeated games with symmetric incomplete information and non-deterministic public signaling. A similar reduction has been used by Abreu et al. (1991) to establish their famous formula that characterizes the Pareto-optimal trigger equilibrium payoff in discounted repeated games with imperfect public monitoring as the discount factor goes to zero: a principal plan is played; If a bad public signal is observed, cooperation stops (i.e. the game is absorbed). The first and famous absorbing game example was introduced in Gillette (1957) (the big match). Blackwell and Ferguson (1968) proved that the big match admits a uniform value under the full monitoring assumption. Without that observation, v∞ may not exist as shown in Blackwell and Ferguson (1968) and Coulomb (2001). Using an operator approach, Kohlberg (1974) proved the existence of the uniform (and hence asymptotic values) in any finite absorbing game with full monitoring. The operator approach uses the additional information obtained from the derivative of the Shapley operator at λ = 0 to deduce the existence of lim vλ and its characterization via variational inequalities. Rosenberg and Sorin (2001) extended the Kohlberg operator approach for the asymptotic values to a large class of stochastic games that includes (1) compact and separately-continuous absorbing games7 and (2) repeated games with incomplete information on both sides (Aumann and Maschler 1995; Mertens and Zamir 1971). Mertens et al. (2009) combined the technics in Mertens and Neyman (1981) and Rosenberg and Sorin (2001) to show the existence of the uniform value in compact and separately-continuous absorbing games with full monitoring. An algebraic approach allows Bewley and Kohlberg (1976a, b) to prove the existence of the asymptotic values lim vλ = lim vn in every finite stochastic game. The 5 All the results on stochastic games cited in this introduction assume that the state at each stage is publicly known to the players and that the description of the game is common knowledge among the players. 6 Aumann and Maschler (1995, p. 217). 7 Meaning that action sets are compact and metric and payoff and transition functions are separately-

continuous.

123

56

R. Laraki

breakthrough come when Mertens and Neyman (1981) proved the existence of the uniform value v∞ in every finite stochastic game with full monitoring. To study long interactions, fixed-point theorems are not in general sufficient and more sophisticated methods need to be devised (Aumann and Maschler 1995; Bewley and Kohlberg 1976a; Coulomb 2001; Laraki 2001a, b; Mertens and Neyman 1981; Mertens and Zamir 1971; Rosenberg and Sorin 2001). Proving the existence of the asymptotic values or of the uniform value is an important theoretical contribution, but finding an explicit formula between the data of the game and its value (as did Aumann and Maschler (1995) and Mertens and Zamir (1971)) allows numerical computations and enables the study of how changes in the underlying data affect the value of the game. Unfortunately, very few repeated games have an explicit formula for the asymptotic or uniform values. In the large class of repeated games where lim vλ = lim vn = v∞ , it is sufficient to establish the asymptotic formula in the discounted model (because of its stationarity). Inspired by the tools developed in the theory of zero-sum differential games with fixed duration, Laraki (2001a, b) used a variational approach to characterize the asymptotic value of discounted stochastic games in which each player controls a martingale (including Aumann and Maschler (1995) and Mertens and Zamir (1971)). Following the line of research of Vrieze and Thuijsman (1989) and Flech et al. (1996), a variational approach gives, for compact and jointly-continuous absorbing games, a new simple proof for the existence of lim vλ and its characterization as the value of a one-shot game. When the probability of absorption (but not the payoff of absorption) is controlled by only one player (as for the big match of Gillette (1957)), the formula can be simplified to the value of an underlying finite game. From Coulomb (2001), one may also deduce an involved formula for lim vλ . Because his aim was different, he did not identify the associated asymptotic game and his approach cannot be extended to compact absorbing games. Actually, Coulomb’s work uses extensively the algebraic approach of Bewley and Kohlberg (1976a) which is only valid for the finite model. The minmax wλ of a multi-player λ-discounted absorbing game is the level at which a team of players could punish another player. Neyman (2003) proved the existence of the uniform minmax w∞ in finite absorbing games with full monitoring, implying in particular that lim wn = lim wλ = w∞ . However, no explicit formula exists in the literature for lim wλ and it is not known if this limit exists in infinite absorbing games. Our tools allow (1) a simple proof for the existence of the asymptotic minmax lim wλ of any multi-player compact and jointly-continuous absorbing game, and (2) an explicit formula for lim wλ . Some of the results may be extended to obtain equations that asymptotic equilibrium payoffs of a multi-player game should satisfy, as the discount factor goes to zero. Note that in the non-zero sum framework, there may be a fundamental incompatibility between equilibrium payoffs E λ of a discounted absorbing game λ as λ goes to zero and equilibrium payoffs E ∞ of the infinite game ∞ . Sorin (1986) in his famous Paris match, shows that lim E λ and E ∞ may be disjoint. Thus, our formula for lim E λ is not necessarily linked to E ∞ . However, the formula could be useful to prove the existence of uniform equilibria. Actually, Vrieze and Thuijsman (1989) for 2-player absorbing games and Solan (1999) for 3-player absorbing games constructed uniform equilibria (with threat) using lim E λ .

123

Explicit formulas for repeated games with absorbing states

57

In Sects. 2 and 3 the zero-sum game is studied. Section 4 provides formulas for the asymptotic minmax. Section 5 deals with Nash equilibria of a multi-player game. The last section extends the results of the previous sections, established for finite games, to compact and jointly-continuous games. 2 The value Consider two finite sets I and J , two (payoff) functions f , g from I × J to [−1, 1] and a (probability transition) function p from I × J to [0, 1]. The repeated game with absorbing states is played in discrete time as follows. At stage t = 1, 2, . . . (if the game is not yet absorbed) player I chooses i t ∈ I and, simultaneously, player J chooses jt ∈ J : (i) the payoff at stage t is f (i t , jt ); (ii) with probability 1 − p (i t , jt ) the game is absorbed and the payoff in all future stages s > t is g (i t , jt ); and (iii) with probability p (i t , jt ) the situation is repeated at step t + 1. If the stream of payoffs is r (t), t = 1, 2, . . . , the λ-discounted-payoff of the game is ∞ t−1 r (t). Player I maximizes the expected discounted-payoff and player t=1 λ(1 − λ) J minimizes that payoff. In the absorbing game described above, the game is over after absorption. One may define a general repeated game with absorbing states where after absorption, a zerosum repeated game in which the state never changes is reached. To play optimally in the discounted game after absorption, players should know the absorbing state that has been reached. A general absorbing game may be reduced to our more restrictive model by assuming that the absorbing payoff of an absorbing state is the value of the associated zero-sum game. Players are allowed to use behavioral strategies. If the game is not absorbed at stage t, player I may choose at random his action i(t) according to some probability distribution8 x(t) ∈  (I ). Similarly, player J chooses his action j (t) at random according to some probability distribution9 y(t) ∈  (J ). If a player obtains some information during the game, his behavioral strategy may depend, at each stage, on all his past information up this stage. As will be seen, Shapley (1953) proved that using such information is not necessary to play optimally in a discounted stochastic game: only the knowledge of the current state matters. Hence, in a discounted stochastic game, one needs only to assume that the states that the game visits are publicly known to the players, and that the description of the game is common knowledge. Denote by M+ (I ) = {α = (α i )i∈I : α i ∈ [0, +∞)} the set of positive measures on I (the I -dimensional positive orthant). For any i and j, let p ∗ (i, j) = 1 − p(i, j) M+ (I ) × J and ϕ : I × and f ∗ (i, j) = [1 − p(i, j)] × g(i, j). For any (α, j) ∈  J → [−1, 1] , ϕ is extended linearly as follows ϕ(α, j) = i∈I α i ϕ(i, j). Note that (I ) ⊂ M+ (I ). 



 8 (I ) = (x i ) i i i∈I : x ∈ [0, 1], i∈I x = 1 is the set probabilities over I . 



 9 (J ) = (y j ) j j j∈J : y ∈ [0, 1], j∈J y = 1 is the set of probabilities over J .

123

58

R. Laraki

Lemma 1 (Shapley 1953) The λ discounted game λ has a value, v (λ) . It is the unique real in [−1, 1] satisfying,   v (λ) = max min λ f (x, j) + (1 − λ) p(x, j)v (λ) + (1 − λ) f ∗ (x, j) . x∈(I ) j∈J

(1)

The asymptotic value, v, is the limit of the discounted values v(λ) as λ goes to zero. The existence of such a limit is already known from Kohlberg (1974). The first main result provides a new proof for its existence and its characterization as the value of a one-shot game simply related to the data of the game. From Eq. 1 and a martingale argument it may be deduced that player I has an optimal stationary strategy (that is, he plays the same mixed action x at each period). This implies in particular that the lemma holds even if the players have no memory or do not observe past actions. Note that those properties are valid in every discounted stochastic game Shapley (1953) as soon as the states that the game visits are publicly known to the players. Example A quitting-game

Here I = J = {C, Q}. The game is absorbed with probability 1 if one of the players chooses Q (for Quitting) and continues with probability 1 if both players choose C (for Continue); that is p(C, C) = 1 and p(Q, C) = p(C, Q) = p(Q, Q) = 0. There are two absorbing payoffs 1 and 0 (marked with a ∗). The absorbing payoff 1 = g(C, Q) = g(Q, C) is achieved at some period if (C, Q) or (Q, C) is played. The absorbing payoff 0 = g(Q, Q) is achieved if (Q, Q) is played. The game is nonabsorbed with probability 1 if both players decide to continue and play (C, C). In that case, the stage payoff is 0 = f (C, C). Consider the following stationary strategy profile in which player I plays at each period (xC, (1 − x)Q) and player 2 (yC, (1 − y)Q) where x and y are in [0, 1]. x and y are the stationary probabilities to play C for players I and J respectively. The corresponding discounted payoff rλ (x, y) satisfies rλ (x, y) = x y (λ × 0 + (1 − λ) rλ (x, y)) + ((1 − x)y + (1 − y)x), so that rλ (x, y) =

x + y − 2x y . 1 − x y(1 − λ)

Hence, the value vλ ∈ [0, 1] satisfies: 

 x + y − 2x y x∈[0,1] y∈[0,1] 1 − x y(1 − λ)   x + y − 2x y , = min max y∈[0,1] x∈[0,1] 1 − x y(1 − λ)

vλ = max min

123

Explicit formulas for repeated games with absorbing states

59

and it may be checked that √ 1− λ vλ = xλ = yλ = →λ→0 1. 1−λ The value and the optimal strategies are not rational fractions of λ (but admit a Puiseux series in power of λ). Bewley and Kohlberg (1976a) show, using algebraic tools, that this property holds for all finite stochastic games and deduce from it the existence of lim v (λ). Lemma 2 (Vrieze and Thuijsman 1989) v (λ) satisfies v (λ) = max min x∈(I ) j∈J

λ f (x, j) + (1 − λ) f ∗ (x, j) . λp(x, j) + p ∗ (x, j)

Proof If in the λ-discounted game, player I plays the stationary strategy x and player J plays a pure stationary strategy j ∈ J , the λ-discounted reward r (λ, x, j) satisfies: r (λ, x, j) = λ f (x, j) + (1 − λ) p(x, j)r (λ, x, j) + (1 − λ) f ∗ (x, j). Since 1 − (1 − λ) p(x) = 1 − p(x) + λp(x) = λp(x) + p ∗ (x), r (λ, x, j) =

λ f (x, j) + (1 − λ) f ∗ (x, j) . λp(x, j) + p ∗ (x, j)

The maximizer has a stationary optimal strategy and the minimizer has a pure stationary best reply: this proves the lemma.   In the following, α ⊥ x means that for every i ∈ I , x i > 0 ⇒ α i = 0. Letting the discount factor tends to zero in the Vrieze and Thuijsman’s (1989) formula yields: Theorem 3 As λ goes to zero v (λ) converges to  ∗ f (x, j) f (x, j)+ f ∗ (α, j) ∗ ∗ v = max 1 1 sup min + { p (x, j)>0} { p (x, j)=0} . x∈(I ) α⊥x∈M+ (I ) j∈J p ∗ (x, j) p(x, j) + p ∗ (α, j) The intuitive meaning of this formula is simple and is closely related to similar ideas in Coulomb (2001), Flech et al. (1996) and Vrieze and Thuijsman (1989) and others. x is the limit of the discounted optimal strategies x(λ) as λ → 0 and α is related to the second order action x(λ)−x λ . The max on x is achieved, the sup on α may not be attainable. Proof Let w = limn→∞ v (λn ) be an accumulation point of v (λ). Step 1: Consider an optimal stationary strategy x (λn ) for player I and go to the limit using Shapley’s dynamic programming principle. From Vrieze and Thuijsman’s (1989) formula, there exists x (λn ) ∈ (I ) such that for every j ∈ J, v (λn ) ≤

λn f (x(λn ), j) + (1 − λn ) f ∗ (x(λn ), j) . λn p(x(λn ), j) + p ∗ (x(λn ), j)

(2)

123

60

R. Laraki

By compactness of (I ) it may be supposed that x (λn ) → x. ∗ (x, j) Case 1: p ∗ (x, j) > 0. Letting λn go to zero implies w ≤ pf ∗ (x, j) .   ∗ i ∗ ∗ Case 2: p (x, j) = i∈I x p (i, j) = 0. Thus, i∈S(x) p (i, j) = 0 where S(x) =

i {i ∈ I : x i > 0} is the support of x. Let α(λn ) = x λ(λn n ) 1{x i =0} ∈ M+ (I ) so that i∈I α(λn ) ⊥ x. Consequently, x i (λn ) x i (λn ) p ∗ (i, j) = p ∗ (i, j) λn λn i∈I i ∈S(x) / = α i (λn ) p ∗ (i, j) i∈I

= p ∗ (α (λn ) , j), and x i (λn ) f ∗ (i, j) = α i (λn ) f ∗ (i, j) = f ∗ (α (λn ) , j), λn i∈I

i∈I

so, from Eq. 2, and because p(x, j) = 1, w ≤ lim inf n→∞

f (x, j) + (1 − λn ) f ∗ (α(λn ), j) . p(x, j) + p ∗ (α(λn ), j)

(3)

Since J is finite, for any ε > 0, there is N (ε) such that, for every j ∈ J , w ≤ f (x, j)+ f ∗ (α(λ N (ε) ), j) p(x, j)+ p ∗ (α(λ N (ε) ), j) + ε. Consequently, w ≤ v. Step 2: Construct a strategy for player I proportional to x + λn α that guarantees v − ε in the λn -discounted game as λn → 0. Let (αε , xε ) ∈ M+ (I ) × (I ) be ε-optimal for the maximizer in the formula of v. For λn small enough, let xε (λn ) be proportional to xε + λn αε (that is, xε (λn ) = µn (xε + λn αε ) for some µn > 0). Let r (λn ) be the unique real in the interval [−1, 1] that satisfies, 

 λn [ f (xε (λn ), j)] + (1 − λn ) ( p(xε (λn ), j)) r (λn ) . r (λn ) = min + (1 − λn ) f ∗ (xε (λn ), j) j∈J

(4)

By the linearity of f , p, f ∗ and p ∗ on x, r (λn ) = min j

= min j

λn f (xε + λn αε , j) + (1 − λn ) f ∗ (xε + λn αε , j) λn p(xε + λn αε , j) + p ∗ (xε + λn αε , j) λn f (xε , j)+ λ2n f (αε , j)+(1−λn ) f ∗ (xε , j) + (1−λn ) λn f ∗ (αε , j) . λn p(xε , j)+λ2n p(αε , j)+ p ∗ (xε , j) + λn p ∗ (αε , j)

Also, v(λn ) ≥ r (λn ) since r (λn ) is the payoff of player I if he plays the stationary strategy xε (λn ). Let jλn ∈ J be an optimal stationary pure best response for player J against xε (λn ) (an element of the arg min in (4)). Since J is finite and r (λn ) bounded, one can

123

Explicit formulas for repeated games with absorbing states

61

switch to a subsequence and suppose that jλn is constant (= j) and that r (λn ) → r . ∗ (x , j) f (xε , j)+ f ∗ (αε , j) ε If p ∗ (xε , j) > 0 then r = pf ∗ (x . If p ∗ (xε , j) = 0, clearly r = p(x . ∗ ε , j) ε , j)+ p (αε , j)   Consequently, w = lim v(λn ) ≥ v − ε. Step 2 of the above proof shows that for each ε > 0, there is xε and αε ⊥ xε such that player I always admits a 2ε-optimal strategy in the λ-discounted game proportional to xε + λαε for all λ small enough (so that |vλ − v| < ε). The quitting game example shows that a 0-optimal strategy of the λ-discounted game is not always of that form. This property permits to identify the asymptotic value as the value of what may be called the asymptotic game (following a terminology initiated in Sorin (2002)), that we now define. For any (α, β)  ∈ M+ (I )× M+ (J ) and ϕ : I × J → [−1, 1] , ϕ is extended linearly by ϕ(α, β) = i∈I, j∈J α i β j ϕ(i, j). For player I, the strategy set in the asymptotic game is (I ) = {(x, α) ∈ (I ) × M+ (I ) : α ⊥ x} and similarly for player J , (J ) = {(y, β) ∈ (J ) × M+ (J ) : β ⊥ y}. The payoff function of the asymptotic game is: A(x, α, y, β) :=

f ∗ (x, y) f (x, y) + f ∗ (α, y) + f ∗ (x, β) ∗ (x,y)>0} + 1 1{ p∗ (x,y)=0}. { p p ∗ (x, y) p(x, y) + p ∗ (α, y) + p ∗ (x, β)

The two first formulas of the following corollary state that the asymptotic game has a value, and it is the same as the asymptotic value of the absorbing game. Corollary 4 v satisfies the following equations: v= =

sup

inf

A(x, α, y, β)

inf

sup

A(x, α, y, β)

(x,α)∈ (I ) (y,β)∈ (J ) (y,β)∈ (I ) (x,α)∈ (I )

=

sup

inf

(x,α)∈ (I ) y∈(J )

f ∗ (x,y) ∗ p ∗ (x,y) 1{ p (x,y)>0} f (x,y)+ f ∗ (α,y) + p(x,y)+ p∗ (α,y) 1{ p∗ (x,y)=0}

 .

Proof Consider an ε-optimal strategy xε (λ) proportional to xε + λαε in the λ-discounted game. Taking any strategy of Player J proportional to y (λ) = y + λβ yields v (λ) − ε ≤

λ f (xε + λαε , y + λβ) + (1 − λ) f ∗ (xε + λαε , y + λβ) . λp(xε + λαε , y + λβ) + p ∗ (xε + λαε , y + λβ) ∗

(xε ,y) p ∗ (xε , y) > 0 implies v = lim v (λ) ≤ pf ∗ (x . If p ∗ (xε , y) = 0 then f ∗ (xε , y) = 0. ε ,y) ∗ ∗ Using the multi-linearity of f , f , p and p and dividing by λ imply:

f (xε + λαε , y + λβ) + (1 − λ) f ∗ (αε , y) + (1 − λ) f ∗ (xε , β) + (1 − λ) λ f ∗ (αε , β) . v (λ) − ε ≤ p(xε + λαε , y + λβ) + p ∗ (αε , y) + f ∗ (xε , β) + λp ∗ (αε , β)

123

62

R. Laraki

Going to the limit, v≤

f (xε , y) + f ∗ (αε , y) + f ∗ (xε , β) , p(xε , y) + p ∗ (αε , y) + f ∗ (xε , β)

which holds for all (y, β). Thus,

v≤

sup

inf

(x,α)∈ (I ) (y,β)∈ (J )

f ∗ (x,y) ∗ p ∗ (x,y) 1{ p (x,y)>0} f (x,y)+ f ∗ (α,y)+ f ∗ (x,β) + p(x,y)+ p∗ (α,y)+ p∗ (x,β) 1{ p∗ (x,y)=0}

 .

And similarly for the other inequality. Since the inf sup is always higher than the sup inf the first two equalities follow. Taking β = 0 in the last inequality implies:

v≤

sup

inf

(x,α)∈ (I ) y∈(J )

f ∗ (x,y) ∗ p ∗ (x,y) 1{ p (x,y)>0} f (x,y)+ f ∗ (α,y) + p(x,y)+ p∗ (α,y) 1{ p∗ (x,y)=0}

 ,

and from the formula of v in Theorem 3, one obtains the last equality of the corollary.   3 Absorption controlled by one player Consider the following zero-sum absorbing game (the big-match) introduced by Gillette (1957).

Here, I = {T, B} and J = {L , R}. If Player I plays Top, the game is absorbed with probability 1 and if he plays Bottom, the game continues with probability 1. Absorbing payoffs are marked with a ∗ as in the quitting-game example. It is easy to show that v(λ) = 21 and that the unique optimal strategy for player I is to play T λ with probability 1+λ . Consequently, v = 21 which also happens to be the value of  10 the underlying one-shot game . On the other hand, the asymptotic value of the 01  01 quitting-game is 1, which is not the value of the underlying one-shot game . 10 A natural question arises: what are the absorbing games where v is the value of an underlying one-shot game? A game is partially-controlled by player I if the transition function p(i, j) depends only on i (but not the payoff functions). Proposition 5 If a zero-sum absorbing game is partially-controlled by player I, the asymptotic value equals the value of the underlying one-shot game, defined by: v = max min x∈(I ) j∈J

123

i ∈I / ∗

x i f (i, j) +

i∈I ∗

x i g(i, j)

Explicit formulas for repeated games with absorbing states

63

where I ∗ = {i : p ∗ (i) > 0} is the set of absorbing actions of player I. Coulomb (2001) proved a similar result for big-match games. Proof Step 1: v ≤ u. Let xε ∈ (I ) and αε ⊥ xε ∈ M+ (I ) be ε-optimal in the formula of v. If p ∗ (xε ) > 0 then, 

 ∗ x i p ∗ (i) f (xε , j) = min g(i, j) v − ε ≤ min j∈J j∈J p ∗ (xε ) p ∗ (xε ) i∈I 

x i p ∗ (i) = min g(i, j) ≤ max min z i g(i, j) ≤ u. j∈J z∈(I ∗ ) j∈J p ∗ (xε ) ∗ ∗ i∈I

If

p ∗ (x

ε)

= 0 then

xεi

= 0 for i ∈

i∈I

I∗

and

p ∗ (i)

= 0 when i ∈ / I ∗ so that



f (xε , j) + f ∗ (αε , j) j∈J p(xε ) + p ∗ (αε ) 

xεi αεi p ∗ (i) = min f (i, j) + g(i, j) . j∈J p(xε ) + p ∗ (αε ) p(xε ) + p ∗ (αε ) ∗ ∗

v − ε ≤ min

i ∈I /

i∈I

But when i ∈ / I ∗ , p(i) = 1, thus: 

xεi p(i) αεi p ∗ (i) f (i, j) + g(i, j) v − ε ≤ min j∈J p(xε ) + p ∗ (αε ) p(xε ) + p ∗ (αε ) ∗ ∗ ≤ u.

i ∈I /

i∈I

Step 2: v ≥ u. Let x0 be optimal for player I in the one shot matrix game associated with u. Define (x1 , α1 ) as follows. If p ∗ (x0 ) = 0, let (x1 , α1 ) = (x0 , 0). This clearly implies that v ≥ u. If p ∗ (x0 ) > 0 then for all i ∈ I ∗ , let x1i = 0 (so that x1 is non-absorbing) and for i ∈ / I ∗ let α1i = 0. This will imply that,  f (x1 , j) + f ∗ (α1 , j) v ≥ min j∈J p(x1 ) + p ∗ (α1 ) 

x1i α1i p ∗ (i) f (i, j) + g(i, j) = min j∈J p(x1 ) + p ∗ (α1 ) p(x1 ) + p ∗ (α1 ) i ∈I / ∗ i∈I ∗ 

x1i p(i) α1i p ∗ (i) = min f (i, j) + g(i, j) . j∈J p(x1 ) + p ∗ (α1 ) p(x1 ) + p ∗ (α1 ) ∗ ∗ i ∈I /

i∈I

Complete the definition of (x1 , α1 ) as follows. For i ∈ / I ∗ , let x0i = I \I ∗ ) and for i

proportional to x0 on the to x0 on I ∗ ). Consequently,



I∗

let x0i

=

α1i p ∗ (i) p(x1 )+ p ∗ (α1 )

x1i p(i) p(x1 )+ p ∗ (α1 )

(x1 is

(α1 is proportional

123

64

R. Laraki

 v ≥ min j∈J



i ∈I / ∗

x0i

f (i, j) +

i∈I ∗

 x0i g(i,

j) = u.  

4 The minmax A team of N players (named I) play against player (J). Assume the finiteness of all the strategy sets. Each player k in team I has a finite set of actions Ik . Player J has a finite set of actions J . Let I = I1 × · · · × I N and f , g from I × J → [−1, 1] and p : I × J → [0, 1] . The game is played as above, except that at each period, players in team I randomize independently (they are not allowed to correlate their random moves). Let  = (I1 ) × · · · × (I N ), p ∗ (·) = 1 − p (·) , f ∗ (·) = p ∗ (·) × g (·) and M+ = M+ (I1 ) × · · · × M+ (I N ). For x ∈ X , j ∈ J, k ∈ N and α ∈ M+ , a function ϕ : I × J → [−1, 1] is extended multi-linearly as follows:

ϕ(x, j) =

1

N

1

k−1

i x1i × · · · × x N ϕ(i, j)

i=(i 1 ,...,i N )∈I



ϕ(αk , x−k , j) =

k

k+1

N

i i x1i × · · · × xk−1 × αki × xk+1 · · · × xni ϕ(i, j).

i=(i 1 ,...,i N )∈I

Let w (λ) be the maximum payoff that team I can guarantee against player J. From Bewley and Kohlberg (1976a) and Neyman (2003) one can deduce the existence of w (λ) . However, no explicit formula exists. Theorem 6 w (λ) = max x∈ min j∈J to ⎛ w = max

sup

min ⎝

x∈ α∈M+ :∀k,αk ⊥xk j∈J

= max

sup

and, as λ → 0, converges ⎞

f ∗ (x, j) ∗ (x, j)>0} p ∗ (x, j) 1{ p  ⎠ N f (x, j)+ k=1 f ∗ (αk ,x−k , j) N + 1{ p∗ (x, j)=0} ∗ p(x, j)+ k=1 p (αk ,x−k , j)



min ⎝

x∈ α∈M+ :∀k,αk ⊥xk y∈(J )

λ f (x, j)+(1−λ) f ∗ (x, j) λp(x, j)+ p ∗ (x, j)



f ∗ (x,y) ∗ (x,y)>0} p ∗ (x,y) 1{ p  ⎠. N f (x,y)+ k=1 f ∗ (αk ,x−k ,y) N + 1{ p∗ (x,y)=0} ∗ p(x,y)+ k=1 p (αk ,x−k ,y)

Proof For the first formula, follow the ideas in the proof of Theorem 3 and Corollary 4. Let v = limn→∞ w (λn ) where λn → 0. Modifications in step 1 in Theorem 3: let x (λn ) → x be such that for every j ∈ J, w (λn ) ≤

123

λn f (x(λn ), j) + (1 − λn ) f ∗ (x(λn ), j) . λn p(x(λn ), j) + p ∗ (x(λn ), j)

Explicit formulas for repeated games with absorbing states

65

Let y(λn ) = x (λn ) − x → 0 so that:

p ∗ (x(λn ), j) =

1

i=(i 1 ,...,i N )∈I





=

N

i x1i (λn ) × · · · × x N (λn ) p(i, j) 1

1

y1i (λn ) + x1i



N i iN × · · · × yN p(i, j) (λn ) + x N

i=(i 1 ,...,i N )∈I ∗

= p (x, j) +

N



p (yk (λn ), x−k , j) + o

k=1

N

 ∗

p (yk (λn ), x−k , j)

k=1 ∗

(x, j) ∗ If p ∗ (x, j) > 0 then w ≤ pf ∗ (x, j) . If p (x, j) = 0 and if αk (λn ) =  k xki (λn )   ∈ M+ (Ik ) then αk (λn ) ⊥ xk and λn 1 i k xk =0

p





i k ∈Ik

x(λn ) ,j λn

=

N



p (αk (λn ), x−k , j) + o

k=1

N

 ∗

p (αk (λn ), x−k , j)

k=1

and the same is true for f ∗ so that w ≤ lim inf

f (x, j) +

n→∞

p(x, j) +

N k=1

f ∗ (αk (λn ), x−k , j)

k=1

p ∗ (αk (λn ), x−k , j)

N

which implies that w ≤ v. Modifications in step 2 in Theorem 3: take (α ε , x ε ) to be ε-optimal for the maximizer in the formula of w and define xkε (λn ) to be proportional to xkε + λn αkε for every k then use the Taylor expansion above and step 2 of Theorem 3 to deduce that w ≥ v − ε. For the second formula, follow Corollary 4. For each ε > 0, the proof of the modification above will imply that players in I have an ε-optimal strategy  ε  of step 2 just xk (λ) k∈I where xkε (λ) is proportional to xkε + λαkε in the λ-discounted game for all λ small enough. This implies that for any y ∈ (J ), w (λ) + ε ≤

λ f (xkε (λ), y) + (1 − λ) f ∗ (xkε (λ), y) . λp(xkε (λ), y) + p ∗ (xkε (λ), y)

where the right hand is a fractional function of λ. Consequently, it admits a limit which may be computed as in step 1 (using the multi-linearity of payoffs and transitions). This will imply that ⎛ w ≤ max

sup

min ⎝

x∈ α∈M+ :∀k,αk ⊥xk y∈(J )



f ∗ (x,y) ∗ p ∗ (x,y) 1{ p (x,y)>0} N ⎠. f (x,y)+ k=1 f ∗ (αk ,x−k ,y) N + 1{ p∗ (x,y)=0} ∗ p(x,y)+ k=1 p (αk ,x−k ,y)

The first formula of w and the fact that J ⊂ (J ) imply the other inequality.

 

123

66

R. Laraki

5 Stationary Nash equilibria Consider a N player absorbing game where each player k ∈ N has a finite set of actions Ik . Define the payoff functions f k : I → [−1, 1] and gk : I → [−1, 1], k ∈ {1, . . . , N } and a probability transition p : I → [0, 1] where I = I1 × · · · × I N . The game is played as above except that if at stage t player k = 1, . . . , N chooses  t , . . . , i t and if the game is absorbed i k receives f the action i kt ∈Ik then player k 1 N  he receives gk i 1t , . . . , i Nt . From Fink (1964), the λ-discounted game admits a stationary Nash equilibrium. A calculus as in Vrieze and Thuijsman’s formula above may be used to establish that x(λ) ∈ , with the corresponding payoff u(λ) = (u 1 (λ), . . . , u N (λ)) ∈ R N , is a stationary equilibrium iff for every player k, Fink’s equations are satisfied: λ f k (xk , x−k (λ)) + (1 − λ) f k∗ (xk , x−k (λ)) xk ∈(Ik ) λp(xk , x−k (λ)) + p ∗ (xk , x−k (λ)) λ f k (xk , x−k (λ)) + (1 − λ) f k∗ (xk , x−k (λ)) , u k (λ) = max xk ∈(Ik ) λp(xk , x−k (λ)) + p ∗ (xk , x−k (λ)) xk (λ) ∈ arg max

The asymptotic game is defined as follows. The set of strategies of player k is: (Ik ) = {(xk , αk ) ∈ (Ik ) × M+ (Ik ) : αk ⊥ xk } and the payoff function of player k is:  f k (x) + Nj=1 f k∗ (α j , x− j ) f k∗ (x) Ak (x, α) = ∗ 1{ p∗ (x)>0} + 1{ p∗ (x)=0} .  p (x) p(x) + Nj=1 p ∗ (α j , x− j ) Theorem 7 Let u = (u 1 , . . . , u N ) ∈ [−1, 1] N be an accumulation point of u(λ). Then u is a limit equilibrium payoff of the asymptotic game. More precisely, there exists x and a sequence of measures α j (n) ⊥ x j , j = 1, . . . , N such that for every player k: u k = lim Ak (x, α(n)) n

 f k (x) + Nj=1 f k∗ (α j (n), x− j ) f k∗ (x) ∗ 1{ p (x)>0} + 1{ p∗ (x)=0} = lim ∗  n p (x) p(x) + Nj=1 p ∗ (α j (n), x− j )  lim sup Ak (xk , αk , x−k , α−k (n)) ≥ sup (xk ,αk )∈ (Ik )

=

sup

(xk ,αk )∈ (Ik )

n



⎜ ⎝lim sup n



f k∗ (xk ,x−k ) p ∗ (xk ,x−k ) 1{ p ∗ (xk ,x−k )>0}  ⎟ ⎠. f k (xk ,x−k )+ f k∗ (αk ,x−k )+ Nj=k f j∗ (α j (n),x− j )  + 1 ∗ N p (x ,x )=0 ∗ { } ∗ k −k p(xk ,x−k )+ pk (αk ,x−k )+ j=k p (α j (n),x− j )

One may ask: does any limit equilibrium payoff of the asymptotic game corresponds to the limit of some λn -discounted equilibrium payoff as λn goes to zero? The

123

Explicit formulas for repeated games with absorbing states

67

equations in the theorem and the proof below suggest that any limit equilibrium payoff of the asymptotic game is the limit of some n -equilibrium payoff of the λn -discounted game as n and λn go to zero. The strategy for player k in the λn -discounted game would be proportional to xk + λn αk (n). Proof Let x(λn ) ∈  be a stationary equilibrium of the λn -discounted absorbing game and let u(λn ) = (u 1 (λn ), . . . , u N (λn )) ∈ R N be its payoff and suppose w.l.o.g. that x(λn ) → x and u(λn ) → u. From Fink’s equations, one deduces that: u k (λn ) =

If

p ∗ (x)

> 0 then u k =

λn f k (x(λn )) + (1 − λn ) f k∗ (x(λn )) , λn p(x(λn )) + p ∗ (x(λn ))

f k∗ (x) p ∗ (x) . If

p ∗ (x)

= 0 define α j (n) =

j

x ij (λn )   λn 1 x i j =0



j

M+ (I j ), j = 1, . . . , N ; so that α j (n) ⊥ x j . Consequently, p∗



x(λn ) λn

=

N

∈ i j ∈I j

⎛ ⎞ N p ∗ (α j (n), x− j ) + o ⎝ p ∗ (α j (n), x− j )⎠

j=1

j=1

and the same is true for f ∗ . Thus, considering a subsequence if necessary, one obtains: u k = lim n

f k (x) + p(x) +

N

∗ j=1 f k (α j (n), x − j ) . ∗ j=1 p (α j (n), x − j )

N

Again, from Fink’s equations one deduces that for every αk ⊥ xk : u k (λn ) ≥

λn f k (xk + λn αk , x−k (λn )) + (1 − λn ) f k∗ (xk + λn αk , x−k (λn )) . λn p(xk + λn αk , x−k (λn )) + p ∗ (xk + λn αk , x−k (λn ))

Using multi-linearity and defining α j (n), j = k, as above proves that:

u k ≥ lim sup n

f k∗ (xk ,x−k ) p ∗ (xk ,x−k ) 1{ p ∗ (xk ,x−k )>0}  . f k (xk ,x−k )+ f k∗ (αk ,x−k )+ Nj=k f j∗ (α j (n),x− j )  + 1 ∗ p(xk ,x−k )+ pk∗ (αk ,x−k )+ Nj=k p ∗ (α j (n),x− j ) { p (xk ,x−k )=0}

  6 Compact continuous games Let us extend the model of zero-sum games. I and J are now assumed to be compact metric sets. The game is separately (resp. jointly) continuous if f , g and p are separately (resp. jointly) continuous functions on I × J . (K ), K = I, J, is the set of Borel probability measures on K and M+ (K ) is the set of Borel positive measure on

123

68

R. Laraki

K . They are endowed with the weak* topology.  For (α, β) ∈ M+ (I ) × M+ (J ) and ϕ : I × J → [−1, 1] measurable, ϕ(α, β) = I ×J ϕ(i, j)dα(i)dβ( j). This framework was introduced in Rosenberg and Sorin (2001). Following the operator approach of Kohlberg (1974), Rosenberg and Sorin considered the Shapley operator r → (λ, r ) where   (λ, r ) = max min λ f (x, y) + (1 − λ) p(x, y)r + (1 − λ) f ∗ (x, y) x∈(I ) y∈(J )   = min max λ f (x, y) + (1 − λ) p(x, y)r + (1 − λ) f ∗ (x, y) . y∈(J ) x∈(I )

The operator is well defined and the existence of the value is guaranteed via Sion’s minmax theorem. As Shapley (1953) already noticed, the operator is (1 − λ)-contracting so that the value of the λ-discounted game v(λ) is the unique fixed point (λ, ·). Kohlberg (1974), in finite absorbing games and Rosenberg and Sorin (2001) in compact and separately-continuous absorbing games proved the existence of v = lim v(λ) and provided a variational characterization of v using the information obtained from the derivative of (λ, r ) around λ ≈ 0. Notations for a multi-player absorbing game could be introduced similarly. Theorem 8 If the game is compact and jointly-continuous, all the results proved above for finite games still hold (for lim v(λ), lim w(λ) and Nash equilibria). Proof Let us show how the first part of Theorem 3 is modified. Let w = limn→∞ v (λn ) where λn → 0. Take an optimal strategy x(λn ) of player I in the λn -discounted game and suppose w.l.o.g. that it converges to some x. Consider any strategy j of Player J so that: v (λn ) ≤ If p ∗ (x, j) > 0 then v ≤

λn f (x(λn ), j) + (1 − λn ) f ∗ (x(λn ), j) λn p(x(λn ), j) + p ∗ (x(λn ), j) f ∗ (x, j) p ∗ (x, j) .

If p ∗ (x, j) = 0 then p ∗ (i, j) = 0 on i ∈ S(x), the

support of x. Define α(λn ) ∈ M+ (I ) to be dα(λn )(i) = d x(λλnn )(i) 1{i ∈S(x)} . Let sn ≥ 0 / be such that α(λn ) = sn σ (λn ) and σ (λn ) ∈ (I ) and assume w.l.o.g. that σ (λn ) → σ and sn → t ∈ [0, +∞] (by compactness of (I )). Using joint continuity, the fact that p(x, j) = 1 and that payoffs are uniformly bounded by 1 imply that for any ε > 0, there is N (ε) such that for all n ≥ N (ε) and all j ∈ J f (x, j) + ε + f ∗ (α(λn ), j) − λn f ∗ (α(λn ), j) f (x(λn ), j) + (1 − λn ) n ), j) ≤ ∗ p(x(λn ), j) + p (α(λn ), j) p(x, j) − ε + p ∗ (α(λn ), j) f (x, j) + f ∗ (α(λn ), j) 2ε ≤ + + λn p(x, j) + p ∗ (α(λn ), j) 1−ε f ∗ (α(λ

Consequently,  w ≤ sup

sup

min

x∈(I ) α⊥x∈M+ (I ) j∈J

123

f ∗ (x, j) f (x, j)+ f ∗ (α, j) 1{ p∗ (x, j)>0} + 1{ p∗ (x, j)=0} . p ∗ (x, j) p(x, j)+ p ∗ (α, j)

Explicit formulas for repeated games with absorbing states

69

Step 2 of Theorem 1 needs no modification. The other proofs are adapted in a similar way.   Acknowledgment I would like to thank the two referees, Michel Balinski, Abraham Neyman, Eilon Solan, Sylvain Sorin and Xavier Venel for their very useful comments.

References Abreu D, Milgrom P, Pearce D (1991) Information and timing in repeated partnerships. Econometrica 59:1713–1733 Aumann RJ, Maschler M (1995) Repeated games with incomplete information. MIT Press, Cambridge Bewley T, Kohlberg E (1976a) The asymptotic theory of stochastic games. Math Oper Res 1:197–208 Bewley T, Kohlberg E (1976b) The asymptotic solution of a recursion equation occuring in stochastic games. Math Oper Res 1:321–336 Blackwell D, Ferguson T (1968) The big match. Ann Math Stat 33:882–886 Coulomb JM (2001) Repeated games with absorbing states and signaling structure. Math Oper Res 26:286– 303 Fink AM (1964) Equilibrium in a stochastic N -person game. J Sci Hiroshima Univ 28:89–93 Flech J, Thuijsman F, Vrieze K (1996) Recursive repeated games with absorbing states. Math Oper Res 21:1016–1022 Gillette D (1957) Stochastic games with zero stop probabilities. In: Tucker AW, Dresher M, Wolf P (eds) Contributions to the theory of games, vol III. Annals of mathematical studies 39. Princeton University Press, Princeton, pp 179–187 Kohlberg E (1974) Repeated games with absorbing states. Ann Stat 2:724–738 Kohlberg E, Zamir S (1974) Repeated games of incomplete information: the symmetric case. Ann Stat 2:1040 Laraki R (2001a) The splitting game and applications. Int J Game Theory 30:359–376 Laraki R (2001b) Variational inequalities, system of functional equations, and incomplete information repeated games. SIAM J Control Optim 40(2):516–524 Mertens J-F, Neyman A (1981) Stochastic games. Int J Game Theory 10:53–66 Mertens J-F, Zamir S (1971) The value of two-person zero-sum repeated games with lack of information on both sides. Int J Game Theory 1:39–64 Mertens J-F, Neyman A, Rosenberg DD (2009) Absorbing games with compact action spaces. Math Oper Res 34:257–262 Neyman A (2003) Stochastic games: existence of the minmax. In: Neyman A, Sorin S (eds) Stochastic games and applications. Kluwer Academic Publishers, Dordrecht, pp 173–193 Neyman A, Sorin S (1998) Equilibria in repeated games with incomplete information: the general symmetric case. Int J Game Theory 27:201–210 Rosenberg D, Sorin S (2001) An operator approach to zero-sum repeated games. Isr J Math 121:221–246 Shapley LS (1953) Stochastic games. Proc Natl Acad Sci USA 39:1095–1100 Solan E (1999) Three-player absorbing games. Math Oper Res 24:669–698 Sorin S (1986) Asymptotic properties of a non zero-sum stochastic game. Int J Game Theory 15:101–107 Sorin S (2002) A first course on zero-sum repeated games. Springer, Berlin Vrieze K, Thuijsman F (1989) On equilibria in repeated games with absorbing states. Int J Game Theory 18:293–310

123

Explicit formulas for repeated games with absorbing ... - Springer Link

Dec 1, 2009 - mal stationary strategy (that is, he plays the same mixed action x at each period). This implies in particular that the lemma holds even if the players have no memory or do not observe past actions. Note that those properties are valid in every discounted stochastic game Shapley (1953) as soon as the states ...

270KB Sizes 2 Downloads 292 Views

Recommend Documents

A Fuzzy-Interval Based Approach for Explicit Graph ... - Springer Link
number of edges, node degrees, the attributes of nodes and the attributes of edges in ... The website [2] for the 20th International Conference on Pattern Recognition. (ICPR2010) ... Graph embedding, in this sense, is a real bridge joining the.

A Fuzzy-Interval Based Approach for Explicit Graph ... - Springer Link
Computer Vision Center, Universitat Autónoma de Barcelona, Spain. {mluqman ... number of edges, node degrees, the attributes of nodes and the attributes.

Repeated Games with General Discounting - CiteSeerX
Aug 7, 2015 - Together they define a symmetric stage game. G = (N, A, ˜π). The time is discrete and denoted by t = 1,2,.... In each period, players choose ...

Repeated Games with General Discounting
Aug 7, 2015 - Repeated game is a very useful tool to analyze cooperation/collusion in dynamic environ- ments. It has been heavily ..... Hence any of these bi-.

Isoperimetric inequalities for submanifolds with ... - Springer Link
Jul 23, 2011 - if ωn is the volume of a unit ball in Rn, then. nnωnVol(D)n−1 ≤ Vol(∂D)n and equality holds if and only if D is a ball. As an extension of the above classical isoperimetric inequality, it is conjectured that any n-dimensional c

Refinements of rationalizability for normal-form games - Springer Link
rationalizability for normal-form games on its own fails to exclude some implausible strategy choices. One example is the game given in Figure 1. It can be shown that fЕX1, Y1Ж, ЕX1, Y2Ж, ЕX2, Y1Ж, ЕX2, Y2Жg are all rationalizable; in other w

Repeated proximity games
If S is a. ®nite set, h S will denote the set of probability distributions on S. A pure strategy for player i in the repeated game is thus an element si si t t 1, where for ..... random variable of the action played by player i at stage T and hi. T

Introduction to Repeated Games with Private Monitoring
Stony Brook 1996 and Cowles Foundation Conference on Repeated Games with Private. Monitoring 2000. ..... actions; we call such strategies private). Hence ... players.9 Recent paper by Aoyagi [4] demonstrated an alternative way to. 9 In the ...

Repeated Games with General Time Preference
Feb 25, 2017 - University of California, Los Angeles .... namic games, where a state variable affects both payoffs within each period and intertemporal.

Repeated Games with Incomplete Information1 Article ...
Apr 16, 2008 - tion (e.g., a credit card number) without being understood by other participants ... 1 is then Gk(i, j) but only i and j are publicly announced before .... time horizon, i.e. simultaneously in all game ΓT with T sufficiently large (or

Rational Secret Sharing with Repeated Games
Apr 23, 2008 - Intuition. The Protocol. 5. Conclusion. 6. References. C. Pandu Rangan ( ISPEC 08 ). Repeated Rational Secret Sharing. 23rd April 2008. 2 / 29 ...

Introduction to Repeated Games with Private Monitoring
our knowledge about repeated games with imperfect private monitoring is quite limited. However, in the ... Note that the existing models of repeated games with.

Repeated Games with Uncertain Payoffs and Uncertain ...
U 10,−4 1, 1. D. 1,1. 0, 0. L. R. U 0,0. 1, 1. D 1,1 10, −4. Here, the left table shows expected payoffs for state ω1, and the right table shows payoffs for state ω2.

Mixed strategies in games of capacity manipulation in ... - Springer Link
Received: 30 September 2005 / Accepted: 22 November 2005 / Published online: 29 April 2006 .... (2005) report that many high schools in New York City .... Abdulkadiro˘glu A, Pathak PA, Roth AE (2005) The New York City high school match.

Approximate efficiency in repeated games with ...
illustration purpose, we set this complication aside, keeping in mind that this .... which we refer to as effective independence, has achieved the same effect of ... be the private history of player i at the beginning of period t before choosing ai.

The Folk Theorem in Repeated Games with Individual ...
Keywords: repeated game, private monitoring, incomplete information, ex-post equilibrium, individual learning. ∗. The authors thank Michihiro Kandori, George ...

Data integration with uncertainty - Springer Link
Nov 14, 2008 - sources by automatic methods (e.g., HTML pages, emails, blogs). ..... If a tuple is an answer to Q under multiple mappings in m, then we add.

Plant location with minimum inventory - Springer Link
fractional solution we were able to derive integer solutions within 4% of optimality. ... F. Barahona, D. Jensen / Mathematical Programming 83 (1998) 101-111.

Energy efficient routing with delay guarantee for sensor ... - Springer Link
Jun 15, 2006 - shown in [2], most of the battery energy is consumed by the radio. A Time ..... can then prove that the head of each arc in G v is closer to the.

Indirect Pharmacodynamic Models for Responses with ... - Springer Link
ûariable and simple first-order loss (kout) of R. These models were extended using ... of two-compartment distribution of R and or polyexponential loss of R.

Adaptive Finite Elements with High Aspect Ratio for ... - Springer Link
An adaptive phase field model for the solidification of binary alloys in two space dimensions is .... c kρsφ + ρl(1 − φ). ( ρv + (k − 1)ρsφvs. )) − div. (. D(φ)∇c + ˜D(c, φ)∇φ. ) = 0, (8) where we have set .... ena during solidif

A link between complete models with stochastic ... - Springer Link
classical ARCH models, a stationary solution with infinite variance may exists. In ..... must compute the required conditional expectations and variances. Setting ...

Infinitely repeated games in the laboratory - The Center for ...
Oct 19, 2016 - Electronic supplementary material The online version of this article ..... undergraduate students from multiple majors. Table 3 gives some basic ...