Communication equilibrium payoffs in repeated games with imperfect monitoring

Communication equilibrium payoffs in repeated games with imperfect monitoring Jérôme Renault (X and GIS “Sciences de la décision”), joint with Tristan Tomala (HEC)

TSE-GREMAQ, October 14, 2008

1/29

Communication equilibrium payoffs in repeated games with imperfect monitoring

Introduction Repeated games are dynamic interactions, played by stages. This lecture: the players repeat over and over the same stage game, which is perfectly known. • In the standard model (with perfect monitoring), the actions played at a given stage are publicly observed before the next stage is reached. We have the Folk Theorem: the equilibrium payoffs of the repeated game are the feasible and individually rational payoffs. • We study here the model with imperfect monitoring: at the end of each stage, the players receive some signal depending on the action profile. e.g.: Principal-Agent problems. Computing the equilibrium payoffs is not known.

2/29

Communication equilibrium payoffs in repeated games with imperfect monitoring

Introduction Repeated games are dynamic interactions, played by stages. This lecture: the players repeat over and over the same stage game, which is perfectly known. • In the standard model (with perfect monitoring), the actions played at a given stage are publicly observed before the next stage is reached. We have the Folk Theorem: the equilibrium payoffs of the repeated game are the feasible and individually rational payoffs. • We study here the model with imperfect monitoring: at the end of each stage, the players receive some signal depending on the action profile. e.g.: Principal-Agent problems. Computing the equilibrium payoffs is not known.

2/29

Communication equilibrium payoffs in repeated games with imperfect monitoring

Introduction Repeated games are dynamic interactions, played by stages. This lecture: the players repeat over and over the same stage game, which is perfectly known. • In the standard model (with perfect monitoring), the actions played at a given stage are publicly observed before the next stage is reached. We have the Folk Theorem: the equilibrium payoffs of the repeated game are the feasible and individually rational payoffs. • We study here the model with imperfect monitoring: at the end of each stage, the players receive some signal depending on the action profile. e.g.: Principal-Agent problems. Computing the equilibrium payoffs is not known.

2/29

Communication equilibrium payoffs in repeated games with imperfect monitoring

We assume that the players can communicate with an exogeneous mediator between the stages, and consider the communication equilibrium payoffs of the repeated game (Myerson 82, Forges 85).

We characterize these payoffs in the general n-person case: they are the feasible payoffs which are robust to undetectable deviations and jointly rational. This extends the result of Lehrer (1992) and Mertens, Sorin, Zamir (1994) for 2-player games.

3/29

Communication equilibrium payoffs in repeated games with imperfect monitoring

We assume that the players can communicate with an exogeneous mediator between the stages, and consider the communication equilibrium payoffs of the repeated game (Myerson 82, Forges 85).

We characterize these payoffs in the general n-person case: they are the feasible payoffs which are robust to undetectable deviations and jointly rational. This extends the result of Lehrer (1992) and Mertens, Sorin, Zamir (1994) for 2-player games.

3/29

Communication equilibrium payoffs in repeated games with imperfect monitoring

Outline: I. The model of repeated games with imperfect monitoring II. The standard case of perfect monitoring III. Aspects of imperfect monitoring IV. Communication equilibrium payoffs V. Punishment levels VI. Feasible payoffs robust to undetectable deviations VII. Jointly rational payoffs VIII. The characterization IX. Elements of the proof X. More on imperfect monitoring without a mediator References

4/29

Communication equilibrium payoffs in repeated games with imperfect monitoring I. The model of repeated games with imperfect monitoring

Repeated games with imperfect monitoring (also called repeated games with signals, or supergames). Data: • A finite stage game G given by a set of players N = {1, ..., n}, and for each player i a set of actions Ai and a payoff function g i : A −→ IR, where A = ∏i Ai stands for the set of action profiles. • an observation structure: for each player i, a finite set of signals U i , and a signalling function f : A −→ ∆(U), where U = ∏i U i is the set of signal profiles. Play: at every stage t = 1, 2, ..., the players independently choose an action in their own set of actions. If at ∈ A is the joint action chosen, a profile of signals ut = (uti )i is selected according to f (at ). The stage payoff for player i is then g i (at ), but all what player i learns before starting stage t + 1 is uti .

5/29

Communication equilibrium payoffs in repeated games with imperfect monitoring I. The model of repeated games with imperfect monitoring

Repeated games with imperfect monitoring (also called repeated games with signals, or supergames). Data: • A finite stage game G given by a set of players N = {1, ..., n}, and for each player i a set of actions Ai and a payoff function g i : A −→ IR, where A = ∏i Ai stands for the set of action profiles. • an observation structure: for each player i, a finite set of signals U i , and a signalling function f : A −→ ∆(U), where U = ∏i U i is the set of signal profiles. Play: at every stage t = 1, 2, ..., the players independently choose an action in their own set of actions. If at ∈ A is the joint action chosen, a profile of signals ut = (uti )i is selected according to f (at ). The stage payoff for player i is then g i (at ), but all what player i learns before starting stage t + 1 is uti .

5/29

Communication equilibrium payoffs in repeated games with imperfect monitoring I. The model of repeated games with imperfect monitoring

Repeated games with imperfect monitoring (also called repeated games with signals, or supergames). Data: • A finite stage game G given by a set of players N = {1, ..., n}, and for each player i a set of actions Ai and a payoff function g i : A −→ IR, where A = ∏i Ai stands for the set of action profiles. • an observation structure: for each player i, a finite set of signals U i , and a signalling function f : A −→ ∆(U), where U = ∏i U i is the set of signal profiles. Play: at every stage t = 1, 2, ..., the players independently choose an action in their own set of actions. If at ∈ A is the joint action chosen, a profile of signals ut = (uti )i is selected according to f (at ). The stage payoff for player i is then g i (at ), but all what player i learns before starting stage t + 1 is uti .

5/29

Communication equilibrium payoffs in repeated games with imperfect monitoring I. The model of repeated games with imperfect monitoring

Repeated games with imperfect monitoring (also called repeated games with signals, or supergames). Data: • A finite stage game G given by a set of players N = {1, ..., n}, and for each player i a set of actions Ai and a payoff function g i : A −→ IR, where A = ∏i Ai stands for the set of action profiles. • an observation structure: for each player i, a finite set of signals U i , and a signalling function f : A −→ ∆(U), where U = ∏i U i is the set of signal profiles. Play: at every stage t = 1, 2, ..., the players independently choose an action in their own set of actions. If at ∈ A is the joint action chosen, a profile of signals ut = (uti )i is selected according to f (at ). The stage payoff for player i is then g i (at ), but all what player i learns before starting stage t + 1 is uti .

5/29

Communication equilibrium payoffs in repeated games with imperfect monitoring I. The model of repeated games with imperfect monitoring

Illustration: The prisoner’s dilemma

1

C D1

2 D2   C (3, 3) (0, 4) (4, 0) (1, 1)

(unique equilibrium payoff for the one-shot game: (1,1)). • standard case of perfect monitoring: U i = A, and uti = at for each player i. • trivial observation for player i: U i is a singleton (play in the dark). • public signals: all players receive the same signal. (Fudenberg, Levine, Maskin 94, Mailath Morris 02, Horner Olszewski 06....) • observable payoffs: (Tomala 99,...)

6/29

Communication equilibrium payoffs in repeated games with imperfect monitoring I. The model of repeated games with imperfect monitoring

Illustration: The prisoner’s dilemma

1

C D1

2 D2   C (3, 3) (0, 4) (4, 0) (1, 1)

(unique equilibrium payoff for the one-shot game: (1,1)). • standard case of perfect monitoring: U i = A, and uti = at for each player i. • trivial observation for player i: U i is a singleton (play in the dark). • public signals: all players receive the same signal. (Fudenberg, Levine, Maskin 94, Mailath Morris 02, Horner Olszewski 06....) • observable payoffs: (Tomala 99,...)

6/29

Communication equilibrium payoffs in repeated games with imperfect monitoring I. The model of repeated games with imperfect monitoring

Illustration: The prisoner’s dilemma

1

C D1

2 D2   C (3, 3) (0, 4) (4, 0) (1, 1)

(unique equilibrium payoff for the one-shot game: (1,1)). • standard case of perfect monitoring: U i = A, and uti = at for each player i. • trivial observation for player i: U i is a singleton (play in the dark). • public signals: all players receive the same signal. (Fudenberg, Levine, Maskin 94, Mailath Morris 02, Horner Olszewski 06....) • observable payoffs: (Tomala 99,...)

6/29

Communication equilibrium payoffs in repeated games with imperfect monitoring I. The model of repeated games with imperfect monitoring

Strategies and payoffs A strategy for player i: σ i = (σti )t ≥1 , where σti : (Ai × U i )t −1 −→ ∆(Ai ) gives the lottery played at stage t depending on his current information. A strategy profile σ naturally induces a probability over plays. Average T -stage payoff for player i:

γTi (σ )

= IEσ

1 T

!

T

∑ g (at ) i

t=1

.

For λ in (0, 1], λ -discounted payoff for player i:

γλ (σ ) = IEσ i



∑ λ (1 − λ )

t=1

t −1 i

!

g (at ) .

7/29

Communication equilibrium payoffs in repeated games with imperfect monitoring I. The model of repeated games with imperfect monitoring

Equilibrium payoffs Definition: A uniform equilibrium payoff of the repeated game is a strategy profile σ such that: 1) ∀ε > 0, σ is a ε -Nash eq. of every discounted game with low enough discount factor : ∃λ0 , ∀λ ≤ λ0 , ∀i ∈ N, ∀τ i ∈ Σi , γλi (τ i , σ −i ) ≤ γλi (σ ) + ε , and 2) (γλi (σ ))i ∈N converges as λ goes to 0 to a limit called an equilibrium payoff. Denote by E∞ the set of equilibrium payoffs. Rems: - long-term strategic aspects, - same with average payoffs γTi (σ ) = IEσ - better robustness than limλ →0 Eλ , - no refinement here.

1 T

 i ∑T t=1 g (at ) , 8/29

Communication equilibrium payoffs in repeated games with imperfect monitoring II. The standard case of perfect monitoring

The standard case of perfect monitoring Example: The prisoner’s dilemma with perfect monitoring 2 D2   C 1 C (3, 3) (0, 4) D1 (4, 0) (1, 1) (3, 3) is an equilibrium payoff of the repeated game: play C i as long as your opponent does, otherwise play D i forever. J2 Equilibrium payoffs set E∞

6 4 P ` B PPP B @ @P` B @@ E∞@B @ @B @ B @@@@B 1 B`P PP B PPB` 0 1 4 J1 9/29

Communication equilibrium payoffs in repeated games with imperfect monitoring II. The standard case of perfect monitoring

The standard case of perfect monitoring Example: The prisoner’s dilemma with perfect monitoring 2 D2   C 1 C (3, 3) (0, 4) D1 (4, 0) (1, 1) (3, 3) is an equilibrium payoff of the repeated game: play C i as long as your opponent does, otherwise play D i forever. J2 Equilibrium payoffs set E∞

6 4 P ` B PPP B @ @P` B @@ E∞@B @ @B @ B @@@@B 1 B`P PP B PPB` 0 1 4 J1 9/29

Communication equilibrium payoffs in repeated games with imperfect monitoring II. The standard case of perfect monitoring

The standard case of perfect monitoring Example: The prisoner’s dilemma with perfect monitoring 2 D2   C 1 C (3, 3) (0, 4) D1 (4, 0) (1, 1) (3, 3) is an equilibrium payoff of the repeated game: play C i as long as your opponent does, otherwise play D i forever. J2 Equilibrium payoffs set E∞

6 4 P ` B PPP B @ @P` B @@ E∞@B @ @B @ B @@@@B 1 B`P PP B PPB` 0 1 4 J1 9/29

Communication equilibrium payoffs in repeated games with imperfect monitoring II. The standard case of perfect monitoring

The standard case of perfect monitoring Example: The prisoner’s dilemma with perfect monitoring 2 D2   C 1 C (3, 3) (0, 4) D1 (4, 0) (1, 1) (3, 3) is an equilibrium payoff of the repeated game: play C i as long as your opponent does, otherwise play D i forever. J2 Equilibrium payoffs set E∞

6 4 P ` B PPP B @ @P` B @@ E∞@B @ @B @ B @@@@B 1 B`P PP B PPB` 0 1 4 J1 9/29

Communication equilibrium payoffs in repeated games with imperfect monitoring II. The standard case of perfect monitoring

The standard Folk theorem Define two sets. Feasible payoffs: convg (A) = g (∆(A)), where g : A −→ IR N is the vector payoff function. Punishment level of player i: vi =

min

max g i (p i , p −i ) independent minmax of player i

p −i ∈∏j6=i ∆(Aj ) p i ∈∆(Ai )

Individually rational payoffs: IR = {x = (x i )i ∈ IR N , ∀i x i ≥ v i }. Standard Folk theorem: the equilibrium payoffs of the repeated game are the payoffs which are both feasible (that can be achieved) and individually rational (every player gets at least his punition payoff). E∞ = g (∆(A)) ∩ IR. Aumann (1981): The Folk theorem “has been generally known in the profession for at least 15 or 20 years, but has not been published; its authorship is obscure." 10/29

Communication equilibrium payoffs in repeated games with imperfect monitoring II. The standard case of perfect monitoring

The standard Folk theorem Define two sets. Feasible payoffs: convg (A) = g (∆(A)), where g : A −→ IR N is the vector payoff function. Punishment level of player i: vi =

min

max g i (p i , p −i ) independent minmax of player i

p −i ∈∏j6=i ∆(Aj ) p i ∈∆(Ai )

Individually rational payoffs: IR = {x = (x i )i ∈ IR N , ∀i x i ≥ v i }. Standard Folk theorem: the equilibrium payoffs of the repeated game are the payoffs which are both feasible (that can be achieved) and individually rational (every player gets at least his punition payoff). E∞ = g (∆(A)) ∩ IR. Aumann (1981): The Folk theorem “has been generally known in the profession for at least 15 or 20 years, but has not been published; its authorship is obscure." 10/29

Communication equilibrium payoffs in repeated games with imperfect monitoring II. The standard case of perfect monitoring

The standard Folk theorem Define two sets. Feasible payoffs: convg (A) = g (∆(A)), where g : A −→ IR N is the vector payoff function. Punishment level of player i: vi =

min

max g i (p i , p −i ) independent minmax of player i

p −i ∈∏j6=i ∆(Aj ) p i ∈∆(Ai )

Individually rational payoffs: IR = {x = (x i )i ∈ IR N , ∀i x i ≥ v i }. Standard Folk theorem: the equilibrium payoffs of the repeated game are the payoffs which are both feasible (that can be achieved) and individually rational (every player gets at least his punition payoff). E∞ = g (∆(A)) ∩ IR. Aumann (1981): The Folk theorem “has been generally known in the profession for at least 15 or 20 years, but has not been published; its authorship is obscure." 10/29

Communication equilibrium payoffs in repeated games with imperfect monitoring II. The standard case of perfect monitoring

The standard Folk theorem Define two sets. Feasible payoffs: convg (A) = g (∆(A)), where g : A −→ IR N is the vector payoff function. Punishment level of player i: vi =

min

max g i (p i , p −i ) independent minmax of player i

p −i ∈∏j6=i ∆(Aj ) p i ∈∆(Ai )

Individually rational payoffs: IR = {x = (x i )i ∈ IR N , ∀i x i ≥ v i }. Standard Folk theorem: the equilibrium payoffs of the repeated game are the payoffs which are both feasible (that can be achieved) and individually rational (every player gets at least his punition payoff). E∞ = g (∆(A)) ∩ IR. Aumann (1981): The Folk theorem “has been generally known in the profession for at least 15 or 20 years, but has not been published; its authorship is obscure." 10/29

Communication equilibrium payoffs in repeated games with imperfect monitoring III. Aspects of imperfect monitoring

Signals do matter. 2  C C1 (3, 3) D1 (4, 0)

Example: prisoner’s dilemma in the dark D2  (0, 4) E∞ = {(1, 1)}. (1, 1)

With imperfect signals, there might be in general: 1) undetectable deviations, 2) incentives to play informative actions (Lehrer 89, 1992), 3) a deviation which is detected by some players, but not by others, (R-Tomala 98) 4) a deviation which is detected, but the identity of the deviator is unknown, (Tomala 98, R-Scarlatti-Scarsini 05 and 08) 5) in a punition phase, a possibility to use the signals to correlate actions and to punish below the level v i (leads to new punition levels, Gossner Tomala 07). Open problem: Compute E∞ in general ? (unknown even for two players) We characterize the set of communication equilibrium payoffs C∞ : 5) disappears, 2) and 3) are simplified.

11/29

Communication equilibrium payoffs in repeated games with imperfect monitoring III. Aspects of imperfect monitoring

Signals do matter. 2  C C1 (3, 3) D1 (4, 0)

Example: prisoner’s dilemma in the dark D2  (0, 4) E∞ = {(1, 1)}. (1, 1)

With imperfect signals, there might be in general: 1) undetectable deviations, 2) incentives to play informative actions (Lehrer 89, 1992), 3) a deviation which is detected by some players, but not by others, (R-Tomala 98) 4) a deviation which is detected, but the identity of the deviator is unknown, (Tomala 98, R-Scarlatti-Scarsini 05 and 08) 5) in a punition phase, a possibility to use the signals to correlate actions and to punish below the level v i (leads to new punition levels, Gossner Tomala 07). Open problem: Compute E∞ in general ? (unknown even for two players) We characterize the set of communication equilibrium payoffs C∞ : 5) disappears, 2) and 3) are simplified.

11/29

Communication equilibrium payoffs in repeated games with imperfect monitoring III. Aspects of imperfect monitoring

Signals do matter. 2  C C1 (3, 3) D1 (4, 0)

Example: prisoner’s dilemma in the dark D2  (0, 4) E∞ = {(1, 1)}. (1, 1)

With imperfect signals, there might be in general: 1) undetectable deviations, 2) incentives to play informative actions (Lehrer 89, 1992), 3) a deviation which is detected by some players, but not by others, (R-Tomala 98) 4) a deviation which is detected, but the identity of the deviator is unknown, (Tomala 98, R-Scarlatti-Scarsini 05 and 08) 5) in a punition phase, a possibility to use the signals to correlate actions and to punish below the level v i (leads to new punition levels, Gossner Tomala 07). Open problem: Compute E∞ in general ? (unknown even for two players) We characterize the set of communication equilibrium payoffs C∞ : 5) disappears, 2) and 3) are simplified.

11/29

Communication equilibrium payoffs in repeated games with imperfect monitoring III. Aspects of imperfect monitoring

Signals do matter. 2  C C1 (3, 3) D1 (4, 0)

Example: prisoner’s dilemma in the dark D2  (0, 4) E∞ = {(1, 1)}. (1, 1)

With imperfect signals, there might be in general: 1) undetectable deviations, 2) incentives to play informative actions (Lehrer 89, 1992), 3) a deviation which is detected by some players, but not by others, (R-Tomala 98) 4) a deviation which is detected, but the identity of the deviator is unknown, (Tomala 98, R-Scarlatti-Scarsini 05 and 08) 5) in a punition phase, a possibility to use the signals to correlate actions and to punish below the level v i (leads to new punition levels, Gossner Tomala 07). Open problem: Compute E∞ in general ? (unknown even for two players) We characterize the set of communication equilibrium payoffs C∞ : 5) disappears, 2) and 3) are simplified.

11/29

Communication equilibrium payoffs in repeated games with imperfect monitoring III. Aspects of imperfect monitoring

Signals do matter. 2  C C1 (3, 3) D1 (4, 0)

Example: prisoner’s dilemma in the dark D2  (0, 4) E∞ = {(1, 1)}. (1, 1)

With imperfect signals, there might be in general: 1) undetectable deviations, 2) incentives to play informative actions (Lehrer 89, 1992), 3) a deviation which is detected by some players, but not by others, (R-Tomala 98) 4) a deviation which is detected, but the identity of the deviator is unknown, (Tomala 98, R-Scarlatti-Scarsini 05 and 08) 5) in a punition phase, a possibility to use the signals to correlate actions and to punish below the level v i (leads to new punition levels, Gossner Tomala 07). Open problem: Compute E∞ in general ? (unknown even for two players) We characterize the set of communication equilibrium payoffs C∞ : 5) disappears, 2) and 3) are simplified.

11/29

Communication equilibrium payoffs in repeated games with imperfect monitoring III. Aspects of imperfect monitoring

Signals do matter. 2  C C1 (3, 3) D1 (4, 0)

Example: prisoner’s dilemma in the dark D2  (0, 4) E∞ = {(1, 1)}. (1, 1)

With imperfect signals, there might be in general: 1) undetectable deviations, 2) incentives to play informative actions (Lehrer 89, 1992), 3) a deviation which is detected by some players, but not by others, (R-Tomala 98) 4) a deviation which is detected, but the identity of the deviator is unknown, (Tomala 98, R-Scarlatti-Scarsini 05 and 08) 5) in a punition phase, a possibility to use the signals to correlate actions and to punish below the level v i (leads to new punition levels, Gossner Tomala 07). Open problem: Compute E∞ in general ? (unknown even for two players) We characterize the set of communication equilibrium payoffs C∞ : 5) disappears, 2) and 3) are simplified.

11/29

Communication equilibrium payoffs in repeated games with imperfect monitoring III. Aspects of imperfect monitoring

Signals do matter. 2  C C1 (3, 3) D1 (4, 0)

Example: prisoner’s dilemma in the dark D2  (0, 4) E∞ = {(1, 1)}. (1, 1)

With imperfect signals, there might be in general: 1) undetectable deviations, 2) incentives to play informative actions (Lehrer 89, 1992), 3) a deviation which is detected by some players, but not by others, (R-Tomala 98) 4) a deviation which is detected, but the identity of the deviator is unknown, (Tomala 98, R-Scarlatti-Scarsini 05 and 08) 5) in a punition phase, a possibility to use the signals to correlate actions and to punish below the level v i (leads to new punition levels, Gossner Tomala 07). Open problem: Compute E∞ in general ? (unknown even for two players) We characterize the set of communication equilibrium payoffs C∞ : 5) disappears, 2) and 3) are simplified.

11/29

Communication equilibrium payoffs in repeated games with imperfect monitoring III. Aspects of imperfect monitoring

Signals do matter. 2  C C1 (3, 3) D1 (4, 0)

Example: prisoner’s dilemma in the dark D2  (0, 4) E∞ = {(1, 1)}. (1, 1)

With imperfect signals, there might be in general: 1) undetectable deviations, 2) incentives to play informative actions (Lehrer 89, 1992), 3) a deviation which is detected by some players, but not by others, (R-Tomala 98) 4) a deviation which is detected, but the identity of the deviator is unknown, (Tomala 98, R-Scarlatti-Scarsini 05 and 08) 5) in a punition phase, a possibility to use the signals to correlate actions and to punish below the level v i (leads to new punition levels, Gossner Tomala 07). Open problem: Compute E∞ in general ? (unknown even for two players) We characterize the set of communication equilibrium payoffs C∞ : 5) disappears, 2) and 3) are simplified.

11/29

Communication equilibrium payoffs in repeated games with imperfect monitoring III. Aspects of imperfect monitoring

Signals do matter. 2  C C1 (3, 3) D1 (4, 0)

Example: prisoner’s dilemma in the dark D2  (0, 4) E∞ = {(1, 1)}. (1, 1)

With imperfect signals, there might be in general: 1) undetectable deviations, 2) incentives to play informative actions (Lehrer 89, 1992), 3) a deviation which is detected by some players, but not by others, (R-Tomala 98) 4) a deviation which is detected, but the identity of the deviator is unknown, (Tomala 98, R-Scarlatti-Scarsini 05 and 08) 5) in a punition phase, a possibility to use the signals to correlate actions and to punish below the level v i (leads to new punition levels, Gossner Tomala 07). Open problem: Compute E∞ in general ? (unknown even for two players) We characterize the set of communication equilibrium payoffs C∞ : 5) disappears, 2) and 3) are simplified.

11/29

Communication equilibrium payoffs in repeated games with imperfect monitoring III. Aspects of imperfect monitoring

Signals do matter. 2  C C1 (3, 3) D1 (4, 0)

Example: prisoner’s dilemma in the dark D2  (0, 4) E∞ = {(1, 1)}. (1, 1)

With imperfect signals, there might be in general: 1) undetectable deviations, 2) incentives to play informative actions (Lehrer 89, 1992), 3) a deviation which is detected by some players, but not by others, (R-Tomala 98) 4) a deviation which is detected, but the identity of the deviator is unknown, (Tomala 98, R-Scarlatti-Scarsini 05 and 08) 5) in a punition phase, a possibility to use the signals to correlate actions and to punish below the level v i (leads to new punition levels, Gossner Tomala 07). Open problem: Compute E∞ in general ? (unknown even for two players) We characterize the set of communication equilibrium payoffs C∞ : 5) disappears, 2) and 3) are simplified.

11/29

Communication equilibrium payoffs in repeated games with imperfect monitoring IV. Communication equilibrium payoffs

Communication equilibrium payoffs (Myerson, Forges) Add an exogeneous mediator who can communicate with the players between the stages (no comittment power, no utility). Canonical representation leading to an extended game where at each stage t: • first, the mediator privately sends to each player i a “recommendation" rti in Ai , • then the stage game is played as usual: every player i plays some action ati and observes the signal uti , • finally each player reports a message mti in U i to the mediator. We now have a supergame with n + 1 players, the mediator has payoff zero. Definition: A communication equilibrium is a uniform equilibrium of the extended game where every player i plays his faithfull strategy σ i ∗ : play what is recommended, report what is observed. A communication equilibrium payoff of Γ is a limit payoff associated to a communication equilibrium (forgetting the mediator’s component). Denote by C∞ the set of communication equilibrium payoffs.

12/29

Communication equilibrium payoffs in repeated games with imperfect monitoring IV. Communication equilibrium payoffs

Communication equilibrium payoffs (Myerson, Forges) Add an exogeneous mediator who can communicate with the players between the stages (no comittment power, no utility). Canonical representation leading to an extended game where at each stage t: • first, the mediator privately sends to each player i a “recommendation" rti in Ai , • then the stage game is played as usual: every player i plays some action ati and observes the signal uti , • finally each player reports a message mti in U i to the mediator. We now have a supergame with n + 1 players, the mediator has payoff zero. Definition: A communication equilibrium is a uniform equilibrium of the extended game where every player i plays his faithfull strategy σ i ∗ : play what is recommended, report what is observed. A communication equilibrium payoff of Γ is a limit payoff associated to a communication equilibrium (forgetting the mediator’s component). Denote by C∞ the set of communication equilibrium payoffs.

12/29

Communication equilibrium payoffs in repeated games with imperfect monitoring IV. Communication equilibrium payoffs

Communication equilibrium payoffs (Myerson, Forges) Add an exogeneous mediator who can communicate with the players between the stages (no comittment power, no utility). Canonical representation leading to an extended game where at each stage t: • first, the mediator privately sends to each player i a “recommendation" rti in Ai , • then the stage game is played as usual: every player i plays some action ati and observes the signal uti , • finally each player reports a message mti in U i to the mediator. We now have a supergame with n + 1 players, the mediator has payoff zero. Definition: A communication equilibrium is a uniform equilibrium of the extended game where every player i plays his faithfull strategy σ i ∗ : play what is recommended, report what is observed. A communication equilibrium payoff of Γ is a limit payoff associated to a communication equilibrium (forgetting the mediator’s component). Denote by C∞ the set of communication equilibrium payoffs.

12/29

Communication equilibrium payoffs in repeated games with imperfect monitoring IV. Communication equilibrium payoffs

Communication equilibrium payoffs (Myerson, Forges) Add an exogeneous mediator who can communicate with the players between the stages (no comittment power, no utility). Canonical representation leading to an extended game where at each stage t: • first, the mediator privately sends to each player i a “recommendation" rti in Ai , • then the stage game is played as usual: every player i plays some action ati and observes the signal uti , • finally each player reports a message mti in U i to the mediator. We now have a supergame with n + 1 players, the mediator has payoff zero. Definition: A communication equilibrium is a uniform equilibrium of the extended game where every player i plays his faithfull strategy σ i ∗ : play what is recommended, report what is observed. A communication equilibrium payoff of Γ is a limit payoff associated to a communication equilibrium (forgetting the mediator’s component). Denote by C∞ the set of communication equilibrium payoffs.

12/29

Communication equilibrium payoffs in repeated games with imperfect monitoring IV. Communication equilibrium payoffs

Communication equilibrium payoffs (Myerson, Forges) Add an exogeneous mediator who can communicate with the players between the stages (no comittment power, no utility). Canonical representation leading to an extended game where at each stage t: • first, the mediator privately sends to each player i a “recommendation" rti in Ai , • then the stage game is played as usual: every player i plays some action ati and observes the signal uti , • finally each player reports a message mti in U i to the mediator. We now have a supergame with n + 1 players, the mediator has payoff zero. Definition: A communication equilibrium is a uniform equilibrium of the extended game where every player i plays his faithfull strategy σ i ∗ : play what is recommended, report what is observed. A communication equilibrium payoff of Γ is a limit payoff associated to a communication equilibrium (forgetting the mediator’s component). Denote by C∞ the set of communication equilibrium payoffs.

12/29

Communication equilibrium payoffs in repeated games with imperfect monitoring IV. Communication equilibrium payoffs

Communication equilibrium payoffs (Myerson, Forges) Add an exogeneous mediator who can communicate with the players between the stages (no comittment power, no utility). Canonical representation leading to an extended game where at each stage t: • first, the mediator privately sends to each player i a “recommendation" rti in Ai , • then the stage game is played as usual: every player i plays some action ati and observes the signal uti , • finally each player reports a message mti in U i to the mediator. We now have a supergame with n + 1 players, the mediator has payoff zero. Definition: A communication equilibrium is a uniform equilibrium of the extended game where every player i plays his faithfull strategy σ i ∗ : play what is recommended, report what is observed. A communication equilibrium payoff of Γ is a limit payoff associated to a communication equilibrium (forgetting the mediator’s component). Denote by C∞ the set of communication equilibrium payoffs.

12/29

Communication equilibrium payoffs in repeated games with imperfect monitoring V. Punishment levels

Punishment levels: wi =

max g i (p i , p −i ) correlated minmax of player i

min

p −i ∈∆(∏j6=i Aj ) p i ∈∆(Ai )

(instead of v i = minp−i ∈∏j6=i ∆(Aj ) maxpi ∈∆(Ai ) g i (p i , p −i )) Example with 3 players: T B



L R −1 0 0 0 W

R   L 0 0 0 −1 E w 3 = −1/2 < v 3 = −1/4.

Rem: Standard Folk theorem for communication equilibria: C∞ = g (∆(A)) ∩ IRc, where IRc = {x = (x i )i ∈ IR N , ∀i x i ≥ w i }. 13/29

Communication equilibrium payoffs in repeated games with imperfect monitoring V. Punishment levels

Punishment levels: wi =

max g i (p i , p −i ) correlated minmax of player i

min

p −i ∈∆(∏j6=i Aj ) p i ∈∆(Ai )

(instead of v i = minp−i ∈∏j6=i ∆(Aj ) maxpi ∈∆(Ai ) g i (p i , p −i )) Example with 3 players: T B



L R −1 0 0 0 W

R   L 0 0 0 −1 E w 3 = −1/2 < v 3 = −1/4.

Rem: Standard Folk theorem for communication equilibria: C∞ = g (∆(A)) ∩ IRc, where IRc = {x = (x i )i ∈ IR N , ∀i x i ≥ w i }. 13/29

Communication equilibrium payoffs in repeated games with imperfect monitoring V. Punishment levels

Punishment levels: wi =

max g i (p i , p −i ) correlated minmax of player i

min

p −i ∈∆(∏j6=i Aj ) p i ∈∆(Ai )

(instead of v i = minp−i ∈∏j6=i ∆(Aj ) maxpi ∈∆(Ai ) g i (p i , p −i )) Example with 3 players: T B



L R −1 0 0 0 W

R   L 0 0 0 −1 E w 3 = −1/2 < v 3 = −1/4.

Rem: Standard Folk theorem for communication equilibria: C∞ = g (∆(A)) ∩ IRc, where IRc = {x = (x i )i ∈ IR N , ∀i x i ≥ w i }. 13/29

Communication equilibrium payoffs in repeated games with imperfect monitoring V. Punishment levels

Punishment levels: wi =

max g i (p i , p −i ) correlated minmax of player i

min

p −i ∈∆(∏j6=i Aj ) p i ∈∆(Ai )

(instead of v i = minp−i ∈∏j6=i ∆(Aj ) maxpi ∈∆(Ai ) g i (p i , p −i )) Example with 3 players: T B



L R −1 0 0 0 W

R   L 0 0 0 −1 E w 3 = −1/2 < v 3 = −1/4.

Rem: Standard Folk theorem for communication equilibria: C∞ = g (∆(A)) ∩ IRc, where IRc = {x = (x i )i ∈ IR N , ∀i x i ≥ w i }. 13/29

Communication equilibrium payoffs in repeated games with imperfect monitoring VI. Feasible payoffs robust to undetectable deviations

Feasible payoffs which are robust to undetectable deviations Example: prisoner’s C2  1 C (3, 3)a D1 (4, 0)b

dilemma again D2  (0, 4)c P1 plays in the dark, U 2 = {a, b, c}. (1, 1)c

(3, 3) ∈ C∞ Strategy of the mediator: - on the main path at√stage t, recommend P2 to play C 2 , and P1 to play C 1 with proba 1 − 1/ t. Continue as long as P2’s reported signal matches P1’s recommended action. Otherwise, go the punishment phase. - punishment phase: punish forever, i.e. recommend (D 1 , D 2 ) at every stage. The players have no incentive to deviate from their faithfull strategy. 14/29

Communication equilibrium payoffs in repeated games with imperfect monitoring VI. Feasible payoffs robust to undetectable deviations

Feasible payoffs which are robust to undetectable deviations Example: prisoner’s C2  1 C (3, 3)a D1 (4, 0)b

dilemma again D2  (0, 4)c P1 plays in the dark, U 2 = {a, b, c}. (1, 1)c

(3, 3) ∈ C∞ Strategy of the mediator: - on the main path at√stage t, recommend P2 to play C 2 , and P1 to play C 1 with proba 1 − 1/ t. Continue as long as P2’s reported signal matches P1’s recommended action. Otherwise, go the punishment phase. - punishment phase: punish forever, i.e. recommend (D 1 , D 2 ) at every stage. The players have no incentive to deviate from their faithfull strategy. 14/29

Communication equilibrium payoffs in repeated games with imperfect monitoring VI. Feasible payoffs robust to undetectable deviations

Feasible payoffs which are robust to undetectable deviations Example: prisoner’s C2  1 C (3, 3)a D1 (4, 0)b

dilemma again D2  (0, 4)c P1 plays in the dark, U 2 = {a, b, c}. (1, 1)c

(3, 3) ∈ C∞ Strategy of the mediator: - on the main path at√stage t, recommend P2 to play C 2 , and P1 to play C 1 with proba 1 − 1/ t. Continue as long as P2’s reported signal matches P1’s recommended action. Otherwise, go the punishment phase. - punishment phase: punish forever, i.e. recommend (D 1 , D 2 ) at every stage. The players have no incentive to deviate from their faithfull strategy. 14/29

Communication equilibrium payoffs in repeated games with imperfect monitoring VI. Feasible payoffs robust to undetectable deviations

Feasible payoffs which are robust to undetectable deviations Example: prisoner’s C2  1 C (3, 3)a D1 (4, 0)b

dilemma again D2  (0, 4)c P1 plays in the dark, U 2 = {a, b, c}. (1, 1)c

(3, 3) ∈ C∞ Strategy of the mediator: - on the main path at√stage t, recommend P2 to play C 2 , and P1 to play C 1 with proba 1 − 1/ t. Continue as long as P2’s reported signal matches P1’s recommended action. Otherwise, go the punishment phase. - punishment phase: punish forever, i.e. recommend (D 1 , D 2 ) at every stage. The players have no incentive to deviate from their faithfull strategy. 14/29

Communication equilibrium payoffs in repeated games with imperfect monitoring VI. Feasible payoffs robust to undetectable deviations

Feasible payoffs which are robust to undetectable deviations Example: prisoner’s C2  1 C (3, 3)a D1 (4, 0)b

dilemma again D2  (0, 4)c P1 plays in the dark, U 2 = {a, b, c}. (1, 1)c

(3, 3) ∈ C∞ Strategy of the mediator: - on the main path at√stage t, recommend P2 to play C 2 , and P1 to play C 1 with proba 1 − 1/ t. Continue as long as P2’s reported signal matches P1’s recommended action. Otherwise, go the punishment phase. - punishment phase: punish forever, i.e. recommend (D 1 , D 2 ) at every stage. The players have no incentive to deviate from their faithfull strategy. 14/29

Communication equilibrium payoffs in repeated games with imperfect monitoring VI. Feasible payoffs robust to undetectable deviations

Suppose now that at some stage, the mediator recommends some action profile a = (ai )i . If some player i can play an action b i which: - does not change the signal of the other players, - gives player i as least as much information as ai , - and gives a better payoff for player i, then player i has no incentive to play ai . (assume here that the signals are deterministic: each player i has a signalling function f i : A −→ U i .)

Definition: for every pair of actions ai and b i of player i, write b i ≥ ai if: (i) ∀a−i ∈ A−i , ∀j 6= i, f j (b i , a−i ) = f j (ai , a−i ) (ai et b i are equivalent), (ii) ∀a−i , b −i ∈ A−i , f i (ai , a−i ) 6= f i (ai , b −i ) =⇒ f i (b i , a−i ) 6= f i (b i , b −i ) (b i is more informative than ai ).

15/29

Communication equilibrium payoffs in repeated games with imperfect monitoring VI. Feasible payoffs robust to undetectable deviations

Suppose now that at some stage, the mediator recommends some action profile a = (ai )i . If some player i can play an action b i which: - does not change the signal of the other players, - gives player i as least as much information as ai , - and gives a better payoff for player i, then player i has no incentive to play ai . (assume here that the signals are deterministic: each player i has a signalling function f i : A −→ U i .)

Definition: for every pair of actions ai and b i of player i, write b i ≥ ai if: (i) ∀a−i ∈ A−i , ∀j 6= i, f j (b i , a−i ) = f j (ai , a−i ) (ai et b i are equivalent), (ii) ∀a−i , b −i ∈ A−i , f i (ai , a−i ) 6= f i (ai , b −i ) =⇒ f i (b i , a−i ) 6= f i (b i , b −i ) (b i is more informative than ai ).

15/29

Communication equilibrium payoffs in repeated games with imperfect monitoring VI. Feasible payoffs robust to undetectable deviations

Definition (Lehrer) The set of feasible payoffs which are robust to undectectable deivations is g (P), where: n P = p ∈ ∆(A), ∀i ∈ N, ∀b i , ai ∈ Ai s.t. b i ≥ ai ,



a−i ∈A−i

p(ai , a−i )g i (ai , a−i ) ≥



a−i ∈A−i

o p(ai , a−i )g i (b i , a−i ) .

Theorem (Lehrer 1992, MSZ 1994): For two-player games, C∞ = g (P) ∩ IRc.

16/29

Communication equilibrium payoffs in repeated games with imperfect monitoring VI. Feasible payoffs robust to undetectable deviations

Definition (Lehrer) The set of feasible payoffs which are robust to undectectable deivations is g (P), where: n P = p ∈ ∆(A), ∀i ∈ N, ∀b i , ai ∈ Ai s.t. b i ≥ ai ,



a−i ∈A−i

p(ai , a−i )g i (ai , a−i ) ≥



a−i ∈A−i

o p(ai , a−i )g i (b i , a−i ) .

Theorem (Lehrer 1992, MSZ 1994): For two-player games, C∞ = g (P) ∩ IRc.

16/29

Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs

Jointly rational payoffs New phenomenum with at least 3 players: somebody has deviated, but who ? −→ Several suspected players have to be simultaneously punished. This creates new constraints for the eq. payoffs. Example: 3 players. L R T B

(0,0,0) (0,3,0) (3,0,0) (1,1,0) W

L (0,2,0) (0,2,0)

R (0,2,0) (0,2,0) M

L (2,0,0) (2,0,0)

R (2,0,0) (2,0,0) E

Player 3 has payoff 0. w 1 = w 2 = w 3 = 0. J2 Eq. payoffs 6 Perfect observation (Folk) 3 Trivial observation 2 @ @ @@ @@ @@ @@ @@ @@ @ @@ 0 2 3 J1

17/29

Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs

Jointly rational payoffs New phenomenum with at least 3 players: somebody has deviated, but who ? −→ Several suspected players have to be simultaneously punished. This creates new constraints for the eq. payoffs. Example: 3 players. L R T B

(0,0,0) (0,3,0) (3,0,0) (1,1,0) W

L (0,2,0) (0,2,0)

R (0,2,0) (0,2,0) M

L (2,0,0) (2,0,0)

R (2,0,0) (2,0,0) E

Player 3 has payoff 0. w 1 = w 2 = w 3 = 0. J2 Eq. payoffs 6 Perfect observation (Folk) 3 Trivial observation 2 @ @ @@ @@ @@ @@ @@ @@ @ @@ 0 2 3 J1

17/29

Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs

Jointly rational payoffs New phenomenum with at least 3 players: somebody has deviated, but who ? −→ Several suspected players have to be simultaneously punished. This creates new constraints for the eq. payoffs. Example: 3 players. L R T B

(0,0,0) (0,3,0) (3,0,0) (1,1,0) W

L (0,2,0) (0,2,0)

R (0,2,0) (0,2,0) M

L (2,0,0) (2,0,0)

R (2,0,0) (2,0,0) E

Player 3 has payoff 0. w 1 = w 2 = w 3 = 0. J2 Eq. payoffs 6 Perfect observation (Folk) 3 Trivial observation 2 @ @ @@ @@ @@ @@ @@ @@ @ @@ 0 2 3 J1

17/29

Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs

Jointly rational payoffs New phenomenum with at least 3 players: somebody has deviated, but who ? −→ Several suspected players have to be simultaneously punished. This creates new constraints for the eq. payoffs. Example: 3 players. L R T B

(0,0,0) (0,3,0) (3,0,0) (1,1,0) W

L (0,2,0) (0,2,0)

R (0,2,0) (0,2,0) M

L (2,0,0) (2,0,0)

R (2,0,0) (2,0,0) E

Player 3 has payoff 0. w 1 = w 2 = w 3 = 0. J2 Eq. payoffs 6 Perfect observation (Folk) 3 Trivial observation 2 @ @ @@ @@ @@ @@ @@ @@ @ @@ 0 2 3 J1

17/29

Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs

L T B

R

(0,0,0) (0,3,0) (3,0,0) (1,1,0) W

L (0,2,0) (0,2,0)

R (0,2,0) (0,2,0) M

L (2,0,0) (2,0,0)

R (2,0,0) (2,0,0) E

Now : P1 and P2 just see the moves of each other, P3 is in the dark. (0, 0, 0) ∈ C∞ ? Suppose we have an eq. where the mediator recommends to play (T , L, W ). If player 1 deviates by playing B, he has to be punished. So P2 has to report it to the mediator, and the mediator has to recommend M to P3 in the future. But in this case P2 has a profitable deviation ! (0, 0, 0) ∈ / C∞ Impossible here to know if the deviation comes from P1 or P2. 18/29

Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs

L T B

R

(0,0,0) (0,3,0) (3,0,0) (1,1,0) W

L (0,2,0) (0,2,0)

R (0,2,0) (0,2,0) M

L (2,0,0) (2,0,0)

R (2,0,0) (2,0,0) E

Now : P1 and P2 just see the moves of each other, P3 is in the dark. (0, 0, 0) ∈ C∞ ? Suppose we have an eq. where the mediator recommends to play (T , L, W ). If player 1 deviates by playing B, he has to be punished. So P2 has to report it to the mediator, and the mediator has to recommend M to P3 in the future. But in this case P2 has a profitable deviation ! (0, 0, 0) ∈ / C∞ Impossible here to know if the deviation comes from P1 or P2. 18/29

Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs

L T B

R

(0,0,0) (0,3,0) (3,0,0) (1,1,0) W

L (0,2,0) (0,2,0)

R (0,2,0) (0,2,0) M

L (2,0,0) (2,0,0)

R (2,0,0) (2,0,0) E

Now : P1 and P2 just see the moves of each other, P3 is in the dark. (0, 0, 0) ∈ C∞ ? Suppose we have an eq. where the mediator recommends to play (T , L, W ). If player 1 deviates by playing B, he has to be punished. So P2 has to report it to the mediator, and the mediator has to recommend M to P3 in the future. But in this case P2 has a profitable deviation ! (0, 0, 0) ∈ / C∞ Impossible here to know if the deviation comes from P1 or P2. 18/29

Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs

L T B

R

(0,0,0) (0,3,0) (3,0,0) (1,1,0) W

L (0,2,0) (0,2,0)

R (0,2,0) (0,2,0) M

L (2,0,0) (2,0,0)

R (2,0,0) (2,0,0) E

Now : P1 and P2 just see the moves of each other, P3 is in the dark. (0, 0, 0) ∈ C∞ ? Suppose we have an eq. where the mediator recommends to play (T , L, W ). If player 1 deviates by playing B, he has to be punished. So P2 has to report it to the mediator, and the mediator has to recommend M to P3 in the future. But in this case P2 has a profitable deviation ! (0, 0, 0) ∈ / C∞ Impossible here to know if the deviation comes from P1 or P2. 18/29

Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs

L T B

R

(0,0,0) (0,3,0) (3,0,0) (1,1,0) W

L (0,2,0) (0,2,0)

R (0,2,0) (0,2,0) M

L (2,0,0) (2,0,0)

R (2,0,0) (2,0,0) E

Now : P1 and P2 just see the moves of each other, P3 is in the dark. (0, 0, 0) ∈ C∞ ? Suppose we have an eq. where the mediator recommends to play (T , L, W ). If player 1 deviates by playing B, he has to be punished. So P2 has to report it to the mediator, and the mediator has to recommend M to P3 in the future. But in this case P2 has a profitable deviation ! (0, 0, 0) ∈ / C∞ Impossible here to know if the deviation comes from P1 or P2. 18/29

Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs

L T B

R

(0,0,0) (0,3,0) (3,0,0) (1,1,0) W

L (0,2,0) (0,2,0)

R (0,2,0) (0,2,0) M

L (2,0,0) (2,0,0)

R (2,0,0) (2,0,0) E

Punishments by P3 of the form λ M + (1 − λ )E , with λ ∈ [0, 1]. For a target payoff x, efficient only if x 1 ≥ 2(1 − λ ) and x 2 ≥ 2λ . Eq. payoffs

J2 3 6 @ @@ 2 @ @ @@ @@ @ @ @@ @@@ @@ @@ @@ @@ 0 2 3 J1

19/29

Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs

L T B

R

(0,0,0) (0,3,0) (3,0,0) (1,1,0) W

L (0,2,0) (0,2,0)

R (0,2,0) (0,2,0) M

L (2,0,0) (2,0,0)

R (2,0,0) (2,0,0) E

Punishments by P3 of the form λ M + (1 − λ )E , with λ ∈ [0, 1]. For a target payoff x, efficient only if x 1 ≥ 2(1 − λ ) and x 2 ≥ 2λ . Eq. payoffs

J2 3 6 @ @@ 2 @ @@ @@ @ @ @@ @ @@@ @@ @@ @@ @@ 0 2 3 J1

19/29

Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs

An action of player i in the extended game is called a decision (rule), element of: D i = {d i = (α i , µ i ), wih α i : Ri −→ Ai and µ i : Ri × Ui −→ Mi }. Consider the scenario: the mediator recommends the action profile a, the players 6= i play faithfully whereas player i plays according to a mixed decision δ i . Denote by ψ i (δ i , a) ∈ ∆(U) the induced law of the messages received by the mediator, and by gδi i (a) the expected payoff of player i. Given a subset of players J, the set of similar decisions of the players in J is defined as: ) ( SD(J) =

(δ i )i ∈J ∈ ∏ ∆(D i ), ∀i, j ∈ J, ∀a ∈ A, ψ i (δ i , a) = ψ j (δ j , a) . i ∈J

SD(J) is a polytope. If a player in J deviates according to an element in SD(J), the mediator has to punish simultaneously all the players in J. 20/29

Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs

An action of player i in the extended game is called a decision (rule), element of: D i = {d i = (α i , µ i ), wih α i : Ri −→ Ai and µ i : Ri × Ui −→ Mi }. Consider the scenario: the mediator recommends the action profile a, the players 6= i play faithfully whereas player i plays according to a mixed decision δ i . Denote by ψ i (δ i , a) ∈ ∆(U) the induced law of the messages received by the mediator, and by gδi i (a) the expected payoff of player i. Given a subset of players J, the set of similar decisions of the players in J is defined as: ) ( SD(J) =

(δ i )i ∈J ∈ ∏ ∆(D i ), ∀i, j ∈ J, ∀a ∈ A, ψ i (δ i , a) = ψ j (δ j , a) . i ∈J

SD(J) is a polytope. If a player in J deviates according to an element in SD(J), the mediator has to punish simultaneously all the players in J. 20/29

Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs

An action of player i in the extended game is called a decision (rule), element of: D i = {d i = (α i , µ i ), wih α i : Ri −→ Ai and µ i : Ri × Ui −→ Mi }. Consider the scenario: the mediator recommends the action profile a, the players 6= i play faithfully whereas player i plays according to a mixed decision δ i . Denote by ψ i (δ i , a) ∈ ∆(U) the induced law of the messages received by the mediator, and by gδi i (a) the expected payoff of player i. Given a subset of players J, the set of similar decisions of the players in J is defined as: ) ( SD(J) =

(δ i )i ∈J ∈ ∏ ∆(D i ), ∀i, j ∈ J, ∀a ∈ A, ψ i (δ i , a) = ψ j (δ j , a) . i ∈J

SD(J) is a polytope. If a player in J deviates according to an element in SD(J), the mediator has to punish simultaneously all the players in J. 20/29

Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs

rem: if q is the Dirac measure on player i, then SD({i}) = ∆(D i ): every deviation of player i makes player i a suspect. Example again: L T B

R

(0,0,0) (0,3,0) (3,0,0) (1,1,0) W

L (0,2,0) (0,2,0)

R (0,2,0) (0,2,0) M

L (2,0,0) (2,0,0)

R (2,0,0) (2,0,0) E

P1 and P2 just see the moves of each other, P3 is in the dark. Consider: d 1 : play B, report R d 2 : play R, report B. d = (d 1 , d 2 ) is a similar decision for the players 1,2. The mediator can not distinguish between “player 1 is deviating with d 1 ” and “player 2 is deviating with d 2 ”. 21/29

Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs

rem: if q is the Dirac measure on player i, then SD({i}) = ∆(D i ): every deviation of player i makes player i a suspect. Example again: L T B

R

(0,0,0) (0,3,0) (3,0,0) (1,1,0) W

L (0,2,0) (0,2,0)

R (0,2,0) (0,2,0) M

L (2,0,0) (2,0,0)

R (2,0,0) (2,0,0) E

P1 and P2 just see the moves of each other, P3 is in the dark. Consider: d 1 : play B, report R d 2 : play R, report B. d = (d 1 , d 2 ) is a similar decision for the players 1,2. The mediator can not distinguish between “player 1 is deviating with d 1 ” and “player 2 is deviating with d 2 ”. 21/29

Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs

For each q in ∆(N), define the “punishment level” l(q) by: l(q) =

max

min ∑ q i gδi i (a) = min

δ ∈SD(Supp q) a∈A i ∈N

max

∑ p(a) ∑ qi gδi i (a).

p ∈∆(A) δ ∈SD(Supp q) a∈A

i ∈N

The set of jointly rational payoffs is defined as:  JR = x ∈ IR N , ∀q ∈ ∆(N), x · q ≥ l(q) . rem: if q is the Dirac measure on player i, SD({i}) = ∆(D i ) and l(q) = w i : jointly rational payoffs are always individually rational.

22/29

Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs

For each q in ∆(N), define the “punishment level” l(q) by: l(q) =

max

min ∑ q i gδi i (a) = min

δ ∈SD(Supp q) a∈A i ∈N

max

∑ p(a) ∑ qi gδi i (a).

p ∈∆(A) δ ∈SD(Supp q) a∈A

i ∈N

The set of jointly rational payoffs is defined as:  JR = x ∈ IR N , ∀q ∈ ∆(N), x · q ≥ l(q) . rem: if q is the Dirac measure on player i, SD({i}) = ∆(D i ) and l(q) = w i : jointly rational payoffs are always individually rational.

22/29

Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs

For each q in ∆(N), define the “punishment level” l(q) by: l(q) =

max

min ∑ q i gδi i (a) = min

δ ∈SD(Supp q) a∈A i ∈N

max

∑ p(a) ∑ qi gδi i (a).

p ∈∆(A) δ ∈SD(Supp q) a∈A

i ∈N

The set of jointly rational payoffs is defined as:  JR = x ∈ IR N , ∀q ∈ ∆(N), x · q ≥ l(q) . rem: if q is the Dirac measure on player i, SD({i}) = ∆(D i ) and l(q) = w i : jointly rational payoffs are always individually rational.

22/29

Communication equilibrium payoffs in repeated games with imperfect monitoring VII. Jointly rational payoffs

For each q in ∆(N), define the “punishment level” l(q) by: l(q) =

max

min ∑ q i gδi i (a) = min

δ ∈SD(Supp q) a∈A i ∈N

max

∑ p(a) ∑ qi gδi i (a).

p ∈∆(A) δ ∈SD(Supp q) a∈A

i ∈N

The set of jointly rational payoffs is defined as:  JR = x ∈ IR N , ∀q ∈ ∆(N), x · q ≥ l(q) . rem: if q is the Dirac measure on player i, SD({i}) = ∆(D i ) and l(q) = w i : jointly rational payoffs are always individually rational.

22/29

Communication equilibrium payoffs in repeated games with imperfect monitoring VIII. The characterization

Theorem R-Tomala 04: in a repeated game with imperfect monitoring, the communication equilibrium payoffs are the feasible payoffs which are both robust to undetectable deviations and jointly rational. C∞ = g (P) ∩ JR. remarks: • Perfect observation case: JR = IRc (Folk theorem) • Trivial observation case: C∞ is the set of correlated eq. payoffs of the stage game. • For two-player games, JR = IR (back to Lehrer and MSZ result). • C∞ is convex compact, but need not be a polytope. • Corollaries: E∞ ⊂ g (P) ∩ JR, Eλ ⊂ g (P) ∩ JR for every λ . • Case of random signals: with f : A −→ ∆(U), OK with P = {p ∈ ∆(A), ∀i ∈ N, ∀δ i ∈ ∆(D i ) t.q. ∀a ∈ A ψ i (δ i , a) = f (a), ∑a∈A p(a)g i (a) ≥ ∑a∈A p(a)gδi i (a)}. 23/29

Communication equilibrium payoffs in repeated games with imperfect monitoring VIII. The characterization

Theorem R-Tomala 04: in a repeated game with imperfect monitoring, the communication equilibrium payoffs are the feasible payoffs which are both robust to undetectable deviations and jointly rational. C∞ = g (P) ∩ JR. remarks: • Perfect observation case: JR = IRc (Folk theorem) • Trivial observation case: C∞ is the set of correlated eq. payoffs of the stage game. • For two-player games, JR = IR (back to Lehrer and MSZ result). • C∞ is convex compact, but need not be a polytope. • Corollaries: E∞ ⊂ g (P) ∩ JR, Eλ ⊂ g (P) ∩ JR for every λ . • Case of random signals: with f : A −→ ∆(U), OK with P = {p ∈ ∆(A), ∀i ∈ N, ∀δ i ∈ ∆(D i ) t.q. ∀a ∈ A ψ i (δ i , a) = f (a), ∑a∈A p(a)g i (a) ≥ ∑a∈A p(a)gδi i (a)}. 23/29

Communication equilibrium payoffs in repeated games with imperfect monitoring IX. Elements of the proof

Main idea of the proof: define an auxiliary 2-player repeated game with incomplete information. PI corresponds to a “cheater" (potential deviator in the original game), PII corresponds to the mediator in the original game, he has payoff 0. Set of states in the auxiliary game : K = N (represents the identity of the deviator in the original game) Initially, one of the states is selected and announced to PI only. At every stage, PII selects an action in A, whereas PI selects an action in D = ∏i D i , and the payoffs for PI and the signals are defined by analogy with the original game. very strong analogy between communication equilibrium payoffs of the orignal game and some equilibrium payoffs of the auxiliary game −→ study of equilibrium payoffs for a particular class of repeated games with incomplete information: lack of info on one side, known own payoffs, state dependent signalling, specific “faithfull" strategy for PI. 24/29

Communication equilibrium payoffs in repeated games with imperfect monitoring IX. Elements of the proof

Main idea of the proof: define an auxiliary 2-player repeated game with incomplete information. PI corresponds to a “cheater" (potential deviator in the original game), PII corresponds to the mediator in the original game, he has payoff 0. Set of states in the auxiliary game : K = N (represents the identity of the deviator in the original game) Initially, one of the states is selected and announced to PI only. At every stage, PII selects an action in A, whereas PI selects an action in D = ∏i D i , and the payoffs for PI and the signals are defined by analogy with the original game. very strong analogy between communication equilibrium payoffs of the orignal game and some equilibrium payoffs of the auxiliary game −→ study of equilibrium payoffs for a particular class of repeated games with incomplete information: lack of info on one side, known own payoffs, state dependent signalling, specific “faithfull" strategy for PI. 24/29

Communication equilibrium payoffs in repeated games with imperfect monitoring IX. Elements of the proof

Main idea of the proof: define an auxiliary 2-player repeated game with incomplete information. PI corresponds to a “cheater" (potential deviator in the original game), PII corresponds to the mediator in the original game, he has payoff 0. Set of states in the auxiliary game : K = N (represents the identity of the deviator in the original game) Initially, one of the states is selected and announced to PI only. At every stage, PII selects an action in A, whereas PI selects an action in D = ∏i D i , and the payoffs for PI and the signals are defined by analogy with the original game. very strong analogy between communication equilibrium payoffs of the orignal game and some equilibrium payoffs of the auxiliary game −→ study of equilibrium payoffs for a particular class of repeated games with incomplete information: lack of info on one side, known own payoffs, state dependent signalling, specific “faithfull" strategy for PI. 24/29

Communication equilibrium payoffs in repeated games with imperfect monitoring IX. Elements of the proof

Main idea of the proof: define an auxiliary 2-player repeated game with incomplete information. PI corresponds to a “cheater" (potential deviator in the original game), PII corresponds to the mediator in the original game, he has payoff 0. Set of states in the auxiliary game : K = N (represents the identity of the deviator in the original game) Initially, one of the states is selected and announced to PI only. At every stage, PII selects an action in A, whereas PI selects an action in D = ∏i D i , and the payoffs for PI and the signals are defined by analogy with the original game. very strong analogy between communication equilibrium payoffs of the orignal game and some equilibrium payoffs of the auxiliary game −→ study of equilibrium payoffs for a particular class of repeated games with incomplete information: lack of info on one side, known own payoffs, state dependent signalling, specific “faithfull" strategy for PI. 24/29

Communication equilibrium payoffs in repeated games with imperfect monitoring IX. Elements of the proof

Main idea of the proof: define an auxiliary 2-player repeated game with incomplete information. PI corresponds to a “cheater" (potential deviator in the original game), PII corresponds to the mediator in the original game, he has payoff 0. Set of states in the auxiliary game : K = N (represents the identity of the deviator in the original game) Initially, one of the states is selected and announced to PI only. At every stage, PII selects an action in A, whereas PI selects an action in D = ∏i D i , and the payoffs for PI and the signals are defined by analogy with the original game. very strong analogy between communication equilibrium payoffs of the orignal game and some equilibrium payoffs of the auxiliary game −→ study of equilibrium payoffs for a particular class of repeated games with incomplete information: lack of info on one side, known own payoffs, state dependent signalling, specific “faithfull" strategy for PI. 24/29

Communication equilibrium payoffs in repeated games with imperfect monitoring IX. Elements of the proof

Main idea of the proof: define an auxiliary 2-player repeated game with incomplete information. PI corresponds to a “cheater" (potential deviator in the original game), PII corresponds to the mediator in the original game, he has payoff 0. Set of states in the auxiliary game : K = N (represents the identity of the deviator in the original game) Initially, one of the states is selected and announced to PI only. At every stage, PII selects an action in A, whereas PI selects an action in D = ∏i D i , and the payoffs for PI and the signals are defined by analogy with the original game. very strong analogy between communication equilibrium payoffs of the orignal game and some equilibrium payoffs of the auxiliary game −→ study of equilibrium payoffs for a particular class of repeated games with incomplete information: lack of info on one side, known own payoffs, state dependent signalling, specific “faithfull" strategy for PI. 24/29

Communication equilibrium payoffs in repeated games with imperfect monitoring IX. Elements of the proof

We prove a result for such games, using: • a theorem of Kohlberg (1975) in the spirit of Blackwell approachability. simplified idea: Let C be a closed convex set in IR K , and (xn )n be a bounded sequence in IR K . Write x¯n = n1 ∑ni=1 xi , and yn = PC (¯ xn ). Assume that: ∀n s.t. x¯n ∈ / C , the hyperplane H containing yn and orthogonal to [¯ xn , yn ] separates x¯n from xn+1 . HH ` x¯n HH  HH yn Then d(¯ xn , C ) −→n→∞ 0. @@@@@HH HH @@@@ H C@@@ ` xn+1 @@ @ Used with C = IR−K . q = ( JR =

x¯n −yn kx¯n −yn k

∈ ∆(N) leads to the definition of JR: )

x ∈ IR N , ∀q ∈ ∆(N), x · q ≥

max

min ∑ q i gδi i (a) .

δ ∈SD(Supp q) a∈A i ∈N

25/29

Communication equilibrium payoffs in repeated games with imperfect monitoring IX. Elements of the proof

We prove a result for such games, using: • a theorem of Kohlberg (1975) in the spirit of Blackwell approachability. simplified idea: Let C be a closed convex set in IR K , and (xn )n be a bounded sequence in IR K . Write x¯n = n1 ∑ni=1 xi , and yn = PC (¯ xn ). Assume that: ∀n s.t. x¯n ∈ / C , the hyperplane H containing yn and orthogonal to [¯ xn , yn ] separates x¯n from xn+1 . HH ` x¯n HH  HH yn Then d(¯ xn , C ) −→n→∞ 0. @@@@@HH HH @@@@ H C@@@ ` xn+1 @@ @ Used with C = IR−K . q = ( JR =

x¯n −yn kx¯n −yn k

∈ ∆(N) leads to the definition of JR: )

x ∈ IR N , ∀q ∈ ∆(N), x · q ≥

max

min ∑ q i gδi i (a) .

δ ∈SD(Supp q) a∈A i ∈N

25/29

Communication equilibrium payoffs in repeated games with imperfect monitoring IX. Elements of the proof

We prove a result for such games, using: • a theorem of Kohlberg (1975) in the spirit of Blackwell approachability. simplified idea: Let C be a closed convex set in IR K , and (xn )n be a bounded sequence in IR K . Write x¯n = n1 ∑ni=1 xi , and yn = PC (¯ xn ). Assume that: ∀n s.t. x¯n ∈ / C , the hyperplane H containing yn and orthogonal to [¯ xn , yn ] separates x¯n from xn+1 . HH ` x¯n HH  HH yn Then d(¯ xn , C ) −→n→∞ 0. @@@@@HH HH @@@@ H C@@@ ` xn+1 @@ @ Used with C = IR−K . q = ( JR =

x¯n −yn kx¯n −yn k

∈ ∆(N) leads to the definition of JR: )

x ∈ IR N , ∀q ∈ ∆(N), x · q ≥

max

min ∑ q i gδi i (a) .

δ ∈SD(Supp q) a∈A i ∈N

25/29

Communication equilibrium payoffs in repeated games with imperfect monitoring IX. Elements of the proof

We prove a result for such games, using: • a theorem of Kohlberg (1975) in the spirit of Blackwell approachability. simplified idea: Let C be a closed convex set in IR K , and (xn )n be a bounded sequence in IR K . Write x¯n = n1 ∑ni=1 xi , and yn = PC (¯ xn ). Assume that: ∀n s.t. x¯n ∈ / C , the hyperplane H containing yn and orthogonal to [¯ xn , yn ] separates x¯n from xn+1 . HH ` x¯n HH  HH yn Then d(¯ xn , C ) −→n→∞ 0. @@@@@HH HH @@@@ H C@@@ ` xn+1 @@ @ Used with C = IR−K . q = ( JR =

x¯n −yn kx¯n −yn k

∈ ∆(N) leads to the definition of JR: )

x ∈ IR N , ∀q ∈ ∆(N), x · q ≥

max

min ∑ q i gδi i (a) .

δ ∈SD(Supp q) a∈A i ∈N

25/29

Communication equilibrium payoffs in repeated games with imperfect monitoring IX. Elements of the proof

We prove a result for such games, using: • a theorem of Kohlberg (1975) in the spirit of Blackwell approachability. simplified idea: Let C be a closed convex set in IR K , and (xn )n be a bounded sequence in IR K . Write x¯n = n1 ∑ni=1 xi , and yn = PC (¯ xn ). Assume that: ∀n s.t. x¯n ∈ / C , the hyperplane H containing yn and orthogonal to [¯ xn , yn ] separates x¯n from xn+1 . HH ` x¯n HH  HH yn Then d(¯ xn , C ) −→n→∞ 0. @@@@@HH HH @@@@ H C@@@ ` xn+1 @@ @ Used with C = IR−K . q = ( JR =

x¯n −yn kx¯n −yn k

∈ ∆(N) leads to the definition of JR: )

x ∈ IR N , ∀q ∈ ∆(N), x · q ≥

max

min ∑ q i gδi i (a) .

δ ∈SD(Supp q) a∈A i ∈N

25/29

Communication equilibrium payoffs in repeated games with imperfect monitoring IX. Elements of the proof

• Statistical tests in equilibrium strategies (as in Renault 00, or Lehrer 90). The game is played by blocks of stages of polynomial size. Main path + statistical tests for deviations + very long (but not infinite) punishment phases. Need for an extension of Tchebychev’s inequality without independence (actions played in a block). (Lehrer) Let R1 , ..., Rn be Bernouilli r.v. with parameter p, and Y1 ,...,Yn be Bernouilli r.v. such that for each m, Rm is independent of R1 ,...,Rm−1 ,Y1 ,...,Ym . Then   R1 Y1 + ... + Rn Yn Y1 + ... + Yn 1 ≥ −p ε } ≤ 2. ∀ε > 0, P n n nε

26/29

Communication equilibrium payoffs in repeated games with imperfect monitoring IX. Elements of the proof

• Statistical tests in equilibrium strategies (as in Renault 00, or Lehrer 90). The game is played by blocks of stages of polynomial size. Main path + statistical tests for deviations + very long (but not infinite) punishment phases. Need for an extension of Tchebychev’s inequality without independence (actions played in a block). (Lehrer) Let R1 , ..., Rn be Bernouilli r.v. with parameter p, and Y1 ,...,Yn be Bernouilli r.v. such that for each m, Rm is independent of R1 ,...,Rm−1 ,Y1 ,...,Ym . Then   R1 Y1 + ... + Rn Yn Y1 + ... + Yn 1 ∀ε > 0, P ≥ −p ε } ≤ 2. n n nε

26/29

Communication equilibrium payoffs in repeated games with imperfect monitoring IX. Elements of the proof

• Statistical tests in equilibrium strategies (as in Renault 00, or Lehrer 90). The game is played by blocks of stages of polynomial size. Main path + statistical tests for deviations + very long (but not infinite) punishment phases. Need for an extension of Tchebychev’s inequality without independence (actions played in a block). (Lehrer) Let R1 , ..., Rn be Bernouilli r.v. with parameter p, and Y1 ,...,Yn be Bernouilli r.v. such that for each m, Rm is independent of R1 ,...,Rm−1 ,Y1 ,...,Ym . Then   R1 Y1 + ... + Rn Yn Y1 + ... + Yn 1 ∀ε > 0, P ≥ −p ε } ≤ 2. n n nε

26/29

Communication equilibrium payoffs in repeated games with imperfect monitoring X. More on imperfect monitoring without a mediator

More on imperfect monitoring without a mediator: • Computing the punishment levels may be difficult: Payoffs for player 3: R R   L  L T −1 0 0 0 B 0 0 0 −1 W E Perfect monitoring: −1/4 (independent minmax) P1 and P2 see each other, P3 in the dark: −1/2 (correlated minmax) Assume now that P3 observes P2 only, P2 observes P1 only, P1 plays in the dark. (Gossner Tomala 07): punishment level v ? v = −1/2(x 2 + (1 − x)2), where −x ln(x) − (1 − x) ln(1 − x) = 1/2 v ∼ −0.402 27/29

Communication equilibrium payoffs in repeated games with imperfect monitoring X. More on imperfect monitoring without a mediator

More on imperfect monitoring without a mediator: • Computing the punishment levels may be difficult: Payoffs for player 3: R R   L  L T −1 0 0 0 B 0 0 0 −1 W E Perfect monitoring: −1/4 (independent minmax) P1 and P2 see each other, P3 in the dark: −1/2 (correlated minmax) Assume now that P3 observes P2 only, P2 observes P1 only, P1 plays in the dark. (Gossner Tomala 07): punishment level v ? v = −1/2(x 2 + (1 − x)2), where −x ln(x) − (1 − x) ln(1 − x) = 1/2 v ∼ −0.402 27/29

Communication equilibrium payoffs in repeated games with imperfect monitoring X. More on imperfect monitoring without a mediator

More on imperfect monitoring without a mediator: • Computing the punishment levels may be difficult: Payoffs for player 3: R R   L  L T −1 0 0 0 B 0 0 0 −1 W E Perfect monitoring: −1/4 (independent minmax) P1 and P2 see each other, P3 in the dark: −1/2 (correlated minmax) Assume now that P3 observes P2 only, P2 observes P1 only, P1 plays in the dark. (Gossner Tomala 07): punishment level v ? v = −1/2(x 2 + (1 − x)2), where −x ln(x) − (1 − x) ln(1 − x) = 1/2 v ∼ −0.402 27/29

Communication equilibrium payoffs in repeated games with imperfect monitoring X. More on imperfect monitoring without a mediator

More on imperfect monitoring without a mediator: • Computing the punishment levels may be difficult: Payoffs for player 3: R R   L  L T −1 0 0 0 B 0 0 0 −1 W E Perfect monitoring: −1/4 (independent minmax) P1 and P2 see each other, P3 in the dark: −1/2 (correlated minmax) Assume now that P3 observes P2 only, P2 observes P1 only, P1 plays in the dark. (Gossner Tomala 07): punishment level v ? v = −1/2(x 2 + (1 − x)2), where −x ln(x) − (1 − x) ln(1 − x) = 1/2 v ∼ −0.402 27/29

Communication equilibrium payoffs in repeated games with imperfect monitoring X. More on imperfect monitoring without a mediator

• Playing more informative actions, even in case of trivial monitoring. Equilibrium paths may be complex: U 1 = {a, b, c}, P2 plays in the dark. H M B

G (0, 1),a  (1, 1),c (0, 0),c 

D  (0, 1),b (0, 0),c  (1, 1),c

An equilibrium is given by (σ 1 , σ 2 ), where: σ 2 plays i.i.d. 1/2 G +1/2 D at odd stages, and repeat the last action at even stages. σ 1 plays H at odd stages (“buy” the information), then M or B at even stages. Equilibrium payoff: (1/2, 1), can not be obtained as a convex combination of payoffs where P1 is in best reply.

28/29

Communication equilibrium payoffs in repeated games with imperfect monitoring X. More on imperfect monitoring without a mediator

• Playing more informative actions, even in case of trivial monitoring. Equilibrium paths may be complex: U 1 = {a, b, c}, P2 plays in the dark. H M B

G (0, 1),a  (1, 1),c (0, 0),c 

D  (0, 1),b (0, 0),c  (1, 1),c

An equilibrium is given by (σ 1 , σ 2 ), where: σ 2 plays i.i.d. 1/2 G +1/2 D at odd stages, and repeat the last action at even stages. σ 1 plays H at odd stages (“buy” the information), then M or B at even stages. Equilibrium payoff: (1/2, 1), can not be obtained as a convex combination of payoffs where P1 is in best reply.

28/29

Communication equilibrium payoffs in repeated games with imperfect monitoring X. More on imperfect monitoring without a mediator

• Playing more informative actions, even in case of trivial monitoring. Equilibrium paths may be complex: U 1 = {a, b, c}, P2 plays in the dark. H M B

G (0, 1),a  (1, 1),c (0, 0),c 

D  (0, 1),b (0, 0),c  (1, 1),c

An equilibrium is given by (σ 1 , σ 2 ), where: σ 2 plays i.i.d. 1/2 G +1/2 D at odd stages, and repeat the last action at even stages. σ 1 plays H at odd stages (“buy” the information), then M or B at even stages. Equilibrium payoff: (1/2, 1), can not be obtained as a convex combination of payoffs where P1 is in best reply.

28/29

Communication equilibrium payoffs in repeated games with imperfect monitoring X. More on imperfect monitoring without a mediator

• A deviation is detected, but the identity of the deviator is unknown. 3-player minority game: 3 players have to vote for one of two alternatives A and B. The player (if any) who vote for the less chosen alternative receives a reward of one euro. Between the stages only the current majority alternative is publicly announced. Game with public signals, and observable payoffs. Feasible payoffs: convex hull of {(1, 0, 0), (0, 1, 0), (0, 0, 1), (0, 0, 0)}. It can be shown that (0, 0, 0) ∈ E∞ . (R-Scarlatti-Scarsini 05 and 08)

29/29

Communication equilibrium payoffs in repeated games with imperfect monitoring X. More on imperfect monitoring without a mediator

• A deviation is detected, but the identity of the deviator is unknown. 3-player minority game: 3 players have to vote for one of two alternatives A and B. The player (if any) who vote for the less chosen alternative receives a reward of one euro. Between the stages only the current majority alternative is publicly announced. Game with public signals, and observable payoffs. Feasible payoffs: convex hull of {(1, 0, 0), (0, 1, 0), (0, 0, 1), (0, 0, 0)}. It can be shown that (0, 0, 0) ∈ E∞ . (R-Scarlatti-Scarsini 05 and 08)

29/29

Communication equilibrium payoffs in repeated games with imperfect monitoring X. More on imperfect monitoring without a mediator

• A deviation is detected, but the identity of the deviator is unknown. 3-player minority game: 3 players have to vote for one of two alternatives A and B. The player (if any) who vote for the less chosen alternative receives a reward of one euro. Between the stages only the current majority alternative is publicly announced. Game with public signals, and observable payoffs. Feasible payoffs: convex hull of {(1, 0, 0), (0, 1, 0), (0, 0, 1), (0, 0, 0)}. It can be shown that (0, 0, 0) ∈ E∞ . (R-Scarlatti-Scarsini 05 and 08)

29/29

Communication equilibrium payoffs in repeated games with imperfect monitoring X. More on imperfect monitoring without a mediator

• A deviation is detected, but the identity of the deviator is unknown. 3-player minority game: 3 players have to vote for one of two alternatives A and B. The player (if any) who vote for the less chosen alternative receives a reward of one euro. Between the stages only the current majority alternative is publicly announced. Game with public signals, and observable payoffs. Feasible payoffs: convex hull of {(1, 0, 0), (0, 1, 0), (0, 0, 1), (0, 0, 0)}. It can be shown that (0, 0, 0) ∈ E∞ . (R-Scarlatti-Scarsini 05 and 08)

29/29

Communication equilibrium payoffs in repeated games with imperfect monitoring X. More on imperfect monitoring without a mediator

• A deviation is detected, but the identity of the deviator is unknown. 3-player minority game: 3 players have to vote for one of two alternatives A and B. The player (if any) who vote for the less chosen alternative receives a reward of one euro. Between the stages only the current majority alternative is publicly announced. Game with public signals, and observable payoffs. Feasible payoffs: convex hull of {(1, 0, 0), (0, 1, 0), (0, 0, 1), (0, 0, 0)}. It can be shown that (0, 0, 0) ∈ E∞ . (R-Scarlatti-Scarsini 05 and 08)

29/29

Communication equilibrium payoffs in repeated games with imperfect monitoring References

D. Abreu, D. Pearce et E. Stacchetti. Toward a theory of discounted repeated games with imperfect monitoring. Econometrica, 58, 1041–1063, 1990. Aumann, R.J. and M. Maschler (1995): Repeated games with incomplete information. With the collaboration of R. Stearns. Cambridge, MA: MIT Press. R. J. Aumann and L. S. Shapley. Long-term competition—A game theoretic analysis. In N. Megiddo, editor, Essays on game theory, pages 1–15. Springer-Verlag, New-York, 1994. F. Forges. An Approach to Communication Equilibria. Econometrica, 54, 1375–1385, 1985. D. Fudenberg and E. Maskin. The folk theorem in repeated games with discounting or with incomplete information. Econometrica, 54:533–554, 1986. D. Fudenberg, D. K. Levine et E. Maskin. 29/29

Communication equilibrium payoffs in repeated games with imperfect monitoring References

The Folk theorem with imperfect public information. Econometrica, 62, 997–1039, 1994. O. Gossner. The Folk theorem for finitely repeated games with mixed strategies. International Journal of Game Theory, 24: 95–107, 1995. O. Gossner et T. Tomala. Secret correlation in repeated games with imperfect monitoring. Mathematics of Operations Research, 32, 413–424, 2007. E. Kohlberg. Optimal strategies in repeated games with incomplete information. International Journal of Game Theory, 4, 7–24, 1975. E. Lehrer. Lower Equilibrium Payoffs in Two-Player Repeated Games with Non-observable Actions. International Journal of Game Theory, 18, 57–89, 1989. E. Lehrer. 29/29

Communication equilibrium payoffs in repeated games with imperfect monitoring References

Nash equilibria of n player repeated games with semi-standard information. International Journal of Game Theory, 19, 191–217, 1990. E. Lehrer. Correlated Equilibria in two-Player Repeated Games with non-Observable Actions. Mathematics of Operations Research, 17, 175–199, 1992a. E. Lehrer. On the Equilibrium Payoffs Set of two-Player Repeated Games with Imperfect Monitoring. International Journal of Game Theory, 20, 211–226, 1992b. E. Lehrer. Two-player repeated games with nonobservable actions and observable payoffs. Mathematics of Operations Research, 17, 200–224, 1992c. R.B. Myerson. Optimal coordination mechanisms in generalized principal agent problems. J. Math. Economics, 10, 67–81, 1982 29/29

Communication equilibrium payoffs in repeated games with imperfect monitoring References

. R.B. Myerson. Multistage games with communication. Econometrica, 54, 323–358, 1986. J.Renault and T. Tomala. Repeated Proximity Games. International Journal of Game Theory, 2, 539–559, 1998. J.Renault. 2-player repeated games with lack of information on one side and state independent signalling. Mathematics of Operations Research, 25, 552–572, 2000. J.Renault, S. Scarlatti and M. Scarsini. A Folk theorem for minority games, Games and Economic Behavior, 53, 208–230, 2005. J.Renault, S. Scarlatti and M. Scarsini. Discounted and Finitely Repeated Minority Games with Public Signals. Mathematical Social Sciences, vol.56, pp.44–74, 2008. 29/29

Communication equilibrium payoffs in repeated games with imperfect monitoring References

J.Renault and T. Tomala. Communication equilibria in supergames, Games and Economic Behavior, 49, 313–344, 2004.

Thanks for your attention !

29/29

Communication equilibrium payoffs in repeated games ...

Definition: A uniform equilibrium payoff of the repeated game is a strategy ...... Definition: for every pair of actions ai and bi of player i, write bi ≥ ai if: (i) ∀a−i ...

826KB Sizes 0 Downloads 325 Views

Recommend Documents

Repeated Games with Uncertain Payoffs and Uncertain ...
U 10,−4 1, 1. D. 1,1. 0, 0. L. R. U 0,0. 1, 1. D 1,1 10, −4. Here, the left table shows expected payoffs for state ω1, and the right table shows payoffs for state ω2.

Nash Equilibrium in Discontinuous Games
Phone: 773$7028192; Fax: 773$702$8490; email: [email protected]. 1This literature has ... behaved best$reply correspondences (e.g., Nash 1950, 1951; Glicksberg 1952). To this end, ... to verify in practice. 2 .... be optimizing, and therefore ,x is

Repeated proximity games
If S is a. ®nite set, h S will denote the set of probability distributions on S. A pure strategy for player i in the repeated game is thus an element si si t t 1, where for ..... random variable of the action played by player i at stage T and hi. T

Multiagent Social Learning in Large Repeated Games
same server. ...... Virtual Private Network (VPN) is such an example in which intermediate nodes are centrally managed while private users still make.

Infinitely repeated games in the laboratory - The Center for ...
Oct 19, 2016 - Electronic supplementary material The online version of this article ..... undergraduate students from multiple majors. Table 3 gives some basic ...

Renegotiation and Symmetry in Repeated Games
symmetric, things are easier: although the solution remains logically indeterminate. a .... definition of renegotiation-proofness given by Pearce [17]. While it is ...

Strategic Complexity in Repeated Extensive Games
Aug 2, 2012 - is in state q0. 2,q2. 2 (or q1. 2,q3. 2) in the end of period t − 1 only if 1 played C1 (or D1, resp.) in t − 1. This can be interpreted as a state in the ...

Repeated Games with General Discounting - CiteSeerX
Aug 7, 2015 - Together they define a symmetric stage game. G = (N, A, ˜π). The time is discrete and denoted by t = 1,2,.... In each period, players choose ...

Approximate efficiency in repeated games with ...
illustration purpose, we set this complication aside, keeping in mind that this .... which we refer to as effective independence, has achieved the same effect of ... be the private history of player i at the beginning of period t before choosing ai.

Infinitely repeated games in the laboratory: four perspectives on ...
Oct 19, 2016 - Summary of results: The comparative static effects are in the same direction ..... acts as a signal detection method and estimates via maximum ...

The Folk Theorem in Repeated Games with Individual ...
Keywords: repeated game, private monitoring, incomplete information, ex-post equilibrium, individual learning. ∗. The authors thank Michihiro Kandori, George ...

Repeated Games with General Discounting
Aug 7, 2015 - Repeated game is a very useful tool to analyze cooperation/collusion in dynamic environ- ments. It has been heavily ..... Hence any of these bi-.

Definitions of Equilibrium in Network Formation Games
Apr 16, 2006 - Different models have been proposed to analyze the formation of bilateral links in small societies ... Financial support from the Lee Center for Advanced .... that they define, which we simply call the linking game with transfers.

Inefficiency of Nash Equilibrium in Network Based Games
Apr 19, 2010 - ish agents are represented by nodes in a network and edges serve as ... it intutivelty denotes the overall social benifit of all the players in the ...

Correlated Equilibrium and Concave Games
May 1, 2007 - I acknowledge financial support by The Japan Economic Research ... payoff function with respect to the player's own strategy and call the.

Equilibrium in Discontinuous Games without Complete ...
Nov 17, 2015 - i G N\I, player iIs best-reply correspondence is closed and has non-empty and convex values, then G possesses a pure strategy Nash equilibrium. 2. An Application to Abstract Games. We demonstrate here how Theorem 1.3 can be applied to

bootstrapping communication in language games ...
topology affects the behavior of the system and forces to carefully consider agents selection ... in the quest for the general mechanisms underlying the emergence of a shared set .... tion social networks in which people are the nodes and their socia

Explicit formulas for repeated games with absorbing ... - Springer Link
Dec 1, 2009 - mal stationary strategy (that is, he plays the same mixed action x at each period). This implies in particular that the lemma holds even if the players have no memory or do not observe past actions. Note that those properties are valid

Repeated Games with Incomplete Information1 Article ...
Apr 16, 2008 - tion (e.g., a credit card number) without being understood by other participants ... 1 is then Gk(i, j) but only i and j are publicly announced before .... time horizon, i.e. simultaneously in all game ΓT with T sufficiently large (or

Introduction to Repeated Games with Private Monitoring
Stony Brook 1996 and Cowles Foundation Conference on Repeated Games with Private. Monitoring 2000. ..... actions; we call such strategies private). Hence ... players.9 Recent paper by Aoyagi [4] demonstrated an alternative way to. 9 In the ...

Repeated games and direct reciprocity under active ...
Oct 31, 2007 - Examples for cumulative degree distributions of population ..... Eguıluz, V., Zimmermann, M. G., Cela-Conde, C. J., Miguel, M. S., 2005. Coop-.

Repeated Games with General Time Preference
Feb 25, 2017 - University of California, Los Angeles .... namic games, where a state variable affects both payoffs within each period and intertemporal.

Rational Secret Sharing with Repeated Games
Apr 23, 2008 - Intuition. The Protocol. 5. Conclusion. 6. References. C. Pandu Rangan ( ISPEC 08 ). Repeated Rational Secret Sharing. 23rd April 2008. 2 / 29 ...

Introduction to Repeated Games with Private Monitoring
our knowledge about repeated games with imperfect private monitoring is quite limited. However, in the ... Note that the existing models of repeated games with.