Introduction Model Results Discussion
Observations on Cooperation Yuval Heller (Bar Ilan) and Erik Mohlin (Lund)
Erice 2017
Heller & Mohlin
Observations on Cooperation
1 / 22
Introduction Model Results Discussion
Motivating Example Alice interacts with a remote trader, Bob. Both agents have opportunities to shirk/cheat. Alice obtains anecdotal evidence about Bob’s actions in a couple of past interactions Alice considers this information when deciding how to act. Alice is unlikely to interact with Bob again. Future partners may ask Bob about Alice’s behavior. Research question Can cooperation be sustained in such environments? Heller & Mohlin
Observations on Cooperation
2 / 22
Introduction Model Results Discussion
Underlying Game: The Prisoner’s Dilemma (PD)
c c
1
d
1+g
d 1
−l
−l
0
g>0 - gain of a greedy player.
1+g 0
l>0 - loss if the partner defects. g
Heller & Mohlin
Observations on Cooperation
3 / 22
Introduction Model Results Discussion
Brief Summary of Results 1
Novel behavior supports stable cooperation. (uniqueness in the restricted set of stationary strategies).
2
Stable cooperation requires observation of 2+ of interactions
3
Observation of partner’s past actions: g > l: Only defection is stable. g < l: Cooperation is stable (and robust to any noise). l +1
4
Observation of action profiles: Cooperation is stable iff g <
5
Optimal feedback: Observing partner’s actions against cooperation. Heller & Mohlin
Observations on Cooperation
2
.
4 / 22
Introduction Model Results Discussion
Observation Structure and Environment Strategies and Steady States Solution Concept
Observation Structure and Environment Basic stationary model (focus of the presentation): Each player privately observes a sample of k actions played by his partner (against other opponents). Agents are restricted to stationary strategies. IID sampling from the partner’s stationary behavior. Set of possible signals - m ∈ {0, 1, 2, ..., k} (interpreted as the number of observed defections).
Alternative model: unrestricted set of strategies, observing the last k actions. All the results hold except uniqueness. Heller & Mohlin
Observations on Cooperation
5 / 22
Stationary Strategies Definition (Strategy - s : {0, ..., k} → ∆ ({c, d})) Mapping assigning a mixed action for each possible observation. Interpretation: The agent’s behavior conditional on the observed signal.
Strategy distribution – Distribution σ over the set of strategies (with a finite support). Interpretation: Heterogeneous population. Example of a Strategy Distribution
supp (σ ) = {su , s1 , s2 } su ≡ 50%
s1 (m) =
c d
σ (su ) = ε, σ (s1 ) = m=0 s2 (m) = m≥1
1−ε 6 ,
σ (s2 ) =
c
m≤1
d
m≥2
5·(1−ε ) 6
Consistent Signal Profile Definition (Signal profile - θ : supp (σ ) −→ ∆ (M)) θ (s) is interpreted as the distribution of signals observed by agents who are matched with a partner who plays strategy s.
Definition (Consistent signal profile ) Signal profile θ and strategy distribution σ jointly induce a behavior profile: a distribution of actions for each strategy. The behavior profile induce a signal profile of observed actions.
Consistency: The induced signal profile is θ .
A strategy distribution may admit multiple signal profiles. Ignored in this presentation.
Introduction Model Results Discussion
Observation Structure and Environment Strategies and Steady States Solution Concept
Commitment Strategies (“Crazy” Agents)
We refine our solution concept by requiring robustness to the presence of few “crazy” agents (`a la Kreps et al., 1982). Definition (Distribution of Commitments – (Sc , λ )) Sc is a finite set of commitment strategies, and λ ∈ ∆ (Sc ) is a distribution over these strategies. We assume that at least one of the commitment strategies is totally mixed.
Heller & Mohlin
Observations on Cooperation
8 / 22
Introduction Model Results Discussion
Observation Structure and Environment Strategies and Steady States Solution Concept
Nash in Perturbed Environment Definition (Perturbed Environment ) A fraction ε of committed agents play a strategy according to λ ∈ ∆ (SC ). Definition (Nash equilibrium in a perturbed environment.) π (σ ∗ ) ≥ πs (σ ∗ ) for every strategy s, where π (σ ∗ ) denote the mean payoff of the 1 − ε “normal” agents, and πs (σ ∗ ) denotes the payoff to strategy s.
Heller & Mohlin
Observations on Cooperation
9 / 22
Introduction Model Results Discussion
Observation Structure and Environment Strategies and Steady States Solution Concept
Perfect Equilibrium
Definition (Perfect equilibrium σ ∗ ) The limit of Nash equilibria in some converging sequence of perturbed environments.
Heller & Mohlin
Observations on Cooperation
10 / 22
Introduction Model Results Discussion
Observation Structure and Environment Strategies and Steady States Solution Concept
Perfect Equilibrium
Definition (Perfect equilibrium σ ∗ ) The limit of Nash equilibria in some converging sequence of perturbed environments. Definition (Strictly perfect action a∗ ) The limit behavior of Nash equilibria in any converging sequence of perturbed environments.
Heller & Mohlin
Observations on Cooperation
10 / 22
Introduction Model Results Discussion
Taxonomy of PDs Observation of Actions Other Observation Structures
Heller & Mohlin
Observations on Cooperation
Results
11 / 22
Prisoner’s Dilemma - Taxonomy Offensive (submodular, Takahashi, 2010) PD l < g : stronger incentive to defect against cooperative partner than defective partner. Defensive (supermodular) PD l > g . Acute PD g >
l+1 2 :
defection against cooperator gives more than half of
what opponent looses. Mild (mildly tempting) PD g <
l+1 2 .
Introduction Model Results Discussion
Taxonomy of PDs Observation of Actions Other Observation Structures
Stable Defection in any PD
Claim Defection is strictly perfect equilibrium action in any PD.
Heller & Mohlin
Observations on Cooperation
13 / 22
Defection is the Unique Outcome in Offensive PDs Proposition Assume an offensive PD (l < g ) with observation of any number of actions. If σ ∗ is a perfect equilibrium then everyone defects.
Defection is the Unique Outcome in Offensive PDs Proposition Assume an offensive PD (l < g ) with observation of any number of actions. If σ ∗ is a perfect equilibrium then everyone defects.
Intuition: Assume to the contrary that σ ∗ 6≡ d Direct gain from defecting decreases in the partner’s prob. of defection. The indirect loss is independent of the current partner’s behavior. ⇒ Incumbents are less likely to defect when observing more defections. ⇒ If Alice always defects, she outperforms the incumbents.
Defection is the Unique Outcome in Offensive PDs Proposition Assume an offensive PD (l < g ) with observation of any number of actions. If σ ∗ is a perfect equilibrium then everyone defects.
Intuition: Assume to the contrary that σ ∗ 6≡ d Direct gain from defecting decreases in the partner’s prob. of defection. The indirect loss is independent of the current partner’s behavior. ⇒ Incumbents are less likely to defect when observing more defections. ⇒ If Alice always defects, she outperforms the incumbents.
Remark (Alternative model with unrestricted set of strategies) A weaker result: Full cooperation isn’t a perfect equilibrium.
Stable Cooperation in Defensive PD Proposition Assume g ≤ l and observing k ≥ 2 actions. Cooperation is strictly perfect. Moreover there is essentially a unique strategy distribution that supports cooperation (uniqueness relies on the restriction to stationary strategies).
Essentially Unique Stable State Everyone cooperates when observing no defections. Everyone defects when observing ≥ 2 defections. 0
1 k
of the incumbents defect when observing 1 defection
(i.e., q of the agents follow s 1 , and the remaining 1−q follow s 2 ). The value of q depends on the commitment strategies.
Introduction Model Results Discussion
Taxonomy of PDs Observation of Actions Other Observation Structures
Other Observation Structures
What happens if the signal about the partner depends also on the behavior of other opponents against her? We study three observation structures: 1
The entire action profile.
2
Mutual cooperation or not (=conflict). Signals:{CC , not − CC }.
3
Observing actions against cooperation.
Heller & Mohlin
Signals:{CC , DC , CD, DD}.
Signals:{CC , DC , ?D}.
Observations on Cooperation
16 / 22
Stable Cooperation when Observing Mutual Cooperation Proposition If players observe conflicts (i.e., CC or not) in at least two interactions, then cooperation is a perfect equilibrium iff the PD is mild (g <
l +1 2
).
Intuition Mild PDs: The perfect state is similar to the previous results. Players condition their play on the number of observed conflicts. Acute PDs: involvement in a conflict has to be punished with probability of at least 1/2. Because both players have to be punished, each conflict induces at least one additional conflict ⇒ Conflicts are “contagious”.
Stable Cooperation when Observing Action Profiles Proposition If players observe action profiles in at least two interactions, then cooperation is a perfect equilibrium iff the PD is mild (g <
l +1 2
).
Intuition The high frequency of punishments required in acute PDs implies that the partner is more likely to defect when observing mutual defection (relative to observing the partner to be the sole defector). ⇒ Agents “punish” partners who defect against another defector ⇒ destabilizes cooperation. In mild PDs, cooperation is stable (with essentially the same unique supporting behavior).
Introduction Model Results Discussion
Taxonomy of PDs Observation of Actions Other Observation Structures
Observing Actions Against Cooperation Proposition If players observe actions against cooperation (i.e., {CC , DC , ?D}) in at least two interactions, then cooperation is perfect in any Prisoner’s Dilemma.
Intuition No indirect loss of defecting against a defector, since it is not observed. Makes it easier to incentivize agents to deter defection. Providing more information to agents may harm cooperation. Heller & Mohlin
Observations on Cooperation
19 / 22
Introduction Model Results Discussion
Related Literature and Contribution Conclusion
Related literature (Partial List): Community Enforcement 1
Contagious equilibria (e.g., Kandori 1992; Ellison, 1994).
2
Applications of belief-free equilibria (Takahashi, 10; Deb, 12).
3
Image scoring (e.g., Nowak & Sigmund, 98).
4
Exogenous reputation mechanisms (e.g., Sugden, 86; Kandori, 92).
5
Structured populations (Cooper & Wallace, 04; Alger & Weibull, 13).
6
Observation of preferences (e.g., Dekel et al., 07; Herold, 12).
Our Main Methodological Contributions Robustness to few crazy agents. Heller & Mohlin
Observations on Cooperation
20 / 22
Companion Projects and Directions for Future Research Companion working papers: 1
“When is Social Learning Path-Dependent?”: When does a distribution of stationary strategies uniquely determines the consistent behavior?
2
“Coevolution of deception and preferences”: Players can deceive others about their preferences and intentions.
Directions for future research: Experiment to test the theoretical predictions. Realistic, yet tractable, model of online feedback. Studying non-negligible noise levels.
Introduction Model Results Discussion
Related Literature and Contribution Conclusion
Conclusion Introducing robustness against few crazy agents into the setup of community enforcement (Prisoner’s Dilemma with random matching). 1
Unique novel behavior supports stable cooperation.
2
Stable cooperation requires observation of 2+ interactions
3
Observation of partner’s past actions: g > l: Only defection is stable. g < l: Cooperation is stable.
4
Observation of action profiles: Cooperation is stable iff g <
5
Optimal feedback: Observing actions against cooperation. Heller & Mohlin
Observations on Cooperation
l+1 2 .
22 / 22
Summary of Results - When is Cooperation Stable?
Category of PD
Mild (g <
l+1 2 )
Acute (g >
l+1 2 )
Actions
Defen.
Y
Offen.
N
Defen.
Y
Offen.
N
Conflicts
Action profiles
Y
Y
N
N
Action against Coop.
Y
Stable cooperation requires observation of 2+ interactions. Observing a single interaction: Cooperation is not stable if g > 1.
Introduction Model Results Discussion
Related Literature and Contribution Conclusion
Backup Slides
Heller & Mohlin
Observations on Cooperation
24 / 22
Influence of Cheap Talk Introducing cheap-talk with unrestricted language destabilize the perfect equilibrium in which everyone defects. Experimenting agents use a secret handshake to cooperate among themselves (Robson, 1990). Implications (observation of actions + cheap-talk): Defensive PD - Only the cooperative equilibrium is stable. Offensive PD - No stable equilibrium. The population state cycles between the defective and the cooperative equilibrium (as in the one-shot PD, see Wiseman & Yilankaya, 2001).
Introduction Model Results Discussion
Related Literature and Contribution Conclusion
Steady States and Payoffs - Details
Back
Fact (standard fixed point argument) Each strategy distribution admits a consistent behavior (not necessarily unique).
Example (k = 3; Each agent plays the mode (frequently observed action).) 3 consistent behaviors: full cooperation, no cooperation, uniform mixing.
The Payoff of each incumbent strategy (s ∈ supp (σ )) and the average payoff in the population are defined in a standard way. πs (σ , η) = ∑s 0 σ (s 0 ) · π (ηs (s 0 ) , ηs 0 (s)), π (σ , η) = ∑s∈supp(σ ) σ (s) · πs (σ , η). Heller & Mohlin
Observations on Cooperation
26 / 22
Introduction Model Results Discussion
Related Literature and Contribution Conclusion
Illustration of Stable Cooperation in Defensive PD
Heller & Mohlin
Observations on Cooperation
27 / 22
Introduction Model Results Discussion
Related Literature and Contribution Conclusion
Illustration of Unstable Cooperation in Offensive PD
Heller & Mohlin
Observations on Cooperation
28 / 22