Dynamic Random Utility∗ Mira Frick

Ryota Iijima

Tomasz Strzalecki

Abstract Under dynamic random utility, an agent (or population of agents) solves a dynamic decision problem subject to evolving private information. We analyze the fully general and non-parametric model, axiomatically characterizing the implied dynamic stochastic choice behavior. Relative to static or i.i.d. versions of the model, a key new feature is that choices appear history dependent: different sequences of past choices reflect different private information of the agent, and hence typically lead to different distributions of current choices. Our main axioms impose discipline on the amount of history dependence that choices can display: We identify simple equivalence classes of past choices that reflect the same private information and thus have the same implications for current choices. Additional conditions allow us to distinguish between two sources of history dependence—randomly evolving utility vs. gradual learning about fixed but unknown tastes—and to characterize the relationship between choice and taste persistence. An agent’s observed choices allow for essentially unique identification of her underlying utility process/private information. We relate our model to dynamic discrete choice models widely used in empirical work, highlighting new modeling tradeoffs in this literature. Finally, we extend our analysis to allow past consumption to directly affect the agent’s utility process, accommodating models of habit formation and experimentation.



This version: 5 June 2017. Frick: Yale University ([email protected]); Iijima: Yale University ([email protected]); Strzalecki: Harvard University (tomasz [email protected]). This research was supported by the National Science Foundation grant SES-1255062. We thank David Ahn, Dirk Bergemann, Jetlir Duraj, Drew Fudenberg, Daria Khromenkova, Yves Le Yaouanq, Jay Lu, Ariel Pakes, Larry Samuelson, Michael Whinston, as well as seminar audiences at ASU Economic Theory Conference, Berkeley, Bocconi, BU, Caltech Choice Conference, Harvard–MIT, LMU, Northwestern, Rochester, Stanford, and Yale.

1

1

Introduction

Random utility models are widely used throughout economics. In the static model, the agent chooses from her choice set after observing the realization of a random utility function U . In the dynamic model, the agent solves a dynamic decision problem, subject to a stochastic process (Ut ) of utilities. The key feature of the model is an informational asymmetry between the agent (who knows her realized utility) and the analyst (who does not). In both the static and dynamic setting, this asymmetry gives rise to choice behavior that appears stochastic to the analyst but is deterministic from the point of view of the agent.1 In the dynamic setting, the informational asymmetry has an additional key implication: If (Ut ) displays serial correlation, then choices will appear history dependent to the analyst. For example, we expect the agent’s probability of voting Republican in 2020 to be different conditional on voting Republican in 2016 than conditional on voting Democrat in 2016. This is because her past voting behavior reveals relevant information about her past political preferences, which we expect to be at least somewhat persistent. History dependence due to serially correlated private information is pervasive in applications, from education and career choices in labor economics to consumer brand choices in marketing. Recognizing that “ignoring serial correlation in unobservables [...] can lead to serious misspecification errors” (Norets, 2009), the dynamic discrete choice literature studying these settings has developed and estimated a number of models that can accommodate history dependent choices. However, as highlighted by Pakes (1986), a limitation of all these models is that they rely on specific parametric forms of serial correlation, making it “difficult to determine the robustness of the conclusions to the stochastic assumptions chosen.” This paper provides the first analysis of the fully general and non-parametric model of dynamic random utility. Our contribution is threefold: First, we axiomatically characterize the implied dynamic stochastic choice behavior, identifying the precise form of history dependence that can arise under serially correlated private information. Our axiomatization answers for the dynamic model a question that has given rise to an extensive literature in the static setting (see Section 9.1) while overcoming challenges, notably a “limited observability” problem explained below, that are new to the dynamic domain. Second, dynamic stochastic choice data allows us to distinguish between models that coincide in static settings. In particular, we show how to behaviorally separate two central forms of private information—random taste shocks vs. learning; and to distinguish history dependence due to serially correlated private information from consumption dependence, where past consumption affects current choices by directly shaping the agent’s utility process, an important distinction emphasized by Heckman (1981). Finally, our analysis sheds new light on modeling tradeoffs in the dynamic discrete choice literature. 1

An equivalent interpretation of the model is that the analyst observes a fixed population of heterogeneous individuals. Throughout the paper, we use “the agent” to refer to both interpretations.

2

Our model generalizes the static random expected utility framework of Gul and Pesendorfer (2006) to decision trees as defined by Kreps and Porteus (1978). Each period t, the agent chooses from a menu of lotteries over current consumptions and continuation menus by maximizing a random vNM utility Ut . A history ht−1 = (A0 , p0 , . . . , At−1 , pt−1 ) summarizes that the agent chose lottery p0 from menu A0 , then was faced with A1 and chose p1 , and so on. Observed behavior at t is given by a history dependent choice distribution ρt (·|ht−1 ), specifying the choice frequency ρt (pt , At |ht−1 ) of pt from any menu At that can arise after ht−1 . Turning to the axiomatic characterization, our first main insight is the following: The fact that history dependence arises purely as a result of serial correlation in (Ut ) entails two history independence conditions. Each condition identifies simple equivalence classes of histories that reveal the same private information to the analyst; if ht−1 and g t−1 are equivalent, then ρt (·|ht−1 ) and ρt (·|g t−1 ) are required to coincide. The first condition, contraction history independence, imposes equivalence if ht−1 can be obtained from g t−1 by eliminating some options that are irrelevant to choices along the history g t−1 . The second condition, linear history independence, imposes equivalence if ht−1 and g t−1 are “linear combinations” of each other. Theorem 1 shows that our most general model, dynamic random expected utility (DREU), is fully characterized by these two independence conditions along with a continuity condition and Gul and Pesendorfer’s (2006) axioms that ensure static random utility maximization at each history. In DREU, the stochastic process (Ut ) is unrestricted. We next study the special case in which (Ut ) is given by a Bellman equation: The agent has separable preferences over current consumption and continuation problems and is forward-looking with a correct assessment of option value. This allows us to distinguish evolving utility, where the agent faces taste shocks that evolve randomly over time, from its special case, gradual learning, in which the agent learns over time about her fixed but unknown tastes—two forms of private information that are indistinguishable in the static setting. To this end, we introduce a novel incomplete and history dependent revealed preference relation that infers from the agent’s choices her preference conditional on any particular realization of her private information. Evolving utility is then characterized by adapting axioms from the menu-preference literature: separability, flexibility, and dynamic sophistication (Theorem 2). The additional behavioral content of gradual learning is encapsulated by a consumption stationarity axiom, reflecting the martingale property of beliefs, along with a constant intertemporal tradeoff axiom (Theorem 3). Proposition 1 establishes identification results for the three representations. In DREU, the agent’s ordinal private information is uniquely pinned down; evolving utility, and more so gradual learning, impose discipline on the cardinal private information (Ut ); additionally, gradual learning allows for unique identification of the discount factor. A key challenge throughout our analysis is the following “limited observability” problem: In contrast with the static setting, where the analyst observes choices from all possible menus,

3

in the dynamic setting each history of past choices restricts the set of current and future choice problems. Over time, this severely limits the history-dependent choice data on which axioms can be imposed and from which (Ut ) can be inferred. We overcome this problem by means of the following extrapolation procedure (Definition 3): For any menu At and history ht−1 that does not lead to At , we define the agent’s counterfactual choice distribution from At following ht−1 by extrapolating from the situation where the agent plans to make the sequence of choices captured by ht−1 , but is sometimes forced, with exogenous probability, to instead make a sequence of choices that does lead to menu At . Invoking linear history independence, the latter sequence can be specified such that it reveals the same private information as the original choice sequence ht−1 , thus justifying the extrapolation. This extrapolation procedure relies crucially on the inclusion of lotteries as choice objects. We discuss the connection with similar uses of plausibly exogenous randomization to perform counterfactual analyses in empirical and experimental work. Section 6 discusses the relationship with the dynamic discrete choice (DDC) model, which is the workhorse model for structural estimation (e.g., Rust (1994)). We find that DREU nests the DDC model, which in turn nests a version of evolving utility in which the agent’s private information is fully summarized by her past realizations of consumption utility. We show that our uniqueness results are complementary to identification results in the DDC literature. Moreover, we point out that the versions of the DDC model most often used in practice are incompatible with evolving utility, as the two make opposite predictions about option value. In the evolving utility model the agent has positive option value: she likes bigger menus, as they provide her with more flexibility and wants to make her decisions as late as she can to condition on as much information as possible. On the other hand, motivated by the need for well-behaved likelihoods, the DDC model typically imposes a full support and/or i.i.d. assumption on utility shocks. Under either of these assumptions, the agent sometimes prefers to commit to smaller menus; under the i.i.d. assumption, she additionally prefers to make her decisions as early as possible. Thus, these most commonly used versions of the DDC model display a negative option value, failing to capture Bayesian learning. This points to a possible modeling tradeoff between having well-behaved likelihoods and positive option value. Finally, in Section 7, we extend our model to additionally allow past consumption to directly influence the agent’s current behavior by shaping her current preferences. We refer to this as consumption dependence, while reserving the term history dependence for the phenomenon discussed so far, where observed current behavior depends on past choices (rather than actual consumption) because different choices reflect different private information. Prominent examples of consumption dependence include habit formation, where consuming a certain good in the past may make the agent like it more in the present; and active learning/experimentation, where the agent’s consumption provides information to her about some payoff-relevant state of

4

the world. Making use of the fact that each chosen lottery can result in multiple consumption outcomes, we adapt our characterization to this setting, providing behavioral foundations for these models and distinguishing history from consumption dependence.

2

Static vs. Dynamic Random Utility

For any set Y , denote by Kf (Y ) the set of all nonempty finite subsets of Y and by ∆(Y ) the set of all simple (i.e., finite support) lotteries on Y ; henceforth, all references to lotteries are to simple lotteries. Whenever Y is a separable metric space, we endow ∆(Y ) with the induced Prokhorov metric and Kf (Y ) with the Hausdorff metric. Let RY denote the set of vNM utility indices over Y , which is endowed with the product topology and its induced Borel sigma-algebra. For any U, U 0 ∈ RY , write U ≈ U 0 if U and U 0 represent the same preference on ∆(Y ). For any finite set of lotteries A ∈ Kf (∆(Y )), let M (A, U ) := argmaxp∈A U (p) denote the P set of lotteries in A that maximize U , where U (p) := y∈supp(p) U (y)p(y) denotes the expected utility of any p ∈ ∆(Y ).

2.1

Static Random Utility

We first briefly review the static model of random expected utility that will serve as the building block of our dynamic representation at each history. As mentioned in the Introduction, there are two equivalent interpretations of the model: a single agent with a random utility function or a population of agents with heterogeneous utilities. The model is based on Gul and Pesendorfer (2006), but allows for an infinite outcome space; this extension is necessary for our purposes, because in the dynamic setting the period-t outcome space Xt , consisting of all pairs of current consumptions and continuation menus, will be infinite in all but the final period. 2.1.1

Agent’s problem

Let X be an arbitrary separable metric space of outcomes. The agent makes choices from menus, which are finite sets of lotteries over X; the set of all menus is A := Kf (∆(X)). Denote a typical menu by A and a typical lottery by p. Let (Ω, F ∗ , µ) be a finitely-additive probability space. In each state of the world, the agent’s choices maximize her expected utility subject to her private information. Her payoff-relevant private information is captured by a sigma-algebra F ⊆ F ∗ and an F-measurable random vNM utility index U : Ω → RX . In case of indifference, ties are broken by a random vNM index W : Ω → RX , which is measurable with respect to F ∗ . Thus, when faced with menu A, the agent chooses lottery p in state ω if and only if p maximizes U (ω) in A and, in case of ties, additionally maximizes W (ω) among the U (ω)-maximizers; that is, p ∈ M (M (A, U (ω)), W (ω)). 5

For tractability, we follow Ahn and Sarver (2013) in assuming that the agent’s payoffrelevant private informaton (F, U ) is simple, i.e., (i) F is generated by a finite partition such that µ(F(ω)) > 0 for every ω ∈ Ω, where F(ω) denotes the cell of the partition that contains ω; and (ii) each U (ω) is nonconstant and U (ω) 6≈ U (ω 0 ) whenever F(ω) 6= F(ω 0 ). Moreover, the tie-breaker W is proper,2 ensuring that in each menu ties occur with probability 0; that is, µ({ω ∈ Ω : |M (A, W (ω))| = 1}) = 1 for all A ∈ A. 2.1.2

Analyst’s problem

The analyst does not observe the agent’s private information and thus cannot condition on events in F (equivalently, in the population interpretation, the analyst does not observe the identities of the agents, just aggregate choice frequencies). Because of this informational asymmetry, the agent’s choices appear stochastic to the analyst. His observations are summarized by P a stochastic choice rule on A, i.e., a map ρ : A → ∆(∆(X)) such that p∈A ρ(p, A) = 1 for all A ∈ A. Here ρ(p, A) denotes the frequency with which the agent chooses lottery p when faced with menu A. If the agent behaves as in the previous section, then the event that the agent chooses p from A is C(p, A) := {ω ∈ Ω : p ∈ M (M (A, U (ω)), W (ω))}. Thus, the analyst’s observations are consistent with the previous section if ρ(p, A) = µ(C(p, A)) for all p and A. The following definition summarizes the static model: Definition 1. A static random expected utility (REU) representation of the stochastic choice rule ρ is a tuple (Ω, F ∗ , µ, F, U, W ) such that (Ω, F ∗ , µ) is a finitely-additive probability space, the sigma-algebra F ⊆ F ∗ and the F-measurable utility U : Ω → RX are simple, the F ∗ measurable tiebreaker W : Ω → RX is proper, and ρ(p, A) = µ(C(p, A)) for all p and A. 2.1.3

Characterization

For finite outcome spaces X, static REU representations have been characterized by Gul and Pesendorfer (2006) and Ahn and Sarver (2013). As a preliminary technical contribution, we extend their characterization to arbitrary separable metric spaces X. The first four conditions of the following axiom are the same as in Gul and Pesendorfer (2006). The fifth condition is a slight modification of the finiteness condition in Ahn and Sarver (2013). Axiom 0. (Random Expected Utility) (i). Regularity: If A ⊆ A0 , then for all p ∈ A, ρ(p; A) ≥ ρ(p; A0 ). (ii). Linearity: For any A, p ∈ A, λ ∈ (0, 1), and q, ρ(p; A) = ρ(λp + (1 − λ)q; λA + (1 − λ){q}). 2

This property is sometimes called “regular” in the literature; we use the term “proper” to avoid confusion with the Regularity axiom (Axiom 0 (i)) below.

6

(iii). Extremeness: For any A, ρ(extA; A) = 1.3 (iv). Mixture Continuity: ρ(·; αA + (1 − α)A0 ) is continuous in α for all A, A0 . (v). Finiteness: There exists K > 0 such that for all A, there is B ⊆ A with |B| ≤ K such that for every p ∈ ArB, there are sequences pn →m p and B n →m B with ρ(pn ; {pn }∪B n ) = 0 for all n. For condition (iv), α 7→ ρ(·; αA + (1 − α)A0 ) is viewed as a map from [0, 1] to ∆(∆(X)), where ∆(∆(X)) is endowed with the topology of weak convergence induced by the Prokhorov metric on ∆(X). For condition (v), convergence in mixture, denoted →m , on ∆(X) and A is defined as follows: For any p ∈ ∆(X) and sequence {pn }n∈N ⊆ ∆(X), we write pn →m p if there exists q ∈ ∆(X) and a sequence {αn }n∈N with αn → 0 such that pn = αn q + (1 − αn )p for all n. Similarly, for any sequence {B n }n∈N ⊆ A, we write B n →m p if there exists B ∈ A and a sequence {αn }n∈N with αn → 0 such that B n = αn B + (1 − αn ){p} for all n. Finally, for any A ∈ A and sequence (An )n∈N ⊆ A, we write An →m A if for each p ∈ A, there is a sequence {Bpn }n∈N ⊆ A such that Bpn →m p and An = ∪p∈A Bpn for all n. Theorem 0. The stochastic choice rule ρ on A admits an REU representation if and only if ρ satisfies Axiom 0. Proof. See Supplementary Appendix F.

2.2



Dynamic Random Utility

In many economic applications, the agent solves a dynamic decision problem subject to evolving, and in general serially correlated, private information. As in the static model, an equivalent interpretation is in terms of a population of agents with heterogeneous and evolving utilities, and we will move freely between the two interpretations.4 The following two examples illustrate some new features of the dynamic setting. Example 1 (School choice). A growing literature in labor economics studies individuals’ school and curriculum choices, recognizing the importance of such choices for eventual labor market outcomes. In this vein, in Figure 1 (left) a student first decides whether to enroll in a regular high school or in a charter school that offers some benefits such as smaller class sizes. After enrolling, the student must pick a foreign language option. The regular school offers Spanish (S), Mandarin (M), and Japanese (J), while the charter school offers only Spanish and Mandarin. 3

Here extA denotes the set of extreme points of A. A special case of the model where all information is resolved at the beginning of time corresponds to a population of heterogeneous agents with fully persistent preferences, or a single agent with random but fully persistent preferences. 4

7

With serially correlated private information, choices will appear history dependent. Failure to account for this may lead to misspecified models, including spurious violations of random utility. For example, suppose enrollment shares for each language are as in Figure 1 (left); in particular, the share of students choosing Mandarin is larger at the regular (40%) than at the charter school (25%). Ignoring history dependence, this behavior appears inconsistent with random utility maximization as it violates Regularity (Axiom 0 (i)).5 However, it is entirely consistent with utility maximization subject to serially correlated private information: because of the dynamic selection effect, the distribution of preferences in the regular and charter schools are different, which leads to an apparent violation of the regularity axiom. In section 3 we will characterize precisely what kind of choice behavior is consistent with dynamic random utility. Another implication of history dependence is limited observability: Unlike in the static setting, where the analyst has (at least in principle) access to choice frequencies from all menus, in the dynamic setting certain past choices rule out certain future menus, so that over time the analyst observes choices from a more and more limited domain. For example, we cannot observe the counterfactual frequency with which charter school students would choose Japanese if it were available. In practice, however, many charter schools ration their seats via lotteries in some years, a fact that is widely exploited in the empirical literature on school choice to generate quasi-experimental variation.6 This is illustrated in Figure 1 (right), where upon applying to charter school each student is admitted with probability λ and must attend regular school otherwise. We will see in section 3.1 that the analyst can extrapolate the probability of charter school students who would choose Japanese by looking at Japanese enrollment among students who applied to charter school but were rejected by the lottery. N Example 2 (Brand choice dynamics). A large literature in marketing studies consumer brand choice dynamics. A widely documented phenomenon is consumption persistence, whereby consumers who chose brand z yesterday are more likely to choose z again today than consumers who chose z 0 yesterday. Our approach is helpful in distinguishing various explanations of this phenomenon. First, this behavior can be viewed as a manifestation of history dependence due to serially correlated utilities: If tastes ut are heterogeneous and somewhat persistent, then consumers who preferred z yesterday are also more likely to prefer it today. Section 5.2 formalizes this by characterizing precisely which utility processes (ut ) give rise to consumption persistence. Here past consumption has no causal connection with current consumption; it simply provides information about the consumer’s preferences to the analyst. An alternative explanation is consumption dependence, where consuming z yesterday directly affects the consumer’s preference 5

Here we treat language x at regular school and at charter school as the same option. Differences across schools in instruction quality, class size etc. are incorporated into the period 0 instantaneous payoffs to charter vs. regular. 6 E.g., Abdulkadiroglu, Angrist, Narita, and Pathak (forthcoming); Angrist, Hull, Pathak, and Walters (forthcoming); Deming (2011); Deming, Hastings, Kane, and Staiger (2014).

8

20% S(

) S

M (40%)

u reg

lar

M

J (40 %)

r ula g e r

J

S

lot ch a

rte

r

te ry

λ) ter ( char reg

) S (75% M (25 %)

ula r

(1

−λ

M S M

)

J

Figure 1: School choice. today, for example due to a process of habit formation. Section 7 characterizes choice behavior under consumption dependence, identifying precisely when the analyst must attribute a causal role to past consumption. A related question concerns different interpretations of serially correlated utilities. The most general possibility is that the agent is subject to arbitrary correlated taste shocks. An important special case of this (e.g., Erdem and Keane (1996)) is a consumer with a fixed but unknown utility u˜, about which she learns over time; in this case, ut represents her expectation of u˜ given period t beliefs. Our analysis in Section 4 identifies the additional behavioral implications of the learning model. Moreover, in the case of learning, Section 7 again enables us to distinguish between learning that is independent of past consumption (e.g., learning from advertising in Erdem and Keane 1996) and active learning/experimentation, where past consumption itself is the source of information (e.g., learning from experience in Erdem and Keane 1996). N In what follows, we develop and analyze a general model of dynamic random utility that encompasses these and similar examples. 2.2.1

Agent’s problem

The agent faces a decision tree, as defined by Kreps and Porteus (1978). There are finitely many periods t = 0, 1, . . . , T . There is a finite set Z of instantaneous consumptions. Each period t, the agent chooses from a period-t menu, which is a finite set of lotteries over the period-t outcome space Xt . The spaces Xt are defined recursively. The final period outcome 9

space XT := Z is just the space of instantaneous consumptions; the set of all period-T menus is AT := Kf (∆(XT )). In all earlier periods t ≤ T − 1, the outcome space Xt := Z × At+1 consists of all pairs of current period consumptions and next period continuation menus; the set of period-t menus is At := Kf (∆(Xt )). Denote a typical period-t lottery by pt ∈ ∆(Xt ) and a typical menu by At ∈ At . The agent’s choice of pt ∈ At determines both her instantaneous consumption zt and the menu At+1 from which she will choose next period; let pZt ∈ ∆(Z) and pA t ∈ ∆(At+1 ) denote the respective marginal distributions. It can be verified recursively that each Xt is a separable metric space under the appropriate topologies (see Lemma 12). As in the static model, let (Ω, F ∗ , µ) be a finitely-additive probability space. Under dynamic random expected utility (DREU), in each state of the world, the agent’s choices each period maximize her expected utility subject to her dynamically evolving private information. The agent’s payoff-relevant private information is captured by a filtration (Ft )0≤t≤T ⊆ F ∗ and an Ft -adapted process of random vNM utility indices Ut : Ω → RXt over Xt . This allows for arbitrary serial correlation of utilities, but does not allow the utility process to depend on past consumption; section 7 relaxes the latter restriction. In case of indifference, ties at each t are broken by a random F ∗ -measurable vNM utility index Wt : Ω → Xt , where we impose dynamic analogs of simplicity and properness that we define at the end of this section. Thus, as before, when faced with menu At in period t, the agent chooses lottery pt in state ω if and only if pt ∈ M (M (At , Ut (ω)), Wt (ω)). DREU is a very general model because it imposes no particular structure on the family (Ut ). This is the most parsimonious setting in which to isolate the behavioral implications of serially correlated private information.7 However, most economic applications additionally assume that the agent is forward-looking and correctly takes her future behavior into account. To capture this, we also analyze the special case of evolving utility, where there is an Ft adapted process of state-dependent felicity functions ut : Ω → RZ over instantaneous consumptions and a discount factor8 δ > 0 such that UT = uT and Ut is given by the Bellman equation  Ut (zt , At+1 ) = ut (zt ) + δE

 max Ut+1 (pt+1 )|Ft

pt+1 ∈At+1

(1)

for all t ≤ T − 1 and (zt , At+1 ) ∈ Xt . Finally, as discussed in Example 2, an important special case of evolving utility arises when the agent has a fixed but unknown felicity about which she learns over time. To identify the added behavioral content of this, we study gradual learning, which additionally imposes that there is some F ∗ -measurable state-dependent felicity 7

DREU could also accommodate various “behavioral” effects, such as temptation, regret, imperfect foresight, etc. but we do not pursue this direction in the current paper. 8 The discount factor is fully identified under gradual learning, while evolving utility only implies that δ > 0.

10

u˜ : Ω → RZ such that for all t, we have9 ut = E[˜ u|Ft ].

(2)

For all three models, we impose the following dynamic analogs of simplicity and properness. The pair (Ft , Ut )0≤t≤T is simple, i.e., (i) each Ft is generated by a finite partition such that µ(Ft (ω)) > 0 for every ω ∈ Ω, where Ft (ω) again denotes the cell of the partition that contains ω; and (ii) each Ut (ω) is nonconstant and Ut (ω) 6≈ Ut (ω 0 ) whenever Ft (ω) 6= Ft (ω 0 ) and Ft−1 (ω) = Ft−1 (ω 0 ).10 The tiebreakers (Wt )0≤t≤T are proper, i.e., (i) µ({ω ∈ Ω : |M (At , Wt (ω))| = 1}) = 1 for all At ∈ At ; (ii) conditional on FT (ω), W0 , . . . , WT are independent; and (iii) µ(Wt ∈ Bt |FT (ω)) = µ(Wt ∈ Bt |Ft (ω)) for all t.11 2.2.2

Analyst’s problem

As in the static setting, the agent’s choices in each period t appear stochastic to the analyst, because he does not have access to the agent’s private information. The novel feature of the dynamic setting is that the analyst can observe the agent’s past choices. With serially correlated utilities, these choices convey some information about the payoff-relevant private information Ft , so that the agent’s behavior additionally appears history dependent to the analyst. This is captured by a dynamic stochastic choice rule ρ, which for any period t and history of past choices summarizes the observed choice frequencies from any menu At that can arise after this history. We define choice frequencies and histories recursively. Choice frequencies in period 0 are given by a (static) stochastic choice rule ρ0 : A0 → ∆(∆(X0 )) on A0 ; thus, P p0 ∈A0 ρ0 (p0 ; A0 ) = 1 for all A0 and ρ0 (p0 ; A0 ) denotes the frequency with which the agent chooses lottery p0 when faced with menu A0 . The choices that occur with strictly positive probability under ρ0 define the set of all period 0 histories H0 := {(A0 , p0 ) : ρ0 (p0 , A0 ) > 0}. For any history h0 = (A0 , p0 ) ∈ H0 , let A1 (h0 ) := supp pA 0 denote the set of period 1 menus 0 that follow h with positive probability. For each t = 1, . . . , T and history ht−1 ∈ Ht−1 , choice frequencies following ht−1 are given by a stochastic choice rule ρt (·|ht−1 ) : At (ht−1 ) → ∆(∆(Xt )) on the set At (ht−1 ) of period t menus P that follow ht−1 with positive probability; thus, pt ∈At ρt (pt ; At | ht−1 ) = 1 for all At ∈ At (ht−1 ) and ρt (pt ; At | ht−1 ) denotes the frequency with which the agent chooses pt when faced with menu At after history ht−1 . The set of period-t histories is Ht := {(ht−1 , At , pt ) : ht−1 ∈ 9

Gradual learning is a model of passive learning, because the agent’s choices do not affect her filtration Ft . The more general model in Section 7 can accommodate active learning, or experimentation, where each period the agent obtains additional information from her consumption zt . 10 For t = 0, we let Ft−1 (ω) := Ω for all ω. 11 (ii) rules out additional serial correlation of tiebreakers, over and above the serial correlation inherent in the agent’s payoff-relevant private information FT (ω). (iii) ensures that to the extent that period-t tie breaking relies on payoff-relevant private information, it can rely only on the information Ft (ω) available at t.

11

Ht−1 and At ∈ At (ht−1 ) and ρt (pt ; At |ht−1 ) > 0}; this contains all sequences (A0 , p0 , . . . , At , pt ) of choices up to time t that arise with positive probability. Finally, for each t ≤ T − 1, the set of period t + 1 menus that follow history ht = (ht−1 , At , pt ) with positive probability is At+1 (ht ) := supp pA t and the set of period-t histories that lead to At+1 with positive probability is Ht (At+1 ) := {ht ∈ Ht : At+1 ∈ At+1 (ht )}. Two features of the primitive are worth noting: First, reflecting limited observability, for each t ≥ 1 and history ht−1 ∈ Ht−1 , the stochastic choice rule ρt (·|ht−1 ) is defined only on the subset At (ht−1 ) ⊆ At of period t menus that arise with positive probability after ht−1 —typically very few menus. Nevertheless, section 3.2 will show that under DREU the analyst can extrapolate from ρt (·|ht−1 ) to a well-defined stochastic choice rule on the whole of At . Second, histories ht−1 = (A0 , p0 , . . . , At−1 , pt−1 ) only summarize the agent’s past choices and do not keep track of realized consumptions zk ∈ pZk . This is without loss in the current model where utilities are not affected by past consumption, but will be relaxed in the model with consumption-dependence in section 7. Under DREU, the private information revealed to the analyst by history ht−1 = T (A0 , p0 , . . . , At−1 , pt−1 ) is given by the event C(ht−1 ) := t−1 k=0 C(pk , Ak ), where for each k the event C(pk , Ak ) := {ω ∈ Ω : pk ∈ M (M (Ak , Uk (ω)), Wk (ω))} that the agent chooses pk when faced with Ak is defined as in the static model.12 Thus, the analyst’s observations are consistent with DREU if the frequency with which the agent chooses pt from At following history ht−1 is equal to the conditional probability µ [C(pt , At )|C(ht−1 )] of the event C(pt , At ) given C(ht−1 ). The following definition summarizes the dynamic model: Definition 2. A dynamic random expected utility (DREU) representation of the dynamic stochastic choice rule ρ is a tuple (Ω, F ∗ , µ, (Ft , Ut , Wt )0≤t≤T ) such that (Ω, F ∗ , µ) is a finitelyadditive probability space, the filtration (Ft ) ⊆ F ∗ and the Ft -adapted utility process Ut : Ω → RXt are simple, the F ∗ -measurable tiebreaking process Wt : Ω → RXt is proper, and for all pt ∈ At and ht−1 ∈ Ht−1 (At ),   ρt (pt ; At |ht−1 ) = µ C(pt , At )|C(ht−1 ) ,

(3)

where for t = 0, we abuse notation by letting C(ht−1 ) := Ω and ρ0 (p0 ; A0 |h−1 ) := ρ0 (p0 ; A0 ). An evolving utility representation is a DREU representation along with an Ft -adapted process of felicities ut : Ω → RZ and a discount factor δ > 0 such that (1) holds. A gradual learning representation is an evolving utility representation along with an F ∗ -measurable felicity u˜ : Ω → RZ such that (2) holds. 12

Note that C(ht−1 ) does not keep track of the random realizations of menus Ak ∈ supp pA k along the sequence , as this exogenous randomness does not reveal any information about the agent’s private information.

t−1

h

12

2.2.3

Discussion

Lotteries as choice objects: In addition to allowing us to model choice behavior under risk, including lotteries in the domain of choice simplifies our analysis, as it allows us to rely on the static framework of Gul and Pesendorfer (2006) instead of the more complicated one of Falmagne (1978). Lotteries play a similar technical role in the original work of Kreps and Porteus (1978), by letting them rely on the vNM framework.13 From a conceptual point of view, we will see in Section 3.2 that lotteries are crucial in overcoming the aforementioned limited observability problem and we illustrate the availability of lotteries for this purpose with examples from experimental and empirical work. Relatedly, lotteries are key in inferring the agent’s history dependent revealed preference in Section 4.1 and in disentangling history dependence due to serially correlated private information from consumption dependence in Section 7. Interpretation of data: As with static stochastic choice, the dynamic stochastic choice rule ρ admits two equivalent interpretations: The analyst either (i) repeatedly observes a single agent solve each decision tree14 ; or (ii) observes a large population of agents (with heterogeneous and evolving utilities) solve each decision tree once. In either case, ρ captures the limiting choice frequencies as the number of observations/population size tends to infinity. Assuming the statistical inference problem away in this manner is also typical in the econometric analysis of identification. In any application, the data set will of course be finite. However, studying behavior on the full domain is an important step in uncovering all the assumptions that are behind the model; moreover, statistical tests are often directly inspired by axioms.15 Dynamic stochastic choice vs. ex ante preference: In our framework, the analyst observes the distribution of choices at each node of each decision tree; as we pointed out, the randomness in choice comes from an informational asymmetry between the agent and the analyst that occurs in each period. The existing decision theoretic literature (e.g. Gul and Pesendorfer, 2004; Krishna and Sadowski, 2014) typically studies a deterministic preference over decision trees: there is a special time period in which there is no informational asymmetry.16 Compared with this literature, our approach identifies the implications of the model about behavior in each period, and also across periods, not just at the beginning of time. Moreover, 13

Likewise, the ambiguity aversion literature extensively relies on the Anscombe and Aumann (1963) framework rather than the more complicated one of Savage (1972); the notable exceptions include Gilboa (1987) and Epstein (1999). Similarly, the menu-preference literature uses lotteries (e.g. Dekel, Lipman, and Rustichini, 2001) to improve upon the uniqueness and comparative statics results of Kreps (1979). 14 Here, the agent’s utilities are assumed to evolve according to the same process Ut at each observation. 15 For example Hausman and McFadden (1984) develop a test of the IIA axiom that characterizes the logit model. Likewise, Kitamura and Stoye (2013) develop axiom-based tests of the static random utility model. 16 Ahn and Sarver (2013) study a two period model where in the first period there is no asymmetry so a deterministic preference is observed, while in the second period the agent receives a private signal so observed choices are random.

13

our model is more general, as it relaxes the assumption that the agent and the analyst are on the same footing in period zero. We discuss these issues in more detail in Section 9.1. In Section 4.1 we show how to extract the deterministic preference relation from the stochastic choice data. Role of axioms: In addition to their usual positive and normative role, we view our axioms as serving an equally important purpose as conceptual tools that elucidate key properties of any dynamic random utility model and facilitate comparisons between different versions of the model. For example, our axioms in Section 3.1 clarify the nature of history dependence that can arise under any dynamic random expected utility model; our axioms in Section 4.3 identify the additional behavioral content of gradual learning relative to evolving utility; and our comparison of the evolving utility and DDC models in Section 6 draws on the axioms to uncover that the two make opposite predictions about option value.

3

Characterization of DREU

DREU is characterized by four axioms, which we present in the following subsections. The first subsection presents two history independence axioms that capture the key new implications of the dynamic relative to the static model. Building on this, the next subsection shows how the analyst can extrapolate from each ρt (·|ht−1 ) to an extended choice rule on the whole of At , thus overcoming the limited observability problem. The final subsection then imposes the static REU conditions as well as a technical history continuity axiom on this extended choice rule.

3.1

History Independence Axioms

Our first two axioms identify two cases in which histories ht−1 and g t−1 reveal the same information to the analyst. Capturing the fact that history dependence arises in DREU only through the private information revealed by past choices, the axioms require that period-t choice behavior be the same after two such histories. 0 0 Given ht−1 = (A0 , p0 , ..., At−1 , pt−1 ) ∈ Ht−1 , let (ht−1 −k , (Ak , pk )) denote the sequence of the form (A0 , p0 , ..., A0k , p0k , ..., At−1 , pt−1 ).17 We say that g t−1 ∈ Ht−1 is contraction equivalent to k−1 ht−1 if for some k, we have g t−1 = (ht−1 ) = −k , (Bk , pk )), where Ak ⊆ Bk and ρk (pk , Ak |h ρk (pk , Bk |hk−1 ).18 That is, g t−1 and ht−1 differ only in period k, where under g t−1 , the agent chooses lottery pk from menu Bk , while under ht−1 , she chooses the same lottery pk from the contraction Ak ⊆ Bk ; moreover, conditional on hk−1 , the choice of pk from Ak and the choice 0 A 0 0 k−1 In general this is not a history, but it is if A0k ∈ supp pA ) > 0. k−1 and Ak+1 ∈ supp(p )k and ρk (pk , Ak |h This induces an equivalence relation on Ht−1 by taking the symmetric and transitive closure. Axiom 1 implies that period-t choice behavior is history independent within each equivalence class. 17

18

14

of pk from Bk occur with the same probability. Axiom 1 requires that choice behavior be the same after ht−1 and g t−1 . Axiom 1 (Contraction History Independence). For all t ≤ T , if g t−1 ∈ Ht−1 (At ) is contraction equivalent to ht−1 ∈ Ht−1 (At ), then ρt (·, At |ht−1 ) = ρt (·, At |g t−1 ). To see the idea, suppose for simplicity that T = 1, in which case the axiom requires that for any p0 ∈ A0 ⊆ B0 if ρ0 (p0 , A0 ) = ρ0 (p0 , B0 ), then ρ1 (·, A1 |A0 , p0 ) = ρ1 (·, A1 |B0 , p0 ) for any A1 ∈ supp pA 0 . In general, the event that p0 is the best element of menu B0 is a subset of the event that p0 is the best element of the smaller menu A0 ⊆ B0 ; thus, observing g 0 = (B0 , p0 ) may reveal more information about the agent’s possible period-0 preferences than h0 = (A0 , p0 ). However, since we additionally know that ρ0 (p0 , A0 ) = ρ0 (p0 , B0 ), the event that p0 is best in A0 but not in B0 must have probability 0; in other words, we must put zero probability on any preference that selects p0 from A0 but not from B0 . Given this, h0 and g 0 reveal the same information, and hence call for the same predictions for period-1 choices. The following example illustrates this with a concrete story in the population setting. Example 3. There are two convenience stores, A and B, each carrying three types of milk (whole, 2%, and 1%). Each store has a stable set of weekly customers whose stochastic process of preferences is identical at both stores.19 Suppose that in week 0, store A’s delivery of 1% milk breaks down unexpectedly. The purchasing shares at each store are given in Tables 1 and 2. Consider a customer of store A, Alice, and a customer of store B, Barbara, who both product whole 2%

product whole 2% 1%

market share 40% 60%

Table 1: Market shares at store A

market share 40% 35% 25%

Table 2: Market shares at store B

buy whole milk in week 0. If in week 1 all types of milk are available again at both stores, then Contraction History Independence implies that Alice and Barbara’s choice probabilities will be the same. This is true because we have the same information about Alice and Barbara. Since at store A only whole and 2% milk were available in week 0, the possible week-0 preferences of Alice are w  2  1 or w  1  2 or 1  w  2. By contrast, since store B stocked all three types of milk, Barbara’s possible preferences are w  2  1 or w  1  2. However, since we additionally know that the share of customers purchasing whole milk in week 0 was the same at both stores, ρ0 (w, {w, 1, 2}) = ρ0 (w, {w, 2}) = 0.4, we can also conclude that no customers had the ranking 1  w  2 in week 0. Therefore, the analyst’s prediction is the same, since the 19

For simplicity, we assume in the following that all preferences are strict.

15

stochastic process that governs the transition from week-0 to week-1 preferences is the same for Barbara and Alice and in both cases the analyst conditions on exactly the same week-0 event {w  2  1, w  1  2}. N The second history independence axiom takes into account that the agent is an expected utility maximizer. Under expected utility maximization, choosing pk from Ak reveals the same information about the agent’s utility as choosing λpk + (1 − λ)qk from λAk + (1 − λ){qk }. More generally, for a menu Bk , if we know that the agent chose some option of the form λpk +(1−λ)qk from λAk + (1 − λ)Bk but we do not know what qk was, this again reveals the same information as choosing pk from Ak . This suggests the following notion of equivalence: We say that a finite set of histories Gt−1 ⊆ Ht−1 is linearly equivalent to ht−1 = (A0 , p0 , ..., At−1 , pt−1 ) ∈ Ht−1 if t−1 Gt−1 = {(h−k , (λAk + (1 − λ)Bk , λpk + (1 − λ)qk )) : qk ∈ Bk }

for some k, Bk , and λ ∈ (0, 1]. That is, Gt−1 is the collection of histories that differ from ht−1 only at period k: Under ht−1 , the agent chooses pk from menu Ak , while Gt−1 summarizes all possible choices of the form λpk + (1 − λ)qk from the menu λAk + (1 − λ)Bk . By the above reasoning, Gt−1 reveals the same information about the agent as ht−1 . Thus, Axiom 2 requires period-t choice behavior following the set of histories Gt−1 to be the same as conditional on ht−1 . To state this formally, define the choice distribution from At following Gt−1 ⊆ Ht−1 (At ), ρ(g t−1 )ρt (·, At |g t−1 ) , t−1 ) g t−1 ∈Gt−1 ρ(g

P t−1

ρt (·, At |G

) :=

g t−1 ∈Gt−1

P

to be the weighted average of all choice distributions ρt (·, At |g t−1 ) following histories in Gt−1 , Q pk , Aˆk |g k−1 ) correwhere for each g t−1 = (Aˆ0 , pˆ0 , . . . , Aˆt−1 , pˆt−1 ) its weight ρ(g t−1 ) := t−1 k=0 ρk (ˆ sponds to the probability of the sequence of choices summarized by g t−1 .20 Axiom 2 (Linear History Independence). For all t ≤ T , if Gt−1 ⊆ Ht−1 (At ) is linearly equivalent to ht−1 ∈ Ht−1 (At ), then ρt (·, At |ht−1 ) = ρt (·, At |Gt−1 ).

3.2

Limited Observability

Recall that unlike the static setting, where the analyst observes choices from all possible menus, the dynamic setting presents a limited observability problem: At each history ht−1 of past choices, ρt (·|ht−1 ) is only defined on the set At (ht−1 ) of menus that occur with positive probability after ht−1 —typically very few menus. For the rest of the paper, it is key to overcome this problem: Otherwise we do not have enough data to verify whether observed choices at history ˆ Note that ρ(g t−1 ) does not keep track of the probabilities pˆA k (Ak+1 ), since these pertain to exogenous randomization and do not reveal any private information. 20

16

ht−1 are consistent with random utility maximization or to identify whether the agent’s utility process belongs to evolving utility class or the more specific gradual learning class. The inclusion of lotteries among the agent’s choice objects together with Axiom 2 allow us to do so. The idea is based on Linear History Independence and to generalize the “linear extrapolation” procedure illustrated in the school choice example (Example 1). Consider any menu At (e.g., the three language menu in the example) and some history ht−1 (e.g., choosing charter school) that does not lead to At . We define the agent’s counterfactual choice distribution from At following ht−1 by extrapolating from the situation where the agent plans to make the sequence of choices captured by ht−1 , but is sometimes forced to instead make a sequence of choices (e.g., enroll in regular school) that does lead to menu At . More precisely, we consider A and replace a degenerate history dt−1 = ({q0 }, q0 , . . . , {qt−1 }, qt−1 ) such that At ∈ supp qt−1 t−1 t−1 t−1 t−1 21 h = (A0 , p0 , . . . , At−1 , pt−1 ) with g := λh +(1−λ)d where at every period k ≤ t−1, the agent faces menu λAk + (1 − λ){qk } and chooses lottery λpk + (1 − λ)qk .22 As discussed preceding Linear History Independence, under expected utility maximization the latter sequence of choices reveals the same information about the agent as ht−1 . This motivates extrapolating from g t−1 to define choice behavior following ht−1 . Define the set of degenerate period-(t − 1) histories by Dt−1 := {dt−1 ∈ Ht−1 : dt−1 = ({qk }, qk )t−1 k=0 where qk ∈ ∆(Xk ) ∀k ≤ t − 1}. Definition 3. For any t ≥ 1, At ∈ At , and ht−1 ∈ Ht−1 , define t−1

ρht

(·; At ) := ρt (·; At |λht−1 + (1 − λ)dt−1 ).

(4)

for some λ ∈ (0, 1] and dt−1 ∈ Dt−1 such that λht−1 + (1 − λ)dt−1 ∈ Ht−1 (At ). t−1

It follows from Axiom 2 (Linear History Independence) that ρht (·; At ) is well-defined: Lemma 15 shows that the RHS of (4) does not depend on the specific choice of λ and dt−1 . t−1 Moreover, ρht (·; At ) coincides with ρt (·; At |ht−1 ) whenever ht−1 ∈ Ht−1 (At ). In the following, we do not distinguish between the extended and nonextended version of ρt and use ρt (·; At |ht−1 ) to denote both. As illustrated by Example 1 in the context of school choice, random assignment is prevalent in many real-world economic environments and is an important tool to obtain quasiexperimental variation in the empirical literature. While this literature typically leverages such random variation to identify the causal effect of current choices on next-period outcomes (e.g., In order for λht−1 + (1 − λ)dt−1 ) := (λAk + (1 − λ){qk }, λpk + (1 − λ)qk )t−1 k=0 to be a well-defined history, A it suffices that λAk + (1 − λ){qk } ∈ supp qk−1 for all k = 1, . . . , t − 1. This can be ensured by appropriately choosing each qk , working backwards from period t − 1. 22 If the agent is indifferent to the timing of randomization, this is equivalent to her facing menu Ak and choosing pk with probability λ and being forced to choose qk with probability (1 − λ). 21

17

test scores in the case of school choice), Definition 3 suggests exploiting it to make counterfactual inferences about next-period choices. Even more readily, lotteries over next-period choice problems can be generated in the laboratory, and a growing literature in experimental economics makes use of this to perform extrapolation procedures akin to Definition 3.23

3.3

History-Dependent REU and History Continuity Axioms

For each ht−1 , the extended choice distribution ρt (·|ht−1 ) from Definition 3 is a stochastic choice rule on the whole of At . The next axiom imposes the standard static REU conditions from Axiom 0 on each ρt (·|ht−1 ). Note that conditioning ρt on past histories is key here; without controlling for past choices, choice behavior at time t will in general violate the REU axioms, as illustrated in Example 1. Axiom 3 (History-dependent REU). For all t ≤ T and ht−1 , ρt (·|ht−1 ) satisfies Axiom 0.24 Our final axiom reflects the way in which tie-breaking can affect the observed choice distribution. We first define menus and histories without ties directly from choice behavior. The idea is that menus without ties are characterized by the fact that slightly perturbing their elements has no effect on choice probabilities.25 We capture such perturbations using convergence in mixture, as defined following Axiom 0. Definition 4. For any 0 ≤ t ≤ T and ht−1 ∈ Ht−1 , the set of period-t menus without ties conditional on history ht−1 is denoted A∗t (ht−1 )26 and consists of all At ∈ At such that for any pt ∈ At and any sequences pnt →m pt and Btn →m At r {pt }, we have lim ρt (pnt , Btn ∪ {pnt }|ht−1 ) = ρt (pt , At |ht−1 ).

n→∞

For t = 0, we write A∗0 := A∗0 (ht−1 ). The set of period t histories without ties is Ht∗ := {ht = 0 (A0 , p0 , . . . , At−1 , pt−1 ) ∈ Ht : At0 ∈ A∗t0 (ht −1 ) for all t0 ≤ t}. 23

E.g., in a recent experimental study of temptation and self-control, Toussaert (2016) presents subjects with lotteries over next-period menus to differentiate between so-called random Strotz agents and Gul and Pesendorfer (2001) agents. For related uses of lotteries in lab experiments, see Augenblick, Niederle, and Sprenger (2015). 24 Lemma 12 verifies that Xt is a separable metric space. Then Mixture Continuity and Finiteness make use of the same convergence notions as defined following Axiom 0. 25 Lu (2016) and Lu and Saito (2016) use an alternative approach by taking non-additive stochastic choice rules as their primitive that directly describe whether certain menus involves ties or not. Their approach requires that ties occur with probability either zero or one, so is not applicable to our setting. Our perturbation-based approach is similar in spirit to Ahn and Sarver (2013). 26 Note that A∗t (ht−1 ) 6⊆ At (ht−1 ) because the first set contains all menus without ties (we use history ht−1 here only to determine where ties could occur) while the second set contains only menus that occur with positive probability given the sequence of choices ht−1 —typically very few menus.

18

The following axiom relates choice distributions after nearby histories. To state this formally, we extend convergence in mixture to histories: We say ht,n →m ht if ht,n = (An0 , pn0 , ..., Ant , pnt ) and ht = (A0 , p0 , ..., At , pt ) satisfy Ant0 →m At0 and pnt0 →m pt0 for each t0 . Axiom 4 (History Continuity). For all t ≤ T − 1, At+1 , pt+1 , and ht , ρt+1 (pt+1 ; At+1 |ht ) ∈ co{lim ρt+1 (pt+1 ; At+1 |ht,n ) : ht,n →m ht , ht,n ∈ Ht∗ }. n

In general, if period t histories are slightly altered, we expect subsequent period t + 1 choice behavior to be adjusted continuously, except when there was tie-breaking in the past. If the agent chose pt from At as a result of tie-breaking, then slightly altering the choice problem can change the set of states at which pt would be chosen and hence lead to a discontinuous change in the private information revealed by the choice of pt . The history continuity condition restricts the types of discontinuities ρt+1 can admit, ruling out situations in which choices after some history are completely unrelated to choices after any nearby history. Specifically, the fact that choice behavior after ht can be expressed as a mixture of behavior after some nearby histories without ties reflects the way in which the agent’s tie-breaking procedure may vary with her payoff-relevant private information.

3.4

Representation Theorem

Theorem 1. For any dynamic stochastic choice rule ρ, the following are equivalent: (i). ρ satisfies Axioms 1–4. (ii). ρ admits a DREU representation. The proof of Theorem 1 appears in Appendix B. In Section 8, we sketch the argument for sufficiency in the two-period setting (T = 1).

4 4.1

Evolving Utility vs. Gradual Learning History-dependent revealed preference

In the following subsections, we characterize evolving utility and gradual learning. Both models impose additional restrictions on the agent’s realized utilities Ut (ω). However, in our setting, the link between the agent’s stochastic and history-dependent choice behavior and her underlying state-dependent utilities is less straightforward than under deterministic choice. To make this link, we identify a collection of incomplete and history-dependent revealed preference relations. For each history ht and any qt , rt ∈ ∆(Xt ), qt %ht rt captures that the agent prefers qt to rt at 19

any state of the world ω that could give rise to history ht ; that is, Ut (ω)(qt ) ≥ Ut (ω)(rt ) for all ω ∈ C(ht ). To see the idea, consider any history h0 = (A0 , p0 ) and suppose that 1 1 1 1 ρ0 ( p0 + r0 ; A0 + {q0 , r0 }) = 0. 2 2 2 2

(5)

Then U0 (ω)(q0 ) ≥ U0 (ω)(r0 ) for all ω ∈ C(h0 ). Indeed, for an expected-utility maximizer, it is optimal to choose 12 p0 + 21 r0 from menu 21 A0 + 12 {q0 , r0 } if and only if it is optimal to choose p0 from A0 and to choose r0 from {q0 , r0 }.27 Thus, if 12 p0 + 21 r0 is never chosen from 12 A0 + 12 {q0 , r0 }, this reveals that the agent prefers q0 to r0 whenever she would select p0 from A0 . Conversely, if U0 (ω)(q0 ) ≥ U0 (ω)(r0 ) for all ω ∈ C(h0 ), then (5) continues to hold as long as q0 and r0 are perturbed appropriately to eliminate potential ties. More generally, this suggests the following definition: Definition 5. For each t ≤ T − 1 and ht = (ht−1 , At , pt ) ∈ Ht relation %ht on ∆(Xt ) is defined as follows: For any qt , rt ∈ ∆(Xt ), we have qt %ht rt if there exist qtn →m qt and rtn →m rt such that 1 1 1 1 ρt ( pt + rtn ; At + {qtn , rtn }|ht−1 ) = 0 for all n. 2 2 2 2 Let ∼ht and ht respectively denote the symmetric and asymmetric component of %ht .28 We show in Appendix C that when ρ admits a DREU representation, then qt %ht rt if and only if Ut (ω)(qt ) ≥ Ut (ω)(rt ) for all ω ∈ C(ht ).29 Thus, %ht captures the desired notion of revealed preference.

4.2

Evolving Utility

To characterize evolving utility, the following three axioms employ %ht to translate conditions from the deterministic menu-choice literature to our setting. First, Separability (e.g., Fishburn (1970), Theorem 11.1) ensures that utility in every state of the world has an additively separable form Ut (zt , At+1 ) = ut (zt ) + Vt (At+1 ): Axiom 5 (Separability). For all t ≤ T − 1, ht , zt , zt0 and At+1 , A0t+1 , we have 12 (zt , At+1 ) + 1 0 (z , A0t+1 ) ∼ht 12 (zt0 , At+1 ) + 21 (zt , A0t+1 ). 2 t The next axiom translates conditions from Dekel, Lipman, and Rustichini (2001) that ensure that Vt (At+1 ) captures the option value contained in menu At+1 , i.e., that Vt (At+1 ) = 27

This observation is related to a common preference elicitation method in experimental work. To elicit a subject’s ranking over a number of options in an incentive compatible manner, the subject is asked to indicate choices from multiple menus; a lottery then determines which menu (and corresponding choice) is implemented. 28 That is, qt ∼ht rt if qt %ht rt and rt %ht qt . And qt ht rt if qt %ht rt and rt 6%ht qt . 29 The “if” direction also makes use of Axioms 5 and 6 below. See Corollary C.1.

20

E[maxpt+1 ∈At+1 Uˆt+1 (pt+1 ) | Ft ] for some random utility function Uˆt+1 . Part (i) is Kreps’s (1979) preference for flexibility axiom; it says that the agent always weakly prefers bigger menus. Part (ii) ensures that the agent cannot affect the filtration. Part (iii) ensures that %ht is continuous and part (iv) that it induces a nontrivial preference over continuation menus.30 Axiom 6 (DLR Menu Preference). For all t ≤ T − 1 and ht , the following hold:31 (i). Monotonicity: For any zt and At+1 ⊆ A0t+1 , we have (zt , A0t+1 ) %ht (zt , At+1 ). (ii). Indifference to Timing: For any zt , At , A0t , and α ∈ (0, 1), we have (zt , αAt+1 + (1 − α)A0t+1 ) ∼ht α(zt , At+1 ) + (1 − α)(zt , A0t+1 ). (iii). Continuity: %ht is continuous.32 (iv). Menu Nondegeneracy: There exist A0t+1 , At+1 such that (zt , A0t+1 ) ht (zt , At+1 ) for all zt . The final axiom adapts the sophistication axiom due to Ahn and Sarver (2013). We require that at any history ht , the agent values a menu A0t+1 strictly more than its subset At+1 if and only if she in fact chooses something from A0t+1 r At+1 with strictly positive probability following ht . This axiom ensures that the agent correctly anticipates her future utility, that is, Uˆt+1 = Ut+1 . Axiom 7 (Sophistication). For all t ≤ T −1, ht ∈ Ht , and At+1 ⊆ A0t+1 ∈ A∗t (ht ), the following are equivalent: (i). ρt+1 (pt+1 ; A0t+1 |ht ) > 0 for some pt+1 ∈ A0t+1 r At+1 (ii). (zt , A0t+1 ) ht (zt , At+1 ) for all zt . Theorem 2. Suppose that ρ admits a DREU representation. The following are equivalent: (i). ρ satisfies Axioms 5–7. (ii). ρ admits an evolving utility representation. Proof. See Appendix C.



30

%ht also satisfies a version of the DLR finiteness axiom (Axiom DLR 6 in Ahn and Sarver (2013)). However, in the presence of Sophistication, this property is inherited from the finiteness axiom on ρ (Axiom 3 (v)), so we do not need to impose it as a separate condition. 31 In the following, we identify any (zt , At+1 ) ∈ Xt with the Dirac lottery δ(zt ,At+1 ) ∈ ∆(Xt ). 32 That is, for all pt ∈ ∆(Xt ), the upper and lower contour sets {qt : qt %ht pt } and {qt : pt %ht qt } are closed in ∆(Xt ) endowed with the topology of weak convergence (recall that by by Lemma 12, Xt is a separable metric space). Alternatively, it is enough to require that, for any zt and At+1 , both {Bt+1 : (zt , At+1 ) %ht (zt , Bt+1 )} and {Bt+1 : (zt , Bt+1 ) %ht (zt , At+1 )} are closed in At+1 .

21

4.3

Gradual Learning

Gradual learning specializes evolving utility to the case in which the agent’s consumption preference is time-invariant but unknown to her and she is learning about it over time. The additional behavioral content of this is captured by restrictions on the agent’s preference %ht over streams of consumption lotteries. Fix t ≤ T − 1. Given a sequence `t , . . . , `T ∈ ∆(Z) of consumption lotteries, let the stream (`t , . . . , `T ) be the period-t lottery that at every future period τ ≥ t yields consumption according to `τ . Formally, for any consumption lottery ` ∈ ∆(Z) and menu At+1 ∈ At+1 , define (`, At+1 ) ∈ ∆(Xt ) to be the period-t lottery that yields current consumption according P to ` and yields continuation menu At+1 for sure; i.e., (`, At+1 ) := zt ∈Z `(zt )δ(zt ,At+1 ) . Then (`t , . . . , `T ) := (`t , At+1 ) ∈ ∆(Xt ), where the sequence of menus At+1 , . . . , AT is defined recursively from period T backwards by AT := {`T } ∈ AT and As := {(`s , As+1 )} ∈ As for all s = t + 1, . . . , T − 1. We write (`t , . . . , `τ , m, . . . , m) if `τ +1 = . . . = `T = m for some m ∈ ∆(Z) | {z } T −τ times

and τ ≥ t, and we do not specify the number of m-entries when there is no risk of confusion. The key axiom capturing learning is the following: Axiom 8 (Stationary Consumption Preference). For all t ≤ T − 1, `, m, n ∈ ∆(Z), and ht , (`, n, . . . , n) ht (m, n, . . . , n) if and only if (n, `, n, . . . , n) ht (n, m, n, . . . , n). Axiom 8 implies that at any history ht , the agent’s consumption utility today and her expected consumption utility tomorrow induce the same preference over consumption lotteries. The following example illustrates the connection with learning. Example 4. Suppose the agent faces a choice between two providers of some service, e.g., two hairdressers or dentists. Based on her current information, she believes provider ` to be better than m (n denotes no consumption), so will select the former if choosing between walkin appointments today. If her desired appointment date is next week, then the agent may in general prefer to delay her decision, because she may be able to acquire more information about ` or m in the meantime. However, Axiom 8 says that if she is forced to decide today whether to go to ` or m next week (say, because advance booking is required), then she must again prefer to commit to `. This is because if the agent currently believes ` to be better than m, then by the martingale property of beliefs she should expect her information next week to still favor ` on average. N The next axiom ensures that the agent’s time preference is deterministic and time-invariant. Suppose that for some consumption lotteries ` and m there is a weight α that makes the agent indifferent between getting (` today and m tomorrow) and the α-mixture of the two lotteries in both periods. Provided the agent is not indifferent between ` and m today, then because 22

her expected consumption preference tomorrow is the same as today, this weight identifies the agent’s discount factor. The axiom asserts that this weight, and hence the agent’s discount factor, is independent of today’s state and time period. We say that `, m ∈ ∆(Z) are ht nonindifferent if (`, n, . . . , n) 6∼ht (m, n, . . . , n) for some n ∈ ∆(Z). Axiom 9 (Constant Intertemporal Tradeoff). For all t, tˆ ≤ T − 1, if `, m are ht -nonindifferent ˆm and `, ˆ are g tˆ-nonindifferent, then for all α ∈ [0, 1] and n ∈ ∆Z: (`, m, n, . . . , n) ∼ht (α` + (1 − α)m, α` + (1 − α)m, n, . . . , n) ⇐⇒ ˆ m, (`, ˆ n, . . . , n) ∼gtˆ (α`ˆ + (1 − α)m, ˆ α`ˆ + (1 − α)m, ˆ n, . . . , n). Axioms 9 has no bite if the agent is indifferent between all consumption lotteries at ht . To rule this out, we impose the following condition: Condition 1 (Consumption Nondegeneracy). For all t ≤ T − 1 and ht , there exist ht nonindifferent `, m ∈ ∆(Z). Theorem 3. Suppose that ρ admits an evolving utility representation and Condition 1 is satisfied. The following are equivalent: (i). ρ satisfies Axioms 8 and 9. (ii). ρ admits a gradual learning representation. The proof is in Appendix D. The argument for sufficiency proceeds in three steps. Consider an evolving utility representation (Ω, F ∗ , µ, (Ft , Ut , Wt , ut )) of ρ, where we can perform approP priate normalizations to ensure that z∈Z ut (ω)(z) = 0 for all ω and t and that the discount factor is 1. Fix any ω and t ≤ T − 1. We first show that Axiom 8 implies that ut (ω) and uˆt (ω) := E[ut+1 | Ft (ω)] represent the same preference over consumption lotteries. Thus, there exists a (possibly state and time-dependent) δt (ω) such that uˆt (ω) = δt (ω)ut (ω). Next, note that if ut (ω)(`) 6= ut (ω)(m), then the unique weight α that makes the agent indifferent between (` today and m tomorrow) and (α` + (1 − α)m both today and tomorrow) is 1+δ1t (ω) . Hence, Axiom 9 together with Condition 1 implies that δt (ω) ≡ δ > 0 is state and time-invariant. Finally, the above shows that the process (δ −t ut ) is a martingale, so that δ −t ut (ω) = E[δ −T uT | Ft (ω)]. Thus, replacing Ut with δ −t Ut and ut with δ −t ut yields a gradual learning representation of ρ, where u˜ = E[δ −T UT | FT ]. Finally, we note that δ will be strictly less than 1 if and only if ρ additionally satisfies the following impatience axiom: for all t ≤ T − 1, ht , and `, m, n ∈ ∆(Z), if (`, n, . . . , n) ht (m, n . . . , n), then (`, m, n, . . . , n) ht (m, `, n . . . , n). A natural generalization of gradual learning is to replace the discount factor δ in (2) with a random variable δ : Ω → R++ that is measurable with respect to time 0 private information F0 . 23

This captures the idea of a population of agents with heterogeneous discount factors, each of whom is learning over time about their fixed but unknown consumption utility. An analogous characterization can be obtained in this case: The only difference is that instead of imposing Axiom 9 on arbitrary histories ht and g tˆ, we require that ht be a subhistory of g tˆ.

5 5.1

Properties of the Representations Uniqueness

The following proposition, which we prove in Supplementary Appendix H, summarizes the uniqueness properties of DREU, evolving utility, and gradual learning. Proposition 1. Suppose ρ admits a DREU representation D = (Ω, F ∗ , µ, (Ft , Ut , Wt )). Conˆ = (Ω, ˆ Fˆ ∗ , µ ˆ t )), where (Ω, ˆ Fˆ ∗ , µ sider D ˆ, (Fˆt , Uˆt , W ˆ) is a finitely-additive probability space; ∗ Xt ˆ ˆ ˆ ˆ ˆ is a DREU representa(Ft ) ⊆ F is a filtration; and Ut , Wt : Ω → R for all t. Then D ˆ t ) is F ∗ -measurable and proper and for each t there exists a finite tion of ρ if and only if (W ˆ t generating Fˆt , a bijection φt : Πt → Π ˆ t , and Ft -measurable functions αt : Ω → R++ partition Π and βt : Ω → R such that for all ω ∈ Ω: (i). µ(F0 (ω)) = µ ˆ(φ0 (F0 (ω))) and µ(Ft (ω)|Ft−1 (ω)) = µ ˆ(φt (Ft (ω))|φt−1 (Ft−1 (ω))) if t ≥ 1; (ii). Ut (ω) = αt (ω)Uˆt (ˆ ω ) + βt (ω) whenever ω ˆ ∈ φt (Ft (ω)); ˆ t ∈ Bt (ω)}|φt (Ft (ω))] for any Bt (ω) such that Bt (ω) = (iii). µ[{Wt ∈ Bt (ω)}|Ft (ω)] = µ ˆ[{W {w ∈ RX : pt ∈ M (M (At , Ut (ω)), w)} for some pt ∈ At ∈ At . ˆ is an evolving utility ˆ (ˆ If (D, (ut ), δ) is an evolving utility representation of ρ, then (D, ut ), δ) representation of ρ if and only if (i)-(iii) hold and additionally ˆ

(iv). αt (ω) = α0 (ω)( δδ )t for all ω ∈ Ω and t = 0, . . . , T ; (v). ut (ω) = αt (ω)ˆ ut (ˆ ω ) + γt (ω) whenever ω ˆ ∈ φt (Ft (ω)), where γT (ω) := βT (ω) and γt (ω) := βt (ω) − δE[βt+1 |Ft (ω)] if t ≤ T − 1. If (D, (ut ), δ) is a gradual learning representation of ρ and ρ satisfies Condition 1, then ˆ is a gradual learning representation of ρ if and only if (i)-(v) hold and additionˆ (ˆ (D, ut ), δ) ally (vi). δ = δˆ (vii). βt (ω) =

1−δ T −t+1 E[βT |Ft (ω)] 1−δ

for all ω and t.

24

Points (i) and (ii) of Proposition 1 show that in DREU, the agent’s choices uniquely determine her underlying stochastic process of ordinal payoff-relevant private information, while point (iii) shows that the (ordinal) distribution of tie-breakers is pinned down for choices featuring ties. This is the period-by-period dynamic analog of known identification results for static REU representations (Proposition 4 in Ahn and Sarver (2013)). Point (iv) shows that evolving utility implies strictly sharper identification than DREU of the agent’s cardinal private information: In particular, the random scaling factor used to transform δˆt Uˆt into δ t Ut is given by α0 (ω), and hence is the same in all periods and measurable with respect to period-0 private information. This allows for meaningful intertemporal comparisons such as “in state ω, the additional period-t utility for pt over qt is greater than the additional discounted period-(t + 1) utility for pt+1 over qt+1 ” and cross-state comparisons such as “the additional period-t utility for pt over qt is greater in state ω than in state ω 0 ∈ F0 (ω).” This generalizes the main identification result (Theorem 2) in Ahn and Sarver (2013): In a two-period setting without consumption in period 0, they obtain U1 (ω) = αUˆ1 (ˆ ω ) + β1 (ω), where α is constant since they do not allow for period-0 private information. Finally, gradual learning, unlike evolving utility, allows for unique identification of the discount factor (point (vi)) and entails even sharper identification of cardinal private information (point (vii)).

5.2

Choice and Taste Persistence

As discussed in Example 2, consumption persistence is a widely documented phenomenon, for instance in the marketing literature on brand choice. This section introduces two formalizations of this notion and characterizes under which felicity processes ut the evolving utility model can give rise to each, without needing to attribute a direct causal role to past consumption. Our first notion of consumption persistence captures the idea that (absent ties) the agent is more likely to choose consumption lottery p today if she consumed p yesterday than if she consumed q yesterday, provided today’s menu does not include any new consumption options relative to yesterday’s menu. To state this formally we focus on atemporal menus, i.e., menus that do not feature any intertemporal tradeoffs, so that the agent’s choices are governed solely by her preference over current consumption. Suppose t ≤ T − 1. Menu At is atemporal if for A Z Z any pt , qt ∈ At , we have pA t = qt . We write At := {pt : pt ∈ At } ⊆ ∆(Z). All period T menus AT are atemporal with AZT = AT . Definition 6. ρ features consumption persistence if for any ht = (ht−1 , At , pt ), g t = (ht−1 , At , qt ) ∈ Ht and pt+1 ∈ At+1 ∈ A∗t+1 (ht ) ∩ A∗t+1 (g t ) with At and At+1 atemporal, AZt+1 ⊆ AZt , and pZt = pZt+1 , we have ρt+1 (pt+1 ; At+1 |ht−1 , At , pt ) ≥ ρt+1 (pt+1 ; At+1 |ht−1 , At , qt ). To obtain a characterization of consumption persistence, we impose the assumption that there are two consumption lotteries ` and ` such that the agent strictly prefers the former to the 25

latter at all histories. This condition is innocuous if, for example, the outcome space includes a monetary dimension. Condition 2 (Uniformly Ranked Pair). There exist `, ` ∈ ∆(Z) such that for all t, ht , and m ∈ ∆(Z), we have (`, m, . . . , m) ht (`, m, . . . , m). If ρ admits an evolving utility representation, consumption persistence is equivalent to a particular form of taste persistence. For any felicity u and at any history, tomorrow’s felicity is more likely to be (equivalent to) a felicity in any convex set containing u if today’s felicity was u than if today’s felicity was u0 . Formally, given any set D ⊆ RZ of felicities, let [D] := {w ∈ RZ : w ≈ v for some v ∈ D}. Proposition 2. Suppose ρ admits an evolving utility representation (Ω, F ∗ , µ, (Ft , Ut , Wt , ut )) and Condition 2 holds. Then the following are equivalent: (i). ρ features consumption persistence (ii). for any u, u0 ∈ RZ , convex D ⊆ RZ with u ∈ D, and ht−1 with µ({ut ≈ u}|C(ht−1 )), µ({ut ≈ u0 }|C(ht−1 )) > 0, we have µ({ut+1 ∈ [D]}|C(ht−1 ) ∩ {ut ≈ u}) ≥ µ({ut+1 ∈ [D]}|C(ht−1 ) ∩ {ut ≈ u0 }). An alternative notion, which is neither implied by nor implies consumption persistence, is consumption inertia. This says that if an agent chose a particular consumption lottery yesterday, then (absent ties) she will continue to choose it with positive probability today, as long as today’s menu does not include any new consumption options relative to yesterday’s menu. Definition 7. ρ features consumption inertia if for any ht = (ht−1 , At , pt ) ∈ Ht and pt+1 ∈ At+1 ∈ A∗t+1 (ht ) with At and At+1 atemporal, AZt+1 ⊆ AZt , and pZt = pZt+1 , we have ρt (pt+1 ; At+1 |ht−1 , At , pt ) > 0. In the presence of Condition 1 (Consumption Nondegeneracy) from Section 4.3, consumption inertia is equivalent to an alternative notion of taste persistence which requires that after any history at which the agent’s period t consumption preference was given by u, her period t + 1 consumption preference remains u with positive probability. Proposition 3. Suppose that ρ admits an evolving utility representation ∗ (Ω, F , µ, (Ft , Ut , Wt , ut )) and Condition 1 is satisfied. Then the following are equivalent: (i). ρ features consumption inertia 26

(ii). for any u ∈ RZ and ht−1 with µ({ut ≈ u}|C(ht−1 )) > 0, we have µ({ut+1 ≈ u}|C(ht−1 ) ∩ {ut ≈ u}) > 0. The next section illustrates the different implications of the two notions by considering an evolving utility representation in which the felicity ut follows a finite Markov chain. 5.2.1

Application: Markov Chain

Let U = {u1 , ..., um } denote the set of possible felicities, where ui 6≈ uj for any i 6= j. Let M be an irreducible transition matrix, where Mij denotes the probability that period t + 1 utility is uj conditional on period t utility being ui . The initial distribution is assumed to have full support, but need not be the stationary distribution. Corollary 1. Consider an evolving utility representation generated by the finite Markov chain (U, M ). (i). Assume that for any i ∈ / {j, k, l},33 we have ui 6∈ [co{uj , uk , ul }]. Then ρ features consumption persistence if and only if the Markov chain is a renewal process, i.e., there exists α ∈ [0, 1) and ν ∈ ∆(U) such that Mii = α + (1 − α)ν(ui ) and Mij = (1 − α)ν(uj ) for all i 6= j. (ii). ρ features consumption inertia if and only if Mii > 0 for every i. Point (i) above imposes a regularity condition on the structure of U. This assumption is generically satisfied if the outcome space is rich enough relative to the number of utility functions. The following example does not satisfy the condition. It features consumption persistence under a non-renewal process. Example 5 (Random walk over a line). Consider an evolving utility representation where the felicity process is of the form ut = w+α(xt )v, where w, v ∈ RZ are fixed felicities, α : Z → R is a strictly increasing function, and xt follows a random walk over Z: it remains at its current value , and decreases by one with probability with probability p, increases by one with probability 1−p 2 1−p . This agent displays consumption persistence if and only if p ≥ 31 and consumption inertia 2 if and only if p > 0. N

6

Comparison with Dynamic Discrete Choice

The dynamic discrete choice literature uses a utility specification that is a close cousin of evolving utility. In fact, many models, e.g., Pakes’s (1986) patent renewal model, belong the 33

Here we allow j, k, and l to be pairwise equal.

27

intersection of the two classes. However, the often used full support i.i.d. specification of the dynamic discrete choice model is inconsistent with evolving utility because it underestimates option value.

6.1

The DDC Model

The general DDC model features a state variable with two components: the first one is jointly observed by both the agent and the analyst and the second one is private to the agent. To facilitate comparisons, we abstract away from the former—the only jointly observable state variable is the menu of available decisions At . The DDC literature typically defines choices only between deterministic options; in what follows we study this restriction of our domain. Let YT := Z and AdT := Kf (YT ) and recursively for t ≤ T − 1 define Yt := Z × Adt+1 and Adt := Kf (Yt ).34 Definition 8. The DDC model is a restriction of DREU to deterministic decision trees that additionally satisfies the Bellman equation Ut (zt , At+1 ) = vt (zt ) +

(z ,A ) t t t+1

 + δE

 max Ut+1 (pt+1 )|Ft ,

(6)

pt+1 ∈At+1

where the functions vt : Z → R are deterministic; t : Ω → RYt is a vector of zero-mean payoff (z ,A ) shocks, each entry t t t+1 corresponding to a deterministic option (zt , At+1 ) ∈ Yt ; the filtration (Ft ) is generated by (t ); and δ ∈ (0, 1). To guarantee nondegenerate likelihoods, the DDC literature assumes that t are continuously distributed with full support, see e.g., Rust (1994). Although the above definition assumes that t has finite support, the likelihoods will still be nondegenerate if t takes sufficiently large values with positive probability; we refer to this as the large support assumption. Another assumption often made in the DDC literature is that t are i.i.d.35 (z ,A

)

Definition 9. A DDC model has large support i.i.d. shocks (denoted DDCl.i. ) if t t t+1 (x ,B ) and τ τ τ +1 are independently and identically  distributed random variables for all (zt ,At+1 ) (zt , At+1 ) and (xτ , Bτ +1 ) and µ t > Dt > 0 for all (zt , At+1 ), where Dt := sup(zt ,At+1 ),(xt ,Bt+1 )∈Yt E[Ut (zt , At+1 ) − Ut (xt , Bt+1 )]. 34

Alternatively, we could study an extension of the DDC model to lotteries. One natural candidate is a linear extension, under which the DDC model is a special case of DREU. Other extensions to lotteries are possible, but they are less satisfactory, as they violate Axiom 0 and lead to counterintuitive comparative statics as pointed out in the static setting by Apesteguia and Ballester (2017). Our results in this section are independent of the extension since they hold on the subdomain Adt . 35 See, e.g., Miller (1984), Rust (1989), Hendel and Nevo (2006), Kennan and Walker (2011), Sweeting (2011), and Gowrisankaran and Rysman (2012), all of whom assume in addition the extreme value distribution of .

28

Under this assumption, each option (zt , At+1 ) gets its own shock, with an identical distribution irrespective of the size and nature of the continuation menu At+1 . Moreover, the magnitude of this shock can be large enough to exceed the utility difference between any two options. Under evolving utility the amount of randomness in choices from the menu {(zt , At+1 ), (zt , Bt+1 )} depends on how much of the utility shock to payoffs in t + 1 is realized already in period t: the more the agent learns in period t, the more random her choices are. In the extreme case where ut is uncorrelated over time, choices in period 0 are deterministic. Thus, the randomness of choices in this model is a reflection of the agent’s learning. On the other hand, DDCl.i. introduces an additional source of variability of the continuation value that is unrelated to the agent’s expectations about fundamentals (the E[·|Ft ] part of the Bellman equation) but is purely a mechanical effect of the variability of t across different continuation menus. An important consequence of this assumption is that the agent misperceives option value and sometimes likes to commit to strictly smaller menus. Let At := {(zt , At+1 ), (zt , Bt+1 )} where At+1 ⊂ Bt+1 . It follows immediately from (6) and Definition 9 that ρt ((zt , At+1 ), At |ht ) ∈ (0, 0.5] after any history ht . That is, with positive probability the agent chooses to commit to the (z ,A ) (z ,B ) smaller continuation menu At+1 . Intuitively, this happens whenever t t t+1 exceeds t t t+1 by more than the expected utility difference of the two menus. On the other hand, from Axiom 6 it follows that in the evolving utility model, absent ties, ρt ((zt , At+1 ), At |ht ) = 0 This phenomenon can also be illustrated in the context of Pakes’s (1986) model, where at each point in time the firm chooses whether to renew the patent at a cost or to let it irreversibly lapse. The prediction of that model is that if the patent renewal fee is set to zero, the firm will hold on to the patent as long as possible. In contrast, a DDCl.i. predicts that with positive probability the firm lets the patent lapse and thus decides not to capitalize on the pure option value of holding a costless patent.36 Perhaps an even more striking illustration of negative value of information is that with probability bigger than 0.5 the DDCl.i. agent does not take advantage of costless information. Suppose that there are three periods t = 0, 1, 2 and the payoff in periods 0 and 1 is x irrespective of the decision of the agent; for simplicity assume that the utility of x is always zero. The payoff in period 2 is either y or z, depending on the decision of the agent. The agent can choose between y and z either in period 1 or in period 2; the decision when to choose is made in period 0. Formally, in period 0 the agent faces the menu A0 = {(x, Aearly ), (x, Alate 1 1 )} and in early late period 1 she faces either the menu A1 = {(x, {y}), (x, {z})} or the menu A1 = {(x, {y, z})}, depending on her period-0 choice. This is illustrated in Figure 2. In the evolving utility model, absent ties, the agent chooses to make decisions late with 36

Pakes’s (1986) maintained assumption is that the instantaneous payoff is positive with probability one. Thus, under zero renewal costs, letting the patent lapse leads both to a smaller continuation menu and smaller instantaneous payoff.

29

y

y }) (x, { ly

, (x

r ea A1

)

(x, { z

})

z

(x

, A la

y

1 te

) (x, {y, z})

z

Figure 2: Decision Timing. probability 1, because waiting an extra period gives her more information, which enables her to better tailor her choice to the state.37 To see this, note that U0 (x, Aearly ) = E[max{E[u2 (y)|F1 ], E[u2 (z)|F1 ]}|F0 ] 1 U0 (x, Alate 1 ) = E[E[max{u2 (y), u2 (z)}|F1 ]|F0 ]. By the conditional Jensen inequality and convexity of the max operator, the agent always weakly prefers to decide late. Moreover, this preference is strict at ω as long as there exist ω 0 , ω 00 ∈ F0 (ω) with F1 (ω 0 ) = F1 (ω 00 ) such that u2 (y) − u2 (z) changes sign on {ω 0 , ω 00 }. This preference for late decisions does not hold in the DDCl.i. model, where we have (x,Aearly ) 1

U0 (x, Aearly ) = 0 1

(x,Alate 1 )

U0 (x, Alate 1 ) = 0

(x,{y})

+ E[max{δ 2 v2 (y) + δε1

(x,{z})

, δ 2 v2 (z) + δε1

}]

+ E[max{δ 2 v2 (y) + δ 2 εy2 , δ 2 v2 (z) + δ 2 εz2 }].

The simplest case to analyze is when v2 (y) = v2 (z): In this case, the comparison of (x,{y}) (x,{z}) the continuation values boils down to the comparison between δE[max{ε1 , ε1 }] and y z 2 δ E[max{ε2 , ε2 }]. Since the shocks  are i.i.d. with ex ante mean zero and δ ∈ (0, 1), the former dominates the latter, so that the agent chooses to decide early with probability greater than 0.5. Intuitively, the option of choosing early is attractive because early choices lead to early payoffs of , while deferring the choice delays those payoffs. Proposition 4 shows that this conclusion holds for any values of v2 (y) and v2 (z). 37

A related finding is Theorem 2 of Krishna and Sadowski (2016), who show that one agent prefers to commit to a constant consumption plan more than another agent if and only if his utility process is more autocorrelated, in other words, when he expects to learn less in the future.

30

), A0 ) ≥ 0.5.38 Proposition 4. In the DDCl.i. model ρ0 ((x, Aearly 1 A special case of this result for logit shocks was proved by Fudenberg and Strzalecki (2015). Proposition 4 generalizes their result to all possible distributions of shocks. Fudenberg and Strzalecki (2015) also introduced a choice aversion parameter that scales the desire for flexibility and for late decisions; however, a model that always values flexibility and always leads to late choices does not correspond to any parameter value. One of the motivations of this paper is to develop a model, evolving utility, that has precisely these properties. In the next section we discuss the tradeoffs involved with choosing between the two kinds of models.

6.2

A Modeling Tradeoff

Evolving utility is a model of Bayesian rational agents, while the DDCl.i. model values commitment and prefers making decisions early rather than late. This casts doubts on the typical interpretation of  as “unobserved state variables” because no Bayesian rational agent would (z,A) (z,B) receive information that t > t when A is a strict subset of B. Another interpretation of  in the DDC literature is that they capture “mistakes” or some small deviations from perfect rationality. However, Proposition 4 shows that those deviations are not small as they happen with probability bigger than a half. Thus, DDCl.i. is a model of a different class of preferences than evolving utility—with patterns of behavior typical of agents who suffer from temptation or other behavioral problems, but are inconsistent with Bayesian rationality. Since the two models capture different preferences, the choice between them depends on the particular application. If behavioral elements, such as the preference for commitment, are important in a given application, the DDCl.i. model might be more suitable, whereas if the agents are profit maximizing firms, the evolving utility model might be more appropriate, as the DDCl.i. model would underestimate option value and can lead to potentially biased parameter estimates.39 Furthermore, the two models have different statistical properties: DDCl.i. has nondegenerate likelihoods so that all options are chosen with a positive probability, while evolving utility can lead to degenerate likelihoods whereby an alternative is chosen with probability zero. The former property is more desirable from estimation viewpoint; however, as discussed above, the DDCl.i. preferences may be appropriate only in some settings. We are not aware of a statistically well-behaved model that captures preferences similar to evolving utility; however, the following model is a step in this direction. (zt ,At+1 )

Definition 10. A DDC model has shocks to payoffs (denoted DDCs.t.p. ) if t (z ,B ) t t t+1 =: zt t for all zt and At+1 , Bt+1 . 38

=

As evident from the proof, this result only relies on the i.i.d. assumption and does not depend on support of t . 39 The quantitative importance of such biases is an empirical question, which is beyond the scope of this paper.

31

A particularly tractable subclass is one where zt t are independent across t and zt . In general, DDCs.t.p. is a special case of evolving utility, where ut (zt ) := vt (zt ) + zt t and (Ft ) is generated by (ut ). Thus, in this model the agent does not receive any purely informative signals about future payoffs: all information about the future has immediate payoff consequences through t . In many applications this is a reasonable modeling choice; for example the Pakes (1986) model is a DDC model with shocks to payoffs.

6.3

Identification

There is an extensive literature on identification of DDC models. Those results rely mostly on the jointly observable variables, which we have excluded from our analysis, while keeping the menu fixed in every period. On the other hand, our uniqueness results in Section 5.1 rely on the assumption that the analyst observes choices from different menus. More precisely, in the DDC literature the analyst may observe choices from different menus, but menus are determined by the jointly observable variable which also is an argument of the utility function, which prevents a clean separation. Identification results in static models sometimes do utilize menu variation; however, we are not aware of similar results in the dynamic setting.40 As a result, the two sets of results are mostly complementary. In the rest of this section we identify the most relevant points of contact, but an exhaustive comparison is beyond the scope of this paper. Manski (1993) and Rust (1994) show that in a DDC model it is not possible to distinguish a myopic agent (δ = 0) from a patient agent (δ > 0). We show that in the evolving utility model these two cases can be distinguished based on menu variation. As Magnac and Thesmar (2002) show, the discount factor can be identified using additional assumptions on how the utility function depends on the jointly observable variable. A similar result holds for evolving utility, for example under the assumption that each alternative z consists of wealth and a consumption bundle and the utility of wealth is separable and state-independent. As we show, another special case when the discount factor can be identified is the gradual learning model. Many of the results about the identification of the utility function v assume a known distribution of  that is conditionally independent, see e.g., Magnac and Thesmar (2002).41 Although the per-period utilities are non-parametrically not identified, certain differences in value functions are. Our approach partially identifies the distribution of u (up to the positive affine transformations), which corresponds to identifying v and the distribution of . This is similar to the partial identification results of Norets and Tang (2013) who also relax the known 40

For example the BLP–style models (Berry, Levinsohn, and Pakes, 1995) of static demand use variation of choice set across markets. In dynamic models exclusion restrictions on terminal states and renewal states are often used (Magnac and Thesmar, 2002); intuitively, they play a similar role to menu variation. Another intuitively related set of results is due to Norets and Tang (2013) who use exogenous variation in the transition probabilities of observed variables. 41 Rust (1994) discusses identification of v in a deterministic choice model without unobservable shocks.

32

distribution of  assumption. However, Norets and Tang (2013) maintain the conditional independence assumption, whereas our uniqueness result holds also under any possible pattern of serial correlation.42

7

Extension: Consumption Dependence

Our analysis thus far has focused on isolating a form of history dependence where choices in period t depend on histories ht = (A0 , p0 , . . . , At−1 , pt−1 ) of past choices purely due to the fact that ht partially reveals the agent’s serially correlated private information. For this analysis the sequence (z0 , . . . , zt ) of agent’s past consumptions was immaterial, because the fact that zk ∈ supp pZk was realized does not reveal any additional private information to the analyst. In many settings, however, an additional channel through which the agent’s past choices can directly affect her current choices is that her past consumption may change her current preferences. Two prominent examples of this phenomenon, which we call consumption dependence, are habit formation (e.g., Becker and Murphy (1988)), where consuming a certain good in the past may make the agent like it more in the present; and active learning/experimentation, where the agent’s consumption provides information to her about some payoff-relevant state of the world, as modeled for instance by the multi-armed bandit literature (e.g., Robbins (1952), Gittins and Jones (1972)). The present section, in conjunction with Supplementary Appendix K which contains all details, shows that our main insights extend to settings with consumption dependence. To this end, we enrich our primitive: A history t−1 = (A0 , p0 , z0 , . . . , At−1 , pt−1 , zt−1 ) now summarizes not only that at each period k ≤ t − 1 the agent faced menu Ak and chose pk , but also that the agent’s realized consumption was zk ∈ supp pZk . Conditional on t−1 , the analyst observes the frequency ρt (pt , At | t−1 ) with which the agent chooses pt from any menu At such that (zt−1 , At ) ∈ supp pt−1 . Theorem 5 in Appendix K shows that natural adaptations of Axioms 1–4 to this setting are equivalent to ρ admitting a consumption-dependent DREU (CDREU) representation: At each time t, the agent’s choices maximize her vNM utility Ust ∈ RXt , which is determined by a subjective state st drawn from a finite state space St . There is an initial distribution µ0 ∈ ∆(S0 ), t ,zt and at each t, today’s state st and consumption zt jointly determine the distribution µst+1 ∈ ∆(St+1 ) over tomorrow’s states. The full formal specification of the representation, including its tie-breaking rule, is in Appendix K. The adaptations of Axioms 1 and 2 still impose history independence of ρt across histories t−1 and t−1 that are contraction equivalent or linearly

h

h

h

h

g

42

Kasahara and Shimotsu (2009) obtain identification results for finite mixtures of conditionally independent DDC models and Hu and Shum (2012) generalize these result to other forms of serial correlation. However, those papers only have reduced form results: they identify the choice probabilities conditional on unobserved heterogeneity and the mixing probabilities; they are not after the identification of utility functions.

33

h

g

equivalent. The sole difference is that in order for such t−1 and t−1 to reveal the same private information, we now additionally require that they entail the same sequences (z0 , . . . , zt−1 ) of realized consumptions; otherwise t−1 and t−1 may correspond to different distributions of k ,zk . subjective states due to the consumption dependence of the transition distributions µsk+1 As before, we also characterize two important special cases of CDREU that feature dynamic sophistication. Consumption-dependent evolving utility requires Ust to be given by a Bellman equation of the form

h

g



 Ust (zt , At+1 ) = ust (zt ) + δE

max Ust+1 (pt+1 ) | st , zt ,

pt+1 ∈At+1

(7)

t ,zt . Theorem 6 shows that consumptionwhere the expectation operator is with respect to µst+1 dependent evolving utility is obtained from CDREU by additionally imposing analogs of the DLR Menu Preference and Sophistication axioms used to characterize evolving utility. Unlike with evolving utility, (7) does not imply any analog of Separability, as current consumption zt can influence preferences over continuation menus At+1 through its effect on the distribution over tomorrow’s utilities Ust+1 . Consumption-dependent evolving utility can accommodate a number of behavioral phenomena where past consumption directly affects current utility. Example 6 below illustrates this for the case of habit formation;43 related phenomena include preference for variety (e.g., McAlister (1982), Rustichini and Siconolfi (2014)), memorable consumption (Gilboa, Postlewaite, and Samuelson, 2016), and endogenous discounting (e.g., Uzawa (1968), Becker and Mulligan (1997)).44 In contrast with most existing models of these phenomena, our formulation allows the effect of past consumption on today’s utility to be stochastic—arguably a realistic feature in many contexts.

Example 6 (Habit formation). Suppose V ⊆ RZ is a finite set of consumption utilities. There is an initial distribution π0 ∈ ∆(V) and a map π : V × Z → ∆(V), capturing the stochastic transition from today’s consumption utility and consumption to tomorrow’s consumption utility. At each period t and current consumption utility vt ∈ V, the agent maximizes Z vt+1 vt Ut (zt , At+1 ) = vt (zt ) + δ max Ut+1 (pt+1 ) dπ(vt , zt )(vt+1 ) pt+1

for t ≤ T − 1 and UTvT (zT ) = vT (zT ). To illustrate how this representation can capture a stochastic form of habit formation, suppose for simplicity that Z = {0, 1} and V = {v 0 , v 1 }, where v 1 (1) = v 0 (0) = 1 and v 1 (0) = v 0 (1) = 0, and let πijk := π(v i , j)(v k ) for i, j, k ∈ {0, 1}. 43 Several papers have characterized versions of habit formation focusing on deterministic choice, e.g., Gul and Pesendorfer (2007), Rozen (2010). To the best of our knowledge, the non-axiomatic work by Gilboa and Pazgal (2001) is the only stochastic choice model. 44 Higashi, Hyogo, and Takeoka (2014) characterize a stochastic version of endogenous discounting, but have as their primitive deterministic ex-ante preferences over infinite-horizon decision problems.

34

Then the agent displays habit formation if π111 > π101 , π011 > π001 , so that she is more likely to prefer 1 to 0 today if she preferred 1 yesterday and/or consumed 1 yesterday. N Another important special case of consumption-dependent evolving utility is active learning, where   ust = E ust+1 |st , zt ∀zt .

(8)

Capturing the fact that felicity is fixed but unknown to the agent who learns about it over time, (8) requires that her expectation of tomorrow’s felicity equal today’s felicity. Unlike with gradual learning, today’s consumption zt can have an effect on tomorrow’s felicity by affecting tomorrow’s information; but unlike with general consumption-dependent evolving utility, this effect is purely informational, and hence zt does not affect the agent’s expectation of tomorrow’s felicity. As with gradual learning, this additional discipline allows unique identification of the discount factor. Theorem 7 shows that active learning is obtained from consumption-dependent evolving utility by additionally imposing analogs of Axioms 8 and 9 used to characterize gradual learning, as well as a weak form of Separability. The latter requires that if from tomorrow on the agent is committed to a particular stream of consumption lotteries (`t+1 , . . . , `T ), then her preference over today’s consumption zt does not depend on (`t+1 , . . . , `T ). This captures the idea that consumption lottery streams, unlike general continuation menus At+1 , only entail degenerate future choices; hence, today’s consumption does not have any informational value in this case, so that the agent evaluates today’s consumption myopically, based solely on today’s felicity ust . Example 7 below shows that active learning nests the standard independent multi-armed bandit model, as well as models that allow for correlation of arms (e.g., Easley and Kiefer 1988, Aghion, Bolton, Harris, and Jullien 1991).45 (The active learning model is more general than the one from Example 7 as it allows for very general signal structure, not necessarily i.i.d. conditional on the state.) Example 7 (Experimentation). Now period-t states correspond to period-t beliefs about a state of the world θ ∈ Θ, which captures uncertainty about the true underlying consumption utility u˜θ ∈ RZ . There is a prior π0 ∈ ∆(Θ) and consuming any particular z produces a signal about Θ whose distribution is i.i.d. over time conditional on z and θ. (This specification is general enough to allow the agent to learn about the utility of item x from consuming item z; imposing a further product structure on Θ leads to the usual independent bandit case.) The signal structure induces a map π : ∆(Θ) × Z → ∆(∆(Θ)) from prior beliefs ν ∈ ∆(Θ) and consumptions z to distributions π(ν, z) ∈ ∆(∆(Θ)) over posterior beliefs. At each period t and 45

Hyogo (2007), Cooke (2016) and Piermont, Takeoka, and Teper (2016) axiomatize related models, taking as their primitive deterministic ex ante preferences over decision problems.

35

belief νt , the agent maximizes Z Z νt Ut (zt , At+1 ) = u˜θ (zt ) dνt (θ) + δ

ν

t+1 max Ut+1 (pt+1 ) dπ(νt , zt )(νt+1 )

pt+1 ∈At+1

R for t ≤ T − 1 and UTνt (zT ) = u˜θ (zt ) dνT (θ). Note that by the martingale property R 0 R of beliefs we have ν = ν dπ(ν, z)(ν 0 ) for all ν and z, so that uνt t := u˜θ dνt (θ) = RR νt+1 u˜θ dνt+1 (θ) dπ(νt , zt )(νt+1 ) =: E[ut+1 |νt , zt ] for all νt , zt , as required by (8). N Finally, Heckman (1981) highlights the importance of distinguishing between (what we term) history dependence and consumption dependence, so as to avoid spuriously attributing a causal role to past consumption when observed behavior could instead be explained through serially correlated private information, such as persistent taste heterogeneity. The following condition allows us to make this distinction:

h h

t−1 Axiom 10 (Consumption Independence). For all t ≤ T − 1, if = t−1 0 t−1 0 ) = (A0 , p0 , z0 , . . . , At−1 , pt−1 , zt−1 ) and = (A0 , p0 , z0 , . . . , At−1 , pt−1 , zt−1 ), then ρt (·| t−1 ρt (·| ).

g

g

Consumption independence states that sequences (z0 , . . . , zt−1 ) of realized consumptions are in fact immaterial for observed choice behavior. When ρ admits a CDREU representation, it is easy to see that it admits a DREU representation if and only if consumption independence is satisfied.46 Thus, this condition captures precisely the observed choice behavior that can be explained through serially correlated private information alone.

8

Proof Sketch of Theorem 1

The proof of Theorem 1 appears in Appendix B. Here we sketch the argument for sufficiency in the two-period setting (T = 1). Step 1: Static random expected utility representations. In Supplementary Appendix F we prove Theorem 0, which extends the characterization of static random expected utility in Gul and Pesendorfer (2006) and Ahn and Sarver (2013) to arbitrary separable metric spaces of outcomes. Since each Xt (t = 0, 1) is a separable metric space (Lemma 12), Axiom 3 (History-dependent REU) together with this theorem yields a static REU representation (Ω0 , F0∗ , F0 , µ0 , U0 , W0 ) of ρ0 and for each h0 ∈ H0 , a static REU representation 0 0 0 0 0 0 (Ωh1 , F1∗h , µh1 , F1h , U1h , W1h ) of ρ1 (·|h0 ). Thus far, there is no relationship between the period0 and period-1 representations. In the following, we use Axioms 1, 2, and 4 to combine them into a DREU representation, which requires ρ1 (p1 , A1 |A0 , p0 ) to be represented as a conditional 46

Reducing consumption-dependent evolving utility (respectively, active learning) to evolving utility (respectively, gradual learning) additionally requires Separability.

36

U02 U02

q02 U01

U03 q01

q03 D0

U01

U02 p20

p10 p30

r02 r01 r03

Aˆ0 = 12 A0 + 12 D0

U03

U02 U01 p0

U02 U03 r0 A0

Figure 3: Suppose S0 = {s10 , s20 , s30 } with corresponding utilities U01 , U02 , U03 . Menu D0 is a separating

menu from which q0i is chosen precisely in state si0 . In menu A0 = {p0 , r0 }, p0 is chosen with probability 1 in state s10 ; tied with r0 in s20 ; and never chosen in s30 . In Aˆ0 = 21 A0 + 21 D0 , p0 is replaced with three copies {p10 , p20 , p30 }: Each pi0 is chosen in state si0 with the same probability with which p0 is chosen at si0 and is never chosen otherwise. Step 3 shows choice probabilities following (Aˆ0 , pi0 ) are the same as following (D0 , q0i ). Step 4 shows choice probabilities following (A0 , p0 ) are a weighted sum of choice probabilities following (Aˆ0 , pi0 ), with weights given by µ0 (si0 |C0 (p0 , A0 )). Combined with the static represenP si tation of ρ1 (·|D0 , q0i ) (Step 2), this yields ρ1 (p1 , A1 |A0 , p0 ) = 3i=1 µ10 (C1i (p1 , A1 ))µ0 (si0 |C0 (p0 , A0 )).

probability, with respect to a single underlying probability space Ω, of the event C(p1 , A1 ) given the event C(p0 , A0 ). Step 2: Period-1 choices conditional on period-0 states. Let S0 be the finite partition of Ω0 that generates F0 . We refer to cells s0 ∈ S0 as states and let Us0 = U0 (ω) for any ω ∈ s0 ∈ S0 . For any history (A0 , p0 ), define U0 (A0 , p0 ) := {Us0 : s0 ∈ S0 and p0 ∈ M (A0 , Us0 )} to be the set of period-0 utilities consistent with the choice of p0 from A0 . Since (U0 , F0 ) is simple, each Us0 is nonconstant and induces a different preference, so by standard arguments (Lemma 13 in the appendix) we can find a menu D0 = {q0s0 : s0 ∈ S0 } that strictly separates s∗ all states, i.e., such that for any s∗0 ∈ S0 we have U0 (D0 , q00 ) = {Us∗0 }. Figure 3 shows an example. For the remainder of this proof sketch, fix such a separating menu D0 and define ρs10 (p1 , A1 ) := ρ1 (p1 , A1 |D0 , q0s0 ) for each A1 and p1 . The representation of ρ1 (·|D0 , q0s0 ) obtained in Step 1 then constitutes a static REU representation (Ωs10 , F ∗s0 , µs10 , U1s0 , W1s0 ) of ρs10 . Step 3: ρs10 is well-defined. We now use Linear History Independence, Contraction History Independence, and History Continuity to show that for any (B0 , q0 ) ∈ H0 such that U0 (B0 , q0 ) = {Us0 }, we have ρ1 (·, A1 |B0 , q0 ) = ρs10 (·, A1 ). To see this, assume first that M (B0 , Us0 ) = {q0 }, i.e., q0 is the unique maximizer of Us0 in B0 . Define r0 := 12 q0 + 21 q0s0 , ˜0 := 1 B0 + 1 q0s0 , and D ˜ 0 := 1 D0 + 1 q0 . Then U(B ˜0 , r0 ) = U(B ˜0 ∪ D ˜ 0 , r0 ) = U(D ˜ 0 , r0 ) = {Us0 }. B 2 2 2 2 From the static REU representation of ρ0 and because M (B0 , Us0 ) = {q0 }, it follows that ˜0 ) = ρ0 (r0 , B ˜0 ∪ D ˜ 0 ) = ρ0 (r0 , D ˜ 0 ) = µ0 (s0 ). ρ0 (r0 , B

37

(9)

But then ˜0 , r0 ) = ρ1 (·, A1 |B ˜0 ∪ D ˜ 0 , r0 ) ρ1 (·, A1 |B0 , q0 ) = ρ1 (·, A1 |B ˜ 0 , r0 ) = ρ1 (·, A1 |D0 , q0s0 ) = ρs10 (·, A1 ), = ρ1 (·, A1 |D where the first and fourth equalities follow from Axiom 2 (Linear History Independence), the second and third equalities from Axiom 1 (Contraction History Independence) and (9), and the final equality holds by definition. Finally, using Axiom 4 (History Continuity), we can extend this argument to the case where in state s0 , q0 is tied with other lotteries in B0 . Step 4: Splitting histories into states. Now consider a general history h0 = (A0 , p0 ). By mixing with the separating menu D0 , we can decompose ρ1 (·|h0 ) into a weighted sum of choice probabilities conditional on each state s0 , where the weight on s0 is the µ0 -conditional probability of s0 given history h0 . Concretely, let Aˆ0 := 21 A0 + 12 D0 , and for any s0 ∈ S0 , let ps00 := 12 p0 + 21 q0s0 . This is depicted in Figure 3. Note that by construction of D0 and the representation of ρ0 , we have that ρ0 (ps00 , Aˆ0 ) = µ0 (C0 (p0 , A0 )|s0 )µ0 (s0 ), where C0 (p0 , A0 ) is the event in Ω0 that p0 is chosen from A0 . Moreover, whenever ρ0 (ps00 , Aˆ0 ) > 0, then U0 (Aˆ0 , ps00 ) = {Us0 }, so Step 3 together with the representation of ρs10 implies ρ1 (p1 , A1 |Aˆ0 , ps00 ) = ρs10 (p1 , A1 ) = µ1 (C1s0 (p1 , A1 )), where C1s0 (p1 , A1 ) is the event in Ωs10 that p1 is chosen from A1 . Then X 1 1 ρ0 (ps00 , Aˆ0 ) ρ1 (p1 , A1 |A0 , p0 ) = ρ1 (p1 , A1 |Aˆ0 , p0 + D0 ) = ρ1 (p1 , A1 |Aˆ0 , ps00 ) P s0 2 2 ρ0 (p00 , Aˆ0 ) 0 s ∈S 0

=

X s0 ∈S0

s0 ∈S0

0

µ0 (C0 (p0 , A0 )|s0 )µ0 (s0 ) = 0 0 s0 ∈S0 µ0 (C0 (p0 , A0 )|s0 )µ0 (s0 ) s

X

µs10 (C1s0 (p1 , A1 )) P

0

µs10 (C1s0 (p1 , A1 ))µ0 (s0 |C0 (p0 , A0 )).

0 ∈S0

(10) Indeed, the first equality follows from Linear History Independence, the second equality from the definition of ρ1 conditional on a set of histories, the third from the observations of the preceding paragraph, and the fourth from Bayes’ rule. S Step 5: Completing the proof. Now define Ω = s0 ∈S0 s0 × Ωs10 . In the natural way, the partitions S0 of Ω0 and S1s0 of Ωs10 induce a finitely generated filtration on Ω, and the random utilities and tie-breakers on Ω0 and Ωs10 induce processes of utilities and tiebreakers on Ω.47 Define µ on Ω by µ(E0 × E1 ) = µ0 (E0 ) × µs10 (E1 ) for any measurable E0 ⊆ s0 , E1 ⊆ Ωs10 . From the construction of Ω and (10), it is then easy to see that ρ1 (p1 , A1 |h0 ) = µ(C(p1 , A1 )|C(p0 , A0 )), where C(pt , At ) denotes the event in Ω that pt is chosen from At . Thus, ρ admits a DREU representation, as required. Specifically, let F0 be generated by the partition {s0 × Ωs10 : s0 ∈ S0 } and F1 by the partition {s0 × s1 : s0 ∈ S0 , s1 ∈ S1s0 }. For any (ω0 , ω1 ) ∈ s0 × Ωs10 , let U0 (ω0 , ω1 ) = U0 (ω0 ), U1 (ω0 , ω1 ) = U1s0 (ω1 ) and W0 (ω0 , ω1 ) = W0 (ω0 ), W1 (ω0 , ω1 ) = W1s0 (ω1 ). 47

38

9 9.1

Discussion Related Literature

An extensive literature studies axiomatic characterizations of random utility models in the static setting (Barber´a and Pattanaik, 1986; Block and Marschak, 1960; Falmagne, 1978; Luce, 1959; McFadden and Richter, 1990).48 Our approach incorporates as its static building block the elegant axiomatization of Gul and Pesendorfer (2006) and Ahn and Sarver (2013). A technical contribution of our paper is the extension of their result to an infinite outcome space, which is needed since the space of continuation problems in the dynamic model is infinite. A recent paper by Lu and Saito (2016) studies period-0 random choice between consumption lottery streams and attributes the source of randomness of choice to the stochastic discount factor. The axiomatic literature on dynamic stochastic choice is very sparse. Our choice domain is exactly the one of Kreps and Porteus (1978); however, while they study deterministic choice in each period, we focus on random choice in each period. To the best of our knowledge, Fudenberg and Strzalecki (2015) is the only other axiomatic study of stochastic choice in general decision trees, but they focus on the special parametric case of logit utility shocks that are i.i.d. over time, while we characterize a fully non-parametric dynamic random utility model and allow for serially correlated utilities.49 Because of their i.i.d. assumption, their representation does not give rise to history dependent choice behavior and cannot accommodate phenomena such as learning, choice persistence, and consumption dependence; likewise, challenges such as limited observability do not arise in their setting. Ahn and Sarver (2013) study a model where the analyst observes deterministic period 0 choices between menus and period 1 random choices from menus. They show how to connect the analysis of Gul and Pesendorfer (2006) and of Dekel, Lipman, and Rustichini (2001) to obtain better identification properties. Their sophistication axiom plays a key role in our characterization of evolving utility. Our work generalizes theirs by allowing for information to arrive in period 0, which makes it necessary to study random choices also in period 0. As discussed, this allows us to study the behavioral implications of serially correlated private information and to make a connection to the dynamic discrete choice literature. The literature on static and deterministic preferences over menus (Dekel, Lipman, and 48

Lu (2016) studies a model with an objective state space and choice between acts which traces all randomness of choice to random arrival of signals. This is similar in spirit to our gradual learning representation; however, our states are subjective and utility can be state-dependent. Another recent contribution by Apesteguia, Ballester, and Lu (2017) considers a setting in which choice options are linearly ordered. 49 Gul, Natenzon, and Pesendorfer (2014) and Ke (2016) study settings in which the agent receives an outcome only once at the end of a decision tree, and characterize stochastic choice models of bounded rationality. There is also non-axiomatic work studying various special cases of our representation on limited domains, where the agent makes a one-time consumption choice at an endogenous or exogenous stopping time, e.g., Fudenberg, Strack, and Strzalecki (2016) and Natenzon (2016).

39

Rustichini, 2001; Dekel, Lipman, Rustichini, and Sarver, 2007; Dillenberger, Lleras, Sadowski, and Takeoka, 2014; Kreps, 1979) assumes that in period 0 the agent does not receive any information, but anticipates receiving information in period 1. This leads to a preference for bigger menus over smaller ones to capitalize on option value. The papers closest to ours are Krishna and Sadowski (2014) and Krishna and Sadowski (2016) who study preferences over infinite horizon dynamic choice problems that are stationary versions of our evolving utility representation (making them unsuited to study gradual learning). The main difference is that they assume the existence of a special period 0 that involves no randomness, thanks to which they can focus on period 0 deterministic preferences. By contrast, we allow information to arrive in each period (and therefore focus on random choice) and study the predictions of the model for behavior in each period (instead of just in period 0). This allows us to make more precise comparisons with models of dynamic discrete choice, which by design look at behavior in each period and are meant to be estimated on data sets that involve multiperiod decisions. Within the dynamic discrete choice literature, seminal models include Miller (1984), Wolpin (1984), Pakes (1986), and Rust (1987); see Rust (1994), Aguirregabiria and Mira (2010) and Arcidiacono and Ellickson (2011) for surveys. Our results provide an axiomatic foundation for these models; we also relate our uniqueness results to the identification results in that literature. A prominent special case of the general model is the so-called dynamic logit model used by Rust (1989), Hendel and Nevo (2006), Kennan and Walker (2011), Sweeting (2011), and Gowrisankaran and Rysman (2012). As mentioned, Fudenberg and Strzalecki (2015) provide an axiomatization of this model; they are also the first paper to notice that this model induces negative value of information; our results in Section 6 generalize theirs.

9.2

Conclusion

This paper provides the first analysis of the fully general and non-parametric model of dynamic random utility. When utilities are serially correlated, a key new feature relative to the static and i.i.d. model is that choices appear history dependent, a pervasive phenomenon in economic applications. We axiomatically characterize the implied dynamic stochastic choice behavior, identifying the precise form of history dependence that can arise. Stochastic choice data in dynamic domains lets us distinguish important models that coincide in the static setting. In particular, choices that arise from learning rather than from more general taste shocks display a form of stationary consumption preference, capturing the martingale property of beliefs. Moreover, by distinguishing between past choices and realized consumption, we can separate history dependence due to serially correlated utilities from models of habit formation and experimentation, where past consumption directly affects the agent’s utility process, and characterize when phenomena such as consumption persistence can be explained through the former channel alone. Our analysis has implications for the dynamic discrete choice literature. By utilizing menu 40

variation, we provide identification results that are complementary to those in the DDC literature. Moreover, unlike some of the commonly used DDC models, our evolving utility representation specifies payoff shocks in a manner that implies a positive option value. We also provide several methodological contributions that we believe will prove useful for future work on stochastic choice: A solution to the limited observability problem that arises from the fact that in dynamic settings past choices typically restrict future opportunity sets; a way to infer, and impose additional structure on, the agent’s preference conditional on any particular realization of her private information; and an extension to infinite outcome spaces of Gul and Pesendorfer’s (2006) and Ahn and Sarver’s (2013) characterization of static random expected utility. Finally, some techniques developed in this paper naturally carry over from the multi-period to the multi-agent setting, and in ongoing work we exploit this to study strategic interactions under correlated private information.

41

Appendix: Main Proofs The appendix is structured as follows: • Section A defines equivalent versions of DREU, evolving utility, and gradual learning, as well as other important terminology that is used throughout the appendix. • Sections B–D prove Theorems 1–3. • Section E collects together several lemmas that are used throughout Sections B–D. The supplementary appendix contains the following additional material: • Section F proves Theorem 0 (the static REU representation result for arbitrary separable metric spaces of outcomes), which is used in the proof of Theorem 1. • Section G proves Proposition 5 from Section A. • Sections H, I, and J collect together proofs for Sections 5.1, 5.2, and 6, respectively. • Finally, Section K contains formal definitions and representation theorems for the consumption dependent representations from Section 7.

A

Equivalent Representations

Instead of working with probabilities over the grand state space Ω, our proofs of Theorems 1–3 will employ equivalent versions of our representations, called S-based representations, that look at onestep-ahead conditionals. Section A.1 defines S-based representations. Section A.2 establishes the equivalence between DREU, evolving utility, and gradual learning representations and their respective S-based analogs. Section A.3 introduces important terminology regarding the relationship between states and histories that will be used throughout the proofs of Theorems 1–3.

A.1

S-based Representations

For any X ∈ {X0 , . . . , XT }, A ∈ K(∆(X)), p ∈ ∆(X), let N (A, p) := {U ∈ RX : p ∈ M (A, U )} and N + (A, p) := {U ∈ RX : {p} = M (A, U )}. Definition 11. A random expected utility (REU) form on X (S, µ, {Us , τs }s∈S ) where

42

∈ {X0 , . . . , XT } is a tuple

(i). S is a finite state space and µ is a probability measure on S (ii). for each s ∈ S, Us ∈ RX is a nonconstant utility over X. (iii). for each s ∈ S, the tie-breaking rule τs is a finitely-additive probability measure on the Borel σ-algebra on RX and is proper, i.e., τs (N + (A, p)) = τs (N (A, p)) for all A, p. Given any REU form (S, µ, {Us , τs }s∈S ) on Xi and any s ∈ S, Ai ∈ Ai , and pi ∈ ∆(Xi ), define τs (pi , Ai ) := τs ({w ∈ RXi : pi ∈ M (M (Ai , Us ), w)}). Definition 12. An S-based DREU representation of ρ consists of tuples (S0 , µ0 , {Us0 , τs0 }s0 ∈S0 ), s

(St , {µt t−1 }st−1 ∈St−1 , {Ust , τst }st ∈St )1≤t≤T such that for all t = 0, . . . , T , we have: s

DREU1: For all st−1 ∈ St−1 , (St , µt t−1 , {Ust , τst }st ∈St ) is an REU form on Xt such that50 s

(a) Ust 6≈ Us0t for any distinct pair st , s0t ∈ supp(µt t−1 ); s0

s

(b) supp(µt t−1 ) ∩ supp(µt t−1 ) = ∅ for any distinct pair st−1 , s0t−1 ; (c)

S

s

st−1 ∈St−1

supp µt t−1 = St .

DREU2: For all pt , At , and ht−1 = (A0 , p0 , A1 , p1 , . . . , At−1 , pt−1 ) ∈ Ht−1 (At ),51 P t−1

ρt (pt , At |h

)= P

(s0 ,...,st )∈S0 ×...×St

sk−1 (sk )τsk (pk , Ak ) k=0 µk . Qt−1 sk−1 (sk )τsk (pk , Ak ) k=0 µk

Qt

(s0 ,...,st−1 )∈S0 ×...×St−1

An S-based evolving utility representation of ρ is an S-based DREU representation such that for all t = 0, . . . , T , we additionally have: EVU: For all st ∈ St , there exists ust ∈ RZ such that for all zt ∈ Z, At+1 ∈ At+1 , we have Ust (zt , At+1 ) = ust (zt ) + Vst (At+1 ), where Vst (At+1 ) :=

P

st+1

t µst+1 (st+1 ) maxpt+1 ∈At+1 Ust+1 (pt+1 ) for t ≤ T − 1 and VsT ≡ 0.

An S-based gradual learning representation is an S-based evolving-utility representation such that additionally: GL: There exists δ > 0 such that for all t = 0, . . . , T − 1 and st ∈ St , we have ust =

1 X st µt+1 (st+1 )ust+1 . δs t+1

50

s

For t = 0, we abuse notation by letting µt t−1 denote µ0 for all st−1 . 51 For t = 0, we again abuse notation by letting ρt (·|ht−1 ) denote ρ0 (·) for all ht−1 .

43

A.2

Equivalence Result

Proposition 5. Let ρ be a dynamic stochastic choice rule. (i). ρ admits a DREU representation if and only if ρ admits an S-based DREU representation. (ii). ρ admits an evolving utility representation if and only if ρ admits an S-based evolving utility representation. (iii). ρ admits a gradual learning representation if and only if ρ admits an S-based gradual learning representation. Proof. See Supplementary Appendix G.

A.3



Relationship between Histories and States

Throughout the proofs of Theorems 1–3 we will make use of the following terminology conFix any t ∈ {0, . . . , T }.

cerning the relationship between histories and states.

Suppose that

s0 (St0 , {µt0t −1 }st0 −1 ∈St0 −1 , {Ust0 , τst0 }st0 ∈St0 ) satisfy DREU1 and DREU2 from Definition 12 for each t0 ≤ t. Fix any state s∗t ∈ St . We let pred(s∗t ) denote the unique predecessor sequence (s∗0 , . . . , s∗t−1 ) ∈ s∗k ) for each S0 × . . . × St−1 , given by assumptions DREU1 (b) and (c), such that s∗k+1 ∈ supp(µk+1 k = 0, ..., t − 1. Given any history ht = (A0 , p0 , . . . , At , pt ), we say that s∗t is consistent with ht if Qt ∗ k=0 τsk (pk , Ak ) > 0.

For any k = 0, . . . , t, sk ∈ Sk , p0 ∈ A0 ∈ A0 , and pk+1 ∈ Ak+1 ∈ Ak+1 , let k Usk (Ak+1 , pk+1 ) := {Usk+1 : sk+1 ∈ supp µsk+1 and pk+1 ∈ M (Ak+1 , Usk+1 )};

U0 (A0 , p0 ) := {Us0 : s0 ∈ S0 and p0 ∈ M (A0 , Us0 )}. A separating history for s∗t is a history ht = (B0 , q0 , ..., Bt , qt ) such that Us∗k−1 (Bk , qk ) = {Us∗k } for all k = 0, . . . , t and ht ∈ Ht∗ , where we abuse notation by letting Us∗−1 (B0 , q0 ) denote U0 (B0 , q0 ). Note that separating histories are required to be histories without ties. We record the following properties: Lemma 1. Fix any s∗t ∈ St with pred(s∗t ) = (s∗0 , . . . , s∗t−1 ). Suppose ht = (B0 , q0 , . . . , Bt , qt ) satisfies Us∗k−1 (Bk , qk ) = {Us∗k } for all k = 0, . . . , t. Then for all k = 0, . . . , t, s∗k is the only state in Sk that is consistent with hk . Proof. Fix any ` = 0, . . . , t. First, consider s0` ∈ S` r {s∗` }, with pred(s0` ) = (s00 , . . . , s0`−1 ). Let s∗

k ≤ ` be smallest such that s0k 6= s∗k . Then s0k ∈ supp µkk−1 , so Us∗k−1 (Bk , qk ) = {Us∗k } implies that qk ∈ / M (Bk , Us0k ). Thus, τs0k (qk , Bk ) = 0, whence s0` is not consistent with h` .

44

Next, to show that s∗` is consistent with h` , note that ρ` (q` , B` |h`−1 ) > 0, so DREU2 implies ` Y

X

s

µkk−1 (sk )τsk (qk , Bk ) > 0.

(11)

(s0 ,...,s` )∈S0 ×...×S` k=0

Now, if (s0 , . . . , s`−1 ) 6= pred(s` ), then

sk−1 (sk ) = 0. And k=0 µk Q` shows k=0 τsk (qk , Bk ) =

Q`

but s` 6= s∗` , then the first paragraph Q` s∗k−1 ∗ µ (sk )τs∗k (qk , Bk ) > 0, whence s∗` is consistent with h` . k=0 k

if (s0 , . . . , s`−1 ) = pred(s` ) 0.

Hence, (11) reduces to 

Lemma 2. Every s∗t ∈ St admits a separating history. Proof. Fix any s∗t ∈ St with pred(s∗t ) = (s∗0 , . . . , s∗t−1 ). By Lemma 13 and DREU1 (a), there exist s

menus B0 = {q0 (s0 ) : s0 ∈ S0 } ∈ A0 and Bk (sk−1 ) = {pk (sk ) : sk ∈ supp µkk−1 } ∈ Ak for each k = 1, . . . , t and sk ∈ Sk such that U0 (B0 , q0 (s0 )) = {Us0 } for all s0 ∈ S0 and Usk−1 (Bk (sk−1 ), qk (sk )) = s

{Usk } for all sk ∈ supp µkk−1 . Moreover, we can assume that Bk+1 (sk ) ∈ supp qk (sk )A for all k = 0, . . . , t − 1 and sk ∈ Sk , by letting each qk (sk ) put small enough weight on (z, Bk+1 (sk )) for some z ∈ Z. Then ht := (B0 , q0 (s∗0 ), . . . , Bt (s∗t ), qt (s∗ (t))) ∈ Ht . Moreover, since Us∗k−1 (Bk , qk (s∗k )) = {Us∗k }, Lemma 1 implies that or all for all k = 0, . . . , t, s∗k is the only state consistent with hk . Additionally, s∗

for all k = 0, . . . , t and sk ∈ supp µkk−1 , we have M (Bk (s∗k−1 ), Usk ) = {qk (sk )} by construction. Hence, by Lemma 14, we have Bk (s∗k−1 ) ∈ A∗k (hk−1 ). Thus ht ∈ Ht∗ , so ht is a separating history for s∗t .

B



Proof of Theorem 1

B.1

Proof of Theorem 1: Sufficiency

Suppose ρ satisfies Axioms 1–4. To show that ρ admits a DREU representation, it suffices, by Proposition 5, to construct an S-based DREU representation for ρ. We proceed by induction on t ≤ T . First consider t = 0. Since ρ0 satisfies Axiom 3 and X0 is a separable metric space by Lemma 12, the existence of (S0 , µ0 , {Us0 , τs0 }s0 ∈S0 ) satisfying DREU1 and DREU2 from Definition 12 is immediate from Theorem 4, which extends Gul and Pesendorfer’s (2006) and Ahn and Sarver’s (2013) characterization result for static S-based REU representations to separable metric spaces and which we prove in Supplementary Appendix F. Suppose

next

that

0



t

<

T

and

s0 (St0 , {µt0t −1 }st0 −1 ∈St0 −1 , {Ust0 , τst0 }st0 ∈St0 ) satisfying DREU1 and t now construct (St+1 , {µst+1 }st ∈St , {Ust+1 , τst+1 }st+1 ∈St+1 ) satisfying

45

that

we

have

DREU2 for each

constructed t0

DREU1 and DREU2.

≤ t.

We

B.1.1

t t }st ∈St , {Ust+1 , τst+1 }st+1 ∈St+1 ): and (St+1 , {µst+1 Defining ρst+1

To this end, we first pick an arbitrary separating history ht (st ) for each st ∈ St (this exists by Lemma 2) and define t (·, At+1 ) := ρt+1 (·, At+1 |ht (st )) ρst+1

for all At+1 ∈ At+1 . Note that here ρt+1 (·, |ht (st )) is the extended version of ρt+1 (·|ht (st )) given in Definition 3; by Axiom 2 and Lemma 15, the specific choice of λ ∈ (0, 1] and dt−1 ∈ Dt−1 used in the extension procedure does not matter. t By Axiom 3 and the fact that Xt+1 is separable metric (Lemma 12), Theorem 4 applied to ρst+1

st t , {Ust+1 , τst+1 }st+1 ∈S st ) on Xt+1 such that Ust+1 6≈ Us0t+1 for any distinct , µst+1 yields an REU form (St+1 t+1

pair

st+1 , s0t+1



st St+1

and such that t (pt+1 , At+1 ) = ρst+1

X

t (st+1 )τst+1 (pt+1 , At+1 ) µst+1

s

t st+1 ∈St+1

s0

st t for all pt+1 and At+1 . Without loss, we can assume that St+1 and St+1 are disjoint whenever st 6= s0t . S st t t (st+1 ) = 0 to a probability measure on St+1 by setting µst+1 and extend µst+1 Set St+1 := st ∈St St+1 st . for all st+1 ∈ St+1 r St+1 t }st ∈St , {Ust+1 , τst+1 }st+1 ∈St+1 ) thus defined satisBy construction, it is immediate that (St+1 , {µst+1

fies DREU1 and that t (pt+1 , At+1 ) = ρst+1

X

t (st+1 )τst+1 (pt+1 , At+1 ) µst+1

(12)

st+1 ∈St+1

for all pt+1 and At+1 . It remains to show that DREU2 is also satisfied.

B.1.2

t ρst+1 is well-behaved:

t To this end, Lemma 3 below first shows that the definition of ρst+1 is well-behaved, in the sense that t for any history ht that can only arise in state st , ρst+1 = ρt+1 (·|ht ).

Lemma 3. Fix any s∗t ∈ St with pred(s∗t ) = (s∗0 , ..., s∗t−1 ). Suppose ht = (A0 , p0 , ..., At , pt ) ∈ Ht satisfies Us∗k−1 (Ak , pk ) = {Us∗k } for all k = 0, 1, . . . , t. Then for any At+1 ∈ At+1 , ρt+1 (·, At+1 |ht ) = s∗

t ρt+1 (·, At+1 ). ∗

˜ t = (A˜0 , p˜0 , . . . , A˜t , p˜t ) denote the separating history for s∗ used to define ρst . Proof. Step 1: Let h t t+1 We first prove the Lemma under the assumption that ht ∈ Ht∗ , i.e, that ht is itself a separating history

46

for s∗t .52 Pick (r0 , ..., rt ) ∈ ∆(X0 ) × . . . × ∆(Xt ) such that At+1 ∈ supp rtA and for all k = 0, . . . , t − 1, ˜k+1 , Bk+1 ∪ B ˜k+1 }, supp(rkA ) ⊇ {Bk+1 , B where B` := 1 3 p`

1 3 A`

˜` := + 31 {˜ p` } + 31 {r` } and B

1 ˜ 3 A`

+ 13 {p` } + 13 {r` } for ` = 0, . . . , t. Define q` :=

+ 31 p˜` + 31 r` .

˜ t ∈ H∗ and Us∗ (Ak , pk ) = Us∗ (A˜k , p˜k ) = {Us∗ }, Lemma 14 implies that Note that since ht , h t k−1 k−1 k ˜ M (Ak , Us∗ ) = {pk } and M (Ak , Us∗ ) = {˜ pk } for all k = 0, 1, . . . , t. By linearity of the Us , we then also k

k

have ˜k , qk ) = Us∗ (Bk ∪ B ˜k , qk ) = {Us∗ } and Us∗k−1 (Bk , qk ) = Us∗k−1 (B k−1 k ˜k , Us∗ ) = M (Bk ∪ B ˜k , Us∗ ) = {qk }. M (Bk , Us∗k ) = M (B k k s∗

k−1 This implies that for all k = 0, . . . , t and sk ∈ supp µk−1 ,

˜k ) = τs (qk , Bk ∪ B ˜k ) = τsk (qk , Bk ) = τsk (qk , B k

  1 if sk = s∗ k  0 otherwise

By DREU2 of the inductive hypothesis, it follows that for all k = 0, . . . , t − 1, ∗

s ˜ t |B ˜0 , q0 , . . . , B ˜t−1 , qt−1 ) µt t−1 (s∗t ) = ρt (qt , Bt |B0 , q0 , . . . , Bt−1 , qt−1 ) = ρt (qt , B

˜t |B0 , q0 , . . . , Bk−1 , qk−1 , Bk ∪ B ˜k , qk , . . . , Bt−1 ∪ B ˜t−1 , qt−1 ) = ρt (qt , Bt ∪ B ˜t−1 , qt−1 ), ˜ t |B ˜0 , q0 , . . . , B ˜k−1 , qk−1 , Bk ∪ B ˜k , qk , . . . , Bt−1 ∪ B = ρt (qt , Bt ∪ B whence repeated application of Axiom 1 (Contraction History Independence) yields ˜0 , q0 , . . . , Bt ∪ B ˜t , qt ) = ρt+1 (·, At+1 |B0 , q0 , . . . , Bt , qt ) = ρt+1 (·, At+1 |B0 ∪ B ˜0 , q0 , . . . , B ˜t , qt ). ρt+1 (·, At+1 |B 52

(13)

Note that Us∗k−1 (Ak , pk ) = {Us∗k } for all k = 0, 1, . . . , t does not by itself imply that ht is a history without

ties.

47

Moreover, by Axiom 2 (Linear History Independence) and Lemma 15, we have ρt+1 (·, At+1 |ht ) = ρt+1 (·, At+1 |B0 , q0 , . . . , Bt , qt ) and

(14)

˜ t ) = ρt+1 (·, At+1 |B ˜0 , q0 , . . . , B ˜t , qt ). ρt+1 (·, At+1 |h ∗

˜ t ) := ρst (·, At+1 ). This Combining (13) and (14) we obtain that ρt+1 (·, At+1 |ht ) = ρt+1 (·, At+1 |h t+1 proves the Lemma for histories ht ∈ Ht∗ . Step 2: Now suppose that ht ∈ / Ht∗ . Take any sequence of histories ht,n →m ht with ht,n = (An0 , pn0 , ..., Ant , pnt ) ∈ Ht∗ for each n. Note that such a sequence exists by Axiom 4 (History Continuity). We claim that for all large enough n, Us∗k−1 (Ank , pnk ) = {Us∗k } for all k = 0, . . . , t. Suppose for a contradiction that we can find a subsequence (ht,n` )∞ `=1 for which this claim is violated. Note that for all `, ρk (pnk ` , Ank ` |hk−1,n` ) > 0 for all k = 0, . . . , t (by the fact that ht,n` is a well-defined history). Hence, DREU2 for k ≤ t implies that we can find s0t,n` ∈ St with pred(s0t,n` ) = (s00,n` , . . . , s0t−1,n` ) and (s00,n` , . . . , s0t,n` ) 6= (s∗0 , . . . , s∗t ) such that Us0k,n ∈ Us0k−1,n (Ank ` , pnk ` ) for all k = 0, . . . , t. Moreover, `

`

since S0 × . . . × St is finite, by choosing the subsequence (ht,n` ) appropriately, we can assume that (s00,n` , . . . , s0t,n` ) = (s00 , . . . , s0t ) 6= (s∗0 , . . . , s∗t ) for all `. Pick the smallest k such that s0k 6= s∗k and pick any qk ∈ Ak . Since Ank ` →m Ak we can find qkn` ∈ Ank ` with qkn` →m qk . For all ` we have Us0k ∈ Us0k−1 (Ank ` , pnk ` ), so Us0k (pnk ` ) ≥ Us0k (qkn` ), whence Us0k (pk ) ≥ Us0k (qk ) by linearity of Us0k . Moreover, s0

s∗

k−1 k−1 by choice of k, s0k ∈ supp µk−1 = supp µk−1 . Thus, Us0k ∈ Us∗k−1 (Ak , pk ) = {Us∗k }. But s0k 6= s∗k , so by

DREU1 (a) of the inductive hypothesis Us0k 6≈ Us∗k , a contradiction. By the previous paragraph, for large enough n, ht,n satisfies the assumption of the Lemma. Since s∗

t ht,n ∈ Ht∗ , Step 1 then shows that ρt+1 (pt+1 , At+1 |ht,n ) = ρt+1 (pt+1 , At+1 ) for all large enough n and

all pt+1 . By Axiom 4 (History Continuity), this implies that for all pt+1 ρt+1 (pt+1 , At+1 |ht ) ∈ co{lim ρt+1 (pt+1 , At+1 |ht,n ) : ht,n →m ht , ht,n ∈ Ht∗ } =

n s∗t {ρt+1 (pt+1 , At+1 )},

which completes the proof.

B.1.3



t ρt+1 (·|ht ) is a weighted average of ρst+1 :

The next lemma shows that ρt+1 (·|ht ) can be expressed as a weighted average of the state-dependent t t choice distributions ρst+1 , where the weight on each ρst+1 corresponds to the probability of st conditional

on history ht .

48

Lemma 4. For any pt+1 ∈ At+1 and ht = (A0 , p0 , ..., At , pt ) ∈ Ht (At+1 ), we have sk−1 t (pt+1 , At+1 ) (sk )τsk (Ak , pk )ρst+1 k=0 µk . P Qt sk−1 (sk )τsk (Ak , pk ) k=0 µk (s0 ,...,st )∈S0 ×···×St

P t

ρt+1 (pt+1 , At+1 |h ) =

Qt

(s0 ,...,st )∈S0 ×···×St

t Proof. Let {s1t , ..., sm t } denote the set of states in St that are consistent with history h (as defined ˆ t (j) = (B j , q j , . . . , B j , q j ) be a separating history for state sj . We in Section A.3). For each j, let h t t t 0 0 j can assume that for each k = 1, . . . , t, qk−1 puts small weight on (z, 12 Ak + 21 Bkj ) for some z, so that ˆ t (j) ∈ Ht (At+1 ) for all j. ht (j) := 1 ht + 1 h 2

2

Note first that for all j = 1, . . . , m, we have t

ρ(h (j)) =

t Y

sj

µkk−1 (sjk )τsj (pk , Ak ).

(15)

k

k=0

Indeed, observe that t

ρ(h (j)) =

t Y k=0

t Y

X

=

(s0 ,...,st ) k=0

=

=

t Y k=0 t Y

1 1 1 1 1 ρk ( pk + qkj , Ak + Bkj | hk−1 + 2 2 2 2 2

1 ˆ k−1 h (j)) 2

1 1 1 1 s µkk−1 (sk )τsk ( pk + qk , Ak + Bkj ) 2 2 2 2

sj 1 1 1 1 µkk−1 (sjk )τsj ( pk + qkj , Ak + Bkj ) k 2 2 2 2 sj

µkk−1 (sjk )τsj (pk , Ak ). k

k=0

The first equality holds by definition. The second equality follows from DREU2 of the inductive ˆ t (j) is a separating history for sj , we have hypothesis. For the final two equalities, note that since h t

for all k = 0, . . . , t that

Usj (Bkj , qkj ) k−1

= {Usj } with k

{qkj }

=

M (Bkj , Usj ) k

(by Lemma 14). Also, since sjt sj

is consistent with ht , τsj (pk , Ak ) > 0 for all k = 0, . . . , t. This implies that for every sk ∈ supp µkk−1 , k

τsk ( 12 pk + 21 qkj , 12 Ak + 12 Bk ) > 0 if and only if sk = sjk , yielding the third equality. It also implies that M ( 21 Ak + 21 Bkj , Usj ) = M ( 12 Ak + 12 {qkj }, Usj ), so that τsj ( 21 pk + 12 qkj , 12 Ak + 21 Bkj ) = τsj ( 12 pk + 12 qkj , 21 Ak + k

1 j 2 {qk })

k

k

k

= τsj (pk , Ak ), yielding the fourth equality. k

Now let H t := {ht (j) : j = 1, . . . , m} ⊆ Ht (At+1 ). Note that by repeated application of Axiom 2, we have that ρt+1 (pt+1 , , At+1 |ht ) = ρt+1 (pt+1 , At+1 |H t ).

49

(16)

Moreover, we have that Pm

t t j=1 ρ(h (j))ρt+1 (pt+1 , At+1 |h (j)) Pm t j=1 ρ(h (j))

t

ρt+1 (pt+1 , At+1 |H ) =

sjk−1 j (sk )τsj (pk , Ak )ρt+1 (pt+1 , At+1 |ht (j)) k=0 µk k

Pm Qt =

j=1

sj

k−1 (sjk )τsj (pk , Ak ) k=0 µk

Pm Qt j=1

k

sjk−1 j sjt (pt+1 |At+1 ) (sk )τsj (pk , Ak )ρt+1 k=0 µk k

(17)

P Qt j

=

sj

k−1 (sjk )τsj (pk , Ak ) k=0 µk

P Qt

j k sk−1 t (pt+1 |At+1 ) (sk )τsk (Ak , pk )ρst+1 (s0 ,...,st )∈S0 ×···×St k=0 µk . P Qt sk−1 (sk )τsk (Ak , pk ) k=0 µk (s0 ,...,st )∈S0 ×···×St

Qt

P =

Indeed, the first equality holds by definition of choice conditional on a set of histories. The second ˆ t (j) is a separating history for sj and sj equality follows from Equation (15). Note next that since h t t is consistent with ht , we have that Usj ( 12 pk + 21 qkj , 12 Ak + 12 Bkj ) = {Usj } for each k. Hence, Lemma 3 k

k

sjt implies that ρt+1 (pt+1 , At+1 = ρt+1 (pt+1 , At+1 ), yielding the third equality. Finally, note that / {s1t , . . . , sm if (s0 , . . . , st ) ∈ S0 × . . . St with (s0 , . . . , st ) 6= (sj0 , . . . , sjt ) for all j, then either st ∈ t }, Q sk−1 j t t or st = sj for some j but (s0 , . . . , st−1 ) 6= pred(st ). In either case, k=0 µk (sk )τsk (Ak , pk ) = 0,

|ht (j))

yielding the final equality. Combining (16) and (17), we obtain the desired conclusion.

B.1.4



Completing the proof:

t Finally, combining Lemma 4 with the representation of ρst+1 in (12) yields that for any ht =

(A0 , p0 , ..., At , pt ) ∈ Ht (At+1 ) ρt+1 (pt+1 , At+1 |ht ) P =

=

(s0 ,...,st )∈S0 ×···×St

P sk−1 t (sk )τsk (Ak , pk ) st+1 ∈St+1 µst+1 (st+1 )τst+1 (pt+1 , At+1 ) k=0 µk P Qt sk−1 (sk )τsk (Ak , pk ) (s0 ,...,st )∈S0 ×···×St k=0 µk P Qt+1 sk−1 (sk )τsk (Ak , pk ) (s0 ,...,st ,st+1 )∈S0 ×···×St ×St+1 k=0 µk . P Qt sk−1 (sk )τsk (Ak , pk ) (s0 ,...,st )∈S0 ×···×St k=0 µk Qt

t Thus, (St+1 , {µst+1 }st ∈St , {Ust+1 , τst+1 }st+1 ∈St+1 ) also satisfies requirement DREU2, completing the

proof.

50

B.2

Proof of Theorem 1: Necessity

Suppose ρ admits a DREU representation. By Proposition 5, ρ admits an S-based DREU representation. By Lemma 16, for each t and ht ∈ Ht , the (static) stochastic choice rule ρt (·|ht ) : At → ∆(∆(Xt )) given by the extended version of ρ from Definition 3 also satisfies DREU2. In other words, ρt (·|ht ) admits an S-based REU representation (see Definition 13). Thus, Theorem 4 implies that Axiom 3 holds. It remains to verify that Axioms 1, 2, and 4 are satisfied. Claim 1. ρ satisfies Axiom 1 (Contraction History Independence). ˆ t−1 = (ht−1 , (Bk , pk )) ∈ Ht−1 (At ) such that Bk ⊇ Ak and Proof. Take any ht−1 = (ht−1 −k , (Ak , pk )), h −k ρk (pk ; Ak |hk−1 ) = ρk (pk ; Bk |hk−1 ). From DREU2 for ρk , ρk (pk ; Ak |hk−1 ) = ρk (pk ; Bk |hk−1 ) implies that X

k−1 Y

s

s

X

µl l−1 (sl )τsl (pl , Al )µkk−1 (sk )τsk (pk , Ak ) =

(s0 ,...,sk ) l=0

k−1 Y

s

s

µl l−1 (sl )τsl (pl , Al )µkk−1 (sk )τsk (pk , Bk ).

(s0 ,...,sk ) l=0

(18) Since Bk ⊇ Ak implies τsk (pk , Ak ) ≥ τsk (pk , Bk ) for all sk , the only way for (18) to hold is if τsk (pk , Ak ) = τsk (pk , Bk ) for all sk consistent with hk . Thus, P t−1

ρt (pt ; At |h

)= P

(s0 ,...,st )∈S0 ×...×St

sl−1 (sl )τsl (pl , Al ) l=0 µl Qt−1 sl−1 (sl )τsl (pl , Al ) l=0 µl

Qt

(s0 ,...,st−1 )∈S0 ×...×St−1

ˆ t−1 ), = ρt (pt ; At |h

as required.



Claim 2. ρ satisfies Axiom 2 (Linear History Independence). Proof. Take any At , ht−1 = (A0 , p0 , . . . , At−1 , pt−1 ) ∈ Ht−1 (At ), and H t−1 ⊆ Ht−1 (At ) of the form H t−1 = {(ht−1 −k , (λAk + (1 − λ)Bk , λpk + (1 − λ)qk )) : qk ∈ Bk } for some k < t, λ ∈ (0, 1), and j Bk = {qk : j = 1, . . . , m} ∈ Ak . Let A˜k := λAk + (1 − λ)Bk , and for each j = 1, . . . , m, let ˜ t−1 (j) := (ht−1 , (A˜k , p˜j )). p˜j := λpk + (1 − λ)q j and h k

−k

k

k

By DREU2, for all pt , we have s`−1 (s` )τs` (p` , A` ) `=0 µ` . P Qt−1 s`−1 (s` )τs` (p` , A` ) (s0 ,...,st−1 ) `=0 µ`

P

ρt (pt ; At |h

t−1

)=

(s0 ,...,st )

Qt

Moreover, by definition Pm ρt (pt ; At |H

t−1

)=

˜ t−1 (j))ρt (pt ; At |h ˜ t−1 (j)) , Pm ˜ t−1 (j)) j=1 ρ(h

j=1 ρ(h

51

(19)

where for each j = 1, . . . , m, DREU2 yields  s s`−1 pjk , A˜k ) (p , A ) µkk−1 (sk )τsk (˜ (s )τ µ ` ` s` ` (s0 ,...,st ) `=0,...,t;`6=k ` t−1 ˜  Q . ρt (pt ; At |h (j)) = P sk−1 s`−1 j ˜ (˜ p , A ) (s )τ (p , A ) µ (s )τ µ s s k k ` ` ` k ` (s0 ,...,st−1 ) `=0,...,t−1;`6=k ` k k Q

P

and Y

˜ t−1 (j)) := ρ(h

˜ `−1 )ρk (˜ ˜ k−1 ) ρ` (p` ; A` |h pjk ; A˜k |h

`=0,...,t−1;`6=k



 X

=

Y 

(s0 ,...,st−1 )

s s pjk , A˜k )). µ` `−1 (s` )τs` (p` , A` ) µkk−1 (sk )τsk ((˜

`=0,...,t−1;`6=k

Combining and rearranging, we obtain Q

 P s s`−1 pjk , A˜k ) (s )τ (A , p ) µkk−1 (sk ) m µ s ` ` ` ` j=1 τsk (˜ (s ,...,s ) `=0,...,t;`6 = k ` t 0 t−1 Q  . (20) ρt (pt ; At |H ) = P P s`−1 s (s` )τs` (A` , p` ) µkk−1 (sk ) m pjk , A˜k ) j=1 τsk (˜ (s0 ,...,st−1 ) `=0,...,t−1;`6=k µ` P

But observe that for all sk , m X j=1

=

X

τsk (˜ pjk , A˜k )

=

m X

τsk ({w ∈ RXk : p˜jk ∈ M (M (A˜k , Usk ), w)})

j=1

τsk ({w ∈ RXk : pk ∈ M (M (Ak , Usk ), w) and qk ∈ M (M (Bk , Usk ), w)})

qk ∈Bk

(21)

= τsk ({w ∈ RXk : pk ∈ M (M (Ak , Usk ), w)}) = τsk (pk , Ak ), where the second equality follows from linearity of the representation, the third equality from the fact that τsk is a proper finitely-additive probability measure on RXk , and the remaining equalities hold by definition. Combining (19), (20), and (21), we obtain ρt (pt ; At |ht−1 ) = ρt (pt ; At |H t−1 ), as required.



Claim 3. ρ satisfies Axiom 4 (History Continuity). Proof. Fix any At , pt ∈ At , and ht−1 = (A0 , p0 , ..., At−1 , pt−1 ) ∈ Ht−1 . Let St−1 (ht−1 ) ⊆ St−1 denote the set of period-(t − 1) states that are consistent with ht−1 .

52

s

Define ρt t−1 (pt ; At ) :=

s

P

st

µt t−1 (st )τst (pt , At ) for each st−1 . By Lemma 16, sk−1 (sk )τsk (pk , Ak ) k=0 µk P Qt−1 sk−1 (sk )τsk (pk , Ak ) (s0 ,...,st−1 )∈S0 ×···×St−1 k=0 µk P P Qt−1 sk−1 s (sk )τsk (pk , Ak ) st ∈St µt t−1 (st )τst (pt , At ) k=0 µk (s0 ,...,st−1 )∈S0 ×···×St−1 . P Qt−1 sk−1 (sk )τsk (pk , Ak ) k=0 µk (s0 ,...,st−1 )∈S0 ×···×St−1

P

t−1

ρt (pt ; At |h

) =

=

(s0 ,...,st )∈S0 ×···×St

Qt

s

Hence, ρt (pt ; At |ht−1 ) ∈ co{ρt t−1 (pt ; At ) : st−1 ∈ St−1 (ht−1 )}. Fix any s∗t−1 ∈ St−1 (ht−1 ). To prove the claim, it is sufficient to show that s∗

t−1 ∗ →m ht−1 , ht−1 ∈ Ht−1 }. ρt t−1 (pt ; At ) ∈ {lim ρt (pt ; At |ht−1 n ) : hn n n

¯ t−1 = (B0 , q0 , ..., Bt−1 , qt−1 ) ∈ H∗ be a To this end, let pred(s∗t−1 ) = (s∗0 , . . . , s∗t−2 ) and let h t−1 separating history for s∗t−1 . By Lemma 17, for each k = 0, . . . , t − 1, we can find sequences Ank ∈ ¯ k−1 ) and pn ∈ An such that An →m Ak , pn →m pk and Us∗ (An , pn ) = {Us∗ } for all n and A∗k (h k k k k k k k k−1 all k = 0, . . . , t − 1. Working backwards from k = t − 2, we can inductively replace Ank and pnk for all with a mixture putting small weight on (z, Ank+1 ) for some z to ensure that Ank+1 ∈ supp pn,A k k ≤ t − 2 while maintaining the properties in the previous sentence. Then by construction ht−1 := n ∗ (A ) and ht−1 is a separating history for s∗ , which by Lemma 16 (An0 , pn0 , . . . , Ant−1 , pnt−1 ) ∈ Ht−1 t n t−1

implies P

st ∈St

ρt (pt ; At |hnt−1 ) = =

Q

∗ t−1 sk−1 ∗ (sk )τs∗k (pk , Ak ) k=0 µk

Qt−1



s∗

µt t−1 (st )τst (pt , At )

s∗

k−1 (s∗k )τs∗k (pk , Ak ) k=0 µk

X

s∗

s∗

µt t−1 (st )τst (pt , At ) =: ρt t−1 (pt ; At )

st

for each n. Since hnt−1 →m ht−1 , this verifies the desired claim.

C C.1



Proof of Theorem 2 Implications of %ht

We begin with a preliminary lemma that characterizes the implications of the revealed dominance order %ht . Lemma 5. Suppose that ρ admits an S-based DREU representation. Consider any t ≤ T − 1, ht = (A0 , p0 , . . . , At , pt ) ∈ Ht , and qt , rt ∈ ∆(Xt ).

53

(i). If qt %ht rt , then Ust (qt ) ≥ Ust (rt ) for all st consistent with ht . (ii). Suppose there exist g, b ∈ ∆(Xt ) such that Ust (g) > Ust (b) for all st consistent with ht . If Ust (qt ) ≥ Ust (rt ) for all st consistent with ht , then qt %ht rt . (iii). If ht is a separating history for st , then qt %ht rt if and only if Ust (qt ) ≥ Ust (rt ). Proof. (i): We prove the contrapositive.

Suppose there exists st consistent with ht such that s

Since st is consistent with ht , we have Πtk=0 µkk−1 (sk )τsk (pk , Ak ) > 0 for

Ust (qt ) < Ust (rt ).

pred(st ) = (s0 , . . . , st−1 ). Moreover, Ust (qt ) < Ust (rt ) implies that for any qtn →m qt and rtn →m rt , we have Ust (qtn ) < Ust (rtn ) for large enough n. But then for all large enough n, we have τst ( 12 pt + 1 n 1 1 n n 2 rt ; 2 At + 2 {qt , rt })

= τst (pt , At ) > 0, whence by Lemma 16, ρt ( 21 pt + 12 rtn ; 12 At + 12 {qtn , rtn }|ht−1 ) > 0.

Thus, by definition, qt 6%ht rt . (ii): Let St (ht ) denote the set of st consistent with ht . Suppose Ust (qt ) ≥ Ust (rt ) for all st ∈ St (ht ). Then picking g, b ∈ ∆(Xt ) such that Ust (g) > Ust (b) for all st ∈ St (ht ) and letting qtn := and rtn :=

n n+1 rt

+

1 n+1 b

1 n n+1 qt + n+1 g

for all n, we have qtn →m qt , rtn →m rt , and Ust (qtn ) > Ust (rtn ) for all n and

st ∈ St (ht ). Consider any (s0 , . . . , st−1 , st ) ∈ S0 × . . . × St−1 × St . Then either st ∈ St (ht ), in which s

s

k−1 case τst ( 21 pt + 12 rtn ; 21 At + 12 {qtn , rtn }) = 0 for all n, so that Πt−1 (sk )τsk (pk , Ak )µt t−1 (st )τst ( 12 pt + k=0 µk

1 n n 1 n 1 2 rt ; 2 At + 2 {qt , rt })

s

s

k−1 = 0. Or st ∈ / St (ht ), in which case Πt−1 (sk )τsk (pk , Ak )µt t−1 (st )τst (pt , At ) = k=0 µk

s

s

t−1 µkk−1 (sk )τsk (pk , Ak )µt t−1 (st )τst ( 12 pt + 21 rtn ; 21 At + 21 {qtn , rtn }) = 0. By 0, in which case again Πk=0

Lemma 16, this implies ρt ( 12 pt + 12 rtn ; 12 At + 21 {qtn , rtn }|ht−1 ) = 0 for all n, i.e., qt %ht rt . (iii): Finally, suppose ht is a separating history for st . If qt %ht rt , then Ust (qt ) ≥ Ust (rt ) by part (i). For the converse, note that since Ust is non-constant, there exist g, b ∈ ∆(Xt ) such that Ust (g) > Ust (b). Since st is the only state consistent with ht (recall Lemma 1), part (ii) implies that if Ust (qt ) ≥ Ust (rt ) then qt %ht rt .

C.2



Proof of Theorem 2: Sufficiency

Throughout this section, we assume that ρ admits a DREU representation and satisfies Axioms 5–7. We will show that ρ admits an evolving utility representation. By Proposition 5, it is sufficient to construct an S-based evolving utility representation. Sections C.2.1–C.2.5 accomplish this.

C.2.1

Recursive Construction up to t

The construction proceeds recursively. Suppose that t ≤ T − 1. Assume that we have obtained s

0

(St0 , {µt0t −1 }st0 −1 ∈St0 −1 , {Ust0 , τst0 }st0 ∈St0 ) for each t0 ≤ t such that DREU1 and DREU2 hold for all t0 ≤ t and EVU holds for all t0 ≤ t − 1 (see Definition 12 for the statements of these conditions).

54

Note that the base case t = 0 is true because of the fact that ρ admits a DREU representation and by Proposition 5 (the requirement that EVU holds for t0 ≤ t − 1 is vacuous here). To complete the t }st ∈St , {Ust+1 , τst+1 }st+1 ∈St+1 ) such that DREU1 and DREU2 hold proof, we will construct (St+1 , {µst+1

for t0 ≤ t + 1 and EVU holds for t0 ≤ t.

C.2.2

Properties of Ust

Using Lemma 5, the following lemma translates Axioms 5 (Separability) and 6 (DLR Menu Preference) into properties of Ust . Lemma 6. For any st ∈ St , there exist functions ust : Z → R and Vst : At+1 → R such that (i). Ust (zt , At+1 ) = ust (zt ) + Vst (At+1 ) holds for each (zt , At+1 ) (ii). Vst is continuous (iii). Vst is monotone, i.e., Vst (A0t+1 ) ≥ Vst (At+1 ) for any At+1 ⊆ A0t+1 (iv). Vst is linear, i.e., Vst (αAt+1 + (1 − α)A0t+1 ) = αVst (At+1 ) + (1 − α)Vs (A0t+1 ) for all At+1 , A0t+1 and α ∈ (0, 1). 0 , C 0 Moreover, there exist Ct+1 t+1 ∈ At+1 such that Vst (Ct+1 ) > Vst (Ct+1 ) for all st ∈ St .

Proof. Fix any st ∈ St and a separating history ht for st , the existence of which is guaranteed by Lemma 2. For (i), it suffices, by standard arguments, to show that Ust (zt , At+1 ) + Ust (zt0 , A0t+1 ) = Ust (zt0 , At+1 ) + Ust (zt , A0t+1 ) for all zt , zt0 , and At+1 , A0t+1 . Fix any zt , zt0 , and At+1 , A0t+1 . By Axiom 5 (Separability), we have 1 1 0 1 0 1 0 0 2 (zt , At+1 ) + 2 (zt , At+1 ) ∼ht 2 (zt , At+1 ) + 2 (zt , At+1 ), whence 1 1 1 0 0 0 0 2 Ust (zt , At+1 ) = 2 Ust (zt , At+1 ) + 2 Ust (zt , At+1 ). This proves

Lemma 5 implies that 12 Ust (zt , At+1 ) + that there exist functions ust : Z → R

and Vst : At+1 → R such that Ust (zt , At+1 ) = ust (zt ) + Vst (At+1 ) for each (zt , At+1 ), as required. For (ii), note that %ht is complete (by ht being a separate history for st and Lemma 5) and also A continuous (by Axiom 6 (iii) (Continuity)). Thus Lemma 5 implies that Ust (pt ) = ust (pZ t ) + Vst (pt )

is continuous in pt , which ensures the continuity of Vst (·). For (iii), suppose At+1 ⊆ A0t+1 and fix any zt . (zt , A0t+1 )

%ht (zt , At+1 ).

By Axiom 6 (i) (Monotonicity), we have

By Lemma 5, this implies that Ust (zt , A0t+1 ) ≥ Ust (zt , At+1 ), whence

Vst (A0t+1 ) ≥ Vst (At+1 ) by (i).

55

For (iv), fix any At+1 , A0t+1 , zt , and α ∈ (0, 1). Axiom 6 (ii) (Indifference to Timing) implies (zt , αAt+1 +(1−α)A0t+1 ) ∼ht α(zt , At+1 )+(1−α)(zt , A0t+1 ), which by Lemma 5 implies Ust ((zt , αAt+1 + (1 − α)A0t+1 )) = Ust (α(zt , At+1 ) + (1 − α)(zt , A0t+1 )). By linearity and separability of Ust , this implies Vst (αAt+1 + (1 − α)A0t+1 ) = αVst (At+1 ) + (1 − α)Vst (A0t+1 ), as required. Finally, for the “moreover” part, again consider any s∗t and separating history ht for s∗t . By Axiom 6 (iv) (Non-degenerate Menu Preference), there exist A0t+1 (s∗t ), At+1 (s∗t ), and zt such that (zt , A0t+1 (s∗t )) ht (zt , At+1 (s∗t )). Thus, Lemma 5 (iii) implies Us∗t (zt , A0t+1 (s∗t )) > Us∗t (zt , At+1 (s∗t )),  S 0 so Vs∗t (A0t+1 (s∗t )) > Vs∗t (At+1 (s∗t )) by part (i). Now let Ct+1 := s∗ ∈St A0t+1 (s∗t ) ∪ At+1 (s∗t ) and let t P 0 ) ≥ Ct+1 := s∗ ∈St |S1t | At+1 (s∗t ). Then for all st and s0t , by monotonicity of Vst , we have Vst (Ct+1 t

Vst (At+1 (s0t )), where by construction this inequality is strict whenever st = s0t . By linearity of Vst , this 0 ) > V (C implies Vst (Ct+1 t+1 ). st



Corollary C.1. Fix any t ≤ T − 1 and ht ∈ Ht . Then qt %ht rt if and only if Ust (qt ) ≥ Ust (rt ) for all st consistent with ht . Proof. The “only if” direction is a restatement of part (i) of Lemma 5. For the “if” direction, let 0 0 Ct+1 and Ct+1 be as in the “moreover” part of Lemma 6. Pick any z ∈ Z and let gt+1 := δ(z,Ct+1 )

and bt+1 := δ(z,Ct+1 ) . Then by Lemma 6, Ust (gt+1 ) > Ust (bt+1 ) for all st . Hence, the “if” direction is immediate from part (ii) of Lemma 5.

C.2.3



Construction of Random Utility in Period t + 1

Since ρ admits a DREU representation, it admits an S-based DREU representation by Proposition 5, so t ˜s , τs }s ∈S ) satisfying DREU1 and DREU2 }st ∈St , {U in particular we can obtain (St+1 , {µst+1 t+1 t+1 t+1 t+1 P st st t (st+1 )τst+1 (pt+1 , At+1 ) for all at t + 1. For any st ∈ St , define ρt+1 by ρt+1 (pt+1 , At+1 ) := st+1 µst+1 pt+1 , At+1 .

C.2.4

Sophistication and Finiteness of Menu Preference

Before completing the representation, we establish two more lemmas. Using Axiom 7 (Sophistication), t the first lemma ensures that for each st , ρst+1 and the preference over At+1 induced by Vst satisfy

Axioms 1 and 2 in Ahn and Sarver (2013). Lemma 7. For any st ∈ St , separating history ht for st , and At+1 ⊆ A0t+1 ∈ A∗t+1 (ht ), the following are equivalent: t (i). ρst+1 (A0t+1 r At+1 ; A0t+1 ) > 0.

56

(ii). Vst (A0t+1 ) > Vst (At+1 ) Proof. Pick any separating history ht for st . Note that by DREU2 at t + 1 and Lemma 16, we have t (A0t+1 r At+1 ; A0t+1 ). Moreover, by Corollary C.1 and Lemma 6 (i), ρt+1 (A0t+1 r At+1 ; A0t+1 |ht ) = ρst+1

Vst (A0t+1 ) > Vst (At+1 ) if and only if (zt , A0t+1 ) ht (zt , At+1 ) for all zt . By Axiom 7, this implies that t (A0t+1 r At+1 ; A0t+1 ) > 0, as claimed. Vst (A0t+1 ) > Vst (At+1 ) if and only if ρst+1



t is enough to ensure The next lemma shows that because of Lemma 7, the finiteness of each suppµst+1

that the preference over At+1 induced by each Vst satisfies Axiom DLR 6 (Finiteness) introduced by Ahn and Sarver (2013): Lemma 8. For each st ∈ St , there is Kst > 0 such that for any At+1 , there is Bt+1 ⊆ At+1 such that |Bt+1 | ≤ Kst and Vst (At+1 ) = Vst (Bt+1 ). t Proof. Fix any st ∈ St and a separating history ht for st . Let St+1 (st ) := suppµst+1 . We will show

that Kst := |St+1 (st )| is as required. Step 1: First consider any At+1 ∈ A∗t+1 (ht ). Then by Lemma 14, for each st+1 ∈ St+1 (st ) we have ˜s )| = 1. Letting Bt+1 := S ˜ |M (At+1 , U t+1 st+1 ∈St+1 (st ) M (At+1 , Ust+1 ), we then have that |Bt+1 | ≤ Kst t and ρst+1 (At+1 r Bt+1 |At+1 ) = 0. By Lemma 7, this implies that Vst (At+1 ) = Vst (Bt+1 ), as required.

Step 2: Next take any At+1 6∈ A∗t+1 (ht ). By Lemma 17, we can find a sequence Ant+1 →m At+1 with n ⊆ An n Ant+1 ∈ A∗t+1 (ht ) for all n. Then by Step 1, we can find Bt+1 t+1 for all n such that |Bt+1 | ≤ Kst n ). By definition of →m , for each q and Vst (Ant+1 ) = Vst (Bt+1 t+1 ∈ At+1 , there exists Dt+1 (qt+1 ) ∈ At+1 S n ⊆ and a sequence αn (qt+1 ) → 0 such that Bt+1 qt+1 ∈At+1 αn (qt+1 )Dt+1 (qt+1 ) + (1 − αn (qt+1 )){qt+1 } n | ≤ K for all n. Hence, since |Bt+1 st for all n, restricting to a subsequence if necessary, there is n Bt+1 ⊆ At+1 such that Bt+1 →m Bt+1 and such that |Bt+1 | ≤ Kst . Finally, by continuity of Vst

(Lemma 6 (ii)), we have Vst (Bt+1 ) = Vst (At+1 ), as required.

C.2.5



Completing the representation

t ˜s , τs }s ∈S ) satisfying Recall that in Section C.2.3, we have obtained (St+1 , {µst+1 }st ∈St , {U t+1 t+1 t+1 t+1

DREU1 and DREU2 at t + 1. We now show that for each st+1 ∈ St+1 there exist αst+1 > 0 and ˜s ˜ βst+1 ∈ R such that after replacing U t+1 with Ust+1 := αst+1 Ust+1 + βst+1 , we additionally have that EVU holds at time t. t Fix any st and let St+1 (st ) := supp µst+1 . Note that by DREU1 at t + 1 and since we have P t t t defined ρst+1 by ρst+1 (pt+1 , At+1 ) := st+1 ∈St+1 (st ) µst+1 (st+1 )τst+1 (pt+1 , At+1 ) for all pt+1 and At+1 , it st ˜s , τs }s ∈S (s ) ) is an S-based REU representation of ρst (see follows that (St+1 (st ), µ , {U

t+1

t+1

t+1

t+1

t+1

t

Definition 13).

57

t+1

Since all the Ust+1 are non-constant and induce different preferences over ∆(Xt+1 ) for distinct st+1 , s0t+1 ∈ St+1 (st ) and since Vst is nonconstant by Lemma 6, we can find a finite set Y ⊆ Xt+1 such that (i) Vst is non-constant on At+1 (Y ) := {Bt+1 ∈ At+1 : ∪pt+1 ∈Bt+1 supp(pt+1 ) ⊆ Y }; (ii) for 0 ˜s each st+1 ∈ St+1 (st ), U t+1 is non-constant on Y ; and (iii) for each distinct pair st+1 , st+1 ∈ St+1 (st ), ˜s0 on Y . ˜s 6≈ U U t+1

t+1

Observe that by Lemmas 6 and 8, the preference %st on At+1 (Y ) induced by Vst satisfies Axioms DLR 1–6 (Weak Order, Continuity, Independence, Monotonicity, Nontriviality, Finiteness) in Ahn and Sarver (2013) (henceforth AS), so by Corollary S1 in AS %st admits a DLR representation (see t admits an S-based REU representation (what AS call a GP Definition S1 in AS). Moreover, since ρst+1 t ) satisfies representation), so does its restriction to At+1 (Y ). Finally, by Lemma 7, the pair (%st , ρst+1

AS’s Axioms 1 and 2 on At+1 (Y ). Thus, by Theorem 1 in AS, we can find a DLR-GP representation of ˆs , τˆs } ˆ st , { U (%st , ρst ) on At+1 (Y ), i.e., an S-based REU representation (Sˆt+1 (st ), µ ) ˆ t+1

t+1 st ρt+1 on

t+1

t+1

st+1 ∈St+1 (st )

At+1 (Y ) such that %st restricted to At+1 (Y ) is represented by Vˆst , where Vˆst (At+1 ) := of P t ˆs (pt+1 ). Since Vst also represents %st restricted to At+1 (Y ), (st+1 ) maxpt+1 ∈At+1 U ˆst+1 t+1 st+1 ∈Sˆt+1 (st ) µ standard arguments yield αˆs > 0 and βˆs ∈ R such that for all At+1 ∈ At+1 (Y ), we have Vs (At+1 ) = t

t

t

α ˆ st Vˆst (At+1 ) + βˆst , whence X

Vst (At+1 ) =

t µ ˆst+1 (st+1 )

st+1 ∈Sˆt+1 (st )

where Ust+1

ˆ ˆs = α ˆ st U t+1 + βst .

max

pt+1 ∈At+1

Ust+1 (pt+1 ),

(22)

By the uniqueness properties of S-based REU represen-

t , {Ust+1 , τˆst+1 }st+1 ∈Sˆt+1 (st ) ) still constitutes an Stations (Proposition 4 in AS), (Sˆt+1 (st ), µ ˆst+1 t on At+1 (Y ). Applying Proposition 4 in AS again, since based REU representation of ρst+1 st ˜s , τs }s ∈S (s ) ) also represents ρst on At+1 (Y ), we can assume after re(St+1 (st ), µt+1 , {U t+1 t+1 t+1 t+1 t+1 t s s t t =µ and that for each st+1 ∈ St+1 (st ), there exist constants labeling that St+1 (st ) = Sˆt+1 (st ), µ ˆ

t+1

t+1

αst +1 > 0 and βst+1 ∈ R such that ˜s (xt+1 ) + βs Ust+1 (xt+1 ) = αst+1 U t+1 t+1

(23)

˜s Since U t+1 is defined on Xt+1 , we can extend Ust+1 to ˜s the whole space Xt+1 by (23). Then Ust+1 and U t+1 represent the same preference over t ˜s , τs }s ∈S (s ) ) satisfies DREU1 and DREU2, so does ∆(Xt+1 ), so since (St+1 (st ), µst+1 , {U t+1 t+1 t+1 t+1 t

for each xt+1 ∈ Y

⊆ Xt+1 .

t (St+1 (st ), µst+1 , {Ust+1 , τst+1 }st+1 ∈St+1 (st ) ).

It remains to show that (22) holds for all At+1 ∈ At+1 , so that EVU is satisfied at st . To see this, S consider any At+1 ∈ At+1 and choose a finite set Y 0 ⊆ Xt+1 such that Y ∪ pt+1 ∈At+1 supp(pt+1 ) ⊆

58

Y 0 . As above, we can again apply Theorem 1 in AS to obtain a DLR-GP representation s0t st 0 ¯s , τ¯s } , {U (S¯t+1 (st ), µ ¯t+1 ¯t+1 (st ) ) of the pair (%st , ρt+1 ) restricted to At+1 (Y ). But since this t+1 st+1 ∈S t+1 t ) restricted to At+1 (Y ), by the uniqueness property of also yields a DLR-GP representation of (%st , ρst+1 DLR-GP representations (Theorem 2 in AS), we can assume that S¯t+1 (st ) = St+1 (st ), µ ¯st = µst and

t+1

t+1

¯s = α ¯ st Ust+1 +β¯st+1 for each st+1 ∈ St+1 (st ). Since that there exists α ¯ st > 0 and β¯st+1 ∈ R such that U t+1 P t ¯s (pt+1 ) and (st+1 ) maxpt+1 ∈Bt+1 U %st is represented on At+1 (Y 0 ) by V¯st (Bt+1 ) := st+1 ∈St+1 (st ) µst+1 t+1 since α ¯ st depends only on st (and not on st+1 ), it follows that %st is also represented on At+1 (Y 0 ) P t (st+1 ) maxpt+1 ∈Bt+1 Ust+1 (pt+1 ). Thus, the linear functions Vst and by Vs0t (Bt+1 ) := st+1 ∈St+1 (st ) µst+1 Vs0t represent the same preference on At+1 (Y 0 ) and coincide on At+1 (Y ), so they must also coincide on At+1 (Y 0 ). Thus, (22) holds at At+1 . This shows that EVU holds at t. Combining this with the inductive hypothesis, it follows that s

0

(St0 , {µt0t −1 }st0 −1 ∈St0 −1 , {Ust0 , τst0 }st0 ∈St0 ) satisfies DREU1 and DREU2 for all t0 ≤ t + 1 and EVU for all t0 ≤ t, as required.

C.3

Proof of Theorem 2: Necessity

Suppose that ρ admits an evolving utility representation. Then by Proposition 5, ρ admits an S-based s

evolving utility representation (St , {µt t−1 }st−1 ∈St−1 , {Ust , ust , τst }st ∈St ). We first show that for every t ≤ T − 1, there exist gt , bt ∈ ∆(Xt ) such that Ust (gt ) > Ust (bt ) for all st ∈ St .

0 , C By separability of Ust , it is sufficient to find menus Ct+1 t+1 such that

0 ) > V (C Vst (Ct+1 st t+1 ) for all st . Note first that for any st+1 ∈ St+1 , since Ust+1 is nonconstant,

we can find gt+1 (st+1 ), bt+1 (st+1 ) ∈ ∆(Xt+1 ) such that Ust+1 (gt+1 (st+1 )) > Ust+1 (bt+1 (st+1 )). Let 0 Ct+1 := {gt+1 (st+1 ), bt+1 (st+1 ) : st+1 ∈ St+1 }, and for every st , let At+1 (st ) := {bt+1 (st+1 )} for some 0 0 0 0 ) ≥ V (A t st+1 ∈ suppµst+1 . Then Vst (Ct+1 st t+1 (st )) for all st , st , with strict inequality for st = st . Hence, P 0 ) > V (C letting Ct+1 := st ∈St |S1t | At+1 (st ), linearity implies Vst (Ct+1 st t+1 ) for all st , as required.

By Lemma 5, the previous paragraph implies that for all t ≤ T − 1, ht and qt , rt , we have qt %ht rt if and only if Ust (qt ) ≥ Ust (rt ) for all st consistent with ht . Axioms 5 (Separability) and 6 (i)–(ii) (Monotonicity and Indifference to Timing) are then straightforward to verify from the representation. t 0 0 )  t (z , C Moreover, Ct+1 and Ct+1 from the previous paragraph satisfy (zt , Ct+1 t t+1 ) for all h and h

zt , implying Axiom 6 (iv) (Nondegeneracy). To show Axiom 7 (Sophistication), consider any t ≤ T − 1, ht , zt , and At+1 ⊆ A0t+1 ∈ A∗ (ht ). Since A0t+1 ∈ A∗ (ht ), Lemma 14 implies that ρt+1 (A0t+1 r At+1 ; A0t+1 |ht ) > 0 holds if and only if there exists some st consistent with ht such that maxpt+1 ∈A0t+1 Ust+1 (pt+1 ) > maxpt+1 ∈At+1 Ust+1 (pt+1 ) t for some st+1 ∈ suppµst+1 , which by the representation is equivalent to Vst (A0t+1 ) > Vst (At+1 ). By

59

Lemma 5, this is equivalent to (zt , At+1 ) 6%ht (zt , A0t+1 ), which by Monotonicity is in turn equivalent to (zt , A0t+1 ) ht (zt , At+1 ). Finally, to show Axiom 6 (iii) (Continuity), note first that for each sT −1 P

sT ∈ST

s µTT −1 (sT ) maxpT ∈AT



ST −1 ,

UsT (pT ) is continuous in menu AT . Assuming inductively that for each

k (sk+1 ) maxpk+1 ∈Ak+1 Usk+1 (pk+1 ) is continuous in menu Ak+1 , µsk+1 P st it also follows that for each t and st ∈ St , st+1 ∈St+1 µt+1 (st+1 ) maxpt+1 ∈At+1 Ust+1 (pt+1 ) is con-

k ≥ t + 1 and sk ∈ Sk ,

P

sk+1 ∈Sk+1

tinuous in menu At+1 . Thus for each st , Ust (pt ) is continuous in pt . Then Continuity follows as, for each pt , {qt : qt %ht pt } = ∩st :consistent with ht {qt : Ust (qt ) ≥ Ust (pt )} and {qt : pt %ht qt } = ∩st :consistent with ht {qt : Ust (pt ) ≥ Ust (qt )} are closed.

D

Proof of Theorem 3

D.1

Proof of Theorem 3: Sufficiency

Suppose that ρ admits an evolving utility representation and that Condition 1 and Axioms 8 (Stationary Consumption Preference) and 9 (Constant Intertemporal Tradeoff) hold. By Proposition 5, ρ s

admits an S-based evolving utility representation (St , {µt t−1 }st−1 ∈St−1 , {Ust , ust , τst }st ∈St )t=0,...,T . Up P to adding appropriate constants to each utility ust and Ust , we can ensure that z∈Z ust (z) = 0 for s

all t = 0, ..., T and st ∈ St without affecting that (St , {µt t−1 }st−1 ∈St−1 , {Ust , ust , τst }st ∈St )t=0,...,T is an S-based evolving utility representation of ρ. We will show that this representation is in fact an S-based gradual learning representation, i.e., that there exists a discount factor δ ∈ (0, 1) such that P t for all t ≤ T − 1 and st , we have ust = 1δ st+1 µst+1 (st+1 )ust+1 . By Proposition 5, this implies that ρ admits a gradual learning representation. Condition 1 implies that each ust is nonconstant: Lemma 9. For each t = 0, .., T − 1 and st ∈ St , there exist `, m ∈ ∆(Z) such that ust (`) 6= ust (m). Proof. Consider any t = 0, . . . , T − 1, st ∈ St and separating history ht for st . By Condition 1, there exist `, m, n ∈ ∆(Z) such that (`, n, . . . , n) 6∼ht (m, n, . . . , n). Then Lemma 5 (iii) implies that Ust ((`, n, . . . , n)) 6= Ust ((m, n, . . . , n)), whence ust (`) 6= ust (m), as required. For any t = 0, . . . , T − 1 and st ∈ St and ` ∈ ∆(Z), let E[ut+1 (`)|st ] :=

X

t µst+1 (st+1 )ust+1 (`)

st+1

60



denote the expected period t + 1 consumption utility of ` at state st . Stationary Consumption Preference implies that ust and E[ut+1 |st ] induce the same preference over ∆(Z): Lemma 10. For all `, m ∈ ∆(Z), t = 0, ..., T − 1, and st ∈ St , E[ut+1 (`)|st ] > E[ut+1 (m)|st ] ⇐⇒ ust (`) > ust (m). Proof. Fix any `, m, n ∈ ∆(Z), t = 0, ..., T − 1, st ∈ St and separating history ht for st . Note that ust (`) > ust (m) if and only if Ust ((`, n, . . . , n)) > Ust ((m, n, . . . , n), which by Lemma 5 (iii) is in turn equivalent to (`, n, . . . , n) ht (m, n, . . . , n). Likewise, E[ut+1 (`)|st ] > E[ut+1 (m)|st ] if and only if Ust ((n, `, n, . . . , n)) > Ust ((n, m, n, . . . , n)), which by Lemma 5 (iii) is equivalent to (n, `, n, . . . , n) ht (n, m, n, . . . , n). Thus, the claim is immediate from Axiom 8.



Given Lemma 10, Constant Intertemporal Tradeoff now allows us to obtain a time-invariant and non-random discount factor δ > 0. Lemma 11. There exists δ ∈ (0, 1) such that for all t = 0, . . . , T − 1 and st ∈ St , we have ust = 1 δ E[ut+1 |st ].

ˆ tˆ for sˆˆ. By Proof. Fix any t, tˆ ≤ T − 1, st ∈ St , sˆtˆ ∈ Stˆ, and separating histories ht for st and h t Lemma 10, ust and E[ut+1 |st ] induce the same preference over ∆(Z), and moreover, ust is nonconstant by Lemma 9. Hence, there exist constants γst > 0, βst ∈ R such that ust = γst E[ut+1 |st ] + βst . Since P we have normalized consumption utilities such that z∈Z ust0 (z) = 0 for any t0 and st0 , we must have βst = 0. Similarly, there exists γˆsˆtˆ > 0 such that usˆtˆ = γˆsˆtˆE[ut+1 |ˆ stˆ]. 1 1 Let δst := γs and δˆsˆtˆ := γˆsˆ . We first show that δst = δˆsˆtˆ. By Condition 1, there exist ht t



ˆ tˆ-nonindifferent `, ˆm nonindifferent `, m ∈ ∆(Z) and h ˆ ∈ ∆(Z). For any α ∈ (0, 1) and n ∈ ∆(Z), Lemma 5 (iii) along with the above implies (α` + (1 − α)m, α` + (1 − α)m, n, . . . , n) ∼ht (`, m, n, . . . , n) ⇐⇒ (1 + δst )(αust (`) + (1 − α)ust (m)) = ust (`) + δst ust (m) ⇐⇒ α=

1 , 1 + δ st

where the final equivalence holds because ust (`) 6= ust (m) (Lemma 9). Likewise, we have (α`ˆ + (1 − ˆ m, α)m, ˆ α`ˆ + (1 − α)m, ˆ n, . . . , n) ∼hˆ tˆ (`, ˆ n, . . . , n) if and only if α = 1ˆ . Since by Axiom 9, we have 1+δsˆˆ t

61

(α` + (1 − α)m, α` + (1 − α)m, n, . . . , n) ∼ht (`, m, n, . . . , n) if and only if (α`ˆ + (1 − α)m, ˆ α`ˆ + (1 − ˆ m, α)m, ˆ n, . . . , n) ∼ˆ tˆ (`, ˆ n, . . . , n), this implies δs = δˆsˆ =: δ. h

t



 This completes the proof that ρ admits an S-based gradual learning representation.

D.2

Proof of Theorem 3: Necessity

Suppose that ρ admits a gradual learning representation and Condition 1 holds. By Proposition 5, s

ρ admits an S-based gradual learning representation (St , {µt t−1 }st−1 ∈St−1 , {Ust , ust , τst }st ∈St )t=0,...,T with discount factor δ > 0. The same argument as in the proof of the necessity direction of Theorem 2 shows that for all t ≤ T − 1, ht and qt , rt ∈ ∆(Xt ), we have qt %ht rt if and only if Ust (qt ) ≥ Ust (rt ) for all st consistent with ht . Given this, Axiom 8 is equivalent to the statement that for all st , ust and E[ut+1 |st ] represent the same preference over ∆(Z). But this is immediate from the fact that for all st , we have ust = 1 δ E[ut+1 |st ].

Finally, to establish Axiom 9, consider any t ≤ T −1, ht , and ht -nonindifferent `, m ∈ ∆(Z). By the second paragraph, for any α ∈ [0, 1] and n ∈ ∆(Z), we have (α`+(1−α)m, α`+(1−α)m, n, . . . , n) ∼ht (`, m, n, . . . , n) if and only if Ust ((α` + (1 − α)m, α` + (1 − α)m, n, . . . , n)) = Ust ((`, m, n, . . . , n)) for all st consistent with ht . Since ust = 1δ E[ut+1 |st ], this is equivalent to (1 + δ)(α(ust (`) + (1 − α)ust (m)) = ust (`) + δust (m) for all st consistent with ht .

(24)

But since `, m are ht -nonindifferent, there is some s∗t consistent with ht such that us∗t (`) 6= us∗t (m), whence (24) is equivalent to α =

1 1+δ .

Since this holds for all ht and ht -nonindifferent `, m, this

establishes Axiom 9.

E

Additional Lemmas

Lemma 12. For all t = 0, . . . , T , Xt is a separable metric space, where XT := Z is endowed with the discrete metric and for all t ≤ T − 1, we recursively endow ∆(Xt+1 ) with the induced topology of weak convergence, At+1 := Kf (∆(Xt+1 )) with the induced Hausdorff topology, and Xt := Z × At+1 with the induced product topology.

62

Proof. By standard arguments, for any separable metric space (Y, d): (a) the set P(Y ) of Borel probability measures on Y endowed with the topology of weak convergence is a separable metric space metrized by the Prokhorov metric πd induced by d (e.g., Theorem 15.12 in Aliprantis and Border (2006)); (b) the set K(Y ) of nonempty compact subsets of Y endowed with the Hausdorff distance induced by d is a separable metric space (e.g., Khamsi and Kirk (2011) p. 40); (c) every dense subspace of Y is separable. We now prove the claim inductively, working backwards from period T . Since XT := Z is finite, the claim is immediate. Consider t < T and suppose that Xτ is a separable metric space for all τ ≥ t + 1. By (a) above, P(Xt+1 ) endowed with the induced Prokhorov metric is separable, and since ∆(Xt+1 ) is dense in P(Xt+1 ) (e.g., Theorem 15.10 in Aliprantis and Border (2006)) so is ∆(Xt+1 ) by (c). Then by (b) above, K(∆(Xt+1 )) endowed with the induced Hausdorff metric is separable, and since At+1 := Kf (∆(Xt+1 )) is dense in K(∆(Xt+1 )) (e.g., Lemma 0 in Gul and Pesendorfer (2001)), so is At+1 . Finally, Xt := Z × At+1 endowed with the product of the discrete metric and the Hausdorff metric is separable, as required.



Lemma 13. Let Y be any set (possibly infinite) and let {Us : s ∈ S} ⊆ RY be a collection of nonconstant vNM utility functions indexed by a finite set S such that Us 6≈ Us0 for any distinct 0

s, s0 ∈ S. Then there is a collection of lotteries {ps : s ∈ S} ⊆ ∆(Y ) such that Us (ps ) > Us (ps ) for any distinct s, s0 ∈ S. Proof. By the finiteness of S, there is a finite set Y 0 ⊆ Y such that for each s the restriction Us Y 0 to Y 0 is nonconstant and for any distinct s, s0 , Us Y 0 6≈ Us0 Y 0 (that is, there exists p, q ∈ ∆(Y 0 ) such that Us (p) ≥ Us (q) and Us0 (p) < Us0 (q)). By Lemma 1 in Ahn and Sarver (2013), there is a collection 0

0

of lotteries {ps : s ∈ S} ⊆ ∆(Y 0 ) such that Us (ps ) = Us Y 0 (ps ) > Us Y 0 (ps ) = Us (ps ) for any distinct s, s0 .

 s

0

Lemma 14. Fix t = 0, . . . , T . Suppose (St0 , {µt0t −1 }st0 −1 ∈St0 −1 , {Ust0 , τst0 }st0 ∈St0 ) satisfy DREU1 and DREU2 for all t0 ≤ t. Take any ht−1 ∈ Ht−1 and let S(ht−1 ) ⊆ St−1 denote the set of states consistent with ht−1 . Then for any At ∈ At , the following are equivalent: (i). At ∈ A∗t (ht−1 ) s

(ii). For each st−1 ∈ S(ht−1 ) and st ∈ supp µt t−1 , |M (At , Ust )| = 1. Proof. s

(i) =⇒ (ii): We prove the contrapositive. Suppose that there is st−1 ∈ S(ht−1 ) and st ∈ supp µt t−1 such that |M (At , Ust )| > 1. Pick any pt ∈ M (At , Ust ) such that τst (pt , At ) > 0. Since Ust is nonconstant, we can find lotteries r, r ∈ ∆(Xt ) such that Ust (r) < Ust (r). Fix any sequence αn ∈ (0, 1)

63

with αn → 0. Let pnt := αn r + (1 − αn )pt . For every qt ∈ At r {pt }, let q nt := αn r + (1 − αn )qt and n

q nt := αn r + (1 − αn )qt . Let B nt := {q nt : qt ∈ At r {pt }}, let B t := {q nt : qt ∈ At r {pt }}, and let n

Btn := B nt ∪ B t . Then Btn →m At r {pt } and pnt →m pt . Moreover, since |M (At , Ust )| > 1, there exists qt ∈ At r {pt } such that Ust (αn r + (1 − αn )qt ) > Ust (pnt ) for all n, so that τst (pnt , Btn ∪ {pnt }) = 0. Furthermore, note that for all s0t ∈ St r {st }, we have N (M (At , Us0t ), pt ) = N (M (B nt ∪ {pnt }, Us0t ), pnt ) ⊇ N (M (Btn ∪ {pnt }, Us0t ), pnt ), so that τs0t (pt , At ) ≥ τs0t (pnt , Btn ∪ {pnt }) for all n. Letting pred(st−1 ) = (s0 , . . . , st−2 ), Lemma 16 then implies that for all n, ρt (pt ; At |ht−1 ) − ρt (pnt ; Btn ∪ {pnt }|ht−1 ) =   P Qt−1 s0k−1 0 s0 (sk )τs0k (pk , Ak )µt t−1 (s0t ) τs0t (pt , At ) − τs0t (pnt , Btn ∪ {pnt }) k=0 µk s00 ,...,s0t ≥ P Qt−1 s0k−1 0 0 )τ (p , A ) µ (s k s00 ,...,s0t−1 k=0 k k sk k Qt−1 sk−1 s (sk )τsk (pk , Ak )µt t−1 (st )τst (pt , At ) k=0 µk > 0. Qt−1 s0k−1 0 P P 0 (pk , Ak ) µ (s )τ 0 0 0 0 s k=0 k s ,...,s s ,...,s k k 0

0

t−1

t−1

Since the last line does not depend on n, this implies limn→∞ ρt (pnt ; Btn ∪ {pnt }|ht−1 ) < ρt (pt ; At |ht−1 ). By definition of A∗t , this means At ∈ / A∗t (ht−1 ). (ii) =⇒ (i): Suppose At satisfies (ii). Consider any pt ∈ At , pnt →m pt , Btn →m At r{pt }. Consider any s

st−1 ∈ S(ht−1 ) and st ∈ supp µt t−1 . By (ii), we either have M (At , Ust ) = {pt } or pt ∈ / M (At , Ust ). In the former case, Ust (pt ) > Ust (qt ) for all qt ∈ At r{pt }. But then, for all n large enough, linearity of Ust implies Ust (pnt ) > Ust (qtn ) for all qtn ∈ Btn , i.e., τst (pt , At ) = limn τst (pnt , Btn ∪ {pnt }) = 1. In the latter case, Ust (pt ) < Ust (qt ) for some qt ∈ At r{pt }. But then, for all n large enough, linearity of Ust implies Ust (pnt ) < Ust (qtn ) for all qtn ∈ Btn such that qtn →m qt , i.e., τst (pt , At ) = limn τst (pnt , Btn ∪ {pnt }) = 0. s

Thus, for all st−1 ∈ S(ht−1 ) and st ∈ supp µt t−1 , we have τst (pt , At ) = limn τst (pnt , Btn ∪ {pnt }). Hence, the representation in Lemma 16 implies that for all n sufficiently large, ρt (pnt ; Btn ∪ {pnt }|ht−1 ) = ρt (pt ; At |ht−1 ), as required.



Lemma 15. Suppose that ρ satisfies Axiom 2. Fix t ≥ 1, At ∈ At , ht−1 = (A0 , p0 , . . . , At−1 , pt−1 ) ∈ t t−1 = ({q }, q )t−1 , dˆt−1 = ({ˆ ˆ ˆ t−1 Ht−1 , and λ = (λn )t−1 qn }, qˆn )t−1 n n n=0 n=0 , λ = (λn )n=0 ∈ (0, 1] . Suppose d n=0 ∈ ˆ t−1 + (1 − λ) ˆ dˆt−1 ∈ Ht−1 (At ), where λht−1 + (1 − λ)dt−1 := Dt−1 satisfy λht−1 + (1 − λ)dt−1 , λh ˆ t−1 + (1 − λ) ˆ dˆt−1 is defined analogously. Then (λn An + (1 − λn ){qn }, λn pn + (1 − λn )qn )t−1 n=0 and λh ˆ t−1 + (1 − λ) ˆ dˆt−1 ), ρt (·; At |λht−1 + (1 − λ)dt−1 ) = ρt (·; At |λh

64

t−1

and hence, ρht

(·; At ) = ρt (·; At |λht−1 + (1 − λ)dt−1 ).

Proof. Let k := max{n = 0 . . . , t − 1 : qn 6= qˆn } be the last entry at which dt−1 and dˆt−1 differ, where we set k = −1 if qn = qˆn for all n = 0, . . . , t − 1. We prove the claim by induction on k. ˆ 0 , then the 0-th entry of λht−1 + Suppose first that k = −1, i.e., that dt−1 = dˆt−1 . If λ0 > λ ˆ t−1 + (1 − λ) ˆ dˆt−1 with (1 − λ)dt−1 can be written as an appropriate mixture of the 0-th entry of λh ˆ 0 , then the 0-th entry of λht−1 + (1 − λ)dt−1 can be written as an appropriate (A0 , p0 ); if λ0 ≤ λ ˆ t−1 + (1 − λ) ˆ dˆt−1 with ({q0 }, q0 ). In either case, Axiom 2 implies that mixture of the 0-th entry of λh ˆ t−1 + (1 − λ) ˆ dˆt−1 ) is unaffected after replacing the 0-th entry of λh ˆ t−1 + (1 − λ) ˆ dˆt−1 with the ρt (·; At |λh 0-th entry of λht−1 + (1 − λ)dt−1 . Continuing this way, we can successively apply Axiom 2 to replace ˆ t−1 + (1 − λ) ˆ dˆt−1 with the corresponding entry of λht−1 + (1 − λ)dt−1 without affecting each entry of λh ρt . This yields the desired conclusion. Suppose the claim holds whenever k ≤ m − 1 for some 0 ≤ m ≤ t − 1. We show that the claim continues to hold for k = m. Note first that we can assume that 1 t−1 1 t−1 1 t−1 1 ˆt−1 h + d , h + d ∈ Ht−1 (At ); 2 2 2 2 1 1 1 2 A Bm + {ˆ qm }, { qm + qˆm } ∈ supp qm−1 ; 3 3 2 2 2ˆ 1 1 1 A Bm + {qm }, { qm + qˆm } ∈ supp qˆm−1 , 3 3 2 2

(25)

1 1 ˆm := 1 Am + 1 {ˆ ˆm := 21 pm + 12 qˆm . where Bm := 12 Am + 21 {qm },B 2 2 qm }, rm := 2 pm + 2 qm , and r

Indeed, we can find a sequence of lotteries (`n )t−1 n=0 such that for all n = 1, . . . , t − 1 1 1 1 1 ˆ n An + (1 − λ ˆ n ){ˆ λn An + (1 − λn ){on }, An + {on }, λ on }, An + {ˆ on }, {on } ∈ supp `A n−1 ; 2 2 2 2 2 1 2ˆ 1 1 1 Bm + {ˆ om }, B ˆm } ∈ supp `A m + {om }, { om + o m−1 , 3 3 3 3 2 2 ˆt−1 := ({ˆ on }, oˆn )t−1 where on := 12 qn + 12 `n and oˆn := 12 qˆn + 12 `n . Letting ct−1 := ({on }, on )t−1 n=0 and c n=0 , ˆ t−1 + (1 − λ)ˆ ˆ ct−1 ∈ Ht−1 (At ), and the last entry we have that ct−1 , cˆt−1 ∈ Dt−1 , λht−1 + (1 − λ)ct−1 , λh at which ct−1 and cˆt−1 differ is m. Moreover, repeated application of Axiom 2 implies ρt (·; At |λht−1 + (1 − λ)dt−1 ) = ρt (·; At |λht−1 + (1 − λ)ct−1 ); ˆ t−1 + (1 − λ) ˆ dˆt−1 ) = ρt (·; At |λh ˆ t−1 + (1 − λ)ˆ ˆ ct−1 ). ρt (·; At |λh Thus, we can replace dt−1 and dˆt−1 with ct−1 and cˆt−1 if need be and guarantee that (25) is satisfied.

65

Given (25), 12 ht−1 + 12 dt−1 , 12 ht−1 + 21 dˆt−1 ∈ Ht−1 (At ), so the base case of the proof implies 1 ρt (·; At |λht−1 + (1 − λ)dt−1 ) = ρt (·; At | ht−1 + 2 1 t−1 t−1 ˆ ˆ dˆ ) = ρt (·; At | ht−1 + ρt (·; At |λh + (1 − λ) 2

1 t−1 d ); 2 1 ˆt−1 d ). 2

(26)

Also, (25) guarantees that (( 12 ht−1 + 12 dt−1 )−m , ( 23 Bm + 13 {ˆ qm }, 32 rm + 13 qˆm )) and (( 21 ht−1 + 1 ˆt−1 ˆm + 1 {qm }, 2 rˆm + 1 qm )) are well-defined histories in Ht−1 (At ). Thus, Axiom 2 imd )−m , ( 2 B 2

3

3

3

3

plies 1 ρt (·; At | ht−1 + 2 1 ρt (·; At | ht−1 + 2

1 t−1 1 d ) = ρt (·; At |( ht−1 + 2 2 1 ˆt−1 1 d ) = ρt (·; At |( ht−1 + 2 2

1 t−1 2 d )−m , ( Bm + 2 3 1 ˆt−1 2ˆ d )−m , ( B m+ 2 3

1 2 {ˆ qm }, rm + 3 3 1 2 {qm }, rˆm + 3 3

1 qˆm )); 3 1 qm )). 3

(27)

But note that 2 1 2 1 ( Bm + {ˆ qm }, rm + qˆm ) 3 3 3 3 1 2 1 1 1 2 1 1 = ( Am + { qm + qˆm }, pm + ( qm + qˆm ) 3 3 2 2 3 3 2 2 2ˆ 1 2 1 = ( Bm + {qm }, rˆm + qm ). 3 3 3 3 Thus, (( 12 ht−1 +

1 t−1 )−m , ( 23 Bm + 13 {ˆ qm }, 23 rm + 13 qˆm )) is an entry-wise mixture of ht−1 with 2d the degenerate history et−1 := ((dt−1 )−m , ({ 12 qm + 12 qˆm }, 12 qm + 21 qˆm ) and similarly (( 12 ht−1 + 1 ˆt−1 ˆm + 1 {qm }, 2 rˆm + 1 qm )) is an entry-wise mixture of ht−1 with the degenerate his)−m , ( 32 B 2d 3 3 3 1 1 t−1 t−1 ˆ tory eˆ := ((d )−m , ({ 2 qm + 2 qˆm }, 12 qm + 12 qˆm ). But the last entry at which et−1 and eˆt−1 differ is

strictly smaller than m. Hence, applying the inductive hypothesis, we obtain 1 1 2 1 2 1 ρt (·; At |( ht−1 + dt−1 )−m , ( Bm + {qm }, rm + qm )) = 2 2 3 3 3 3 1 t−1 1 ˆt−1 2ˆ 1 2 1 ρt (·; At |( h + d )−m , ( Bm + {qm }, rˆm + qm )). 2 2 3 3 3 3

(28)

Combining (26), (27), and (28) yields ˆ t−1 + (1 − λ) ˆ dˆt−1 ), ρt (·; At |λht−1 + (1 − λ)dt−1 ) = ρt (·; At |λh as required. ˆ ∈ (0, 1] be the choices from Definition 12 such that ρht−1 (·; At ) := Finally, let dˆt−1 and λ t t−1 t−1 t−1 h t−1 ˆ ˆ ˆ ρt (·; At |λh + (1 − λ)d ). Then the above implies that ρt (·; At ) = ρt (·; At |λh + (1 − λ)dt−1 ),

66

as claimed.

 s

0

Lemma 16. Fix t = 0, . . . , T . Suppose (St0 , {µt0t −1 }st0 −1 ∈St0 −1 , {Ust0 , τst0 }st0 ∈St0 ) satisfy DREU1 and DREU2 for all t0 ≤ t. Then the extended version of ρ from Definition 3 also satisfies DREU2 for all 0

t0 ≤ t, i.e., for all pt0 , At0 , and ht −1 = (A0 , p0 , . . . , At0 −1 , pt0 −1 ) ∈ Ht0 −1 ,53 we have t0 −1

ρt0 (pt0 , At0 |h

P

(s0 ,...,st0 )∈S0 ×...×St0

)= P

Qt0

(s0 ,...,st0 −1 )∈S0 ×...×St0 −1

sk−1 (sk )τsk (pk , Ak ) k=0 µk . Qt−1 sk−1 (sk )τsk (pk , Ak ) k=0 µk

0

0

Proof. If ht −1 ∈ Ht0 −1 (At0 ), the claim is immediate from DREU2. So suppose ht −1 ∈ / Ht0 −1 (At0 ). Let 0

0

0

t −1 λ ∈ (0, 1) and dt −1 = ({q` }, q` )`=0 ∈ Dt0 −1 be the choices from Definition 3 such that λht −1 + (1 − 0

0

0

0

λ)dt −1 ∈ Ht0 −1 (At0 ) and ρt0 (pt0 , At0 |ht −1 ) := ρt0 (pt0 , At0 |λht −1 + (1 − λ)dt −1 ). Note that for all k ≤ t0 , sk ∈ Sk , and w ∈ RXk , we have pk ∈ M (M (Ak , Usk ), w) if and only if λpk + (1 − λ)qk ∈ M (M (λAk + (1 − λ){qk }, Usk ), w). Hence, τsk (pk , Ak ) = τsk (λpk + (1 − λ)qk , λAk + 0

0

(1 − λ){qk }). Thus, the claim follows from DREU2 applied to the history λht −1 + (1 − λ)dt −1 ∈ Ht0 −1 (At0 ).

 s

0

Lemma 17. Fix t = 0, . . . , T . Suppose (St0 , {µt0t −1 }st0 −1 ∈St0 −1 , {Ust0 , τst0 }st0 ∈St0 ) satisfy DREU1 and DREU2 for all t0 ≤ t. Fix any st−1 ∈ St−1 , separating history ht−1 for st−1 , and At ∈ At . Then there s

exists a sequence Ant →m At such that Ant ∈ A∗t+1 (ht ) for all n. Moreover, given any s∗t ∈ suppµt t−1 and p∗t ∈ M (At , Us∗t ), we can ensure in this construction that there is pnt (s∗t ) ∈ Ant with pnt (s∗t ) →m p∗t such that Ust (Ant , pnt (s∗t )) = {Us∗t } for all n. s

Proof. Let St (st−1 ) := suppµt t−1 . By DREU1, we can find a finite Yt ⊆ Xt such that (i) for any st ∈ St (st−1 ), Ust is non-constant over Yt ; (ii) for any distinct st , s0t ∈ St (st−1 ), Ust 6≈ Us0t over Yt ; and (iii) S st pt ∈At supppt ⊆ Yt . By (i) and (ii) and Lemma 13, we can find a menu Dt := {qt : st ∈ St (st−1 )} ⊆ P ∆(Yt ) such that M (Dt , Ust ) = {qtst } for all st ∈ St (st−1 ). Define bt := y∈Yt |Y1t | δy ∈ ∆(Y ). For each st ∈ St (st−1 ), pick z st ∈ argmaxy∈Y Ust and let gtst := δz st . By (i), we have Ust (gtst ) > Ust (bt ) for all st ∈ St (st−1 ). Hence, there exists α ∈ (0, 1) small enough such that for all st ∈ St (st−1 ), we have ˆ := {ˆ Ust (ˆ q st ) > Ust (bt ), where qˆst := αq st + (1 − α)g st . Note that setting D q st : st ∈ St (st−1 )}, we still t

ˆ t , Ust ) = have M (D

t

t

{ˆ qtst }.

For each st ∈ St (st−1 ), pick some pt (st ) ∈ M (At , Ust ). For the “moreover” part, we can ensure that pt (s∗t ) = p∗t . Fix any sequence (εn ) from (0, 1) such that εn → 0. For each n and st ∈ St (st−1 ), let pnt (st ) := (1 − ε)pt (st ) + εˆ qtst . And for each rt ∈ At , let rtn := (1 − ε)rt + εbt . Finally, let Ant := {pnt (st ) : st ∈ St (st−1 )} ∪ {rtn : rt ∈ At }. Note that Ant →m At . Moreover, by construction, for 53

0

0

For t0 = 0, we abuse notation by letting ρt0 (·|ht −1 ) denote ρ0 (·) for all ht −1 .

67

all st ∈ St (st−1 ) and n, we have M (Ant , Ust ) = {pnt (st )}: Indeed, Ust (pnt (st )) > Ust (rtn ) for all rt ∈ At qtst ) > Ust (bt ); and Ust (pnt (st )) > Ust (pnt (s0t )) for all s0t 6= st , since since Ust (pt (st )) ≥ Ust (rt ) and Ust (ˆ s0

Ust (pt (st )) ≥ Ust (pt (s0t )) and Ust (ˆ qtst ) > Ust (ˆ qt t ). Since st−1 is the only state consistent with ht−1 , Lemma 14 implies that Ant ∈ A∗t (ht−1 ), as required. Finally, for the “moreover” part, note that we ensured that pt (s∗t ) = p∗t . Hence pnt (s∗t ) constructed above has the desired property that pnt (s∗t ) →m p∗t and Ust (Ant , pnt (s∗t )) = {Us∗t } for all n.



References Abdulkadiroglu, A., J. D. Angrist, Y. Narita, and P. A. Pathak (forthcoming): “Research design meets market design: Using centralized assignment for impact evaluation,” Econometrica. Aghion, P., P. Bolton, C. Harris, and B. Jullien (1991): “Optimal learning by experimentation,” The review of economic studies, 58(4), 621–654. Aguirregabiria, V., and P. Mira (2010): “Dynamic discrete choice structural models: A survey,” Journal of Econometrics, 156(1), 38–67. Ahn, D. S., and T. Sarver (2013): “Preference for flexibility and random choice,” Econometrica, 81(1), 341–361. Aliprantis, C. D., and K. C. Border (2006): Infinite Dimensional Analysis: a Hitchhiker’s Guide. Springer, Berlin; London. Angrist, J., P. Hull, P. A. Pathak, and C. Walters (forthcoming): “Leveraging lotteries for school value-added: Testing and estimation,” Quarterly Journal of Economics. Anscombe, F. J., and R. J. Aumann (1963): “A definition of subjective probability,” The annals of mathematical statistics, 34(1), 199–205. Apesteguia, J., M. Ballester, and J. Lu (2017): “Single-Crossing Random Utility Models,” Econometrica. Apesteguia, J., and M. A. Ballester (2017): “Monotone Stochastic Choice Models: The Case of Risk and Time Preferences,” Journal of Political Economy. Arcidiacono, P., and P. B. Ellickson (2011): “Practical methods for estimation of dynamic discrete choice models,” Annu. Rev. Econ., 3(1), 363–394. Augenblick, N., M. Niederle, and C. Sprenger (2015): “Working over time: Dynamic inconsistency in real effort tasks,” The Quarterly Journal of Economics, p. qjv020. ´ , S., and P. Pattanaik (1986): “Falmagne and the rationalizability of stochastic Barbera choices in terms of random orderings,” Econometrica, pp. 707–715. Becker, G. S., and C. B. Mulligan (1997): “The endogenous determination of time preference,” The Quarterly Journal of Economics, 112(3), 729–758. Becker, G. S., and K. M. Murphy (1988): “A theory of rational addiction,” Journal of political Economy, 96(4), 675–700. 68

Berry, S., J. Levinsohn, and A. Pakes (1995): “Automobile prices in market equilibrium,” Econometrica, pp. 841–890. Block, D., and J. Marschak (1960): “Random Orderings And Stochastic Theories of Responses,” in Contributions To Probability And Statistics, ed. by I. O. et al. Stanford: Stanford University Press. Cooke, K. (2016): “Preference discovery and experimentation,” Theoretical Economics. De Oliveira, H., T. Denti, M. Mihm, and M. K. Ozbek (2016): “Rationally inattentive preferences and hidden information costs,” Theoretical Economics, pp. 2–14. Dekel, E., B. Lipman, and A. Rustichini (2001): “Representing preferences with a unique subjective state space,” Econometrica, 69(4), 891–934. Dekel, E., B. L. Lipman, A. Rustichini, and T. Sarver (2007): “Representing Preferences with a Unique Subjective State Space: A Corrigendum1,” Econometrica, 75(2), 591– 600. Deming, D. J. (2011): “Better schools, less crime?,” The Quarterly Journal of Economics, p. qjr036. Deming, D. J., J. S. Hastings, T. J. Kane, and D. O. Staiger (2014): “School choice, school quality, and postsecondary attainment,” The American economic review, 104(3), 991– 1013. Dillenberger, D., J. S. Lleras, P. Sadowski, and N. Takeoka (2014): “A theory of subjective learning,” Journal of Economic Theory, 153, 287–312. Easley, D., and N. M. Kiefer (1988): “Controlling a stochastic process with unknown parameters,” Econometrica: Journal of the Econometric Society, pp. 1045–1064. Epstein, L. G. (1999): “A definition of uncertainty aversion,” The Review of Economic Studies, 66(3), 579–608. Erdem, T., and M. P. Keane (1996): “Decision-making under uncertainty: Capturing dynamic brand choice processes in turbulent consumer goods markets,” Marketing science, 15(1), 1–20. Ergin, H., and T. Sarver (2010): “A unique costly contemplation representation,” Econometrica, 78(4), 1285–1339. Falmagne, J. (1978): “A representation theorem for finite random scale systems,” Journal of Mathematical Psychology, 18(1), 52–72. Fishburn, P. (1970): Utility theory for decision making. Fishburn, P. C. (1984): “On Harsanyi’s Utilitarian Cardinal Welfare Therem,” Theory and Decision, 17, 21–28. Fudenberg, D., P. Strack, and T. Strzalecki (2016): “Speed, Accuracy, and the Optimal Timing of Choices,” . Fudenberg, D., and T. Strzalecki (2015): “Dynamic logit with choice aversion,” Econometrica, 83(2), 651–691. Gilboa, I. (1987): “Expected utility with purely subjective non-additive probabilities,” Jour69

nal of mathematical Economics, 16(1), 65–88. Gilboa, I., and A. Pazgal (2001): “Cumulative discrete choice,” Marketing Letters, 12(2), 119–130. Gilboa, I., A. Postlewaite, and L. Samuelson (2016): “Memorable consumption,” Journal of Economic Theory, 165, 414–455. Gittins, J. C., and D. M. Jones (1972): A dynamic allocation index for the sequential design of experiments. University of Cambridge, Department of Engineering. Gowrisankaran, G., and M. Rysman (2012): “Dynamics of Consumer Demand for New Durable Goods,” mimeo. Gul, F., P. Natenzon, and W. Pesendorfer (2014): “Random Choice as Behavioral Optimization,” Econometrica, 82(5), 1873–1912. Gul, F., and W. Pesendorfer (2001): “Temptation and self-control,” Econometrica, 69(6), 1403–1435. Gul, F., and W. Pesendorfer (2004): “Self-control and the theory of consumption,” Econometrica, 72(1), 119–158. Gul, F., and W. Pesendorfer (2006): “Random expected utility,” Econometrica, 74(1), 121–146. (2007): “Harmful addiction,” The Review of Economic Studies, 74(1), 147–172. Hausman, J., and D. McFadden (1984): “Specification tests for the multinomial logit model,” Econometrica: Journal of the Econometric Society, pp. 1219–1240. Heckman, J. J. (1981): “Heterogeneity and state dependence,” in Studies in labor markets, pp. 91–140. University of Chicago Press. Hendel, I., and A. Nevo (2006): “Measuring the implications of sales and consumer inventory behavior,” Econometrica, 74(6), 1637–1673. Higashi, Y., K. Hyogo, and N. Takeoka (2014): “Stochastic endogenous time preference,” Journal of Mathematical Economics, 51, 77–92. Hu, Y., and M. Shum (2012): “Nonparametric identification of dynamic models with unobserved state variables,” Journal of Econometrics, 171(1), 32–44. Hyogo, K. (2007): “A subjective model of experimentation,” Journal of Economic Theory, 133(1), 316–330. Kasahara, H., and K. Shimotsu (2009): “Nonparametric identification of finite mixture models of dynamic discrete choices,” Econometrica, 77(1), 135–175. Ke, S. (2016): “A Dynamic Model of Mistakes,” working paper. Kennan, J., and J. Walker (2011): “The effect of expected income on individual migration decisions,” Econometrica, 79(1), 211–251. Khamsi, M. A., and W. A. Kirk (2011): An introduction to metric spaces and fixed point theory, vol. 53. John Wiley & Sons. Kitamura, Y., and J. Stoye (2013): “Nonparametric analysis of random utility models: 70

testing,” . Kreps, D. (1979): “A representation theorem for” preference for flexibility”,” Econometrica, pp. 565–577. Kreps, D., and E. Porteus (1978): “Temporal Resolution of Uncertainty and Dynamic Choice Theory,” Econometrica, 46(1), 185–200. Krishna, R. V., and P. Sadowski (2014): “Dynamic preference for flexibility,” Econometrica, 82(2), 655–703. Krishna, V., and P. Sadowski (2016): “Randomly Evolving Tastes and Delayed Commitment,” mimeo. Lu, J. (2016): “Random choice and private information,” Econometrica, 84(6), 1983–2027. Lu, J., and K. Saito (2016): “Random intertemporal choice,” Discussion paper, mimeo. Luce, D. (1959): Individual choice behavior. John Wiley. Magnac, T., and D. Thesmar (2002): “Identifying dynamic discrete decision processes,” Econometrica, 70(2), 801–816. Manski, C. F. (1993): “Dynamic choice in social settings: Learning from the experiences of others,” Journal of Econometrics, 58(1-2), 121–136. McAlister, L. (1982): “A dynamic attribute satiation model of variety-seeking behavior,” Journal of Consumer Research, 9(2), 141–150. McFadden, D., and M. Richter (1990): “Stochastic rationality and revealed stochastic preference,” Preferences, Uncertainty, and Optimality, Essays in Honor of Leo Hurwicz, pp. 161–186. Miller, R. A. (1984): “Job matching and occupational choice,” The Journal of Political Economy, pp. 1086–1120. Natenzon, P. (2016): “Random choice and learning,” . Norets, A. (2009): “Inference in dynamic discrete choice models with serially orrelated unobserved state variables,” Econometrica, 77(5), 1665–1682. Norets, A., and X. Tang (2013): “Semiparametric Inference in dynamic binary choice models,” The Review of Economic Studies, p. rdt050. Pakes, A. (1986): “Patents as options: Some estimates of the value of holding European patent stocks,” Econometrica, 54, 755–784. Piermont, E., N. Takeoka, and R. Teper (2016): “Learning the Krepsian state: Exploration through consumption,” Games and Economic Behavior, 100, 69–94. Rao, K. P. S. B., and M. B. Rao (2012): Theory of Charges: A Study of Finitely Additive Measures. Academic Press. Robbins, H. (1952): “Some aspects of the sequential design of experiments,” Bulletin of the American Mathematical Society, 58(5), 527–535. Rozen, K. (2010): “Foundations of intrinsic habit formation,” Econometrica, 78(4), 1341– 1373.

71

Rust, J. (1987): “Optimal replacement of GMC bus engines: An empirical model of Harold Zurcher,” Econometrica, pp. 999–1033. (1989): “A Dynamic Programming Model of Retirement Behavior,” in The Economics of Aging, ed. by D. Wise, pp. 359–398. University of Chicago Press: Chicago. (1994): “Structural estimation of Markov decision processes,” Handbook of econometrics, 4, 3081–3143. Rustichini, A., and P. Siconolfi (2014): “Dynamic theory of preferences: Habit formation and taste for variety,” Journal of Mathematical Economics, 55, 55–68. Savage, L. J. (1972): The foundations of statistics. Courier Corporation. Sweeting, A. (2011): “Dynamic product positioning in differentiated product markets: The effect of fees for musical performance rights on the commercial radio industry,” mimeo. Toussaert, S. (2016): “Eliciting temptation and self-control through menu choices: a lab experiment on curiosity,” mimeo. Uzawa, H. (1968): “Time preference, the consumption function, and optimum asset holdings,” Value, capital and growth: papers in honor of Sir John Hicks. The University of Edinburgh Press, Edinburgh, pp. 485–504. Wolpin, K. I. (1984): “An estimable dynamic stochastic model of fertility and child mortality,” The Journal of Political Economy, pp. 852–874.

72

Supplementary Appendix to Dynamic Random Utility Mira Frick, Ryota Iijima, and Tomasz Strzalecki

F F.1

Proof of Theorem 0 Preliminaries

In this section we prove Theorem 0 which extends the characterizations of REU representations in Gul and Pesendorfer (2006) and Ahn and Sarver (2013) to allow for an arbitrary separable metric space X of outcomes. Refer to section 2.1 of the main text for all relevant notation and ˜ X = {0} × RXr{y∗ } denote the set of terminology. Throughout, we fix some y ∗ ∈ X and let R utility functions u in RX that are normalized by u(y ∗ ) = 0. We first define the static analog of S-based representations introduced in Appendix A: Definition 13. An S-based REU representation of ρ is a tuple (S, µ, {Us , τs }s∈S ) such that (i). S is a finite state space and µ is a probability measure on S such that supp(µ) = S ˜ X is nonconstant and Us 6≈ Us0 for s 6= s0 (ii). for each s ∈ S, the utility Us ∈ R (iii). for each s ∈ S, the tie-breaking rule τs is a proper finitely-additive probability measure ˜ X endowed with the Borel σ-algebra on R (iv). for all p ∈ ∆(X) and A ∈ A, ρ(p; A) =

X

µ(s)τs (p, A),

s∈S

˜ X : p ∈ M (M (A, Us ), u)}). where τs (p, A) := τs ({u ∈ R Analogous arguments as for the DREU part of Proposition 5 yield the equivalence of S-based REU representations and static REU representations. Proposition 6. Let ρ be a stochastic choice rule on A. Then ρ admits an REU representation if and only if it admits an S-based REU representation. Proof. Analogous to Proposition 5 (i).



Thus, Theorem 0 is equivalent to the following result, which we prove throughout the rest of this section. Theorem 4. The stochastic choice rule ρ on A admits an S-based REU representation (S, µ, {Us , τs }s∈S ) if and only if ρ satisfies Axiom 0. 1

Remark 1. Note that because X may be infinite, continuity of each Us in the representation is not directly implied by linearity. However, in constructing an evolving utility representation in Section C, we can make use of Axiom 6 (iii) (Continuity) to show that each Ust is continuous (see Lemma 6). In the static setting, an alternative approach is to impose the following axiom. As in Section 3.3, let A∗ denote the collection of menus without ties, i.e., the set of all A ∈ A such that for any p ∈ A and any sequences pn →m p and B n →m A r {p}, we have limn→∞ ρ(pn ; B n ∪ {pn }) = ρ(p; A). Axiom 11 (Continuity). ρ : A∗ → ∆(∆(X)) is continuous. Here A is endowed with the Hausdorff topology induced by the Prokhorov metric π on ∆(X), and A∗ with the relative topology. We have the following proposition. Since we do not make use of this result anywhere in the paper, the proof is omitted but available on request. Proposition 7. Suppose ρ admits an S-based REU representation (S, µ, {Us , τs }s∈S ). Then each utility Us is continuous if and only if ρ additionally satisfies Axiom 11. N Additional notation: For any Y ⊆ X, let A(Y ) := {A ∈ A : ∀p ∈ A, supp(p) ⊆ Y } ⊆ A denote the space of all menus consisting only of lotteries with support in Y . Note that for each A ∈ A, there is a finite Y such that A ∈ A(Y ). We denote by ρY the restriction of ρ to A(Y ), ˜ Y := {0} × RY r{y∗ } . which can be seen as a map from A(Y ) to ∆(∆(Y )). If y ∗ ∈ Y , we write R ˜ Y : p ∈ M (A, u)} and let For any A ∈ A(Y ) and p ∈ ∆(X), let NY (A, p) := {u ∈ R + Y ˜ : {p} = M (A, u)}. Note that NY ({p}, p) = N + ({p}, p) = R ˜ Y and that NY (A, p) := {u ∈ R Y NY (A, p) = NY+ (A, p) = ∅ if p ∈ / A. Let N (Y ) := {NY (A, p) : A ∈ A(Y ) and p ∈ ∆(X)}, + + N (Y ) := {NY (A, p) : A ∈ A(Y ) and p ∈ ∆(X)}. ˜ Y and its subalgebra F(Y ) that is generated We will consider both the Borel σ-algebra on R by N (Y ) ∪ N + (Y ). A finitely-additive probability measure ν Y on either of these algebras is called proper if ν Y (NY (A, p)) = ν Y (NY+ (A, p)) for any A ∈ A(Y ) and p ∈ ∆(X). Whenever Y = X, we omit Y from the description of NY (A, p), NY+ (A, p), N (Y ), N + (Y ), and F(Y ).

F.2 F.2.1

Proof of Theorem 4: Sufficiency Outline

The proof proceeds as follows: (i). In section F.2.2, we use conditions (i)–(iv) of Axiom 0 and Theorem 2 in Gul and Pesendorfer (2006) to construct, for each finite Y ⊆ X, a proper finitely-additive probability measure ν Y on F(Y ) representing ρY , in the sense that ρY (p; A) = ν Y (NY (A, p)) for all A, p. Given the fact that each ρY is derived from the same ρ, it is easy to check that the family {F(Y ), ν Y } is Kolmogorov consistent. We can then find a proper finitely-additive probability measure ν on F extending all the ν Y (and hence representing ρ). (ii). The support of ν is defined by [ c {V ∈ F : V is open and ν(V ) = 0} . supp(ν) :=

2

In section F.2.3, we use part (v) of Axiom 0 to show that supp ν is finite (up to positive affine transformation of utilities) and contains at least one non-constant utility function. While Axiom 0 (v) is similar to the finiteness axiom in Ahn and Sarver (2013), this step requires more work in our setting. A key technical challenge is that unlike in Ahn and Sarver, it is not clear in our infinite outcome space setting how to normalize utilities to ensure that N (A, p)-sets are compact. Compact sets C have the useful property (used repeatedly by Ahn and Sarver) that if C ∩supp ν = ∅, then ν(C) = 0. Lemma 22 exploits the geometry of N (A, p)-sets to show that this property continues to hold for N (A, p)-sets in our setting, even though they are not compact. (iii). In section F.2.4, we proceed in a similar way to the proof of Theorem S3 in Ahn and Sarver (2013)(again using Lemma 22 to circumvent technical difficulties). Letting S := {s1 , . . . , sL } denote the equivalence classes of nonconstant utilities in supp ν, we find separating neighborhoods Bs ∈ F of each s such that ν(Bs ) > 0. We then define µ(s) = ∩Bs ) and show that this yields an S-based REU representation of ν(Bs ) and τs (V ) = ν(V ν(Bs ) ρ. F.2.2

Construction of ν

In this section, we construct a proper finitely-additive probability measure ν on F that represents ρ, i.e., such that for all A ∈ A and p ∈ A, we have ρ(p; A) = ν(N (A, p)) = ν(N + (A, p))). First consider any finite Y ⊆ X with y ∗ ∈ Y . By Axiom 0 (i)–(iv) (Regularity, Linearity, Extremeness, and Mixture Continuity), Theorem 2 in Gul and Pesendorfer (2006) ensures that there is a proper finitely-additive probability measure ν Y on F Y such that ρY (p; A) = ν Y (NY (A, p)) = ν Y (NY+ (A, p)) for all A ∈ A(Y ) and p ∈ A. 0

Claim 4. For any finite Y 0 ⊇ Y 3 y ∗ , (ν Y , F(Y 0 )) and (ν Y , F(Y )) are Kolmogorov consistent, i.e., for any E ∈ F(Y ), we have 0

ν Y (E × RY

0 rY

) = ν Y (E).

(29) 0

0

Proof. To see this, note first that the LHS of (29) is well-defined, S since E × RY rY ∈ F Y by Lemma 21 (iv). Note next that by Lemma 21 (iii), E is of the form ni=1 NY (Ai , pi ) ∩ NY+ (Bi , qi ) for some finite n and Ai , Bi ∈ A(Y ). Let E 0 be S obtained from E by replacing each NY (Ai , pi ) with NY+ (Ai , pi ). By Lemma 21 (ii), E 0 = ni=1 NY+ (Ci , ri ) for some family {Ci } ⊆ A(Y ). 0 0 Moreover, since both ν Y and ν Y are proper, we have that ν Y (E) = ν Y (E 0 ) and ν Y (E × 0 0 0 0 0 RY rY ) = ν Y (E 0 × RY rY ). Hence, it suffices to prove that ν Y (E 0 × RY rY ) = ν Y (E 0 ). For this, it is enough to showSthat for any collection of sets N1+ , . . . , Nn+ ∈ N + (Y ) := {N + (A, p) : S 0 0 n n + + A ∈ A(Y )}, we have ν Y ( i=1 Ni ) = ν Y ( i=1 Ni × RY rY ). We prove this by induction. For

3

the base case, note that for any N + (A, p) ∈ N + (Y ), we have 0

ν Y (N + (A, p)) × RY

0 rY

0

) = ρY (p, A) := ρ(p; A) =: ρY (p; A) = ν Y (N + (A, p)).

Suppose next that the claim is true whenever m < n. Then m+1 [

Y

Ni+ )

ν ( 0

i=1 m [

0

i=1 m+1 [

νY (

νY (

Y

=ν (

m [

Ni+ )



Y

+ (Nm+1 )

Y

−ν (

i=1

Ni+ × RY

0 rY

m [

+ (Ni+ ∩ Nm+1 )) =

i=1 0

+ × RY ) + ν Y (Nm+1

0 rY

0

) − νY (

m [

+ ) × RY (Ni+ ∩ Nm+1

0 rY

)=

i=1

Ni+ × RY

0 rY

),

i=1 + where the second equality follows from the inductive hypothesis and the fact that Ni+ ∩ Nm+1 ∈ + N (Y ) by Lemma 21 (ii). 

Now define ν on F by setting ν(E) := ν Y (projR˜ Y E) for any finite Y 3 y ∗ such that E = projR˜ Y E × RXrY and projR˜ Y E ∈ F Y . By Lemma 21 (iv) such a Y exists. Moreover, given Kolmogorov consistency of the family {ν Y }Y ⊆X , this is well-defined. Finally, it is immediate that ν is a proper finitely-additive probability measure and that ν represents ρ. F.2.3

Finiteness of supp ν

The support of a finitely-additive probability measure ν is defined by [ c supp(ν) := {V ∈ F : V is open and ν(V ) = 0} . The next lemma invokes Axiom 0 (v) (Finiteness) to show that the support of ν constructed in the previous section contains finitely many equivalence classes of utility functions and contains at least one nonconstant function. We use 0 to denote the unique constant utility function in ˜X. R Lemma 18. Let K be as in the statement of the Finiteness Axiom and let P ref (∆(X)) denote the set of all preferences over ∆(X). Then #{%∈ P ref (∆(X)) :% is represented by some u ∈ supp(ν) r {0}} = L, where 1 ≤ L ≤ K. Proof. We first show that L ≤ K. If not, then we can find utilities {u1 , ..., uK+1 } ⊆ supp(ν) such that each ui is non-constant over X and ui 6≈ uj for all i 6= j. By Lemma 13, we can find a menu A = {pi : i = 1, . . . , K + 1} ∈ A such that ui ∈ N + (A, pi ) for each i. Take any B ⊆ A with |B| ≤ K. Then pi 6∈ B for some i. Fix any sequences pin →m pi and Bn →m B. By definition, this means that there exists r ∈ ∆(X) and αn → 0 such that pin = αn r + (1 − α)pi for all n, and that for each q ∈ B there 4

S exists Bq ∈ A and βn (q) → 0 such that Bn = q∈B (βn (q)Bq + (1 − βn (q)){q}) for all n. Now, B and each Bq are finite, and ui is linear with ui · pi > ui · q for all q ∈ B. Hence, there is N such that for all n ≥ N , ui · pin > u · qn for all qn ∈ Bn . Thus, ui ∈ N + ({pin } ∪ Bn , pin ) for all n ≥ N . But since ui ∈ supp(ν) and N + ({pin } ∪ Bn , pin ) is an open set in F, the definition of supp(ν) then implies that ν(N + ({pin } ∪ Bn , pin )) > 0 for all n ≥ N . But then ρ(pin ; {pin } ∪ Bn ) = ν(N + ({pin } ∪ Bn , pin )) > 0 for all n ≥ N , contradicting Finiteness. Next we show that L ≥ 1. Indeed, if L = 0, then for any A ∈ A with |A| ≥ 2 and for any p ∈ A, we have (N (p, A) r {0}) ∩ supp ν = ∅. By Lemma 22 below, this implies that ν(N + (p, A))P= 0 for any p ∈ A. But since ν represents ρ, ρ(p; A) = ν(N + (p, A)) for any p ∈ A, so we have p∈A ρ(p; A) = 0, which is a contradiction.  F.2.4

Constructing the REU representation

Let %1 , . . . , %L denote all the preferences represented by some non-constant utility in supp(ν), where by Lemma 18 we know that L is finite and L ≥ 1. For each i = 1, . . . , L, pick some ˜ X , let [u] := {u0 ∈ R ˜ X : u0 ≈ u}. By Lemma ui ∈ supp ν representing %i . For any u ∈ R + 13, we can find A := {p1 , . . . , pL } ∈ A such that ui ∈ N (A, pi ) for all i = 1, . . . , L. Let Bui := N + (A, pi ) for all i. By construction, [ui ] ⊆ Bui and Bui ∩ Buj = ∅ for j 6= i. Moreover, by the definition of supp(ν), we have ν(Bui ) > 0 for each i, since Bui ∈ F is open and ui ∈ Bui ∩ supp(ν) 6= ∅. Let S := {u1 , . . . , uL } and define the function µ : S → [0, 1] by µ(s) = ν(Bs ) for each s ∈ S. We claim on S. For this it remains to show S P probability measure P that µ defines aPfull-support that s µ(s) = 1. Since s µ(s) = s ν(Bs ) = ν( s∈S Bs ), it suffices to prove the following claim: S Lemma 19. ν( s∈S Bs ) = 1. ˜ X = SL N (A, pi ), since ˜ X r S Bs ) = 0. Note that R Proof. It suffices to prove that ν(R i=1 s∈S A = {pi , . . . , pL }. Thus, [

˜X r R

Bs ⊆

L [

(N (A, pi ) r N + (A, pi )).

i=1

s∈S

By finite additivity of ν, this implies that ˜X r ν(R

[ s∈S

Bs ) ≤

L X

ν(N (A, pi ) r N + (A, pi )) = 0,

i=1

where the last inequality follows from properness of ν. Next, we define a set function τs : F → R+ for each s ∈ S by setting τs (V ) :=

ν(V ∩ Bs ) ν(Bs ) 5



for each V ∈ F. Since ν(Bs ) > 0 for all s ∈ S, this is well-defined. Moreover, since ν is a proper finitely-additive probability measure on F, so is τs . ˜ X : p ∈ M (M (A, s), u)} = N (M (A, s), p) ∈ Note that for all A ∈ A and p ∈ ∆(X), {u ∈ R X ˜ : p ∈ M (M (A, s), u)}) is well-defined. The next lemma will allow us to F, so τs ({u ∈ R complete the representation: Lemma 20. For each s ∈ S, A ∈ A, and p ∈ A, X ˜ X : p ∈ M (M (A, s), u)}). ν(N (A, p)) = µ(s)τs ({u ∈ R s∈S

Proof. We first show that for each s ∈ S, supp τs r {0} = [s]. To see that [s] ⊆ supp τs r {0}, consider any u ∈ [s] and any open V ∈ F such that u ∈ V . By Lemma 21 (iii), V is a finite union of finite intersections of sets in N ∪ N + . Hence, since each element of N ∪ N + is closed under positive affine transformations so is V . Thus, u ∈ V implies s ∈ V . But then V ∩ Bs ∈ F is open and contains s, and hence ν(V ∩Bs ) > 0 since s ∈ supp ν. This proves u ∈ supp τs r{0}. To see that supp τs r {0} ⊆ [s], consider any u 6= 0 such that u ∈ / [s]. It suffices to show that there exists an open V ∈ F such that u ∈ V and τs (V ) = 0. If u ≈ s0 for some s0 ∈ S r {s}, then V = Bs0 is as required since Bs0 ∩ Bs = ∅ and u ∈ Bs0 . If there is no s0 ∈ S r {s} such that u ≈ s0 , then u ∈ / supp ν. But then there exists an open V ∈ F such that u ∈ V and ν(V ) = 0, so also τs (V ) = 0. By Lemma 23 below, this implies that τs (N (A, p)) = τs (N (M (A, s), p)) for any A ∈ A and p ∈ A. This implies that for any A ∈ A and p ∈ A X X ˜ X : p ∈ M (M (A, s), u)}) = µ(s)τs ({u ∈ R µ(s)τs (N (M (A, s), p)) s∈S

s∈S

=

X

=

X

µ(s)τs (N (A, p))

s∈S

ν(N (A, p) ∩ Bs )

s∈S

= ν(N (A, p) ∩

[

Bs )

s∈S

= ν(N (A, p), where the last equality follows from Lemma 19.



For any s ∈ S = {u1 , . . . , uL }, we write Us := s. We claim that (S, µ, {Us , τs }s∈S ) is an S-based REU representation of ρ. Indeed, by construction, Us is non-constant for all s, Us 6≈ Us0 for any distinct s, s0 ∈ S, and µ is a full-support probability measure on S. Moreover, each τs is ˜ X endowed with the algebra F. By standard a proper finitely-additive probability measure on R arguments (cf. Rao and Rao (2012)), we can extend τs to a proper finitely-additive probability ˜ X . Finally, Lemma 20 and the fact that ν represents ρ measure on the Borel σ-algebra on R P implies that for all A ∈ A and p ∈ A, we have ρ(p; A) = s∈S µ(s)τs (p, A), as required.

6

F.3

Proof of Theorem 4: Necessity

Suppose that ρ admits an S-based REU representation (S, µ, {Us , τs }s∈S ). We show that ρ satisfies Axiom 0. Observe first that for any finite Y ⊆ X with y ∗ ∈ Y , (S, µ, {Us Y , τs Y }s∈S ) constitutes an S-based REU representation of ρY , where Us Y denotes the restriction of Us to Y and τs Y is given by τs Y (B) = τs (B × RXrY ) for any Borel set B on RY . Thus, by Theorem S3 in Ahn and Sarver (2013), ρY satisfies Regularity, Linearity, Extremeness, and Mixture Continuity. To show that ρ satisfies Regularity, consider any p ∈ A ⊆ A0 . Pick a finite Y ⊆ X with ∗ y ∈ Y such that A, A0 ∈ A(Y ). By definition, ρ(p; A) = ρY (p; A) and ρ(p; A0 ) = ρY (p; A0 ). Hence, by Regularity for ρY , we have ρ(p; A) ≥ ρ(p; A0 ), as required. Similarly, we can show that ρ satisfies Linearity, Extremeness, and Mixture Continuity by using the fact that for each finite Y , each ρY satisfies these axioms. Finally, to show that ρ satisfies Finiteness, let K := |S| and consider any A ∈ A. For each s ∈ S, pick any qs ∈ M (A, Us ), and define B := {qs : s ∈ S}. Note that |B| ≤ K. If B = A, then Finiteness is trivially satisfied. If B ( A, then pick any p ∈ A r B. We can pick a large enough finite Y ⊆ X such that each Us is non-constant on Y and Us Y 6≈ Us0 Y for any distinct s, s0 ∈ S. Let r ∈ ∆(Y ) be given by r(y) := |Y1 | for each y ∈ Y . For each s ∈ Y , pick any ys ∈ argmaxy∈Y Us (y). Note that Us (ys ) > Us (r). Define B n := n−1 B + n1 {ys : s ∈ S} and n p + n1 r. Then B n →m B and pn →m p. Moreover, for all large enough n, we have pn := n−1 n n−1 Us ( n qs + n1 ys ) > Us (pn ) for each s ∈ S. Thus, ρ(pn ; {pn } ∪ B n ) = 0, proving Finiteness.

F.4 F.4.1

Additional Lemmas for Section F Properties of N (A, p) sets

Lemma 21. Fix any X 0 ⊆ X with y ∗ ∈ X. For any collection S, we let U(S) denote the set of all finite unions of elements of S. (i). If E ∈ N (X 0 ) (resp. E ∈ N + (X 0 )), then E c ∈ U(N + (X 0 )) (resp. E c ∈ U(N (X 0 )). (ii). If E1 , E2 ∈ N (X 0 ) (resp. E1 , E2 ∈ N + (X 0 )), then E1 ∩ E2 ∈ N (X 0 ) (resp. E1 ∩ E2 ∈ N + (X 0 )). S (iii). F(X 0 ) is the set of all E such that E = `∈L M` ∩ N` for some finite index set L and M` ∈ N (X 0 ), N` ∈ N + (X 0 ) for each ` ∈ L. (iv). F(X 0 ) is the set of all E for which there exists a finite Y ⊆ X 0 with y ∗ ∈ Y and E Y ∈ F(Y ) such that E = E Y × RXrY . Proof. S (i): If E = N (A, p) ∈ N (X 0 ), then E c = q∈Ar{p} N + ({p, q}, q) ∈ U(N + (X 0 )) if p ∈ A ˜ X 0 ∈ U(N + (X 0 )) if p ∈ and E c = R / A. Similarly, if E = N + (A, p) ∈ N + (X 0 ), then E c = S 0 c ˜ X 0 ∈ U(N (X 0 )) if p ∈ / A. q∈Ar{p} N ({p, q}, q) ∈ U(N (X )) if p ∈ A and E = R 0 (ii): If N (A1 , p1 ), N (A2 , p2 ) ∈ N (X ), then N (A1 , p1 ) ∩ N (A2 , p2 ) = N ( 21 A1 + 12 A2 , 21 p1 + 1 p ) ∈ N (X 0 ). The same argument goes through replacing all instances of N with N + . 2 2 7

(iii): By standard results, F(X 0 ) can be described as follows: Let F0 (X 0 ) denote the set of all elements of N (X 0 ) ∪ N + (X 0 ) and their complements. Let F1 (X 0 ) denote the set of all finite intersections of elements of F0 (X 0 ). Then F(X 0 ) is the set of all finite unions of elements of F1 (X 0 ). By part (i), F0 (X) = U(N (X)) ∪ U(N (X 0 )) is the collection of all finite unions of elements of N (X 0 ) and of all finite unions of elements of N + (X 0 ). SBy part (ii), F1 (X 0 ) = F0 (X) ∪ I(X 0 ), where I(X 0 ) consists of all finite unions of the form `∈L M` ∩ N` , ˜ X 0 ∈ N (X 0 ) ∩ N + (X 0 ), since where M` ∈ N (X 0 ) and N` ∈ N + (X 0 ) for each ` ∈ L. Note that R 0 ˜ X = NX 0 ({p}, p) = N +0 ({p}, p) for any p ∈ ∆(X 0 ). Thus, F0 (X) = U(N (X)) ∪ U(N (X 0 )) ⊆ R X I(X). Hence, F1 (X) = I(X) = F(X). (iv): Note first that for any NX 0 (A, p) ∈ N (X 0 ) (resp. NX+0 (A, p) ∈ N + (X 0 )) and any 0 finite Y ⊆ X 0 with y ∗ ∈ Y and A ∈ A(Y ), we have NX 0 (A, p) = NY (A, p) × RX rY (resp. 0 NX+0 (A, p) = NY+ (A, p) × RX rY ). Now fix any E ∈ F(X 0 ). By part (iv), weShave a finite index set L and M` ∈ N (X 0 ), N` ∈ N + (X 0 ) for each ` ∈ L such that E = `∈L M` ∩ N` . By the first sentence, we can then pick a finite Y ⊆ X 0 with y ∗ ∈ Y such that for each `, 0 0 N`Y × RX rY , where M`Y ∈ N (Y ) and N`Y ∈ N + (Y ). we have M` = M`Y × RX rY and N` = S Then E = E Y × RXrY , where E Y := S `∈L M`Y ∩ N`Y ∈ F(Y ). Conversely, if E Y ∈ F(Y ), then by part (iv), E Y is of the form `∈L M`Y ∩ N`Y ∈ F(Y ) for some finite collection of 0 M`Y ∈ N (Y ) and N`Y ∈ N + (Y ). Then by the firstSsentence, M` := M`Y × RX rY ∈ N (X 0 ) and 0 0  N` = N`Y × RX rY ∈ N + (X 0 ), so E Y × RX rY = L`=1 M` ∩ N` ) ∈ F(X 0 ) by part (iv). F.4.2

Properties of proper finitely-additive probability measures on F

Lemma 22. Let ν be a proper finitely-additive probability measure on F and suppose that (N (p, A) r {0}) ∩ supp ν = ∅ for some A ∈ A and p ∈ A, where 0 denotes the unique constant ˜ X . Then ν(N + (A, p)) = ν(N (A, p)) = 0. utility in R Proof. Since (N (A, p) r {0}) ∩ supp ν = ∅, we have [ N (A, p) r {0} ⊆ (supp ν)c := {V ∈ F : V open and ν(V ) = 0}. Thus, for some possibly infinite index set I, there exists a family {Vi }i∈I , with Vi ∈ F open and ν(Vi ) = 0 for each i such that [ N (A, p) r {0} ⊆ Vi . i∈I

We now show that there is a finite subset {i1 , . . . , in } ⊆ I such that N (A, p) r {0} ⊆

n [

Vij .

j=1

To see this, define L(A, p) := (N (A, p) ∩ [−1, 1]X ) r {0}. Note that since [−1, 1]X is compact in ˜ X , C(A, p) is compact in the relative RX (by Tychonoff’s theorem) and N (A, p) is closed in R ˜ X r {0}. Hence, since L(A, p) ⊆ N (A, p) r {0} is covered by S Vi and each Vi topology on R i∈I S is open, it has a finite subcover nj=1 Vij . 8

S We claim that N (A, p) r {0} is also covered by nj=1 Vij . To see this, consider any u∗ ∈ N (A, p) r {0}. We can find a finite Y ⊆ X such that y ∗ ∈ Y , u∗ Y is not constant, N (A, p) = NY (A, p)×RXrY , and for each j = 1, . . . , n, Vij = ViYj ×RXrY for some ViYj ∈ F Y (see Lemma 21 (iv)). Since Y is finite, there exists α > 0 small enough such that αu∗ (y) ∈ [−1, 1] for all y ∈ Y . ˜ X by u Y = αu∗ Y and u(x) = 0 for all x ∈ X rY . Note that u ∈ N (A, p): Indeed, Define u ∈ R u∗ ∈ N (A, p) = NY (A, p) × RXrY , u Y = αu∗ Y , and NY (A, p) is closed under positive scaling. X Moreover, u is not constant, since u∗ S Y is not constant. Finally, u ∈ [−1, 1] . This shows n u ∈ L(A, p). Since L(A, p) is covered by j=1 Vij , there exists j such that u ∈ Vij = ViYj ×RXrY . But note that ViYj is closed under positive scaling, since by Lemma 21 (iii) it is a finite union of sets which are closed under positive scaling. Since uSY = αu∗ Y , this implies u∗ ∈ Vij . The above shows that N (A, p) r {0} is covered by nj=1 Vij , and hence so is N + (A, p). But since ν(Vij ) = 0 for all j = 1, . . . , n and ν is finitely additive, it follows that ν(N + (A, p)) = 0. Moreover, by properness of ν, this implies ν(N (A, p)) = 0.  Lemma 23. Suppose ν is a proper finitely-additive probability measure on F and supp ν r ˜ X . Then for any A ∈ A and p ∈ A, we have ν(N (A, p)) = {0} = [u] for some u ∈ R ν(N (M (A, u), p)). Proof. Fix any A ∈ A and p ∈ A. Note first that for any q ∈ A, q∈ / M (A, u) ⇒ ν(N (A, q)) = 0.

(30)

Indeed, if q ∈ / M (A, u), then ∅ = [u] ∩ N (A, q) = (N (A, q) r {0}) ∩ supp ν. But then Lemma 22 implies that ν(N (A, q)) = 0, as claimed. Suppose now that p ∈ / M (A, u). Then (30) implies that ν(N (A, p)) = 0. Moreover, N (B, p) := ∅ if p ∈ / B, so also ν(N (M (A, u), p)) = 0, as required. Suppose next that p ∈ M (A, u). Then [ N (A, p) ⊆ N (M (A, u), p) ⊆ N (A, p) ∪ N (A, q), q∈ArM (A,u)

so that ν(N (A, p)) ≤ ν(N (M (A, u), p)) ≤ ν(N (A, p)) +

X

ν(N (A, q)) = ν(N (A, p)),

q∈ArM (A,u)

where the last equality follows from (30). ν(N (M (A, u), p)), as required. F.4.3

This again shows that ν(N (A, p))

= 

Menus without ties

Lemma 24. Suppose ρ is represented by the REU form (S, µ, {Us , τs }s∈S ). Then for any A ∈ A, the following are equivalent: (i). A ∈ A∗ (ii). For each s ∈ S, |M (A, Us )| = 1 9

(iii). For each p ∈ A, N (A, p) ∩ {Us : s ∈ S} = N + (A, p) ∩ {Us : s ∈ S}. Proof. The equivalence of (ii) and (iii) is immediate from the definitions. The proof that (i) is equivalent to (ii) is entirely analogous to that of Lemma 14 and is thus omitted. 

G

Proof of Proposition 5

The following three subsections prove Proposition 5, that is, the equivalence between DREU, evolving utility, gradual learning and their respective S-based analogs.

G.1

DREU

“If ” direction: Suppose ρ admits an S-based DREU representation s (St , {µt t−1 }st−1 ∈St−1 , {Ust , τst }st ∈St )t=0,...,T . We will construct a DREU representation ∗ ˆ ˆ ˆ ˆ ˆ (Ω, F , µ ˆ, (Ft , Ut , Wt )).  Q Consider the space G := Tt=0 St × RXt of all sequences of states and tie-breaking utilˆ := {(s0 , W0 , . . . , sT , WT ) ∈ G : Qt µsk−1 (sk ) > 0}. Let Fˆ ∗ be the restricities. Let Ω k=0 k ˆ of the product sigma-algebra of the discrete sigma-algebra on QT St and the tion to Ω t=0 Q product Borel sigma-algebra on Tt=0 RXt . For each K = ({s0 }, K0 , ..., {sT }, KT ) ∈ Fˆ ∗ , let Q Q s ˆ extends to a finitely-additive probaµ ˆ(K) = Tt=0 µt t−1 (st )τst (Kt ); by finiteness of Tt=0 St , µ ˆ bility measure on Ω in the natural way. ˆ whose cells are all the cylinders C(s0 , . . . , st ) := {ˆ Let Πt be the finite partition of Ω ω ∈ ˆ ˆ ω ) = (s0 , . . . , st )}. Let Ft be the sigma-algebra generated by Πt ; by definition Ω : projS0 ×...×St (ˆ S ˆ ˆ ˆ Also, Fˆt (ˆ ω 0 ), so (Fˆt )0≤t≤T ⊆ Fˆ ∗ is of Ω, µ(Ft (ˆ ω )) > 0 for all ω ˆ ∈ Ω. ω ) = ωˆ 0 ∈Fˆt (ˆω) Fˆt+1 (ˆ ˆ → RXt by Uˆt (ˆ ω ) = st . Note that (Uˆt ) ω ) = Ust where projSt (ˆ a filtration. Define Uˆt : Ω is adapted to (Fˆt ) and that Uˆt (ˆ ω ) is nonconstant for each ω ˆ since each Ust is nonconstant. 0 0 Finally, if Ft−1 (ˆ ω ) = Ft−1 (ˆ ω ) and Ft (ˆ ω ) 6= Ft (ˆ ω ), then projSt−1 (ˆ ω ) = projSt−1 (ˆ ω ) = st−1 and st−1 0 0 0 ω ) for some st−1 ∈ St−1 and st , st ∈ supp µt . By DREU1 (a), ω ) = st 6= st = projSt (ˆ projSt (ˆ ˆ this implies Ut (ˆ ω ) := Ust 6≈ Us0t =: Uˆt (ˆ ω 0 ). Thus, (Ft , Ut ) are simple. ˆ t (ˆ ˆt : Ω ˆ → RXt by W Define W ω ) = Wt where proj ω ) = Wt . Note that for all At , µ ˆ({ˆ ω∈ RXt (ˆ P Q s T k−1 X ˆ : |M (At , W ˆ t )| = 1}) = Ω (sk ) τst ({Wt ∈ R t : |M (At , Wt )| = 1}) = 1, (s0 ,...,sT ) k=0 µk ˆ t ) satisfies part (i) of the properness requirement for DREU. since each τst is proper. Thus, (W ˆ Moreover, for any FT (ˆ ω ) = C(s0 , . . . , sT ) and any sequence (Bt ) of Borel sets Bt ⊆ RXt , the definition of µ ˆ implies ! T T T   \ Y Y ˆ ˆ (31) µ ˆ {Wt ∈ Bt }|C(s0 , . . . , sT ) = τst (Bt ) = µ ˆ {Wt ∈ Bt } | C(s0 , . . . , st ) . t=0

t=0

t=0

ˆ t) Since FˆT (ˆ ω ) = C(s0 , . . . , sT ) implies Fˆt (ˆ ω ) = C(s0 , . . . , st ) for all t ≤ T , this shows that (W also satisfies parts (ii) and (iii) of the properness requirement. ˆ Fˆ ∗ , µ ˆ t )) represents ρ, fix any ht = (A0 , p0 , ..., At , pt ) ∈ Ht . Finally, to see that (Ω, ˆ, (Fˆt , Uˆt , W

10

Then µ ˆ(C(ht )) = µ ˆ

T

t ω k=0 {ˆ

 ˆ : pk ∈ M (M (Ak , Uˆk (ˆ ˆ k (ˆ ∈Ω ω )), W ω ))} =

T  t ˆ ˆ ˆ µ ˆ (C(s , ..., s ))ˆ µ {ˆ ω ∈ Ω : p ∈ M (M (A , U ), W )}|C(s , ..., s ) = 0 t k k k k 0 t C(s0 ,...,st )∈Πt k=0 T  Qt P sk−1 t ˆ : pk ∈ M (M (Ak , Us ), W ˆ k )}|C(s0 , ..., st ) = (s )ˆ µ {ˆ ω ∈ Ω µ k k k=0 k=0 k (s0 ,...,st )∈S0 ×...×St P Qt sk−1 (sk )τsk (pk , Ak ) (s0 ,...,st )∈S0 ×...×St k=0 µk P

where the third equality follows from the definition of µ ˆ and Uˆ , and the final equality follows from (31). Thus, as required, we have P Qt sk−1 (sk )τsk (pk , Ak ) µ ˆ(C(ht )) (s0 ,...,st ) k=0 µk t−1 µ ˆ(C(pt , At )|C(h )) = = ρt (pt ; At |ht−1 ), = P Q t−1 sk−1 t−1 µ ˆ(C(h ) (sk )τsk (pk , Ak ) (s0 ,...,st−1 ) k=0 µk where the final equality holds by DREU2. “Only if ” direction: Take any DREU representation (Ω, F ∗ , µ, (Ft , Ut , Wt )) of ρ. We will s construct an S-based DREU representation (St , {ˆ µt t−1 }st−1 ∈St−1 , {Uˆst , τst }st ∈St )t=0,...,T . For each t, let St := {Ft (ω) : ω ∈ Ω} denote the partition generating Ft , which is finite t since (Ft ) is simple. Each µ ˆst+1 is defined to be the one-step-ahead conditional of µ, i.e., t µ ˆ0 (s0 ) := µ(s0 ) for all s0 ∈ S0 and µ ˆst+1 (st+1 ) := µ(st+1 |st ) for all st ∈ St , st+1 ∈ St+1 . This is well-defined since µ(Ft (ω)) > 0 for all ω. For each st ∈ St , define Uˆst := Ut (ω) if ω ∈ st ; this is well-defined as (Ut ) is Ft -adapted and each Ust is nonconstant since each Ut (ω) is nonconstant. Finally, for any Borel set Bt ⊆ RXt , define τst (Bt ) := µ({Wt ∈ Bt }|st ). This is well-defined since Wt is F ∗ -measurable. Moreover, because µ({ω ∈ Ω : |M (At , Wt (ω)| = 1} = 1 for all At and |St | is finite, it follows that τst (N (At , pt )) = τst (N + (At , pt )) for all pt , i.e., τst is proper. s Thus, each (St , µt t−1 , {Ust , τst }st ∈St ) is an REU form on Xt . s Moreover, (a) for any distinct st , s0t ∈ supp(µt t−1 ), we have ω, ω 0 such that Ft−1 (ω) = st−1 = Ft−1 (ω 0 ) and Ft (ω) = st 6= Ft (ω 0 ) = s0t . Thus, Uˆst = Ut (ω) 6≈ Ut (ω 0 ) = Uˆs0t , since (Ut , Ft ) is simple. Also, since (Ft ) is adapted, the partition St refines the partition St−1 , so that (b) for s0t−1 st−1 any distinct st−1 , s0t−1 , we have supp(ˆ µ ) ∩ supp(ˆ µ ) = ∅. Since additionally µ(st ) > 0 for t t S st−1 all st ∈ St , we have (c) st−1 ∈St−1 supp µ ˆt = St . Thus, DREU1 is satisfied. To see that DREU2 holds, observe that for each ht = (A0 , p0 , ..., At , pt ) ∈ Ht , we have P µ(C(ht ) = sT ∈ST µ(sT )µ (C(ht )|sT )  P T = sT ∈ST µ(sT )µ tk=0 {ω ∈ Ω : pk ∈ M (M (Ak , Uk ), Wk )}|sT  P T = µ(sT )µ tk=0 {pk ∈ M (M (Ak , Usk ), Wk )}|st (s0 ,...,sT ) ∃ω∈Ω∀t: st =Ft (ω)

=

P

(s0 ,...,sT ) ∃ω∈Ω∀t: st =Ft (ω)

=

P

=

P

µ(sT )

Qt

k=0

(s0 ,...,st ) ∃ω∈Ω∀k≤t: sk =Fk (ω) (s0 ,...,st )∈S0 ×...×St

µ ({pk ∈ M (M (Ak , Usk ), Wk )}|sk ) s

Qt

k=0

Qt

µkk−1 (sk ) s

k=0

µkk−1 (sk )

11

Qt

k=0 τsk (pk , Ak )

Qt

k=0 τsk (pk , Ak ),

where the third equality follows from the fact that (Ut ) is Ft -adapted, the fourth equality follows from Q part (ii) of the properness assumption on (Wt ), the final equality follows from the fact s that tk=0 µkk−1 (sk ) = 0 whenever (s0 , . . . , st ) 6= (F0 (ω), . . . , Ft (ω)) for all ω, and the remaining µ(C(ht )) equalities hold by definition. Since ρt (pt ; At |ht−1 ) = µ(C(h t−1 ) by (3), this shows that DREU2 holds.

G.2

Evolving Utility

“If ” direction: Suppose ρ admits an S-based evolving utility representation s ˆ Fˆ ∗ , µ ˆ t )) denote the corre(St , {µt t−1 }st−1 ∈St−1 , {Ust , ust , τst }st ∈St )t=0,...,T . Let (Ω, ˆ, (Fˆt , Uˆt , W sponding DREU representation of ρ obtained in the “if” direction for DREU. In addition, define ˆ → RZ for each t by uˆt (ˆ uˆt : Ω ω ) := ust whenever projSt (ˆ ω ) = st . Note that the process (ˆ ut ) is ˆ ˆ Ft -adapted. Moreover, for each ω ˆ = (s0 , W0 , ..., sT , WT ), we have UT (ˆ ω ) = UsT = usT = uˆT (ˆ ω) and for each t ≤ T − 1 and (zt , At+1 ) Uˆt (ˆ ω )(zt , At+1 ) = Ust (zt , At+1 ) X = ust (zt ) +

t µst+1 (st+1 ) max Ust+1 (pt+1 )

pt+1 ∈At+1

st+1 ∈St+1

= uˆt (ˆ ω )(zt ) +

X

µ ˆ(st+1 |st ) max Ust+1 (pt+1 )

st+1 ∈St+1

pt+1 ∈At+1

= uˆt (ˆ ω )(zt ) + E[ max Uˆt+1 (pt+1 )|Fˆt (ˆ ω )], pt+1 ∈At+1

where we let µ ˆ(st+1 |st ) := µ ˆ (C(s0 , . . . , st+1 ) | C(s0 , . . . , st )). Thus we constructed an evolving utility representation with δ = 1. “Only if ” direction: Suppose ρ admits an evolving utility representation (Ω, F ∗ , µ, (Ft , Ut , ut , δ, Wt )). We take another evolving utility representation ∗ 0 0 (Ω, F , µ, (Ft , Ut , ut , 1, Wt )) that is constructed by setting u0t := δ t ut for each t. Clearly this represents the same ρ. Based on this second representation, st−1 ˆ (St , {ˆ µt }st−1 ∈St−1 , {Ust , τst }st ∈St )t=0,...,T denote the corresponding S-based DREU representation obtained in the “only if” direction for DREU. In addition, for each st , define uˆst ∈ RZ by uˆst = u0t (ω) for any ω ∈ st ; this is well-defined as (u0t ) is Ft -adapted. Reversing the argument in the previous part, we can verify that uˆsT = UˆsT for each sT and P t Uˆst (zt , At+1 ) = uˆst (zt ) + st+1 µ ˆst+1 (st+1 ) maxpt+1 ∈At+1 Uˆst+1 (pt+1 ) for each st with t ≤ T − 1.

G.3

Gradual learning

“If ” direction: Suppose ρ admits an S-based gradual learning representation st−1 ˆ Fˆ ∗ , µ ˆ t )) denote the cor(St , {µt }st−1 ∈St−1 , {Ust , ust , τst }st ∈St , δ)t=0,...,T . Let (Ω, ˆ, (Fˆt , Uˆt , uˆt , W responding evolving utility representation obtained in the “if” direction for evolving utility. In addition, define δˆP := δ. Note that for each ω ˆ = (s0 , W0 , .., sT , WT ) and t ≤ T − 1, we have t (st+1 )ust+1 = 1δˆ E[ˆ ut+1 |Fˆt (ˆ ω )]. Iterating expectations, this yields uˆt (ˆ ω ) = ust = 1δ st+1 µst+1 uˆt (ˆ ω ) = δˆt−T E[ˆ uT |Fˆt (ˆ ω )] = δˆt−T E[UˆT |Fˆt (ˆ ω )]. Replace Uˆt with Uˆt0 := δˆT −t Uˆt for each t. By ∗ 0 ˆ Fˆ , µ ˆ t )) is still a DREU representation of ρ. Moreover, for each Proposition 1, (Ω, ˆ, (Fˆt , Uˆt , W 12

t ≤ T − 1, we have Uˆt0 (ˆ ω )(zt , At+1 ) = δˆT −t uˆt (ˆ ω )(zt ) + δˆT −t E[ max Uˆt+1 (pt+1 )|Fˆt (ˆ ω )] pt+1 ∈At+1

=

E[UˆT0 (zt )

0 ˆ | Fˆt (ˆ ω )] + δE[ max Uˆt+1 (pt+1 )|Fˆt (ˆ ω )]. pt+1 ∈At+1

ˆ is a gradual learning representation of ρ. ˆ Fˆ ∗ , µ ˆ t ), δ) Thus, (Ω, ˆ, (Fˆt , Uˆt0 , W “Only if ” direction: Suppose that ρ admits a gradual learning representation (Ω, µ, (Ft , Ut , Wt ), δ). Let Ut0 := δ t−T Ut for all t. By Proposition 1, (Ω, µ, (Ft , Ut0 , Wt )) is still a DREU representation of ρ. Moreover, let u0t := δ t−T ut , where ut (ω) := E[UT |Ft (ω)]. By (2), for each ω, we have UT0 (ω) = UT (ω) = uT (ω) = u0t (ω) and for all t ≤ T − 1 Ut0 (ω)(zt , At+1 ) = u0t (ω)(zt ) + δ t−T δE[ max Ut+1 (pt+1 ) | Ft (ω)] pt+1 ∈At+1

=

u0t (ω)(zt )

0 + E[ max Ut+1 (pt+1 ) | Ft (ω)]. pt+1 ∈At+1

Let Thus, (Ω, µ, (Ft , Ut0 , u0t , Wt )) is an evolving utility representation of ρ. s (St , {ˆ µt t−1 }st−1 ∈St−1 , {Uˆs0 t , uˆ0st , τst }st ∈St )t=0,...,T denote the corresponding S-based evolving utility representation of ρ obtained in the “only if” direction for evolving utility. In addition, define δˆ := δ. Then for each t ≤ T − 1 and ω with Ft (ω) = st , we have 1 1 X st uˆ0st = u0t (ω) = δˆt−T E[UT |Ft (ω)] = E[u0t+1 |Ft (ω)] = µ ˆt+1 (st+1 )ˆ u0st . ˆ ˆ δ δ st+1 s ˆ t=0,...,T is an S-based gradual learning represenThus (St , {ˆ µt t−1 }st−1 ∈St−1 , {Uˆs0 t , uˆ0st , τst }st ∈St , δ) tation of ρ. 

H H.1

Proof of Proposition 1 “If ” directions:

DREU: Note first that because (Ut ), (αt ), and (βt ) are Ft -adapted and (Ft , Ut ) are simple, (i) and (ii) imply (Uˆt ) is (Fˆt )-adapted and (Fˆt , Uˆt ) is simple. Consider any ht = (p0 , A0 , ..., pt , At ) ∈

13

Ht . Then µ(C(ht )) =

X

µ(FT (ω))µ

FT (ω)∈ΠT

=

X

t \

! {pk ∈ M (M (Ak , Uk ), Wk )}|FT (ω)

k=0

t Y

µ(Fk (ω) | Fk−1 (ω))µ ({Wk ∈ N (M (Ak , Uk (ω)), pk )}|Fk (ω))

Ft (ω)∈Πt k=0

=

t X Y



 ˆ µ ˆ(gk (Fk (ω)) | gk−1 (Fk−1 (ω)))ˆ µ {Wk ∈ N (M (Ak , Uk (ω)), pk )}|gk (Fk (ω))

Ft (ω)∈Πt k=0

=

t X Y

  ˆ ˆ ˆ ˆ ˆ µ ˆ(Fk (ˆ ω ) | Fk−1 (ω))ˆ µ {Wk ∈ N (M (Ak , Uk (ˆ ω )), pk )}|Fk (ˆ ω)

ˆ t k=0 Fˆt (ˆ ω )∈Π

=

X ˆT FˆT (ˆ ω )∈Π

µ ˆ(FˆT (ˆ ω ))

t \

! ˆ k )} | FˆT (ˆ {pk ∈ M (M (Ak , Uˆk ), W ω)

ˆ t )), =µ ˆ(C(h

k=0

where the second equality follows from properness of (Wt ) and Ft -adaptedness of (Ut ), the third equality follows from assumptions (i) and (iii), the fourth equality from the fact that ˆ t ) and Fˆt gt is a bijection and assumption (ii), the fifth equality from the properness of (W adaptedness of (Uˆt ), and the first and last equalities hold by definiton. Since D represents ˆ t )) µ(C(ht )) µ ˆ(C(h ˆ t , At ) | C(h ˆ t−1 )]. Thus, D ˆ is a ˆ[C(p ρ, this implies ρt (pt , At |ht−1 ) = µ(C(h t−1 )) = ˆ t−1 )) = µ µ ˆ(C(h DREU representation of ρ as required. ˆ is a DREU representation of ρ. Evolving utility: By the “if” direction for DREU, D Moreover, by assumption (v) and since (ut ), α0 , (βt ) are (Ft )-adapted, it follows that (ˆ ut ) is ˆ ˆ ˆ (Ft )-adapted. It remains to show that (D, (ˆ ut ), δ) satisfies (1). From assumptions (ii), (iv), and ˆ (v) it is immediate that UT = uˆT . Moreover, for all t ≤ T − 1, and ω ∈ Ω, ω ˆ ∈ gt (Ft (ω)), we have αt (ω)Uˆt (ˆ ω )(z, At+1 ) = Ut (ω)(z, At+1 ) − βt (ω) = ut (ω)(z) − βt (ω) + δEµ [ max Ut+1 (pt+1 ) | Ft (ω)] pt+1 ∈At+1

= αt (ω)ˆ ut (ˆ ω )(z) − δEµ [βt+1 |Ft+1 (ω)] + δEµˆ [αt+1 max Uˆt+1 (pt+1 ) | Fˆt (ˆ ω )] + δEµ [βt+1 |Ft (ω)] pt+1 ∈At+1   ˆ µˆ [ max Uˆt+1 (pt+1 ) | Fˆt (ˆ = αt (ω) uˆt (ˆ ω )(z) + δE ω )] pt+1 ∈At+1

where the first equality follows from (ii), the second from (1) for (D, (ut ), δ), the third from (i), ˆ satisfies ˆ (ˆ (ii), and (v) (and the fact gt is a bijection), and the fourth by (iv). Thus, (D, ut ), δ) (1). ˆ is a DREU representation of ρ. Gradual learning: By the “if” direction for DREU, D ˆ δ) satisfies (2). For all t ≤ T − 1, (z, At+1 ), and Since δ = δˆ by (vi), it remains to show that (D,

14

ω ∈ Ω, ω ˆ ∈ gt (Ft (ω)), we have α0 (ω)Uˆt (ˆ ω )(z, At+1 ) + βt (ω) = Ut+1 (ω)(z, At+1 ) = Eµ [UT (z)|Ft (ω)] + δEµ [ max Ut+1 (pt+1 ) | Ft (ω)] pt+1 ∈At+1

= α0 (ω)Eµˆ [UˆT (z)|Fˆt (ˆ ω )] + Eµ [βT |Ft (ω)] + α0 (ω)δEµˆ [ max Uˆt+1 (pt+1 ) | Fˆt (ˆ ω )] + δEµ [βt+1 |Ft (ω)] pt+1 ∈At+1   1 − δ T −t+1 ˆ ˆ ˆ ˆ = α0 (ω) Eµˆ [UT (z)|Ft (ˆ ω )] + δEµˆ [ max Ut+1 (pt+1 ) | Ft (ˆ ω )] + Eµ [βT | Ft (ω)], pt+1 ∈At+1 1−δ where the first equality follows from (ii) and (iv), the second from (2) for (D, δ), the third from (i), (ii), and (iv) (and the fact gt is a bijection), and the final equality from (vii). Since T −t+1 ˆ δ) satisfies (2). βt (ω) = 1−δ1−δ E[βT | Ft (ω)] by (vii), this shows that (D,

H.2

“Only if ” directions:

ˆ t ) and the fact that each Fˆt is generated by a finite partition Π ˆt DREU: Properness of (W ˆ being a DREU representation. Throughout the proof, for any t and is immediate from D Et = Ft (ω) ∈ Πt , we let Ut (Et ) denote Ut (ω) and likewise for Uˆ ; this is well-defined by adaptedness. We construct the sequence (gt , αt , βt ) inductively, dealing with the base case t = 0 and the inductive step simultaneously. Suppose t ≥ 0 and that we have constructed (gt0 , αt0 , βt0 ) satisfying (i)–(iii) for all t0 < t (disregard the latter assumption if t = 0). If t > 0, fix any Et−1 = Ft−1 (ω ∗ ) ∈ Πt−1 , let ˆ t (Eˆt−1 ) := Eˆt−1 := gt−1 (Et−1 ), and let Πt (Et−1 ) := {Et = Ft (ω) ∈ Πt : Ft−1 (ω) = Et−1 } and Π ˆ ˆ ˆ ˆ ˆ {Et = Ft (ˆ ω ) ∈ Πt : Ft−1 (ˆ ω ) = Et−1 }. As in the proof of Lemma 2, we can repeatedly apply Lemma 13 to find a separating history for Et−1 = Ft−1 (ω ∗ ), i.e., a history ht−1 = ∗ such that {ω ∈ Ω : qk ∈ M (Bk , Uk (ω))} = Fk (ω ∗ ) for all (B0 , q0 , . . . , Bt−1 , qt−1 ) ∈ Ht−1 k = 0, . . . , t − 1. By inductive hypothesis ht−1 is then also a separating history for Eˆt−1 . Thus, by Lemma 14 (and the translation to S-based DREU in Proposition 5), C(ht−1 ) = Et−1 and ˆ t−1 ) = Eˆt−1 . If t = 0, then in the following we let Et−1 := Ω, Eˆt−1 := Ω, ˆ Πt (Et−1 ) := Π0 , C(h ˆ ˆ Πt (Et−1 ) := Π0 , and we disregard all references to the separating history. Enumerate Πt (Et−1 ) = {Eti : i = 1, . . . , m} with corresponding utilities Uti := Ut (Eti ) and ˆ t (Eˆt−1 ) = {Eˆtj : j = 1, . . . , m} Π ˆ with corresponding utilities Uˆ0j := Uˆt (Eˆtj ). Since (Ft , Ut ) and 0 (Fˆt , Uˆt ) are both simple, we have µ(Eti ) > 0 for all i and Uti 6≈ Uti for i 6= i0 , and likewise 0 µ ˆ(Eˆtj ) > 0 for all j and Uˆtj 6≈ Uˆtj for j 6= j 0 . Note that for every j there exists a unique i(j) i(j) such Ut ≈ Uˆtj . Indeed, if such an i(j) exists it is unique because all the Uti represent different preferences. And the desired i(j) exists, since otherwise by Lemma 13, we can find a menu Bt = {qti : i = 1, . . . , m} ∪ {ˆ qtj } such that M (Bt , Uti ) = {qti } for each i and M (Bt , Uˆtj ) = {ˆ qtj }. We can additionally assume (by replacing ht−1 with an appropriate mixture if need be) that ∗ ˆ both represent ρ, we obtain ht−1 ∈ Ht−1 (Bt ). Since D and D ˆ qtj , Bt )|Eˆt−1 ) ≥ µ 0 = µ[C(ˆ qtj , Bt )|Et−1 ] = ρt (ˆ qtj ; Bt |ht−1 ) = µ ˆ[C(ˆ ˆ(Eˆtj |Eˆt−1 ) > 0, j(i) a contradiction. Similarly, for every i, there exists a unique j(i) such that Uˆt ≈ Uti . Thus, ˆ t (Eˆt−1 ) by gt (E i ) = Eˆtj(i) yields a bijection. By construction, defining gt : Πt (Et−1 ) → Π t

15

Ut (Eti ) ≈ Uˆt (gt (Eti )) for all i, so we can find αt (Eti ) ∈ R++ and β(Eti ) ∈ R such that Ut (Eti ) = αt (Eti )Uˆt (gt (Eti )) + β(Eti ). Defining α(ω) = α(Ft (ω)) and β(ω) = β(Ft (ω)) this yields Ft measurable maps αt , βt : Et−1 → R such that (ii) holds for all ω ∈ Et−1 . Moreover, applying Lemma 13 again, we can find a menu Dt = {rti : i = 1, . . . , n} such that M (Dt , Uti ) = {rti } for each i. Again, slightly perturbing the separating history ht−1 for Et−1 if need be, we can ∗ assume that ht−1 ∈ Ht−1 (Dt ). Then by the representation, µ(Eti |Et−1 ) = ρt (rti ; Dti |ht−1 ) = µ ˆ(gt (Eti )|Eˆt−1 ) for all i, yielding (i). ∗ To show (iii), consider any pt ∈ At , where we can again assume ht−1 ∈ Ht−1 ( 21 At + 12 Dt ). Let Bti := {w ∈ RXt : pt ∈ M (M (At , Ut (Eti )), w)}. Note that by (ii), Bti = {w ∈ RXt : ˆt ∈ pt ∈ M (M (At , Uˆt (gt (Eti ))), w)}. Thus, µ({Wt ∈ Bt }|Eti ) = µ(C(pt , At )|Eti ) and µ ˆ({W i i ˆ ˆ ˆ(C(pt , At )|gt (Et )). But since D and D both represent ρ and by choice of Dt , Bt }|gt (Et )) = µ 1 1 1 1 µ(Eti |Et−1 )µ[C(pt , At )|Eti ] = µ[C( pt + rti , At + Dt )|Et−1 ] = 2 2 2 2 1 1 i 1 1 ρt ( pt + rt ; At + Dt |ht−1 ) = 2 2 2 2 1 1 1 1 ˆ pt + ri , At + Dt )|Eˆt−1 ] = µ ˆ t , At )|gt (E i )], µ ˆ[C( ˆ(gt (Eti )|Eˆt−1 )ˆ µ[C(p t 2 2 t 2 2 ˆ t , At )|gt (E i )], since by (i) we have µ(E i |Et−1 ) = ˆ[C(p which implies µ[C(pt , At )|Eti ] = µ t t ˆ t ∈ Bt }|gt (E i )), as required. ˆ({W µ ˆ(gt (Eti )|Eˆt−1 ). Thus, µ({Wt ∈ Bt }|Eti ) = µ t Finally, note that the collection {Πt (Et−1 ) : Et−1 ∈ Πt−1 } partitions Πt , and similarly ˆ ˆ t−1 } partitions Π ˆ t . Thus, applying the above construction for every {Πt (Eˆt−1 ) : Eˆt−1 ∈ Π ˆ t and Ft -measurable maps αt : Ω → R++ and Et−1 ∈ Πt−1 yields a bijection gt : Πt → Π βt : Ω → R such that (i)–(iii) are satisfied. Evolving utility: The “only if” part for DREU yields sequences (gt , αt , βt ) such that (i)–(iii) are satisfied. It remains to show that (iv) and (v) hold. Throughout the proof, for any Et = Ft (ω) ∈ Πt , we sometimes use Ut (Et ), αt (Et ), βt (Et ) to denote Ut (ω), αt (ω), βt (ω); this is well-defined since Ut , αt , βt are Ft -measurable. We also let Ft−1 (Et ) := Ft−1 (ω); this is well-defined since Ft (ω) = Ft (ω 0 ) implies Ft−1 (ω) = Ft−1 (ω 0 ), as (Ft ) is a filtration. For (iv), fix any ω and t ≤ T − 1. Let Et := Ft (ω) and pick any At+1 , Bt+1 and zt . Then Ut (Et )(zt , At+1 ) − Ut (Et )(zt , Bt+1 ) = αt (Et )(Uˆt (gt (Et ))(zt , At+1 ) − Uˆt (gt (Et ))(zt , Bt+1 )) X = αt (Et ) µ ˆ(Eˆt+1 |gt (Et ))[max Uˆt+1 (Eˆt+1 ) − max Uˆt+1 (Eˆt+1 )] At+1

ˆt+1 ∈Π ˆ t+1 E

= αt (Et )

X

µ ˆ(gt+1 (Et+1 )|gt (Et ))[max Uˆt+1 (gt+1 (Et+1 )) − max Uˆt+1 (gt+1 (Et+1 ))] At+1

Et+1 ∈Πt+1

X

= αt (Et ) X

Et+1 s.t. Ft (Et+1 )=Et

Bt+1

(32)

µ(Et+1 |Et )[max Uˆt+1 (gt+1 (Et+1 )) − max Uˆt+1 (gt+1 (Et+1 ))]

Et+1 ∈Πt+1

= αt (Et )

Bt+1

At+1

Bt+1

µ(Et+1 |Et )[max Uˆt+1 (gt+1 (Et+1 )) − max Uˆt+1 (gt+1 (Et+1 ))], At+1

Bt+1

ˆ being an evolving where the first equality holds by (ii), the second equality follows from D utility representation, the third equality from the fact that gt is a bijection, the fourth equality 16

from (i), and the fifth equality from the fact that µ(Ft+1 (ω 0 )|Et ) > 0 iff Ft (ω 0 ) = Et . At the same time, we have Ut (Et )(zt , At+1 ) − Ut (Et )(zt , Bt+1 ) =

X

µ(Et+1 |Et )[max Ut+1 (Et+1 ) − max Ut+1 (Et+1 )

X

=

µ(Et+1 |Et )αt+1 (Et+1 )[max Uˆt+1 (gt+1 (Et+1 )) − max Uˆt+1 (gt+1 (Et+1 ))]

Et+1 ∈Πt+1

=

X

Bt+1

At+1

Et+1 ∈Πt+1 At+1

Bt+1

µ(Et+1 |Et )αt+1 (Et+1 )[max Uˆt+1 (gt+1 (Et+1 )) − max Uˆt+1 (gt+1 (Et+1 ))], At+1

Et+1 s.t. Ft (Et+1 )=Et

Bt+1

(33) where the first equality follows from D being an evolving utility representation, the second equality from (ii), and the third equality from the fact that µ(Ft+1 (ω 0 )|Et ) > 0 iff Ft (ω 0 ) = Et . Combining (32) and (33), we have that for all At+1 and Bt+1 , X µ(Et+1 |Et )αt (Et )[max Uˆt+1 (gt+1 (Et+1 )) − max Uˆt+1 (gt+1 (Et+1 ))] Et+1 s.t. Ft (Et+1 )=Et

=

X

At+1

Bt+1

µ(Et+1 |Et )αt+1 (Et+1 )[max Uˆt+1 (gt+1 (Et+1 )) − max Uˆt+1 (gt+1 (Et+1 ))].

Et+1 s.t. Ft (Et+1 )=Et

At+1

Bt+1

(34) 0 )) for all disSince (Fˆt , Uˆt ) is simple and gt is a bijection, Uˆt+1 (gt+1 (Et+1 )) 6≈ Uˆt+1 (gt+1 (Et+1 0 0 tinct Et+1 , Et+1 with Ft (Et+1 ) = Et = Ft (Et+1 ). So by Lemma 13, we can find a menu Et+1 At+1 := {qt+1 : Ft (Et+1 ) = Et } such that for all Et+1 with Ft (Et+1 ) = Et we have ∗ Et+1 Et+1 ∗ M (At+1 , Uˆt+1 (gt+1 (Et+1 )) = {qt+1 }. Let Et+1 := Ft+1 (ω) and let Bt+1 = At+1 r {qt+1 }. Then ∗ ˆ ˆ in (34), [maxAt+1 Ut+1 (gt+1 (Et+1 )) − maxBt+1 Ut+1 (gt+1 (Et+1 ))] 6= 0 iff Et+1 = Et+1 . Hence, (34) ∗ implies αt (ω) = αt (Et ) = αt+1 (Et+1 ) = αt+1 (ω). Since this is true for all t ≤ T − 1, (iv) follows. For (v), note that the claim for T is immediate from (ii) and the fact that UT = uT , UˆT = uˆT . Next, fix any ω ∈ Ω, ω ˆ ∈ gt (Ft (ω), t ≤ T − 1, and (z, {pt+1 }). Then

Ut (ω)(z, {pt+1 }) = ut (ω)(z) + Eµ [Ut+1 (pt+1 )|Ft (ω)] = ut (ω)(z) + α0 (ω)Eµˆ [Uˆt+1 (pt+1 )|Fˆt (ˆ ω )] + Eµ [βt+1 |Ft (ω)],

(35)

where the first equality follows from (D, (ut )) being an evolving utility representation and the second equality from (i), (ii), (iv) (and the fact that gt is a bijection). At the same time, we have Ut (ω)(z, {pt+1 }) = α0 (ω)Uˆt (ˆ ω )(z, {pt+1 }) + βt (ω) = α0 (ω)ˆ ut (ω)(z) + α0 (ω)Eµˆ [Uˆt+1 (pt+1 )|Fˆt (ˆ ω )] + βt (ω),

(36)

ˆ (ˆ where the first equality follows from (ii) and (iv) and the second equality from (D, ut )) being an evolving utility representation. Combining (35) and (36) yields the desired claim. 17

Gradual learning: Since ρ admits a gradual learning representation and satisfies Condi∗ ˆ tion 1, ρ satisfies Axiom 9. Letting α∗ be as in Axiom 9, we first show that δ = 1−α = δ. α∗ Indeed, combining the proofs of Proposition 5 (equivalence of gradual learning and Sbased gradual learning representations) and of the necessity direction of Theorem 3, it is easy to see that for all t ≤ T − 1, ht , and qt , rt ∈ ∆(Xt ), we have qt %ht rt if and only if Ut (ω)(qt ) ≥ Ut (ω)(rt ) for all ω ∈ C(ht ). Fix any ω ∈ Ω and h0 = (A0 , p0 ) ∈ H0∗ such that {ω 0 : p0 ∈ M (A0 , U0 (ω 0 ))} = F0 (ω) (which exists by Lemma 13). Then by Lemma 14, C(h0 ) = F0 (ω). So by Condition 1 applied to h0 , we can find `, m, n ∈ ∆Z such that U0 (ω)(`, n, . . . , n) 6= U0 (ω)(m, n, . . . , n). Moreover, by Axiom 9 applied to h0 , we have U0 (ω)(α∗ `+(1−α∗ )m, α∗ `+(1−α∗ )m, n, . . . , n)) PT= U0k(ω)(`, m,k n, . . . , n). Note that by (2) and 0 T iterated expectations, we have U0 (` , . . . , ` ) = k=0 δ u0 (ω)(` ) for any stream of consumption lotteries, (`0 , . . . , `T ), where u0 (ω) := E[UT |F0 (ω)]. Hence, by the preceding observations, we have u0 (ω)(`) 6= u0 (ω)(m) and (1+δ) (α∗ u0 (ω)(`) + (1 − α∗ )u0 (ω)(m)) = u0 (ω)(`)+δu0 (ω)(m). ∗ ˆ shows δˆ = 1−α∗ ∗ , proving (vi). . The same argument for D This implies δ = 1−α α∗ α 0 t−T 0 t−T ˆ 0 t−T ˆ ˆ Next, define Ut := δ Ut , Ut := δ Ut , ut := δ ut , uˆ0t := δˆt−T uˆt , where ut (ω) := E[UT |Ft (ω)], uˆt (ˆ ω ) := E[UˆT |Fˆt (ˆ ω )] for all ω, ω ˆ . By the “if” part for DREU, D0 := ˆ 0 := (Ω, ˆ Fˆ ∗ , µ ˆ t )) are still DREU representations of ρ. (Ω, F ∗ , µ, (Ft , Ut0 , Wt )) and D ˆ, (Fˆt , Uˆt0 , W 0 Moreover, by (2), for each ω, we have UT (ω) = UT (ω) = uT (ω) = u0T (ω) and for all t ≤ T − 1 Ut0 (ω)(zt , At+1 ) = u0t (ω)(zt ) + δ t−T δE[ max Ut+1 (pt+1 ) | Ft (ω)] pt+1 ∈At+1

=

u0t (ω)(zt )

0 + E[ max Ut+1 (pt+1 ) | Ft (ω)]. pt+1 ∈At+1

ˆ 0 , (ˆ u0t )). By the“only Thus, (D0 , (u0t )) is an evolving utility representation of ρ and likewise for (D 0 0 0 ˆ 0 , (ˆ u0t )) satisfy if” direction for evolving utility, there exists gt , αt , βt such that (D , (ut )) and (D ˆ as well. Moreover, for any t, ω and properties (i)–(v). Then (i) and (iii) hold for D and D ω ˆ ∈ gt (Ft (ω)), we have Ut0 (ω) = α0 (ω)Uˆt0 (ˆ ω ) + βt0 (ω). But since Ut0 = δ t−T Ut and Uˆt0 = δˆt−T Uˆt ˆ this implies Ut (ω) = α0 (ω)Uˆt (ˆ and δ = δ, ω ) + βt (ω), where βt := δ T −t βt0 . Thus, gt , αt , βt satisfy ˆ (i)–(iv) for D and D. ˆ 0 , (ˆ ω ) + βt0 (ω) − u0t (ˆ u0t )), we have u0t (ω) = α0 (ω)ˆ Finally, note that by (v) for (D0 , (u0t )) and (D 0 E[βt+1 |Ft (ω)] for all t ≤ T − 1. Equivalently, ut (ω) = α0 (ω)ˆ ut (ˆ ω ) + βt (ω) − δE[βt+1 |Ft (ω)]. Combining this with the fact that ut (ω) = Eµ [UT |Ft (ω)] = α0 (ω)Eµˆ [UˆT |Fˆt (ˆ ω )] + Eµ [βT |Ft (ω)] ˆ ˆ and uˆt (ˆ ω ) = Eµˆ [UT |Ft (ˆ ω )] yields βt (ω) = E[βT |Ft (ω)] + δE[βt+1 |Ft (ω)]. Then an easy inductive T −t+1  argument shows βt (ω) = 1−δ1−δ E[βT |Ft (ω)], proving (vii).

I I.1

Proofs for Section 5.2 Proof of Proposition 2

(ii)=⇒(i): Consider any (ht−1 , At , pt ), (ht−1 , At , qt ) ∈ Ht and pt+1 ∈ At+1 ∈ A∗t+1 (ht−1 , At , pt )∩ A∗t+1 (ht−1 , At , qt ) such that At and At+1 are atemporal, AZt+1 ⊆ AZt and pZt+1 = pZt =: p. Let Ut (p) := {ut (ω) : ω ∈ C(ht−1 , At , pt )} and Ut (q) := {ut (ω) : ω ∈ C(ht−1 , At , qt )}. Note that since At+1 features no ties, Lemma 14 implies C(pt+1 , At+1 ) = {ω : pt+1 ∈ M (At+1 , Ut+1 (ω))}, which since At+1 is atemporal is in turn equal to {ω : p ∈ 18

M (AZt+1 , ut+1 (ω))}. Hence, by the representation, ρt+1 (pt+1 ; At+1 |ht−1 , At , pt ) = µ({p ∈ M (AZt+1 , ut+1 )}|C(ht−1 , At , pt )) ≥ min µ({p ∈ M (AZt+1 , ut+1 )}|C(ht−1 ) ∩ {ut ≈ u}).

(37)

u∈Ut (p)

Likewise, ρt+1 (pt+1 ; At+1 |ht−1 , At , qt ) = µ({p ∈ M (AZt+1 , ut+1 )}|C(ht−1 , At , qt )) ≤ 0max µ({p ∈ M (AZt+1 , ut+1 )}|C(ht−1 ) ∩ {ut ≈ u0 }).

(38)

u ∈Ut (q)

Pick u ∈ Ut (p) (respectively, u0 ∈ Ut (q)) which achieve the min (respectively max) in (37) (ret−1 spectively, in (38)). Let {u1t+1 , ..., um , At , pt ) ∪ C(ht−1 , At , qt ) and p ∈ t+1 } := {ut (ω) : ω ∈ C(h Z Z Z M (AZt+1 , ut+1 (ω))} and let D := co{u, u1t+1 , ..., um t+1 }. Note that since At ⊇ At+1 and since pt = Z t−1 Z t−1 p, we have p ∈ M (At+1 , u). Hence, C(h ) ∩ {ut ≈ u} ∩ {p ∈ M (At+1 , ut+1 )} = C(h ) ∩ {ut ≈ u} ∩ [D], and likewise C(ht−1 ) ∩ {ut ≈ u0 } ∩ {p ∈ M (AZt+1 , ut+1 )} = C(ht−1 ) ∩ {ut ≈ u0 } ∩ [D]. Thus, µ({p ∈ M (AZt+1 , ut+1 )}|C(ht−1 ) ∩ {ut ≈ u}) = µ([D]|C(ht−1 ) ∩ {ut ≈ u}) ≥ µ([D]|C(ht−1 ) ∩ {ut ≈ u0 }) = µ({p ∈ M (AZt+1 , ut+1 )}|C(ht−1 ) ∩ {ut ≈ u0 }), where the inequality holds by (ii). Combining (37), (38), t−1 t−1 ρt+1 (pt+1 ; At+1 |h , At , pt ) ≥ ρt+1 (pt+1 ; At+1 |h , At , qt ), as required.

and

(39)

(39) yields

(i)=⇒(ii): We prove the contrapositive. Suppose that for some u, u0 ∈ RZ and ht−1 with C(ht−1 ) ∩ {ut ≈ u} = 6 ∅ 6= C(ht−1 ) ∩ {ut ≈ u0 }, and convex D ⊆ RZ with u ∈ D, we have µ({ut+1 ∈ [D]}|C(ht−1 ) ∩ {ut ≈ u}) < µ({ut+1 ∈ [D]}|C(ht−1 ) ∩ {ut ≈ u0 }).

(40)

Let Ut+1 be the set of possible realizations of ut+1 conditional on the event C(ht−1 ) ∩ {ω : ut ≈ u or ut ≈ u0 }. Let Ut be the set of possible realizations of ut conditional on event C(ht−1 ). Utility functions in these sets are all non-constant by Condition 2 (Uniformly Ranked u1t+1 , ..., uˆnt+1 } := Ut+1 r [D]. Fix any Pair). Enumerate {u1t+1 , ..., um t+1 } := Ut+1 ∩ [D] and {ˆ pZ ∈ int∆(Z). Note that for any j = 1, ..., n, uˆjt+1 does not belong to [co{u, u1t+1 , ..., um t+1 }]. P j Z j Thus, by Lemma 25, for each j = 1, ..., n, we can find a vector w ∈ R with z w (z) = 0 such that uˆjt+1 · wj > 0 ≥ uit+1 · wj , u · wj for any i = 1, ..., m. For each j, we construct q Z (j) ∈ ∆(Z) such that the vector q Z (j) − pZ (in RZ ) is proportional to wj .54 Thus for each j = 1, ..., n and i = 1, .., m, we have uˆjt+1 · (q Z (j) − pZ ) > 0 ≥ uit+1 · (q Z (j) − pZ ), u · (q Z (j) − pZ ). Pick a uniformly ranked pair of consumption lotteries `, ` ∈ ∆Z from Condition 2. Since u · (` − `) > 0 and uit+1 · (` − `) > 0 for each i = 1, ..., n, setting p˜Z := (1 − )pZ + (¯ p − p) i Z Z Z Z for some  > 0, we have 0 > ut+1 · (q (j) − p˜ ), u · (q (j) − p˜ ) for any i = 1, ..., m and j = 1, ..., n. We can pick  sufficiently small so that p˜Z is indeed well-defined (i.e., belongs to ∆Z) and that uˆjt+1 · (q Z (j) − p˜Z ) > 0 holds for each j = 1, ..., n. Finally, subject to 54

Note that such a construction is possible because pZ is in interior of ∆(Z).

19

perturbations of the lotteries {˜ pZ } ∪ {q Z (j) : j = 1, ..., n}, we can assume without loss55 that ut (q Z (j)) 6= ut (q Z (j 0 )) for each ut ∈ Ut and distinct j, j 0 = 1, .., n while preserving the fact that uˆjt+1 · (q Z (j) − p˜Z ) > 0 > uit+1 · (q Z (j) − p˜Z ), u · (q Z (j) − p˜Z ) for any i = 1, ..., m and j = 1, ..., n. pZ } ∪ {q Z (j) : j = 1, ..., n} and such Construct an atemporal menu At+1 such that AZt+1 = {˜ that for each p ∈ AZt+1 there is a unique pt+1 ∈ At+1 such that pZt+1 = p. Note that since uˆjt+1 · (q Z (j) − p˜Z ) > 0 > uit+1 · (q Z (j) − p˜Z ) for any j = 1, ..., n and i = 1, ..., m, we have µ(˜ pZ ∈ M (ut+1 , AZt+1 )|C(ht−1 ), ut ≈ u) = µ({ut+1 ∈ [D]}|C(ht−1 ), ut ≈ u) Let {[u1t ], ..., [ukt ]} denote the collection of equivalence classes of utilities in Ut , and assume without loss that u ∈ [u1t ]. By Lemma 13, we construct a collection of consumption lotteries {rZ (l) : l = 1, ..., k} such that ult (rZ (l)) > ult (rZ (l0 )) for any distinct l, l0 = 1, ..., k. Pick 0 > 0 sufficiently small such that p˜Z + 0 (rZ (l) − rZ (1)) ∈ ∆(Z) for all l = 2, . . . , k; such an 0 exists as p˜Z is in the interior of ∆(Z). Thus, we can construct an atemporal menu At such that AZt := {˜ pZ } ∪ {q Z (j) : j = 1, ..., n} ∪ {˜ pZ + 0 (rZ (l) − rZ (1)) : l = 2, ..., k} and such that for each p ∈ AZt there is a unique pt ∈ At with pZt = p. Recall that by construction ut (q Z (j)) 6= ut (q Z (j 0 )) for each ut ∈ Ut and distinct j, j 0 = 1, .., n. Moreover, for any ut ∈ Ut , ut (˜ pZ + 0 (rZ (l) − rZ (1))) is non-constant in 0 . Therefore, for small enough 0 > 0, we can guarantee that |M (ut , AZt )| = 1 for all ut ∈ Ut . Since At is atemporal and each lottery in At has a unique corresponding projection in AZt , this ensure that At ∈ A∗ (ht−1 ) (by Lemma 14). Since ult · (rZ (l) − rZ (1)) > 0 > u · (q Z (j) − p˜Z ), u · (rZ (l) − rZ (1)) for each j = 1, .., n and l = 2, ..., k, we have µ(˜ pZ ∈ M (ut , AZt )|C(ht−1 )) = µ({ut ≈ u}|C(ht−1 )). Let pt denote the unique lottery in At such that pZt = p˜Z and let pt+1 denote the unique lottery in At+1 such that pZt+1 = p˜Z . Since u 6≈ u0 , which implies u0 (˜ pZ + 0 (rZ (l) − rZ (1)) > 0 Z 0 l u (˜ p ) for the index l such that u ∈ [ut ], there is a lottery qt ∈ At different from pt such that M (AZt , u0 ) = {qtZ }. Also, |M (ut+1 , AZt+1 )| = 1 for all ut+1 ∈ Ut+1 , and thus At+1 ∈ A∗t+1 (ht−1 , At , pt ) ∩ A∗t+1 (ht−1 , At , qt ). ˆ t−1 = λht−1 + (1 − λ)dt−1 by In case At 6∈ A(ht−1 ), we can construct another history h t−1 t−1 mixing h with an appropriate degenerate history d (as in the construction of the extended ˆ t−1 ) and At ∈ A(h ˆ t−1 ). By the previous paragraphs, ρ, Definition 3) such that C(ht−1 ) = C(h Z Z t−1 ˆ , At , pt ) = µ({ut+1 ∈ [D]}|C(ht−1 ), ut ≈ u) and we have At ⊇ At+1 and ρt+1 (pt+1 ; At+1 |h ˆ t−1 , At , qt ) = µ({ut+1 ∈ [D]}|C(ht−1 ), ut ≈ u0 ). But then (40) implies that ρt+1 (pt+1 ; At+1 |h ˆ t−1 , At , pt ) < ρt+1 (pt+1 ; At+1 |h ˆ t−1 , At , qt ), which is a violation of consumption ρt+1 (pt+1 ; At+1 |h persistence.  Lemma 25. Take any finite Y and a finite set of non-constant utilities {u1 , .., um } ⊆ RY . Then for any non-constant u ∈ RY , the following are equivalent: 55

Recall that each ut ∈ Ut is non-constant.

20

(i). for any w ∈ RY such that

P

y∈Y

w(y) = 0,

[∀i = 1, ..., m, ui · w ≤ 0] ⇒ u · w ≤ 0 (ii). u ∈ [co{u1 , ..., um }]. Proof. The result follows from the utilitarian aggregation theorem, e.g., Theorem 2 in Fishburn (1984). 

I.2

Proof of Proposition 3

(ii)=⇒(i): Take any t, history ht−1 and atemporal menus At ∈ At (ht−1 ) and At+1 ∈ A∗t+1 (ht−1 , At , pt ) such that AZt+1 ⊆ AZt . This implies the existence of a possible realization of period t consumption utility ut = u conditional on C(ht−1 ) such that u(pZt ) > u(qtZ ) for all qtZ ∈ AZt r {pZt }. By (ii) there exists a possible realization of period t + 1 consumption utility ut+1 = u0 ≈ u conditional on C(ht−1 ) ∩ {ut ≈ u}. Note that, for any such u0 , AZt+1 ⊆ AZt ensures Z Z u0 (pZt+1 ) > u0 (qt+1 ) for all qt+1 ∈ AZt+1 r {pZ }. This implies ρt+1 (pt+1 ; At+1 |ht−1 , At , pt ) ≥ µ({ut+1 ≈ u}|C(ht−1 ), ut ≈ u) > 0. (i)=⇒(ii): Suppose for a contradiction that there is some t and history ht−1 such that µ({ut+1 ≈ u}|C(ht−1 ), ut ≈ u) = 0 for some u such that µ(ut ≈ u|C(ht−1 )) > 0. Let ([u1 ], ..., [um ]) denote the possible equivalence classes of consumption utilities that can realize in either period t or t + 1, and suppose without loss that u ∈ [u1 ]. Note that they are all non-constant by Condition 1 (Consumption Nondegeneracy). By applying Lemma 13 to consumption lotteries, we construct a collection of consumption lotteries {pZ (i) : i = 1, ..., m} ⊆ ∆(Z) such that ui (pZ (i)) > ui (pZ (j)) for any distinct pair i, j = 1, ..., m and ui ∈ [ui ]. Based on this we take an atemporal menu At such that AZt = {pZ (i) : i = 1, ..., m}. Also we construct an atemporal menu At+1 such that AZt+1 = {pZ (i) : i = 1, .., m}. Let pt and pt+1 denote the lotteries in At and At+1 such that pZt = pZt+1 = pZ (1). In this construction, without loss we require the support of each lottery in At to contain an At+1 . By construction At ∈ A∗t and At+1 ∈ A∗t+1 (Lemma 14). Note in particular that, conditional on C(ht−1 ), ut ∈ M (At , pt ) only if ut ≈ u, so that C(ht−1 , At , pt ) = C(ht−1 ) ∩ {ut ≈ u} since there is no tie at At . In case At 6∈ A(ht−1 ), by mixing ht−1 with an appropriate degenerate history, we choose ˆ t−1 that has the property that C(ht−1 ) = C(h ˆ t−1 ) and At ∈ A(h ˆ t−1 ) another history h But ˆ t−1 , At , pt )) = µ({ut+1 ≈ u}|C(ht−1 ), ut ≈ u) = 0, ρt+1 (pt+1 ; At+1 |C(h which contradicts consumption inertia.

I.3



Proof of Corollary 1

first part, “only if”: 21

We consider the case m ≥ 2 as otherwise the desired statement trivially holds with any α. First, take any distinct pair of indices i, j = 1, ..., m. By the consumption persistence and its characterization (Proposition 2), Mii = µ({u1 ∈ [ui ]}|u0 = ui ) ≥ µ({u1 ∈ [ui ]}|u0 = uj ) = Mji

(41)

(Note that by assumption both ui and uj arise with positive probability in period 0). Next, take any distinct pair of indices i, j = 1, ..., m and let D = co{ui , uj }. Note that by assumption there is no k 6= i, j such that uk ∈ [D]. By the consumption persistence and its characterization (Proposition 2), Mii + Mij = µ({u1 ∈ [D]}|u0 = ui ) = µ({u1 ∈ [D]}|u0 = uj ) = Mjj + Mji .

(42)

Then we first consider the case m = 2. Since 1 = M11 + M12 = M22 + M21 , we have M11 − M21 = M22 − M12 := α, which is nonnegative by the above (41). Since the Markov M21 and chain is irreducible, M21 , M12 > 0, which also ensures α < 1. By setting ν(u1 ) = 1−α M12 2 ν(u ) = 1−α , we obtain the desired form. To consider the case m ≥ 3, take any distinct triple of indices i, j, k = 1, ..., m and let D0 = co{ui , uj , uk }. By assumption there is no l 6= i, j, k such that ul ∈ [D0 ]. By the consumption persistence and its characterization (Proposition 2), Mii + Mij + Mik = µ({u1 ∈ [D0 ]}|u0 = ui ) = µ({u1 ∈ [D0 ]}|u0 = uj ) = Mjj + Mji + Mjk . Combined with the above (42), this implies thatP Mik = Mjk for any i, j, k. Thus define βk := Mik for each k. Here βk > 0, because otherwise i6=k Mik = 0, which implies that the Markov chain is not irreducible. By the above (42), Mii − Mji = Mjj − Mij for any i, j, and thus Mii − βi = Mjj − βj =: α for any i, j. Since the Markov chain is irreducible, α < 1 (as otherwise βj for each uj , which leads to the desired form. Mii = 1 for any i). We set ν(uj ) := 1−α first part, “if”: Take any t and [u], [u0 ] ⊆ RZ that correspond to possible realizations of period t consumption utilities. For any Et−1 , X µ({ut+1 ∈ [D]}|Et−1 , ut ≈ u) = α + (1 − α) ν(uj ) uj ∈[D]

≥ αµ({ut ∈ [D]}|Et−1 , ut ≈ u0 ) + (1 − α)

X

ν(uj )

uj ∈[D] 0

= µ({ut+1 ∈ [D]}|Et−1 , ut ≈ u ).

Thus ρ features choice persistence by Proposition 2. Second part: The proof follows immediately by applying Proposition 3 and thus is omitted.

22



J

Proofs for Section 6 (x,{y})

(x,{z})

Proof of Proposition 4. Let E denote the support of distribution of (ε1 , ε1 y z also the support of distribution of (ε2 , ε2 ). Fix a function σ : E → [0, 1] such that

), which is

σ(εy , εz ) ∈ argmaxα∈[0,1] ∈ α(δ 2 v2 (y) + δ 2 εy ) + (1 − α)(δ 2 v2 (z) + δ 2 εz ) for each (εy , εz ) ∈ E. Then E0 [max{δ 2 v2 (y) + δ 2 εy2 , δ 2 v2 (z) + δ 2 εz2 }] = E0 [σ(εy2 , εz2 )(δ 2 v2 (y) + δ 2 εy2 ) + (1 − σ(εy2 , εz2 ))(δ 2 v2 (z) + δ 2 εz2 )] = δ 2 (α∗ v2 (y) + (1 − α∗ )v2 (z)) + δ 2 E0 [σ(εy2 , εz2 )εy2 + (1 − σ(εy2 , εz2 ))εz2 ] where α∗ := E0 [σ(εy2 , εz2 )]. Note that since ε2 have mean zero, δ 2 (α∗ v2 (y) + (1 − α∗ )v2 (z)) is the expected value the agent would obtain from Alate if in period 2 she chooses y with probability α∗ 1 regardless of the realization of 2 . This implies that the term δ 2 E0 [σ(εy2 , εz2 )εy2 + (1 − σ(εy2 , εz2 ))εz2 ] is nonnegative. At the same time, (x,{y})

E0 [max{δ 2 v2 (y) + δε1 ≥ =

(x,{z})

, δ 2 v2 (z) + δε1

}]

(x,{y}) (x,{z}) (x,{y}) (x,{y}) (x,{z}) E0 [σ(ε1 , ε1 )(δ 2 v2 (y) + δε1 ) + (1 − σ(ε1 , ε1 ))(δ 2 v2 (z) δ 2 (α∗ v2 (y) + (1 − α∗ )v2 (z)) + δE0 [σ(εy2 , εz2 )εy2 + (1 − σ(εy2 , εz2 ))εz2 ]

(x,{z})

+ δε1

)]

where the equality used the i.i.d assumption on ε1 and ε2 . Therefore E0 [max{δ 2 v2 (y) + (x,{z}) (x,{y}) 2 }] ≥ E0 [max{δ 2 v2 (y) + δ 2 εy2 , δ 2 v2 (z) + δ 2 εz2 }]. Thus the desired claim , δ v2 (z) + δε1 δε1 follows by the i.i.d assumption on 0 . 

K

Consumption Dependence

K.1

Enriched Primitive

To accommodate consumption dependence, we enrich the stochastic choice data studied in previous sections: We not only keep track of past choices of lotteries from menus, but also of the corresponding realized consumptions. The dynamic stochastic choice rule ρ is again defined recursively. The observed choice distriP bution at period 0 is summarized by a map ρ0 : A0 → ∆(∆(X0 )) such that p0 ∈A0 ρ0 (p0 ; A0 ) = 1 for all A0 . The set of enriched period 0 histories 0 := {(A0 , p0 , z0 ) : ρ0 (p0 , A0 ) > 0 and z0 ∈ supp pZ0 } summarizes all choices p0 from A0 and realized consumptions z0 that jointly occur with positive probability. For any history 0 = (A0 , p0 , z0 ) ∈ 0 , let A1 ( 0 ) := {A1 ∈ A1 : (z0 , A1 ) ∈ supp p0 } denote the set of period 1 menus that follow 0 with positive probability. For each t = 1, . . . , T and history t−1 ∈ t−1 , observed period tP choices following t−1 are summarized by a map ρt (·| t−1 ) : At ( t−1 ) → ∆(∆(Xt )) such that pt ∈At ρt (pt ; At | t−1 ) = 1 for all At ∈ At ( t−1 ). The set of enriched period-t histories is denoted by

H

h

h

h

h

h H

H

h

h

h h

Ht := {(ht−1, At, pt, zt) : ht−1 ∈ Ht−1; At ∈ At(ht−1); ρt(pt; At|ht−1) > 0; zt ∈ supp pZt }. 23

h

h

For each t ≤ T − 1, the set of period t + 1 menus that follow history t = ( t−1 , At , pt , zt ) with positive probability is At+1 ( t ) := {At+1 ∈ At+1 : (zt , At+1 ) ∈ supp pt } and the set of period-t histories that lead to At+1 with positive probability is t (At+1 ) := { t ∈ t : At+1 ∈ At+1 ( t )}. t−1 Finally, for each t = 1, . . . , T and consumption stream z t−1 = (z0 , ..., zt−1 ) ∈ Z t , let zt−1 := { t−1 ∈ t−1 : t−1 = (A0 , p0 , z0 , ..., At−1 , pt−1 , zt−1 ) for some A0 , p0 , . . . , At−1 , pt−1 } denote the t−1 t−1 set of histories that give rise to consumption stream z t−1 ; and let zt−1 (At ) := zt−1 ∩ t−1 (At ).

h

h

H

K.2

H

h

h H

H

H

H H

h

Representations

We define the consumption dependent versions of our representations by extending the S-based representations introduced in Section A. Equivalent analogs of the Ω-based representations from the main text can be defined, but are omitted to save space. Definition 14. A consumption-dependent DREU (CDREU) representation of ρ consists of s ,z tuples (S0 , µ0 , {Us0 , τs0 }s0 ∈S0 ), (St , {µt t−1 t−1 }st−1 ∈St−1 ,zt−1 ∈Z , {Ust , τst }st ∈St )1≤t≤T such that for all t = 0, . . . , T , we have: s ,z CDREU1: For all st−1 ∈ St−1 and zt−1 ∈ Z, (St , µt t−1 t−1 , {Ust , τst }st ∈St ) is an REU form on Xt such that56 s

(a) Ust 6≈ Us0t for any distinct st , s0t ∈ supp(µt t−1 s

s0

,z

,zt−1

);

,z 0

0 ); (b) supp(µt t−1 t−1 ) ∩ supp(µt t−1 t−1 ) = ∅ for any distinct pairs (st−1 , zt−1 ), (s0t−1 , zt−1 S s ,z (c) st−1 ∈St−1 ,zt−1 ∈Z supp µt t−1 t−1 = St . 57 ht−1 = (Ak , pk , zk )t−1 k=0 ∈ Ht−1 (At ),

CDREU2: For all pt , At , and

h

ρt (pt , At |

P t−1

)= P

(s0 ,...,st )∈S0 ×...×St

sk−1 ,zk−1 (sk )τsk (pk , Ak ) k=0 µk . Qt−1 sk−1 ,zk−1 (sk )τsk (pk , Ak ) k=0 µk

Qt

(s0 ,...,st−1 )∈S0 ×...×St−1

A consumption-dependent evolving utility representation of ρ is a CDREU representation such that for all t = 0, . . . , T , we additionally have: CEVU: For all st ∈ St , there exists ust ∈ RZ such that for all zt ∈ Z, At+1 ∈ At+1 , we have Ust (zt , At+1 ) = ust (zt ) + Vst ,zt (At+1 ), P t ,zt where Vst ,zt (At+1 ) := st+1 µst+1 (st+1 ) maxpt+1 ∈At+1 Ust+1 (pt+1 ) for t ≤ T − 1 and VsT ,zT ≡ 0. An active learning representation is a consumption-dependent evolving-utility representation such that additionally: CGL: There exists δ > 0 such that for all t = 0, . . . , T − 1, zt ∈ Z, and st ∈ St , we have58 ust =

1 X st ,zt µt+1 (st+1 )ust+1 . δs t+1

56

s

,z

For t = 0, we abuse notation by letting µt t−1 t−1 denote µ0 for all st−1 , zt−1 . 57 For t = 0, we again abuse notation by letting ρt (·| t−1 ) denote ρ0 (·) for all t−1 . 58 Note that subject to multiplying Ust and ust by δ T −t for each t and st , this yields the representation in equation (8) in the main text.

h

24

h

K.3

Characterization

K.3.1

Consumption-Dependent DREU

Consumption-dependent DREU is equivalent to natural analogs of Axioms 1–4 used to characterize DREU. The only difference is that in defining contraction equivalence, linear equivalence, the extended version of ρ, and convergence of histories, we need to restrict attention to histories whose sequences of realized consumptions coincide. 0 0 0 Given t−1 = (A0 , p0 , z0 , ..., At−1 , pt−1 , zt−1 ) ∈ t−1 , let ( t−1 −k , (Ak , pk , zk )) denote the sequence of the form (A0 , p0 , z0 , ..., A0k , p0k , zk0 , ..., At−1 , pt−1 , zt−1 ).59 We say that t−1 ∈ t−1 is contraction equivalent to t−1 if for some k, we have t−1 = ( t−1 −k , (Bk , pk , zk )), where Ak ⊆ Bk k−1 k−1 and ρk (pk , Ak | ) = ρk (pk , Bk | ). We say that a finite set of histories t−1 ⊆ t−1 is t−1 linearly equivalent to if

h

h

h

H g

h

h

h h

g H

H H

t−1 Ht−1 = {(h−k , (λAk + (1 − λ)Bk , λpk + (1 − λ)qk , zk )) : qk ∈ Bk }

for some k, Bk , and λ ∈ (0, 1].

g

Axiom 12 (Contraction History Independence*). If t−1 ∈ to t−1 ∈ t−1 (At ), then ρt (·, At | t−1 ) = ρt (·, At | t−1 ).

h

H

h

g

Axiom 13 (Linear History Independence∗ ). If t−1 ) = ρt (·; At | t−1 ). t−1 (At ), then ρt (·; At |

H

h H As before, ρt (·; At |Ht−1 ) is shorthand for H

ρt (·; At |

Ht−1 ⊆ Ht−1(At) is linearly equivalent to ht−1 ∈

gt−1 ∈P ht−1 ρ(g )ρt (·, At |g gt−1 ∈Ht−1 ρ(gt−1 ) t−1

P t−1

) :=

Ht−1(At) is contraction equivalent

g

g

t−1

)

,

gk−1).60 Ht : d t

Q pk , Aˆk | where for any t−1 = (Aˆ0 , pˆ0 , zˆ0 , . . . , Aˆt−1 , pˆt−1 , zˆt−1 ), ρ( t−1 ) := t−1 k=0 ρk (ˆ Define the set of degenerate period-t histories by Dt−1 := { t ∈ t−1 ({qk }, qk , zk )k=0 where qk ∈ ∆(Xk ) ∀k ≤ t − 1}.

d

ht−1 ∈ Hzt−1 , define ρht (·; At ) := ρt (·; At |λht−1 + (1 − λ)dt−1 ). for some λ ∈ (0, 1] and dt−1 ∈ Dzt−1 such that λht−1 + (1 − λ)dt−1 ∈ Hzt−1 (At ).61

Definition 15. For any t ≥ 1, At ∈ At , z t−1 , and

t−1

t−1

t−1

d

H

(43)

t−1

As before, it follows from Axiom 13 specific choice of λ and t−1 . Moreover, t−1 ∈ t−1 (At ). Thus, in the following we nonextended version of ρt and use ρt (·; At |

h

=

that the RHS of (43) does not depend on the t−1 ρht (·; At ) coincides with ρt (·; At | t−1 ) whenever again do not distinguish between the extended and t−1 ) to denote both.

h

h

H

In general this is not a history in t−1 , but it is if (zk−1 , A0k ) ∈ supp pk−1 and (zk , Ak+1 ) ∈ supp p0k and ρk (p0k ; A0k | k−1 ) > 0. 60 As for the weights ρ(g t−1 ) in Section 3.1, here we again do not keep track of the probabilities pˆk (ˆ zk , Aˆk+1 ), as these do not reveal any private information to the analyst. t−1 61 As before, we define λ t−1 + (1 − λ) t−1 := (λAk + (1 − λ){qk }, λpk + (1 − λ)qk , zk )k=0 , where t−1 = t−1 t−1 (Ak , pk , zk )k=0 and t−1 = ({qk }, qk , zk )k=0 . 59

h

d

h

d

h

25

ht−1 ∈ Ht−1, conditions (i)–(v) of Axiom 3

Axiom 14 (Random Expected Utility*). For all hold after replacing all instances of ht−1 with t−1 .

h

h

For any 0 ≤ t ≤ T , we define the set A∗t ( t−1 ) of period-t menus without ties conditional on history t−1 ∈ t−1 and the set ∗t of period-t histories without ties as in the main text, replacing each instance of h and H in Definition 4 with and . We extend convergence in mixture to histories in t by requiring additionally that each history in the sequence gives rise to the same sequence of consumptions: We write nt →m t for any t = (A0 , p0 , z0 , ..., At , pt , zt ), nt = (An0 , pn0 , z0 , ..., Ant , pnt , zt ) ∈ t such that Ant0 →m At0 and pnt0 →m pt0 for each t0 .

h

H

H

h

H

h

h

H

h

H

h

h t ∈ Ht , ρt+1 (pt+1 ; At+1 |ht ) ∈ co{lim ρt+1 (pt+1 ; At+1 |ht,n ) : ht,n →m ht , ht,n ∈ H∗t }. n

Axiom 15 (History Continuity∗ ). For all 0 ≤ t ≤ T − 1, At+1 , pt+1 , and

Theorem 5. The following are equivalent: (i). ρ satisfies Axioms 12–15. (ii). ρ admits a CDREU representation. K.3.2

Consumption-Dependent Evolving Utility

Consumption-dependent evolving utility does not in general satisfy Separability axiom, as current consumption affects the expected utility over continuation menus through its effect on the st ,zt transition distributions µt+1 . Instead, it is fully characterized by analogs of Axioms 6 (DLR Menu Preference) and 7 (Sophisticaton). To state these, we make use of the same history dependent dominance relation as in Definition 5, except that histories now live in the enriched space t :

H

h

h

H

Definition 16. For each t ≤ T and t = ( t−1 , At , pt , zt ) ∈ t , define relation %ht , ∼ht , and ht on ∆(Xt ) as follows. For any qt , rt ∈ ∆(Xt ), qt %ht rt if there exist qtn →m qt and rtn →m rt such that 1 1 1 1 ρt ( pt + rtn ; At + {qtn , rtn }| t−1 ) = 0 2 2 2 2 for all n. Let ∼ht and ht respectively denote the symmetric and asymmetric component of %ht .

h

h

As before, qt %ht rt reveals that at all states st consistent with t the agent prefers qt to rt . Note that since the final realized consumption zt does not reveal any additional information about st , %ht depends only on ( t−1 , At , pt ) and not on zt . We have the following analogs of Axioms 6 and 7:

h

Axiom 16 (DLR Menu Preference*). For all t ≤ T − 1 and hold after replacing all instances of ht with t .

h

Axiom 17 (Sophistication*). For all t ≤ T − 1, A∗t ( t ), the following are equivalent:

h

26

ht, conditions (i)–(iv) of Axiom 6

ht = (ht−1, At, pt, zt) ∈ Ht, and At+1 ⊆ A0t+1 ∈

h

(i). ρt+1 (pt+1 ; A0t+1 | t ) > 0 for some pt+1 ∈ A0t+1 r At+1 (ii). (zt , A0t+1 ) ht (zt , At+1 ). A slight departure from Axiom 7 is that because both the agent’s preference over period t + 1 menus and her period t + 1 choices may be influenced by her period t consumption, point (ii) of Axiom 17 only imposes a preference for A0t+1 over At+1 when the corresponding period t consumption is the consumption zt that arises under t .

h

Theorem 6. Suppose that ρ admits a CDREU representation. The following are equivalent: (i). ρ satisfies Axioms 16–17. (ii). ρ admits a Consumption-Dependent Evolving Utility representation. K.3.3

Active Learning

As with gradual learning, active learning is characterized by additional restrictions on the agent’s preference over streams of consumption lotteries. In contrast with consumptiondependent evolving utility, active learning restores a weak form of Separability, requiring that the agent’s payoff to today’s consumption be independent of tomorrow’s consumption lottery stream. This captures the idea that since consumption lottery streams only entail degenerate future choices, today’s consumption does not have any informational value in this case and hence yields the same payoff regardless of the future stream: Axiom 18 (Stream Separability). For all

ht, z, z0 ∈ Z, (`k )Tk=t+1, (mk )Tk=t+1 ∈ ∆(Z),

1 1 1 1 (z, `t+1 , . . . , `T ) + (z 0 , mt+1 , . . . , mT ) ∼ht (z 0 , `t+1 , . . . , `T ) + (z, mt+1 , . . . , mT ). 2 2 2 2 Additionally, active learning entails analogs of Axioms 8–9 that we used to characterize gradual learning: Axiom 19 (Stationary Consumption Preference*). For all t ≤ T − 1, `, m, n ∈ ∆(Z), and (`, n, . . . , n) ht (m, n, . . . , n) if and only if (n, `, n, . . . , n) ht (n, m, n, . . . , n). We say that `, m ∈ ∆(Z) are n ∈ ∆(Z).

ht-nonindifferent if (`, n, . . . , n) 6∼h

Axiom 20 (Constant Intertemporal Tradeoff*). If `, m are nonindifferent, then for all α ∈ [0, 1] and n ∈ ∆Z:

t

ht ,

(m, n, . . . , n) for some

ht-nonindifferent and `,ˆ mˆ are gtˆ-

(`, m, n, . . . , n) ∼ht (α` + (1 − α)m, α` + (1 − α)m, n, . . . , n) ⇐⇒ ˆ m, (`, ˆ n, . . . , n) ∼ tˆ (α`ˆ + (1 − α)m, ˆ α`ˆ + (1 − α)m, ˆ n, . . . , n).

g

As before, the axiom is vacuous unless we impose Consumption Nondegeneracy: Condition 3 (Consumption Nondegeneracy*). For all t ≤ T − 1 and nonindifferent `, m ∈ ∆(Z). 27

ht ,

there exist

ht -

When Condition 3 is satisfied, Axioms 18–20 fully characterize the gap between consumption-dependent evolving utility and the special case of active learning: Theorem 7. Suppose ρ admits a consumption-dependent evolving utility representation and satisfies Condition 3. The following are equivalent: (i). ρ satisfies Axioms 18–20. (ii). ρ admits an active learning representation.

K.4

Proof Sketches for Theorems 5–7

K.4.1

Theorem 5

We first extend the terminology from Section A.3 to the enriched setting. Suppose that s 0 ,z 0 (St0 , {µt0t −1 t −1 }(st0 −1 ,zt0 −1 )∈St0 −1 ×Z , {Ust0 , τst0 }st0 ∈St0 ) satisfy CDREU1 and CDREU2 from Definition 14 for each t0 ≤ t. Fix any state s∗t ∈ St and consumption zt∗ ∈ Z. We let pred(s∗t ) denote the unique ∗ predecessor sequence (s∗0 , z0∗ , . . . , s∗t−1 , zt−1 ) ∈ S0 × Z × . . . × St−1 × Z, given by assumps∗k ,zk∗ ∗ tions CDREU1 (b) and (c), such that sk+1 ∈ supp(µk+1 ) for each k = 0, ..., t − 1. Given t any = (A0 , p0 , z0 , . . . , At , pt , zt ), we say that (s∗t , zt∗ ) is consistent with t if Qt history ∗ ∗ ∗ ∗ k=0 τsk (pk , Ak ) > 0 and zk = zk for all k = 0, . . . , t. A separating history for (st , zt ) is a ∗ history t = (B0 , q0 , z0 , . . . , Bt , qt , zt ) ∈ ∗t such that Us∗k−1 ,zk−1 (Bk , qk ) = {Us∗k } and zk = zk∗ for

h

h

h

H

s∗

,z ∗

∗ all k = 0, . . . , t, where Us∗k−1 ,zk−1 (Bk , qk ) := {Usk : sk ∈ supp µkk−1 k−1 and qk ∈ M (Bk , Usk )}.62 Analogs of Lemmas 1 and 2 are readily established; in particular, any (s∗t , zt∗ ) ∈ St × Z admits a separating history.

Sufficiency: The sufficiency direction of the proof proceeds as in Section ??. Specifically, assuming that for s 0 ,z 0 some t ≤ T − 1 we have constructed (St0 , {µt0t −1 t −1 }(st0 −1 ,zt0 −1 )∈St0 −1 ×Z , {Ust0 , τst0 }st0 ∈St0 ) satisfying CDREU1 and CDREU2 for each t0 ≤ t, we construct st ,zt (St+1 , {µt+1 }(st ,zt )∈St ×Z , {Ust+1 , τst+1 }st+1 ∈St+1 ) satisfying CDREU1 and CDREU2 as follows: t ,zt For any (st , zt ) ∈ St × Z, we define ρst+1 (·, At+1 ) := ρt+1 (·, At+1 | t (st , zt )), where t (st , zt ) is an arbitrary separating history for (st , zt ). Using Axiom 14 and Theorem 4, we then find t ,zt (St+1 , {µst+1 }(st ,zt )∈St ×Z , {Ust+1 , τst+1 }st+1 ∈St+1 ) such that CDREU1 holds and

h

st ,zt ρt+1 (pt+1 , At+1 ) =

X

t ,zt µst+1 (st+1 )τst+1 (pt+1 , At+1 ).

h

(44)

st+1 ∈St+1

Analogous arguments as in Section ?? yield analogs of Lemma 3, showing that for any history t ,zt t that can only arise at (st , zt ), we have ρst+1 = ρt+1 (·| t ); and of Lemma 4, showing that for t any = (A0 , p0 , z0 , . . . , At , pt , zt ) ∈ t (At+1 ), we have P Qt sk−1 ,zk−1 t ,zt (sk )τsk (Ak , pk )ρst+1 (pt+1 , At+1 ) (s0 ,...,st )∈S0 ×···×St k=0 µk t . (45) ρt+1 (pt+1 , At+1 | ) = P Qt sk−1 ,zk−1 µ (s )τ (A , p ) k s k k k (s0 ,...,st )∈S0 ×···×St k=0 k

h

h

h

H

h

62

∗ (B0 , q0 ) denotes U0 (B0 , q0 ) := {Us : s0 ∈ S0 and q0 ∈ M (B0 , Us )}. As usual, Us∗−1 ,z−1 0 0

28

t ,zt }(st ,zt )∈St ×Z , {Ust+1 , τst+1 }st+1 ∈St+1 ) also satisfies Combining (45) and (44) shows that (St+1 , {µst+1 CDREU2.

Necessity: Necessity of Axioms 12–15 is established using analogous arguments to Section ??. K.4.2

Theorem 6

We first record the following analog of Lemma 5, which admits an analogous proof. Given t = ( t−1 , At , pt , zt ), we say that st is consistent with t if (st , zt ) is consistent with t , and we call t a separating history for st if it is a separating history for (st , zt ).

h

h h

h

h

Lemma 26. Suppose that ρ admits CDREU representation. Consider any t ≤ T , (A0 , p0 , z0 , . . . , At , pt , zt ) ∈ t , and qt , rt ∈ ∆(Xt ).

H

(i). If qt %ht rt , then Ust (qt ) ≥ Ust (rt ) for all st consistent with

ht

ht .

(ii). Suppose there exist g, b ∈ ∆(Xt ) such that Ust (g) > Ust (b) for all st consistent with If Ust (qt ) ≥ Ust (rt ) for all st consistent with t , then qt %ht rt .

h

(iii). If

ht is a separating history for st, then qt %h

t

=

ht .

rt if and only if Ust (qt ) ≥ Ust (rt ).

Sufficiency: We proceed by an analogous inductive argument as in Section ??. Assume that for some t ≤ s 0 ,z 0 T −1, we have obtained (St0 , {µt0t −1 t −1 }(st0 −1 ,zt0 −1 )∈St0 −1 ×Z , {Ust0 , τst0 }st0 ∈St0 ) such that CDREU1 and CDREU2 hold for each t0 ≤ t and CEVU holds for each t0 ≤ t − 1. For any (st , zt ) ∈ St ×Z, define ust ∈ RZ and Vst ,zt : At+1 → R by ust ≡ 0 and Vst ,zt (At+1 ) := Ust (zt , At+1 ). Applying Axiom 16 yields the following analog of Lemma 6, with an analogous proof. Note that because we do not impose Separability, Vst ,zt depends on zt . Lemma 27. For all (st , zt ), Vst ,zt is continuous, monotone, and linear. Moreover, there exist 0 0 Ct+1 , Ct+1 ∈ At+1 such that Vst ,zt (Ct+1 ) > Vst ,zt (Ct+1 ) for all (st , zt ). Applying the “moreover” part of Lemma 27 also yields the obvious analog of Corollary C.1. Since ρ admits a CDREU representation, we can obtain t ,zt (St+1 , {µst+1 }(st ,zt )∈St ×Z , {U˜st+1 , τst+1 }st+1 ∈St+1 ) satisfyingPCDREU1 and CDREU2 at t + 1. For st ,zt t ,zt t ,zt all (st , zt ) ∈ St × Z, define ρt+1 by ρst+1 (pt+1 , At+1 ) := st+1 µst+1 τst+1 (pt+1 , At+1 ). Let %st ,zt denote the preference over At+1 induced by Vst ,zt . Using Axiom 17, an analogous t ,zt argument as for Lemma 7 shows that for all st , zt , the pair (%st ,zt , ρst+1 ) satisfies AS’s Axioms 1 and 2, and an analogous argument as for Lemma 8 establishes that %st ,zt satisfies AS’s Axiom DLR 6. Given this, we proceed as in Section C.2.5 (replacing each instance of st with an instance of (st , zt )) to obtain αst+1 > 0 and βst+1 ∈ R such that Ust+1 := αst+1 U˜st+1 + βst+1 satisfies X s ,z t t µt+1 Vst ,zt (At+1 ) = (st+1 ) max Ust+1 (pt+1 ). pt+1 ∈At+1

st+1

t ,zt Thus, replacing U˜st+1 with Ust+1 , we have that (St+1 , {µst+1 }(st ,zt )∈St ×Z , {Ust+1 , τst+1 }st+1 ∈St+1 ) satisfies not only CDREU1 and CDREU2, but also CEVU, as required.

29

Necessity: As in Section C.3, we first show that for all t ≤ T − 1, there exist gt , bt ∈ ∆(Xt ) such 0 , Ct+1 ∈ At+1 that Ust (gt ) > Ust (bt ) for all st ∈ St , for which it is sufficient to find Ct+1 0 0 := such that Vst ,zt (Ct+1 ) > Vst ,zt (Ct+1 ) for all (st , zt ) ∈ St × Z. As before, we let Ct+1 {gt+1 (st+1 ), bt+1 (st+1 ) : st+1 ∈ St+1 } where gt+1 (st+1 ), bt+1 (st+1 ) ∈ ∆(Xt+1 ) are such that Ust+1 (gt+1 (st+1 )) > Ust+1 (bt+1 (st+1 )). For each st , zt , we set At+1 (st , zt ) := {bt+1 (st+1 )} for st ,zt 0 ) ≥ Vst ,zt (At+1P (s0t , zt0 )) for all st , zt , s0t , zt0 , with strict . Then Vst ,zt (Ct+1 some st+1 ∈ suppµt+1 1 At+1 (st , zt ), we have inequality for (st , zt ) = (s0t , zt0 ). Hence, letting Ct+1 := (st ,zt )∈St ×Z |St ×Z| 0 Vst ,zt (Ct+1 ) > Vst ,zt (Ct+1 ) for all st , zt . Given this, we can proceed as in Section C.3, using part (ii) of Lemma 26 to derive Axiom 16 (i), (ii), (iv) and Axiom 17. Part (iii) of Axiom 16 is also established in an analogous manner to Section C.3. K.4.3

Theorem 7

We first show that when ρ admits a consumption dependent evolving utility (CEVU) representation, Axiom 18 (Stream Separability) is equivalent to the following form of separability over consumption lottery streams: Lemma 28. Suppose ρ admits a CEVU representation with utilities Ust . Then ρ satisfies Axiom 18 if and only if for all t ≤ T − 1, st , z, z 0 and (`k )Tk=t+1 , (mk )Tk=t+1 ∈ ∆(Z), we have Ust (z, `t+1 , . . . , `T ) − Ust (z 0 , `t+1 , . . . , `T ) = Ust (z, mt+1 , . . . , mT ) − Ust (z 0 , mt+1 , . . . , mT ). (46) Proof. Fix any t ≤ T − 1. By the necessity direction of the proof of Theorem 6, there exist gt , bt ∈ ∆(Xt ) such that Ust (gt ) > Ust (bt ) for all st ∈ St . Then Lemma 26 (i)–(ii) implies that for all t and qt , rt , we have qt ∼ht rt if and only if Ust (qt ) = Ust (rt ) for all st consistent with t . Then Axiom 18 holds if and only if for all t ≤ T − 1, st , z, z 0 and (`k )Tk=t+1 , (mk )Tk=t+1 ∈ ∆(Z), we have 21 Ust (z, `t+1 , . . . , `T ) + 12 Ust (z, mt+1 , . . . , mT ) = 12 Ust (z 0 , `t+1 , . . . , `T ) + 1 U (z 0 , mt+1 , . . . , mT ), which in turn is equivalent to (46).  2 st

h

h

Sufficiency: s ,z Suppose that ρ admits a CEVU representation (St , {µt t−1 t−1 }st−1 ∈St−1 ,zt−1 ∈Z , {Ust , ust , τst }st ∈St )0≤t≤T and satisfies Condition 3 and Axioms 18–20. Steps 1 and 2 below first perform two normalizations. Step 3 then proceeds in an analogous manner to the sufficiency direction of Theorem 3. Step 1: We first show that replacing each Ust and ust with a suitable Uˆst and uˆst continues to yield a CEVU representation of ρ which additionally satisfies ˆ st (`t+1 , . . . , `T ) Uˆst (`t , . . . , `T ) = uˆst (`t ) + W (47) st ,zt ˆ st (`t+1 , . . . , `T ) := P ˆ for all st and (`k )Tk=t ∈ ∆(Z), where W st+1 µt+1 (st+1 )Ust+1 (`t+1 , . . . , `T ) does not depend on zt . To see this, fix any z0∗ , . . . , zT∗ ∈ Z. For all s0 , define Uˆs0 := Us0 and uˆs0 (z0 ) := Uˆs0 (z0 , z1∗ , . . . , zT∗ )− Uˆs0 (z0∗ , z1∗ , . . . , zT∗ ) for all z0 . For all s1 , define Uˆs1 := Us1 +(us0 (z0 )− uˆs0 (z0 )) 30

and u¯s1 := us1 + (us0 (z0 ) − uˆs0 (z0 )), where (s0 , z0 ) are the unique state-outcome pair such that s1 ∈ supp µs10 ,z0 . Note that replacing each Us0 with Uˆs0 , us0 with uˆs0 , Us1 with Uˆs1 , and us1 with u¯s1 , and keeping all utilities in periods t ≥ 2 the same continues to yield a CEVU representation of ρ: Indeed, Usi and Uˆsi represent the same preference over ∆(Xi ) so CDREU1 and CDREU2 remain satisfied. Moreover, CEVU holds at s1 because Uˆs1 and u¯s1 are obtained from Us1 and us1 by adding the same constant, and CEVU holds at s0 because Uˆs0 = Us0 and Uˆs1 = Us1 + (us0 (z0 ) − uˆs0 (z0 )) for all s1 ∈ supp µs10 ,z0 implies X s ,z X s ,z µ10 0 (s1 ) max Uˆs1 (p1 ). Uˆs0 (z0 , A1 ) = us0 (z0 ) + µ10 0 (s1 ) max Us1 (p1 ) = uˆs0 (z0 ) + s1

p1 ∈A1

s1

p1 ∈A1

P Then, for all z0 and (`k )Tk=1 ∈ ∆(Z), we have s1 µs10 ,z0 Uˆs1 (`1 , . . . , `T ) = Uˆs0 (z0 , `1 , . . . , `T ) − uˆs0 (z0 ) = Uˆs0 (z0∗ , `1 , . . . , `T ), where the last equality follows from the fact that uˆs0 (z0 ) := Uˆs0 (z0 , z1∗ , . . . , zT∗ ) − Uˆs0 (z0∗ , z1∗ , . . . , zT∗ ) = Uˆs0 (z0 , `1 , . . . , `T ) − Uˆs0 (z0∗ , `1 , . . . , `T ) by Lemma 28. ˆ s0 (`t+1 , . . . , `T ), where Thus, for all (`k )Tk=0 ∈ ∆(Z), we have Uˆs0 (`0 , `1 , . . . , `T ) = uˆs0 (`0 ) + W P s ,z 0 0 ˆ s0 (`1 , . . . , `T ) := W (s1 )Uˆs1 (`1 , . . . , `T ) does not depend on z0 , whence (47) holds at s0 . s 1 µ1 Next, suppose that for some t ≥ 1, we have obtained a CEVU representation of ρ by replacing each Us0 , us0 , . . . , Ust , ust with Uˆs0 , uˆs0 , . . . , Uˆst , u¯st and keeping utilities in periods t + 1, . . . , T the same, and suppose that (47) holds for all t0 < t. For all st and zt , define uˆst (zt ) := ∗ ∗ , . . . , zT∗ ). For all st+1 , define Uˆst+1 := Ust+1 + (¯ ust (zt ) − uˆst (zt )) , . . . , zT∗ ) − Uˆst (zt∗ , zt+1 Uˆst (zt , zt+1 and u¯st+1 := ust+1 + (¯ ust (zt ) − uˆst (zt )), where (st , zt ) are the unique state-outcome pair such t ,zt that st+1 ∈ supp µst+1 . Then the same argument as in the previous paragraph shows that after replacing u¯st with uˆst and Ust+1 and ust+1 with Uˆst+1 and uˆst+1 , we continue to have a CEVU representation of ρ which now additionally satisfies (47) in period t. Proceeding inductively, we obtain the desired representation. Step 2: Next, we show that replacing each Uˆst and uˆst with a suitable U˜st and u˜st continues to yield a CEVU representation of ρ which again satisfies (47) and such that additionally X u˜st (z) = 0 (48) z∈Z

P 1 holds for all st . Indeed, let u˜st (zt ) := uˆst (zt ) − γst for all st and zt , where γst := |Z| ˆst (z). z∈Z u Then (48) is immediate. Inductively define U˜st by U˜sT := u˜sT and U˜st (zt , At+1 ) := u˜st (zt ) + P st ,zt ˜ ˜ ˜st are as required, it suffices st+1 µt+1 maxpt+1 ∈At+1 Ust+1 (pt+1 ) for all st . To show that Ust and u to prove that for all t ≤ T − 1 and st , U˜st = Uˆst − (γst + βst ),

(49)

P ˆ st (zt+1 , . . . , zT ). Indeed, (49) implies that U˜st and where βst := |Z|t−T (zt+1 ,...,zT )∈Z T −t W Uˆst represent the same preference over ∆(Xt ), so that replacing each Uˆst and uˆst with U˜st and u˜st continues to yield a CEVU representation of ρ. Moreover, (49) implies that P st ,zt ˜ ˆ st+1 µt+1 (st+1 )Ust+1 (`t+1 , . . . , `T ) = Wst (`t+1 , . . . , `T ) − βst is independent of zt for all st , so that (47) continues to hold as well.

31

To show (49), note first that for each sT −1 we have U˜sT −1 (zT −1 , AT ) = uˆsT −1 (zT −1 ) − γsT −1 +

X

s

µTT −1

,zT −1

(sT )( max uˆsT (pT ) − γsT ) = pT ∈AT

sT

UˆsT −1 (zT −1 , AT ) − γsT −1 − |Z|−1

XX

s

µTT −1

,zT −1

 (sT )ˆ usT (z) = UˆsT −1 (zT −1 , AT ) − γsT −1 + βsT −1 ,

z∈Z sT s

,z

ˆ s (z) by (47). Next, assume where the final equality holds because µTT −1 T −1 (sT )ˆ usT (z) = W T −1 0 that (49) holds for all t > t. By (47), we have for all st+1 that X Uˆst+1 (zt+1 , . . . , zT ) (zt+1 ,...,zT )∈Z T −t

= |Z|T −(t+1)

X zt+1

X

uˆst+1 (zt+1 ) + |Z|

 ˆ st+1 (zt+2 , . . . , zT ) = |Z|T −t γst+1 + βst+1 . W

(50)

(zt+2 ,...,zT )

Thus, U˜st (zt , At+1 ) = Uˆst (zt , At+1 ) − γst −

X

t ,zt µst+1 (st+1 ) γst+1 + βst+1



st+1

= Uˆst (zt , At+1 ) − γst − |Z|t−T

X

t ,zt µst+1 (st+1 )

X

Uˆst+1 (zt+1 , . . . , zT ) =

(zt+1 ,...,zT )∈Z T −t

st+1

Uˆst (zt , At+1 ) − (γst + βst ), where the first equality holds by (49) applied to U˜st+1 , the second equality follows from (50), P t ,zt ˆ st (zt+1 , . . . , zT ) by and the final equality holds because st+1 µst+1 (st+1 )Uˆst+1 (zt+1 , . . . , zT ) = W (47). Thus, (49) holds at t as well. Step 3: Finally, we argue as in Section D.1 that the CEVU representation with utilities ˜ Ust and u˜st is an active learning representation, i.e., that there exists δ > 0 such that for all st and zt we have 1 X st ,zt µt+1 (st+1 )˜ ust+1 . (51) u˜st = δs t+1

To see this, note first that by (47) and Condition 3, u˜st is nonconstant for each st . Moreover, for any sT −1 ∈ ST −1 and `T −1 , `T ∈ ∆(Z), we have U˜sT −1 (`T −1 , `T ) = u˜sT −1 (`T −1 ) + E[˜ uT (`T )|sT −1 ], P s (s ),z where E[˜ uT (`T )|sT −1 ] := sT µTT −1 T T −1 u˜sT (`T ) does not depend on zT −1 by (47). Arguing exactly as in Lemma 10, we can invoke Axiom 19 to show that E[˜ uT |sT −1 ] and u˜sT −1 represent the same preference over ∆(Z). By (48) and because u˜sT −1 is nonconstant, this implies that there exists δsT −1 > 0 for each sT −1 such that E[˜ uT |sT −1 ] = δsT −1 u˜sT −1 . Invoking Axiom 20 and arguing as in Lemma 11, we can show that δsT −1 = δs0T −1 =: δ for all sT −1 , s0T −1 .

32

Next, assuming that we have established (51) for all t0 ≥ t, (47) implies that U˜st−1 (`t−1 , . . . , `T ) = u˜st−1 (`t−1 ) +

T −t X

δ k E[˜ ut (`t+k )|st−1 ],

k=0

P s (s ),z where E[˜ ut |st−1 ] := st µt t−1 t t−1 u˜st does not depend on zt−1 by (47). Again invoking (48), Axioms 19–20, and similar arguments as in Lemmas 10–11, we can then show that (51) also holds at t − 1. Necessity: Suppose ρ admits an active learning Then for each t ≤ T − 1, st , and (`k )Tk=t ∈ PT −trepresentation. k ∆(Z), we have Ust (`t , . . . , `T ) = k=0 δ ust (`t+k ). Then (46) holds, which by Lemma 28 implies Axiom 18. Moreover, Axioms 19–20 are verified in an analogous manner to the necessity direction of Theorem 3.

33

Dynamic Random Utility - Harvard

Jun 5, 2017 - ... is pervasive in applications, from education and career choices ..... applying to charter school each student is admitted with probability λ and must ..... static REU conditions as well as a technical history continuity axiom on ...

832KB Sizes 2 Downloads 419 Views

Recommend Documents

Dynamic Random Utility - Harvard University
Jun 21, 2017 - outside-the-home options and some of these parents subsequently ...... consists of wealth and a consumption bundle and the utility of wealth is ..... to a constant consumption plan more than another agent if and only if her ...

Dynamic Random Utility
Jun 21, 2017 - A joint file, including both the main text and supplementary ...... Given any REU form (S, µ,{Us,τs}s∈S) on Xi and any s ∈ S, Ai ∈ Ai, and pi ...

Dynamic Random Utility - Penn Economics
Jun 21, 2017 - 7A small technical difference from Kreps and Porteus (1978) is that they use ..... test scores in the case of school choice), Definition 3 suggests ...

Dynamic Random Subjective Expected Utility
Jun 23, 2018 - only on the information available to the agent at the moment of her choice. .... Then continue inductively by defining Xt = Z × At+1, where At+1 is.

Robust Utility Maximization with Unbounded Random ...
pirical Analysis in Social Sciences (G-COE Hi-Stat)” of Hitotsubashi University is greatly ... Graduate School of Economics, The University of Tokyo ...... Tech. Rep. 12, Dept. Matematica per le Decisioni,. University of Florence. 15. Goll, T., and

Semiparametric Estimation of the Random Utility Model ...
Apr 15, 2017 - ... the consistent estimation of the ratios of coefficients despite stochastic mis- ... is asymptotically normal, meaning that it is amenable to the ...

Utility-Optimal Dynamic Rate Allocation under Average ...
aware applications that preserves the long-term average end- to-end delay constraint ...... Service Management, IEEE Transactions on, vol. 4, no. 3, pp. 40–49,.

A Dynamic Theory of Random Price Discounts
... goods prices at many retailers exhibit a distinct pattern that might seem difficult to square .... Hence, measure zero of high-value buyers pay the sales price.

Online Appendix for Dynamic Random Subjective ...
Jun 23, 2018 - Note also, that Lemma 2 has given us state-dependent utilities of the form u(s, x),s ∈ .... 1) We start first with showing that there can't be more than K elements in ...... mappings from Proposition 4 the following properties hold.

Online Appendix for Dynamic Random Subjective ...
The proof that there is at least one non-constant SEU preference in supp(µ) is again very similar to [Frick, Iijima, Strzalecki '17] (proof by contradiction), only that ...

Interpreting Utility Patent Claims
Jul 11, 2016 - The diagram below depicts the scope that may be available ... oppositions, trademark cancellations and domain name disputes; and preparing.

Utility Belt
Manage existing AdWords campaigns. 4. Analyse trends using the performance graph. 5. Use the keywords tab to find data fast. 6. Bid changes are now easier.

Utility Belt
Manage existing AdWords campaigns. 4. Analyse trends using the performance graph. 5. Use the keywords tab to find data fast. 6. Bid changes are now easier.

Interpreting Utility Patent Claims
Jul 11, 2016 - other patents within the invention's technology field. An Example of ..... consulting information such as dictionaries, technical trea- tises, other ...

1.5.2 Utility Software.pdf
Chapter: 1.5 System software. Topic: 1.5.2 Utility programs ... a constant check on files searching for viruses and deletes it if found. 4. ... 1.5.2 Utility Software.pdf.

random walks, disconnection and random interlacements
Doctor of Sciences .... covered disk roughly behaves like n1/4 in two dimensions. This be- havior differs radically from that of the largest covered disk centered.

pdf-12115\probability-random-variables-and-random-signal ...
... of the apps below to open or edit this item. pdf-12115\probability-random-variables-and-random-sig ... daptation-by-bertram-emil-shi-peyton-z-peebles-jr.pdf.

random-prices.pdf
share a common point of discontinuity and their intersection does not determine the. price of housing. An interval of prices clears the ex post market, including ...

Online PDF The Harvard Dictionary of Music (Harvard University Press ...
Un ebook scritto anche e book o eBook in italiano libro elettronico 232 un libro in formato digitale a cui si pu 242 avere accesso mediante computer e dispositivi ...