Identification in Discrete Markov Decision Models

Viewer
Transcript

Identi…cation in Discrete Markov Decision Models Sorawoot Srisumay University of Surrey 11 December 2013

Abstract We derive conditions for the identi…cation of the structural parameters in Markov decision model under the assumptions of Rust (1987) when the payo¤ function is parametrically speci…ed. Identi…cation in this class of dynamic problems is di¢ cult to establish since the parameters of interest enter the value function nonlinearly, and the value function is only de…ned implicitly as a …xed point of some functional equation. We show it is su¢ cient to verify identi…cation in the pseudo-model, which is more tractable as it is originally designed to reduce the computational burden in the estimation problem, for the identi…cation of the data generating parameter of the underling model. Our results extend naturally to a class of dynamic discrete action games commonly use in empirical industrial organizations. JEL Classification Numbers: C5, C14, C44 Keywords: Identi…cation, Discrete Decision Processes, Markovian Games. I am grateful to Xiaohong Chen, Arthur Lewbel, Oliver Linton and Hashem Pesaran for helpful comments. School of Economics, University of Surrey, Guildford, Surrey, GU2 7XH, United Kingdom. E-mail address: [email protected] y

1

1

Introduction

The discrete Markov decision model that is surveyed in Rust (1994) provides a useful framework to study dynamic decision problems in many areas of economics. Importantly, it also forms the foundation to model dynamic games in the empirical industrial organization literature (see Aguirregabiria and Mira (2010) and Bajari, Hong and Nekipelov (2013) for recent surveys). Several estimation methodologies have been proposed to estimate a class of decision problems that satis…es additive separability and conditional independence (Rust (1987)), where the main object of interest is the …nite dimensional structural parameter that parameterizes the payo¤ function of the economic agent. The identi…cation conditions for these parameters are di¢ cult to establish due to the nonlinear structure inherent in the dynamic optimization framework. This paper derives more tractable conditions for the identi…cation of the structural parameters in these models. Existing identi…cation results in this literature are nonparametric in nature. In a single agent setting, a well-known negative result of Rust (1994) says that without any structure a dynamic decision model is nonparametrically not identi…ed. Magnac and Thesmar (2002) show nonparametric identi…cation of the payo¤ function is possible under additive separability and conditional independence assumptions by imposing additional further restrictions on the payo¤s. Pesendorfer and SchimdtDengler (2008) use a similar approach to provide conditions to identify dynamic discrete games, and Bajari, Chernozhukov, Hong and Nekipelov (2009) extend their results for games when the observed state variables are continuous. Although nonparametric identi…cation results are fundamental, our work is motivated by the fact that most of the methodology and empirical literatures employ a parametric payo¤ function.1 One obstacle in the study of identi…cation can be attributed to the nonlinearity generated by the recursive structure of the dynamic programming problem in the form of a value function. The intractability of value functions is a well-known computational issue for estimation. Indeed, earlier emphasis in the literature has been on developing feasible estimation methods focusing on the value function, which generally has no closed-form and is de…ned implicitly as a …xed point of some nonlinear function equation (e.g. see Rust (1994,1996)). An alternative estimation approach, aimed to reduce the computational burden, following Hotz and Miller (1993) instead estimates a “pseudomodel”in two steps, where the value function is replaced by an easy to compute policy value function in the …rst stage. The two-step approach is particularly attractive for the estimation of dynamic games as it has been shown to circumvent incomplete models that may arise due to presence of multiple equilibria. Examples of two-step estimators include: (single agent problems) Hotz and 1

There are some notable exceptions when an outcome variable is also available, which we do not assume, e.g. see Heckman and Navarro (2007).

1

Miller (1993), Hotz, Miller, Sanders and Smith (1994), Aguirregabiria and Mira (2002), Srisuma and Linton (2012); (dynamic games) Aguirregabiria and Mira (2007), Bajari, Benkard and Levin (2007), Bajari, Chernozhukov, Hong and Nekipelov (2012), Pakes, Ostrovsky and Berry (2007), Pesendorfer and Schmidt-Dengler (2008). The identi…cation problem we consider in this paper is set in a parametric framework. Although we allow in…nite dimensional nuisance parameters to be present in the model as long as these objects are nonparametrically identi…ed, hence our results also directly useful for semiparametric estimators in the literature (e.g. Bajari et al. (2012) and Srisuma and Linton (2012)). General conditions for the identi…cation of nonlinear parametric models have been provided by Wegge (1965) and Rothenberg (1971).2 However, these results are often di¢ cult to apply to a speci…c nonlinear model. Our strategy is to study identi…cation in the pseudo-model. We …rst show that identi…cation in the pseudo-model is su¢ cient for the identi…cation of the true model. Then we exploit the fact that policy value function can generally be de…ned as a linear transformation of the payo¤ function, so that the pseudo-model provides a tractable framework for the parameterization of the structural parameters. We then give conditions on the observables for the model to be identi…ed. We also emphasize that establishing the identi…cation of the pseudo-model is important in itself, since it is necessary for consistent estimation of all two-step estimators cited above. We focus on the single agent model to develop our intuition and arguments. Our identi…cation strategy for single agent models is directly applicable to a popular of dynamic games where players are known to play pure strategies almost surely. Thus, to avoid additional notations and discussions on the multiplicity problem speci…c to games, we shall present our main results for a single agent model where the setup is more transparent. The paper is organized as follows. Section 2 introduces the decision problem of interest and de…nes the model, pseudo-model and identi…cation concept. Section 3 gives the main results of the paper. Section 4 brie‡y describes how our results can be extended to dynamic games. Section 5 concludes with a discussion on the identi…cation of related decision problems. All proofs can be found in the Appendix.

2

Markov Decision Processes

Our starting point is the basic framework of a general single agent decision problem described in Rust (1994), which we use to de…ne the statistical model derived from the decision problem and illustrate the intractability of the parameter parameterization. We introduce the pseudo-model through the 2

Other important results on the identi…cation of parameters that may enter a system of equations nonlinearly without specifying parametric distributional assumptions can be found in Fisher (1961,1965,1966), Brown (1983), Roehrig (1988), Benkard and Berry (2006) and Komunjer (2012).

2

pseudo-decision problem in 2.2. We de…ne and discuss identi…ability in 2.3.

2.1

Basic Setup

We consider the decision process of a forward looking agent who solves an in…nite horizon intertemporal decision problem. The random variables in the model are the control (choice) and state variables, denoted by at 2 A and st 2 S respectively. The action set A is …nite, A = f0; 1; : : : ; Kg. The state st = (xt ; "t ) 2 X

E can be partitioned into two components where, from the perspective of the

econometrician, xt is observable and "t = ("t (0) ; : : : ; "t (K)) is unobservable. At each period t, the agent observes st and chooses an action at (st ) in order to maximize her discounted expected utility.

The payo¤s are time separable and is represented by u (at ; st ) in each period, which is known up to 2

, where

is a subset of Rp . The agent’s action in period t a¤ects the uncertain future states

according to the …rst order Markov transition law F (st+1 jst ; at ). The next period utility is subjected to discounting at the rate

2 (0; 1). Formally, for any time t, the agent is represented by a triple of

primitives ( ; ; F ), who is assumed to behave according to an optimal decision rule, fa (s )g1=t , in solving the following sequential problem W (st ) =

max

fa (s )g1=t

E

"

1 X

t

u (a (s ) ; s ) st

=t

#

s.t. a (s ) 2 A for all

t;

(1)

where W denotes the value function. Under some regularity conditions, there exists a stationary Markovian optimal policy function

: S ! A so that

t; , where

(st ) =

(st+ ) for any st = st+ and any

(st ) = arg max fu (a; st ) + E [W (st+1 ) jst ; at = a]g ; a2A

(2)

and W is the unique solution to the Bellman’s equation W (st ) = max fu (a; st ) + E [W (st+1 ) jst ; at = a]g : a2A

(3)

Su¢ cient conditions for the existence of stationary optimal policies for the decision problem described above can be found in Section 3 of Rust (1994). Note that

and W also generally depend on ( ; F ).

We suppress the explicit dependence on these primitives for the ease notational since our focus is on the identi…cation of , as ( ; F ) are often either assumed to be known or nonparametrically identi…ed in practice. We shall also impose the following modeling assumptions throughout the paper. Assumption M1: The discounting factor

is known.

Assumption M2: The transitional distribution of the states has the following factorization: 3

F (xt+1 ; "t+1 jxt ; "t ; at ) = Q ("t+1 jxt ) G (xt+1 jxt ; at ), where Q is the conditional distribution function

of "t given xt , and G denotes the transition law of xt+1 conditioning on xt and at . Assumption M3: (i) u is additive separable, u (at ; xt ; "t ) =

(at ; xt ) +

P

a2A "t

(a) 1 [at = a];

(ii) for each x, the conditional distribution of "t given xt = x is absolutely continuous with respect to the Lebesgue measure full support on RK+1 and the …rst moment of "t (a) exists for all a. Assumption M4: xt has …nite support, X = x1 ; : : : ; xJ . Assumptions M1 - M4 are standard in the (identi…cation and estimation) literature for Markov decision problems (e.g. see Rust (1987), Hotz and Miller (1993), Aguirregabiria and Mira (2002), Magnac and Thesmar (2002)). Under M1 - M3 the primitives of the model reduce to . We de…ne the statistical model for the Markov decision processes as follows. Definition S1: The true model is a collection of conditional choice probabilities fP g

that for all a:

P (ajxt ) = Pr [

(xt ; "t ) = ajxt ] a.s.; where

2

such

(4)

(st ) = arg max fu (a; st ) + E [W (st+1 ) jxt ; at = a]g ; a2A

W (st ) = max fu (a; st ) + E [W (st+1 ) jxt ; at = a]g : a2A

Note that the conditioning set for the discounted expected future returns at st+1 only depends on xt ; at since "t has no additional information for st+1 on under M2. The intractability of the parameterization in P is well-documented in the methodology literature, as the following simple example from Rust (1987) illustrates.

Example: At each period the manager inspects a bus, and decides whether he should replace the existing engine or to service it. The tradeo¤s are: there is an immediate …xed-cost of replacement but the maintenance costs and risks of future breakdowns increase as the bus continues to run on an old engine. Here at takes value 1 if the decision is to replace the engine and 0 otherwise, xt denotes the mileage on the bus, and "t = ("t (0) ; "t (1)) is a vector of unobserved costs (of goodwill etc.) associated with choice 0 and 1 respectively. The payo¤ function represents the cost to replace the bus engine is described as follows u (at ; st ) =

(

1

c (0; c (xt ;

2)

4

2)

+ "t (1)

+ "t (0)

if at = 1 if at = 0

;

where

= ( 1;

2)

with

1

denoting the …xed engine replacement cost, and c (x;

function describing the maintenance costs, parameterize by

2.

2)

is a parametric

The transition law for the mileage

is taken to be known, otherwise G can be estimated (parametrically or nonparametrically) independently of ; Rust (1987) discretizes X and estimates G using a multinomial distribution. Costs of future periods are then discounted by

. In the special case when "t has i.i.d. extreme value

distribution, the choice probability of replacing the engine with mileage x is: P (1jx) =

expf

1

expf 1 c (0; 2 ) + E [W (st+1 ) jxt = x; at = 1]g c (0; 2 ) + E [W (st+1 ) jxt = x; at = 1]g + expf c (x; 2 ) + E [W (st+1 ) jxt = x; at = (5)

which takes a familiar logit form with an addition of a dynamic term. However, E [W (st+1 ) jxt = ; at = ]

generally does not have a closed form, it is de…ned implicitly as a …xed point of a functional map, : W ! W, where W is a space of functions whose domain is A X such that for all (a; x) 2 A X

and w 2 W:

(w) (a; x) =

Z

E[max fu (a0 ; st ) + w (a0 ; xt )g jxt = x0 ]dG (x0 jx; a) . 0 a 2A

By applying Blackwell’s theorem and its generalization it is often straightforward to show that

(6) is a

contraction map (e.g. Theorem 1 in Rust (1987)), in addition, under the conditional logit framework simpli…es to (w) (a; x) =

Z

log

expf

1

c (0; 0

+ expf c (x ;

2) 2)

+ w (1; x0 )g 0

+ w (0; x )g

Thus E [W (st+1 ) jxt = ; at = ], hence P , generally depends on

!

dG (x0 jx; a) :

in a complicated manner. In

fact, the expression in the previous display is unusually transparent since the function E[maxa0 2A fu (a0 ; st )+ w (a0 ; xt )gjxt = ], that enters

in equation (6), generally does not have a closed-form when "t

does not have i.i.d. extreme value distribution.

2.2

Pseudo-Decision Problem

Estimation procedures that solve the dynamic optimization problem for every

in order to evaluate

P is often called a full-solution approach. This is typically characterized by the need to approximate some functional of W by solving a nonlinear functional equation, for instance, as in the Example above, by applying …xed-point iteration on

in equation (6). We next introduce the pseudo-model

that underlies the popular two-step approach of Hotz and Miller (1993), which is designed to reduce the computational burden of estimating dynamic decision problems.

5

Suppose we are interested in =

as de…ned in (2). A (

, which generates the data according to

that equals

for

) policy value function, V , is the discounted expected payo¤ an

economic agent would receive from making her decision based on V (st ) = E

"

1 X

t

u (

in every period: #

(s ) ; s ) st :

=t

We de…ne the pseudo-model as follows. Definition S2: The pseudo-model is a collection of conditional choice probabilities fP g

such that for all a:

(st ) = ajxt ] a.s.; where

P (ajxt ) = Pr [

2

(7)

(st ) = arg max fu (a; st ) + E [V (st+1 ) jxt ; at = a]g ; a2A

V (st ) = u (

(st ) ; st ) + E [V (st+1 )j st ] :

P describes the distribution of the optimal action that solves a pseudo-decision problem, which di¤ers from the true decision problem in (4) by replacing W with V . Particularly the pseudoproblem is a static optimization problem in the sense that V does not incorporate Bellman’s principal of optimality. Furthermore, by construction simply denote P

=

by P .

, hence P

= P . In what follows we shall

For estimation purposes, the main insight of Hotz and Miller (1993) centers on applying an inversion result (see Lemma HM below) so that E [V (st+1 ) jxt = ; at = ] can be estimated using a nonparametric estimator of P for each . Since V is a solution to a linear equation, in contrast

W is de…ned through the max operator, evaluating P numerically is signi…cantly easier than P as the former does not require one to solve a dynamic optimization problem. The inherent linear structure that de…nes V is crucial in our identi…cation analysis since we claim the parameterization of in E [V (st+1 ) jxt = ; at = ] is much more tractable relative to E [W (st+1 ) jxt = ; at = ]. Impor-

tantly, by the law of iterated expectations, under M1: E[V (st+1 ) jxt ; at ] = E[E[V (st+1 )j xt+1 ]jxt ; at ], where E[ V (st )j xt ] is the solution to a linear equation E [V (st ) jxt ] = E [u (

(st ) ; st ) jxt ] + E [E [V (st+1 ) jxt+1 ] jxt ] ;

which is obtained by taking conditional expectation of V (st ) w.r.t. xt . Let m = E [V (st ) jxt = ]

and g = E[V (st+1 ) jxt = ; at = ], then we can then succinctly summarize their relationships with

6

the per period payo¤ function

in a linear functional notation as follows: m g

= r + Lm ;

(8)

= Hm ;

(9)

where r is the ex-ante expected payo¤ given state xt using

, namely r (x) = E [u (

(st ) ; st ) jxt = x]

for any x 2 X; and for any m : X ! RJ the linear operator L generates discounted expected nextperiod value of its operands Lm (x) = E [m (xt+1 ) jxt = x] for any x 2 X; H is another conditional

expectation operator that generates the choice speci…c expected next period values of its operands operator Hm (x; a) = E [m (xt+1 ) jxt = x; at = a] for any a; x 2 A

X. In contrast to a general solu-

tion of a …xed point, the structure of g is more tractable in terms of operators that do not depend on , and (I

L)

1

. In particular L; H are linear

exists (see below), so that g = H (I

L)

1

r . We

now revisit the engine replacement problem introduced earlier and consider the choice probabilities for the pseudo-model. Example (continued): The choice probability of replacing the engine with mileage x implied by the pseudo-decision problem is: P (1jx) =

expf

1

expf 1 c (0; 2 ) + g (1; x)g ; c (0; 2 ) + g (1; x)g + expf c (x; 2 ) + g (0; x)g

which di¤ers from P (1jx) (de…ned in (5)) by replacing E[W (st+1 ) jxt = ; at = ] with g . In order to decompose g , one starts with r and solve for m . Since X is …nite, equation (8) can then be represented by the following matrix equation 0

1 0 m (x1 ) r (x1 ) B C B .. .. B C=B . . @ A @ J m x r xJ where L = (lij ) is a J

1

0

10 l1;J m (x1 ) B .. C .. .. B . . C . A@ lJ;J m xJ

l11 C B . C + B .. A @ lJ;1

c (0;

2)

j

j

P (1jx )) + P (0jx ) ( c (x ;

is the choice probabilities induced by

2)

from (I

L)

j

log(P (0jx )), where

+ P (1jxj ) (

1

is the Euler’s constant and P

. Since L is a stochastic matrix multiplied by

the dominant diagonal theorem, the matrix I 1

C C; A

Pr [xt+1 = xj jxt = xi ]. Using Hotz and Miller (1993)’s

J matrix with lij =

inversion theorem (see Lemma HM below), it can be shown that r (xj ) = j

1

2 (0; 1), by

L is invertible and m can be uniquely recovered

r . Since A is …nite, equation (9) can also be represented by a matrix equation. For

7

each a:

1 0 g (a; x1 ) ha1;1 C B B .. C = B ... B . @ A @ J g a; x haJ;1 0

where H a = haij is a J

10 ha1;J m (x1 ) C B .. C B .. .. . . A@ . haJ;J m xJ

1

C C; A

J matrix with haij = Pr [xt+1 = xj jxt = xi ; at = a]. By letting H =

[H 1> ; H 2> ]> , then g can be represented by a 2J dimensional vector ((H 1 m )> ; (H 2 m )> )> .

The example above illustrates that the parameterization in E[V (st+1 ) jxt = ; at = ] is generally

more tractable than that of E[W (st+1 ) jxt = ; at = ], in particular the former can generally be written as some linear combination of elements in

. In the estimation literature, r has been shown

to be identi…ed under various modeling assumptions (see list of two-step estimators papers cited in the introduction), and L is nonparametrically identi…ed under weak conditions. Even if we relax M4 and allow xt to be a continuous variable equations (8) and (9) still hold. In that more general case

m is the solution to a general type II integral equation, which solves a well-posed inverse problem since the operator L is a contraction map in commonly used metric spaces.3 H is also identi…ed as a conditional expectation of the observables, therefore g is identi…ed for every . Numerous

estimators for (r ; m ; g ; L; H) have been proposed in the literature and have also been shown to be uniformly consistent for any compact subset of

and A X. Since g = H (I

L)

1

r , H (I

L)

1

is a nonparametric object that can typically be represented or approximated by a matrix, objective functions that involve E[V (st+1 ) jxt = ; at = ], through (7), are generally much easier to compute

than those with E[W (st+1 ) jxt = ; at = ] (see (4)), which is the main motivation for using the pseudo-model.

2.3

Scope of Identi…cation

We de…ne identi…cation using the notion of observational equivalence between two parameters in terms of their implied distribution functions. Let P ( jxt ) denote any conditional probability function de…ned on A that is known up to the unknown parameter

2

.

Definition I1: For a collection of conditional choice probabilities fP g

and

0

are said to be observationally equivalent in fP g

for all a in A:

2

2

, two parameter points

if P (ajxt ) = P 0 (ajxt ) almost surely

3

Suppose X is some compact subset of RL and B be a space of bounded real-valued functions de…ned on X. Consider a Banach space (B; k k) equipped with the sup-norm, i.e. k k = supx2X j (x)j for any 2 B. For any x 2 X, L (x) = E[ (xt+1 ) jxt = x], then it follows that jL (x) j supx2X j (x)j : In other words kL k k k, hence the operator norm kLk is bounded above by . Since 2 (0; 1), L is a contraction. Therefore the inverse of 1 P I L exists. Furthermore, it is a linear bounded operator and admits a Neumann series representation, L (see =0

Kreyszig (1989)).

8

at

Definition I2: A collection of conditional choice probabilities fP g 0

if there is no other

in

which is observationally equivalent.

We say that a collection of conditional functions fP g

point

in

2

is said to be identi…ed

is identi…ed if it is identi…ed at every

2

. The de…nition of identi…cation employed in this paper is chosen to re‡ect the estima-

tion problem, where most known objective functions in the literature are explicit functionals of the conditional choice probabilities. Comparing to related work in the literature, similar to Magnac and Thesmar (2002), our de…nition of identi…cation takes the primitive ( ) as the structure of the model and uses its implied distribution function (P ) as the reduced form, whilst Rust (1994, Section 3.5) de…nes the decision rule ( ) as the reduced form. The two de…nitions are equivalent under M1 - M4 in the sense that for any ;

0

2

:

(st ) =

0

(st ) a.s. , P (st ) = P 0 (st ) a.s.

In Section 3 we shall focus on the identi…cation of the pseudo-model. Our …rst result shows that the identi…cation of fP g

2

has an immediate implication for the identi…cation of fP g

Proposition 1. Suppose the pseudo-model fP g

satis…es (2) with primitives 0

. If fP g

that satis…es (2) with primitives

3

0

2

6=

2

is identi…ed at such that

0

2

.

is generated from the decision rule

that

then there can be no other decision rule

(st ) =

(st ) almost surely.

Identi…cation

3.1

Equivalent Condition for Identi…cation

Let v =

+ g , following the notations used in (8) and (9), so that for all a; x: v (a; x) =

(a; x) + g (a; x). Then we can write choice probabilities of the pseudo-model as follows: P (ajx) = Pr [ v (a; xt ) + "t (a) De…ne

v (a0 ; xt ) + "t (a0 ) for all a0 6= aj xt = x] :

v (a; x) to be v (a; x) v (0; x) for all (a; x). Under M1 - M4, Hotz and Miller (1993)’s inver-

sion theorem (also see Lemma 8 of Matzkin (1991)) allows f v (a; xt )ga>0 to be uniquely recovered

from fP (ajxt )ga>0 . Therefore the identi…ability condition reduces to whether the parameterization

of

in

v is unique on A

X. A version of the recoverability result is stated below for the ease of

reference.

9

For all a > 0 we de…ne a map v) = a;x (e

Z

1

"a = 1

Z

Z

v ea +"a

1;x ; : : : ;

K;x

so that

: RK ! [0; 1], such that for any real K vector ve = (e v1 ; : : : ; veK );

v ea v e1 +"a

:::

x

Z

v ea v ea

"a

"1 = 1

"0 = 1

By construction P (ajx) = >

a;x

1=

1 +"a

Z

v ea v ea+1 +"a

:::

v ea v eK +"a

"K = 1

"a+1 = 1

1

Z

dQ ("0 ; : : : ; "K jx) :

v (1; x) ; : : : ; v (K; x)) for all (a; x). Then de…ne a map

a;x ( K

K

:R !

K

, where

x

=

denotes the K simplex. The following Lemma

follows from Proposition 1 of Hotz and Miller (1993). Lemma HM: Suppose Assumptions M1 - M4 hold. Then The invertibility of

x

x

is invertible for all x 2 X.

can be used to recover the indi¤erent-value vector for any given choice

probabilities (and vice-versa). The identi…cation problem of fP g

tion whether

v is uniquely parameterized on A

is then equivalent to the ques-

2

X. Since A and X are …nite we can represent

v in a matrix form. The following notations shall be useful for Theorems 1 and 2. Preliminary Notations for Theorems 1 and 2 For all , let v a = v (a; x1 ) ; : : : ; v

a; xJ

for all a, and v = v 0 ; : : : ; v K

>

, de…ne

a

and

analogously. For any k let: Ik denote an identity matrix of size k; H denote a block-diagonal matrix

diag H 0 ; H 1 ; : : : ; H K , where H a denotes a J J matrix whose ij th element is Pr [xt+1 = xj jxt = xi ; at = a] H denote a block-diagonal matrix diag H 1

(I 2

1

De…ne

v a = v (a; x1 )

H 0; : : : ; H K

H 0 ; Mk = (Ik+1

M ) where M =

j

i L) and L denotes a J J matrix whose ij th element is Pr [xt+1 3 2 = x jxt = x ]; P0 = 3 P0 P1 PK P1 P2 PK 6 . 6 .. .. 7 .. .. 7 6 .. 7 where P k = diag P (kjx1 ) ; : : : ; P kjxJ and P1 = 6 ... . . . . 7 4 5 4 5. P0 P1 PK P1 P2 PK a

de…ne

and

formation Dv = ;

0

=

v (0; x1 ) ; : : : ; v

a; xJ

0; xJ

v

analogously. Let D denote the JK v . Finally, for any ; 0 , let 0

v

for all a > 0; and

v =

v1; : : : ; vK

J (K + 1) matrix that performs the trans;

0

=

v 0,

v

;

0

=

. Then the following representation Lemmas hold.

Lemma 1. Suppose Assumptions M1 - M4 hold. Then for all ; 0 : v

;

0

= D IJ(K+1) + HMK+1 P0

;

0

:

Lemma 2. Suppose Assumptions M1 - M4 and R1 (de…ned below) hold. Then for all ; 0 ,

v

;

0

= (IJK +

HMK P1 ) 10

;

0

:

0

and

>

,

There are two notable features of the representation Lemmas. First, for any ; 0 , the RHS of v

;

can always be written as a product of some matrix that can be identi…ed from the data

0

and a vector that depends on the structural parameters. This follows directly from the fact that all future discounted expected returns can be written as linear transforms of v

;

0

. The other is that

only depends on the distribution of "t through the equilibrium induced transition and choice

probabilities. Therefore conditions for identi…cation consist only of two separate sources: one on the parameterization in

3.2

and the other on objects matrices that can be identi…ed from the data.

Main Results

Theorem 1: Suppose Assumptions M1 - M4 hold. Then fP g if the intersection between f

0;

:

empty.

2

2

is identi…ed at

0

if and only

n f 0 gg and the null space of D IJ(K+1) + HMK+1 P0 is

It immediately follows from Lemma 1 that the intersection between the null space of D(IJ(K+1) + HMK+1 P0 ) and f

0;

:

2

n f 0 gg completely characterizes the set of P 0 s that are observa-

tionally equivalent to P 0 . However, the implication from the characterization may not be immediately obvious since it can be di¢ cult to visualize geometrically. We also note that the possibility for non-identi…cation is not necessarily trivial if the parameterization of

is too rich since

D IJ(K+1) + HMK+1 P0 is always non-injective (as it has more columns than rows).4 Our next two sets of results are easier to verify. Each makes use of an additional restriction on the payo¤ function commonly found in the literature. The …rst assumes the knowledge of the payo¤ from an outside option. The second relies on the linear-in-parameter speci…cation. Assumption R1: For all ; x,

(0; x) =

(x).

Theorem 2: Suppose Assumptions M1 - M4 and R1 hold, and whenever is not a null vector. f

0;

:

2

Then fP g

2

is identi…ed at

n f 0 gg and the null space of IKJ +

0

6=

0

,

;

0

if and only if the intersection between

HMK P1 is empty.

The proof follows immediately from Lemma 2. Assumption R1 is commonly used in the nonparametric identi…cation literature, which can also be motivated in particular empirical applications (e.g. in entry games at = 0 often denotes …rms not participating hence getting zero payo¤). Although note that, in contrast to static discrete choice models, R1 is not a normalization condition without 4 This is familiar from the literature on nonparametric identi…cation, in which case f 0 ; : 2 n f 0 gg can be thought of as an arbitrary set of points around (but excluding) the origin in RJ(K+1) ; see the nonidenti…cation result in Magnac and Thesmar (2002, Proposition 2) in a two-period model for a similar intuition.

11

loss of generality; cf. Theorem 1 (also see Theorems 3 and 4 below). Whilst the requirement that is well-parameterized, in the sense that there are no redundant parameters, is obviously a necessary condition for identi…cation. Di¤erent speci…cation of

can be checked on a case-by-case basis. Since

HMK P1 is a square matrix, then we have a simple su¢ cient condition for identi…cation

IKJ +

independent of . Corollary 1: Suppose Assumptions M1 - M4 and R1 hold, and whenever a; x so that

(a; x)

0

identi…ed.

(a; x) 6= 0. If IKJ +

0

6=

there exists

HMK P1 is non-singular then fP g

is

2

It is worth noting a feature of our results thus far that the conditions for identi…cation consist of two separate sources: one on the parameterization in

and the other on matrices that can be

identi…ed from the data. This follows directly from the fact that all future discounted expected returns can be written as linear transforms of

. Next we consider the case when

is linear-in-

parameters, where we can revert back to the linear functional notations instead of the cumbersome vectorized notations used for Theorems 1 and 2. Assumption R2: The payo¤ function is linear-in-parameters, i.e. a; x; , where

0

(a; x) =

>

0

is a p dimensional vector of real functions.

For Theorems 3 and 4 it shall be useful to represent f v0 (a; x)ga>0 using a matrix 2

6 V (x) = 6 4 where

V (x), where

3 v0 (1; x)> 7 .. 7 for all x; . 5 > v0 (K; x)

(10)

v0 is de…ned within the following Lemma.

Lemma 3. Under Assumptions M1 - M4 and R2, for all where v0 = v0 (a; x)

(a; x) for all

0

+ H (I

L)

1

r0 with r0 (x) = E[

v0 (0; x) for all x; a > 0.

0

(

; 0,

v

v

0

= (

(st ) ; xt ) jxt = x] for all x and

0 >

)

v0 ,

v0 (a; x) =

Under the linear-in-parameters speci…cation, Lemma 3 shows that the identi…cation of fP g

2

is analogous to a familiar problem of no multicollinearity condition in a linear model. We give conditions for identi…cation of this special case in the next Theorem. h Theorem 3: Under Assumptions M1 - M4 and R2, and E V (xt )> h i fP g 2 is identi…ed if and only if E V (xt )> V (xt ) has rank p. 12

i V (xt ) exists. Then

Given an explicit speci…cation of the parameterization under R2 we can adapt Theorem 3 to incorporate a priori relationships between the elements of . Our next theorem gives a weaker set of conditions for identi…cation when the rank condition in Theorem 3 may fail to hold but

is known

to satisfy a set of linear restrictions. Theorem 4: Suppose Assumptions M1 - M5 hold, and E f

2

and

: R

>

=

0g

where, for some d < p, R is a p

be a d dimensional vector. Then fP g

2

h

V (xt )> d

i V (xt ) exists. Let

0

=

matrix with full column rank,

is identi…ed if and only if there exists a p

0

d) matrixh D whose column space equals the orthogonal complement of the column space of R i > such that D> E V (xt ) V (xt ) D has rank p d. (p

The requirements on the column space of D in Theorem 4 does not impose any restriction since

such matrix always exist (although not unique) as Rp can always be written as a direct sum between the column space of (any matrix) R and its orthogonal complement. The essence of the proof is the reparameterization of

to incorporate linear constraints that reduces the dimension of the free

parameters; such method is familiar from the classical study of constrained least square problems (for instance, see Amemiya (1985, Section 1.4.2)). Under R2 the identi…cation problem is determined by a system of linear equations. If the rank conditions in Theorems 3 and 4 are satis…ed, then the likelihood criterion and any other objective functions, which depends on the choice probabilities from the pseudo-model, will be able to deliver a consistent estimator of

under standard regularity conditions. Our identi…cation conditions also

have h…nite sample implications. Since i h the estimator of i V is easy to compute, the empirical analogs of E V (xt )> V (xt ) and D> E V (xt )> V (xt ) D can (and should) be analyzed prior the

(st ) ; xt ; xt+1 ) estimation stage. For instance, suppose fan ; xn ; x0n gN n=1 is a random sample from ( P N > Vb (xn ) does not have full rank, then the and Vb is the estimator of V . If N1 n=1 Vb (xn ) intersection of the null spaces of f Vb (xn )gN is non-empty. Thus, by applying Lemma HM, for

each

0

there are uncountably many other

n=1 0

s that imply the same choice probabilities so that

any objective functions constructed from the choice probabilities cannot have a unique optimizer; P analogous conclusions can also be drawn from studying the rank of D> ( N1 N Vb (xn )> Vb (xn ))D. n=1

Comments on Relaxing M4: Conceptually, we can naturally extend our identi…cation results

so that xt also contains continuously distributed state variable (see Srisuma and Linton (2012)). In particular, for Theorems 1 and 2 and Corollary 1,

v

for Lemmas 1 and 2 respectively. However, the sets f

;

0

remains a linear transform of 0;

:

2

n f 0 gg and f

0;

;

:

0

and 2

;

0

n f 0 gg

become subsets of functions in some Banach spaces, and the matrices of conditional choice and transition probabilities are to be replaced by linear integral operators. Although general conditions 13

for injectivity of linear operators are more di¢ cult to characterized and veri…ed. On the other hand, Theorems 3 and 4 immediately accommodate continuously distributed xt without any modi…cation. We end this section by illustrating how our results can be used, based on observables objects, to show the identi…cation of speci…c parametric models that are otherwise generally intractable. Example (continued): We consider parametric models similar to those studied in Rust (1987), generated from the following functional forms: (a) c (x; 5

j

j+1

for some known c0 ( ). We assume that x < x

and

In order to study identi…cation we construct

2)

= exp ( 2 x), and (b) c (x; 2++

2

=R

2)

=

1jx

(a; x) = [

1

of ones, P = diag P (1jx ) ; : : : ; P

J

(x)

.

v (a; x). Since K = 1, the preliminary notations

introduced for Lemmas 1 and 2 can be simpli…ed further. In what follows we let denote J 1

2 c0

1

and c ( 2 ) = (c (x ;

2) ; : : : ; c

J

x ;

1 vector

>

) . Note that we

2

can write

Since

(1; x)

(0; x) =

c (0;

1

c (0; 2)

=

2 )]

+ c (x;

a + [ c (x; 2 ),

c (0;

1

2 )]

(1

a) :

it follows that 2)

+ c ( 2) :

The vector of di¤erences in future expected payo¤s, fg (1; xj )

g (0; xj )gJj=1 , is

HM1 P0

, where

H; M1 are now square matrices of size J and P0

Then we can write v =

v =

(IJ +

1

= +

HM1 P)

1

c (0;

P

2)

(IJ +

Next we impose speci…c functional form for c (x; Case (a): c (x; v

;

0

=

2)

(

1

= exp ( 2 x). For any ; 0 1)

(IJ +

P

(IJ

P) c ( 2 ) :

as

HM1 P0 c (0;

2)

0

HM1 P) + (IJ

HM1 (IJ

P)) c ( 2 ) :

2 ).

2

, we have

HM1 P) + (IJ

HM1 (IJ

P)) (c ( 2 )

c ( 02 )) :

Note that (a) does not satisfy either R1 or R2. Therefore the condition for identi…cation of fP g

at any point

0,

2

as characterized by Theorem 1, is equivalent to whether there exists any other

5 We assume the speci…cations in (a) and (b) for notational simplicity. Rust (1987, p.1015) considers four functional forms, (i) to (iv), for c (x; ). Rust’s speci…cation (ii) only di¤ers from our (a) by an additional multiplicative (scaling) parameter. Rust’s speci…cations (i), (iii) and (iv) also satisfy R2, where the latter two are special cases of our (b).

14

2

that

v

dimensional curve

2

7! (IJ

HM1 (IJ

HM1 (IJ

P)) (c ( 2 )

from Corollary 1, is for IJ 2

6=

fP g IJ

6 02 . 2

P))(c (

02 )

c(

0 2 )).

0

is identi…able if the one-

c ( 2 )) does not intersect the line spanned

HM1 P) in RJ . If we impose R1, for instance by taking

by (IJ + (IJ

= 0, which has the geometric interpretation that

0;

01

to be known then

v

;

0

=

Then, a su¢ cient condition for identi…cation, following P) to be non-singular since c ( 2 ) 6= c (

HM1 (IJ

Although such condition is not necessary. Even if IJ

HM1 (IJ

02 )

for any

P) is rank de…cient,

may still be identi…ed. In particular we can relax the requirement on the null space of P) by using information on the structure of c (x;

HM1 (IJ

exp ( 2 xj+1 )

exp (

02 x

j+1

) > exp ( 2 xj )

exp (

02 x

also be identi…ed even if the null space of IJ

j

) for any

2

>

For example, given that

2 ).

02

and j = 1; 2,7 fP g

2

will

P) is non-empty but also does not

HM1 (IJ

>

contain vectors of the form d = (d1 ; d2 ; d3 ) where d3 > d2 > d1 > 0. Case (b): c (x;

2)

=

=

(

1

v Let

;

0

V = [(IJ +

2 c0

(x). For any ;

0 1)

(IJ +

0

2

, we have

HM1 P) + (

HM1 P) : (IJ

HM1 (IJ

2

0 2)

(IJ

HM1 (IJ

P)) c0 ], then it is clear that the su¢ cient

and necessary condition for identi…cation is the full column rank of the condition in Theorem 3. In particular, so that (IJ +

HM1 P) and (IJ

P)) c0 :

V, which is equivalent to >

v0 (1; x) = ( v01 (1; x) ; v02 (1; x))

HM1 (IJ

P)) c0 are (

v01

(cf. Lemma 3),

(1; x ) ; : : : ; v01 1; xJ )> and 1

( v02 (1; x1 ) ; : : : ; v02 1; xJ )> respectively.

4

Dynamic Games

Recently there has been a growing interest in extending econometric methods developed for single agent models to estimate empirical games, especially in the empirical industrial organization literature. However, an additional di¢ culty that arises in models with strategic interactions is the problem of multiple equilibria. Speci…cally, for every , the game may have more than one (if any) equilibrium that predicts di¤erent distributions of optimal actions. But the pseudo-model generally does not su¤er from problems of multiple equilibria when players are known to play pure strategies almost surely. Our results in the previous section can be directly applied to the dynamic games studied in Aguirregabiria and Mira (2007), Bajari, Benkard and Levin (2007), Pakes, Ostrovsky and Berry (2007) and Pesendorfer and Schmidt-Dengler (2008). To save space and avoid repetition we 6

This condition is equivalent to HM1 (IJ P) does not contain 7 exp ( 2 x) is a strictly supermodular function.

15

1

as an eigenvalue.

do not list out various conditions and theorems for the game counterparts to those of single agent problems.

5

Concluding Remarks

Existing identi…cation results for nonlinear parametric models can be hard to verify the identi…cation of Markov decision models and dynamic games. We show the identi…cation of the pseudo-model is generally simpler and can also be used to show identi…cation of the data generating parameter from the underlying model. Therefore our results are useful for two-step estimators, and other estimation methods where the objective functions are analytically less tractable but need to assume the model is identi…ed for consistent estimation (such as the nested-…xed point algorithm of Rust (1987), the iterative versions of estimators in Aguirregabiria and Mira (2002,2007), and the constrained optimization method proposed by Egesdal, Lai and Su (2013)). We derive our identi…cation results directly by studying the implied expected payo¤s rather than choice probabilities, since the latter is unnecessarily complicated by a further one-to-one, but nonlinear, transformation (Hotz and Miller (1993)). Relatedly the structural parameters can also be estimated directly based on the implied expected payo¤s instead of the traditional choice probabilities; Sanches, Silva Jr. and Srisuma (2013) recently show there are computational advantages in doing so without any theoretical cost (or gain). The linear structure that de…nes the policy value functions is a general property that is also present in other Markov decision problems and dynamic games outside the discrete choice framework. For instance, a closely related class of decision problems imposes increasing di¤erences conditions on the payo¤ functions instead of additive separability, see Hong and Shum (2010) in a single agent setting, and Bajari et al. (2007) and Srisuma (2013) for games. However, the lack of separability means we can no longer apply Hotz and Miller’s inversion lemma, the parameterization of for the identi…cation of the pseudo-model for these problems are therefore less tractable than the ones considered in this paper. Nevertheless the general local identi…cation results in Wegge (1965) and Rothenberg (1971) can be applied for any pseudo-models, which is also su¢ cient for the local identi…cation of the actual model for the same reasons as implied by Proposition 1.8 8

Establishing local identi…cation can be informative in this literature since several estimators have been proposed to estimate the same (pseudo-)model, and it is sometimes unclear whether the failure to identify the parameter of interests is a property of the model or a consequence of a particular methodology (see Srisuma (2013)).

16

Appendix Proof of Proposition Proof of Proposition 1. Suppose there exist

0

6=

such that

0

(st ) =

(st ) almost surely.

Then V 0 must equal W 0 by the de…nitions of the value functions and policy value functions. However, this implies

0

(st ) almost surely, contradicting the assumption that the pseudo-model is

(st ) =

identi…ed.

Proofs of Lemmas Proof of Lemma 1. Immediate. Proof of Lemma 2. Immediate. Proof of Lemma 3. Under the linear-in-parameters speci…cation, r = > r0 + r, where r0 = P E[ 0 ( (st ) ; xt ) jxt = ] and r = E[ a2A "t (a) 1 [ (st ) = a] jxt = ]. Since (I L) 1 and H are linear operators, v can be written as and v1 = H (I

L)

1

>

v0 + v1 such that v0 =

r1 . For all x; a > 0 and j = 0; 1, let

vj (a; x) denote vj (a; x)

vj (0; x) so that

v (a; x) =

0

+ g0 , where g0 = H (I

v (a; x) denote v (a; x)

>

v0 (a; x) +

L)

1

r0 ,

v (0; x) and

v1 (a; x).

Proofs of Theorems Proof of Theorem 1. Immediate from Lemma 1. Proof of Theorem 2. Immediate from Lemma 2. Proof of Theorem 3. Suppose fP g

than p, then there exists v (a; xt ) =

6=

0

such that

2

is identi…ed. If E

V (xt ) =

V (xt )

0

h

>

V (xt )

i V (xt ) has rank less

almost surely, which is equivalent to

v 0 (a; xt ) almost surely for all a > 0 since v1a (xt ) does not depend on . Lemma HM

then implies P (ajxt ) = P 0 (ajxt ) almost surely for all a 2 A, which leads to a contradiction. For the converse, suppose fP g

2

is not identi…ed. Then, using Lemma HM, there exists

6=

0

such that

v 0 (a; xt ) almost surely for all a > 0, which is equivalent to V (xt ) = h i V (xt ) > almost surely. However, this contradicts the full rank condition of E V (xt ) V (xt ) . v (a; xt ) =

0

Proof of Theorem 4. Let B = [R; D]> . By construction B is non-singular since D> R = 0.

17

For any , V (xt )

V (xt ) B 1 B h V (xt ) R R> R

= =

V (xt ) R R> R

= where

1

= R> and

2

= D> . Suppose fP g

rank less than p d, then there exists

2

6=

0 2

2

1

1 1

0

V (xt )

=

V (xt )

0

such that

almost surely.

i

"

R> D>

6=

V (xt ) D D> D 0

in

0

surely, where

V (xt ) = D>

0

almost surely. Then

h

1

such that D

1

2;

i V (xt ) D has

V (xt )> 2 = >

V (xt ) D D> D

=

2

and D>

0

=

1

0 2,

However, by Lemma HM, this leads to a con-

tradiction. For the converse, using Lemma HM, suppose there exists V (xt ) =

#

V (xt ) D D> D

+

is identi…ed. If D> E

almost surely. Since B is non-singular there exists so that

; D D> D

1

V (xt )

V (xt )

0

=

6=

0

in

V (xt ) D D> D

0

such that 1

almost

> 0

his a null-vector (since i B is non-singular). However, this contradicts > the full rank condition of D E V (xt ) V (xt ) D. D

>

References [1] Aguirregabiria, V., and P. Mira (2002): “Swapping Nested Fixed Point Algorithm: A Class of Estimators for Discrete Markov Decision Models,”Econometrica, 70, 1519-1543. [2] Aguirregabiria, V., and P. Mira (2007): “Sequential Estimation of Dynamic Discrete Games,” Econometrica, 75, 1-53. [3] Aguirregabiria, V., and P. Mira (2010): “Dynamic Discrete Choice Structural Models: A Survey,”Journal of Econometrics, 156, 38-67. [4] Amemiya, T. (1985), Advanced Econometrics, Harvard University Press. [5] Bajari, P., C.L. Benkard, and J. Levin (2007): “Estimating Dynamic Models of Imperfect Competition,”Econometrica, 75, 1331-1370. [6] Bajari, P., H. Hong and D. Nekipelov (2013): “Econometrics for Game Theory,” Advances in Economics and Econometrics: Theory and Applications, 10th World Congress. [7] Bajari, P., V. Chernozhukov, H. Hong and D. Nekipelov (2012): “Semiparametric Estimation of a Dynamic Game of Incomplete Information,”Working paper, University of Minnesota.

18

0 2

[8] Benkard, C. L., and S. Berry (2006): “On the Nonparametric Identi…cation of Nonlinear Simultaneous Equations Models: Comment on Brown (1983) and Roehrig (1988),”Econometrica, 74, 1429-1440. [9] Brown, B. W. (1983): “The Identi…cation Problem in Systems Nonlinear in the Variables,” Econometrica, 51, 175-196. [10] Chernozhukov, V., H. Hong and E. Tamer (2007): “Estimation and Con…dence Regions for Parameter Sets in Econometric Models,”Econometrica, 75, 1243-1284. [11] Egesdal, M., Z. Lai and C. Su (2013): “Estimating Dynamic Discrete-Choice Games of Incomplete Information,”Working paper, University of Chicago Booth School of Business. [12] Fisher, F. M. (1961): “Identi…ability Criteria in Nonlinear Systems,”Econometrica, 29, 574-590. [13] Fisher, F. M. (1965): “Identi…ability Criteria in Nonlinear Systems: A Further Note,” Econometrica, 33, 197-205. [14] Fisher, F. M. (1966): The Identi…cation Problem in Economics. McGraw-Hill, New York. [15] Heckman, J. and S. Navarro (2007): “Dynamic Discrete Choice and Dynamic Treatment Effects,”Journal of Econometrics, 136, 341-396. [16] Hong, H. and M. Shum (2010), “Pairwise-Di¤erence Estimation of a Dynamic Optimization Model,”forthcoming, Review of Economics Studies, 77, 273-304. [17] Hotz, V., and R.A. Miller (1993): “Conditional Choice Probabilities and the Estimation of Dynamic Models,”Review of Economic Studies, 60, 497-531. [18] Hotz, V., R.A. Miller, S. Smith and J. Smith (1994): “A Simulation Estimator for Dynamic Models of Discrete Choice,”Review of Economic Studies, 61, 265-289. [19] Komunjer, I. (2012): “Global Identi…cation in Nonlinear Models with Moment Restrictions,” Econometric Theory, 28, 719-729. [20] Kreyszig, E. (1989), Introduction to Functional Analysis with Applications, Wiley Classics Library. [21] Magnac, M. and D. Thesmar (2002): “Identifying Dynamic Discrete Decision Processes,”Econometrica, 70, 801-816.

19

[22] Maskin, E. and J. Tirole (2001): “Markov Perfect Equilibrium I: Observable Actions,”Journal of Economic Theory, 100, 191-219. [23] Matzkin, R.L. (1991): “Semiparametric Estimation of Monotone and Concave Utility Functions for Polychotomous Choice Models,”Econometrica, 59, 1315–1327. [24] Pakes, A., M. Ostrovsky, and S. Berry (2007): “Simple Estimators for the Parameters of Discrete Dynamic Games (with Entry/Exit Example),”RAND Journal of Economics, 38, 373-399. [25] Pesendorfer, M., and P. Schmidt-Dengler (2008): “Asymptotic Least Squares Estimator for Dynamic Games,”Reviews of Economics Studies, 75, 901-928. [26] Roehrig, C. S. (1988): “Conditions for Identi…cation in Nonparametric and Parametric Models,” Econometrica, 56, 433-447. [27] Rothenberg, T.J. (1971): “Identi…cation in Parametric Models,”Econometrica, 39, 577-591. [28] Rust, J. (1987): “Optimal Replacement of GMC Bus Engines: An Empirical Model of Harold Zurcher,”Econometrica, 55, 999-1033. [29] Rust, J. (1994): “Structural Estimation of Markov Decision Processes,”Handbook of Econometrics, vol. 4, eds. R.F. Engle and D. McFadden. North-Holland. [30] Rust, J. (1996): “Numerical Dynamic Programming in Economics,”Handbook of Computational Economics, vol. 1, eds. H.M. Aumann, D.A. Kendrick and J. Rust. Elsevier. [31] Srisuma, S. (2012): “Minimum Distance Estimators for Dynamic Games,” Quantitative Economics, 4, 549-583. [32] Srisuma, S. and O.B. Linton (2012): “Semiparametric Estimation of Markov Decision Processes with Continuous State Space,”Journal of Econometrics, 166, 320-341. [33] Wegge, L. (1965): “Identi…ability Criteria for a System of Equations as a Whole,” Australian Journal of Statistics, 67-77.

20

Ranking policies in discrete Markov decision processes - Springer Link