Binary Quantile Regression with Local Polynomial Smoothing Songnian Chen and Hanghui Zhangy Department of Economics HKUST

December 2012

(HKUST)

December 2012

1 / 44

Roadmap Models Motivations 1 2 3

Intuitions on MSC and SMSC Links between SMSC and nonparametric kernel regression Local constant …t v.s local linear …t Literature reviews & Pros and cons of existing methods Identi…cations Estimation method Assumptions and asymptotic properties of the estimator Extensions

1

2

From median to quantiles(Kordas 2006) and counterfactual probability Panel-data model Monte Carlo simulations Wrap-up (HKUST)

December 2012

2 / 44

Motivations: Model Setup

This paper is considered with estimating the standard binary choice model Y = 1 X 0β + ε > 0 The disturbance term ε is restricted in ways that ensure identi…cation of β 1

2

Parametric restriction specifying the distribution of ε, e.g, Probit or Logit Model: MLE estimator or NLS estimator Conditional Median Restriction: Manski-maximum score estimator; Horowitz-smoothed maximum score estimator med (ε jX ) = 0

3

4

Our estimator: Binary median regression with local polynomial smoothing. Two Building Blocks-SMSC and Local Polynomial Nonparametric Regression Estimator. (HKUST)

December 2012

3 / 44

Motivations: Drawbacks of MSC and SMSC

Why do we write a paper to propose a new method? MSC or SMSC has their own drawbacks. 1

1

MSC: Usually having relative small bias in …nite sample; n 3 convergence; Non-standard asymptotic distribution(Gaussian process with unknown parameter); Bootstrap is invalid.

(HKUST)

December 2012

4 / 44

Motivations: Drawbacks of MSC and SMSC

Why do we write a paper to propose a new method? MSC or SMSC has their own drawbacks. 1

2

2

SMSC: Faster convergence rate: n 5 for usual kernel and n higher order kernel, where γ 2 25 , 12 ;

γ

for

Relative larger bias in …nite sample;

3

Very sensitive to the selection of bandwidth in smoothing device, especially for higher order kernel;

4

The bias correction estimator is not systematic thus not stable.

(HKUST)

December 2012

5 / 44

Motivations: Drawbacks of MSC and SMSC

Bias-slop-plot

RMSE-slop-plot 6 smsc locallinear high-order-smsc localpolynomial

4

3

2

smsc locallinear high-order-smsc localpolynomial

5 4 rmse-slop

abs value-meanbias-slop

5

3 2

1

1

0 0.5

1

1.5 bandwidth

2

0

2.5

0

0.5

1

Bias-intercept-plot

2

2.5

3

RMSE-intercept-plot 3

5 smsc locallinear high-order-smsc localpolynomial

4

3

2

1

smsc locallinear high-order-smsc localpolynomial

2.5 rmse-intercept

abs value-meanbias-intercept

1.5 bandwidth

2 1.5 1 0.5

0 0.5

1

1.5 bandwidth

2

2.5

0

0

Figure: Design HM1, n = 250; β0 = (HKUST)

0.5

1

1.5 bandwidth

0.5, β1 =

2

2.5

3

1. December 2012

6 / 44

Motivations: Drawbacks of MSC and SMSC

Bias-slop-plot

RMSE-slop-plot 6 smsc locallinear high-order-smsc localpolynomial

4 3 2

smsc locallinear high-order-smsc localpolynomial

5 4 rmse-slop

abs value-meanbias-slop

5

3 2

1

1

0 0.5

1

1.5 bandwidth

2

2.5

0

3

0

0.5

1

Bias-intercept-plot

1.5 2 bandwidth

2.5

3

3.5

RMSE-intercept-plot smsc locallinear high-order-smsc localpolynomial

4 3 2 1

smsc locallinear high-order-smsc localpolynomial

2.5 rmse-intercept

abs value-meanbias-intercept

3 5

2 1.5 1 0.5

0 0.5

1

1.5 bandwidth

2

2.5

3

0

0

0.5

Figure: Design HM1, n = 500; β0 = (HKUST)

1

1.5 2 bandwidth

0.5, β1 =

2.5

3

3.5

1. December 2012

7 / 44

Motivations: Drawbacks of MSC and SMSC

Bias-panel-plot

RMSE-panel-plot 10 smsc locallinear high-order-smsc localpolynomial

8

8

7

7

6

6 5 rmse

abs value-meanbias

smsc locallinear high-order-smsc localpolynomial

9

5

4 4 3 3 2 2

1

1

0 0.5

1

1.5

2

2.5 3 bandwidth

3.5

4

4.5

0

0

0.5

1

1.5

2

2.5 bandwidth

3

3.5

4

4.5

5

Figure: Panel Design, n = 250; β = 1. (HKUST)

December 2012

8 / 44

Motivations: Drawbacks of MSC and SMSC

Bias-panel-plot

RMSE-panel-plot 8 smsc locallinear high-order-smsc localpolynomial

7

smsc locallinear high-order-smsc localpolynomial

7

6 6

5

rmse

abs value-meanbias

5

4

4

3 3

2

2

1

1

0 0.5

1

1.5

2

2.5 3 bandwidth

3.5

4

4.5

0

0

0.5

1

1.5

2

2.5 bandwidth

3

3.5

4

4.5

5

Figure: Panel Design, n = 500; β = 1. (HKUST)

December 2012

9 / 44

Motivations: (Smoothed) Maximum Score Estimator(SMSC)

Manski’s estimator(1975,1985) maximized the following objective function 1 n 0g 1 Xi0 b 0 max ∑ 1 fYi > 0g 1 Xi0 b > 0 + 1 fYi b n i =1 The objective function counts only when Yi shares the same sign with Xi0 b.

Alternatively, the score is the number of correct predictions we would make if we predicted Yi to be 1 whenever Xi0 b > 0 and 0 otherwise. max b

1 n (2Yi n i∑ =1

1) 1 Xi0 b > 0

Horowitz(1992)’s smoothed version-SMSC. max b

(HKUST)

1 n ( Yi n i∑ =1

0.5) K

Xi0 b h December 2012

10 / 44

Motivations: Maximum Score Estimator(MSC)

The MSC max b

1 n ( Yi n i∑ =1

0.5) 1 Xi0 b > 0

Intuition: under the median restriction: med (ε jX ) = 0

fY = 1 jX g , f ε > X 0 β jX g med (ε jX ) = 0 X 0β > 0

(HKUST)

) Pr (Y = 1 jX ) > 0.5 ,

December 2012

11 / 44

Motivations: Link between SMSC and Kernel Nonparametric Regression

β is normalized to be β = 1, β˜ and Xi = X1i , X˜ i SMSC estimator βˆ SMC satis…es the FOC, 1 n ( Yi n i∑ =1

1 0.5) X˜ i kh Xi0 βˆ SMC = 0, kh (.) = k (. /h ) h

Equivalently the FOC can be formulated as En ( Y

0.5) X˜ i Xi0 βˆ SMC = 0 = 0

where n

E n ( Yi

0.5) X˜ i Xi0 βˆ SMC = 0 =

∑ kh

Xi0 βˆ SMC (Yi

0.5) X˜ i

i =1

n

∑ kh

Xi0 βˆ SMC

i =1 (HKUST)

December 2012

12 / 44

Motivations: Link between SMSC and Kernel Nonparametric Regression

The FOC of SMSC can be viewed as Moment-based estimator derived from the conditional moment by replacing the population moment condition with sample analogue using kernel nonparametric regression at zero point. Population Conditional Moments: E (Y 0.5) X˜ jX 0 β = 0 = 0 Intrinsic link between the …nite sample and asymptotic properties of SMSC and those of the kernel nonparametric regression.

Replacing the kernel nonparametric regression by local linear or local polynomial regression in constructing the sample analogue of the conditional moment.

(HKUST)

December 2012

13 / 44

Motivations: Link between SMSC and Kernel Nonparametric Regression-Three Remarks

Three Remarks 1

Random-Coe¢ cients Models Model 1: Model 2:

2

3

Y = 1 X1 + X˜ 0 β˜ + v + ε > 0 Y = 1 X1 + X˜ 0 β˜ (U ) > 0

where in the …rst model ε = X˜ 0 v + ε with med ε X˜ = 0 and in the second model, U follows U [0, 1], β ( ) is an increasing function of U. Let Z = X 0 β, gZ (z ) be the density of Z at z. It is assumed that gZ (z ) bounded away from zero. Impossible to formulate a maximization problem whose FOC corresponds to the conditional moment based on local polynomial smoothing techniques(M-estimator to moment-based estimator NOT the reverse). (HKUST)

December 2012

14 / 44

Motivations: Local Constant Fit v.s Local Linear Fit

Let (X , Y ) denotes the i.i.d. random sample from DGP Y = m (X ) + ε, whose conditional mean and conditional variance are denoted respectively by m (x0 ) = E (Y jX = x0 ) and σ2 (x0 ) = Var (Y jX = x0 )

The kernel nonparametric regression estimator of m (x0 ) is given by n

∑ kh (Xi

m ˆ (x0 ) = En (Y jX = x0 ) =

x0 ) Yi

i =1 n

∑ kh (Xi

x0 )

i =1

Kernel nonparametric regression estimator and local constant approximation, let wi (x0 ) be kh (Xi x0 ) , n n 1 n 2 m ˆ c (x0 ) = arg min ∑ (Yi c ) wi (x0 ) = ∑ wi (x0 ) Yi ∑ wi (x0 ) c n i =1 i =1 i =1 (HKUST)

December 2012

15 / 44

Motivations: Local Constant Fit v.s Local Linear Fit

Suppose that locally the regression function m can be approximately represented by a linear function, for x in a neighborhood of x0 by Taylor expansion m (x )

m (x0 ) + m0 (x0 ) (x

x0 )

This suggests using a locally weighted linear regression m ˆ l (x0 ) , m ˆ l0 (x0 ) = arg min c1 ,c2

1 n ( Yi n i∑ =1

c0

c1 ( Xi

x0 ))2 wi (x0 )

m ˆ l (x0 ) can be explicitly expressed by a weighted sum , n

m ˆ l (x0 ) =

∑ wil (x0 ) Yi

i =1

where

wil n

(x0 ) = wi (x0 ) (Sn2

Snj = ∑ wi (x0 ) (Xi

( Xi

n

∑ wil (x0 )

i =1

x0 ) Sn1 ) with

x0 )j .

i =1

(HKUST)

December 2012

16 / 44

Motivations: Local Constant Fit v.s Local Linear Fit

Advantages of local linear …t 1

2

3

.For the interior point x0 , the local constant estimator assigns symmetric weights to both sides of x0 , which creates a large bias for non-equispaced design. A local linear estimator, however, adapts automatically to the random designs by assigning an asymmetric weighting scheme, e.g, if there are more datum points on the leftside of x0 , then Sn1 shows to be negative. i.e. wil (x0 ) gives more weights to the rightside. (AsymBias) Design adaptive property of local linear estimator. Method Local constant …t Local linear …t

(HKUST)

p.w AsyBias ∝

m00

2m 0 (x )f 0 (x ) f (x )

(x ) + ∝ m00 (x )

p.w AsyVariance Vn Vn

December 2012

17 / 44

Motivations: Local Constant Fit v.s Local Linear Fit-An example

Example Consider the following DGP Yi = sin (Xi ) + εi , Xi

N (0.5, 1) , εi

U [ 1, 1]

Random and non-equispaced design For the interior point x0 = 1, the local constant estimator assigns symmetric weights to both sides of x = 1. In this random design, this estimator overweighs the points on the left-hand side and hence creates a downward bias.

(HKUST)

December 2012

18 / 44

Motivations: Local Constant Fit v.s Local Linear Fit-An example

Figure: Kernel nonparametric regression estimator and local linear estimator for estimating m (x0 ) = sin (x0 ) at x0 = 1. Kernel and Local Linear Estimation

Local W eight F unction

2

4 sin-curve sin(x0) kernel locallinear localpolynom ial

kernel-weight locallinear-weight 3. 5

1. 5 3

2. 5

y

weight

1

0. 5

2

1. 5

1 0

← y= sin(x) 0. 5

0 -0.5 -0.5

0

0. 5

1 x

(HKUST)

1. 5

2

-0.5

0

0. 5

1 data

1. 5

2

2. 5

December 2012

19 / 44

The Pros and Cons: MSC, SMSC and SMSLC 1

1

MSC: Usually having relative small bias in …nite sample; n 3 convergence; Non-standard asymptotic distribution(Gaussian process with unknown parameter); Bootstrap is invalid.

2

SMSC: Faster convergence rate: n 5 for usual kernel and n γ for higher order kernel, where γ 2 25 , 12 ; Relative larger bias in …nite sample; Very sensitive to the selection of bandwidth in smoothing device, especially for higher order kernel; The bias correction estimator is not systematic thus not stable.

3

SMSLC: Same Convergence Rate as SMSC; Systematic Bias Correction; Less Sensitive to the Selection of Bandwidth; Weaker Assumptions on Identi…cation and Establishing the Asymptotic Properties of our Estimator. Widely Used for Non-multiplicative Heteroscedastic Model, Random-coe¢ cients Model, Panel-data Model and Other Types of the Models.

2

(HKUST)

December 2012

20 / 44

Literature Reviews 1

MSC and SMSC: Manski(1975(Citation 521),1985(Citation 386),1987); Horowitz(1992(Citation 386),1993,2009), Charlier et al.(1995); Cavanagh(1987), Kim and Pollard(1990)(Citation 452), Pollard(1993), Delgado et al.(2001), Abrevaya and Huang(2005); Manski and Tamer(2002)(Citation 223); Sung el al.(2012), etc.

2

Bias Correction and Developments: Iglesias(2010), Kotlyarova et al.(2010), Blevins and Khan(2012), Khan(2013), etc.

3

Local Polynomial Estimator and Its Semiparametric Applications: Fan and Gijbels(1996)(Citation 3302), Ruppert and Wand(1994); Chen(1999), Chen and Khan(2000), Hahn, Todd and Klaauw(2001), Heckman, Ichimura and Todd(1997,1998), Linton(1995, 2002), Powell and Khan(2001), Wang et al(2010), etc.

4

Applications of SMC and SMSC: Kyriazidou(1997)(Citation 348), Abrevaya(2000), Fox(2008), etc. (HKUST)

December 2012

21 / 44

Identi…cation: Manski’s identi…cation

Su¢ cient conditions for point-identi…cation of β.

Proof. Theorem (Manski 1985) Let med (ε jX ) = 0, β1 = 1, then β is point-identi…ed if (a) The support of the distribution of X is not contained in any proper linear subspace of Rd ; (b ) X1 X˜ = x˜ has an everywhere positive density a.s.

(HKUST)

First Step: Pr (Y = 1 jX = x ) = 1 Pr (ε < x 0 β jX = x ) thus Pr (Y = 1 jX = x ) > 0.5 i¤ x 0 β > 0; Second Step: Let S1 ( b ) = f x : x 0 β < 0 x 0 b g and S2 ( b ) = f x : x 0 b < 0 x 0 β g , then β is point-identi…ed if Pr [S1 (b ) [ S2 (b )] > 0 which holds whenever (a) and (b ) of the theorem holds. December 2012

22 / 44

Identi…cation: Horowitz’s identi…cation

Theorem (Horowitz 1992,2009) Let med (ε jX ) = 0, β1 = 1. Then β is point-identi…ed if for θ > 0, there are an interval Iθ = [ θ, θ ] and a set Nθ 2 Rd 1 such that (a) Nθ is not contained in any proper linear subspace of Rd 1 ; (b ) Pr X˜ 2 Nθ > 0; (c ) For almost every x˜ 2 Nθ , the distribution of X 0 β conditional on X˜ = x˜ has a probability density that is everywhere positive on Iθ .

(HKUST)

December 2012

23 / 44

Identi…cation: Two important Lemmas

Lemma (i)med (ε jX ) = 0, a.s; (ii)X1 given X˜ is continuous distributed; (iii)the parameter space B is a compact set and for each given b 2 B, Pr X˜ 0 δ˜ = 0 jX 0 b = 0 < 1 for any δ˜ 6= 0, namely, E X˜ X˜ 0 jX 0 b = 0 is of full rank a.s, then for any b 2 B, with b 6= β, E (Y

0.5) X˜ X 0 b = 0 6= 0

Lemma Assumes Horowitz(1992,2009)’s assumptions on identi…cation, then for any δ˜ 6= 0, Pr X˜ 0 δ˜ = 0 X 0 β = 0 < 1

(HKUST)

December 2012

24 / 44

Identi…cation: Two important Lemmas

De…nition Admissible Parameter Space: B = b : Pr X˜ 0 δ˜ = 0 jX 0 b = 0 < 1 for any δ˜ 6= 0 or B = b : E X˜ X˜ 0 jX 0 b = 0 is of full rank a.s The fundamental insight from Lemma 1 is that any b without such property would be inadmissible-Work with the Admissible Parameter Space. The fundamental insight from Lemma 2 is that the conditional moments condition has the true value β as its unique solution among the admissible values.

(HKUST)

December 2012

25 / 44

Identi…cation: Two important Lemmas

Proof. 0.5) X˜ 0 δ˜ jX 0 b = 0 ,

Note that m (b, 0) = E (Y m (b, 0) = E

F (0 jX )

F

X˜ 0 δ˜ jX

X˜ 0 δ˜ X 0 b = 0

F (0 jX )

F

X˜ 0 δ˜ jX

> 0 i¤ X˜ 0 δ˜ > 0

for δ = b˜

β˜

December 2012

26 / 44

and thus m (b, 0) = 0 implies Pr

X˜ 0 δ˜ = 0 X 0 b = 0 = 1

thus δ˜ = 0 because of (ii) and E X˜ X˜ 0 X 0 b = 0

(HKUST)

is of full rank

Estimation Method Moment-based Estimator (or Z-estimator) e00 ^ γ (b ) = 0 s.t

0 @

∑ni=1 kc

X i0 b hc

∑ni=1 kc

X˜ i X˜ i0

X i0 b hc

1

A > c0

and

1 nhc

n

∑ kc

i =1

Xi0 b hc

> c0

where ^ γ (b ) can be de…ned as

^ γ (b ) = X (b ) 0 W (b ) X (b )

1

X (b ) 0 W (b ) U

and X (b ) = W (b ) = U = (HKUST)

(Xi0 b )j diag

( Yi

, i = 1,

, n; j = 0,

n (p +1 ) kh Xi0 b , i = 1, n n 0 0.5) Xi n (d 1 ) , i = 1,

,p

,n ,n December 2012

27 / 44

Estimation Method: Work with the Admissible Parameter Set

For SMLL and SMLC, we make a simple, yet e¤ective adjustment, to impose the full rank condition; Solve the constrained moment equation by minimizing

jje00 ^ γ(b )An 1 (b ) jj X 0b

∑ni=1 Xi Xi0 kc hic , which can be viewed as a consistent estimator for E (Xi Xi0 jXi0 b = 0 ) gZb (0). where An (b ) =

1 nh c

Multiplication by An (b ) can be thought of as imposing a penalty term for those values of b with An (b ) close to be singular. With the above adjustment, it appears that choosing c0 to be a very small number, such as 0.01, is su¢ cient to rule out problematic values. The bandwidth hc in estimating An (b ) was set to be 1.06hrot , where hrot -Silverman’s (1986) rule-of-thumb. (HKUST)

December 2012

28 / 44

Assumptions and Asymptotic Properties: Key Assumptions

Assumption 1: fYi , Xi : i = 1, 2, ...n g is a i.i.d. random sample of (Y , X ), generated from the model Y = 1fX 0 β + ε > 0g, with X 2 Rd . Assumption 2: Median(εjX = x ) = 0 for almost every x.

˜ a compact subset Assumption 3: β1 = 1 and β˜ is an interior point of B, d 1 of R .

Assumption 4: For almost every x˜ = (x 2 , ..., x d ), X1 conditional on X˜ = x˜ is absolutely continuously distributed with respect to the Lebesgue measure. The components of X˜ are bounded.

(HKUST)

December 2012

29 / 44

Assumptions and Asymptotic Properties: Key Assumptions

Let Z = X 0 β, Let F ( jz, x˜ ) denote the conditional cumulative distribution function of ε given (Z , X˜ ) = (z, x˜ ), and let m (z ) = E (F ( z jz, X˜ ) 0.5)X˜ 0 jZ = z . Assumption 5: (i) m (z ) and g (z jx˜ ) are continuous in z and uniformly bounded for almost every x; ˜ (ii) for all z in a neighborhood of 0, and for almost all x, ˜ g (1 ) (z jx˜ ), F (1 ) ( z jz, x˜ ) and all the components of m (`) (z ), are uniformly bounded and continuous in z, for each integer ` such that 1 ` p + 1. Assumption 6: (i) The kernel function k is continuous and symmetric density function having a bounded support, and k (1 ) is continuous with a bounded support; (ii) the kernel function kc is a density function with hc ! 0 and nhc ! ∞ as n ! ∞. h i Assumption 7: Q = E X˜ X˜ 0 F (1 ) (0j0, X˜ )g (0jX˜ ) is nonsingular. (HKUST)

December 2012

30 / 44

Assumptions and Asymptotic Properties: Remarks

1

Do NOT make explicit assumptions on the support of the conditional density of Z given X˜ ;

2

The smoothness conditions are mainly imposed on the regression function m (z ), not separately on F ( z jz, x˜ ) and g (z jx˜ ), and in particular, minimum smoothness condition of gZ ( ) at 0; Nonsingularity of Q implies that E X˜ X˜ 0 jX 0 β = 0 is nonsingular, and thus there exists some c0 such that det E X˜ X˜ 0 jX 0 β = 0 > c0 and gZ (0) > c0 .

3

(HKUST)

December 2012

31 / 44

Assumptions and Asymptotic Properties: Asymptotic Normality(main theorem)

Theorem (i) Under Assumptions 1-4, and if h ! 0, hc ! 0, ln n/(nh ) ! 0, ln n/(nhc ) ! 0 as n goes to in…nity, then b˜ n is consistent; further more, under the Assumptions 1-7, (ii) If nh2p +3 ! λ as n goes to in…nity, then

p

nh (b˜ n

d β˜ ) ! N (λ1/2 Q

where ACZ

1 = αA (p + 1) !

"

1

ACZ , Q

dm (p +1 ) (0) dz (p +1 )

#0

1

DQ

1

).

gZ ( 0 )

and D=

1 4

Z

u 2 k (u ) du E X˜ X˜ 0 g (0jX˜ ) , k (t ) = e00 Sp 1 (1, t, ..., t p )0 k (t )

(HKUST)

December 2012

32 / 44

Assumptions and Asymptotic Properties: Bias Correction

The asymptotic bias term for the smoothed maximum score estimator is of the form λ1/2 Q

1

AH = λ1/2 Q

1

ACZ + λ1/2 Q

1

AH

where p

AH = α A



`=1

1 ` ! (p + 1

`)!

h

m

(`)

i0 (p +1 (0) gZ

`)

(0)

The asymptotic bias of does NOT depend on the design density of Z at 0 and its various derivatives-design adaptive nature. This phenomenon is reminiscent of the di¤erence between the asymptotic biases of kernel nonparametric regression and local (linear) polynomial regression estimators-tends to provide smaller …nite sample biases. (HKUST)

December 2012

33 / 44

Extensions: Quantile Binary Choice Model

Qτ (εjX = x ) = 0 for almost every x; SMSC for quantile binary choice model n

max SH τ (b ) = b

∑ ( Yi

(1

τ )) K

i =1

Xi0 b h

De…ne local polynomial based binary τ-th quantile regression estimator b˜ τn as a solution to e00 ^ γ τ (b ) = 0 subject to the same constraint as median case, where ^ γτ (b ) = (X(b )0 W(b )X(b ))

1

X (b ) 0 W (b )Uτ

with U = ( Yi (HKUST)

(1

τ )) Xi0

n (d 1 )

, i = 1,

,n December 2012

34 / 44

Extensions: Panel Data Models-Setup

Binary choice panel data model with …xed e¤ects, where t = 1, 2, ..., T , T is …xed and αi is the time-invariant …xed e¤ect. εi 1 and εi 2 have the same distribution (stationarity) conditional on ( Xi 1 , Xi 2 , α i ) , Yit = 1 Xit0 β + αi + εit > 0 Abrevaya’s generalized …xed-e¤ects regression model D (Yit ) = F Xit0 β, αi , εit αi is the time-invariant …xed e¤ect which may be correlated with Xit or the error disturbances. The function D is weakly increasing and F is strictly increasing in its …rst and third arguments. D (Yit ) is observed instead of Yit .

Example Yit = 1 α1i (Xit0 β) (HKUST)

α2i

+ α3i + εit > 0 where α1i , α2i > 0 December 2012

35 / 44

Extensions: Panel Data Models-Intuitions

MSC for the binary choice panel data model

n

max SHnp (b ) = b

∑ ( Yi 1

Yi 2 ) K

i =1 n

=

∑ ( 1 f Yi 1

i =1

Yi 2

∆Xi0 b h 0g

1 f Yi 1

Yi 2

0g)K

∆Xi0 b h

9 0 , ε1 ε2 ( X1 X2 ) 0 β = then 0 , ε2 ε1 (X2 X1 )0 β ; ε1 , ε2 are i.i.d. ∆X 0 β > 0 ) Pr (Y1 Y2 jX1 , X2 , α ) > Pr (Y2 Y1 jX1 , X2 , α ) Y1 Y2

Y2 Y1

(HKUST)

December 2012

36 / 44

Extensions: Panel Data Models-Estimator

SMSC for the binary choice panel data model n

max SHnp (b ) = b

∑ ( Yi 1

Yi 2 ) K

i =1

∆Xi0 b h

Local polynomial based panel smoothed maximum score estimator can be de…ned as a solution to e00 (∆X(b )0 ∆W(b )∆X(b ))

1

∆X(b )0 ∆W(b )∆U =0

subject to 0 1 ∆X i0 b ˜ i ∆X˜ 0 ∆ X ∑ni=1 kc i hc A > c0 and 1 det @ 0b ∆X nhc i ∑ni=1 kc hc

n

∑ kc

i =1

∆Xi0 b hc

> c0

where

∆U = (Yi 1 (HKUST)

Yi 2 ) ∆X˜ i0

n (d 1 )

, i = 1,

,n December 2012

37 / 44

Monte Carlo Simulation: Designs

Compare …ve estimators: MSC, SMSC, HSMSC, SMSLL, SMSLC; Five designs, two homoscedastic designs, two multiplicative heteroscedastic designs and a random-coe¢ cient design for the cross-sectional case, and one design for the panel data case; Y = 1 fX1 + β1 X2 + β0 + σ (X ) ε > 0g, β1 = 1, β0 = 0.5, X1 N (0, 2) and X2 χ2(1 ) 1. ε N (0, 1) (Design 1.1, HM1) and ε follows a median hp standardized mixture i Normal p Gamma and 0.5Gamma (0, 1) + 0.5N (0, 1) (Design 1.2, distribution: ε˜

HM2); σ (X ) = exp

(HKUST)

X1 X2 3

(Design 2.1,2.2, HT).

December 2012

38 / 44

Monte Carlo Simulation: Designs

Random Coe¢ cients Model(RC): Y = 1 fX1 + β1 (U1 ) X2 + β0 (V1 ) + 0.5ε > 0g where β1 (U1 ) = U1 + β1 and β0 (V1 ) = V1 + β0 with β0 = β1 = 1, X2 is drawn from U [ 2, 2] and X1 = X3 + 0.5X2 + 0.5 with X3 being standard normal independent of X2 , V1 is uniformly distributed over [0, 1] and ε is equal mixture of U2 and U3 , where (U1 , U2 , U3 ) follow the following joint normal distribution 0 1 00 1 0 11 U1 0 1 0.5 0.5 @ U2 A N @@ 2 A , @ 0.5 0.75 0 AA U3 2 0.5 0 0.75 (HKUST)

December 2012

39 / 44

Monte Carlo Simulation: Designs

Panel Data Model: Yit = 1 fexp (α1i ) (X1it + βX2it ) + α2i + εit > 0g , t = 1, 2 where β = 1, α1i and α2i are independently drawn from U [ 1, 0]. The regressor X1i 1 follows N (0, 4) for both periods and X2i 1 χ2(1 ) in the …rst period, while in the second period X2i 2 has a time drift with a perturbation: X2i 2 = 1 + X2i 1 + U [ γ, γ], where γ = 2.5; The error εit follows a Logistic distribution with zero mean and unit variance, independent across i and over t. In this model, the …xed e¤ect contains both additive and multiplicative elements. The support of ∆X2i is mainly controlled by the constant γ.

(HKUST)

December 2012

40 / 44

Monte Carlo Simulation: Population Moment Function

For some speci…c design, we can plot the population moment function: m (t, β) = E (Y 0.5) X˜ jX 0 β = t . Design: Y = 1 fX1 X2 1 + ε > 0g, where X1 N (0.5, 1), X2 and ε N (0, 1). Plot-population moments 0.5 β1 β0

← m(0, β)=0

m(t)

0

-0.5

-1 -3

-2

-1

0 t

1

2

3

Figure: The Population Moment Function m (t, β) (HKUST)

December 2012

41 / 44

Monte Carlo Simulation: Numerical Results

Objective function plot, panel data model, sample size n = 250 Objective Function of MSC and SMSC for Panel Data Model, c = 0.6, n = 250

Objective Function of SMCLC for Panel Data Model, c = 0.6, n = 250

0.25

6 MSC SMSC

5 0.2

objective fun value

f

objective un value

4 0.15

0.1

3

2

0.05 1

0 -2

-1

0

1 2 parameter- β

3

4

5

0 -2

-1

0

1 2 parameter- β

3

4

5

Figure: Objective Function Plot (HKUST)

December 2012

42 / 44

Monte Carlo Simulation: Numerical Results

Objective function plot, homoscedastic model, sample size n = 250

(HKUST)

December 2012

43 / 44

Wrap Up

Local polynomial smoothing based estimators for both the cross-sectional and panel data binary choice models; Weaker assumptions but more favorable …nite sample and asymptotic properties than the smoothed maximum score estimators; The limited Monte Carlo experiments indicate our estimators, especially the ones based on local cubic regression techniques, o¤er dramatic improvement. Possible future research: Choice-based sampling, etc. Comments are welcome. Thanks!

(HKUST)

December 2012

44 / 44

Binary Quantile Regression with Local Polynomial ...

nhc ∑n i=1 Xi Xi kc (Xi b hc ), which can be viewed as a consistent estimator for E (Xi Xi |Xi b = 0)gZb (0). Multiplication by An (b) can be thought of as imposing a penalty term for those values of b with An (b) close to be singular. With the above adjustment, it appears that choosing c0 to be a very small number, such as 0.01, ...

619KB Sizes 0 Downloads 255 Views

Recommend Documents

Local Polynomial Order in Regression Discontinuity Designs
Oct 21, 2014 - but we argue that it should not always dominate other local polynomial estimators in empirical studies. We show that the local linear estimator in the data .... property of a high-order global polynomial estimator is that it may assign

Quantile Regression
The prenatal medical care of the mother is also divided into four categories: those ..... Among commercial programs in common use in econometrics, Stata and TSP ... a public domain package of quantile regression software designed for the S.

Local Polynomial Order in Regression Discontinuity ...
Jun 29, 2018 - Central European University and IZA ... the polynomial order in an ad hoc fashion, and suggest a cross-validation method to choose the ...

Online Supplement to nPredictive Quantile Regression ...
VInference in predictive quantile regressions,V Unpublished. Manuscript. Nadarajah, S., & Kotz, S. (2007). Programs in R for computing truncated t distributions. Quality and. Reliability Engineering International, 23(2), 273'278. Phillips, P. C. B. (

Adaptive penalized quantile regression for high ...
For example, there is no guarantee that K-fold cross-validation would provide a choice of ln with a proper rate. Third, their statistical properties are still uncharted ...

Implications of KKT conditions in quantile regression
May 7, 2013 - Let y = (y1,..., yn)T ∈ Rn and X = (x1,..., xn)T ∈ Rn×p be a pair of a .... l1-norm minimization with application to nonlinear l1-approximation.

Why you should care about quantile regression - Research at Google
Permission to make digital or hard copies of all or part of this work for personal or .... the beginning of execution and only writes the data to disk once the.

Simple Local Polynomial Density Estimators ...
Jul 9, 2017 - First, ˜F, viewed as a process, does not converge to a Brownian bridge unless wi = 1. .... being a (possibly unbounded) line segment in R purely for notational ...... where estimating G(1)(x) and G(2)(x) is almost free – when the.

pdf-147\binary-polynomial-transforms-and-non-linear-digital-filters ...
... the apps below to open or edit this item. pdf-147\binary-polynomial-transforms-and-non-linear- ... nd-applied-mathematics-from-chapman-and-hall-crc.pdf.

A polyhedral study of binary polynomial programs
Oct 19, 2016 - E-mail: [email protected]. † ... E-mail: [email protected]. 1 ... and Lasserre [24] provide automated mechanisms for the generation of sharp ...

A polyhedral study of binary polynomial programs - Semantic Scholar
Oct 19, 2016 - Next, we proceed to the inductive step. Namely ...... programming approach of Balas [2] who gives an extended formulation for the convex hull.

pdf-147\binary-polynomial-transforms-and-non-linear-digital-filters ...
... the apps below to open or edit this item. pdf-147\binary-polynomial-transforms-and-non-linear- ... nd-applied-mathematics-from-chapman-and-hall-crc.pdf.

Multichannel Decoded Local Binary Patterns for Content Based Image ...
adder and decoder based local binary patterns significantly improves the retrieval ...... „Diamond Peach‟, „Fuji Apple‟, „Granny Smith Apple‟,. „Honneydew ...

Boosting Local Binary Pattern with Bag-of-Filters for ...
Zhao et al., “SOBEL-LBP,” IEEE ICIP, pp. 2144-2147, 2008. [8]. Tan and Triggs, “Enhanced local texture feature sets for face recognition under difficult lighting conditions,” IEEE TIP, 19(6): 1635-1650, 2010. [9]. Zhangn et al., “Local deri

Multichannel Decoded Local Binary Patterns for ...
adder/decoder local binary pattern decimal values from three 8-bit input. LBPs. Green and Red circles represent 0 ... [1]T. Ojala, M. Pietikainen and T. Maenpaa, “Multiresolution gray-scale and rotation invariant texture ... [3]C. Zhu, C.E. Bichot

Predicting the Density of Algae Communities using Local Regression ...
... the combination of features of different data analysis techniques can be useful to obtain higher predictive accuracy. ... Download as a PDF; Download as a PS ...

Regression Discontinuity Design with Measurement ...
Nov 20, 2011 - All errors are my own. †Industrial Relations Section, Princeton University, Firestone Library, Princeton, NJ 08544-2098. E-mail: zpei@princeton.

Polynomial-mal.pdf
+cx+d F¶ _lp]Z ̄nsâ Hcq LSIw (x+1) Bbm a+c=b+d F¶v. sXfnbn¡qI? 16. P(x)= x. 3. +6x2. +11x-6+k bpsS LSI§fmW" (x+1), (x+2) Ch F¦n k bpsS hne F ́v ? 17. P(x)=x15. -1 sâ LSIamtWm (x-2) F¶v ]cntim[n¡qI. 18. P(x)=x2. +1 sâ LSIamtWm (x-2) FÂ

Polynomial-Time Isomorphism Test for Groups with no ...
Keywords: Group Isomorphism, Permutational Isomorphism, Code Equiva- lence. .... lence, in [3] we gave an algorithm to test equivalence of codes of length l over an ... by x ↦→ xg := g−1xg. For S ⊆ G and g ∈ G we set Sg = {sg | s ∈ S}. Pe

SLICE INVERSE REGRESSION WITH SCORE ...
Nov 25, 2016 - yi,k := (ai,k/xk)2, i = 1,2,...m, k = 1,2,...,q. Motivating application: dynamic solar imaging from phaseless measurements; image changes are often ...