Asymptotic Distribution of Factor Augmented Estimators ...

Viewer
Transcript

Asymptotic Distribution of Factor Augmented Estimators for Panel Regression∗ Ryan Greenaway-McGrevy

Chirok Han

Bureau of Economic Analysis

Department of Economics

Washington, D.C.

Korea University Donggyu Sul

Department of Economics University of Texas at Dallas September 2008

Abstract In this paper we derive asymptotic theory for linear panel regression augmented with estimated common factors. We give conditions under which the estimated factors can be used in place of the latent factors in the regression equation. For the principal component estimates of the factor space it is shown that these conditions are satisified when   → 0 and  3 → 0 under regularity. Monte-Carlo studies verify the asymptotic theory.

Keywords: Panel Data, Factor Augmented Regression, Principal Components, Cross Section Dependence, Serial Correlation. JEL Classification: C33

∗

The views expressed herein are those of the authors and not necessarily those of the Bureau of Economic Analysis or the Department of Commerce. We thank I. Choi, C.J. Kim, Su Lianjun, R.H. Moon, Jin Sainan, K. Tanaka, Y. J. Whang, the editors and anonymous referees for helpful comments. A previous version of this paper was circulated under the title “Estimating and Testing Idiosyncratic Equations using Cross Section Dependent Panel Data”.

1

1 Introduction Recent panel-data research has incorporated strong cross-section dependence into the conventional panel regression model by introducing a factor structure into the regression error. A prototypical panel regression with a factor error struture is given by (1)

  =  0  +    = 0   + 

where  is a −vector of latent common factors in the regression error  ,  is a vector of factor loadings, and  is a vector of explanatory variables (see, e.g., Pesaran, 2006; Bai, 2009a). No restriction is imposed on the relationship between the regressor  and the com mon component 0   , so the conventional least squares (LS) or least-squares dummy variable (LSDV) methods may yield inconsistent estimators due to endogeneity. Several methods have recently been developed to consistently estimate the  parameter. Ahn, Lee and Schmidt (ALS, 2006) extend the single factor model of Ahn, Lee and Schmidt (2001) to allow multiple factors, and provide estimation methods based on moment restrictions on the error term (e.g., white noise or parametric ARMA structure) for small  (time-series observations). Pesaran (2006) proposes filtering out common factors by including the crosssectional averages of (  0 )0 in a regression. Under regularity this ‘common correlated effects’ (CCE) estimator is consistent provided a rank condition is satisfied (e.g., the number of observed variables in the equation is at least as large as ). Recently, Bai (2009a) proposes estimating  jointly with the factor space {1       } and factor loadings {1       } by minimizing ( 1        1       ) =

P P =1

=1 (

0  2 − 0   −   ) 

P P  subject to the normalizations  −1 =1  0 =  and  −1  =1   is diagonal,  ∈ R , and  ∈ R for given . Bai (2009a) shows that this LS estimator is consistent as   → ∞ as long as  ≥ , without requiring the rank condition of Pesaran (2006), and permitting general weak dependence and heteroskedasticity in the error term  . Note that Bai (2009a) controls for common factors of the regression ‘residuals’ (i.e.,  − 0 ), whereas Pesaran (2006), by using the cross-section averages of (  0 )0 , controls for the common factors to the ‘observable’ variables (i.e.  and  ). Alternatively one may augment the panel regression with some other factor estimate from the observable variables, such as the principal components (PC) estimate. In fact, Kapetanios and Pesaran (2007) consider a version of this factor augmented estimator and study the finite sample properties by means of small Monte Carlo experiments. Giannone and Lenza (2008) 2

use a similar factor augmented panel regression to estimate the international saving-investment relationship, in which global factors are extracted from the observables. One purpose of the present paper is to establish formal asymptotics for these factor augmented panel regressions. As yet, rigorous asymptotics have not been derived, although some authors have suggested required conditions under which the PC estimate can replace the common factors in the panel regression without affecting the limiting distribution of the LS estimator. Specifically, based on his Theorem 1, Bai (2003, p. 146) states that the conditions √   → 0 and   → ∞ are sufficient under regularity for replacing unobservable common factors with the PC estimate in time series models (Stock and Watson, 2002; Bernanke, Boivin, and Eliasz, 2005; Bai and Ng, 2006). Kapetanios and Pesaran (2007) and Giannone and Lenza (2008) conjecture that this condition also applies to panel regression models such as (1). However, in the pooled panel regression, the fact that the factor loadings are individually specific √ confounds this generalization, and   → 0 is in fact not sufficient for using PC estimated factors in place of the true factors. Instead we find that   → 0 and  3 → 0 are sufficient for the replacement of the unobservable common factors with the PC estimates under regularity. This finding is supported by intuitive explanation and is verified by simulation. In establishing these results we provide general conditions under which any factor estimate can replace the true common factors in the regression, so that our results can be straightforwardly applied to various other factor estimates such as GMM estimators (ALS, 2006) or efficient PC estimates (Choi, 2008). This is our first contribution. Another purpose of this paper is to propose and establish asymptotics for a straightforward one-step dynamic estimator. This method is of practical interest because it exhibits greater efficiency in the presence of serial correlation in the regression errors. The remainder of the paper consists of four sections. In the next section we explain the model and the estimators. Section 3 derives the asymptotic properties of those estimators, and provides Monte Carlo studies to verify the established theorems. Section 4 concludes. To save the space, we do not provide technical proofs of the theorems presented herein. The proofs are available from the authors upon request. Throughout, ‘→ ’ denotes convergence in probability, and ‘⇒’ convergence in distribution, as  → ∞ and  → ∞ jointly; kk = tr(0 ); ‘LLN’ is an acronym for ‘law of large numbers’, and ‘CLT’ for ‘central limit theory’.

2 Model and Estimators In vector notation, (1) is given by (2)

 =   +    +   3

where  = (1       )0 ,   = (1       )0 ,  = (1       )0 , and  = (1       )0 . Following convention (e.g. Pesaran, 2006; Bai, 2009a), we permit  to be arbitrarily correlated with the unobservable     In many factor-error regressions this correlation is modelled through a latent factor structure in  . That is, when (3)

 =     +  

  we permit     to be correlated with   . Let   denote the common factors to  , and let  denote the associated factor loading vector. We define  to be the  ×  matrix consisting of a subset of the columns of   and   such that  −1  0  is nonsingular, and   =   and   =   for some selection matrices  and  . That is,  contains all unique columns in   and   , and when  and  share the same factor, it is included only once as a column in  . Then    =    −      =         (  −   ), which implies that  =   for some selection matrix  . Thus, by augmenting the panel regression equation  =   +  with  , we can always control for   . But since  is unobservable, this estimation method is infeasible. Instead we consider augmenting the regression with the estimated common factors. It is worth noting that this treatment partials out more variation in  than necessary for the identification of , thus it may lead to loss of efficiency.1 In contrast, Bai (2009a) controls for the residual common factors only (i.e., common factors in  − 0 ), and it is conjectured that his estimator is asymptotically efficient if  are  over  and  (Bai, 2009a, Corollary 1). In practical applications, factor number estimation is important, and Bai (2009b) provides a criterion which gives a consistent factor number estimator. However, the small sample performance of the LS estimator with the factor number estimated this way might be compromised, possibly leading to a multi-modal sampling distribution of the  estimate. This problem is especially likely for data generating processes in which fewer common factors exist in  − 0  than in  − 0  for some  6=  We have defined  as the maximal common factor set of the observable variables (  0 )0   listed without duplication. Let  and  be such that     =   and   =   .   (When   =   and   =   as above, we have  =    and  =   − P  −12 0    .) Let  ≡  =1   for any column vectors  and  . We impose the following restrictions on (2)–(3), which are expressed as high level assumptions for simplicity. Sufficient fundamental conditions are stated in the remarks to follow.

Assumption A 1

(i)  −1  0  is convergent and asymptotically nonsingular;

This was pointed out by an anonymous referee to one of our previous drafts of this paper.

4

P P 0 0 0 −1 (ii)  −1  =1 vec(   )vec(   ) =  (1), and  =1 (  +   ) is convergent and asymptotically nonsingular; P P 0 −1 0 2 (iii) Let  ≡  −1  =1   , and  =  =1 vec( )vec( ) . We have (k k ) = ( 2 ) and (k k2 ) = ( 2 ); p (iv) If lim    ∞, then the maximal eigenvalues of  and  are  ((1 +  )2 ); P 0 (v) (k k2 ) = ( ) and (k  k2 ) = ( ) for all , where   ≡  −12  =1   with  denoting the th column of  ; P 0 (vi) (k 0  k2 ) = ( ) and (k 0   k2 ) = ( ), where   ≡  −12  =1   , for all ; (vii) (k  k2 ) = ( 2 ) for all ; (viii) (k   k2 ) = ( 2 ) and (k 0   k2 ) = ( 2 ) for all . Assumptions A(i) and A(ii) are conventional in the approximate factor model literature. The second part of condition (ii) ensures that each common factor has a nontrivial contribution to the variance of at least one of the elements of (0   )0 , so that the regularity conditions of Bai (2003) are satisfied (see Kapetanios and Pesaran, 2007). Assumption A(iii) means that  P  P  P  1 P [    ] = (1)  2  2 =1 =1 =1 =1

which holds if [    ] is uniformly bounded. The same arguments apply to the  part. Assumption A(iv) for  is taken from a result in Yin, Bai and Krishnaiah (1988, Theorem 3.1) and Bai and Ng (2002b). For example, it holds for  if  is an element of   , where  is the  × matrix of  random variables with finite fourth moments, and where the eigenvalues of 0  and 0  are uniformly bounded. A similar treatment can be made to permit weak dependence and heteroskedasticity among the elements of  . Assumption A(v) is motivated P P as follows: Note that k k2 =  −1   0  0  . If |(0   |1       )| ≤   and |(0  )| ≤ ¯ for some universal constant ¯ , then we have   P  P  ¯   =1 =1 P P which is ( ) if  −1  =1 =1    ∞. The same remarks apply to the second part. This assumption relates to Lemmas 1(ii) and (iv) of Bai and Ng (2002). Assumption A(vi) means i h  P  P  P  1 P    0   0 = (1)  =1 =1 =1 =1

(k k2 ) ≤

5

which holds if (0  0  ) is uniformly bounded and  P  P  P  1 P (  ) = (1)  =1 =1 =1 =1

under the assumption that  is independent of the common factors and the factor loadings. Similar arguments apply to the second part of the assumption. Assumption A(vii) is satisfied P P   if the second moments of the elements of   , i.e.,  −1  =1 =1 (    ), are uniformly bounded. This would be satisfied in general unless cross section dependence is too strong so, e.g.,   does not follow a CLT element-wise. The first part of Assumption A(viii) can be written as ( 0 0    ) = ( 2 ). Intuitively, each element of  −12    is likely bounded and asymptotically random (if    = 0), thus the sum of its squared elements would be bounded in the mean. To illustrate this more rigorously, let  be scalar. Assumption A(viii) means  P  P   P  P 1 P [     0 ] = (1)  2 =1 =1 =1 =1 =1

If |(  |1         )| ≤   for all , then the left hand side (which is nonnegative) is bounded by  P  P  P  1 P   |(  0  )|  =1 =1 =1 =1 

If furthermore |(  | )| ≤  for all  and , and if |(0  )| is uniformly bounded, then the above quantity is bounded by a universal constant times # ∙ " ¸  P  P     P  P  P P 1 P 1 P 1       =  ·    =1 =1 =1 =1    =1 =1   =1 =1 

Thus the first part of Assumption A(viii) holds if the right hand of the above displayed expression is bounded, which is a weak condition. The second part can be examined in a similar manner. Our interest is a pooled regression of  on  augmented with an estimate ˆ of the common factors  . This factor augmented estimator (FAE) is defined as ´−1 P ³P   0 0    (4) ˆFAE ≡  ˆ  =1 =1  ˆ  

where  ≡  − (0 )−1 0 for any full column rank matrix . We also consider improving efficiency by estimating an equation that includes lagged defactored variables as regressors. More specifically, one can estimate  by fitting (5)

0 ¨ 0  +  ¨ −1 ¨ =   + ¨ −1 + error 

6

¨ 0 are the -th rows of ¨ ≡  ˆ  and  ¨  ≡  ˆ  respectively. The  estimate where ¨ and    ˆ from this regression, denoted as SFAE , is called the FAE under serial correlation (SFAE in short). When the common factors  are estimated using the PC method, we call the resulting feasible estimator (4) the principal component augmented estimator (PCAE) and denote it by ˆPCA . Similarly, the SFAE estimator from (5) using the PC factor estimates is called the PCAE under serial correlation (SPCAE in short), and it is denoted by ˆSPCA .

3 Asymptotics Replacing   with the full factor set  , we can write (2) as (6)

 =   +   +  

where  satisfies   =    as explained previously. In this section, we provide asymptotics for the factor augmented estimators ˆPCA and ˆSPCA defined in the previous section.

3.1 Ordinary Factor Augmented Estimator We first consider an infeasible factor-augmented estimator, denoted as ˆI , FAE , which is obtained from a pooled regression of   on   . It satisfies: ´−1 P ³  0 ˆI , FAE =  + P  0    =1 =1    

The estimator is infeasible since the factors are assumed to be known. When  is stationary √ over , a  convergence rate would be obtained under regularity, and the limit distribution √ P 0 of  (ˆI , FAE − ) is also naturally obtained from the behavior of ( )−1  =1    and P 0 ( )−12  =1    . We maintain the following assumption.

P 0 Assumption B As   → ∞, ( )−1  =1    → Σ which is nonsingular, and P 0 ( )−12  =1    ⇒ (0 FAE ) for some FAE . When  is replaced with an estimate ˆ , the corresponding FAE satisfies ³P ´−1 P   0 0 ˆ (7) FAE =  + =1  ˆ  =1  ˆ ( +   )

The properties of ˆFAE depend heavily on the last term. In particular, consistency requires that P 0 ˆ ( )−1  =1  ˆ ( +   ) = 0, while in order for FAE to have an unbiased limiting P 0 distribution, we require ( )−12  =1  ˆ ( +   ) to be centered at zero. 7

We will consider ‘consistent’ factor estimates ˆ in the sense that (8)

 −1 (ˆ −  )0 (ˆ −  ) =  ( −2  ) for some   → ∞

where  is asymptotically nonsingular. (Note that   =   −1  , so any nonsingular transformation of the columns of  can also be regarded as common factors. See Bai and Ng, 2002a.) In (8),   is a function of  and  . For example, Bai and Ng (2002a) show that for the PC estimator,   = min[  ]12 . Theorem 1 below gives asymptotic theory for the ordinary FAE. Given that the PC factor estimation method is popular in practice, we also provide theory for the PCAE ˆPCA . For the PCAE, we take the results given in Bai and Ng (2002a) as a high level assumption in Theorem 1. We have the following results as   → ∞. Theorem 1 Under Assumption A: (i) if (8) is satisfied for some   → ∞ and asymptotically nonsingular , then  ˆFAE = ; (ii) If the conditions in (i) hold, and if (a)  12  −2  → 0, p √ (b)  0 ˆ  =  (1), and (c)  0 ˆ  =  (   ), then  (ˆFAE − ˆI ,FAE ) → 0; (iii) If   → 0,  3 → 0 and if the PC estimator ˆ satisfies (8) for some  2 = min(  ), then √  (ˆPCA − ˆI , FAE ) → 0. Remarks. 1. Theorem 1(i) states that the consistency of an ordinary FAE requires only the consistency of the associated factor estimator in the sense of (8). For example, the PC estimator is consistent when   → ∞ in the presence of serial correlation and weak cross-sectional dependence in the idiosyncratic errors  and  (Bai and Ng, 2002a). If  is fixed √ and the errors are uncorrelated over , then various  -consistent factor estimates are available (e.g., ALS, 2006, classical PC estimates, etc.). 2. Theorem 1(ii) means that under three additional conditions, the ordinary FAE and its infeasible counterpart are asymptotically equivalent up to order ( )−12 , thus having the same asymptotic distribution. When   → 0, condition (c) implies condition (b). Note that conditions (a)–(c) are not automatically satisfied by consistent  estimates and √ should be checked for each factor estimate. If  is fixed and ˆ is -consistent (e.g., ALS, 2006), then all the conditions are satisfied. 3. The condition that  2 = min(  ) is a result derived by Bai and Ng (2002, Theorem 1) from a set of fundamental assumptions. See Bai and Ng (2002) for full discussion.

8

4. Theorem 1(iii) gives conditions under which the PC estimate satisfies the requirements √ of Theorem 1(ii), such that  (ˆPCA − ˆI , FAE ) → 0. These conditions merit some √ discussion. Notably these conditions are different from the   → 0 condition given in Kapetanios and Pesaran (2007) and Giannone and Lenza (2008). The condition that  3 → 0 and   → 0 is justified as follows. When the idiosyncratic errors  and  are serially correlated, the factor estimate may be biased for small  . If  increases fast while  increases too slowly, then the remaining small bias can be amplified greatly √ (when multiplied by  ), so the limiting distribution can be biased. The condition that  3 → 0 precludes this possibility. On the other hand, if  grows too fast compared to , then the discrepancy between   and ˆ may accumulate, possibly resulting in a biased asymptotic distribution. This possibility is precluded by the condition that   → 0. 5. Theorem 1(iii) suggests that if  is large compared to  and if  is not too small, then the asymptotic distribution of the PCAE is equivalent to that of its infeasible counterpart. √ But if   , the asymptotic distribution of  (ˆPCA − ) is biased. But in that case, one can simply estimate the factor loadings of (  0 )0 first and then augment the panel regression with the estimated factor loadings. By switching the roles of  and  , and of the common factors and the factor loadings, we can see that this ‘factor loading augmented’ estimator is asymptotically equivalent to the corresponding infeasible estimator if  → 0 and   3 → 0. However if  →   0 then the asymptotic distribution of the FAE is biased. A bias correction as proposed in Bai, 2009a, for this case would be an interesting future research topic. We next consider the computation of standard errors for the ordinary FAE under the assumption that the random variables are independent across  such that (9)

1 P 0 0 =1 (      )  →∞ 

FAE = lim

When the FAE is equivalent to its infeasible counterpart, the variance of the asymptotic distri√ −1 bution of  (ˆFAE − ) is FAE ≡ Σ−1  FAE Σ , because of Theorem 1(ii) and Assumption ˆ  ≡ ( )−1 P 0  ˆ  , and FAE in (9) is B. The Σ term is naturally estimated by Σ  =1 estimated by 1 P 0 (10) ˆFAE ≡ ˆ0 ˆ   =1  ˆ ˆ 

ˆ ≡  − ˆFAE 

under the assumption of cross sectional independence for  . 9

Pesaran (2006) proposes another method of estimating the asymptotic variance. Let ˆ be the individual feasible  estimate, i.e., ˆ = (0 ˆ  )−1 0 ˆ  . Then we have ˆ =  + (0 ˆ  )−1 0 ˆ ( +   ), thus 0 ˆ  + 0 ˆ   = 0 ˆ  (ˆ − ) where the second term on the left hand side is negligible in the sense that its second sample P 0 0 0 moment ( )−1  =1  ˆ     ˆ  asymptotically disappears. Thus FAE can also be estimated by 1 P 0 ˆ ¯ ˆ ¯0 0 ¯ = 1 P ˆ  (11) ˜FAE = =1  ˆ  ( − )( − )  ˆ     =1 According to supplementary simulations (not reported) this variance estimate performs quite well even in small samples. The analysis so far is based on the supposition that the factor numbers are known or correctly estimated. When the factor numbers are unknown, they can be consistently estimated using the selection critertia suggested by Bai and Ng (2002a), for example. A simple Monte Carlo experiment is conducted to verify Theorem 1. We generate data from 0  = 0  + 0   +  with scalar  =   +  . Here  ,  and { }=1 follow independent AR(1) processes based on  (0 1) innovations and autoregressive coefficients  equal to 0.5; and   and  are   (1 1) independent of the other variables and also mutually independent. The number of common factors is 2 and in each replication we estimate the factor number using the Bai-Ng IC2 () criterion with max = 4. We set  = 1 and  = −2. For each of the generated samples, we compute the infeasible FAE (with true factors  ) and the PCAE (using ˆ estimated from the pooled variables). In order to verify Theorem 1(iii), √ Table 1 reports the mean and variance of  (ˆPCA − ˆI , FAE ) from 10,000 replications. Because √ the infeasible estimator is unbiased (results not reported), the mean of  (ˆPCA − ˆI , FAE ) is the normalized bias of the PCAE. Note the (absolute) bias has a “U” shape as  increases for given  (this is particularly prominent for small  ). This would imply that bias results when growth in  is too slow (compared to ), which partly illustrates the necessity of the  3 → 0 condition. Also, as  → ∞ for fixed , the absolute bias either increases for small , or has a “U” shape for large , which suggests the requirement that   → 0 in order for the bias to diminish. In addition the bias does not dissipate as  and  grow with  =  , which is also in accordance with the   → 0 condition.

3.2 FAE under Serial Correlation In this subsection we provide asymptotic results for the SFAE, which is proposed in order to enhance efficiency in the presence of serial correlation in  . The analysis is similar to the 10

previous case: We consider an infeasible estimator first, and then show that a feasible estimator is asymptotically equivalent to the infeasible counterpart under certain conditions. Let ˙ ≡   and ˙  ≡   , and let ˙  and ˙ 0 denote the -th rows of ˙  and ˙  respectively. The infeasible estimator ˆI , SFAE of  under serial correlation is the estimated coefficient of ˙  when ˙ is regressed on ˙  , ˙ −1 and ˙−1 by pooled least squares. (To specify a higher AR order, one can simply use more lagged variables on the right hand side. For example, if an AR(2) specification is to be fitted, ˙  , ˙ −1 , ˙ −2 , ˙ −1 and ˙−2 appear on the right hand side.) Importantly,  is consistently estimated by this autoregressive estimation, despite ˙−1 being correlated with the regression error. (This is shown by Phillips and Sul, 2007, p. 169, for the case when case for  = (1 )0 .) Under regularity similar to Assumption B for a panel CLT, the infeasible estimator ˆI , SFAE is 0 asymptotically normal. The assumption below ensures this holds. Let  ≡ [˙ 0  ˙ −1  ˙ −1 ]0 , which is the infeasible de-factored regressor vector. Let ˙  be the −th row of ˙  ≡   . We assume the following. P P 0 Assumption C As  → ∞ and  → ∞, ( )−1  =1 =1   → Σ , which is nonP P  singular, and ( )−12    ˙ −1 ), is asymptotically =1 =1 (  −   ), where   ≡  (˙ normal. The SFAE ˆSFAE is obtained by replacing  with an estimate ˆ satisfying (8). That is, letting ¨  ≡  ˆ  and  ˆ =  − ˆ (ˆ 0 ˆ )−1 ˆ 0 , the feasible estimator ˆSFAE is ¨ ≡ ˆ  and    ¨  ,  ¨ −1 and ¨−1 , where ¨ and  ¨ 0 are the -th rows of ¨ obtained by regressing ¨ on  ¨  respectively. The PCAE estimator for this autoregressive case is the feasible estimator and  using the PC estimator. As stated above, ˆSPCA denotes the feasible estimator with PC estimated factors. We have Theorem 2 The results in Theorem 1 also hold when ˆFAE , ˆI ,FAE and ˆPCA are replaced with ˆSFAE , ˆI ,SFAE and ˆSPCA , respectively. The required conditions of Theorem 2 are identical to those of Theorem 1. To evaluate the asymptotic variance, we note that the autoregressive estimator is alge¨ 0  +   where ¨ ≡ braically identical to the slope estimate from the regression on ¨ =  0 ˆ1 − ˆ¨−1 with  ¨ −1 ¨ −1 respec¨ −  ˆ and ˆ1 being the estimated coefficients of ¨−1 and  tively from the autoregressive feasible regression, viz., ¸−1 ∙ √ 1 P P ¨ 1 P P ¨ ¨ 0 ˆ ¨ 0 ) √ (12)  (SFAE − ) =  −  =1 =1   =1 =1  (¨   11

Because of the asymptotic equivalence of the feasible and infeasible estimators under Theorem 2, and under the assumption that  are cross section independent, the variance of the numerator is approximated by ∙ ¸∙  ¸0  P ¨ P 1 P 0 0 ¨  ) ¨  )  ¨  (¨ (13)  −   −   (¨   =1 =1 =1

Now (13) can be estimated by replacing  with ˆSFAE , and the asymptotic variance of (12) is estimated using the usual sandwich form. Alternatively Pesaran’s (2006) method can again be employed. Specifically, for each , let ˜ = (P  ¨   ¨ 0 )−1 P  ¨  ¨ , which is the individual autoregressive estimator is obtained by   ¨  . Then we have regressing ¨ on    P ¨   ¨ 0 (˜ − ) = P  ¨  (¨ ¨ 0 )   − 

=1

=1

i hP i0 P hP ¨ ¨ 0 ˜  0 ˜ ¨ ¨     so (13) equals ( )−1  (  − ) (  − ) . This can be es      =1 =1 =1 timated by replacing  with the average of ˜ over . According to simulations this alternative estimator performs well even when the sample size is small, a result similar to that obtained for the ordinary FAE. Table 2 presents simulation results for the SPCAE in terms of its normalized difference from √ the infeasible FAE under serial correlation, i.e.,  (ˆSPCA − ˆI , SFAE ). Samples are generated as described at the end of Section 3.1. In terms of bias the results are similar to the that of the √ PCAE, but the variance of  (ˆSPCA − ˆI , SFAE ) is shown to be considerably lower than that of the PCAE in Table 1. We also compare the PCAE and the SPCAE in Table 3 to show that the SPCAE is more efficient than the PCAE when the regression errors  are serially dependent (in the DGP  follows an AR(1) with coefficient 0.5). As we expect, SPCAE exhibits a considerable variance reduction relative to PCAE.

4 Conclusion In this paper we establish asymptotics for linear panel regression estimators augmented with estimated common factors. A specific rate condition (  → 0 and  3 → 0) is derived for the asymptotic equivalence of PC-augmented panel estimators. These conditions are different √ from those for time series models augmented with factors (i.e., that   → 0; see Bai and Ng, 2006). Monte Carlo studies support these new asymptotic results. We also derive asymptotics for a one-step dynamic estimator which can achieve efficiency gains in the presence of serial correlation in the idiosyncratic regression error. 12

References Ahn, S. C.,Y.-H. Lee, and P. Schmidt (2001): GMM estimation of linear panel data models with time-varying individual effects, Journal of Econometrics, 101, 219–255. Ahn, S. C.,Y.-H. Lee, and P. Schmidt (2006): Panel Data Models with Multiple Time-Varying Individual Effects, mimeo, Arizona State University. Bai, J. (2003): Inferential theory for factor models of large dimensions, Econometrica, 71,135– 171. (2009a): Panel data models with interactive fixed effects, Econometrica, 77, 12291279. (2009b): Supplement to “Panel Data Models with Interactive Fixed Effects”: Technical details and proofs (Econometrica, Vol. 77, No. 4, July 2009, 1229-1279), Econometrica Supplementary Material. Bai, J., and S. Ng (2002a): Determining the number of factors in approximate factor models, Econometrica, 70, 191–221. (2002b): Determining the number of factors in approximate factor models, Errata. http://www.columbia.edu/~sn2294/papers/correctionEcta2.pdf. (2006): Confidence intervals for diffusion index forecasts and inference for factoraugmented regressions, Econometrica, 74, 1133–1150. Choi, I. (2008): Efficient Estimation of Factor Models, mimeo, Sogang University. Greenaway-McGrevy, R., C. Han, and D. Sul (2008): Estimating the Number of Common Factors in Serially Dependent Approximate Factor Models, mimeo, the University of Auckland. Giannone, D. and M. Lenza (2008): The Feldstein-Horioka fact, European Central Bank working paper No. 873. Kapetanios, G, and and M. H. Pesaran (2007): Alternative approaches to estimation and inference in large multifactor panels: Small sample results with an application to modelling of asset return, in G. Phillips and E. Tzavalis (eds.), The Refinement of Econometric Estimation and Test Procedures: Finite Sample and Asymptotic Analysis, Cambridge University Press, Cambridge. 13

Pesaran, H. (2006): Estimation and inference in large heterogenous panels with a multi factor error structure, Econometrica, 74, 967–1012. Phillips, P. C. B., and Sul, D. (2007): Bias in dynamic panel estimation with fixed effects, incidental trends and cross section dependence, Journal of Econometrics, 137, 162-188. Stock, J. H., and M. W. Watson (2002): Forecasting using principal components from a large number of predictors, Journal of the American Statistical Association, 97 1167–1179.

14

 \

25 50 100 200 1000 2000 4000

 \

25 50 100 200 1000 2000 4000

15

Table 1: Mean (left) and variance (right) of ( )12 (ˆPCA − ˆI , FAE )

-1.542 -1.041 -0.749 -0.581 -0.370 -0.373 -0.407

25

50

100

200

15

25

50

100

200

-1.814 -1.272 -0.919 -0.672 -0.364 -0.316 -0.300

-2.447 -1.756 -1.241 -0.884 -0.425 -0.311 -0.244

-3.447 -2.450 -1.747 -1.237 -0.561 -0.402 -0.290

-4.854 -3.450 -2.456 -1.741 -0.784 -0.554 -0.397

1.746 0.954 0.784 0.696 0.801 0.972 0.966

0.932 0.573 0.472 0.412 0.371 0.377 0.376

0.608 0.392 0.293 0.237 0.195 0.192 0.190

0.518 0.307 0.197 0.142 0.104 0.097 0.096

0.515 0.274 0.149 0.100 0.056 0.050 0.048

Table 2: Mean (left) and variance (right) of ( )12 (ˆSPCA − ˆI , SFAE )

15

25

50

100

200

15

25

50

100

200

-1.536 -1.034 -0.745 -0.572 -0.370 -0.374 -0.409

-1.806 -1.266 -0.911 -0.666 -0.360 -0.313 -0.300

-2.443 -1.747 -1.243 -0.888 -0.426 -0.317 -0.254

-3.435 -2.453 -1.742 -1.237 -0.563 -0.405 -0.297

-4.841 -3.450 -2.455 -1.737 -0.782 -0.555 -0.397

1.452 0.624 0.462 0.392 0.498 0.805 0.833

0.640 0.285 0.209 0.166 0.138 0.136 0.137

0.339 0.184 0.114 0.078 0.050 0.047 0.046

0.304 0.157 0.085 0.049 0.023 0.019 0.018

0.340 0.156 0.074 0.040 0.013 0.010 0.008

 \

25 50 100 200

 \

25 50 100 200

Table 3: Comparison of PCAE (left) and SPCAE (right) Bias ×100

25 -7.277 -3.643 -1.820 -0.978 25 3.283 1.469 0.719 0.349

50 -6.910 -3.501 -1.732 -0.885 50 1.480 0.711 0.352 0.174

100 -6.926 -3.464 -1.776 -0.885

200 -6.842 -3.454 -1.739 -0.880

25 -7.254 -3.601 -1.801 -0.972

Variance ×1000

100 0.726 0.358 0.173 0.083

200 0.379 0.175 0.087 0.043

15

25 2.477 1.059 0.513 0.254

50 -6.882 -3.504 -1.745 -0.887

100 -6.884 -3.453 -1.763 -0.874

200 -6.850 -3.459 -1.732 -0.875

50 0.964 0.459 0.229 0.113

100 0.463 0.226 0.106 0.052

200 0.239 0.109 0.053 0.026

Asymptotic distribution theory for break point estimators in models ...