A note on the identification of dynamic economic ...

Viewer
Transcript

A note on the identification of dynamic economic models with generalized shock processes Claire Reicher∗ Institut für Weltwirtschaft Kiellinie 66 24105 Kiel, Germany This version: July 24, 2015

Abstract DSGE models with generalized shock processes, such as shock processes which follow a VAR, have been an active area of research in recent years. Unfortunately, the structural parameters governing DSGE models are not identified when the driving process behind the model follows an unrestricted VAR. This finding implies that parameter estimates derived from recent attempts to estimate DSGE models with generalized driving processes should be treated with caution, and that there always exists a tradeoff between identification and the risk of model misspecification. However, these results also make it easier to address the issue of model misspecification by making it computationally easier to check the validity of cross-equation restrictions. Word count: 5661.

∗

Phone: +49 (0)431 8814 300. Email: claire dot reicher at ifw hyphen kiel dot de.

I wish to thank Steffen Ahrens, Martin Plödt, Jens Boysen-Hogrefe, Dominik Groll, Vincenzo Caponi, Henning Weber, Ignat Stepanok, seminar participants at the IfW, conference participants at the Verein für Sozialpolitik, Francesco Zanetti, and two anonymous referees for their patient advice and valuable feedback. Frank Schorfheide has also made valuable comments. All remaining errors are mine. JEL: C13, C32, E00 Keywords: Identification, DSGE models, wedges, shocks, maximum likelihood

1

Introduction

The estimation of dynamic models which feature generalized shock processes has become an active area of research in recent years. This research is motivated by the intuitively appealing idea that the orthogonality restrictions typically placed on shocks in DSGE models are arbitrary and restrictive, and that these restrictions carry with them a risk for misspecification. To address this issue, Ireland (2004) looks at a model with observation errors which follow a VAR(1), while Cúrdia and Reis (2012) use Bayesian methods to estimate a largescale dynamic model whose shock process follows a VAR(1). This set of approaches is in contrast with most estimation exercises, such as that of Smets and Wouters (2007); these exercises have typically relied upon stronger restrictions upon the underlying shock processes. This contrast is important, since both Ireland (2004) and Cúrdia and Reis (2012) present results which suggest substantial differences between their estimates and those of Smets and Wouters (2007), which those authors attribute to the more general nature of their estimated shock processes. Implicit in this exercise is the idea that off-diagonal elements in the VAR might reflect model misspecification. To investigate this idea, Cúrdia and Reis also engage in a series of model checks based on their estimated VAR process. Based on these model checks, they argue that it is important to model the correlation between government spending and productivity in a way that is typically not done in DSGE models. While the motivation behind generalizing the shock process in DSGE models is appealing, it appears that generalizing the shock process too far can result in a complete lack of identification for deep model parameters. This is a problem alluded to by del Negro and Schorfheide (2009), and this problem occurs for the simple reason that a model with an unrestricted VAR shock process can approximate an unrestricted VAR in the observables arbitrarily well for any valid set of deep parameter values. Therefore, the concentrated likelihood function is completely flat with respect to those parameters, and those parameters hence are unidentified. This problem with identification also occurs asymptotically under Bayesian estimation. In order avoid this problem and achieve identification, therefore, it is necessary to make meaningful restrictions on the driving process governing the model. Analytical results suggest that this problem is a general problem, with examples of this problem occurring in a three-equation New Keynesian model with VAR wedges and also in a model with VAR observation errors. For instance, even the basic three-equation New Keynesian model with a Taylor rule, aggregate supply equation, intertemporal asset pricing equation, and AR(1) errors has eleven parameters, and that model is known to have a poorly-behaved likelihood function. That model but with VAR(1) errors has twenty parameters in total, while a simple unrestricted VAR(1) in the observables has only fifteen parameters. As a re-

1

sult, a model with VAR(1) errors is computationally difficult to estimate, and so it is useful to know that such a model faces theoretical problems with identification. In a broader context, these findings imply that there will always remain some need to make restrictive identifying assumptions in order to achieve identification, although as Cúrdia and Reis (2012) point out, allowing for VAR(1) errors can aid in model evaluation, when used judiciously. In fact, some of the results presented here can aid in these types of model evaluation efforts, so long as one is willing to give up on the identification of deep model parameters. The nonidentification result fits into a rapidly growing literature on the identification of structural parameters in DSGE models.1 The traditional way to assess identification in finite samples has been to check that the information matrix is of full rank, following Rothenberg (1971). More recently, Canova and Sala (2009) provide a set of diagnostics intended to detect possible nonidentification when matching impulse responses, while del Negro and Schorfheide (2009) and Consolo, Favero, and Paccagnini (2009) discuss the identification of DSGE models within the context of DSGE-VAR and DSGE-FAVAR approaches. Meanwhile, Iskrev (2010) and Komunjer and Ng (2011) discuss further conditions governing autocovariances under which economic models may or may not be locally identified in a classical setting. In a Bayesian setting, Koop, Pesaran, and Smith (2013) propose to assess asymptotic identification by examining the rate of decay of posterior variances as subsamples get larger, in order to get a sense of whether the model seems to be converging toward some mode. These methods to assess identification are highly useful, but they generally require the actual estimation of a model, which can be computationally difficult or impossible when a model is large or poorly-behaved, as are the models discussed here. This difficulty occurs because a likelihood-based estimation strategy behaves poorly, in a computational sense, when the likelihood function is flat. In the context of these problems, the main contribution of the results presented here is to offer a theoretical basis as to why certain models suffer from a lack of identification and the computational difficulties that pertain thereto, without needing to attempt to estimate those models. Furthermore, the results presented here rely on a global concept of identification, in contrast with the local concepts of identification used throughout the main part of the literature.2 1

Cúrdia and Reis (2012) present an excellent review of the previous literature on the relationship between identification and the specification of shocks in DSGE models. This discussion presents only a brief summary. 2 There is also some discussion in the literature about the identification of model parameters in specific theoretical contexts. Cochrane (2011), for instance, discusses the ways in which the parameters which govern unstable eigenvalues in DSGE models may not show up in the data and hence are not identified. The DSGE literature contains many such examples.

2

2

The issue of identification

Implicit in this treatment is a global concept of identification, whereby a full set of model parameters m ∈ M is said to be (fully and globally) identified given a finite sample of observables Z and a likelihood function (or similar objective function) Ω(m; Z). Identification is defined as follows: if there exists a value m∗ such that for any m# where m# ∈ M and m# 6= m∗ , the likelihood function satisfies Ω(m∗ ; Z) > Ω(m# ; Z) with strict inequality. This is equivalent to saying that a model is identified when the likelihood function has a unique set of parameter values at which it attains its global maximum, or that the maximum likelihood estimator is unique. This concept of identification corresponds closely with the mechanics of maximum likelihood estimation, under which identification is a global property of the likelihood function. Therefore, this concept is more restrictive than the concept of Rothenberg (1971) and the main part of the subsequent literature, since that concept focuses on local identification. In the context of the current concept of identification, m contains a set of deep model parameters θ and a set of nuisance parameters governing the driving process, or wedges, in the language of Chari, Kehoe, and McGrattan (2007), Zanetti (2008), Kersting (2008), and Šustek (2011). These wedges follow an unrestricted VAR process. The main result is that in a DSGE context, there is no unique m which maximizes the likelihood function when m contains enough nuisance parameters to mimic an unrestricted VAR in observables. This occurs even though m can be partially identified–that is, for a given value of θ, the nuisance parameters can be identified, and vice versa, since the likelihood function exhibits a ridge in relation to the axes defined by these parameter sets. However, letting all of the elements of m vary precludes global and, in this case, local identification for the elements of θ. This result suggests that attempts to estimate DSGE models with generalized shock processes are likely to lose full identification but may retain partial identification. Intuitively, this result is based on the existence of a mapping between an unrestricted VAR in the wedges and an unrestricted VAR in the observables, and vice versa. The existence of these mappings implies that, when there are no restrictions on the law of motion for the wedges, the VAR for the wedges can replicate an unrestricted VAR in the observables. This in turn implies that the maximized likelihood of the data given the deep parameters of the DSGE model takes on a constant value for any value of the deep parameters, and so the deep parameters of the model are not identified. Given that this result implies a flat concentrated likelihood function in θ, this result implies both the global and local nonidentification of θ.

3

2.1

Mapping from the DSGE model to the VAR model

This result relies on the existence of a mapping between a VAR for wedges and a VAR for observables, and vice versa. A DSGE model with VAR wedges is defined as a set m = {θ, {Fi }, S}, where θ is a set of deep model parameters; and {{Fi }, S} are a set of parameters which describe the reduced-form law of motion for a set of shocks or wedges wt . These wedges are of a dimension k by one, and so the matrices which comprise {{Fi }, S} are each k by k. The wedges follow a law of motion which can be represented as a (possibly infinite-order) VAR, such that: wt =

∞ X

Fi wt−i + ζt , where Eζt ζt0 = S.

(1)

i=1

This model class covers most typical DSGE models. In most typical DSGE models, {Fi } and S are highly restricted. Most commonly, for instance, the off-diagonal elements of these matrices are restricted to equal zero. The main methodological innovation of Ireland (2004) and Cúrdia and Reis (2012) is to relax these restrictions by allowing {Fi } and S to take on a broad range of values. The complete linearized system underlying the DSGE model {θ, {Fi },S}, including wedges and observables, can be represented using the notation of Sims (2002), such that: Γ0,0 xt+1 = Γ1,0 xt + Π0 η0,t+1 + Ψ0 ζt+1 .

(2)

The matrices Γ0,0 , Γ1,0 , Π0 , and Ψ0 , are functions of the deep model parameters θ and the VAR coefficients {Fi }. The matrix xt contains the observables zt , the wedges wt , and any other auxiliary variables included in the model. The endogenous expectational errors η0,t are functions of the primitive shocks ζt . For the sake of what follows, ζt , wt , and zt are of the same dimensionality k by one. Furthermore, the observables zt are linked to the system xt through the observation equation zt = Hxt , with restrictions on the observation matrix H to ensure that the model is relevant in explaining the data. This set of assumptions can be relaxed somewhat without loss of generality, in the following ways. While it is necessary for the dimensionality of ζt and wt to be weakly greater than that of zt , in order to avoid stochastic singularity, if the dimensionality of ζt and wt were to be strictly greater than that of zt , then the mapping from the DSGE to the VAR model could still be unique. This corresponds with the standard idea that there need to be at least as many shocks as observables. However, the reverse mapping would become indeterminate, in which case, for the present exercise, it suffices in proving existence to choose the mapping that corresponds with a dimensionality for ζt and wt equal to that of zt . 4

As del Negro and Schorfheide (2009) show, given that this setup results in a solution to the overall model, the law of motion of the observables zt can then be represented as a VAR process in observables, which is defined as a set V = {{Φi }, Σ}, such that: zt =

∞ X

Φi zt−i + εt , where Eεt ε0t = Σ.

(3)

i=1

In this setup, the matrices {Φi } are a set of k by k matrices of reduced-form VAR coefficients which govern the evolution of zt , and Σ is the covariance matrix of the reduced-form innovations (or expectational errors) to zt which are given by εt . As a matter of notation, it is possible to define V (M ) or V ({θ, {Fi }, S}) as the mapping from {θ, {Fi }, S} to V described by (3). These matrices are a function of the underlying DSGE model parameters {θ, {Fi }, S}, and the resulting VAR could possibly be of infinite order as shown by Ravenna (2007). This set of results, taken together, forms the backbone of the DSGE-VAR literature, and these results justify discussing DSGE models as special cases of VARs.

2.2

Mapping from the VAR model to the DSGE model

There is also a mapping which works the other way, from the law of motion for observables given by (3) to the law of motion for wedges given by (1). This mapping exists given θ, in the case where there are no restrictions on {{Fi }, S}. To derive this mapping, the model can again be represented using the notation of Sims (2002), but this time treating the law of motion V from equation (3) as given. The variables xt again contain the observables zt , the wedges wt , and any other auxiliary variables included in the model. As before, εt , wt , and zt have the same dimension k by one, and furthermore, the model is assumed to be stationary and locally determinate.3 Formally, the system contains the law of motion for the data (3) plus any model equations. The system in the notation of Sims (2002) now takes the form: Γ0,1 xt+1 = Γ1,1 xt + Π1 η1,t+1 + Ψ1 εt+1 .

(4)

The matrices Γ0,1 , Γ1,1 , Π1 , and Ψ1 are now functions of the deep model parameters θ and the VAR coefficients {Φi }. The endogenous expectational errors η1,t are now functions of the reduced-form shocks εt , and the wedges wt are linked to the system through the observation 3

In cases where some of the model objects and observables are not I(0), as in the random-walk consumption model discussed by Fernández-Villaverde et al. (2007), it may be necessary to take first differences of the appropriate objects and/or impose cointegration restrictions before proceeding. The issue of indeterminacy is handled more easily by imposing one of the admissible equilibria. In addition, it is helpful to track endogenous state variables (such as capital) to avoid VAR truncation error.

5

equation wt = Dxt , where D is an observation matrix. This system has a reduced-form solution of the form: xt = A1 xt−1 + B1 εt ,

(5)

for some matrices A1 and B1 which usually have to be derived numerically. Iterating this solution yields the wedges as a function of the history of the innovations to the data, such that: wt = Dxt = D

∞ X

Ai1 Li B1 εt ,

(6)

i=0

where L denotes the lag operator. The expectational error for wt , denoted by ζt , equals DB1 εt . Substituting this relationship into (6) gives the infinite-order MA process which governs the evolution of the wedges: wt = D

∞ X

Ai1 Li B1 (DB1 )−1 ζt ,

(7)

i=0

which, since the system implied by A1 is strictly stationary, can be written as wt = D(I − A1 L)−1 B1 (DB1 )−1 ζt . Assuming that the construction of D ensures that the matrix which premultiplies ζt is of full rank, which in turn ensures invertibility, this system can be written in the VAR form given by: wt =

∞ X

Fi wt−i + ζt , where Eζt ζt0 = S,

(8)

i=1

which again may be infinite in order. The results from equation (8) imply that for every set {V, θ} which is compatible with the model having a solution, there exists an underlying mapping from {V, θ} to {{Fi }, S} which is given by {{Fi (θ; V )}, S(θ; V )}. In practice, it is also possible to follow the logic of del Negro and Schorfheide (2009) and approximate this mapping using the covariance matrix of wt implied by V and θ, although the examples discussed below allow for an exact set of solutions. Altogether, the existence of both of these mappings implies that it is econometrically equivalent to treat the wedges as a function of the data or the data as a function of the wedges as in the previous section. Knowing the law of motion of one, conditional on θ, gives the law of motion of the other.

6

2.3

Main result: Nonidentification under an unrestricted {{Fi }, Σw }

It is possible, based on the results from the previous two sections, to prove that θ is not identified in cases where there are no restrictions on {{Fi },S}, for a class of extremum estimators that includes maximum-likelihood estimators. The setup of this proof is a proof by construction. First, the objective function for the estimation problem for the DSGE model takes the form of Ω(V ({θ,{Fi },S}); Z) given a stacked matrix of observables Z. This objective function could be the likelihood function of a VAR on the observables, although this objective function may take a related form (like the objective function for a GMM problem), so long as this objective function is well-defined over the space of all possible V . To analyze the properties of this class of estimators, a few more definitions are necessary. The value V ∗ (Z) is defined as the value of V at which the function Ω(V ; Z) is maximized, such that the maximum value of the objective function at this point is given by by Ω∗ = Ω(V ∗ (Z); Z). This implies that that there exist no values of {θ,{Fi },S} such that Ω(V ({θ,{Fi },S}); Z) > Ω∗ . Given this setup, it can be proven that θ is unidentified when there are no restrictions on {{Fi },S}. Formally, this statement is equivalent to saying that it can be proven that for any value of θ, the maximum value of Ω over {{Fi },S} always satisfies Ω(V ({θ,{Fi },S}); Z) = Ω∗ . Proof proceeds by construction. First, it can be shown that for any θ, there always exists a set of values for {{Fi },S} such that Ω(V ({θ,{Fi },S}); Z) ≥ Ω∗ . One element of this set is given by {{Fi (θ; V ∗ (Z))},S(θ; V ∗ (Z))}, based on the mapping given by Section 2.2. To show that this choice in fact satisfies the above inequality, it can be shown that Ω(V ({θ,{Fi (θ; V ∗ (Z))}, S(θ; V ∗ (Z))}); Z) = Ω(V ∗ (Z); Z), since it must be the case that V ∗ (Z) = V ({θ,{{Fi (θ; V ∗ (Z))}, S(θ; V ∗ (Z))}), based on the mapping given by Section 2.1. Therefore, for any θ, there exists a set of values for {{Fi },S} such that Ω(V ({θ,{Fi },S}); Z) ≥ Ω∗ . Given this result, and the fact that there exist no values of V ({{Fi },S}) and hence {{Fi },S} such that Ω(V ({θ,{Fi },S}); Z) > Ω∗ , then for any value of θ, the maximum value of Ω always exactly satisfies Ω(V ({θ,{Fi },S}); Z) = Ω∗ . As this proof shows, the problem with respect to the identification of θ lies in the ease with which one may obtain a valid set {{Fi },S} given by the mapping {{Fi (θ; V ∗ (Z))}, S(θ; V ∗ (Z))}. In particular, as θ varies in the absence of restrictions on this set, {Fi } and S have enough degrees of freedom to simply adjust in order to bring the wedge-driving process (1) completely into line with an unrestricted VAR (3). A successful likelihood-based estimation strategy, therefore, must meaningfully restrict the set to which {{Fi },S} can belong. In practice, this would imply some combination of zero restrictions, meaningful lag length restrictions when the true VAR lag length is greater than one, or the use of

7

outside information (like first moments) in an efficient way.4 While this result seems to be negative, this result should not be entirely surprising, since it mirrors a century of work on the identification of systems of equations. A simple, classic system of supply and demand equations in this context is illustrative. To estimate a supply and demand system with two equations and data on quantity and prices, it is necessary to make additional identifying restrictions or to bring in outside information. One way to do this is through instrumental variables, where some set of shocks is assumed to be uncorrelated with another set of shocks—the classic example given by Wright (1928) involves taking shocks to the productivity of land as orthogonal to shocks to the demand for butter and flaxseed. Wright uses this orthogonality assumption in order to estimate the elasticities of demand for these two commodities. The same situation holds in DSGE models with respect to orthogonality assumptions, in which case it is necessary to make meaningful restrictions on F and S in order to identify θ. Typical restrictions placed in the DSGE literature are to assume that the off-diagonal elements of F and S are zero or that first moments (or external studies) contain useful information–for instance, using information from labor’s average share of income to identify labor’s share in a Cobb-Douglas production function. It is simply not possible to dispense with restrictions of this sort in the absence of other meaningful prior information. In a sense, the original critique made by Sims (1980) of the simultaneous equations literature cannot be fully reconciled with DSGE models, in that there will always be some degree to which a DSGE (or any kind of structural model) must place unbelievable restrictions on the data in order for the data to deliver believable parameter estimates. Otherwise, the likelihood function is likely to exhibit a ridge in relation to the axes defined by θ and {{Fi }, S}, and this ridge allows for partial identification but precludes full identification.

3

Practical examples of nonidentification

3.1

A simple three-equation model

The problem of nonidentification can be illustrated by a simple example based on the textbook 3-equation New Keynesian model mentioned in the introduction. In this model, the output gap yt is related to the inflation gap πt through an aggregate supply equation in 4

This problem does not go away when using Bayesian methods, since a flat likelihood implies that the posterior will tend to inherit the properties of the prior. This leads to the type of asymptotic problems with identification and convergence discussed by Koop, Pesaran, and Smith (2013). In fact, a comparison of the priors and posteriors in the analysis of Cúrdia and Reis (2012) suggests that the concentrated likelihood in their model with VAR shocks, for most parameters, is relatively flat.

8

which the parameter κ reflects the effect of inflation on output, and β reflects the rate of time preference. Output is also related to future output, inflation, and current interest rates through an aggregate demand equation, in which the parameter σ governs the willingness of consumers to substitute across time. Interest rates are governed by a Taylor rule which relates interest rates to inflation and output through the coefficients φπ and φy respectively. In the current example there is no interest rate smoothing, for the sake of simplicity. h i0 s d i The system, with wedges wt = wt wt wt , is expressed by the following three equations: yt = κπt − κβEt πt+1 + wts ;

(9)

1 yt = − (it − Et πt+1 ) + Et yt+1 + wtd ; σ

(10)

it = φπ πt + φy yt + wti .

(11)

and

i0 h The wedges wt = wts wtd wti represent reduced-form disturbances to aggregate supply, aggregate demand, h i0 and monetary policy, respectively, and the observables form the matrix zt = yt πt it . This system, written in the canonical form given by (5), takes the following form, assuming that the observables actually follow a VAR(1) with a coefficient matrix Φ, whose i, j

9

element is given by Φij , such that:5       1 0 0 0 0 0 yt+1 Φ11 Φ12 Φ13 0 0 0 yt       0   Φ21 Φ22 Φ23 0 0 0  πt  1 0 0 0 0    πt+1     0    Φ   0 1 0 0 0  it+1    31 Φ32 Φ33 0 0 0  it    s  =    s 0  wt+1   −1 κ   wt  κβ 0 0 0 0 0 1 0 0       −1 −1/σ 0 0 0 0 wd   −1 0 −1/σ 0 1 0 wd     t+1    t i wt+1 wti 0 0 0 0 0 0 φy φπ −1 0 0 1   1 0 0   0  y  1 0   ε t+1 0 0 1   π  +  εt+1  . 0 κβ 0   i −1 −1/σ 0 εt+1   0 0 0

(12)

In this case, it turns out that the wedges wt are a simple linear function of the observables zt , and this fact greatly facilitates finding the law of motion for the wedges, which in a more general case may have to be approximated numerically. To see this, the bottom three lines of the system can be rewritten as obeying:     s    −1 κ 0 yt wt 0 κβ 0 Et yt+1     d     −1 −1/σ 0 Et πt+1  = −1 0 −1/σ  πt  + wt  , wti φy φπ −1 it 0 0 0 Et it+1 

so after substituting in the law of motion for the observables and rearranging,        0 κβ 0 −1 κ 0 yt wts  d        wt  = −1 −1/σ 0 Φ − −1 0 −1/σ  πt  , wti 0 0 0 φy φπ −1 it 

or equivalently,      −1  s  wt −1 κ 0 yt 0 κβ 0     d     πt  = −1 −1/σ 0 Φ − −1 0 −1/σ  wt  , wti 0 0 0 φy φπ −1 it which can be represented by writing wt = Jzt or zt = J −1 wt , respectively, for a square matrix 5 In typical implementations of the Sims (2002) algorithm, the bottom row of coefficients governing the Taylor rule is placed on the t + 1 side and not on the t side.

10

J. Substituting the latter representation of the mapping between the data and wedges into the law of motion for the data (3) gives the law of motion for the wedges: J −1 wt = ΦJ −1 wt−1 + εt , so that the wedges follow a VAR(1) of their own, such that: wt = JΦJ −1 wt−1 + Jεt ,

(13)

which takes the form given by equation (1). The important thing to note is that any possible VAR process for zt implied by Φ and Σ maps one-to-one into a valid VAR process for wt where F = JΦJ −1 and S = JΣJ 0 , unless one puts some meaningful restriction on the latter objects. Therefore, the likelihood of this particular model is equal to the likelihood of an unrestricted VAR on the data. This problem occurs because the parameters {κ, β, σ, φπ , φy } do not place any meaningful restrictions on the wedge process which are needed to perfectly match an unrestricted VAR(1) on the data. This finding should not be surprising, since this particular DSGE model has twenty free parameters, while an unrestricted VAR model has fifteen parameters. The fifteen parameters of the VAR driving process for the wedges can adjust to match the fifteen parameters of an estimated VAR process which governs the observables, for any valid set of structural parameters that go into J.

3.2

A model with VAR measurement errors

The problem of nonidentification also appears in a model with VAR measurement errors, and for the same reason: overparameterization. Generalizing the approach of Ireland (2004) to a situation with more than one lag, the model with VAR measurement errors can be written as having a structural block x0,t such that: 0 x0,t = Ax0,t + Bζ0,t , where Eζ0,t ζ0,t = Σw,0 ,

(14)

where the coefficient matrices A and B are functions of deep parameters θ. Additionally, the model contains a law of motion for the observation errors w1,t given by: w1,t =

∞ X

0 Fi,1 w1,t−i + ζ1,t , where Eζ1,t ζ1,t = Σw,1 ,

i=1

11

(15)

and an observation equation for the observables zt given by: zt = H0 x0,t + H1 w1,t ,

(16)

for a set of observation matrices H0 and H1 . Both the observables zt and the observation errors w1,t are of size k by one, while there are no restrictions on the dimensionality of ζ0,t . In this setup, the main result from Section 2.3 holds, such that the value for the maximized objective function for the extremum estimator does not depend on θ, so that θ is unidentified. To see this, one can set H1 to an identity matrix and by set Σw,0 to a matrix of zeros, so that x0,t equals a matrix of zeros. As a result, zt = w1,t . In this case, for any value of θ, the maximum likelihood estimator of {{Fi }, S} is given simply by the maximum likelihood estimator V ∗ = {{Φ∗i }, Σ∗ }, which in turn is easily estimated. At these values, the objective function takes on its value of Ω∗ , which does not depend at all on θ. Since the value of the objective function does not depend on θ, θ is unidentified.6 As with the simple three-equation model, this is a situation where the shock process is specified in a general enough way so that it can soak up all of the variation in observables for any values of the structural parameters θ. In this situation, the observation error terms act like another set of model wedges on which it is necessary to place restrictions in order to identify parameters of interest. If this is not done, then the model has enough parameters to explain any systematic movements in the data, whether or not these movements are explained by the underlying structural model.

4

Conclusion

In conclusion, DSGE models with VAR wedges may suffer from problems with identification if, for any values of the deep model parameters θ, the VAR process governing the wedges is flexible enough to replicate an unrestricted VAR. Using likelihood-based methods does not make it possible to identify θ in this set of circumstances, since there is always some shock process governing the wedges which can generate the patterns seen in the data. Therefore, the problems identified by Ireland (2004), Cúrdia and Reis (2012), and others with existing identification schemes does not appear to have a completely satisfactory solution, in that it does not seem to be possible to dispense with a priori statements regarding the nature of the wedge process without losing the ability to identify model parameters. With this need for identifying assumptions, there will always remain a risk of model misspecification. 6

In general, there are additional options that allow Σw,0 to take on other values; the main point is that it is not possible for Ω to exceed Ω∗ .

12

While this set of results on the identification and estimation of deep parameters is negative, there is a positive side to these results, and that positive side involves diagnosing model misspecification. To choose one example, in addition to estimating a DSGE model, Cúrdia and Reis (2012) use their estimation methodology to check which wedges exhibit high correlations with each other. They find, based on their posterior distributions of θ and {{Fi }, S}, that government spending and productivity wedges are particularly highly correlated with each other, and they argue that this finding should motivate future work. In light of their emphasis on model checking, the results presented here suggest a more efficient way to go about this, so long as one is willing to give up on estimating θ. First of all, one could estimate V ∗ = {{Φ∗i }, Σ∗ } either as a point estimate or as a posterior distribution and then back out {{Fi }, S}. These {{Fi }, S}, in turn, can be used to check the types of restrictions explored by Cúrdia and Reis. As a result, while it might not make sense to seek to identify components of θ by allowing for a VAR shock process, such a methodology might make it easier to engage in model checking. This emphasis on model checking is more in line with the work on DSGE models and VARs by del Negro and Schorfheide (2004), An and Schorfheide (2007), and del Negro and Schorfheide (2009), the latter of whom also allude to some of the same identification issues discussed here. Altogether, these model checking methods require significant judgment and effort on the part of the researcher; this is a consequence of a general tradeoff between model fit and identification. Some practical situations in which this model-checking methodology might be of use include those situations identified by Giacomini (2013), in which a structural VAR estimated using standard types of restrictions may fail to identify a true structural shock. Examples include situations in which lags in the process of implementing fiscal policy result in "fiscal foresight", like those situations discussed by Leeper (2010) and Leeper, Walker, and Yang (2013), or situations in which "news shocks" about future changes to total factor productivity precede these changes, like the situation discussed by Barsky and Sims (2011). Both sets of situations imply that, when an SVAR or a structural model does not take news about future shocks into account, estimates of structural shocks will not correspond with their true values. In the context of wedges, this type of misspecification would show up as wedges that follow a VAR process with nonzero off-diagonal coefficients, rather than a set of uncorrelated AR(1) processes. By detecting and evaluating what these off-diagonal entries look like, it should be possible to get a better handle on the misspecification that underlies these entries, and to use the insights gained in order to build better-specified models.

13

References An, Sungbae, and Frank Schorfheide, 2007. "Bayesian Analysis of DSGE Models". Econometric Reviews 26(2-4), pages 113-172. Barsky, Robert B, and Eric R. Sims, 2011. "News shocks and business cycles". Journal of Monetary Economics 58(3), pages 273-289. Canova, Fabio, and Luca Sala, 2009. "Back to square one: Identification issues in DSGE models". Journal of Monetary Economics 56(4), pages 431-449. Chari, Varadarajan V., Patrick Kehoe, and Ellen McGrattan, 2007. "Business Cycle Accounting". Econometrica 75(3), pages 781-836. Cochrane, John, 2011. "Determinacy and Identification in Taylor Rules: A Critical Review". Journal of Political Economy 119(3), June 2011. Consolo, Agostino, Carlo A. Favero, and Alessia Paccagnini, 2009. "On the statistical identification of DSGE models". Journal of Econometrics 150(1), pages 99-115. Cúrdia, Vasco, and Ricardo Reis, 2012. "Correlated Disturbances and U.S. Business Cycles". Manuscript, Columbia University. del Negro, Marco, and Frank Schorfheide, 2004. "Priors from General Equilibrium Models for VARs". International Economic Review 45(2), pages 643-673. del Negro, Marco, and Frank Schorfheide, 2009. "Monetary Policy Analysis with Potentially Misspecified Models". American Economic Review 99(4), pages 1415-1450. Fernández-Villaverde, Jesús, Juan F. Rubio-Ramírez, Thomas J. Sargent, and Mark W. Watson, 2007. "ABCs (and Ds) of understanding VARs". The American Economic Review 97(3), pages 1021-1026. Giacomini, Raffaella, 2013. "The relationship between VAR and DSGE models". Advances in Econometrics 32, pages 1-25. Ireland, Peter, 2004. "A method for taking models to the data". Journal of Economic Dynamics and Control 28(6), pages 1205-1226. Iskrev, Nikolay, 2010. "Local identification in DSGE models". Journal of Monetary Economics 57(2), pages 189-202.

14

Kersting, Erasmus K., 2008. "The 1980s recession in the UK: A business cycle accounting perspective". Review of Economic Dynamics 11(1), pages 179–191. Komunjer, Ivana, and Serena Ng, 2011. "Dynamic Identification of Dynamic Stochastic General Equilibrium Models". Econometrica 79(6), pages 1995-2032. Koop, Gary, M. Hashem Pesaran, and Ron P. Smith, 2013. "On Identification of Bayesian DSGE Models". Journal of Business and Economic Statistics 31(3), pages 300-314. Leeper, Eric M., 2010. "Monetary science, fiscal alchemy". Proceedings - Economic Policy Symposium - Jackson Hole, Federal Reserve Bank of Kansas City, pages 361-434. Leeper, Eric M., Todd B. Walker, and Shu-Chun Susan Yang, 2013. "Fiscal Foresight and Information Flows". Econometrica 81(3), pages 1115-1145. Ravenna, Federico, 2007. "Vector autoregressions and reduced form representations of DSGE models". Journal of Monetary Economics 54(7), pages 2048-2064. Rothenberg, Thomas J., 1971. "Identification in Parametric Models". Econometrica, 39(3), pages 577–591. Sims, Christopher A., 1980. "Macroeconomics and reality". Econometrica 48(1), pages 1-48. Sims, Christopher A., 2002. "Solving Linear Rational Expectations Models". Computational Economics 20(1-2), pages 1-20. Smets, Frank, and Rafael Wouters, 2007. "Shocks and Frictions in US Business Cycles: A Bayesian DSGE Approach". American Economic Review 97(3), pages 586–606. Šustek, Roman, 2011. "Monetary Business Cycle Accounting". Review of Economic Dynamics 14(4), pages 592-612. Wright, Philip G., 1928. The Tariff on Animal and Vegetable Oils. New York: MacMillan. Zanetti, Francesco, 2008. "Labor and investment frictions in a real business cycle model". Journal of Economic Dynamics and Control 32(10), pages 3294-3314.

15

A NOTE ON THE NONEXISTENCE OF SUM OF ...

Note on the Voice of the Customer

The Identification and Economic Content of Ordered ...

A Primer on the Empirical Identification of Government ...

A Note on the Power of Truthful Approximation ...

A NOTE ON STOCHASTIC ORDERING OF THE ... - Springer Link

Identification of dynamic models with aggregate shocks ...

Parametric Identification of Stochastic Dynamic Model ...

A Note on the Inefficiency of Bidding over the Price of a ...

Note on Drafting a speech.pdf

A note on Kandori-Matsushima

A Note on -Permutations

A Critical Note on Marx's Theory of Profits

Comment on ``Identification of Nonseparable ...

briefing note on - Services

A Note on Separation of Convex Sets

A Note on Uniqueness of Bayesian Nash Equilibrium ...

A note on juncture homomorphisms.pdf - Steve Borgatti

A Note on Uniqueness of Bayesian Nash Equilibrium ...

Report of the commission on the measurement of economic ...