Movements in the Equity Premium: Evidence from a Time-Varying VAR Massimiliano De Santis ∗ Department of Economics Dartmouth College [email protected]

September 9, 2007

Abstract Previous literature has recognized the importance of regime changes in the calculation of ex-ante equity premia. However, the methodologies used to estimate equity premia only allow for very restrictive forms of regime transitions. This paper addresses the issue by postulating an evolving model for the law of motion of dividend growth, consumption growth and dividend-price ratio. Model parameters are then used to compute conditional and unconditional U.S. equity premia. We substantially extend and confirm previous work on the declining equity premium, and uncover important macroeconomic factors driving the equity premium. We find that the equity premium has declined, particularly from 1950 to 1971 and from 1988 to 2000. Our results point to changing consumption volatility as an important priced factor. We find that volatility of consumption growth is a good indicator of economic uncertainty and, as such, its changes are reflected in expected returns, and are priced by the market. We also find that not accounting for parameter time variation induces large pricing errors. I am grateful to Timothy Cogley, Oscar Jorda, Jason Lepore, Louis Makowski, Klaus Nehring, Giorgio Primiceri, Martine Quinzii, Kevin Salyer, Aaron Smith, participants at the European Financial Management Meetings (Basel) and the Money and Macro Research Group Annual Conference (Rethymno), for many helpful comments. Special thanks go to Timothy Cogley, whose suggestions helped improve substantially a previous version of the paper, and to Louis Makowski and Jason Lepore for numerous enjoyable discussions. ∗

1

Introduction

Previous literature has recognized the importance of regime changes in the calculation of ex-ante equity premia. However, the methodologies used to estimate equity premia only allow for very restrictive forms of regime changes. For example, Blanchard (1993) uses rolling samples to estimate conditional equity premia. Jagannathan, McGrattan and Scherbina (2000), and Fama and French (2002) use non-overlapping subsamples to estimate unconditional equity premia. In this paper we use an optimal filter that allows for a wide class of regime transitions to efficiently estimate ex-ante equity premia.1 Calculation of expected equity returns and equity premia is crucial to in applications such as capital budgeting and portfolio allocation decisions. The work cited above emphasizes that use of historical averages of excess returns may result in a poor estimate of the ex-ante equity premium. Without changes in the equity premium, high ex-post average of excess returns indicate high risk and/or high risk aversion. But if the equity premium changes over time, high realized excess returns can be due to a drop in risk premia over the sample, rather than high risk or high risk aversion. A more precise measure of the expected equity premium is then calculated from the yield derived from present value relations. There are two related problems with the use of present value formulas to estimate expected returns. First, they require a log-linearization of returns around a steady-state value of the dividend-price ratio. Second, they require a specification of the law of motion of dividend growth. The usual assumption is that dividend growth follows a stationary ARMA process. While this is a convenient simplifying assumption, there is no reason to assume that the law of motion of dividend growth should follow a stationary process. For example, the Modigliani-Miller theorem states that firm value maximization does not constrain the form of dividend policy. Firm managers then have no incentive to follow a constant law of motion for dividends. This may lead to inconsistent estimation of expected returns. If prices are invariant to dividend policy while dividends are subject to regime changes, the law of motion of the dividendprice ratio (or dividend yield) will also be time-varying. As an additional cause of instability in the dividend yield, savers and investors’ perception of risk might change over time. This implies that log-linearization around an invariant steady state value may be inappropriate. Since the approximating constants enter as geometric weights in the infinite sum of future dividend 1

The distinction between conditional and unconditional will be defined below. In this paper we provide estimates of both conditional and unconditional equity premia.

1

growth required to estimate expected returns, small changes in the steadystate value of the dividend-price ratio may imply a large bias in estimated expected returns. If the laws of motion of dividend growth and the dividend-price ratio evolve, so should their joint behavior with other macro-economic variables like consumption growth. In this paper we address these issues by modelling dividend growth, the dividend-price ratio, and consumption growth as a reduced form VAR with two ingredients: time-varying coefficients and time-varying variance-covariance matrix. In contrast to previous literature that examines the behavior of the ex-ante equity premium over time, we use an optimal filter to provide Bayesian estimates of the annual equity premium from 1928 to 2002 that use the entire sample. Also in contrast to previous literature, we include consumption in the system to relate movements of expected returns and equity premia to sources of macroeconomic risk, as measured by fluctuations in per capita consumption growth. If consumption is determined by optimal behavior of agents under time varying risk, it should summarize information about risk premia, and be of value in forecasting future returns, while keeping model’s variables to a minimum in our VAR. For example, Parker and Julliard (2005) find that consumption contains information about expected returns at multi-year horizons for a cross section of stock portfolios. Results from our empirical model substantially extend and confirm previous work of Blanchard (1993), Siegel (1999), Jagannathan, McGrattan and Scherbina (2000), and Fama and French (2002), on the declining equity premium. We find that the equity premium in recent years is closer to levels implied by standard consumption models and that it has been declining in the post-war period from the unusually high levels of the 1930’s and 1940’s. This decline suggests that the high equity returns in the post war period may represent the end of a high equity premium, as opposed to a puzzle. Furthermore, we find a common low frequency component between the volatility of consumption growth and the level of the equity premium. We also perform an exploratory data analysis in search of clues about factors that drive risk premia at business cycle frequencies. Results point to changing consumption volatility as an important priced factor. We find that volatility of consumption growth is a good indicator of economic uncertainty and, as such, changes are reflected in expected returns, and are priced by the market. As an added bonus, our time-varying methodology yields interesting results about dividend and return predictability. In particular, we find that accounting for parameter time variation increases substantially the ability of the price dividend ratio to forecast future dividend growth. Not accounting for

2

parameter instability induces large pricing errors, because too much variation in the dividend yield is attributed to expected returns. This possibility was discussed by Martin Evans (1998). While not interested in the equity premium, Evans tests present value relationships in the spirit of Campbell and Shiller (1988) allowing the dividend process to switch between two regimes. Apart from the different focus, we do not restrict dividend growth to switch between two regimes. Discrete-switches models either impose a finite number of recurring states, or a finite number of non-recurring states and the switch between regimes is a discrete jump. These models may well describe rapid changes in the joint behavior of the variables of interest, but seem less suitable to capture changes in aggregate stock market behavior, where aggregation among agents smooths out most of the changes. Finally, VAR parameters may vary as a result of economy-wide changes other than dividends, such as changes in preferences or risk attitudes, that also seem better represented by smooth transitions. Other work has looked at the movements in the equity premium over time using present value relations. Our approach is similar to Blanchard (1993). Blanchard does not use a VAR framework to infer the equity premium, but he recognizes that the relationship between fundamentals and the premium varies over time, and thus chooses to use 40 years rolling samples —his concern being unstable inflation. Like Blanchard, Fama and French (2002), and Jagannathan, McGrattan and Scherbina (2000), provide evidence that the unconditional equity premium has declined in the last 50 years and suggest that high realized returns over the period are a consequence of a declining equity premium. Both papers base their analysis on unconditional measures based on ten years sub-samples. Therefore they implicitly assume that the stochastic process underlying stock prices is stable within each decade. The procedure used in this paper extends the work above by providing estimates of the expected equity premium at each point in time that use the entire sample. It is left to the data to decide how much weight to give to observations far from date t. We also go a step further, and provide some evidence of macroeconomic variables that drive the equity premium both in the long run, and at business cycle frequencies.2 By introducing consumption growth in the VAR, we are also able to link 2

A paper by Pastor and Stambaugh (2001) is also worth citing. Although with a different focus, they also use a Bayesian technique to estimate the equity premium. They model structural breaks in the equity premium using a univariate approach (for returns) based on the risk-return trade off. The idea is that changes in return volatility can be used as a instrument that has power to detect breaks in the equity premium.

3

our work to recent developments in the relationship between asset prices and macroeconomic risk, as measured by volatility of consumption. Consumption volatility is found to be time varying and predictable by valuation ratios. Recent work by Bansal, Khatcharian, and Yaron (2005) and Lettau, Ludvigson, and Wachter (2005) show that this relationship is consistent with existing general equilibrium models. Here, using our conditional measures of the equity premium, we provide direct empirical evidence on the relationship between consumption volatility and expected excess returns. The remaining of the paper is organized as follows. Section 2 reports stability tests on VAR equations for dividend yield, dividend growth, and consumption growth. Section 3 the model for expected excess returns, and the econometric model. Section 4 motivates the Bayesian inference, specifies the priors used in the analysis and gives an overview of the Gibbs sampler. Section 5 discusses the results, and Section 6 concludes the paper. An appendix at the end of the paper provides robustness checks of our Bayesian inference.

2

Stability Tests

Stability tests are conducted using both quarterly and annual data on the log dividend-price ratio (δt ), log dividend growth (∆dt ), and log per-capita consumption growth (∆ct ). Data on dividends and prices refer to the S&P 500 and are downloaded from Robert Shiller’s website as well as the CPI used to convert to real figures. Data on personal consumption and population are from the FREDII website. We use two samples, a post World War II quarterly sample and a long annual sample. Quarterly data spans the period 1947.I - 2004.II, and annual data spans the period 1890-2004. The focus on post World War II quarterly data is particularly useful for our analysis of the equity premium and macroeconomic variables of Section 5.4, whereas the long sample is more useful for our analysis of long run changes in the equity premium.3 The top panel of Table 1 presents summary statistics and Phillips-Perron unit root tests for δt , ∆ct , ∆dt . The left panel reports results for the post war sample, and the right panel does the same for the long sample. The lower part of the table presents residual analysis for AR(2) models of each variable. Let us focus on this part first. Ljung-Box statistics (lag length 12) indicate that two lags are enough to model the dynamics of dividend yields, consumption 3

In this section, we extend and confirm previous evidence presented in Evans, 1998, of an unstable dividend growth process using Shiller’s annual data, and in Timmermann (2001) who uses Shiller’s monthly data.

4

growth, and dividend growth in quarterly data, as absence of autocorrelation is not rejected in the residuals of any of the estimated equations. Moving to unit root tests on the variables, the Phillips-Perron test statistic does not reject the unit root hypothesis for dividend yields, although it does reject the hypothesis for dividend growth and consumption growth. One possible reason for failing to reject a unit root in δt is that the log dividend yield is a very persistent process, and so the data is not informative enough to distinguish between the two types of processes. Alternatively, and this is the view we take here, the time series model for δt may not be stable over the sample, while being stationary within sub-intervals in the sample. To explore this hypothesis, we present results from the efficient test statistic of Graham Elliott and Ulrich K. M¨ ueller (2006). Under Elliot and M¨ ueller’s test, the null hypothesis of a stable regression model is tested against the alternative of an unstable regression model, without assuming a specific law of motion for the vector of regression coefficient. For example, a stable process for dividend-yields δt ′

δt = xt β + ut

t = 1, 2, . . . , T

is tested against the unstable model ′

δt = xt βt + ut

t = 1, 2, . . . , T,

where T is the sample size, ut is a disturbance term, and β is the vector of regression coefficients. The vector β is allowed to vary in the unstable model —hence the subscript t. But the law of motion of βt is not specified. Elliot and M¨ ueller show that, asymptotically, there is no power improvement in specifying a law of motion, which is typically unknown. In other words, their proposed test is asymptotically efficient. The test statistic, denoted by QLL in Table 2, is calculated using a simple six-step process. The process consists in calculating OLS residuals in the stable model, and in running an auxiliary regression using an appropriate transformation of the OLS residuals. Table 2 reports the results from the tests conducted on both samples. Consider the dividend yield first. The second column of Table 2 shows the values of the QLL statistic for δt ’s VAR(2) equation. The VAR includes ∆dt , ∆ct , and δt in this order. The QLL rejects stable coefficients for the equation at the 1% significance level in both the post war sample and the long sample. In third column of the table, contemporaneous levels of ∆dt and ∆ct are added as regressors in the same VAR(2) equation to guard against the possibility that rejection is due to lack of appropriate conditioning information. The QLL statistics again reject absence of parameter instability at the 1% level, in both samples.

5

Table 1: Time Series Properties

δt Summary Stats Mean Stdev Max Min Unit Root Tests PP Stat 10% Critical value 1% Critical Value Autoregressions µ φ1 φ2 Autocorrelations: Lag 1 Lag 2 Lag 3 Lag 12 Ljung Box (12 lags) Critical Value

Post War ∆dt

∆ct

Long Sample δt ∆dt ∆ct

-4.776 0.420 -3.984 -5.884

0.010 0.083 0.400 -0.317

0.022 0.035 0.180 -0.052

-3.207 0.409 -2.315 -4.448

0.011 0.115 0.405 -0.436

0.021 0.035 0.105 -0.099

-1.905 -3.136 -3.991

-10.639 -3.136 -3.991

-14.843 -3.136 -3.991

-3.069 -3.155 -4.005

-9.127 -3.155 -4.005

-11.490 -3.155 -4.005

-4.828 1.090 -0.105

0.013 0.517 0.080

0.022 0.019 0.285

-3.261 0.855 0.035

0.010 0.160 -0.215

0.021 -0.072 0.060

0.004 -0.023 0.029 0.065 6.612 21.026

-0.022 -0.086 0.045 -0.129 17.972 21.026

-0.001 0.050 0.009 -0.022 13.908 21.026

-0.004 -0.236 0.157 -0.143 18.583 21.026

-0.008 -0.020 -0.028 -0.086 9.874 21.026

0.007 -0.007 -0.107 -0.019 18.460 21.026

Notes: δt is the log of dividend-price ratio, ∆dt is log-dividend growth, and ∆ct is logconsumption growth. PP stat indicates the Phillips-Perron test for unit roots. The test includes a constant term and a linear trend. The coefficients µ, φ1 , φ2 are coefficients of an AR(2) model for the respective variables. The number of Lags included in the calculation of the Ljung-Box statistic is 12, and the 5% critical value of the χ212 is 21.03.

6

Table 2: Tests for Parameter Instability

Post War QLL 1% Critical Value Long Sample QLL 1% Critical Value

δt VAR(2)

δt VAR(2), ∆dt , ∆ct

∆dt VAR(2)

|∆dt | VAR(2)

∆ct VAR(2)

-11631 -45.85

-16542 -56.46

-20581 -45.85

-35611 -45.85

-8590 -45.85

-2531 -45.85

-2379 -56.46

-1005 -45.85

-1902 -45.85

-1595 -45.85

Notes: The QLL statistic tests the null hypothesis of stable parameters, against the alternative of non-stable parameters, with unspecified law of motion. The null hypothesis is rejected for small values of the statistic.

Moving to log dividend growth, the QLL statistic rejects parameter stability in the VAR(2) model at the 1% level (see the fourth column). We also conduct the test on the absolute value of dividend growth, |∆dt |, a proxy for realized volatility of the process. There is even stronger evidence of instability in this series, which indicates that there are important non-linearities in the time series behavior of this fundamental variable for price determination. The VAR(2) equation with stable parameters is also rejected for log consumption growth at the 1% level of significance. To summarize, we find significant evidence of parameter instability in the behavior of δt , ∆ct , ∆dt over the sample considered. This and the existing evidence reported in Timmermann (2001) and Evans (1998) gives support to a time varying specification of the joint behavior of the series.4

3

The Model of Expected Excess Returns

To estimate expected excess returns, we estimate a time varying VAR on dividend growth in excess of the risk free rate, consumption growth, and dividend yields. As we detail below, we use VAR forecasts along with Campbell and Shiller’s (1988a,b) log linear approximation of returns to infer excess returns. That is, we do not use returns data directly. It is possible to use the same 4

We also conduct Bai and Perron (1998), and Hansen’s (1992) tests of parameter instability and find similar evidence. Results (not reported) are available in an appendix upon request.

7

VAR methodology and use excess returns data directly, along with consumption growth and dividend yields, to calculate expected excess returns.5 But notice that because returns are a non linear function of dividends and prices, apart from approximation error, the two VAR’s yield the same forecasts of excess returns as they use the same information. Further, the number of parameters to estimate would be the same; thus, in terms of computational complexity, the two approaches are equivalent. Even if our way of proceeding is sometimes referred to as a present value approach (or dynamic Gordon model) it does not impose any more economic assumptions than using returns directly would. As shown in Campbell and Shiller (1988a), and also below, the only extra cost is the use of the log linearization of returns.6 The benefit from this procedure is that it allows us, in a three-variable VAR, to estimate the equation for dividend growth. In particular, it allows us to analyze the effect of parameter time-variation on the predictability of “fundamentals,” as measured by dividend growth. We now proceed to the derivation of our measures of expected excess returns.

3.1

Measures of Expected Excess Returns

As mentioned above, we use the log linear approximation of returns proposed by Campbell and Shiller (1988a,b) to estimate expected excess returns from the time series on dividends and prices. The approximation starts from the definition of stock returns between time t and t + 1   Pt+1 + Dt+1 Dt+1 Pt+1 1 + Rt+1 ≡ 1+ = Pt Pt Pt+1 Denoting log returns by ht+1 and the logs of price and dividend by lowercase letters we can write: ht+1 ≡ 1 + Rt+1 = pt+1 − pt + log (1 + exp(dt+1 − pt+1 )) = −(dt+1 − pt+1 ) + (dt − pt ) + ∆dt+1 + log (1 + exp(dt+1 − pt+1 )) = δt + ∆dt+1 − δt+1 + log (1 + exp(δt+1 )) , 5

Following the predictability literature, the dividend yield would still need to be included to improve forecasts of returns. 6 The model becomes a present value model when additional assumptions are made about discount rates, e.g. the assumption that the discount rate is constant, or the assumption that the discount rate is a power function of aggregate consumption growth, as in standard consumption CAPM models.

8

where the second line is obtained by simply adding and subtracting log dividend growth ∆dt+1 , and the third line by rearranging and defining δt ≡ dt −pt . The non linear function log(1+exp(δt+1 )) can be linearly approximated around a value of δt+1 . Campbell and Shiller (1988a,b) approximate the function ¯ obtaining around the unconditional mean δ, ht+1 ≃ k + δt + ∆dt+1 − ρδt+1 ,

(3.1)

k and ρ are constants derived from the approximation, and are defined by ρ = 1/(1 + exp(δ)) and k = − log(ρ) − (1 − ρ) log(1/ρ − 1). f In this paper, we are interested in log excess returns ht+1 − rt+1 , obtained f by subtracting the risk free rate rt+1 from both sides of (3.1): f ht+1 ≃ k + δt + (∆dt+1 − rt+1 ) − ρδt+1 .

Campbell and Shiller derived the linear approximation on the assumption that the dividend yield is a stationary process and so chose the sample mean as point of approximation. But if the process for δt is non stationary, and the δ¯ changes over time, a better approximation would let k and ρ vary over time. Even though Campbell and Shiller’s approximated return series ht with constant parameters is highly correlated with actual returns (the correlation coefficient is 0.99), a time varying approximation can make a difference for calculating expected returns, as we show below. We therefore define the timevarying approximation for excess returns as f f ht+1 − rt+1 ≃ kt + δt + (∆dt+1 − rt+1 ) − ρt δt+1 ,

(3.2)

where the approximation parameters are allowed to vary over time. The (approximate) first order difference equation in δt implicitly defined by (3.2) can be solved for δt imposing the terminal condition limj→∞ ρjt δt+j = 0. Taking expected values conditional on an information set containing δt and the approximating constants, we obtain ∞



X j X j kt f f + ρt Et (ht+j+1 − rt+j+1 )− δt ≃ − ρt Et (∆dt+j+1 − rt+j+1 ). (3.3) 1 − ρt j=0 j=0 So the log dividend-price ratio is approximately equal to a constant, plus a weighted sum of future expected excess returns, minus a weighted sum of future expected dividend growth in excess of future risk free rates. The first sum on the right hand side of (3.3) is a weighted sum of future expected excess returns whose weights sum to (1 − ρt ). Thus, at time t, we

9

can use (3.3) to obtain a measure of the conditional equity premium EPc : EPc,t ≡ (1 − ρt )

∞ X

f ρjt Et (ht+j+1 − rt+j+1 )

j=0

= kt + (1 − ρt ) δt +

∞ X

!

f ) . (3.4) ρjt ∆Et (dt+j+1 − rt+j+1

j=0

This measure of the equity premium is simply the average yield of stocks in excess of the risk free rate. Notice now that even small changes in ρt may have a large impact in the measurement of the equity premium, as the error propagates in the infinite sum.7 From the above equation, the usefulness of the log linear approximation should become clear: we can now use a linear model to forecast future dividends in excess of the risk free rate, and calculate the equity premium. In what follows, we will model δt , (∆dt − rtf ), and ∆ct as a VAR with timevarying parameters. Then we can use estimated VAR parameters to calculate the expectations on the right hand side as ∞ X j=0

ρjt Et (∆dt+j+1



f rt+j+1 )

=s

d



µt + Ft (I − ρt Ft )−1 ξt 1 − ρt



,

(3.5)

where Ft contains the time-t VAR parameters rewritten in state-space form, ξt contains the state vector in deviation from the (time-varying) unconditional means, and sd is a vector that selects the dividend series from the VAR (see Hamilton, 1994, p. 259). The unconditional means are computed as µt = (I − Ft )−1 ct , where ct is the vector of intercepts in the VAR. This is analogous to what one would do with a constant parameter VAR. Here we use a different set of parameters at each date. The expectation Et is therefore conditional on the variables at time t (yt in the VAR), and conditional on VAR parameters θt ≡ (Ft , µt ). If we condition only on θt we can get a measure of the time-t unconditional equity premium. Consider equation (3.2) again. Taking the expected value conditional on θt yields EPu,t ≡ µt (h − rf ) ≃ kt + µt (δ)(1 − ρt ) + µt (∆d − rf ), 7

(3.6)

For example, assuming the true parameters equal the posterior median of our estimates in the following sections, not allowing for a time-varying approximation leads to median absolute mispricing of 85 basis points and an average mispricing of 41 basis points in quarterly data.

10

where µt (·) are the time varying unconditional means from the VAR. This is our unconditional measure of the equity premium EPu ≡ µt (h − rf ). EPu can also be derived by averaging over yt in (3.4). In both EPc and EPu , we use the time-varying unconditional mean of δt , µt (δ), to evaluate kt and ρt . Our conditional measure of the equity premium ERc is an average excess yield on the risky asset, and can be thought as the average expected excess return over a period say of 15-20 years, in annual data. At each date t, expected 15-20 years annualized excess returns will depend on the price level relative to dividends at time t. If stocks are expensive relative to dividends compared to some mean reverting value, excess yields will be lower. Our choice of EPc,t needs further explanation. Notice in fact that, given VAR parameters, it is possible to calculate the one period ahead forecasts f Et ht+1 − rt+1 . We choose, instead, to use an average yield on stocks over and above of the risk free rate. There are two basic reasons for our choice. First, excess returns are almost unpredictable at quarterly and yearly frequencies, which causes these forecasts to vary highly from a quarter/year to the next, often assuming implausible values. By contrast, long horizon returns are much more predictable. Second, for main applications of the equity premium such as capital budgeting decisions, or wealth allocations across asset classes, a long horizon is more appropriate. Our measure is also consistent with previous studies such as Blanchard (1993), Jagannathan, McGrattan, and Scherbina (2000), and Fama and French (2002), who also defined the equity premium as an average yield on stocks above the risk free rate. The second measure ERu represents expected returns as if one bought stocks at their mean price relative to dividends. The fact that our unconditional measure varies over time is meant to capture non-stationarity of δ due to structural shifts in dividend policy, productivity, or preferences that change expected returns and/or expected growth rates. Notice that this way of calculating the premium, using ∆dt − rtf , will automatically yield a real equity premium, as inflation corrections cancel out. So we do not have to worry about calculating expected inflation in our measure of the premium.

11

3.2

The Econometric Model

Given the strong evidence of parameter instability in VAR equations presented in Table 2, we model the joint behavior of the (∆dt − rtf ), ∆ct , and δt (in this order) as a VAR with time-varying parameters: yt = Xt′ θt + ut ′ ′ Xt′ = In ⊗ [1, yt−1 , . . . , yt−k ],

(3.7)

where yt includes the three observed variables. In general this is an n × 1 vector. ⊗ denotes the Kronecker product, so in general Xt′ is an n × k matrix. θt is the k × 1 vector of coefficients. The ut are disturbance terms with variance covariance matrix Ωt . Without loss of generality consider the following decomposition of Ωt ′

−1 Ωt = A−1 t Σt At

(3.8)

where At is lower triangular 

1/2

and Σt

1

0

···

 1 ... a At =  21,t . . ... ..  .. an1,t · · · ann−1,t is the diagonal matrix  1/2

Σt

  = 

σ1,t

0

0 .. .

σ2,t .. .

0

···

It follows that (3.7) is equivalent to

 0 ..  .  0 1

···

 0 ..  ... .  . ... 0  0 σn,t 1/2

yt = Xt′ θt + A−1 t Σt εt

(3.9)

The drifting coefficients are meant to capture possible nonlinearities or time variation in the lag structure of the model. The multivariate time varying variance covariance matrix captures possible heteroskedasticity of the shocks and time variation in the simultaneous relations among the variables in the system. In the context of time varying VAR models, a similar specification has been proposed by Primiceri (2005) and Cogley and Sargent (2005), though

12

the latter has a time invariant At matrix. As emphasized in Primiceri (2005), a time variant At is highly desirable if the objective is to model time variation in a simultaneous equation model. Let αt be the vector of non-zero and non-one elements of the matrix At 1/2 (stacked by rows) and σt be the vector of diagonal elements of Σt . The model’s time varying parameters evolve as follows: θt = θt−1 + νt , αt = αt−1 + ζt , log σt = log σt−1 + ηt ,

(3.10) (3.11) (3.12)

with the distributional assumptions regarding (εt , νt , ζt , ηt ) stated below. Time varying parameters θt and At are modelled as driftless random walks and the standard deviations are assumed to evolve as geometric random walks. Thus, the model belongs to the class of stochastic volatility models, which constitutes an alternative to ARCH models. The crucial difference with ARCH is that the variances generated by (3.12) are unobservable components. Equations (3.9)-(3.12) form a state space representation for the model. (3.9) is termed the observation equation, and (3.10)-(3.12) are the state equations. An undesirable feature of the random walk assumption is that the process hits any upper or lower bound with probability one. Our objective though is to uncover the values of the parameters θt , At and σt as they evolve in our finite sample. As long as (3.10)-(3.12) are thought to be in place for a finite period of time, the random walk assumption should be quite innocuous and provides flexibility while reducing the number of parameters in the estimation procedure. This is particularly true if, quite plausibly, the variances of parameter innovations are small. All the innovations in the model are assumed to be jointly normally distributed with a block diagonal covariance matrix:     εt In 0 0 0 νt   0 Q 0 0      V = Var  (3.13) ζt  =  0 0 S 0  , ηt 0 0 0 W

where In is the identity of dimension n, Q, S, and W are positive definite matrices. We will further assume that S is block diagonal with blocks corresponding to parameters belonging to separate equations in the structural model. This assumption simplifies inference and increases the efficiency of the estimation algorithm.

13

4

Estimation

The model in (3.9)-(3.12) is basically a regression model with random coefficients and covariances. The Bayesian framework, which views parameters as random variables, is the most natural approach in this setting. The Kalman filter, which is the algorithm used to make inferences about the history of θt , also fits naturally in a Bayesian framework (see Meinhold and Singpurwalla, 1983) . This section gives an overview of the estimation strategy and the algorithm used in estimation. Two other important reasons make Bayesian methods particularly suitable for this class of models. First, if the variance of the time varying coefficients is small, as one would expect here, then the maximum likelihood estimator is biased towards a constant coefficients VAR. As a consequence, numerical optimization methods are very likely to get stuck in uninteresting regions of the likelihood (see for instance Stock and Watson, 1998 for a discussion on the subject). The second and related drawback is that numerical optimization methods have to be employed in a highly dimensional problem. Multiple peaks are highly probable in such a non-linear model. This makes MLE quite unreliable if in fact a peak is reached at all. Here, the problem of estimating a highly dimensional parameter vector is dealt with by means of the Gibbs sampler, which allows to divide the task in smaller and simpler ones. The Gibbs sampler is a stochastic algorithm, and as such it is more likely to escape local maxima. Finally, our Gibbs sampling methodology delivers smoothed series, i.e. conditional on observing the entire sample, which we view as the natural object to consider in this context. The view of the paper is the following. We are given a sample (y1 , . . . , yT ) and want to make inference about expected excess returns over the time interval 1, . . . , T . Under the assumption that the TVP-VAR well represents the dynamic relationship among the variables in yt , expectations are a nonlinear function of unknown VAR parameters. Thus, the inference about these expectations is an inference about a nonlinear function of the parameters. When making the inference about a nonlinear function of the parameters, it seems natural to use the entire observed sample.8 Put differently, we are not assuming that agents form expectations using our VAR at each point in time after observing yt , in which case out of sample forecasts would be more appropriate. Rather, we view the time varying VAR as a useful model for conducting the inference of what expectations might have been, given the full sample. This seems to be the interpretation of Campbell and 8

The Bayesian procedure will produce a posterior distribution of our object of interest, the expectations of excess returns.

14

Shiller (1988) (and the following literature on tests of present value relations). They use a constant parameter VAR estimated over the entire sample to infer expected future dividend growth as in equation (3.5) at each point in time.

4.1

Priors

We choose prior distributions following Cogley and Sargent (2005) and Primiceri (2005). The choice is based on intuitiveness and statistical convenience of the distributions for the application at hand. Following the Bayesian literature, θt , At , Σt will be called “parameters” and the elements of Q, S, W “hyperparameters”. The hyperparameters are assumed to be distributed as independent inverseWishart random matrices. The Wishart distribution can be thought of as the multivariate analog of the χ-square, and it is used to impose positive definiteness of the blocks of V as defined in (3.13). The prior is p(V ) = IW (V

−1

, T0 ),

where IW (Sc, df ) represents the inverse-Wishart with scale matrix Sc and degrees of freedom df . The priors for the initial states of the regression coefficients, the covariances, and log volatilities, p(θ0 ), p(α0 ), p(log σ0 ), are conveniently assumed to be normally distributed, independent of each other and of the hyperparameters. These assumptions, together with (3.10)-(3.12) imply normal priors for the evolving parameters. For instance, the vector of covariance states evolves according to p(αt+1 |αt , S) ∼ N (αt , S), and similarly for coefficient states and volatility states. The normal prior on θ is standard. Primiceri (2005), Smith and Kohn (2002), as well as Cogley and Sargent (2005), use the same decomposition of Ωt and place a similar prior on the elements of A. The log normal prior on the volatility parameters is common in the stochastic volatility literature modelling ηt as Gaussian (see Kim, Shephard and Chib, 1998). Such prior is not conjugate, but has the advantage of tractability. While we make convenient distributional assumptions, our priors are only weakly informative and we have no a priori reason to believe these assumptions should affect parameter estimates of our reduced form VAR in any economically significant way.

15

4.2

Overview of the Simulation method

The complete Gibbs sampling procedure is detailed in the appendices of Primiceri (2005) or Cogley and Sargent (2005). Here I sketch the MCMC algorithm used to sample from the joint posterior of (θT , AT , ΣT , V ). Here and throughout the paper, a superscript T denotes complete histories of data (e.g. θT = θ1′ , . . . , θT′ ). Sampling from the joint posterior is complicated, so sampling is carried out in four steps by sequentially drawing from the conditional posterior of the four blocks of parameters: coefficients θT , simultaneous relations AT , variances ΣT , and hyperparameters V . The sampler can be summarized as follows: 1. Initialize AT , ΣT , and V . 2. Sample θT from p(θT |Y T , AT , ΣT , V ). 3. Sample AT from p(AT |Y T , θT , ΣT , V ). 4. Sample ΣT from p(ΣT |Y T , θT , AT , V ). 5. Sample V, by sampling Q, W, S from p(Q, W, S, |Y T , θT , AT , ΣT ) 6. Go to 2. Conditional on AT and ΣT , the state space form given by (3.9) and (3.10) is linear and Gaussian. Therefore, the conditional posterior of θT is a product of Gaussian densities and θT can be drawn using a standard simulation smoother (see for instance Fruhwirth-Schnatter (1994) or Cogley and Sargent 2005). This consists in drawing an initial state θ0 , then use of the Kalman filter produces a trajectory of parameters. From the terminal state, a backward recursion produces the required “smoothed” draws (i.e. draws of θ’s given Y T ). Similarly, the posterior of AT conditional on θT and ΣT is a product of normal densities, so AT is drawn in the same way. Drawing from the conditional posterior of ΣT is a little more involved because the conditional state-space representation for log σt is not Gaussian. This stage of the Gibbs sampler uses a method proposed by Kim, Shephard and Chib (1998) . This consists of transforming the non-Gaussian state space form in an approximately Gaussian one (by using a mixture of normal distributions), which allows us again to use the standard simulation smoother conditional on a member of the mixture. Finally, drawing from the conditional posterior of the hyperparameters (V ) is standard, since it is a product of independent inverse-Wishart distributions.

16

After a transitional period (“burn-in” period), the sequence of draws of the four blocks from their respective conditional posteriors converges to a draw from the joint distribution p(θT , AT , ΣT , V |Y ).

4.3

Prior Calibration

The priors are calibrated on a constant parameter VAR(2) estimated using an initial sample of 36 observations for the long sample of annual data, and 50 observations for the post war sample of quarterly data. This corresponds to the years 1892-1927 in annual data and 1947.Q1-1956.Q4 in the quarterly sample. Priors for parameters and hyperparameters are modeled as follows:   ˆ ˆ θ0 ∼ N θOLS , V (θOLS )   A0 ∼ N AˆOLS , V (AˆOLS ) log σ0 ∼ N (log σ ˆOLS , In )   2 Q ∼ IW kQ T0 V (θˆOLS ), T0  2 W ∼ IW kW In , 4   S1 ∼ IW kS2 V (Aˆ1,OLS ), 2   S2 ∼ IW kS2 V (Aˆ2,OLS ), 3

(4.1)

The prior on θ0 is standard. For σ0 we simply use the log of the OLS estimate. The prior on A is calibrated using the residual from the OLS regressions uˆt = A−1 0 Σ0 εt . Since A0 is lower triangular, we can get estimates of the coefficients in A by regressing uˆt,2 on uˆt,1 , and uˆt,3 on uˆt,2 and uˆt,1 . The regressions also provide estimates of V (AˆOLS ). The prior for the hyperparameters are inverseWishart with scale matrices set to a fraction of the OLS covariance matrix of 2 the respective parameters. So for Q, the scale matrix kQ times the covariance of the OLS estimates for θ0 , times T0 . We set kQ = 0.025. With k = 0.025 our prior attributes 2.5% of the estimated total variation in parameters to time variation. This should be a quite conservative value, letting the likelihood add variability if needed. T0 , the prior degrees of freedom, is set to 22, the minimum required for the prior to be proper (22=dim(Q)+1). We multiply the variance by T0 so that we have a scale matrix, as opposed to a covariance. Cogley and Sargent (2005) and Primiceri (2005) use similar values. For kW and kS I choose the same values as Primiceri (2005), i.e. 0.01 and 0.1 respectively. Some robustness checks are discussed in Appendix A. These values seem to be

17

plausible for both data sets and the conclusions drawn below are not altered for alternative sensible specifications of the parameters.

5

Results from the Time Varying VAR

We now present the results from the estimated time varying VAR. As in Section 2, we use both the post war sample and the long sample. As described earlier, most of the data is available from Robert Shiller’s web site, and from FRED II. The only difference between the two samples (apart from the time span and frequency) is in the definition of the risk free rate. The post war quarterly sample uses the return on the three month T-bill, while the long annual sample uses the annual return on six month commercial paper. This different choice does not affect the dynamics of the equity premium. While we use both samples for the entire analysis, we focus on the annual sample to uncover movements in the last 75 years and relate this to the discussion on the declining equity premium. We then focus on the quarterly sample to further explore movements in the equity premium during the last four decades and to relate this to some recent literature on the premium and macroeconomic risk that uses the same data set. To estimate the VARs, we repeat the algorithm described above 24,000 times, dropping the first 4,000 draws (burn in period) and keeping one every two draws of the remaining 20,000 (thinning ratio). This yields a sample of 10,000 draws. We use posterior draws to compute expected returns and risk premia as detailed in Section 3. Before presenting the results, it is worth mentioning that our equity premium measures are conditional on the index used, the S&P 500. While it may be argued that the S&P 500 index is too narrow a measure for overall market performance, Campbell and Shiller (1988a) document striking similarities between the S&P 500 and the CRSP index over the period 1926-1986. The indices have a correlation coefficient of 0.985 in annual data. Their mean is 0.044 and 0.042 respectively, and the standard deviations are 0.200 and 0.208. Similar results are found for dividends and dividend-price ratio series. Also, the S&P 500 includes something like 75% of US securities in value.

5.1

Time Variation in VAR Parameters

While our main focus is on the equity premium, and we view the VAR as a reduced form dynamic model for dividends, prices, and consumption, some of the parameters in the VAR equations have interesting and direct economic

18

Table 3: Constant Parameters VAR, Annual Data. yt

c

(∆d − rf )t

-0.02 [-0.24] 0.01 [0.22] -0.19 [-1.45]

∆ct δt

yt−1 0.28 [2.96] -0.09 [-1.92] 0.11 [0.82]

0.25 [1.35] 0.28 [2.91] -0.24 [-0.92]

yt−2 0.01 [0.17] 0.15 [4.00] 0.82 [7.93]

-0.16 [-1.75] -0.10 [-2.13] 0.10 [0.75]

-0.19 [-1.16] 0.12 [1.37] 0.02 [0.08]

-0.02 [-0.24] -0.15 [-3.79] 0.12 [1.18]

Notes: The table present VAR estimates, with t-ratios reported in square brackets.

meaning, so we begin by describing the levels and dynamics of some of these coefficients. For brevity, we only present results for the long sample here. But the post war sample shows very similar characteristics. Table 3 presents estimates and t-statistics (in square brackets) of a constant parameter VAR (CP-VAR) which will be used as comparison. Of particular interest is the equation for δt , the log dividend yield, which did not reject a unit root and showed substantial evidence of instability. Figure 1 shows the median of the VAR parameter on δt−1 over time, along with the median absolute error (mae). The corresponding coefficient for a constant parameter VAR is 0.82 (see fourth coefficient in the δt equation). When the parameter is allowed to vary, the coefficient is much lower, particularly for the earlier part of the sample. The median coefficient starts at about 0.6 (see left scale), and it is over .80 only in 1994. A substantial portion of the persistence in δt is thus attributable to parameter time variation. The graph of the mean absolute error of the estimated coefficient over time (right scale) also show that the coefficient is estimated quite precisely. The equation for excess dividend growth ∆dt − rtf is effectively a reduced form model for fundamentals. The constant parameter model shows that only lagged values of excess dividends have forecasting power. Notice in particular that lagged values of δt are not statistically significant: the price dividend ratio does not seem to forecast 1 or 2 year ahead dividend growth. This result is well known, and it is used to argue that because the dividend yield does not forecast future dividends, it must forecast expected returns.9 Figure 2 graphs the median coefficient of δt−1 in the equation for (∆d−rf )t , 9

See Cochrane, 2001 for a discussion.

19

Figure 1: Autoregressive Coefficient of Log Dividend Yield. 1

med(θt )

mae(θt )

0.24

0.9

0.82

0.8

0.19

0.7 0.14 0.6

0.5 0.09 0.4

0.3

0.04 1928 1933 1938 1943 1948 1953 1958 1963 1968 1973 1978 1983 1988 1993 1998 2003

Notes: med(θt ) is the posterior median, and mae(θt ) is the median absolute error of the coefficient on δt−1 in the VAR equation for δt . The dashed line at 0.82 is the corresponding estimate in the CP-VAR model.

as well as its median absolute error. The coefficient starts at about 5 percent in 1928, five times larger than in the CP-VAR, but the median absolute error is about the same size. While the mae(θ) declines steadily though, the median of θt increases rapidly, and it is more than twice the mae in 1956, when it is 8.5 percent. After that, it starts increasing, and it exceeds 10 percent from 1960 onward. Since the standard deviation of the log dividend yield is about 40 percent, a one-sigma increase in the dividend yield forecasts a 4 percent increase in one year ahead dividend growth in excess of the risk free rate. This predictability seems quite strong, and suggests that modelling yt as a CPVAR leads to loss of information about future dividends. In particular, the linear model will attribute too much of the variation in δt to expected excess returns, thus overstating their predictability. An example of how parameter time variation in fundamentals may induce predictability in expected returns if not properly modelled is presented in Evans (1998). Evans’ example assumes a recurrent Marcov switching model for the parameters in the equation for (∆d − rf )t , but the idea generalizes to our setting. The CP-VAR equation of log consumption growth shows that lagged values of both consumption growth and the dividend yield have forecasting power.

20

Figure 2: Forecasting Power of Dividend Yield. 0.16

med(θ)

mae(θ)

0.14

0.12

0.1

0.08

0.06

0.04

0.02

0 1928

1933

1938

1943

1948

1953

1958

1963

1968

1973

1978

1983

1988

1993

1998

2003

Notes: med(θt ) is the posterior median, and mae(θt ) is the median absolute error of the coefficient on δt−1 in the VAR equation for (∆d− rf )t .

These results are roughly consistent with results in Hall (1978). Accounting for parameter time variation shows that lagged consumption growth has greater effect on current consumption growth —the coefficient increases to 0.42-0.46— while the effect of δt−1 is reduced —the coefficient is between 10 and 6 percent, slowly declining over time.10 We now present results about the measures of expected excess returns, starting with low frequency movements of the equity premium, and then moving to higher frequency variation.

5.2

The Declining Equity Premium

Our measures of equity premiums for the long sample of annual data are reported in Figure 3. Recall that in this case the equity premium is defined as expected excess returns on the S&P 500 relative to 6-month commercial paper. Figure 6 does the same for the post war quarterly sample, which uses the three-month T-bill as risk-free rate. The first noticeable fact is the decline of the equity premium over the past 75 years. This is reflected in both our conditional and unconditional measures. 10

These results are not presented here, but are available upon request.

21

Figure 3: Expected Returns and Expected Excess Returns, Annual Data 0.09

EPc

EPu

0.08

0.07

0.06

0.05

0.04

0.03

0.02

0.01

0 1928

1933

1938

1943

1948

1953

1958

1963

1968

1973

1978

1983

1988

1993

1998

2003

Notes: EPc is the posterior median of the conditional equity premium, and EPu is the posterior median of the unconditional equity premium, both for the long annual sample.

The sample mean of realized excess returns for the period 1928-2004 is 6.5%. Our measure of unconditional equity premium is close to this value only for the period 1928-46, when it is constant at about 6.4%. These values correct for Jensen’s inequality.11 Ex-post excess returns for the period 1928-46 average at about 6%. From 1946 to 1971 we observe a continuous decline, the decline being sharper from 1963. In this sub-sample, the realized excess return is 8.6%, but the unconditional mean return moves from 6% to 3.5 %(from 5% to about 2.5% using log-returns). This confirms both Fama and French (2002) and Jagannathan et McGrattan and Scherbina (2000) conjecture that the ex-post returns are a distorted view of expected excess returns on equities and are a result of a declining equity premium. Similarly, notice that our measure of log expected excess returns stays constant at about 2.5% between 1971 and 1988, or 3.2% in terms of expected excess returns. The ex-post excess returns during the period average at 3.2%. Succinctly, during periods of constant expected returns, ex-post returns are a better measure of expected returns than in periods of changing expected returns. The evidence of this 11 The VAR produces log excess returns. We correct our measure using the estimated variance of returns. The correction is on average 1.25%.

22

negative correlation is summarized in Figure 4. This should warn us about the use ex-post returns in equity valuation, a point also made by Jagannathan, McGrattan, Scherbina (2000) and Siegel (1999). Figure 4: Ex-Ante and Ex-Post Excess Returns 0.09

Ex-Post

EPc 0.08

0.07

0.06

0.05

0.04

0.03

0.02

0.01

0 1928

1933

1938

1943

1948

1953

1958

1963

1968

1973

1978

1983

1988

1993

1998

2003

Notes: EPc is the posterior median of the conditional equity premium, and Ex-Post is the average excess return realized between 1900 and time t.

How can we explain the long run decline of the equity premium? Part of the high equity premium of the period 1928-46 can be explained by the turbulent years of the Great Depression. The feeling of aversion to the stock market generated by the volatile years during and after the Great Depression lasted until well after the war. The early thirties were indeed a period of higher volatility for both dividend growth and dividend yield, as can be seen from estimated volatilities (see Figure 5), particularly for dividend growth. Also, this is a period in which participation to the stock market was quite limited and mutual funds where not available to savers. This made it harder to share risks across people. Fear of a catastrophic event, limited participation, and costly diversification can explain the high EPc and EPu of the thirties and why the premium stayed high for so long. Investors in the 1930’s could not know for certain that the U.S. would be the most successful capitalist country in history. Even a small probability of a catastrophic event like the Great Depression can generate a substantial premium, see Rietz (1988). Even with the economy getting out of the depression, investors’ revision of beliefs about

23

the economy could be very slow, given the size of the depression. This could explain the persistence and slow decline of the premium afterwards. This idea is pursued in a recent paper by Cogley and Sargent (2005), who argue that a large shock (such as the Great Depression) may cause a drastic change in agents beliefs that is slow to reverse. Further, as agents learn and revise their beliefs, expected returns decline and prices increase, giving rise to the negative correlation between ex-ante and ex-post returns, as shown in Figure 4.12 As memories of the Great Depression started to fade, the premium gradually declined until 1971. Figure 5: Volatility of Excess Dividend Growth, Consumption Growth, and log Dividend-Prices (Annual) 0.14

0.35

σ(∆d − rf )t

σ(∆c)t

σ(δ)t

0.12

0.3

0.1

0.25

0.08

0.2

0.06

0.15

0.04

0.1

0.02

0.05

0 0 1928 1933 1938 1943 1948 1953 1958 1963 1968 1973 1978 1983 1988 1993 1998 2003

Notes: The figure shows posterior medians of σ(δ)t , σ(∆c)t (left scale), and σ(∆d−rf )t (right scale).

The increased desirability of stocks over this period (and therefore the declining premium) can be further reinforced by the perception that the business cycle has become less severe over time. A measure of the severity of the business cycle is the conditional standard deviation of consumption growth. Macroeconomic risk measured this way increased between 1928 and 1946 (see 12

Martin Weitzman (forthcoming) also emphasize the importance of parameter uncertainty, instability and rare events as potential explanations of the historically large equity premium.

24

Figure 5). After 1946, it declines until 1970, strongly supporting the idea of declining macro-risk. The unconditional equity premium is more or less constant in the period 1976-1988, though volatility of consumption growth keeps increasing until 1981. It then declines again from 1988 to 2002. Increased diversification from the availability of index mutual funds and other new financial instruments in the seventies offsets the increased volatility of consumption growth in the period 1976-1988 and, as a result, the premium remained more or less constant. The premium then declines again with lower uncertainty and greater opportunities for portfolio diversification. A similar interpretation applies to our conditional measure EPc . Recall from its definition that EPc can be considered as an approximation to an average expected excess return on the stock market over a period say of 1520 years, given the state of the economy at time t. EPc peaks in 1951 and declines more or less steadily from 1952 to 1973. It then peaks in 1975 and 1985 to start declining again. The seventies are a period of greater uncertainty (see Figures 5 and 7) and higher inflation, and as a consequence, EPc stops declining and it is about constant until 1988. EPc peaks in 1975 and stays high until 1985, at a level of about 4%. It then declines to historical lows. There is an increase in EPc in 1987-88, probably spurred by the brief shock of 1987, and the market uncertainty that surrounded it. Adjusting for Jensen’s inequality, the unconditional equity premium in 2004 is about 4%, rather than the 6% recorded by Mehra and Prescott (1985). The quarterly measures are basically a blow-up of the annual counterparts for the period 1961-2002, and tell a similar story. There is a sharp drop in the equity premium starting in 1994 which is not as pronounced in annual data. One possible explanation is a drastic regime shift in the payout policy of corporations at the end of the sample that is not well captured by our model. Grullon and Michaely (2002) report evidence for the period 19722000 that repurchases have become an important source of payout for U.S. corporations and that firms finance their share repurchases with funds that would otherwise be used to increase dividends. The repurchase tendency was reversed at the beginning of the 21st century, and both measures of the equity premium increase as a consequence afterwards.

25

Figure 6: Expected Returns and Expected Excess Returns, Quarterly Data 0.06

EPc

EPu

0.05

0.04

0.03

0.02

0.01

0

-0.01 1956

1959

1962

1965

1968

1971

1974

1977

1980

1983

1986

1989

1992

1995

1998

2001

Notes: EPc is the posterior median of the conditional equity premium, and EPu is the posterior median of the unconditional equity premium, both for the post war quarterly sample.

5.3

Economic Importance of the Time Varying VAR

We presented statistical evidence of parameter variation in Section 2, where we conducted stability tests on VAR equations. A natural question is whether parameter time variation is important economically. We have already addressed this question partially in Section 5.1, where we compared the time varying VAR model (TVP-VAR) with the constant parameter VAR (CP-VAR). In this section we contrast our measure of the conditional equity premium EPc with the equity premium derived from (i) a CP-VAR, and (ii) a CP-VAR recursively estimated over the sample (Rec-VAR). We know from our statistical tests that the CP-VAR is rejected in favor of a model with unstable parameters. But modeling expected returns by means of a constant parameter VAR (either estimated over the full sample, or recursively estimated) is a much simpler way to calculate the cost of capital. So, how much do we lose by using a simple constant parameter VAR if the true model is statistically better represented by a VAR with evolving parameters? We now show that using a CP-VAR leads to a substantial miss-pricing. Figure 8 plots our measure EPc,t with the analogous CP-VAR estimate and the Rec-VAR estimate. Figure 9 plots our unconditional estimates EPu,t

26

Figure 7: Volatility of Excess Dividend Growth, Consumption Growth, and log Dividend-Prices (Quarterly) 0.16

0.055

σ(∆d − rf )t

σ(δ)t

σ(∆c)t

0.05

0.14 0.045 0.12 0.04 0.1

0.035

0.03

0.08

0.025 0.06 0.02 0.04 0.015

0.02 1953 1956 1959 1962 1965 1968 1971 1974 1977 1980 1983 1986 1989 1992 1995 1998

0.01

Notes: The figure shows posterior medians of σ(δ)t , σ(∆d − rf )t (left scale), and σ(∆c)t (right scale).

with the unconditional estimate from the CP-VAR (a straight line), and the estimate from the recursively estimated VAR. First consider the conditional measures. The CP-VAR is much closer to our measure than is the Rec-VAR, and while all three measures seem to have some common short run movements, the mispricing of both the CP-VAR and the Rec-VAR measures relative to EPc,t are large and systematic. For the CP-VAR measure, the median absolute pricing error relative to EPc is 51 basis points, and the average absolute error is 62 basis points. Further, there are extended periods during which the difference from our EPc,t is greater than 50 basis points in either directions. See the figure between 1959 and 1965, from the late 1970’s to 1985, and from the early 1990’s onward. The high variability of the Rec-VAR measure of the equity premium, due to large revisions of parameters estimates over the sample, reflects the considerable amount of uncertainty in out of sample VAR forecasts. The use of the entire sample in calculating EPc,t and the CP-VAR measure considerably reduces this uncertainty. Figure 9 paints a similar picture; in this case too the unconditional equity premium from the Rec-VAR shows considerable variation, in addition to being negative for extended periods of time.

27

Figure 8: Comparison of Conditional Equity Premium Measures 0.08

CP-VAR

EPc

Rec-VAR

0.06

0.04

0.02

0 1956

1959

1962

1965

1968

1971

1974

1977

1980

1983

1986

1989

1992

1995

1998

2001

-0.02

-0.04

Notes: The figure plots the posterior median of EPc,t along with expected excess returns from a constant parameter VAR (CP-VAR), and a recursively estimated VAR (Rec-VAR).

We now present a more complete comparison between the CP-VAR and the TVP-VAR using the posterior distribution of EPc,t . In Section 3 we derived our measure of expected returns by inferring it from the dynamic growth model: Given the dividend price ratio and expected future dividend growth in excess of the risk free rate, we used the model to calculate expected excess returns. Under the null hypothesis that the TVP-VAR and the CP-VAR are equivalent, the pricing errors should be small and should not have a systematic component. If the two models are not equivalent, then the pricing errors will be large and will have a systematic component. For each draw from the parameter posterior, we use the time-varying parameters to calculate a history of EPc,t as in (3.3). Then we calculate a history of percent deviations of the constant parameter equity premium. Denoting by ¯ t , the conditional equity premium from a CP-VAR, we calculate a history EP of the following pricing errors: eˆt ≡

¯ t − EPc,t EP . EPc,t

For each time series of the errors eˆt , we calculate the median of the absolute

28

Figure 9: Comparison of Unconditional Equity Premium Measures 0.08 CP-VAR

EPu

Rec-VAR

0.06

0.04

0.02

0 1956

1959

1962

1965

1968

1971

1974

1977

1980

1983

1986

1989

1992

1995

1998

2001

-0.02

-0.04

Notes: The figure plots the posterior median of EPu,t along with unconditional expected excess returns from a constant parameter VAR (CP-VAR), and a recursively estimated VAR (Rec-VAR).

value, the median of the series, and the first autocorrelation. Then we calculate the 5%, 25%, 50%, 75%, and 95% quantiles of these statistics over the posterior draws. The quantiles are reported in Table 4. The median absolute error is large. The 5% quantile is 47%. That is, by using a CP-VAR, an analyst would make a median pricing error of at least 47% with 95% probability. The distribution of the median pricing error shows that at least 99% of the times the equity premium from the CP-VAR is too low: stock prices are overpriced relative to safe T-bills. At least 75% of the time the miss-pricing exceeds 29%: the equity premium from a CP-VAR is 29% lower. Finally, the posterior distribution of the autocorrelation coefficient tells us that the pricing errors are positively correlated. While the distribution does include zero and a negative interval, the median autocorrelation coefficient is 25%, and the 75% quantile is 64%, and the correlation coefficient is positive 93% of the times.13 13

In calculating the eˆt errors, we are assuming that the true model is the TVP-VAR here. Given the statistical evidence in Section 2 this seems appropriate. Since the TVP-VAR includes the CP-VAR as a special case, even if the true model was the CP-VAR, the errors

29

Table 4: Pricing Errors med(|ˆ et |)

med(ˆ et ) ρ(ˆ et , eˆt−1 )

q.05

0.47

-0.98

-0.03

q.25

0.63

-0.63

0.03

q.50

0.82

-0.45

0.25

q.75

0.98

-0.29

0.51

q.95

1.15

-0.06

0.79

Notes: med denotes the median over each time series, ρ(·) is the correlation coefficient, eˆt is defined in the text, and q.05 , . . . , q.95 denote quantiles of the posterior distributions of the statistics in the columns.

We take the results in Table 4 seriously as indicating that the constant parameter VAR is missing an important component of expected returns. This is consistent with the evidence discussed in Section 5.1 about the increased predictability of dividend growth, once we allow for parameter time variation.

5.4

Equity Premium and Macroeconomic Risk

In this section we explore the relationship between expected returns and selected macroeconomic variables. We summarize the co-movements of our measure of the equity premium and variables that should contain, or have been found to contain information about the premium. We run exploratory regressions here, and while we recognize that the regressions may be subject to some measurement error and do not have a full structural interpretation, we claim that this is a useful exercise that provides some new empirical evidence. Further, the variables we include in the regressions are justified by existing literature that tries to build a bridge between the behavior of the stock market and macroeconomic risk, and the related literature on the predictability of stock returns. As some of the variables depend on our estimation procedure, they may have some complicated time series properties, so the standard errors are Newey-West autocorrelation-corrected for lags of ten periods in all the regressions discussed below. Also, we use posterior medians of estimated quantities such as equity premia and volatility of consumption growth, i.e. we do not conduct a fully should be small and should not contain a predictable component.

30

Bayesian analysis. In other words, we view the prior on the parameters, and the posterior, as theoretical tools to obtain useful economic objects, without attaching to them subjective significance.14 We regress our conditional measure of the equity premium on inflation, dividend growth (in deviation from the time-varying unconditional mean), a measure of consumption-wealth ratio in logs (cayt ), the payout ratio (dividend over earnings), and our estimate of the volatility of consumption growth (σt (∆ct+1 )). We are particularly interested in the relationship between the equity premium and the volatility of consumption growth which we take as a measure of macroeconomic uncertainty. Results are presented in Table 5. We use both quarterly and annual data, though we focus our discussion on the quarterly data set, as it is the data set on which the literature on macroeconomic risk has mostly focused on. To focus on short-run movements of the equity premium, we use EPc,t − EPu,t as the dependent variable; subtracting EPu,t filters out long-run movements in the equity premium, which also reduces the risk of spurious regressions. For completeness, the right panel of the table also presents regressions using EPc,t .15 The reason for including inflation in a regression of the equity premium comes from the large body of research which links inflation to the premium through the interaction of inflation and taxation, inflation and risk, or inflation and money illusion.16 The consumption wealth ratio is the cayt variable computed by Lettau and Ludvigson (2001). Data on this variable, both annual and quarterly, can be downloaded from their webpages. The idea behind the variable is that deviations of consumption from a long-run trend with wealth should contain information about expected returns. Under some assumptions, Lettau and Ludvigson compute the variable from a co-integrating relationship between consumption and wealth derived from the intertemporal budget constraint. It is plausible to think that the payout ratio should predict expected returns. Payout ratios move in response to cyclical variation in earnings and to 14

This is consistent with the discussion in Bickel and Doksum (2000). Notice that while our estimation procedure imposes that innovations to volatilities are uncorrelated with innovation to autoregressive parameters, our measure of the equity premium is a nonlinear function of the autoregressive parameters and the data, thus it will not , in general, be uncorrelated with σ(∆c). Our assumption greatly simplifies estimation and we see no a priori reason to believe that the assumption would bias the results in any particular way. 16 That investors may be comparing nominal rates on bonds to dividend yields for stocks was argued by Modigliani and Cohn (1979), who predicted that if inflation came down, the equity premium would also decrease. The prediction may have held in the eighties, during which we assist to both declining inflation and premium. 15

31

Table 5: Equity Premium and Fundamentals

Variable const. πt ∆dt cayt payout σt (∆c)

Variable const. πt ∆dt cayt payout σt (∆c)

Coeff. -0.0222 0.1157 -0.0062 0.1310 -0.0072 0.5565

Coeff. -0.0293 0.0162 0.0097 8.5E-06 0.0167 0.6405

Quarterly Regressions EPc − EPu Coeff. Std.Err. t-ratio R2 0.0052 -4.31 0.66 -0.02510 0.0423 2.73 0.26638 0.0133 -0.46 0.04453 0.0375 3.50 0.41302 0.03929 0.0085 -0.84 0.1945 2.86 0.73421 Annual Regressions EPc − EPu Std.Err. t-ratio R2 Coeff. 0.0079 -3.71 0.33 -0.0290 0.0430 0.38 -0.1582 0.0253 0.0090 1.07 3.6E-06 2.36 6.5E-06 0.0065 2.55 0.0415 0.1615 3.97 1.1953

EPc Std.Err. t-ratio 0.0135 -1.86 0.1278 2.08 0.0241 1.85 0.1075 3.84 0.0244 1.61 0.5165 1.42 EPc Std.Err. t-ratio 0.0208 -1.40 0.0811 -1.95 0.0117 2.16 8.2E-06 0.79 0.0160 2.60 0.2856 4.18

R2 0.61

R2 0.64

Notes: EPc is conditional equity premium. EPu is the unconditional equity premium. This is subtracted from EPc in the left panel to filter out a low frequency component from the conditional equity premium. πt is inflation, ∆dt is dividend growth, and cayt is the consumption-wealth ratio variable of Lettau and Ludvigson (2001). The variable payout is the ratio of dividend to earnings. σt (∆c) is the standard deviation of consumption conditional on time t. Standard errors and t-ratios are 10-lag autocorrelation consistent.

32

permanent changes in future expected growth. If payout ratios are higher because companies anticipate higher future growth, then this information should be reflected in prices and expected returns. The R2 from the quarterly regression of EPc − EPu is high and above 60%. The effect of consumption-growth volatility on short-run movements of our measure of the equity premium is positive and significant. The coefficient is also economically significant: A 1% increase in the volatility of consumption growth implies a 0.56% increase in the conditional equity premium EPc relative to EPu .17 Evidence on inflation is mixed, it is positive and significant in quarterly data, but not significant in annual data. The payout ratio is not significant in the quarterly regression, but significant in annual data and significant at the 10% level in the quarterly regression of EPc . This is consistent with the idea that low frequency movements in the payout ratio forecast future returns, whereas cyclical fluctuations do not, as they are mostly driven by cyclical variation in earnings. The cayt variable is positive and significant in both data sets, corroborating the results in Lettau and Ludvigson (2001). Notice that while cay should capture information about the level of permanent income, the volatility of consumption growth should capture information about agents’ uncertainty about permanent income. Our results indicate that both dimensions are important determinants of the equity premium. Because our measure of the equity premium is a long horizon measure, it is important to emphasize that our findings imply that the right hand side variables have long horizon predictive power for equity premia. Thus, our findings are directly comparable to the literature on return predictability, which focuses on long horizon regressions.

6

Summary and Conclusions

This paper estimates expected returns and expected equity premia in U.S. stock markets using a valuation formula and allowing both the simultaneous relation between dividend growth, consumption growth, and dividend-prices, and their dynamic lag structure to vary over time. We motivate the importance of time-varying parameters both theoretically and statistically. In particular, the paper focuses on the size and movements of the equity premium over the last 75 years, and on the relationship between the equity premium and sources of macroeconomic risk. We extend and confirm previous 17

These values are annualized.

33

work on the declining equity premium and run exploratory data analysis in search of clues about factors determining movements in the equity premium. We find that the equity premium has declined and it is now much closer to levels predicted by standard consumption models. This implies that asset pricing models that aim at understanding the relationship between risk and returns should try to replicate aggregate stock market features matching a value closer to 4% rather than 6%. De Santis (2007) shows that a simple generalization of Mehra and Prescott’s (1985) model can generate a premium of 6% with a relative risk aversion as low as 5. The model can explain a 4% equity premium with a relative risk aversion slightly above 3, a very plausible value. A low estimate for the equity premium has important consequences for portfolio allocation decisions, the cost of capital, and how much of Social Security funds should be put into stocks. Jagannathan, McGrattan, and Scherbina (2000) point to institutional changes occurred in the U.S. in the last 30 years, technological improvements in particular. Other reasons for a lower premium may include greater opportunities for portfolio diversification. Since the 1970’s, there have been enormous changes in the financial instruments available to the public: think of money market funds, floating-rate notes, index mutual funds, emerging market funds, equity REITs, zero-coupon bonds, S&P index futures and options, and many more. Better risk sharing has reduced individual risk, and possibly aggregate risk, thus making the economy safer, as indicated by our measures of consumption volatility. Our second finding is a statistically and economically significant relationship between the premium and volatility of consumption growth, which underscores the role of consumption growth for risk. Consumption growth has been given a relatively low weight in asset pricing because it is thought to be too smooth and to close to i.i.d. in quarterly data (thus the equity premium puzzle). As Bansal and Yaron (2004) show, a small persistent component in the volatility of consumption growth can be hardly detectable in the data and yet have important implications for asset prices. Our results indicate the presence of persistent, and economically meaningful, time-varying conditional volatility of consumption growth. As an added bonus from our flexible approach, we show evidence that accounting for parameter time variation is important for forecasting dividend growth, and therefore expected returns. Relatedly, we show that estimating expected returns and excess returns using a constant parameter VAR induces substantial miss-pricing.

34

A A.1

Convergence and Robustness Convergence Checks

I perform the usual informal checks (changing the starting point in the Markov chain, different number of draws, burning periods and “thinning” ratios) and some more formal ones, such as Geweke (1992) χ2 convergence diagnostic (CD) and relative numerical efficiency (RNE), and a test on the standardized CUMSUM statistic proposed by Yu and Mykland (1994).18 The χ2 statistic of Geweke compares the estimate of a posterior mean from the first NA draws with the estimate of the last NB draws of the chain. If the two subsamples are well separated, they should be independent. If the number of draws is sufficiently large, the following statistics is a χ2 CD(θ) =

(θA − θB )2 , nse2A + nse2B

where θi is the estimate of the posterior mean of the parameter, and nse2i is its numerical standard error, formed from subsample i. In other words, θi is the sample mean of Monte Carlo draws, and nse2i is an estimate of the variance of the sample mean. Given that the Monte Carlo sample is a Markov chain, it is not an independent sample, and the variance of the sample mean is function of the variance and autocovariance of the process generating the draws. The variance of the sample mean is therefore estimated with a spectral estimator evaluated at zero, Sθ (0). In this application, the number of draws is 10, 000, NA is the first 2, 000 draws, NB is the last 5, 000 draws, and the variance of the two sample means is estimated using Newey-West weights on a number of covariances equal to 10% Ni . The Newey-West estimator corresponds to the Bartlett Spectral estimator (see Hamilton p.167). Geweke’s RNE is a measure of efficiency of the algorithm, relative to the i.i.d. case. If the Monte Carlo sample is i.i.d., the variance of the sample mean is the variance of the population divided by the sample size, say γ0 /N . Given P dependence of the draws, the variance is instead 2πSθ (0)/N = (γ0 + 2 j γj )/N . The relative numerical efficiency is X γ0 RN E(θ) = = (1 + 2 ρj )−1 2πSθ (0) j Notice that this statistic is not bounded between zero and one, and values greater than one indicate that the variance of the sample mean in the chain is 18 Gauss routines that perform the checks described in this section are available upon request.

35

smaller than in the i.i.d. case. This is desirable and means that convergence can be achieved with a relatively smaller number of draws. Given the N draws, a standardized version of Yu and Mykland (1994) convergence check is the statistic Pt 1 (n) − µθ n=1 θ t CSt (θ) = σθ where µθ and σθ are the empirical mean and standard deviation of the N draws. If the Markov chain converges, the graph of CS(θ) against t should converge smoothly towards zero. This statistic is performed on randomly selected parameters. Results from CD(θ) and RN E(θ) are reported in Table 6. They are quite satisfactory, the algorithm is efficient according to RNE for most sets of parameters, although not as efficient in the case of the hyperparameters. A value above 0.5 for the RNE is very satisfactory. For the matrix V , the value of 0.06 is low, but acceptable, given the high number of iterations in the Markov chain. The value is similar to Primiceri’s (2005). He reports 1/RN E equal to 18 for hyperparameters, which correspond to 0.056. A value of RN E = 0.06 means that the algorithm requires 16.7 times more iterations than the i.i.d. case, for a given level of precision. Given the fact that our results are unchanged by changes in starting points of the chain, and greater number of draws, we believe that 10,000 draws are enough and conclude that the algorithm is converging to the ergodic distribution. This is confirmed by the CD statistics and graphs of randomly selected CSt (not reported). The CD statistics are all well below the χ21 critical value of 3.84. Medians are between 0.36 and 0.77 and convergence is not rejected for even a single parameter.

A.2

Robustness to Prior Calibration

Robustness checks are conducted for the parameters kQ , kS , kW . Given the great number of parameters, the robustness checks are based on the long run values of the VAR µt|T and by looking at the impact on the diagnostic statistics of the Markov chain. Varying kQ does not affect the behavior of the long run values of the VAR variables, unless extreme values are used. With values of the order of magnitude of the benchmark level (0.025), from 0.01 to 0.05, both long-run posteriors and diagnostic statistics are not affected. The long run trajectory of dividend growth is affected by high values say greater than 0.1, and large values of kQ produce implausible values for the long run trajectory of dividend yield. The long run of dividend growth becomes negative at some dates, meaning that

36

Table 6: Convergence Diagnostics Parameter block θT A Σ V Parameter block θT A Σ V Parameter block θT A Σ V

20th Autocorrelation Median Mean Min 0.01189 0.01213 -0.01817 0.01306 0.01827 -0.02812 -0.00126 -0.0011 -0.02842 0.11797 0.12056 -0.01664 RN E Median Mean Min 0.61489 0.65593 0.14499 0.67781 0.63823 0.11269 0.99855 1.09247 0.38967 0.05998 0.08196 0.01341 CD Median Mean Min 0.51137 1.04140 3E-08 0.36281 0.94815 1.79E-06 0.76759 1.57906 6.63E-06 0.50669 1.16395 0.00017

Max 0.05877 0.08148 0.02842 0.42024

10% -0.0052 -0.00185 -0.01443 0.05679

90% 0.02846 0.04305 0.01156 0.18159

Max 2.45819 1.82291 3.17533 1.0552

10% 0.29221 0.16013 0.58277 0.03679

90% 1.08640 1.24765 1.69928 0.11555

Max 2.45819 1.82291 3.17533 1.0552

10% 0.02275 0.01137 0.03186 0.01355

90% 2.92498 2.74856 4.02212 2.80397

Notes: RN E is Geweke (1992) relative numerical efficiency, a measure of the efficiency of the algorithm. CD is Geweke (1992) χ21 convergence diagnostic.

time variation in the parameters captures noise. The condition on unit-root parameters does not impact the amount of time variation in θ’s in a sensible way, if anything it reduces it, and it does not affect time variation in Ωt . Changing kW or kS does not have any particular impact on the behavior of long run values and on convergence properties of the algorithm. The most important parameter is clearly kQ . The reason is that Q affects the amount of time variation and it is a matrix of big dimension, possibly singular. The fact that the model behaves sensibly with a value of kQ close to what is used in other research with quarterly data (Cogley and Sargent, 2005, and Primiceri, 2005) is taken as a good sign.

References Bai, J. and P. Perron (1998): “Estimating and testing linear models with multiple structural changes,” Econometrica, 66, 47–78.

37

Bansal, R., V. Khatcharian, and A. Yaron (2005): “Interpretable asset markets?” European Economic Review, 49, 531–560. Bansal, R. and A. Yaron (2004): “Risks for the long run: A potential resolution of asset pricing puzzles,” Journal of Finance, 59, 1491–1509. Bickel, P. J. and K. A. Doksum (2000): Mathematical Statistics: Basic Ideas and Selected Topics, volume I, Prentice Hall, 2nd edition. Blanchard, O. (1993): “Movements in the equity premium,” Brooking Papers on Economic Activity, 75–138. Campbell, J. H. and R. Shiller (1988a): “The dividend-price ratio and expectations of future dividends and discount factors,” Review of Financial Studies, 1, 195–227. Campbell, J. H. and R. Shiller (1988b): “Stock prices, earnings, and expected dividends,” Journal of Finance, 43, 661–676. Cogley, T. and T. Sargent (2005): “Drifts and volatilities: Monetary policies and outcomes in the post WWII U.S.” Review of Economic Dynamics, 8. Cogley, T. and T. Sargent (March 2005): “The market price of risk and the equity premium: A legacy of the great depression?” Manuscript, 32 pages. De Santis, M. (2007): “Demystifying the equity premium,” Dartmouth College, manuscript. Elliott, G. and U. K. Muller (2006): “Efficient tests for general persistent time variation in regression coefficients,” Review of Economic Studies, 73, 907–940. Evans, M. D. (1998): “Dividend variability and stock market swings,” The Review of Economic Studies, 65, 711–740. Fama, E. F. and K. R. French (2002): “The equity premium,” The Journal of Finance, 57, 637–659. Fruhwirth-Schnatter, S. (1994): “Data augmentation and dynamic linear models,” Journal of Time Series Analysis, 15, 183–202. Grullon, G. and R. Michaely (2002): “Dividends, share repurchases, and the substitution hypothesis,” Journal of Finance, 57, 1649–1684.

38

Hall, R. (1978): “Stochastic implications of the life cycle permanent income hypothesis: Theory and evidence,” Journal of Polical Economy, 86, 971– 987. Hamilton, J. (1994): Time Series Analysis, Princeton University Press. Hansen, B. E. (1992): “Testing for paramenter instability in linear models,” Journal of Policy Modeling, 14, 517 –533. Jagannathan, R., E. McGrattan, and A. Scherbina (2000): “The declining U.S. equity premium,” Federal Reserve Bank of Minneapolis Quarterly Review, 24, 3–19. Kim, S., N. Shepard, and S. Chib (1998): “Stochastic volatility: Likelihood inference and comparison with arch models,” Review of Economic Studies, 65, 361–393. Lettau, M. and S. Ludvigson (2001a): “Consumption, aggregate wealth, and expected stock returns,” Journal of Finance, 56, 815–849. Lettau, M., S. C. Ludvigson, and J. A. Wachter (Forthcoming): “The declining equity premium: What role does macroeconomic risk play?” Review of Financial Studies. Mehra, R. and E. Prescott (1985): “The equity premium: A puzzle,” Journal of Monetary Economics, 15, 145–162. Meinhold, R. and N. Singpurwalla (1983): “Understanding the Kalman filter,” The American Statistician, 37, 123–127. Parker, J. A. and C. Julliard (2005): “Consumption risk and the cross section of stock returns,” Journal of Political Economy, 113, 185–222. Pastor, L. and R. Stambaugh (2001): “The equity premium and structural breaks,” The Journal of Finance, 56, 1207–1245. Primiceri, G. E. (2005): “Time varying structural vector autoregression and monetary policy,” The Review of Economic Studies, 72. Rietz, T. (1988): “The equity risk premium: A solution?” Journal of Monetary Economics, 21, 117–132. Siegel, J. J. (1999): “The shrinking equity premium,” Journal of Portfolio Management, 26, 10–17.

39

Smith, M. and R. Kohn (2002): “Parsimoious covariance matrix estimation for longitudinal data,” Journal of the American Statistical Association, 97, 1141–1153. Stock, J. and M. Watson (1998): “Asymptotically median unbiased estimation of coefficient variance in a time varying parameter model,” Journal of the American Statistical Association, 93, 349–358. Timmermann, A. (2001): “Structural breaks, incomplete information and stock prices,” Journal of Business and Economic Statistics, 19, 299–314. Weitzman, M. L. (forthcoming): “Prior-sensitive expectations and asset-return puzzles,” American Economic Review.

40

Movements in the Equity Premium: Evidence from a ...

Sep 9, 2007 - the European Financial Management Meetings (Basel) and the Money and Macro Research. Group Annual Conference ... applications such as capital budgeting and portfolio allocation decisions. The work cited above ..... more predictable. Second, for main applications of the equity premium such as.

387KB Sizes 0 Downloads 297 Views

Recommend Documents

Demystifying the Equity Premium
Apr 7, 2009 - predicted by the model based on U.S. consumption data from 1891-. 2001 has a .... if and only if it failed to explain (2). ...... nent in dividend growth is hard to detect. 22 ... Successive declines in consumption drive down future exp

pdf-1499\the-equity-premium-puzzle-a-review-foundations-and ...
Try one of the apps below to open or edit this item. pdf-1499\the-equity-premium-puzzle-a-review-foundations-and-trendsr-in-finance-by-rajnish-mehra.pdf.

Urban Wage Premium Revisited: Evidence from ...
total amount of production minus total material, fuel, and energy consumed and subcontracting expenses .... if alternative productivities and wages are chosen.

The Equity Premium and the One Percent
Frank Warnock, Amir Yaron, and seminar participants at Boston College, Cambridge-INET, ..... Suppose Assumptions 3–5 hold and agents have common beliefs ...

The Equity Premium and the One Percent
Jiasun Li, Larry Schmidt, Frank Warnock, and seminar participants at Boston College,. Cambridge-INET ... ‡Darden School of Business, University of Virginia. ... earners are all else equal more willing to trade risk for return, then it should.

The Equity Premium and the One Percent
Boston College, Cambridge-INET, Carleton, Darden, Federal Reserve Board of Governors, ... †Department of Economics, University of California San Diego. .... For many years after Fisher, in analyzing the link between individual utility.

Winner Bias and the Equity Premium Puzzle
Jul 10, 2008 - The US stock market was the most successful market in the 20th century. ... The US is evidently the “winner” among global stock markets.

The Private Equity Premium Puzzle Revisited
... for sharing his expertise on SCF, and to Gerhard Fries for answering data inquiries ..... 9In the case of the owners who provide services in their businesses ...

The Private Equity Premium Puzzle Revisited
accounting for the relative performance of the public and private equity over the .... software, and inventories all at replacement/current), financial assets minus ...... Sales. 6,994,702 217,000,000. 0. 4,200. 30,000. 130,000. 700,000. Profits.

The Private Equity Premium Puzzle Revisited
and indirect share holdings in publicly traded companies. Table 1 .... software, and inventories all at replacement/current), financial assets minus liabilities. It does not ... in MVJ either from the printed or electronic sources. .... They use data

Winner Bias and the Equity Premium Puzzle
Jan 16, 2009 - The equity premium puzzle in US stocks can be resolved by winner ... “winner bias,” affects estimates of US stock market performance and is.

The Private Equity Premium Puzzle Revisited - Acrobat Planet
University of Minnesota and Federal Reserve Bank of Minneapolis. This Version: July 23, 2009. Abstract. In this paper, I extend the results of ..... acquisitions adjustment, which is an important channel for movements in and out of private equity in

Extrapolative Expectations and the Equity Premium -
was supported by National Institute on Aging grants T32-AG00186 and R01-AG021650-01 and the Mustard Seed Foundation. †. Yale University and NBER, Yale School of Management, 135 Prospect Street,. New Haven, CT 06520-8200, USA; E-mail: james.choi@yal

Luxury Goods and the Equity Premium
Woodrow Wilson School of Public and International Affairs and the Bendheim Center for Finance,. Princeton ...... biles from Ward's Automotive Yearbook.

The Private Equity Premium Puzzle Revisited
the poor performance of public equity markets, while returns to entrepreneurial equity remained largely .... value of public equity in householdps sector would follow closely the total for public equity. 7 Excluded from ...... employment and business

Experimental Evidence from a Slum in Cairo
17 Jan 2013 - 1Trust is defined as placing something valuable at the disposal of another person, the trustee, without being able to ensure that she will not misuse it. ..... (2011) and Hardeweg, Menkhoff and Waibel (2011) validated the same risk ques

Evidence from a Field Experiment
Oct 25, 2014 - answers had been entered into an electronic database, did we compile such a list .... This rules out fatigue, end-of-employment, and ..... no reciprocity concerns and supplies e = 0 for any wage offer (the normalization to zero is.

Promotion incentives in the public sector: evidence from ... - Albert Park
Dec 3, 2016 - Most teachers in the sample completed vocational college or .... competitors increases (higher n) effort increases for those with skill near the marginal skill ...... We thank Matthew Collin, Jishnu Das, Stefan Dercon, Eliana La ...

Networks in Conflict: Theory and Evidence from the ...
Jul 1, 2016 - network theory and from the politico-economic theory of conflict. We obtain a ... NBER Summer Institute, Workshop on the Economics of Organized Crime, and at the universities of Aalto, Bocconi, ... Luxembourg, Manchester, Marseille-Aix,

Promotion incentives in the public sector: evidence from ... - Albert Park
Dec 3, 2016 - Most teachers in the sample completed vocational college or regular college, with nearly ..... In year t = X − 3 there are two such years (t = X + 1.

Women in Politics. Evidence from the Indian States
Sep 11, 2008 - diture decisions if we compare them to their male counterparts. They invest ..... tenancy contracts and attempts to transfer ownership to tenants.

Evidence from the Safe Motherhood Program in ...
Early Life Public Health Intervention and Adolescent Cognition: Evidence ...... The Impact of Improving Nutrition during Early Childhood on Education Among.

The social brain in adolescence: Evidence from ...
2. Functional neuroimaging and behavioural studies of social cognition in adolescence. .... ter traits, and finally using this knowledge to guide interactions. (Frith and Frith, 2008, ..... using a modified Ultimatum Game showed that the tendency to