How Sensitive are Sales Prices to Online Price Estimates in the Real Estate Market? Yong Suk Lee and Yuya Sasaki∗ March 12, 2014

Abstract This paper investigates how sensitive sales prices are to online price estimates in the real estate market. With our preliminary national and MSA-level analysis, we fail to reject the null hypothesis that online price estimates do not affect actual transaction prices. This macro-level evidence is followed by microeconometric analysis using houselevel data. To account for correlated house specific unobservables, we use the differences between listing prices and online price estimates as proxies to form a partial linear model. To account for correlated neighborhood specific unobservables, we use neighborhood first differencing. Using house price estimates and sales prices collected from Zillow.com, we find that the elasticity of sales price with respect to the Zillow estimate is one, controlling for the aforementioned unobservables as well as observed house attributes. Our results imply that online price estimates can have a large impact on real estate price dynamics. Keywords: real estate pricing, online price estimates, hedonic valuation, neighborhood panel data, proxies

JEL Codes: D82, R21, R31, R32 ∗

Lee: Williams College; Sasaki: Johns Hopkins University. The authors thank Danny Guo and Simmon Kim for data collection and research assistance.

1

1

Introduction

Like other types of assets, the price of real estate is determined by the observed and unobserved attributes of the asset. Houses, especially single family houses, exhibit unobserved heterogeneity across various dimensions. Same sized bedrooms can be valued differently depending on the location of the window.

The topography of same sized lots can affect the value of the

property. Neighborhood amenities, like nearby schools and parks, are important determinants of property prices. While no two houses in general are alike, houses have been priced based on what appraisers or brokers refer to as comparables, observationally similar houses that were recently sold in the same or nearby neighborhood. Pricing adjustments are made to reflect the differences between the house of interest and the comparable houses. In other words, the pricing of real estate takes into account the information of other real estate prices. With the advancement of the internet, one can easily search sales price information for a large number of properties. Furthermore, there are online services that provide their own property estimates for free based on the property and neighborhood attributes, as well as the sales prices of comparable properties. Does the availability of such price information impact actual sales prices? How large is the extent of this impact? Economists have long been interested in how information, or the lack of information, impacts asset prices. Easley and O’Hara (1987, 2004) show that large trades in the securities market reflect better information and impact security prices, and that investors demand higher returns on stocks for which there is less public information. Researchers have found evidence that information impacts sales prices in the real estate market as well. Levitt and Syversson (2008) show that informational advantage translates to higher sales prices by examining properties owned by real estate brokers. They find that realtors sell their own houses at about 4 percent higher prices. The recent financial crisis have triggered interests in the impact of foreclosure on house prices. Foreclosures can potentially impact the price of non-foreclosure houses by conveying new information about unobserved neighborhood attributes, or more directly by being included as comparables. Campbell et al. (2011) find that foreclosed homes 2

lower prices of nearby houses by about 1 percent.1 The real estate market is prone to imperfect information and market participants will likely value information that can help price the asset. Sellers may generally have the informational advantage over the buyers about the condition of properties.2 In many cases, the buyer and seller may simply not know the exact values of certain heterogeneous components of the property. In this paper, we analyze the value of real estate price information and estimate how sensitive sales prices are to online price estimates in the real estate market. In order to estimate the impact of information on prices in the real estate market, we propose a reduced-form pricing equation as the convex combination of third-party price estimate and self valuation of a property. The main challenge for estimation is to control for the unobserved house and neighborhood attributes in the model. We propose a method that nonparametrically proxies for unobserved house specific attributes by using the difference between the listing price and the online price estimate. Our method is similar to the approaches used in the production literature where researchers have used observed inputs and investments to nonparametrically proxy for unobserved technologies. We also control for unobserved neighborhood attributes by first differencing properties within same neighborhoods. The literature has dealt with unobserved area specific attributes by using boundary discontinuities (Black 1999, Bayer et al. 2005) or quasi-experimental research designs (Chay and Greenstone 2005). Bajari et al. (2012) propose a method that relies less on the research design but on the structural assumption that prior sales prices can be used to control for time-varying unobservable attributes in a hedonic regression. Similarly, we estimate the extended hedonic model by relying on the structural assumption that prior list prices contain unobserved house specific information and on the data requirement of having at least two properties per neighborhood. We collect home value estimates, list prices, sales prices, and house and neighborhood attributes from Zillow.com, an online real estate information provider, for 1,200 houses across 1

Real life examples of markets for information, like car reports for used cars or online reviews for restaurants, more directly speak to the value people put on information. 2 The seller may know of a problem that may not be apparent to one who has not lived in the house for multiple days or even a home inspector (e.g. seasonal drafts, neighbor issues, etc.).

3

30 Metropolitan Stastistical Areas (MSAs) in the US. We find that the elasticity of sales prices with respect to the Zillow home price estimates is one. The results are robust regardless of how we calculate the proxy variable to control for unobserved house attributes. These results imply that online price estimates can have a big impact on real estate price dynamics. Additionally, we explore possible factors that might explain the variation in the elasticity estimates across the 30 MSAs. The percent of population with a bachelor degree or above significantly explains the variation in the elasticity estimates across MSAs. The paper is organized as follows. Section 2 presents the MSA level analysis. Section 3 presents the microeconometric model and its estimation strategies. Section 4 explains the house level data collected from the real estate website, Zillow.com. In Section 5, we present our elasticity estimates. Section 6 concludes and discusses the implications.

2

The MSA Level Analysis

As a preliminary step, we first examine the hypothesis that online property price estimates impact actual sales prices at the aggregate level. If real estate price information directly impacts house prices, we expect the relation to hold at an aggregate level as well. Specifically, we test whether Zillow’s median price estimates Granger cause the median sales price as reported by Zillow across 30 MSAs in the US. The 30 MSAs were chosen based on Zillow’s MSA level report and the availability of individual sales price information.3 Table 1 lists the 30 MSAs and the summary statistics of the median sales price and Zillow estimates for three bedroom single family houses. The MSA level data is available at Zillow’s research division and we collect monthly data from October 2008 to April 2013.4 The following two subsections introduce the empirical methodology for the MSA level analysis, and the third subsection presents empirical results. 3 4

Section 4 describes the selection of MSAs in more detail The data is available at http://www.zillow.com/blog/research/data/

4

2.1

Granger Causality in VAR

For each MSA, we denote Zillow’s median log house price estimate at time t by Zt . The median log sales price at time t is denoted by Yt . We assume that they jointly follow the p-th order vector autoregressive (VAR(p)) process: 







p











 Zt   A0,1  X  Aq,1,1 Aq,1,2   Zt−q   εt,1    +   = + Aq,2,1 Aq,2,2 Yt−q εt,2 Yt A0,2 q=1

(2.1)

We say that Zt does not Granger cause Yt if Aq,2,1 = 0 for all q = 1, · · · , p. A test of this null hypothesis can be conducted by the Wald test on (A1,2,1 , · · · , Ap,2,1 ). Let A2 = (A0,2 , A1,2,1 , A1,2,2 , · · · , Ap,2,1 , Ap,2,2 ) be the (2p + 1)-dimensional vector of the coefficient in the ˆ A2 second row of the above VAR model (2.1). Let Aˆ2 denote its consistent estimate, and let Σ denote a consistent estimate of the variance matrix of the coefficient estimate Aˆ2 . The Wald statistic is computed by ˆ = Aˆ02 R0 (RΣ ˆ A2 R0 )−1 RAˆ2 W where R is the p by 2p+1 restriction matrix whose 2r-th column is one for each row r = 1, · · · , p, and all the other elements are zero. Under the null hypothesis H0 : A2 = ~0, this Wald statistic ˆ follows the chi-square distribution of p degrees of freedom. We report this statistic and the W associated p-value for the test of Granger causality.

2.2

Model Selection

There is arbitrariness in the choice of the order p of the VAR model (2.1). Some commonly used approaches to selecting p include Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC). We conduct the hypothesis testing after selecting the order p of the VAR process by choosing the minimum AIC or BIC. However, we remark that these approaches have some drawbacks in terms of consistency of model selection and validity in post-selection inference. 5

A recently popular method of model selection in econometrics is the least absolute shrinkage and selection operator (LASSO: Tibshirani, 1996). In particular, the adaptive LASSO (Zou, 2006) enjoys the nice Oracle property as well as consistency of the model selection. This method works as follows. Let Aˆ denote a preliminary consistent estimate of the parameters A in model (2.1) without model restriction, e.g., the least squares estimate under a choice of large order p. The adaptive LASSO estimate Aˆ∗ is obtained by the L1 penalized least square problem

Aˆ∗

       2

p T

Zt

X X A A A Z 0,1 q,1,1 q,1,2 t−q

       = arg min − −

       A

Y A0,2 Aq,2,1 Aq,2,2 Yt−q t=p+1 q=1

t   p 2 X 2 X X |A | |A | 0,r  q,r,c κ  . κ + +λT ˆ ˆ A0,r c=1 q=1 Aq,r,c r=1

The theory (Zou, 2006) requies that the tuning parameters κ > 0 and λT > 0 satisfy the √ asymptotic order λT / T → 0 and λT T (κ−1)/2 → ∞ as T → ∞. In practice, however, T is fixed given a finite sample, and thus this asymptotic guideline may not be useful. Therefore, we present empirical estimation results for each of the different values of the tuning parameter, and examine their robustness.

2.3

Results from the MSA level analysis

Table 2 presents the tests of Granger causality of median sales price yt by Zillow’s median price estimates for each MSA. The results are based on the best information criteria with the maximum p set to 10. Column (1) presents results when optimal p is chosen using the minimum AIC. The optimal lag ranges from 7 to 10 with 10 being the most common. In all 30 cities but one, Boston, the joint hypothesis that all the coefficients on Zt−q are zero is rejected at the 10 percent level. Column (2) uses the minimum BIC to choose p and the joint hypothesis tests reject the null for all cities except Boston, Denver, and Philadelphia. The more preferred LASSO method with tuning parameters of 0.5, 1.0, and 2.0 are presented in columns (3) (5). The optimal p tends to be smaller in these columns than in columns (1) and (2), but the 6

general results are similar. Other than in three to five cities, we reject the null hypothesis that the online price estimates do not Granger cause sales prices. The last row of Table 2 presents results for the entire US. The selected lag orders are 5 and 6 in the LASSO models of columns (4) and (5) and 10 in the other models. The p-values imply that Zillow’s median price estimate Granger-cause actual sales prices at the national level in all five models. The MSA and national level aggregate results indicate that real estate price information may impact actual transaction prices.

3

The Methods of Micro Level Analysis

Our analysis based on aggregate data and the VAR model in the previous section is limited in its ability of causal inference. In order to convincingly illustrate the relationship between online price estimates and sales prices, we obtain consistent and robust evidence based on microeconomic analysis using house-level data. This section presents a model and empirical methods to this end.

3.1

The Extended Hedonic Model

We propose a microeconometric method that estimates the impact of real estate price information on sales prices. Specifically, we extend the traditional hedonic framework to the one that incorporates the potential effects of real estate price information, in particular, the individual house level price estimate provided by Zillow. The following is a list of economic factors that may potentially affect transaction prices for house i in a neighborhood Ni : • Xi : A vector of house-specific amenities including: lot size, square footage, number of bedrooms, number of bathrooms, and year built. • Ui : The value of unobserved house-specific amenities including: floor plans and appliances.

7

• VNi : The value of unobserved neighborhood-specific amenities including: public schools, crime, curb appeal, environmental quality, and other public services. • Zi : Home price information, i.e. Zillow’s price estimates, that real estate market participants can observe. The standard hedonic pricing models forecast the transaction price as follows: Residual 1

Residual 2

Residual 3 z}|{ z}|{ z}|{ Yi = α + Xi β + VNi + Ui + εi . | {z } | {z } Regression

(3.1)

Residual

For the purpose of elucidating the problem that we face in our study, we decompose the usual residual into three components, the first one reflecting the value VNi of neighborhood-specific amenities, the second one reflecting the value Ui of house-specific amenities, and the third one representing idiosyncratic errors εi . The standard hedonic pricing model (3.1) assumes that sellers and/or buyers take the vector of house-specific amenities (Xi , Ui ) and the value of neighborhood-specific amenities VNi into account when making decisions about transaction prices Yi in the equilibrium. Econometricians estimate the reduced-form marginality coefficient β, called contributory values, for the part Xi of the house-specific amenities that they can observe in data. We hypothesize that agents may also take into account the home price information Zi , the one that is produced by real estate information providers like Zillow, when proposing to set transaction prices. This hypothesis may reflect that both buyers and sellers may not be so confident of their own home evaluation based on the information of the house and the neighborhood, and therefore tend to use the measure Zi provided by third parties. In this light, we propose an extended reduced-form equilibrium pricing model simply as the convex combination of outside and self valuations:

Yi = γZi + (1 − γ)[α + Xi β + VNi + Ui + εi ].

8

(3.2)

The expression in the square brackets in the second term, α + Xi β + VNi + Ui + εi , constitute those factors used in the traditional hedonic pricing models (3.1). Further, we add the first term Zi to reflect the potential effects of the home price information Zi on transaction prices Yi . As such, the parameter γ may be interpreted as the degree of agents’ reliance on the third-party information. Our null hypothesis that the home price estimates Zi do not impact the actual transaction prices is thus represented by the equality γ = 0, which is readily testable once a √ N -consistent estimate of γ is obtained. The OLS estimators of the parameters α, β, and γ would be consistent if (VNi , Ui ) were mean independent of both Zi and Xi . However, this statistical independence assumption is hard to justify at least for two reasons. First, the unobserved house-specific amenities Ui are likely to be correlated with the observed house-specific amenities Xi . Second, more importantly in our study, the introduction of Zi in the extended pricing model (3.2) causes another source of endogeneity. To see this, it may help to think of how the home price information Zi is generated by real estate information providers. Although these service agencies do not disclose their formulas, those estimates Zi are constructed using the recent transaction data in the neighborhood Ni of house i. (See Section A.1 in the appendix for the case of Zillow.) As such, the statistical independence Zi ⊥⊥ VNi between the price estimate and unobserved neighborhood characteristics, or the corresponding mean independence, will probably not hold even if we control for the observed house specific amenities Xi . We therefore propose a couple of approaches to handle these two sources of endogeneity in the subsequent sections.

3.2

Proxy Variable

To control for the endogenous unobserved house-specific amenities Ui , we follow the proxy variable approach often taken in the production literature (Olley and Pakes, 1992; Levinsohn and Petrin, 2003), which is formalized by Wooldridge (2009). Specifically, we construct a proxy variable using listing prices, denoted by Li . The seller can perceive house-specific amenities Ui that econometricians cannot observe. They may add these values to benchmark hedonic

9

valuations Hi when proposing their listing prices, i.e.,

Li = Hi + g(Ui ).

List prices may differ from the online hedonic estimates Hi for various reasons. List prices tend to start high since the seller predicts that the negotiation process will ultimately result in a lower sales price. How quickly the seller needs to sell the property could also impact the list price. The function g thus captures the seller’s adjustment of the self-valuation of Ui . Note that the identity function g(u) = u implies that there is no markup or markdown in the listing prices above the observed and unobserved value of the house.5 Finally, to take this structure into estimation of the parameters, we assume that g is strictly increasing so that its inverse g −1 exists. With this inverse function, we can recover the unobserved house-specific amenities Ui by Ui = g −1 (Li − Hi ).

Substituting this expression in (3.2) yields

Yi = γZi + (1 − γ)[α + Xi β + VNi + g −1 (Li − Hi ) + εi ] = γZi + α ˜ + Xi β˜ + γ˜ VNi + g˜(Li − Hi ) + ε˜i ,

(3.3)

where α ˜ := (1 − γ)α, β˜ := (1 − γ)β, γ˜ = (1 − γ), g˜ := (1 − γ)g −1 (·) and ε˜i = (1 − γ)εi for shorthand notations. This operation removes one of the two sources of endogeneity, namely Ui , and it thus remains to handle the other unobserved variable VNi . For estimation of the parameters with the additive nonparametric function g˜ provided VNi is known, we can use Robinson (1988) which Olley and Pakes (1992) and Levinsohn and Petrin (2003) use for the similar purpose of handling proxy variables nonparametrically. 5

We find that initial list prices are higher than the prior hedonic estimates in about 68 percent and lower in about 32 percent of the observations in our sample.

10

This method works as follows. If the mean independence E[εi | Ui ] = 0 is true, then E[Yi | U˜i ] = γ E[Zi | U˜i ] + α ˜ + E[Xi | U˜i ]β˜ + γ˜ E[VNi | U˜i ] + g˜(U˜ ) follows, where U˜i = Li − Hi for a short-hand notation. Thus, we obtain Yi − E[Yi | U˜i ] = γ(Zi − E[Zi | U˜i ]) + (Xi − E[Xi | U˜i ])β˜ + γ˜ (VNi − E[VNi | U˜i ]) + ε˜i

If the contributory value VNi of neighborhood Ni were observed, then γ may be

√ N -consistently

estimated by the OLS of Yi − E[Yi | Li − Hi ] on Zi − E[Zi | Li − Hi ], Xi − E[Xi | Li − Hi ], and VNi − E[VNi | Li − Hi ], where the nonparametric regressions E[Yi | Li − Hi ], E[Zi | Li − Hi ], E[Xi | Li − Hi ] and E[VNi | Li − Hi ] are pre-estimated using the kernel method.

3.3

Local First Differencing

The previous section introduced a way to control for house-specific unobservables Ui , provided that the contributory value VNi of neighborhood Ni were observed. If we have multiple observations per neighborhood, however, we do not need to observe VNi since we can take first differences within a neighborhood to vanish the VNi terms. Note that Ni = Nj clearly implies VNi = VNj . Hence, we can take the difference of (3.3) between two properties, i and j to obtain the equation

Yi − Yj = γ(Zi − Zj ) + (Xi − Xj )β˜ + g˜(Li − Hi ) − g˜(Lj − Hj ) + ε˜i − ε˜j .

(3.4)

for any pair (i, j) such that Ni = Nj , i.e., within the same neighborhood. This operation, mechanically identical to the method of first differencing for panel data analysis, removes the neighborhood fixed effect VNi . For this sort of first-differenced partially linear equations, Li and Stengos (1996) extend the Robinson’s method (see the previous section). √ Specifically, γ may be N -consistently estimated by the OLS of Yi − Yj − E[Yi − Yj | Li −

11

Hi , Lj −Hj ] on Zi −Zj −E[Zi −Zj | Li −Hi , Lj −Hj ], and Xi −Xj −E[Xi −Xj | Li −Hi , Lj −Hj ], where the nonparametric regressions E[Yi − Yj | Li − Hi , Lj − Hj ], E[Zi − Zj | Li − Hi , Lj − Hj ], and E[Xi − Xj | Li − Hi , Lj − Hj ] are pre-estimated using the kernel method.6

4

Data

We collect house level data from Zillow, one of the major online real estate information providers. There are many websites providing real estate information. Zillow provides individual house price estimates called Zestimates. The estimates are available regardless of whether the property is on the market or not.7 Zillow does not disclose the formula that they use to generate their price estimates, but they mention that they use the physical attributes of the property, tax assessments, and prior and current transactions of the property itself and the comparable properties nearby (see Appendix A). In addition to their current house price estimates, Zillow provides their past estimates, current and past listing prices when available, and the most recent sales price, and past sales prices when available. We collect the sales date and price, Zillow estimate at the time of sales, the estimates one, two, three, and six months before sales, the initial listing price, and historical sales and listing prices when available. In addition, a rich set of house specific and neighborhood/town specific information are available for Xi . We collect the address of the house, square footage, number of bedrooms, number of bathrooms, lot size, year built, and property tax. Zillow also provides nearby school names and the school ratings from GreatSchools.org. In collecting a sample, we make sure to include multiple houses from each neighborhood for 6

Baltagi and Li (2002ab) propose alternative methods to estimate first-differenced partially linear models with discussions on asymptotic properties of the estimators – they suggest that the nonparametric preestimations be done with series approximation instead of the kernel method in order to take advantage of the additivity between g˜(Li − Hi ) and g˜(Lj − Hj ). 7 Currently there are many real estate websites. Many of these websites are brokerage websites where listing and selling of properties on the market is the main business model. These websites belong to the local multiple listing service (MLS) which are local associations where real estate agents share their property listings. Other websites, for example Zillow and Trulia, are not real estate brokerage firms but mainly serve as an information provider to various parties interested in the real estate market. Their business model aims to gain a wide audience and profit from advertisement fees, not through brokerage fees.

12

neighborhood first differencing. The following procedure is employed to collect our sample of house level data. We first choose 30 MSAs where Zillow provides both their price estimates and the sales price information.8 For each MSA we find 10 neighborhoods with median Zestimates closest to the MSA median zestimate and collect data on four houses per neighborhood. If there are less than 10 neighborhoods in an MSA we additionally collect four more houses from existing neighboroods, starting with neighborhoods that have median Zestimates closest to the MSA median Zestimate. Within each neighborhood, we restrict the search to single family homes that are 2000 sqft or above, have 3 bedrooms or more, 2 bathrooms or more, and were last sold between July 2012 and July 2013. For each neighborhood, we narrow down to houses that have Zestimates that are closest to its Zipcode median Zestimate and that list the same set of nearby public elementary schools. We then randomly select the first four houses that have non-missing information on Zestimate at time of sales, sales price, initial listing price, number of beds and baths, house square footage, lot size, and year built. We also record the number of skips for each MSA, if there were any skip due to missing information.9 This procedure returns 40 houses in 30 MSAs for a total of 1,200 observations. Table 3 Panel A summarizes the characteristics of these houses. We also collect information on the accuracy of Zillow’s estimates by MSAs. This information is provided by Zillow Research and is available online. The variables provided are the number of homes on Zillow, the number of homes with Zillow estimates, and the error margin between Zillow’s estimates and sales price. Furthermore we collect MSA level population, land area, household number, education attainment from the census. Table 3 Panel B presents the summary statistics for the additional MSA level variables. 8

We first refer to the MSA Zestimate accuracy file. The MSA file contains 30 MSAs. However, in a few MSAs sale prices were not reported and we replaced those MSAs with other MSAs where accuracy estimates and sales prices were available. The Zestimate accuracy file is accessible at http://www.zillow.com/howto/DataCoverageZestimateAccuracy.htm. 9 On average there were about 16 skips per MSAs. We examined whether the number of skips impact the distribution of γ estimates across cities by including the number of skips in the OLS regression in Table 7. Including the number of skips does not alter the coefficient estimate on education and the coefficient estimate on the number of skips is statistically indistinguishable from zero.

13

5

Evidences from the Micro Level Analysis

5.1

The Impact of Real Estate Information on Sales Prices

We first implement our procedure on the full sample and then by MSAs. Tabel 4 presents the full sample results. Each cell in columns (1) through (6) displays the estimate of γ. Columns (1) through (3) report results when we do not perform the neighborhood first differencing, so we are not controlling for the unobserved neighborhood component. Column (1) presents estimates when we do not proxy for the unobserved house specific characteristics, column (2) controls for the unobserved house specific characteristics by including a linear proxy, which is the difference between the listing price and previous Zillow estimate, and column (3) uses the non-parametric nonlinear proxy in place of the linear proxy as described in Section 4.2. Columns (4) through (6) report results for specifications that parallel columns (1) through (3) but also based on the neighborhood first-differencing procedure in order to control for the unobserved neighborhood characteristics, as described in Section 4.3. Each row represents the combination of the Zillow estimate of interest, Zi in equation 4.2, and the Zillow estimate used in calculating the proxy variable, Hi in equation 4.3. In columns (1) through (3), when the neighborhood unobservables are not controlled for, the estimate of γ ranges from 0.825 to 1.013. The estimates decrease when we control for the endogenous unobserved house-specific amenities Ui using a proxy. In general, the estimates based on the non-parametric nonlinear proxy are smaller than the estimates based on the linear proxy. When we further control for the unobserved neighborhood specific amenity VNI by first differencing observations within neighborhoods, the estimates of γ decrease relative to the parallel specifications in columns (1) through (3). Overall, the estimates indicate that controlling for unobserved house and neighborhood attributes decreases the estimated partial effects of Zillow’s online estimates on sales prices. The estimates in column (6) which control for both UI and VNI are our preferred estimates for γ. Columns (7) and (8) conduct hypothesis tests where the null hypotheses are γ=0 and γ=1. All estimates of γ are statistically different from

14

zero at the 1 percent level. For some specifications, we can not reject the null that γ=1 at the 5 percent level. The result that the estimates are close to one implies that online house price estimate may well be a major factor in determining sales price. If the underlying model indeed follows equation (3.2) so that the coefficient estimates on the observable attributes Xi is (1 − γ)β, then γ=1 implies that the estimated coefficients on Xi would equal zero. We report test results on the joint hypothesis that all the coefficient estimates on the observable attributes Xi are zero. The p-values are reported in Table 5 where each column represents the same specification as described in Table 4. We focus on the column (5) and (6) results which use proxy variables and implement the neighborhood first-differencing. Even though we are testing a very restrictive hypothesis, we can not reject the null hypothesis that all the beta coefficients are zero at the 5 percent level in any of the specifications in column (5). Similarly in column (6), we can not reject the null hypothesis that all the beta coefficients are zero in all the specifications at the 1 percent level and we can not reject the null in all but one specification at the 5 percent level. Given the availability of online price estimates, participants in the real estate market no longer seem to rely on own hedonic assessments of house or neighborhood attributes. Another interpretation of these results is that the scalar index, namely Zi , is sufficient for the decisions by the market players. The multiple dimensions of the information, Xi , Ui and VNi , are available for them to look at, but they are redundant given the scalar index Zi . The results indicate that information is the prime driver of house prices. The magnitude of one for γ may initially seem surprising. However, once one thinks about how property transactions are made, this finding may seem less surprising. A hedonic framework is often used to estimate the marginal valuation of one attribute in a composite good. It is an ex post revelation of what people’s marginal willingness to pay for an attribute is. For instance, we can back out the marginal value of a bedroom from a hedonic regression. However, the market participant, be it the seller, buyer, or broker, would rarely use the estimates from a hedonic method to price the composite good, the house. One of the most common methods to value

15

property is to use recent sales prices of comparable properties in the neighborhood and then to make marginal adjustments based on the different attributes of the houses. Hence, the finding that online third party price estimates can serve as an important determinant of sales prices is not that surprising. We next examine how the estimates of γ vary across MSAs. Table 6 presents the impact of one month prior Zillow estimates on sales prices in each of the 30 MSAs using various benchmark online estimates in calculating the nonlinear proxy variable. For each estimate we conduct hypothesis tests where the null hypotheses are γ=0 and γ=1. Focusing on the column (2) results, which use 2 month prior online estimates in the nonlinear proxy, all estimates are statistically different from zero at the 10 percent level except for Charlotte and Riverside. The estimates vary considerably across MSAs, e.g. ranging from 0.377 in Riverside to 1.276 in Nashville. Furthermore, many of the estimates are statistically indistinguishable from one even at the MSA level. Using 3 months or 6 months prior Zillow estimates in the nonlinear proxy return similar results across the different MSAs.10

5.2

What Explains the Variation in the Elasticity Estimates?

What might explain the variation in the value of information in the real estate market and hence the elasticity estimates across MSAs? The size of the housing market, availability of the third party estimates, or the precision of the estimates could impact people’s reliance on online house price estimates. One could also hypothesize that the education level or income level of the population would impact the use of online real estate information. A simple correlation analysis finds that educational attainment of the MSA population, specifically, the share of age 25 or above population with a bachelor degree or above explains the variation in the γ estimates. Figure 1 presents a scatter plot between the two variables. The γ estimates in the figure are from the specification used in Table 6 column (1). The positive correlation is stark. MSAs with higher share of the population with college education or above tend to value online 10

Appendix Tables 1 present the MSA results with the different specifications as used in Table 3.

16

real estate price estimates more. Table 7 examines the statistical significance of this relation in an OLS regression. Column (1) presents the bivariate regression. The coefficient estimate on the educational attainment variable is 1.82 and statistically significant at the 1 percent level. A 10 percentage point increase in the share of bachelor degree or above is related to a γ estimate that is higer by about 0.18. The coefficient estimates on the education variable are robust to additionally including the set of MSA size variables in column (2) or the set of Zillow coverage variables in column (3). In column (4), we add the median error between Zillow estimates and sales price and the coefficient estimate on gamma barely changes. In column (5), we control for a categorical variable used by Zillow to indicate how accurate Zillow estimates are. The variable ranges from one to four with four indicating the most accurate and two the least accurate. Category one is for cities where accuracy is unknown. The coefficient estimate on education slightly drops and is statistically significant at the 10 percent level. Finally, in column (6) we additionally include median family income. Median family income is highly correlated with share of college educated or above. The coefficient estimate on the college share variable increases but the standard error increases as well, most likely due to the colinearity with income. Overall, Table 7 indicates that real estate sales prices are more sensitive to online price estimates in MSAs with a more educated population.

6

Conclusion

We investigate how sensitive sales prices are to online price estimates in the real estate market. For our micro data analysis, we propose a reduced-form equilibrium pricing model as the convex combination of third-party price estimate and self valuation of properties. Our method nonparametrically proxies for unobservable house attributes by using the difference between the listing price and the online price estimate, and controls for unobserved neighborhood attributes by neighborhood first differencing. We collect house price estimates, sales and list prices, in addition to various house and neighborhood attributes from Zillow.com across 30 MSAs in the US. The empirical obtained in this paper results show that the elasticity of sales price with 17

respect to the Zillow estimate is one. In addition, we find that the population share of college educated and above significantly explains the variation in the elasticity estimates across MSAs, although this simple correlation analysis is not the main feature of our analysis. The literature has found that information impacts asset prices, in particular in the securities market. We add to this literature evidence that information is valued in the real estate market to a large extent. Our finding of unit elasticity between sales prices and online price estimates in the real estate market may have significant implications. If information is more important than fundamentals in determining real estate prices, then how information is generated could have a big impact on real estate price dynamics. One may conjecture that the prevalence of online real estate information and the reliance on such information may have contributed partially to the recent boom and bust in the real estate market.

References Bajari, Patrick, Jane Cooley, Kyoo il Kim, and Christopher Timmins. 2012.“A Rational Expectations Approach to Hedonic Price Regressions with Time-Varying Unobserved Product Attributes: The Price of Pollution,” American Economic Review, 102(5): 1898-1926. Baltagi, Badi and Dong Li. 2002a. “Series Estimation of Partially Linear Panel Data Models with Fixed Effects,” Annals of Economics and Finance, 3: 103-116. Baltagi, Badi and Qi Li. 2002b. “On Instrumental Variable Estimation of Semiparametric Dynamic Panel Data Models,” Economics Letters, 76: 1-9. Bayer, Patrick, Fernando Ferreira, and Robert McMillan. 2007. “A Unified Framework for Measuring Preferences for Schools and Neighborhoods,” Journal of Political Economy, 114(4): 588-638. Black, Sandra. 1999. “Do Better Schools Matter? Parental Valuation of Elementary Education,” Quarterly Journal of Economics, 114(2): 577-599. 18

Campbell, John Y., Stefano Giglio, and Parag Pathak. 2011. “Forced Sales and House Prices,” American Economic Review, 101: 2108-2131. Chay, Kenneth and Michael Greenstone. 2005. “Does Air Quality Matter? Evidence from the Housing Market,” Journal of Political Economy, 113(2): 376-424. Easley, David, Soeren Hvidkjaer, and Maureen O’Hara. 2002. “Is Information Risk a Determinant of Asset Returns?” Journal of Finance, 57(5): 2185-2221. Easley, David and Maureen O’Hara. 1987. “Price, Trade Size, and Information in Securities Markets,” Journal of Financial Economics, 19: 69-90. Easley, David and Maureen O’Hara. 2004. “Information and the Cost of Capital,” Journal of Finance, 59(4): 1553-1583. Levinsohn, James and Amil Petrin. 2003. “Estimating Production Functions Using Inputs to Control for Unobservables,” Review of Economic Studies, 70: 317-341. Levitt, Steven D. and Chad Syverson. 2008. “Market Distortions When Agents Are Better Informed: The Value of Information In Real Estate Transactions,” Review of Economics and Statistics, 90(4): 599-611. Li, Qi and Thanasis Stengos 1996. “Semiparametric Estimation of Partially Linear Panel Data Models,” Journal of Econometrics, 71: 389-397. Olley, Steven and Ariel Pakes. 1992. “The Dynamics of Productivity in the Telecommunications Equipment Industry,” Econometrica, 64: 1263-1297. Robinson, Peter. 1988. “Root-N-Consistent Semiparametric Regression,” Econometrica, 56: 931-954. Rosen, Sherwin. 1974. “Hedonic Prices and Implicit Markets: Product Differentiation in Pure Competition,” Journal of Political Economy, 82(1): 34-55

19

Stiglitz, Joseph E. 2000. “The Contributions of The Economics of Information to Twentieth Century Economics,” The Quarterly Journal of Economics, Tibshirani, Robert. 1996. “Regression Shrinkage and Selection via the Lasso,” Journal of the Royal Statistical Society: Series B, 58: 267-288. Wooldridge, Jeffrey M. 2009. “On Estimating Firm-Level Production Functions Using Proxy Variables to Control for Unobservables,” Economics Letters, 104: 112–114. Zou, Hui. 2006. “The Adaptive Lasso and Its Oracle Properties,” Journal of the American Statistical Association, 101: 1418-1429.

A

Appendix

A.1

Zillow estimates

Zillow does not disclose the formula for how their hedonic price estimates are produced, but they mention which data they use. According to their website, some of the data that they use include: • Physical attributes: Location, lot size, square footage, number of bedrooms and bathrooms and many other details. • Tax assessments: Property tax information, actual property taxes paid, exceptions to tax assessments and other information provided in the tax assessors’ records. • Prior and current transactions: Actual sale prices over time of the home itself and comparable recent sales of nearby homes

20

Table 1. The list of Metropolitan Statistical Areas and the summary statistics of variables used in the MSA level VAR analysis Median Sales Price

Median Zillow Estimates

October 2008

October 2010

October 2012

October 2008

October 2010

October 2012

Atlanta Baltimore

194240 270735

148650 274823

154635 259375

155200 255700

131700 230000

112600 220800

Boston Charlotte

321025 176225

331100 180775

322985 176900

323300 150600

313500 137200

315000 136000

Chicago

238100

210125

195925

221000

183900

161600

Cincinnati Columbus

136300 154200

127400 148200

123600 157175

147700 137200

143900 128900

141460 126700

Denver Las Vegas

239645 226500

240925 138105

246325 132500

213600 199100

208200 127700

224400 122500

Los Angeles Miami-Fort Lauderdale Minneapolis

458752

405300

400000

435300

415200

405600

240200

157523

161050

196400

143300

149800

220000

195965

195750

207500

180300

173400

Nashville New York

160885 399750

171750 386652

171725 366450

155100 396400

148500 364500

148400 343100

Orlando Philadelphia

200500 224748

121250 223575

130900 213550

175500 216300

127400 199900

123700 186500

Phoenix

215933

149125

165250

189300

133700

154100

Pittsburgh Portland

126800 258036

121800 239100

131425 233750

102800 259500

105900 226300

111100 226800

Providence Riverside

242250 286500

231225 198900

207965 202500

237600 236300

227200 193200

211000 192000

Sacramento

289950

228050

221750

267700

226400

217900

St. Louis San Diego

163075 401500

150825 355750

154057 358900

141200 373900

135400 364900

127100 362800

San Francisco San Jose

572925 606100

483250 556425

480800 564000

536600 605700

499000 561800

512400 610400

Seattle 330875 309225 296612 330400 278900 Tampa 168475 130196 124165 147000 117700 Virginia 220441 222750 215125 223100 210900 Beach Washington 348750 358245 339717 329500 307700 DC Notes: The median sales price and Zillow estimates are for three bedroom single family houses.

267300 111700

21

195700 320200

Table 2. MSA level VAR results (1) Minimum AIC p (df) p-val

(2) Minimum BIC p (df) p-val

(3) LASSO (0.5) df p-val

(4) LASSO (1.0) df p-val

(5) LASSO (2.0) df p-val

Atlanta

10

0.000

9

0.000

10

0.000

8

0.000

10

0.000

Baltimore

10

0.000

10

0.000

10

0.000

9

0.000

10

0.006

Boston Charlotte

10 10

0.159 0.010

10 9

0.159 0.017

10 10

0.153 0.010

10 10

0.186 0.009

10 6

0.159 0.367

Chicago Cincinnati

9 10

0.000 0.001

9 10

0.000 0.001

9 10

0.000 0.000

9 9

0.000 0.000

9 9

0.000 0.000

Columbus

10

0.000

5

0.000

9

0.000

2

0.276

7

0.000

Denver Las Vegas

10 10

0.084 0.003

7 10

0.602 0.003

10 7

0.142 0.000

10 4

0.084 0.000

9 7

0.077 0.000

Los Angeles Miami-Fort Lauderdale

9 10

0.000 0.000

9 10

0.000 0.000

10 10

0.000 0.000

8 10

0.000 0.000

9 10

0.000 0.000

Minneapolis Nashville

7 10

0.000 0.061

5 10

0.000 0.061

10 8

0.005 0.114

10 9

0.003 0.029

9 6

0.003 0.135

New York

10

0.044

6

0.000

9

0.007

10

0.021

10

0.000

Orlando Philadelphia

10 10

0.000 0.003

10 4

0.000 0.239

10 10

0.000 0.003

10 9

0.000 0.010

10 9

0.000 0.001

Phoenix Pittsburgh

10 10

0.000 0.000

9 10

0.000 0.000

10 9

0.000 0.000

9 10

0.000 0.002

10 10

0.000 0.000

Portland

10

0.065

10

0.065

9

0.039

4

0.000

2

0.232

Providence Riverside

10 10

0.000 0.000

10 10

0.000 0.000

10 8

0.000 0.000

9 10

0.000 0.000

10 8

0.000 0.000

Sacramento St. Louis

10 10

0.000 0.014

10 9

0.000 0.011

9 10

0.000 0.025

10 8

0.000 0.007

8 6

0.000 0.002

San Diego San Francisco

9 10

0.000 0.000

6 10

0.000 0.000

10 10

0.000 0.042

10 10

0.000 0.005

9 9

0.000 0.000

San Jose

10

0.000

10

0.000

9

0.000

8

0.000

4

0.000

Seattle Tampa

10 10

0.000 0.000

10 8

0.000 0.000

9 10

0.000 0.000

8 10

0.000 0.000

9 10

0.000 0.000

Virginia Beach Washington DC

10 10

0.034 0.000

4 10

0.054 0.000

9 10

0.021 0.000

4 10

0.566 0.000

3 10

0.415 0.000

United States 10 0.009 10 0.009 10 0.006 5 0.000 6 Notes: The analysis was performed on monthly data over the period between Oct. 2008 and Apr. 2013.

0.015

22

Table 3. Summary statistics Variable

Mean

Std. Dev.

Min

Max

Obs

Panel A: House level data Sales price

320977

243894

10000

2950000

1200

Zillow estimate when sold Zillow estimate 1 month prior to sale Zillow estimate 1 month prior to sale Zillow estimate 1 month prior to sale Zillow estimate 1 month prior to sale List price

322056

230858

36000

2600000

1200

319858

230901

38000

2600000

1200

320484

246462

37000

3.34E+06

1200

315433

230222

16100

2600000

1200

313439

239434

26000

3080000

1200

340065

250090

19900

2900000

1199

Number of bedrooms

3.85

0.87

3

9

1200

Number of bathrooms

2.68

0.64

2

6

1199

Square footage

2373

547

2000

10890

1200

Year built

1960

37

1810

2013

1200

Panel B: MSA level data Percent of 25 and above population with bachelor degree or above

0.34

0.07

0.187

0.475

30

Median family income, 2007

69083

11047

51554

97095

30

Number of households, 2007

1353754

1348206

46675

6717007

30

6040.7

4783.9

1600.9

27259.9

30

Population, 2008

2763774

2836117

165829

1.29E+07

30

Houses on Zillow

1332274

1043634

159506

4976087

30

Houses with Zillow Estimates

1248442

977717.8

155126

4542490

30

3.17

0.87

1

4

30

Land area

Zillow estimate accuracy

23

Table 4. The micro-econometric analysis: estimates of γ (effect of online price estimates on sales prices) for the pooled sample Estimates of γ (Effect of online price estimates on sales prices) Y - Sales Prices

No Local First Differencing

All 30 MSAs Pooled

No Proxy (1)

Linear Proxy (2)

Nonlinear Proxy (3)

Z - 1 Month Before H - 2 Months Before

1.002 (0.028)

Z - 1 Month Before H - 3 Months Before

Z - 1 Month Before H - 6 Months Before

1.013 (0.029)

Z - 2 Months Before H - 3 Months Before

Local First Differencing No Proxy (4)

Linear Proxy (5)

Nonlinear Proxy (6)

1.005 (0.031)

0.805 (0.120)

0.988 (0.027)

0.999 (0.032)

0.994 (0.029)

0.985 (0.031)

0.856 (0.117)

0.833 (0.142)

0.808 (0.125)

p-Value

Local First Diff Nonlinear Proxy

H0: γ = 0 (7)

H0: γ = 1 (8)

0.993 (0.047)

0.000

0.887

0.832 (0.113)

1.003 (0.041)

0.000

0.944

0.805 (0.120)

0.893 (0.053)

0.000

0.043

0.292 (0.201)

0.684 (0.225)

0.002

0.160

Z - 2 Months Before H - 6 Months Before

0.874 (0.121)

0.856 (0.119)

0.825 (0.133)

0.266 (0.190)

0.273 (0.192)

0.570 (0.203)

0.005

0.034

Z - 3 Months Before H - 6 Months Before

1.001 (0.031)

0.982 (0.299)

0.970 (0.031)

0.582 (0.100)

0.604 (0.095)

0.857 (0.055)

0.000

0.009

Notes: The γ estimates measure the impact of online price estimates on actual sales prices. Z is the online price estimate of interest and H is the online price estimate used in calculating the nonlinear proxy. Different numbers of months prior to sales were used for the different Z and H combinations. Standard errors are in parentheses.

24

Table 5. The micro-econometric analysis: tests of the joint hypothesis that β’ = 0 based on the Wald statistic using the pooled sample p-value for the joint hypothesis that β’ = 0 based on the Wald statistics Y - Sales Prices

No Local First Differencing Linear Proxy (2)

Nonlinear

Z - 1 Month Before H - 2 Months Before

0.015

Z - 1 Month Before H - 3 Months Before

All 30 MSAs Pooled

Z - 1 Month Before H - 6 Months Before

No Proxy (1)

0.019

Z - 2 Months Before H - 3 Months Before

Local First Differencing Linear Proxy (5)

Nonlinear

0.011

0.089

0.050

0.018

0.045

0.264

0.016

0.065

0.138

0.200

0.371

0.101

0.075

0.723

0.474

Proxy (3)

No Proxy (4)

0.078

Proxy (6)

Z - 2 Months Before H - 6 Months Before

0.024

0.131

0.181

0.300

0.522

0.874

Z - 3 Months Before H - 6 Months Before

0.003

0.009

0.005

0.036

0.134

0.291

Notes: The γ estimates measure the impact of online price estimates on actual sales prices. Z is the online price estimate of interest and H is the online price estimate used in calculating the nonlinear proxy. Different numbers of months prior to sales were used for the different Z and H combinations. Standard errors are in parentheses.

25

Table 6. The micro-econometric analysis: estimates of γ (effect of online price estimates on sales prices) by MSA Local First Differencing with Nonparametric Proxy Estimates of γ

Atlanta Baltimore Boston Charlotte Chicago Cincinnati Columbus Denver Las Vegas Los Angeles Miami-Fort Lauderdale Minneapolis Nashville New York Orlando Philadelphia Phoenix Pittsburgh Portland Providence Riverside Sacramento San Diego San Francisco San Jose

Benchmark Hedonic Value – 2 Months Prior to Sales p-value p-value Est of γ H0: γ=0 H0: γ=1 (1) (2) (3) 0.675 0.000 0.029 (0.149) 0.773 0.000 0.009 (0.087) 1.036 0.000 0.683 (0.089) 0.170 0.675 0.040 (0.405) 0.922 0.000 0.328 (0.080) 0.873 0.000 0.131 (0.084) 1.255 0.000 0.136 (0.171) 1.065 0.000 0.635 (0.137) 0.583 0.100 0.239 (0.354) 0.505 0.015 0.017 (0.208) 0.592 0.005 0.053 (0.211) 1.019 0.000 0.890 (0.139) 1.276 0.000 0.002 (0.088) 1.201 0.000 0.519 (0.315) 0.865 0.000 0.395 (0.158) 1.282 0.000 0.273 (0.258) 1.131 0.000 0.427 (0.165) 1.022 0.000 0.755 (0.071) 0.725 0.000 0.059 (0.146) 0.611 0.011 0.104 (0.239) 0.377 0.152 0.018 (0.263) 0.703 0.004 0.230 (0.247) 0.867 0.000 0.416 (0.164) 1.211 0.000 0.467 (0.311) 1.091 0.000 0.637 (0.193)

Benchmark Hedonic Value – 3 Months Prior to Sales p-value p-value Est of γ H0: γ=0 H0: γ=1 (4) (5) (6) 0.698 0.000 0.048 (0.153) 0.748 0.000 0.005 (0.090) 1.013 0.000 0.920 (0.126) 0.168 0.693 0.051 (0.426) 0.927 0.000 0.377 (0.082) 0.828 0.000 0.037 (0.082) 1.180 0.000 0.334 (0.186) 0.849 0.000 0.319 (0.151) 0.475 0.167 0.127 (0.344) 0.607 0.019 0.128 (0.258) 0.483 0.032 0.022 (0.226) 1.019 0.000 0.909 (0.165) 1.156 0.000 0.271 (0.141) 1.168 0.000 0.615 (0.335) 1.025 0.000 0.897 (0.195) 1.203 0.000 0.421 (0.252) 1.184 0.000 0.376 (0.207) 0.994 0.000 0.942 (0.082) 0.801 0.000 0.151 (0.138) 0.648 0.004 0.118 (0.225) 0.271 0.298 0.005 (0.261) 0.674 0.006 0.183 (0.245) 0.754 0.000 0.121 (0.158) 1.186 0.000 0.543 (0.305) 0.937 0.000 0.770 (0.217)

26

Benchmark Hedonic Value – 6 Months Prior to Sales p-value p-value Est of γ H0: γ=0 H0: γ=1 (7) (8) (9) 0.698 0.000 0.049 (0.154) 0.793 0.000 0.126 (0.135) 0.765 0.001 0.326 (0.240) 0.401 0.200 0.056 (0.313) 0.890 0.000 0.192 (0.084) 0.841 0.000 0.106 (0.098) 1.045 0.000 0.806 (0.183) 0.491 0.001 0.001 (0.152) 0.431 0.169 0.069 (0.313) 0.408 0.002 0.000 (0.129) 0.289 0.125 0.000 (0.188) 1.003 0.000 0.984 (0.165) 1.199 0.000 0.094 (0.119) 1.018 0.005 0.961 (0.361) 0.827 0.000 0.327 (0.176) 1.200 0.000 0.415 (0.245) 1.080 0.000 0.658 (0.181) 0.939 0.000 0.471 (0.084) 0.569 0.004 0.029 (0.197) 0.809 0.000 0.332 (0.197) 0.112 0.660 0.000 (0.254) 0.520 0.029 0.044 (0.238) 0.703 0.000 0.050 (0.148) 0.889 0.016 0.765 (0.369) 0.761 0.002 0.323 (0.242)

Seattle St. Louis Tampa Virginia Beach Washington DC

1.086 (0.146) 0.853 (0.113) 1.223 (0.069) 0.434 (0.323) 0.986 (0.055)

0.000

0.555

0.000

0.190

0.000

0.001

0.179

0.080

0.000

0.797

1.117 (0.137) 0.951 (0.101) 1.229 (0.078) 0.521 (0.295) 0.994 (0.056)

0.000

0.395

0.000

0.630

0.000

0.003

0.077

0.105

0.000

0.920

0.994 (0.144) 0.778 (0.126) 1.203 (0.067) 0.405 (0.242) 0.968 (0.239)

0.000

0.970

0.000

0.079

0.000

0.002

0.094

0.014

0.000

0.895

Notes: The γ estimates measure the impact of one month prior online price estimates on actual sales prices. Online estimate two, three, and six months prior to sales were used in calculating the nonlinear proxy. Robust standard errors are in parentheses.

27

Table 7. Determinants of the γ estimate Dependent variable:

Percent bachelor degree or higher Ln(number of households 2007) Ln(land area) Ln(population 2008)

Estimate of γ (Effect of online price estimates on sales prices) (1)

(2)

(3)

(4)

(5)

(6)

1.820*** (0.484)

1.858*** (0.603) -0.00422 (0.0354) 0.0219 (0.0979) 0.00957 (0.0383)

1.761** (0.722) -0.0250 (0.0309) -0.0553 (0.124) -0.00256 (0.0475) 1.038 (1.223) -0.958 (1.225)

1.816** (0.839) -0.0218 (0.0537) -0.0484 (0.129) -0.00372 (0.0589) 0.950 (1.107) -0.877 (1.143) 1.104 (3.105)

1.635* (0.906) -0.0281 (0.0618) -0.0578 (0.134) 0.00142 (0.0635) 1.094 (1.253) -1.007 (1.268)

1.924 (1.396) -0.0397 (0.0791) -0.0642 (0.142) -0.00912 (0.0813) 1.014 (1.373) -0.903 (1.422)

Ln(number of houses on Zillow) Ln (number of houses with Zillow estimates) Zillow estimate median error

-0.252 (0.871)

Log(median family income 2007)

Zillow estimate accuracy dummy Y Y variables Observations 30 30 30 29 30 30 R-squared 0.189 0.191 0.230 0.232 0.235 0.238 Notes: Estimates are based on the specification that uses nonlinear proxy and local first differencing. The γ estimates measure the impact of one month prior online price estimates on actual sales prices. The observation drops by one in column (4) because Zillow did not report their median error for St. Louis. The Zillow estimate accuracy dummy variables are four categorical variables based on the degree of error between actual sales price and Zillow’s price estimate. Four implies the most accurate (lowest error) and two implies the least accurate (highest error). Category one is for cities where accuracy is unknown, i.e. St. Louis. The two month prior online estimates were used in calculating the nonlinear proxy. Robust standard errors are in parentheses. *** p<0.01, ** p<0.05, * p<0.1.

28

1.4

Figure1. Scatterplot between the elasticity estimate and educational attainment across MSAs

Estimate of gamma 1 .6 .8

1.2

Philadelphia Nashville Columbus Tampa New York San Francisco Phoenix Jose Seattle San Denver Boston Pittsburgh Minneapolis Washington DC Chicago Cincinnati Orlando St. LouisSan Diego

.2

.4

Baltimore Portland Sacramento Atlanta Providence Miami-Fort Lauderdale Las Vegas Los Angeles Virginia Beach Riverside

0

Charlotte

0

.1

.2 .3 .4 .5 .6 Percent bachelor degree or above in 2010

.7

.8

Notes: Estimates are based on the specification that uses nonlinear proxy and local first-differencing. The gamma estimates measure the impact of one month prior online price estimates on actual sales prices. The two month prior online estimates were used in calculating the nonlinear proxy.

29

Appendix Table 1. Estimates of γ (effect of online price estimates on sales prices) by MSA using the 6 months prior estimate in calculating the proxy variable Estimates of γ (Effect of Online Estimates 1 Month Prior to Sales) No Local First Differencing Local First Differencing

Atlanta Baltimore Boston Charlotte Chicago Cincinnati Columbus Denver Las Vegas Los Angeles Miami-Fort Lauderdale Minneapolis Nashville New York Orlando Philadelphia Phoenix Pittsburgh Portland Providence Riverside Sacramento San Diego San Francisco

No Proxy

Linear Proxy

Nonlinear Proxy

No Proxy

Linear Proxy

Nonlinear Proxy

0.822 (0.090) 1.196 (0.095) 1.113 (0.071) 0.648 (0.569) 0.958 (0.548) 1.024 (0.073) 1.121 (0.065) 0.900 (0.208) 0.818 (0.210) 0.793 (0.152) 1.130 (0.108) 0.903 (0.153) 1.342 (0.070) 0.980 (0.106) 1.173 (0.105) 1.131 (0.131) 0.775 (0.189) 0.942 (0.071) 1.256 (0.133) 0.939 (0.092) 0.740 (0.179) 1.131 (0.281) 0.891 (0.127) 0.810

0.811 (0.077) 1.010 (0.116) 1.110 (0.070) 1..248 (0.420) 0.901 (0.053) 0.982 (0.065) 1.035 (0.051) 0.608 (0.139) 0.715 (0.182) 0.791 (0.142) 0.737 (0.098) 1.098 (0.090) 0.917 (0.076) 0.932 (0.098) 0.872 (0.082) 0.988 (0.131) 0.734 (0.170) 0.888 (0.051) 1.103 (0.108) 0.891 (0.080) 0.720 (0.210) 0.710 (0.193) 0.699 (0.106) 0.894

0.827 (0.069) 0.941 (0.059) 1.095 (0.079) 0.776 (0.462) 0.919 (0.049) 0.956 (0.064) 1.059 (0.056) 0.625 (0.120) 0.834 (0.166) 0.793 (0.108) 0.793 (0.139) 1.090 (0.110) 0.994 (0.092) 0.953 (0.100) 0.878 (0.088) 0.993 (0.133) 0.756 (0.170) 0.934 (0.041) 1.081 (0.100) 0.887 (0.093) 0.384 (0.174) 0.676 (0.159) 0.727 (0.107) 1.033

0.573 (0.160) 1.036 (0.197) 1.385 (0.563) -1.004 (1.558) 0.887 (0.102) 0.854 (0.097) 1.212 (0.271) 0.550 (0.376) 0.698 (0.355) 0.157 (0.201) 0.781 (0.462) 0.578 (0.290) 1.471 (0.146) 1.264 (0.335) 1.187 (0.211) 1.178 (0.297) 0.875 (0.157) 1.053 (0.106) 0.521 (0.224) 0.876 (0.202) 0.121 (0.252) 0.900 (0.295) 1.307 (0.159) 0.509

0.488 (0.140) 0.968 (0.211) 1.384 (0.563) -0.656 (1.173) 0.860 (0.102) 0.798 (0.095) 1.170 (0.343) 0.510 (0.320) 0.627 (0.347) 0.147 (0.189) 0.577 (0.317) 0.692 (0.278) 1.089 (0.138) 1.388 (0.346) 1.125 (0.152) 1.062 (0.321) 0.886 (0.153) 0.954 (0.139) 0.557 (0.222) 0.838 (0.292) 0.051 (0.234) 0.626 (0.234) 1.264 (0.147) 0.676

0.698 (0.154) 0.793 (0.135) 0.765 (0.240) 0.401 (0.313) 0.890 (0.084) 0.841 (0.098) 1.045 (0.183) 0.491 (0.152) 0.431 (0.313) 0.408 (0.129) 0.289 (0.188) 1.003 (0.165) 1.199 (0.119) 1.018 (0.361) 0.827 (0.176) 1.200 (0.245) 1.080 (0.181) 0.939 (0.084) 0.569 (0.197) 0.809 (0.197) 0.112 (0.254) 0.520 (0.238) 0.703 (0.148) 0.889

30

p-Value

Local First Diff Nonlinear Proxy

H0: γ = 0 0.000

H0: γ = 1 0.049

0.000

0.126

0.001

0.326

0.200

0.056

0.000

0.192

0.000

0.106

0.000

0.806

0.001

0.001

0.169

0.069

0.002

0.000

0.125

0.000

0.000

0.984

0.000

0.094

0.005

0.961

0.000

0.327

0.000

0.415

0.000

0.658

0.000

0.471

0.004

0.029

0.000

0.332

0.660

0.000

0.029

0.044

0.000

0.050

0.016

0.765

(0.186) (0.140) (0.169) (0.292) (0.285) (0.369) 0.856 0.897 0.940 0.678 0.761 0.761 0.002 0.323 (0.078) (0.057) (0.052) (0.252) (0.279) (0.242) Seattle 0.947 0.934 0.939 1.074 1.059 0.994 0.000 0.970 (0.103) (0.086) (0.086) (0.128) (0.139) (0.144) St. Louis 1.014 0.873 0.926 0.694 0.756 0.778 0.000 0.079 (0.160) (0.108) (0.107) (0.195) (0.138) (0.126) Tampa 1.163 0.856 1.086 1.423 1.177 1.203 0.000 0.002 (0.108) (0.073) (0.063) (0.145) (0.123) (0.067) Virginia Beach 0.932 1.001 1.040 0.137 0.133 0.405 0.094 0.014 (0.129) (0.062) (0.085) (0.261) (0.243) (0.242) Washington DC 0.951 0.965 0.974 0.351 0.453 0.968 0.000 0.895 (0.048) (0.032) (0.030) (0.394) (0.349) (0.239) Notes: The γ estimates measure the impact of one month prior online price estimates on actual sales prices. The six month prior online estimates were used in calculating the nonlinear proxy. Robust standard errors are in parentheses. San Jose

31

Appendix Table 2. Determinants of the γ estimate (3 month prior estimates used in the nonlinear proxy) Dependent variable:

Percent bachelor degree or higher

Estimate of γ (Effect of online price estimates on sales prices) (1)

(2)

(3)

(4)

(5)

1.584** (0.623)

1.579** (0.670) -0.00979 (0.0399) -0.00427 (0.106) 0.0117 (0.0364)

1.361* (0.724) -0.0417 (0.0317) -0.114 (0.128) -0.0120 (0.0453) 1.084 (1.290) -0.960 (1.291)

1.451* (0.833) -0.0231 (0.0519) -0.109 (0.133) -0.00572 (0.0568) 1.095 (1.218) -0.990 (1.238) 0.306 (2.997)

1.496 (0.888) -0.0199 (0.0552) -0.126 (0.132) -0.00400 (0.0594) 1.194 (1.309) -1.103 (1.311)

Ln(number of households 2007) Ln(land area) Ln(population 2008) Ln(number of houses on Zillow) Ln (number of houses with Zillow estimates) Zillow estimate median error

Zillow estimate accuracy dummy Y variables Observations 30 30 30 29 30 R-squared 0.151 0.153 0.226 0.227 0.243 Notes: Estimates are based on the specification that uses nonlinear proxy and local first differencing. The γ estimates measure the impact of one month prior online price estimates on actual sales prices. The observation drops by one in column (4) because Zillow did not report their median error for St. Louis. The Zillow estimate accuracy dummy variables are four categorical variables based on the degree of error between actual sales price and Zillow’s price estimate. Four implies the most accurate (lowest error) and two implies the least accurate (highest error). Category one is for cities where accuracy is unknown, i.e. St. Louis. The three month prior online estimates were used in calculating the nonlinear proxy. Robust standard errors are in parentheses. *** p<0.01, ** p<0.05, * p<0.1.

32

How Sensitive are Sales Prices to Online Price ...

Mar 12, 2014 - advancement of the internet, one can easily search sales price information for .... The results are based on the best information criteria with the.

466KB Sizes 1 Downloads 185 Views

Recommend Documents

Price Sensitive Information to Publish.pdf
Email: [email protected], Website: www.saifpowertecltd.com. P R I C E S E N S I T I V E I N F O R M A T I O N. This is for kind information of all concerned that ...

Prices and Price Setting
Apr 27, 2010 - Moreover, it is important for policy makers to understand the reason for ..... Dutch gasoline retail prices are published daily on the webpage of ...

Evidence that Two‐Year‐Old Children are Sensitive to Information ...
Evidence that Two-Year-Old Children are. Sensitive to Information Presented in. Arguments. Thomas Castelain. Instituto de Investigaciones Psicoleogicas. Universidad de Costa Rica. Stéphane Bernard. Centre de Sciences Cognitives. Universitee de Neuch

Looking for meaning: Eye movements are sensitive to ...
tic relationship is symmetrical. In an exploration of the effects of semantic and as- sociative relatedness on automatic semantic priming,. Thompson-Schill, Kurtz ...

Are Voters Sensitive to Terrorism? Direct Evidence ... - SSRN papers
Candidate, Master of Public Policy (MPP), Georgetown University, Expected ... supportive of the policy voting hypothesis, according to which “parties benefit from ...

Dispersed Information and Market Prices in a Price ...
dynamic, departures from this benchmark arise to the extent there are strategic inter- actions in firm's ... Email: [email protected]. ‡Pennsylvania ...

Estimating the Structural Credit Risk Model When Equity Prices Are ...
Abstract. The transformed-data maximum likelihood estimation (MLE) method for struc- tural credit risk models developed by Duan (1994) is extended to account for the fact that observed equity prices may have been contaminated by trading noises. With

The Offered Prices Are Right For Insulation Contractors In Melbourne.pdf
right kind. Page 1 of 1. The Offered Prices Are Right For Insulation Contractors In Melbourne.pdf. The Offered Prices Are Right For Insulation Contractors In ...

The Offered Prices Are Right For Insulation Contractors In Melbourne.pdf
Page 1 of 1. The Offered Prices Are Right For Insulation Contractors In Melbourne.pdf. The Offered Prices Are Right For Insulation Contractors In Melbourne.pdf.

What Kind of Firms are More Sensitive to Public ...
Indonesia's Program for Pollution Control Evaluation and Rating (PROPER) was the first ... National Risk Management Research Laboratory, 26 W. Martin Luther King ... polluters published by British Columbia's Ministry of Environment, Canada. .... the

Online Appendix to: “Learning from Prices: Public ...
demand is relatively high, which generates an expansion of labor supply. The effect ..... interest rate in the financial market does not provide any new information to the workers and that the .... The following proposition accounts for all equilibri

Border Prices and Retail Prices
May 31, 2011 - 4 In their example complete pass-through would be 100% pass through. ..... telephones and microwave ovens. ... There are, for example, the wedges associated with small-screen .... has gone out of business; (2) the BLS industry analyst,

The Cyclicality of Sales and Aggregate Price Flexibility
This strong correlation between the business cycle and the use of temporary .... (2016) do not find evidence that sales respond to wholesale ... dominated by strong trends in all three series during Japan's “lost decades,” making it difficult to.

Sales and Price Spikes in Retail Scanner Data
Nov 30, 2010 - Using this filter, I show that price spikes are much smaller and less frequent .... M. Kilts Center, Chicago Graduate School of Business. The data ...

How Optimism Leads to Price Discovery and ... - Semantic Scholar
Feb 18, 2010 - followed by trade (almost) at the right price pW (µ). ...... wSτ (c|µ) converge to the utilities that traders would realize under perfect com- petition,.

HOW TO APPLY FOR FREE AND REDUCED PRICE SCHOOL MEALS
household that you have not listed on the application, go back and add ... Sharing a phone number, email address, or both is optional, .... found online at: http://www.ascr.usda.gov/complaint_filing_cust.html, and at any USDA office, or write a ...