Introduction

Housing markets play an important role in the economy. For example, Bertaut and Starr-McCluer (2002) show that residential properties accounted for about one quarter of aggregate household wealth in the United States in the late 1990s, and Tracy and Schneider (2001) show that housing wealth accounts for about two-thirds of the wealth of the median U.S. household. Changes in home prices and trading volume seem to have significant economic impacts on builders, brokers, lenders, appraisers, furniture consumption as well as local property tax collections and related local government budgets, in addition to local affordability and wealth. A rapid surge in home prices and trading volume after 2000 has been seen across many areas in the United States, and it is followed by a recent decline in house prices and trading volume. These phenomena generate a lot of discussion regarding whether the US has been in a “housing bubble” (see, e.g., Case and Shiller 2003). Despite the importance of the housing markets and the economic and policy implications of changes in home prices and trading volume, some important aspects of housing markets are not well understood.

A well known pattern in the housing market is that prices and trading volume seem to correlate with each other: trading activity tends to be more intense (i.e., more transactions and less time on the market before sale) when prices are rising compared to falling markets. The positive correlation between prices and trading volume appears to be inconsistent with standard rational expectation asset market models, in which housing prices are present discounted values of the future service streams (see e.g. Poterba 1984). A conventional interpretation of the correlation is that price changes cause changes in trading volume. The causal relation is built on one of three factors: equity constraints,Footnote 1 nominal loss aversion (homeowners are less willing to sell their homes in a falling market to avoid realized losses), or the option value of homeowners (homeowners wait to sell when the upside benefits exceed net carrying costs, see Cauley and Pavlov 2002). Stein (1995), Genesove and Mayer (1997), Lamont and Stein (1999), and Chan (2001) provide theoretical and empirical evidence for equity constraints of home sellers. Genesove and Mayer (2001), Cauley and Pavlov (2002), and Engelhardt (2003) provide evidence for nominal loss aversion.

Although research regarding the causal relation between prices and trading volume greatly improves our understanding of the dynamics of the housing market, a few important questions have not been satisfactorily answered. First, is a positive price–volume correlation widely observed across markets? It is striking that there is mixed evidence regarding the relation between prices and trading volume, and the evidence is from either aggregate national level data, or from small panel data (up to 22 metropolitan areas). While a positive price–volume correlation is found by Stein (1995), Berkovec and Goodman (1996), Andrew and Meen (2003), and Ortalo-Magné and Rady (2004), a negative relation is found by Follain and Velz (1995) and Hort (2000), and no significant relation is found in commercial real estate by Leung and Feng (2005).

Second, does the causal relation from prices to trading volume necessarily explain the contemporaneous price–volume correlation? The causal relation, though strongly supported by empirical evidence, more naturally implies a lead–lag relation instead of a positive correlation. While it is possible that a lead–lag relation at high frequency helps generate a contemporaneous correlation at low frequency, or a correlation at the same frequency due to possible positive autocorrelations of prices, to date no empirical study has been conducted to assess the extent to which the causal relation from house prices to trading volume helps explain the price–volume correlation.

Third, is the price–volume correlation necessarily, or solely, due to the causal relation between prices and trading volume? Houses are not only assets, but also consumption goods. While the supply of many assets such as common stocks may be fixed in the short term, the aggregate demand and supply for housing in a market is often elastic. In fact, Smith (1976), Hanushek and Quigley (1980), DiPasquale and Wheaton (1994), and Malpezzi and Maclennan (2001) among others, provide evidence of negative price elasticity of housing demand and positive price elasticity of housing supply. Therefore, shocks to the housing market may affect both home prices and trading volume, and thus cause co-movements of them, which may lead to a price–volume correlation. Further, there is a solid theoretic foundation for the co-movements of prices and trading volume in housing markets. Wheaton (1990) provides theoretical evidence that exogenous variables such as demand shocks can affect both vacancy and sales time, which usually relate negatively to turnover, and prices in housing markets. More theories along this line are proposed by Krainer (2001), Ortalo-Magné and Rady (2006), and Novy-Marx (2007). While the theories suggest that the price–volume correlation could simply be co-movements, there is no empirical study for such co-movements and the extent to which they help explain the price–volume correlation.Footnote 2 Moreover, Wheaton (1990) also predicts that trading volume itself can affect house prices: a higher rate of successful matching between buyers and sellers reduces the supply of for-sale units; therefore, sellers adjust their reservation prices upward.Footnote 3 This causal relation from trading volume to prices might also help explain the price–volume correlation in the housing market, although there is no empirical study on this possibility at this moment.

This paper aims to shed light on these three questions using an unusually large panel dataset which comprises 114 metropolitan statistic areas (MSAs) in the U.S. and covers a sample period from 1990:2 to 2002:2. First, we fit to the data a bivariate VAR model with both prices and volume (measured with turnover) being endogenous, and estimate how exogenous variables, such as conditions in the labor market, the mortgage market, and the financial market, and lagged endogenous variables affect both prices and volume in housing markets. We test Granger causality from prices to trading volume (Stein 1995 theory) and from trading volume to prices (Wheaton 1990 theory), to compare and contrast the Wheaton (1990) and Stein (1995) theories. In this step, we also estimate three alternative specifications of the VAR model. Specifically, we separate positive changes in house prices from negative changes to test an asymmetric relation between house prices and volume implied by Stein (1995), which is due to that equity constraints are binding in falling markets. Further, housing markets are well known for being heterogeneous, particularly in terms of supply elasticity. Therefore, we break down our sample into two groups of MSAs, with above and below median supply elasticity respectively, and estimate and Granger causality tests for each group.

Second, we empirically analyze determinants of prices and trading volume in housing markets and investigate the existence and magnitude of the co-movements between prices and trading volume, as well as to what extent they help explain the price–volume correlation. For each specification of the VAR model, we decompose the changes in prices and volume respectively into two components: the fitted part (explained by our VAR model) and the residual, and investigate if our model captures the price volume correlation in the data. Further, we decompose the fitted values into three components: a price-caused component (explained by lagged prices), a trading volume-caused component (explained by lagged trading volume), and a co-movement component (explained by exogenous changes in the economy), and study how each component helps explain the fitted price–volume correlation. Finally, we use impulse response functions to describe the responses of prices and trading volume to shocks.

This paper provides original insights into the determinants of prices and trading volume in the housing market. We find that both house prices and trading volume are significantly affected by changes in the labor market, which include changes in total non-agricultural employment, average household income, and the unemployment rate. The housing market is also significantly affected by the level and trend of mortgage rates (we use the national average interest rate for 30year fixed rate mortgages). When the mortgage rate is high and when it is falling, both home prices and trading volume are low. Interestingly, the stock market performance also has a statistically significant effect on house prices. When the S&P500 index is high (level) or when it shows a down turn (trend), home prices tend to be low and trading volume tend to be high.

We find strong evidence that home prices Granger cause trading volume. Moreover, it is the decreases in prices, not the increases, that affect future trading volume, which is direct evidence of support for Stein (1995). We also find some evidence that trading volume Granger causes home prices, which, derives mainly from markets with low supply elasticity. This appears to indicate that in markets where supply can easily adjust, trading volume does not seem to affect future prices. The fact that trading volume more significantly affects prices in supply constrained markets seems consistent with Wheaton (1990). Overall, we find supporting evidence for both Stein (1995) and Wheaton (1990).

We find a statistically significant positive price–volume correlation in the housing market, and this correlation seems to be explained by co-movements of house prices and trading volume instead of the causal relations between prices and volume. Specifically, we find the positive correlation is almost completely explained by the home prices and trading volume fitted by our panel VAR model. In addition, we find that the price-caused components of prices and volume are negatively correlated, which indicates that the Granger causality from prices to trading volume does not appear to help explain the positive price–volume correlation. We also find that the trading volume-caused components of prices and volume are positively correlated in markets with high supply elasticity, but negatively correlated in markets with low supply elasticity. Therefore, the Granger causality from trading volume to prices does not seem to provide a good explanation for the average price–volume correlation that is positive. Finally, the co-movement components of prices and volume are significantly positively correlated for markets with both high and low supply elasticity; therefore exogenous shocks seem to explain the positive price–volume correlation well. Overall, our empirical evidence suggests that home prices and trading volume indeed Granger cause each other, but the causal relations do not appear to be driving the positive price–volume correlation, at least not at quarterly frequency.

This paper is original in four aspects. This is the first study that investigates the contemporaneous price–volume correlation in the housing market using a large panel data set comprising a large number of markets (114 MSAs) that are arguably distinct from each other. Second, this paper is the first to test the Granger causality between house prices and trading volume and compare and contrast Stein’s (1995) and Wheaton’s (1990) theories. Third, this paper is the first to empirically study the co-movements of prices and trading volume caused by exogenous economic/demographic shocks. Finally, this paper is the first to assess the extent to which the co-movements of prices and volume and causality between them, respectively, help explain the price–volume correlation.

The paper proceeds as follows. The next section presents the econometric model. “Model Specifications and Data” discusses the specification of the model and the data. Empirical evidence is presented in Section “Empirical Evidence”. Section “Conclusions” provides conclusions.

Econometric Model

The Model

We use a bivariate VAR model to analyze the determinants of both house prices and trading volume. This approach has a few important merits. First, it allows us to directly test the Granger causality between prices and trading volume. Wheaton (1990) suggests not only that turnover and house prices are jointly determined, but also that greater market turnover itself can generate higher house prices by reducing sales time and increasing seller reservations, which predicts that turnover Granger causes house price changes. On the other hand, Stein (1995) and others suggest that house price changes should Granger cause turnover, due to equity constraints, loss aversion, or the option value of homeowners. While all the above predictions provide important theoretical insights, they have not been tested using large cross-sectional time series data in the literature.

Second, this approach enables us to better understand the determination of house prices and trading volume, and decompose prices and volume respectively into four components: the component determined by exogenous variables; the component determined by lagged prices; the component determined by lagged trading volume; and the component determined by other unknown variables. This decomposition allows us to calculate the price–volume correlation using each of the four components of the price and volume and assess the direction and magnitude of the correlation due to each of the components.

We now build the bivariate panel VAR model. We assume that both the equilibrium housing price level and turnover are functions of quarterly dummy variables, exogenous variables and lagged endogenous variables.

$$\left( {\begin{array}{*{20}c}{p_{i,t} } \\{q_{i,t} } \\\end{array} } \right) = \left( {\begin{array}{*{20}c}{a_i } \\{b_i } \\\end{array} } \right)t + \sum\limits_{s = 1,2,3,4} {A_s d_s } + \sum\limits_{s = 1}^k {B_s \left( {\begin{array}{*{20}c}{p_{i,t - s} } \\{q_{i,t - s} } \\\end{array} } \right)} + CX_{i,t} + \left( {\begin{array}{*{20}c}{\varepsilon _{i,t}^p } \\{\varepsilon _{i,t}^q } \\\end{array} } \right)$$
(1)

In Eq. 1, d s is a dummy variable for the sth quarter, which equals 1 if period t is the sth quarter and 0 otherwise. For the ith MSA in period t, P i ,t denotes the log of the equilibrium price, q i ,t denotes the log of the turnover (measured with the ratio of existing single family home sales to the units of existing single family homes), X i ,t is a k by 1 vector of exogenous variables that affect either the demand or supply in the market, \(\varepsilon _{i,t}^p \,{\text{and}}\,\varepsilon _{i,t}^q \) are error terms. Coefficients and are scalars. A s is a 2 by 1 vector. B s is 2 by 2 vector. C is a 2 by k vector with k being the number of exogenous variables in X i ,t . All variables on the right side of the equation can affect either the demand or supply in the housing market, and thus ultimately determine the equilibrium price level and turnover. While the functional forms of the demand and supply curves themselves are interesting, this paper focuses on the aggregate effect of the explanatory variables because it appears sufficient to help us test the two lines of theories regarding the price–volume correlation. It is not this paper’s research goal to estimate the demand or supply curve of houses.

Four points are worth noting in Eq. 1. First, the equation includes lagged market prices as explanatory variables, and thus allows them to affect the equilibrium price and trading volume. This accommodates the causal relation from market prices to trading volume as predicted by Stein (1995).

Second, the model allows lagged trading volume to affect both the price and turnover, which essentially allows market participants to update their private valuations based on historical trading volume. This enables us to test the Granger causality from trading volume to prices, which Wheaton (1990) suggests. In addition, this accommodates the feedback effects proposed by Novy-Marx (2007), which suggests that a demand shock may increase the buyer-to-seller ratio in the market, and thus reduce the time on the market and increase the turnover of housing units. Changes in trading volume, consequently, can help sellers update their information set and thus change their asking prices, which shifts the supply curve.

Third, the prices in our model are nominal prices. We chose nominal prices instead of real prices because an important theory that aims to explain that the price–volume correlation relies on nominal loss aversion of homeowners (see, e.g. Genesove and Mayer 2001; Engelhardt 2003). Moreover, existing research suggests that people often make financial decisions in nominal terms. For example, Shafir et al. (1997) argue that money illusion is common in a wide variety of contexts. Particularly, they find that a majority of survey respondents focus on nominal rather than real gains in assessing hypothetical gains/losses when selling a house.

Finally, our model controls for the heterogeneity in the housing market in two ways. First, our model includes MSA-specific dummies, which would capture all unobserved time-invariant MSA characteristics, such as geographic attributes. Second, our model includes economic variables at the MSA level, which help capture local economic conditions that are time-variant. However, our results should be interpreted with caution: the estimated parameters should be treated as averages across the MSAs in our sample or subsamples, and our analysis can be interpreted as analysis of an average MSA. Note that this is not necessarily a problem—the theories we test are general and should apply to all MSAs; therefore, results from an average MSA serve our research purposes.

Since our data include price indices (with the index level normalized to 100 for 1995:1) rather than actual prices, we can not estimate (1) directly. Instead, we estimate the first order difference of (1)

$$\left( {\begin{array}{*{20}c}{\Delta p_{i,t} } \\{\Delta q_{i,t} } \\\end{array} } \right) = \left( {\begin{array}{*{20}c}{a_i } \\{b_i } \\\end{array} } \right) + \sum\limits_{s = 1,2,4} {A_s d_s } + \sum\limits_{s = 1}^k {B_s \left( {\begin{array}{*{20}c}{\Delta p_{i,t - s} } \\{\Delta q_{i,t - s} } \\\end{array} } \right)} + C\Delta X_{i,t} + \left( {\begin{array}{*{20}c}{\nu _{i,t}^p } \\{\nu _{i,t}^q } \\\end{array} } \right).$$
(2)

We assume the error terms have zero means and are orthogonal to all explanatory variables. The quarterly dummies in (2) are first-order differences of the dummies in (1), but we use the same notations to simplify the illustrations. The system in (2) is essentially a fixed-effect panel VAR model. In our estimation, we use the within transformation to eliminate MSA dummies, so variables in (2) become demeaned.

Tests and Analysis

Based on the results of estimating the model in (2), we conduct the following analysis. First, we test the null hypotheses that house prices do not Granger cause trading volume, and trading volume does not Granger cause house prices. The null hypothesis that house prices (trading volume) do not Granger cause trading volume (house prices) essentially imposes the constraint that the coefficients of all lagged prices (trading volume) are 0 in the second (first) equation of (2), which can be easily tested with a F-test. These hypotheses are expected to be rejected if the theories by Stein (1995) and Wheaton (1990) are valid.

Second, we investigate the existence and magnitude of the price–volume correlation. The price–volume correlation is defined as the correlation between changes in home prices, i.e. Δp i ,t , and changes in trading volume, i.e. Δq i ,t (both are demeaned using the within transformation). We do not use the correlation between p i ,t and q i ,t because prices have trends and are not stationary, while the trading volume is bounded between 0 and 1; therefore, the correlation between them in a long sample period does not seem to make much economic sense.

Third, we assess how well the fitted prices and volume in our model (explained by both exogenous economic changes and lagged prices and volume) help explain the price–volume correlation. We decompose Δp i ,t and Δq i ,t respectively into the fitted values and residuals (unexplained by our model), and then calculate the correlation between the fitted price and fitted volume and the correlation between the residuals, respectively. We compare the “fitted” correlation and the “residual” correlation with the raw price–volume correlation. The comparison helps us assess how well our model captures the price–volume correlation overall.

Fourth, we analyze the degree to which the Granger causality between prices and trading volume and the co-movements of the price and volume help explain the price–volume correlation respectively. This time, we decompose the fitted values of Δp i ,t and Δq i ,t respectively into three parts: the price-caused component (explained by lagged prices), the trading volume-caused component (explained by lagged trading volume) and the co-movement component (explained by other variables). We then assess the significance and magnitude of the correlations for different components and investigate how well each component helps explain the fitted price–volume correlation.

Finally, we study how shocks in exogenous variables affect the dynamics of the price and trading volume in the housing market. We construct and plot impulse response functions to describe how prices and trading volume react to exogenous shocks respectively. The impulse response functions help shed light on the economic sources of the price–volume correlation.

Model Specifications and Data

Model Specifications

This section discusses our choice of exogenous variables X i ,t in the panel VAR model. We categorize variables that may affect the demand and/or supply in the housing market as labor market related, mortgage market related, and financial market related variables. Note that our estimation uses the demeaned first-order differences of the log values of these variables.

Changes in the labor market and local demographic conditions likely affect housing demand and/or supply for several reasons. First, increasing immigrants and the growth of the local economy and/or population may increase demand for dwellings such as single family homes. Therefore, we include the total non-agricultural employment as an exogenous variable. Second, changes in income may increase housing demand. Consequently, we include the average household income as another exogenous variable. Thirdly, changes in the unemployment rate imply that the number of people who need to search for jobs in and out of a specific area is changing, which likely affects the housing demand and supply in the area. As a result, we include the unemployment rate as the third labor market related variable.

Mortgage market conditions likely affect house prices and turnover as well, for borrowing cost is another ostensible exogenous variable that affects housing demand and supply. We consider two variables that may be relevant. The first one is the mortgage rate per se. It is plausible that home buyers are less financially constrained when mortgage rates are lower. The second one is the trend in mortgage rates. Among other possibilities, potential buyers have the real option to delay their home purchases until mortgage rates are more favorable. Consequently, when mortgage rates seem to be falling, potential buyers may choose to postpone their home purchases, and housing demand may decrease. A measure of the trend in quarter t is the change of mortgage rates from quarter t – 1 to quarter t. Since the autocorrelation of the change in mortgage rate is indeed positive and fairly large (0.13), the change from quarter t – 1 to quarter t does help capture the trend.

In our estimation, we use the national average interest rate of 30year fixed rate mortgages and its first order difference to capture the mortgage rate level and trend. Since to our knowledge there is no theory that articulates the specific effects of these two variables, the interpretation of the coefficients demands caution. Fortunately, our tests rely on the aggregate effects for these two variables, not the specific manner in which they affect the housing market.

The stock market may also affect the housing market, even though the effects might be complicated and ambiguous. First, the well known wealth effect suggests that an increase in wealth may increase consumption, including consumption of housing. Therefore, a booming stock market may increase housing demand. Second, a booming stock market may help mitigate the liquidity constraints of moving families, for they have the option to use proceeds from selling stocks to help defray down payments for new homes. This might affect both the demand and the supply in the housing market, given that many families are simultaneously buyers and sellers. Finally, houses may appear to be less attractive assets when investors believe that stocks are better investments. The competing effect may reduce housing demand. While we lack rigorous theories with unambiguous predictions regarding the effects of the stock market, we try to use two variables to capture the effects: the S&P 500 index level, which may proxy for the financial wealth and/or constraints of households, and the gross return of the S&P 500 index, which may proxy for the trend of the stock market. In our estimation, we essentially use the first-order and the second-order differences of the S&P 500 index.

Data

This paper compiles data from five sources. First, the U.S. Bureau of Census (BOC) provides quarterly estimates for single family housing units for 209 MSAs in 1990:2 and for 280 MSAs in 2000:2. The difference in the number of MSAs is mainly due to changes in MSA boundaries and the establishment of new MSAs. Second, the Office of Federal Housing Enterprise Oversight (OFHEO) provides transaction-based quarterly house price indices at the MSA level (using BOC 1999 MSA definitions). Third, Moody’s Economy.com provides quarterly measurements for existing single family home sales, total nonagricultural employment, average household income, population, single family home permits, and the unemployment rate at the MSA level (using BOC 1999 MSA definitions). The sources for these variables are respectively the National Association of Realtors (NAR), Bureau of Labor Statistics (BLS), Bureau of Census, and Internal Revenue Service (IRS). IRS records seem to be used to estimate migration between MSAs, which is then used to estimate population. Fourth, NAR provides the time series of the national average interest rate for 30 year fixed rate mortgages. Finally, CRSP provides the time series of the SP500 index.

The sample period in our analysis is from 1990:2 to 2002:2, and the time frequency is quarterly. We hope to fit our model to high frequency data, since the causal relation between prices and trading volume is more likely to be identified in high frequency data. The highest frequency we are able to obtain is quarterly.

Our analysis uses MSAs that satisfy the following three requirements. First, a qualifying MSA needs to exist in 1990:2 (1990 Census period) and 2000:2 (2000 Census period), and has single family housing unit data in the two periods. Second, the MSA also needs to have unchanged definitions and boundaries in the sample period. To check whether the boundaries have changed, we first check if the MSA name has changed from 1990:2 to 2000:2. If the name remains the same, we manually check the BOC historical records of boundary changes to verify if the MSA has unchanged boundaries. 114 MSAs satisfy the first two requirements. The third requirement is that MSAs cannot have any of the following variables missing: existing single family home sales, total nonagricultural employment, the unemployment rate, average housing income, single family home permits, and the house price index. Using all three requirements, we end up with 114 MSAs,Footnote 4 which are listed in the Appendix.

In our analysis, trading volume is measured by turnover, which is defined as the ratio of existing single family home sales to the stock of existing single family homes. Since we only observe the number of single family units for the two Census quarters, we estimate the units in other time periods using the following formula.

$${\text{unit}}_{i,t} = {\text{unit}}_{i,t - 1} + {\text{completion}}_{i,t} - {\text{demolish}}_{i,t} $$
(3)

In (3), for MSA i in period t, unit i ,t is the single family units, completion i ,t is units completed, and demolish i ,t is the units demolished.

We estimate completion i ,t using permit information as well as the relation among permits, starts, and completions. According to BOC, actual starts in new home building are on average 2.5% more than permits and the completion rate is 3.5% less than starts. Therefore, we assume that 100 issued permits will lead to \(100 \times 1.025 \times 0.965 = 98.9\) new units. It is also worth noting that completion of new units may not be in the same quarter as permit issuances. According to the BOC, almost all (97% to be exact) constructions start in the same quarter when permits are issued. Further, 20% of starts are completed in the same quarter, 49% in the next quarter, 19% in the third quarter, 7% in the 4th quarter, and 6% in the 5th quarter or beyond. The above relations help us estimate completion i ,t using the following equation.

$${\text{completion}}_{i,t} = 0.989 \times \left( {0.2{\text{permit}}_{i,t} + 0.49{\text{permit}}_{i,t - 1} + 0.19{\text{permit}}_{i,t - 2} + 0.07{\text{permit}}_{i,t - 3} + 0.06{\text{permit}}_{i,t - 4} } \right)$$
(4)

We estimate demolitions after estimating completions. We assume the same number of demolished units per period,Footnote 5 and calculate the demolished units per quarter using the following equation.

$${{\left( {{\text{unit}}_{i,1990:2} + \sum\nolimits_{t = 1990:2}^{2002:2} {{\text{completion}}_{i,t} } - {\text{unit}}_{i,2002:2} } \right)} \mathord{\left/{\vphantom {{\left( {{\text{unit}}_{i,1990:2} + \sum\nolimits_{t = 1990:2}^{2002:2} {{\text{completion}}_{i,t} } - {\text{unit}}_{i,2002:2} } \right)} {40}}} \right.\kern-\nulldelimiterspace} {40}}$$
(5)

After we estimate the completion and demolished units, we use eq. 3 to estimate the units of single family homes in each of the non-census quarters.

As a robustness check, we also estimate the existing single family units using an alternative method based on the relation between population and housing units. Our data indicate that single family housing units are almost perfectly correlated with population across MSAs and over time. In fact, two cross-sectional regressions of single family units on population in 1990:2 and 2000:2, respectively, both generate R-squares around 0.99 when accurate housing units and population data are available from the BOC.

Furthermore, a regression of the ratio of housing units to population in 2000:2 against the same ratio in 1990:2 generates an R-square of 0.98, which seems to indicate that the housing unit–population ratio is highly stable across time. Therefore, for each MSA, we estimate the ratio of single family housing units to population in a given period with a time–distance weighted average of the ratios in 1990:2 and 2000:2. We then estimate the existing housing units in that period with the product of population and the estimated ratio. The turnover estimated with the alternative method is highly correlated with the turnover estimated using permits. In a regression of the turnover estimated with permits on the turnover estimated with population/unit ratio, the coefficient is 1.02, and the R-square is 0.98. In addition, all our empirical findings remain unchanged when we use the turnover estimated from population/unit ratios. Therefore, we only report the results using turnover estimated with permits.

Figure 1 plots the deciles of the 114 home price indices where the deciles are created by the total change from the beginning to the end of the sample period. Figure 2 plots the across-MSA 25% percentile, median, and 75% of the estimated market turnover. Figure 3 plots the across-MSA 25% percentile, median, and 75% of the non-agricultural employment, average household income, unemployment rate, as well as the time series of the 30year fixed rate mortgage rate. Since we estimate our VAR model using first-order differences of log values of these variables, we provide some statistics for the first order differences in Table 1, including across-MSA averages of their means, medians, variances if applicable, autocorrelations, and correlations, as well as t-statistics if applicable.

Fig. 1
figure 1

Home price indices for the U.S. MSAs. This figure plots the deciles of OFEHO quarterly home price indices (nominal) in the U.S. from 1990:2 to 2002:2 where the deciles are created using the total appreciation of home prices in the sample period. The index levels are normalized to 1 in 1990:2

Fig. 2
figure 2

Quarterly turnover in the U.S. MSA housing markets. This figure reports the 25% percentile, median, and the 75% percentile of the quarterly turnover in the 114 MSA housing markets from 1990:2 to 2002:2. Turnover is defined as the ratio of existing home sales to the stock of existing homes

Fig. 3
figure 3

Time series of some MSA and national level variables. This figure reports the 25%, median, and 75% percentiles of the quarterly total non-agricultural employment, average household income, the unemployment rate across 114 MSAs, and the national average 30 year Fixed Rate Mortgage interest rate (FRM rate). The sample period is from 1990:2 to 2002:2

Table 1 Data summary

Empirical Evidence

Determinants of Prices and Trading Volume

We first estimate different specifications of the fixed effect panel VAR model in (2). The first specification includes contemporaneous exogenous variables, which is the benchmark specification due to its simplicity.Footnote 6 The second separates the positive values from negative values of log differences in house price indices, and thus allows for asymmetric effects of house prices on turnover, which help us directly test Stein’s (1995) theory. Malpezzi and Maclennan (2001) and others point out that the determination of house prices varies dramatically across time and markets with different supply elasticity. To understand possible heterogeneity in the determination of house prices and turnover and whether it affects the price–volume correlation, we re-estimate the first specification for MSAs with above and below-median long term supply elasticity. The measure we use for long term supply elasticity is the ratio of the change in population to the change in the house price index over the sample period. Holding constant the increases in population in a market, the greater is the increase in house prices, the “tighter” is the market and the lower the long term supply elasticity. Figure 4 plots the histogram of the long term supply elasticity across MSAs.

Fig. 4
figure 4

Histogram of the long term housing supply elasticity across MSAs. This figure plots the histogram of the estimated long term housing supply elasticity across 114 MSAs in the sample. The long term elasticity for a MSA is estimated with the ratio of the change in population to the change in the house price index from 1990:2 to 2002:2 in the MSA

We use AIC to choose the optimal lag order for endogenous variables, which is 3 for all specifications. We use only three quarterly dummies to avoid the multi-collinearity of the four dummies due to the within transformation. The model is estimated with feasible GLS that allows for heteroskedasticity across MSAs. We calculate t-statistics using heteroskedasticity-robust standard errors according to Kezdi (2003). Kezdi (2003) shows that the robust standard deviations allow serial correlation and heteroskedasticity of any kind, as well as unit roots and unequal spacing. They also have good small sample properties.

Table 2 reports the estimation results of the first specification. The results indicate that labor market shocks affect both prices and trading volume in housing markets. On the one hand, both the total non-agricultural employment and the average household income positively affect both the prices and trading volume in housing markets, and the effects are statistically significant. On the other hand, the unemployment rate has an interesting impact on housing markets. Increases in the unemployment rate increase (albeit insignificantly) home prices and significantly reduce trading volume. This seems to be consistent with the spatial lock-in phenomenon (see Chan 2001, for example) which states that home sellers who are hurt by the increased unemployment rate are likely to be subjected to financial constraints and thus need to raise their asking prices (so that the proceeds from selling their homes would be large enough to repay their mortgage and provide a down payment on a new home). This behavior shifts the supply curve upwards and thus causes lower trading volume and higher transaction prices.

Table 2 Panel VAR estimation

Table 2 confirms the importance of the mortgage market in the determination of prices and trading volume. When the mortgage rate is high (low), home prices are significantly low (high), and so is trading volume. Furthermore, when the mortgage rate demonstrates a rising (falling) trend, both home prices and trading volume increase (decrease), which is consistent with the potential homebuyers’ rational behavior, although other possibilities can not be ruled out. When the mortgage rate is rising, potential homebuyers would be better off purchasing sooner and locking in the mortgage rate. Yet, when the rate is falling, it seems rational for them to wait and postpone their home purchases. The effects of the mortgage level and trend are consistent with shifts of the housing demand.

It is interesting that the stock market has significant effects on the price and turnover in the housing market. First, home prices are significantly lower, and the trading volume is (insignificantly) higher when the S&P 500 index is higher, which appears to indicate a shift of the supply curve to the right side (a decrease in sellers’ asking prices). This finding seems consistent with Stein (1995) etc: the more financial wealth a household has (possibly due to higher stock prices), the less likely the household is financially constrained and thus it sets a lower ask price. Second, home prices are significantly higher and trading volume is (insignificantly) lower when the stock market shows an uptrend. It is premature to make any conclusions regarding the economic mechanism; however, we conjecture that private valuations of homeowners may be affected by their expectation of the economy in the future. A booming stock market may create higher housing demand in the future, and homeowners may adjust their private valuation upward accordingly, which may shift the supply curve upward. We leave the exploration of this possibility for future research.

Table 2 also suggests that lagged prices and trading volume significantly affect home prices and trading volume. The first-order autoregressive coefficients are significantly negative for both prices and trading volume, therefore, prices and volume tend to reverse in the next quarter, which may indicate the adjustments of the housing market to exogenous shocks. An adjustment of market supply to a demand shock appears consistent with the feedback effects predicted by Novy-Marx (2007). The negative coefficients of lagged prices and trading volume are also consistent with the overshooting of home prices predicted by Ortalo-Magné and Rady (2006).

Table 2 also reports the tests of Granger causality between prices and turnover. We find strong evidence that prices Granger cause turnover, with the F statistic being 4.811, and the P value being 0.002. This directly supports the theory by Stein (1995), and is consistent with empirical evidence provided by Chan (2001), Engelhardt (2003), Genesove and Mayer (1997), Genesove and Mayer (2001), etc. At the same time, we find weak evidence that turnover Granger causes prices, with the F statistic being 2.436, and the P value being 0.063, which provides some evidence for the part of the theory in Wheaton (1990) that suggests turnover might affect prices, particularly in tight markets.

Table 3 reports the results of the second specification, which separates positive values from negative values for lagged log differences of house prices and thus accommodates asymmetric effects of house prices on turnover. While almost all results in Table 2 remain, we find that, in the equation with turnover being the dependant variable, the negative value of one-quarter lag log difference of house prices is significantly positive at the 1% level, while the positive value and all other lagged house prices are insignificant. This indicates that decreases in house prices reduce market turnover, but increases in house prices do not have significant effects. The result is consistent with theories in Stein (1995) etc., which suggest that equity constraints or loss aversion due to decreasing house prices reduce market trading volume.

Table 3 Panel VAR estimation: asymmetric effects of house price changes

Tables 4 and 5 report the results for MSAs with high and low supply elasticity respectively. While our early results remain, these tables reveal interesting differences across markets with different supply elasticity. First, turnover Granger causes prices in tight markets (MSAs with low supply elasticity) but not in loose markets (MSAs with high supply elasticity). Relating to Wheaton (1990), this seems to indicate that sellers more likely raise their reservations as a reaction to increasing trading volume in markets with an inelastic supply of housing. Therefore, in tighter markets, due to the lack of new homes, sellers are able to profit more from increasing housing demand. Second, the results seem to suggest that homebuyers in tight markets are less financially constrained. The first piece of evidence for this is that house prices in tight markets are less sensitive to mortgage interest rate levels and trends, possibly due to the fact that homebuyers are less financially constrained. This may have interesting implications on the risk of home equity: although houses tend to be more expensive in tight markets, they are less risky in the sense that their prices are less sensitive to mortgage interest rates. The second piece of evidence is that growth in average household income has weaker effects on house prices in tight markets than in loose markets, which seems to suggest that income is less likely a financial constraint for homebuyers in tight markets with high house prices. Third, growth in employment has stronger effects on house prices in tight markets than in loose markets, which is sensible given the low supply elasticity in tight markets.

Table 4 Panel VAR estimation: high supply elasticity MSAs
Table 5 Panel VAR estimation: low supply elasticity MSAs

Overall, we find that, first, exogenous variables, such as employment, household income, the mortgage rate, etc., play significant roles in determining prices and trading volume in the housing market, which supports the theories (e.g. Wheaton 1990) that argue for the effects of exogenous variables as a possible explanation of the price–volume correlation. Second, our results reject the hypothesis that prices do not Granger cause trading volume at the 1% level, and reject the hypothesis that trading volume does not Granger cause prices at the 10% level (at the 5% level for markets with low supply elasticity). The Granger causality tests provide strong evidence for Stein (1995) etc. and some evidence for Wheaton (1990). Third, we find decreases in house prices reduce trading volume while increases in house prices do not affect trading volume, which is a strong evidence supporting Stein (1995) etc. Fourth, we break down the MSAs into two groups with high and low supply elasticity respectively, and find very interesting heterogeneity. We find that trading volume Granger cause prices in tight markets, but not in loose markets. We also find that house prices in tight markets are less sensitive to variables related to financial constraints on homebuyers, including mortgage interest rates and average household income. Moreover, we find growth in employment has stronger effects on house prices in tight markets.

Decomposing the Price–volume Correlation

This section analyzes which relations—the Granger causality between prices and trading volume or the effects of exogenous variables—help explain the price–volume correlation in housing markets. We first calculate the raw price–volume correlation for each MSA using the series of home appreciation rates and changes in trading volume. Then, based on results from estimating the panel VAR model, we decompose the home appreciation rates and trading volume (both demeaned due to the within transformation) into fitted components and residuals.

$$\begin{array}{*{20}c}{hp_{i,t} = h\hat p_{i,t} + u_{i,t} } \\{to_{i,t} = t\hat o_{i,t} + v_{i,t} } \\\end{array} $$
(6)

We then calculate the correlation between \(h\hat p_{i,t} \) and \(t\hat o_{i,t} \), as well as the correlation between u i ,t and v i ,t , which is the component of the price–volume correlation that cannot be explained by our model.

Panel A in Table 6 reports the across-MSA averages of the raw price–volume correlations, the “fitted” correlations, and the correlations between residuals, using estimation results in Tables 2, 3, 4 and 5: the benchmark specification in Table 2, the asymmetric specification in Table 3, and subsample estimation for high supply elasticity markets (Table 4) and low supply elasticity markets (Table 5). The table also reports t-statistics of testing two-sided hypotheses that the correlations follow a distribution with 0 mean.

Table 6 Price–volume correlations

We have a few interesting findings. First, we find evidence of the statistically significant positive price–volume correlation. The raw correlation is 0.048 (0.072 in markets with high elasticity and 0.025 in markets with low elasticity) and significant at the1% level (insignificant for markets with low elasticity). Second, we find strong evidence of positive correlations between “fitted” prices and volume. The “fitted” price–volume correlations are much higher than raw price–volume correlations. They are 0.168 and 0.165 for the benchmark and asymmetric specifications, and 0.214 and 0.108 for sub-samples of MSAs with high and low supply elasticity. All “fitted” price–volume correlations are significant at the 1% level. Finally, the correlations between residuals are always lower than the raw price–volume correlations, and are insignificant. Panel A seems to indicate that our model captures the price–volume correlation well.

To investigate the extent to which the price–volume correlation is explained by the Granger causality between prices and trading volume and by the exogenous shocks, we further decompose the “fitted” parts of both hp i ,t and to i ,t into three components—the component explained by lagged prices, the component explained by lagged trading volume, and the component explained by all other variables. We then calculate the correlation between the same component of house prices and trading volume, and thus have three correlations: the correlation between price-caused prices and trading volume (“Price-caused” in the table), the correlation between trading volume-caused prices and trading volume (“Turnover-caused” in the table), and the correlation between prices and trading volume caused by other variables (“Co-movements” in the table).

Panel B of Table 6 reports the three types of price–volume correlations, and the corresponding t-statistics of two-sided tests that the correlations follow distributions with zero means. We have a few interesting findings. First, we find that the “co-movement” component of the price–volume correlation is statistically significant at the 1% level under all specifications and for all subsamples. Moreover, the correlation ranges from 0.529 to 0.589 for different specifications or subsamples, which is much higher than the raw correlation and the “fitted” correlation. Second, the “price-caused” component is statistically significant at the1% level but is negative. The negative “price-caused” component of the price-volume correlation seems to be caused by the positive effect of prices on future turnover and the negative autocorrelation of prices at quarterly frequency. Third, the “turnover-caused” component is significantly positive on average across MSAs, but varies dramatically across markets—it is significantly positive for MSAs with high supply elasticity, but significantly negative for MSAs with low supply elasticity. This seems to be caused by the negative autocorrelation of turnover and different effects of turnover on house prices in different markets: in loose (tight) markets, higher turnover tends to reduce (increase) house prices in next quarter.

Overall, our results provide strong evidence of the existence of positive price–volume correlations at quarterly frequency. Furthermore, the positive correlation seems to be fully explained by fitted prices and volume in our model. A novel finding is that the positive price–volume correlation appears to be mainly caused by co-movements of prices and volume due to exogenous shocks. Lagged prices seem to lead to negative price–volume correlation for all markets, and lagged trading volume leads to positive price–volume correlation in MSAs with high supply elasticity but negative price–volume correlation in MSAs with low supply elasticity, and thus both lagged prices and trading volume do not seem to explain the positive price–volume correlation well.

Impulse Response Analysis

We use impulse-response functions to provide a more intuitive description of how shocks in exogenous variables generate the co-movements of the price and turnover in housing markets. The impulse-response functions are constructed using estimation results of the first specification (Table 2). We build the analysis on the level model in (1) instead of the first-order difference model since the level model seems more intuitive. As a result, the impulse responses are for the absolute price level and turnover in the market, not their changes. Also, the benchmark case is a market in which all exogenous variables remain unchanged, and thus the price and turnover do not change over time.

Conventionally, the shock introduced equals one standard deviation of the underlying variable, which, however, does not seem to be the most appropriate approach in our study. First, most exogenous variables in our study are not mean-stationary; instead, they have trends and cycles. It is not clear how to define a meaningful standard deviation for these non-stationary variables. Second, most MSAs have experienced fairly smooth growth in the sample period. For these MSAs, standard deviations of the growth rates of economic variables are very small, and do not appear to represent meaningful shocks. As a result, we define a shock as a 5% absolute change in the level of the underlying variable.

We construct a conventional type of impulse response functions that are based on one shock in one variable and no shock in others. It is worth noting that this simple approach is often not suitable to study impulses in endogenous variables. A shock in an endogenous variable often has contemporaneous effects not only on the endogenous variable itself but also on other endogenous variables. Hence, it is inappropriate to assume a shock on one endogenous variable while keeping other endogenous variables fixed. To address this composition effect (defined by Koop et al. 1996), researchers often use either orthogonalized impulse responses or generalized impulse responses. However, since we are interested in how shocks in exogenous variables affect both the price and turnover, it appears reasonable to entertain perturbations in an exogenous variable, while assuming no extra shocks in other exogenous or endogenous variables.

To construct the impulse response functions, we first let all contemporaneous exogenous variables (except the one representing the source of shock), lagged endogenous variables and intercepts be 0, and then introduce a 5% one-time shock in the variable that represents the source of the shock. Since the VAR system is a log linear system, a shock that equals log (1.05) implies that the corresponding variable has an unexpected increase of 5%. The values of the price and trading volume over time are then calculated by repeatedly plugging into the VAR system all estimated coefficients and the lagged endogenous variables.

Figure 5 plots the dynamic responses of both the price and turnover in the housing market to a 5% exogenous increase in the total non-agricultural employment, the average household income, the unemployment rate, and the mortgage rate, respectively. We do not report the standard deviations of the responses since we are interested in the patterns of the expected responses, not the statistical significance. The pre-shock values of both the price and turnover are 1, which means the values are 1 times the values in the benchmark case. Values greater than 1 suggest positive deviations from the benchmark level. For example, 1.02 means the variable is 2% higher than the benchmark level.

Fig. 5
figure 5

Responses of home prices and market turnover to exogenous shocks. This figure reports the responses of home prices and market turnover (both in level) to a 5% one-period shock in the total non-agricultural employment, average household income, the unemployment rate, and the mortgage rate (in level). To construct the responses, we first let all contemporaneous exogenous variables (except the one representing the source of shock), lagged endogenous variables and intercepts be 0, and then introduced a 5% one-time shock in the variable that represents the economic source of the shock. The values of the price and trading volume over time are then calculated by repeatedly plugging into the VAR system all estimated coefficients and the lagged endogenous variables. In each graph, the series that has a higher absolute value of deviation from 1 in Period 1 is for turnover, while the other is for house prices

Note that the responses to shocks in the mortgage rate should be interpreted with caution. Empirically, changes in the mortgage rate also result in changes in the trend, so the aggregate effects will be more complicated than what the impulse response functions show. However, these two functions can be interpreted as thought experiments. Suppose the effect of the trend is fixed, the impulse response functions show the net effect of a change in the level, which is useful to know.

Note that in Fig. 5, the series that have higher absolute values of deviations from 1 in Period 1 are for turnover, while the other series are for house prices. We observe a few interesting patterns. First, trading volume reacts much more dramatically to exogenous shocks than prices do, which corroborates Andrew and Meen (2003) and Hort (2000). For instance, after a 5% increase in the average household income, trading volume increases by about 5%, while the price increases by less than 1%. This is consistent with the conventional wisdom that, in real estate markets, changes in trading volume more accurately represent changes in market conditions than changes in prices, (see Berkovec and Goodman 1996, for instance).

Second, some shocks appear to generate co-movements of the price and volume, while others seem to cause the price and volume to move in opposite directions. We call the first type of shocks Type I shocks, and the second type of shocks Type II shocks. Since the price–volume correlation is positive in our sample, it is very likely that our sample is exposed to more Type I shocks than Type II shocks. However, one should be cautious that the price–volume correlation in a market can be negative, particularly if Type II shocks dominate Type I shocks. The positive price–volume correlation in our data might be a small sample phenomenon in the sense that we happen to be in an economy where Type I shocks dominate in frequency and/or magnitude.

The third finding is that overshooting of the price and volume is very common. Particularly, the overshooting of trading volume is observed in all four scenarios. This is consistent with the theories by Novy-Marx (2007) and Ortalo-Magné and Rady (2006), which both imply or predict overshooting, though rely on different mechanisms.

Conclusions

Using an unusually large panel data set consisting of housing markets in 114 MSAs from 1990 to 2002, we study the determinants of home prices and trading volume in the housing market. We find that the housing market is affected by shocks in the labor market, mortgage market and the stock market. Moreover, house prices Granger cause trading volume, but the effects are asymmetric: decreases in house prices lead to lower trading volume, while increases in house prices have no effect. We also find that trading volume Granger causes prices, but only in markets with low supply elasticity. Our results also provide insights regarding heterogeneity across markets. In tight markets (low supply elasticity), house prices are less sensitive to mortgage rates and average household income, but more sensitive to growth in employment.

We find a significant and positive price–volume correlation at quarterly frequency. Our model captures the price–volume correlation well: after controlling for the price–volume correlation explained by our model, there is no significant price–volume correlation left. Furthermore, we find that the Granger causality of prices on trading volume appears to lead to a negative price–volume correlation, while the Granger causality of trading volume on prices leads to a positive price–volume correlation in markets with high supply elasticity, but a negative price–volume correlation in markets with low supply elasticity. Therefore, the Granger causality between prices and trading volume does not help explain the price–volume correlation well, while the co-movements of prices and volume, which are caused by shocks in exogenous variables, are significant, substantial, and positive for all markets. Using impulse response functions, we find that trading volume reacts more dramatically to economic shocks than home prices do. We also observe overshooting of trading volume in the adjustment process to shocks.