Abstract
Search theory shows that real property prices and marketing durations are simultaneously determined and positively related. Yet, empirical studies find positive, negative, and insignificant parameter estimates on the time-on-the-market (TOM) variable in price models. Using a dataset well suited to the research question, this article investigates reasons for the divergence between the theoretical and empirical results. Our test equations examine the quality of instrumental variables, severe overpricing, atypicality, structure quality, loss aversion, market tightness as well as measures unique to our data such as sellers’ income levels, reasons for sale, and urgency. We find that weak instrumental variables account for the varied empirical relations between transaction prices and TOM.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
House sales vary by price and by the time on the market (TOM). Search-theoretic models such as Wheaton (1990) and Krainer and LeRoy (2002) show that prices and TOM depend simultaneously on the probability of sale.Footnote 1 This leads to a positive relation between prices and TOM in the theory. In a survey of the literature, Benefield et al. (2014) find 197 models that include TOM as a potential price determinant. Of these models, only 24 show a positive relation while 73 report statistically insignificant parameter estimates and 100 find a significant negative relation. This study seeks to reconcile these divergent results by examining a number of possible explanations. We focus on the impact of TOM on transaction prices because the parameter estimate on transaction price in TOM models routinely exhibits the expected positive relation.
When housing specifications exhibit inverse relations between prices and TOM, researchers often cite the theoretical model of Taylor (1999) and attribute the negative parameter estimates to overpricing or a structural defect. When homeowners set their list prices much higher than market expectations they can experience a relatively long TOM. If a list price is sufficiently high such that a reduction is required to induce a transaction, enough time may have passed that a negative relation is a possibility. But this is an asymmetric relation because homes that are underpriced will not systematically spend more time on the market. Severe overpricing therefore is a possible omitted variable that leads to a unique market outcome at the upper end of the TOM distribution. We consider two approaches to examine the severe overpricing rationale.
Another rationale for a longer TOM from Taylor (1999) is when prior prospective purchasers discover a flaw in the property that is not apparent to a new potential buyer. TOM is thus a measure of structural quality and new prospective buyers become suspicious of a property that has been on the market for an extended time period (stigmatization). To test this hypothesis, we include the measure of structural quality from Genesove and Mayer (2001) in our equations.
Severe overpricing and quality can be thought of as omitted variables in price models. Using a rich dataset of property-level transactions from the National Association of Realtors (NAR), we investigate a number of other variables typically omitted from housing studies due to data unavailability. By design, the NAR collects the economic and demographic information that can affect the expected price-TOM tradeoff including hard-to-observe variables such as sellers’ levels of urgency, age, and income.Footnote 2
The literature also offers other economic explanations that can affect the price-TOM tradeoff and are usually not included in housing studies. Genesove and Mayer (2001) and Hayunga and Pace (2016) find that homeowners who expect to realize selling prices lower than their original purchase prices will set higher list prices, which can extend TOM. There is also an evolving literature examining the ratio of buyers to sellers, which is often termed market tightness or colloquially as hot versus cold markets. We include the Carrillo (2013) index to examine this explanation. We additionally consider aggregate demand and supply measures as alternatives to the market tightness variables. Our equations also control for atypicality due to studies such as Haurin (1988) finding relatively longer TOM for atypical properties compared to standard houses.
Simultaneity is another issue that we investigate in two ways. The first is the use of ordinary least squares (OLS) versus a system of simultaneous equations. Since, as mentioned, housing prices and marketing durations are jointly determined, TOM and its error terms are not independent of the vector of error terms in the price models. This can cause the OLS estimator to be biased and inconsistent. Approximately 40 studies out of the 68 examined by Benefield et al. (2014) use OLS models.
Two-stage least squares (2SLS) is commonly used to address the joint determination. The ability of 2SLS to control for the simultaneity is dependent on the quality of the instruments. It is well known that when instruments are only weakly correlated with the endogenous regressor 2SLS is biased towards the OLS estimator. We investigate the strength of the TOM instrumental variable (IV) using the NAR data, which provides high-quality instruments of marketing durations.
After considering the many explanations, our tests demonstrate that weak instrumental variables account for the varied slope coefficients on TOM in price models. In OLS specifications and 2SLS models with weak instruments, TOM is negative. Conversely, we consistently find the expected positive relation when test statistics indicate a high-quality TOM IV. This positive finding does not change when we introduce potential omitted variable bias, with the inclusion of the unique measures from the NAR data, or with the consideration of the other factors such as atypicality or market tightness.
Of final note, when comparing the models that exhibit positive versus negative or insignificant TOM slope coefficients, we observe that the economic and statistical significances of other covariates do not change greatly. This fact provides encouragement to studies that find negative or insignificant TOM parameter estimates as our equations suggest that the alternative TOM slopes will probably not materially affect the inferences concerning other variables of interest.
Data Sample
To conduct this study, we utilize the answers to the NAR survey used to create the annual Profile of Home Buyers and Sellers report. The NAR mails out a survey with over 100 questions to a random sample of recent home buyers. NAR obtains the consumer names from Experian, which maintains an extensive database of home buyers derived from county records. Because the data come from Experian and not only NAR members, the sample includes homes sold by owners without the assistance of a Realtor®.
While the survey is sent to homebuyers, we primarily use the property and demographic information of owners of previous homes for our study. The nature of our investigation of TOM necessitates this focus. For instance, the amount of time buyers are in the market is recorded, but the marketing durations of purchased homes are not recorded and there is no information on sellers of the purchased home.
The trade-off in using transactions of owners with homes that have sold or are trying to sell is that we remove first-time home buyers.Footnote 3 We find that only about 1.6% of the NAR responders are first time homebuyers, which means that approximately 50 transactions are removed from our sample of over 3000.
Our full-panel sample consists of property-level transactions of single family homes and townhouses from the 2010 to 2012 annual surveys. The 2010 survey includes sales from 2009. We restrict our sample to the surveys beginning in 2010 for multiple reasons. Foremost, because earlier questionnaires do not ask the questions required for the analysis. For example, 2009 questionnaire does not ask whether a responder is a first time seller or the number of bedrooms and bathrooms. We also desire to control for structural quality and loss aversion, which we can obtain using the original prices paid at purchase. Earlier surveys did not ask for this variable.
In constructing the data panel we remove observations with extreme or suspicious answers as well as those that lack relevant data. For instance, we drop six records that indicate a home age between 300 and 815 years and four observations indicating owner’s age less than 21 years old. We also remove 19 survey responses that indicate TOM in excess of 4 years. We further trim both ends of the transaction price distribution at the 1% level to focus on the more typical housing stock.
A potential drawback of the dataset is the low response rate, which is approximately 10% for the sample period. Despite the low response rate, the dataset befits our analysis primarily due to the many hard-to-observe variables that can impact price-TOM preferences. Further, Genesove and Han (2012) show the distribution of the NAR survey responses is similar to other surveys with higher response rates but lack the attributes important to our analysis. Additionally, the inclusion of the many economic and demographic attributes provides a linear approximation of a Heckman selection term that helps mitigate sample-selection bias. For instance, the propensity to respond to the survey may depend on responders’ demographic characteristics. Retired sellers may have lower opportunity costs of time while higher income owners have greater opportunity costs.
Table 1 reports the many sample attributes. The median sold residence is 25 years old with 3 bedrooms, 2 bathrooms, and 2000 square feet. The median original purchase price is $177,600 with a selling price of $220,000. The mean (median) TOM is 19.05 (10) weeks. The median period owners hold their properties is 8 years.
We group the many additional characteristics as seller motivations, demographics, search costs, and structural atypicalities. As to their motivations, sellers provide multiple measures, one of which is urgency. Springer (1996) and Knight (2002) use keywords like “motivated” and “must sell” in the comments field of MLS datasets to proxy for seller urgency. Springer (1996) finds that motivated sellers experience a decrease in transaction prices and an increase in the TOM; however, the models do not control for the simultaneity of the two outcomes. Knight (2002) finds no effect on transaction prices. Keyword studies such as these are challenging because the precise level of sellers’ urgency is hard to measure as entering keywords in MLS comes at little cost. Instead of having to proxy for urgency, the NAR survey asks the sellers’ levels of urgency. Seventeen percent of sellers express high urgency, 44% indicate some urgency, and 39% answer as not being urgent.
Another general category of motivation is the sellers’ reasons for selling. Nineteen percent of the respondents indicate job relocation, 8% note a change in the family status such as a divorce or the birth of a child, 3% indicate cash constraints resulting in the property being too expensive to keep, and 4% of the sample indicate a desire to avoid foreclosure. In addition, short sales constitute another 3% of the sample. The reasons indicating the property is too expensive to keep and a desire to avoid foreclosure provide insight into the financing of the property. In addition to property taxes, insurance, and maintenance–the last being something that can be deferred–the major commitment for many homeowners is the monthly mortgage payment. The property being too expensive or the owners specifically noting they want to sell to avoid foreclosure can impact price-TOM preferences when the mortgage is a constraint or when the home value is close to the mortgage amount such as in Genesove and Mayer (1997, 2001).Footnote 4
Regarding demographics, a majority of the responders are white, speak English, and born in the U.S. While 93% of the responders identify themselves as white, this is less than the almost 96% reported in Harding et al. (2003b). First-time sellers represent 38% of the sample.
The household composition is revealed through three factors: income, number of children, and number of earners. For anonymity, the NAR survey reports household income within one of sixteen levels, which we report in Table 1. Income is distributed unimodally across the levels with the largest percentage being incomes between $100,000 and $125,000. Our analysis examines each of these categories coded as a binary variable and also a continuous measure of log income using the midpoint of each category with the top-coded upper category set to $1.5 M. The results are invariant to the income measure so we report each category.
The number of children in the household can affect the price-TOM tradeoff. For instance, Harding et al. (2003b) find that having school-age children can reduce bargaining power and thus impact prices, especially during the school year. The effect of the number of household earners on prices and liquidity is unclear. Multiple earners should exhibit higher household incomes and greater house prices, although the income measures may control for this. Alternatively, multiple earners may decrease the motivation to sell if one earner holds high-quality employment, although the urgency variables will probably capture at least some impact of this characteristic.
The sample has three general measures of search costs. Since moving further away from the current residence should increase search costs, the first is the separation distance between the sold and purchased residences using the ZIP code at each location. We analyze separation distance in two ways. The first is the natural log of the continuous variable using the actual distance. The second is to group distances into bins to investigate if there is any additional nonlinearity. The results and conclusions are invariant to the proxy so we report the slope coefficients for each bin.
Using the ZIP code of the sold property, we control for approximately 455 locational fixed effects at the three-digit ZIP code level. Fixed effects control for the market tightness and thinness. Locational fixed effects also may help control for local tax regimes, labor markets, and additional demographics.
The other measures of search costs are selling to a friend or acquaintance and addition marketing methods beyond MLS listing including holding an open house and using magazine, flyer, print, and television advertising. Each is coded as a binary variable.
We also control for atypicality following Harding et al. (2003a) to compute measures of unique structural characteristics. The controls represent approximately 1 % of the sample for each type. We define a new home as 2 years old or less and an old home is equal to or greater than 120 years old. A large home is greater than 5000 square feet and a small home is less than 900 square feet. A home has many bathrooms if there are 5 or more; many bedrooms if 6 or more.
Harding et al. (2003a) also include the inverse Mills ratio (IMR) as an atypical feature. The IMR may measure price-liquidity preferences because atypical homes may experience thinner markets and possibly spend more time on the market than more typical properties before finding a buyer desiring the particular characteristics of the atypical home. Since the sample includes properties that have not sold in each survey year, we compute the IMR using the Heckman two-step procedure to correct for possible sample selection bias from modeling prices using only sold observations. Appendix 2 provides the first-stage probit model to compute the IMR.
Empirical Findings
All of our tests model transaction prices as the primary dependent variable. We are fitting a limited information simultaneous system using 2SLS where we instrument for the secondary dependent variable (TOM). Even though TOM is a duration variable, one must be careful to use a linear (linear in the parameters) model for TOM to keep within the standard 2SLS framework. One advantage of this framework is that consistency of 2SLS does not depend on the correct specification of the first stage regression (Kelejian 1971).Footnote 5 We do not investigate the alternative situation where TOM is the primary dependent variable and price is the secondary dependent variable for the reason that instrumented prices routinely show a positive relation when used in a TOM equation. If one went to a more full information simultaneous equation system, it might be possible to use the hazard nature of the TOM variable to improve efficiency. This opens up some future research possibilities, but lies beyond the scope of this study.
We begin the empirical analysis by determining the naïve relation between prices and TOM using OLS. The first model considers independent variables typical to MLS and public-record datasets. To this specification, we then add the NAR data to determine if any are omitted variables that alter the TOM slope coefficient.
We next use a system of equations and 2SLS to control for simultaneity and investigate the strength of possible instruments of TOM. The first specification considers MLS-type variables, which test statistics indicate yields a weak TOM IV. The second model uses the NAR data to form a strong TOM IV based upon the test statistics. The OLS and weak-IV models exhibit negative or insignificant slope coefficients on TOM whereas the specification with a strong IV produces the expected positive relation between TOM and prices.
With identification of a strong IV, we next consider omitted variable bias by incrementally removing determinants that correlate with prices. We begin by withholding explanatory variables that are not typical to studies using MLS and public record data and end by removing all possible determinants except the TOM IV and the temporal and spatial fixed effects. In each case, TOM exhibits a positive slope coefficient.
The remainder of the paper considers the aforementioned other possible explanations found in the literature (e.g., severe overpricing). We include in each specification the strong TOM IV as well as all the NAR variables. In each of these tests, TOM is consistently positive.
OLS Models
The analysis begins with OLS models that uses the natural log of uninstrumented TOM as an independent variable. Table 2 reports the results. In the first model, we restrict the explanatory variables to those typically found in MLS and public-records datasets. The second model introduces the economic and demographic information. In either case, TOM is inversely correlated with prices.
Simultaneity and a System of Equations
We now control for the simultaneity between the transaction outcomes using 2SLS, which we do throughout the remainder of this article.Footnote 6 We first investigate the use of instruments available in typical MLS datasets. The challenge is that price and marketing durations are generally determined by identical variables. Turnbull and Dombrow (2006) propose a method to address this issue using local market competition measures. Since ours is a national study and the market competition variables have limited impact, we empirically find variables within our data that correlate with TOM but not prices. Using the MLS-type dataset, we find one instrument, which is detached single-family residences (SFR) versus townhouses. A rationale for the TOM relation is that detached SFR take longer to sell due to greater heterogeneity compared to townhouses.
Model (1) in Table 3 reports the first-stage reduced-form models of TOM. We include both the instruments and the included exogenous variables in Table 3 to show the additional determinants of TOM, which will be important in later investigation of omitted variables.
Using the predicted value of TOM from the MLS-type Model (1) equation in Table 3, we model the structural price equation and report the results in Table 4. We note first that the parameter estimate on TOM in Table 4 is insignificant. We also observe that the endogeneity statistics reported at the bottom of the column indicate a weak IV.
We next consider the full information set in both the reduced form model in Table 3 and the structural equations in Table 4. In the first stage, we add variables that exhibit multivariate correlation with TOM but do not exhibit a relation with prices. These are selling to an acquaintance and various marketing methods. Selling to an acquaintance can reduce the marketing duration if an agreement is reached early or even before the property is listed to the general public. Marketing methods can impact TOM through additional publicity that can increase buyers’ rate of arrival. Somewhat unexpectedly, the results demonstrate that the marketing methods increase TOM. This is consistent with owners not finding initial matches and invoking additional publicity. Model (2) in Table 3 reports the updated reduced form equation.
Model (2) in Table 4 presents the results with the improved TOM IV as well as the additional economic and demographic measures as covariates in an unrestricted structural equation. The endogeneity statistics at the bottom of the column demonstrate that TOM is indeed endogenous with transaction prices and that the TOM IV is strong and not overidentified. The partial R2 is 0.150 with an F statistic of 58.73. The partial elasticity of 0.043 on TOM is positive and significant.
There are a few additional relations of note in Model (2) in Table 4. The first is structural quality. Stigmatization is often cited when housing researchers find an inverse relation between transaction prices and TOM. However, the effect should be idiosyncratic for a few lemon properties and not systematic across the price distribution. Nonetheless, quality is clearly an important part of housing service flow and frequently omitted in empirical housing models.
We thus follow Genesove and Mayer (2001) to compute quality as the residuals from a hedonic model of values at the time of purchase. Quality is the portion of the previous transaction prices that the regression did not predict and the unobserved home qualities that partially determine the prices at the time of the original purchases. To the extent these qualities do not change significantly over time, the residuals are a relevant measure for their impact on future transaction prices.
Consistent with buyers preferring higher quality homes ceteris paribus, we find the slope coefficient is significant and positive in all of the price specifications. We also observe that TOM and quality are not correlated in the first-stage Model (2) in Table 3, which is empirical evidence that does not support the quality/stigmatization hypothesis.
Concerning other predictive variables, the results demonstrate that urgency, income, and race are price determinants. Sellers exhibiting greater urgency experience lower transaction prices, which is consistent with higher discount rates. There is a monotonic increase in transaction prices with higher incomes. This finding is consistent with households having greater (lower) incomes owning homes with more (less) service flow, which can include locational and neighborhood effects. The results also demonstrate that African American sellers realize a decrease in transaction prices while Asian’s experience an increase.
Omitted Variables
The results thus far demonstrate that the parameter estimates on TOM switch signs between the OLS models in Table 2 and the 2SLS specification in Table 4. Because the IV specifications are joint tests of omitted and instrumental variables, we next isolate the omitted-variable hypothesis by removing significant determinants that are generally not observed in MLS samples and thus can potentially lead to biases.
There are TOM determinants in the reduced form equations in Table 3 that are also correlated with prices in the structural equation in Table 4. These are sellers’ incomes, ages, and races along with short sales and holding periods. If these determinants are withheld from the model, the coefficient estimate on TOM may be biased. To test for omitted-variable bias, we remove these variables from the previously unrestricted model while modeling the TOM IV using the high-quality instruments from Model (2) in Table 3.
Model (1) in Table 5 reports the initial findings. In addition to the other covariates displaying the same sign and significance as in Table 4, the results demonstrate that the TOM IV maintains a positive and statistically significant effect on transaction prices. The statistical tests at the bottom of the column indicate that TOM is endogenous with prices and the IV is not weak. Thus, with a strong IV, the TOM estimate is not impacted by the omitted correlated variables.
Since withholding variables that impact both TOM and prices does not change the inferences we investigate other specifications when sellers may change their price-TOM preferences and possibly cause bias if a significant determinant is not included in the specification. While not found in our data, a higher degree of urgency can cause a seller to prefer an increase in the probability of sale. To accomplish this a seller can reduce their list and reservation prices, which should decrease TOM. We therefore reconsider Model (1) without the urgency variables. We do not present the results because they are not materially altered. We also consider the omission of separation distance since sellers moving greater intervals may prefer quicker sales to avoid property maintenance costs as an absentee owner as well as higher transaction costs due to travelling back to the previous location to complete a sales transaction. Again, we find the results are invariant to the removal of the separation distance measures and thus do not present them. We next examine a specification that omits all of the additional economic and demographic information and uses the information found in typical MLS datasets. This test differs from Model (1) in Table 4 in that it includes a strong TOM IV. Model (2) in Table 5 reports the results. The high quality TOM measure maintains an endogenous, positive, and significant relation with prices. The IV is strong with a partial R2 of 0.148 and an F statistic equal to 49.51.
Lastly, we consider an acutely restricted model that omits all covariates except the strong TOM IV and the spatial and temporal fixed effects. Model (3) in Table 5 shows that TOM still positively impacts transaction prices.
Severe Overpricing
At times the literature attributes an inverse correlation between prices and TOM to severe overpricing such as in Knight (2002). The argument is that if homeowners set their list prices much higher than the expected market value, the home will sit on the market until the homeowner reduces the asking price or withdraws the property from the market. While on the market sellers may reduce their reservation prices (Huang and Palmquist, 2001) and/or gain information about the market value and subsequently decrease their list prices. The reduced list prices should result in transaction prices near the expected market value or maybe even less than the expected value. Combining the extended TOM with prices close to expected market values, a negative relation can result between the transaction outcomes.
It is critical to note, however, that this is an asymmetric relation at the upper tail of the TOM distribution that does not negate the first-order positive relation between prices and TOM, and therefore should not yield an overall negative coefficient on TOM. It is asymmetric because homes that are underpriced for quick sale should not consistently spend more TOM than expected. Instead, the severe overpricing is potentially an omitted variable.
There is an empirical challenge is measuring severe overpricing due to additional functional-form simultaneity. The severe overpricing cannot simply be measured as the difference between list prices and transaction prices and including as an independent variable, which is sometimes seen in the extant literature and termed degree of overpricing. The first reason is that this difference does not consider just the severe overpricing but also typical overpricing by sellers allowing for negotiation with potential buyers. Moreover, since the other covariates are predicting transaction prices, the simple price difference is a close proxy for list prices and, thus, not appropriate to include as an explanatory variable in a transaction price specification. Similar issues exist in using predicted list or transaction prices as possible inputs to the measure of severe overpricing. Conceptually, the price model can include an IV for the severe overpricing in a system of equations, but we find the challenge is finding high-quality instruments that separately measure TOM and severe overpricing.
We thus address the additional simultaneity using two methods. The first is to include a quadratic term to better fit the tails of the TOM distribution. The second is to explore breakpoints in the TOM IV to isolate the severe overpricing. Table 6 reports the findings.
In Model (1), we change from using the natural log of TOM to the level and add the TOM quadratic. The specifications include the full set of covariates as in Model (2) in Table 4 as well as the strong set of TOM instruments specified in Model (2) of Table 3. We use these unrestricted reduced-form and structural equations in all remaining models.
The combined slope coefficients in Model (1) demonstrate the positive first-order relation but at a decreasing rate. The parameter estimate on the quadratic is slightly less than the threshold of 5% (p-value = 0.057). The combined relation indicates lower (higher) priced homes will experience shorter (longer) TOM, but to a point. For severely-overpriced homes, TOM can become quite long and not result in a higher price. At the other end of the distribution, the TOM on extensively underpriced homes is truncated from below at zero and homeowners can expect to spend some amount of time marketing the property.
In Model (2), we use the log of TOM and explore breakpoints. The first we find is at 52 weeks. The results demonstrate a positive partial elasticity on the first TOM segment and an insignificant parameter estimate on the segment that is greater than 52 weeks. In Model (3), we set the breakpoint at 65 weeks. The partial elasticity on the shorter TOM segment increases compared to Model (2). Consistent with the severe overpricing hypothesis, the slope coefficient on the segment greater than 62 weeks is an economically significant negative partial elasticity, however the slope coefficient is not statistically significant.
Local Market Dynamics
A national study such as ours mitigates a type of sample selection bias since it does not concentrate within one local market. Alternatively, the total summation of TOM into one slope coefficient could suffer from aggregation across markets, regions, or states of nature. To mitigate this issue, each of the specifications includes ZIP code and annual fixed effects. We find that regressing prices on just the ZIP code fixed effects explains a creditable 23% of the price variation.
We next consider alternative measures of local market dynamics as possible omitted variables. There is an evolving literature examining prices and TOM based upon the state of the market related to tightness, which is the ratio of buyers to sellers. The central idea is that housing liquidity varies over time and different states of nature. In tight or hot markets the number of potential buyers actively seeking alternative housing service flows increases relative to sellers. This can result in rising house prices, higher sales volumes, and higher market liquidity. The greater liquidity should translate into lower TOM. The transaction outcomes are reversed in cold markets. Thus, the relation between prices and TOM is negative. Krainer (1999), Krainer (2001), Krainer and LeRoy (2002), and Novy-Marx (2009) provide theoretical models and calibrations while Carrillo and Pope (2012), Carrillo (2013), Gan (2013), and Carrillo, de Wit, and Larson (2015) contribute recent empirical studies.
Note, though, that the negative relation in this literature is a function of changes in prices and TOM or the comovement of prices and liquidity over time. Hot and cold markets are relative to another market geographically or to a previous time period within the same market. This is not the focus of the housing literature using hedonic price models with TOM as a possible determinant.
Search-theoretic models do however allow for varying levels of buyers’ arrival rates and economic shocks at local levels.Footnote 7 Therefore, to confirm that the TOM relation is not an aggregation across markets, we first include the market tightness index of Carrillo (2013) in our unrestricted structural equation.
The Carrillo (2013) index, denoted as θ, is a relative measure of the bargaining power between sellers and buyers. Since ours is a national study, we follow the estimation for aggregate data. We bin transactions at the 2-digit ZIP code level to have sufficient observations to calculate θ. To estimate the structural parameters, we compute per bin the mean log list prices, mean log transaction prices, the share of transactions that occurred at a price below the list price, and the mean number of days that a property stays on the market. We then calculate the structural parameters following the equations in Carrillo (2013). The final parameter computed is the bargaining measure, which ranges from 0 to 1 with higher (lower) values indicating greater bargaining power for sellers (buyers).
The descriptive statistics on θ are including in Table 1. The average value is 0.33 with a minimum of 0.10 and a maximum of 0.84. Meeting with our priors concerning market conditions during the sample period, we find the observations with lowest θ are all transactions around the Chicago MSA, about 2% of the sample. Conversely, the highest θ markets include Fargo, North Dakota and many of the cities that suffered the most during the financial crisis from 2007 to 2009 and rebounded during the sample period. These include Los Angeles, Sacramento, and Fresno, California along with Albuquerque, New Mexico.
We employ θ in two ways. In the first, we use the continuous variable. In the second, we convert θ to binary variables to mitigate another possible endogeneity issue. Table 7 details the findings using the first method. In Models (1) and (2) we restrict the specification to only θ and a constant. As expected, the results demonstrate that the hotter (colder) markets experience an increase (decrease) in prices in Model (1) and a decrease (increase) in TOM in Model (2). In Models (3), we include the TOM IV along with the covariates sans the spatial fixed effects because they are measuring similar treatments as θ and we find they are highly collinear. Prices continue to exhibit a positive relation with θ in Model (3). Further, the partial elasticity on the TOM IV is positively and significantly correlated with transaction prices.
The computation of θ, while nonlinear, includes the difference in list and transaction prices as well as the level of TOM. The fact that sellers’ bargaining power is a function of the market fundamentals that we are examining introduces the possibility of simultaneity between θ and the price and TOM disturbances. To mitigate the possible issue, we replace the continuous variable with two binary variables. We set Hot (Cold) equal to one when the value of the index places the observation in the upper (lower) quartile of the θ distribution and zero otherwise.
Table 8 reports the results. In Models (1) and (2) we again restrict the specification to only a constant term and the hot and cold measures to confirm the base relation. Hotter markets continue to experience an increase in prices and a decrease in TOM. Properties in cold markets do not initially experience a change in prices or TOM. This changes in Model (3), which is the unrestricted specification with all other covariates and the TOM IV. Now transactions in colder markets experience price decreases of approximately 5%. We note that the partial elasticity on TOM continues to be positive and significant.
A different method to measure local market dynamics is to use aggregate demand and supply measures in place of θ and spatial fixed effects. Following Genesove and Han (2012), we include population and income at the MSA level as demand drivers. Homeowners with high incomes have a greater willingness to pay for consumption amenities so that improvements in these amenities will be accompanied by higher income people moving into the local area. Population can also impact housing demand through channels such as employment opportunities, consumption amenities, and zoning laws. On the supply side, we include population density following the literature such as Green, Malpezzi, and Mayo (2005), who find that density is a negative predictor of supply inelasticity.
We examine the macroeconomic variable levels as well as their first differences (Δ) computed as continuously-compounded percentages. We investigate the long-run population levels and changes using the year 2000 US Census values compared to the 2010 levels. This considers the housing stock. We also examine short-run changes using year 2008 compared to 2010 levels, which investigates the housing flow. We find the results and conclusions regarding the elasticity of TOM are invariant to the supply and demand variables. Table 9 reports the equations with Model (1) including the demand-side drivers and Model (2) adding population density levels. In both instances, the TOM IV continues to exhibit a positive and significant elasticity with transaction prices.
Loss Aversion
Empirical studies such as Genesove and Mayer (1997, 2001) and Hayunga and Pace (2016) find that sellers are loss averse and reluctant to realize a loss when the current transaction price is expected to be below the original price paid for their home. Sellers consequently set list prices higher than expected, which can impact marketing durations such that loss aversion may be a unique omitted variable in price equations. Since the original purchase price is observable in the NAR dataset, we are able to compute the proxy for loss aversion using the method laid out by Genesove and Mayer (1997, 2001).
When expected losses are added to our standard unrestricted model, we observe in Table 10 a positive and significant correlation with transaction prices, which is consistent with Genesove and Mayer (2001) and Hayunga and Pace (2016). We also include the binary variables of hot and cold markets, which are significant in the expected direction. Again, the relation between TOM and prices continues to be positive and significant.
Localized Markets
Real estate trades in local markets and while we account for this aspect using spatial fixed effects, these controls along with the atypicality variables may introduce two additional considerations. The first is that fixed effects across the entire US at the 3-digit ZIP code level can result in just a couple of observations per ZIP code control for our sample. Another concern with a national study is that the atypical variables may not generalize across the entire country. For instance, slightly more than 1% of the sample consists of homes greater than 5000 square feet. While this is a large home in a major metropolitan area like New York City, it may be less atypical in more rural area in, say, Texas.
Due to these considerations, our last test restricts the sample to homes in five large metropolitan in northeast US, which are Boston, New York City, Philadelphia, Baltimore, and Washington, D.C. With 559 observations, the sample is large enough to include a sufficient number of observables within each ZIP code area when we bin at the 2-digit level. We also observe that the atypicality variables still measures the extreme tails of the distributions for old, new, small, and large homes as well as those properties with many bathrooms and bedrooms.
The model in Table 11 reports the results from this test. In the specification, we include the same unrestricted Model (2) from Table 4 as well as the expected nominal loss variable. We do not include the hot and cold measures because they are multicollinear with the ZIP code controls. We find that the TOM variable of 0.048 is similar in economic magnitude to the previous equations; however, the slope coefficient is not significant at the 5% level. The p-value is 0.06. We note that with less observations the standard error increases by approximately 50%, which explains the reduced statistical significance.
Conclusion
Due to fixed location and heterogeneous products, buyers who desire new housing must search for a property with the service flow that meets with their tastes and preferences. On the other side of the transaction sellers do not know which buyers will be a good match for their property. Thus, search theory and the matching process are fundamental to real estate markets. Search-theoretic models show that the probability of sale simultaneously affects both the expected prices and marketing durations of properties, which leads to a positive relation between these outcomes in equilibrium. The empirical research, however, is mixed with a number of models finding insignificant or negative relations. The negative relation is disconcerting because one conclusion is that underpriced homes systematically remain on the market longer than expected.
This study examines explanations for the divergence of the empirical and theoretical results. Our first finding is that, while recent articles have taken more care, many studies still fail to account for the joint determination of prices and marketing durations, which can cause biased results.
Of the studies that address simultaneity, a system of equations is typically modeled using two-stage least squares and instrumental variables. The presence of instrumental variables in a system does not guarantee that the signs on estimated parameters will agree with those predicted from theory. This can result from omitted variables or from weak instruments. Using a unique dataset that provides many variables omitted in other studies, we find that specifications that include a strong TOM instrumental variable produce the expected positive relation between TOM and prices. Alternatively, models with a weak TOM instrumental variable will generate the negative or insignificant parameter estimates on TOM.
When comparing the equations that exhibit negative and insignificant versus positive slope coefficients on TOM we note that the parameter estimates on the other independent variables do not change dramatically. Our tests suggest that the negative or insignificant slopes probably will not materially affect the inferences concerning other variables of interest in price regressions that include TOM.
Notes
Appendix 1 provides a brief summary of search models.
Some house price studies examine research topics that fit data aggregated at the census tract, county, or larger spatial area. Similar to the NAR dataset, these data usually offer more information about the social, economic, and demographic characteristics of market participants summarized at the geographic area of interest. However, these data are not individual transactions. The American Housing Survey is a source of buyers’ and sellers’ demographic information along with property-level information but does not record marketing durations.
We also do not observe sales when the buyer does not purchase another home. By the nature of the survey, this is another necessary trade-off in order to use the rich information provided by the survey responses.
Since the surveys are sent to individual sellers, there are no transactions for REO or foreclosed properties.
See Angrist and Pischke (2008, pp. 189–192) for a more recent discussion of this.
Appendix 3 provides a brief explanation of the econometric issues with simultaneity and omitted variable bias in housing studies as well as IV test statistics.
Similar to market tightness, considering an additional market characteristic in the form of varying arrival rates of buyers is consistent with search theory. For instance, the static model of Wheaton (1990) allows for changes in the rate of matching, which is shown to impact TOM. In the quasi-steady state model of Williams (1995), the pricing function is derived in continuous time, which allows changes in local employment or income. Comparative statics of the Williams (1995) model also show that an increase in buyers’ arrival rate increases liquidity. Additionally, Krainer and LeRoy (2002) use standard search theory to motivate two states of nature, hot and cold markets.
Parente and Santos Silva (2012) caution that an insignificant statistic on the overidentifying-restrictions test does not preclude a lack of power of the test due to invalid instruments. Kiviet (2017) highlights the importance of an unambiguous hypothesis to mitigate model misspecification as a reason for the insignificant test statistic. Accordingly, the test hypothesis of our paper is solely focused on the relation of TOM in price equations.
References
Angrist, J. D., & Pischke, J.-S. (2008). Mostly harmless econometrics. Princeton: Princeton University Press.
Benefield, J., Cain, C., & Johnson, K. (2014). A review of literature utilizing simultaneous modeling techniques for property price and time-on-market. Journal of Real Estate Literature, 22(2), 149–175.
Carrillo, P. E. (2013). To sell or not to sell: measuring the heat of the housing market. Real Estate Economics, 41(2), 310–346. https://doi.org/10.1111/reec.12003
Carrillo, P. E., & Pope, J. C. (2012). Are homes hot or cold potatoes? The distribution of marketing time in the housing market. Regional Science and Urban Economics, 42(1-2), 189–197. https://doi.org/10.1016/j.regsciurbeco.2011.08.010
Carrillo, P. E., de Wit, E. R., & Larson, W. (2015). Can tightness in the housing market help predict subsequent home price appreciation? Evidence from the United States and the Netherlands. Real Estate Economics, 43(3), 609–651. https://doi.org/10.1111/1540-6229.12082
Gan, Q. (2013). Optimal selling mechanism, auction discounts and time on market. Real Estate Economics, 41(2), 347–383. https://doi.org/10.1111/reec.12002
Genesove, D., & Han, L. (2012). Search and matching in the housing market. Journal of Urban Economics, 72(1), 31–45. https://doi.org/10.1016/j.jue.2012.01.002
Genesove, D., & Mayer, C. J. (1997). Equity and time to sale in the real estate market. The American Economic Review, 87(3), 255–269.
Genesove, D., & Mayer, C. J. (2001). Loss aversion and seller behavior: evidence from the housing market. Quarterly Journal of Economics, 166, 1233–1260.
Green, R. K., Malpezzi, S., & Mayo, S. K. (2005). Metropolitan-specific estimates of the price elasticity of supply of housing, and their sources. The American Economic Review, 95(2), 334–339. https://doi.org/10.1257/000282805774670077
Harding, J. P., Rosenthal, S. S., & Sirmans, C. F. (2003b). Estimating bargaining power in the market for existing homes. The Review of Economics and Statistics, 85(1), 178–188. https://doi.org/10.1162/003465303762687794
Harding, J., Knight, J., & Sirmans, C. F. (2003a). Estimating bargaining effects in hedonic models: Evidence from the housing market. Real Estate Economics, 31(4), 601–622. https://doi.org/10.1046/j.1080-8620.2003.00078.x
Haurin, D. R. (1988). The duration of marketing time of residential housing. Journal of the American Real Estate and Urban Economics Association, 16(4), 396–410. https://doi.org/10.1111/1540-6229.00463
Hayunga, D. K., & Pace, R. K. (2016). List prices in the US housing market. Journal of Real Estate Finance and Economics, 55(2), 155–184.
Huang, J.-C., & Palmquist, R. B. (2001). Environmental conditions, reservation prices, and time on the market for housing. Journal of Real Estate Finance and Economics, 22(2/3), 203–219. https://doi.org/10.1023/A:1007891430162
Kelejian, H. H. (1971). Two stage least squares and econometric systems linear in the parameters but nonlinear in the endogenous variables. Journal of the American Statistical Association, 66(334), 373–374. https://doi.org/10.1080/01621459.1971.10482270
Kiviet, J. F. (2017). Discriminating between (in)valid external instruments and (in)valid exclusion restrictions. Journal of Econometric Methods, 6(1), 1–9.
Knight, J. (2002). Listing price, time on market, and ultimate selling price: Causes and effects of listing price changes. Real Estate Economics, 30(2), 213–237. https://doi.org/10.1111/1540-6229.00038
Krainer, J. (1999). Real Estate Liquidity. Economic Review–Federal Reserve Bank of San Francisco, 3, 14–26.
Krainer, J. (2001). A theory of liquidity in residential real estate markets. Journal of Urban Economics, 49(1), 32–53. https://doi.org/10.1006/juec.2000.2180
Krainer, J., & LeRoy, S. F. (2002). Equilibrium valuation of illiquid assets. Economic Theory, 19(2), 223–242. https://doi.org/10.1007/PL00004214
Novy-Marx, R. (2009). Hot and cold markets. Real Estate Economics, 37(1), 1–22. https://doi.org/10.1111/j.1540-6229.2009.00232.x
Parente, P. M., & Santos Silva, J. (2012). A cautionary note on tests of overidentifying restrictions. Economic Letters, 115(2), 314–317. https://doi.org/10.1016/j.econlet.2011.12.047
Springer, T. M. (1996). Single-family housing transactions: seller motivations, price, and marketing time. Journal of Real Estate Finance and Economics, 13, 237–254.
Stock, J. H., & Yogo, M. (2005). Testing for weak instruments in linear IV regression. In D. W. Andrews & J. H. Stock (Eds.), Festschrift in honor of Thomas Rothenberg. Cambridge: Cambridge University Press. https://doi.org/10.1017/CBO9780511614491.006
Stock, J. H., Wright, J. H., & Yogo, M. (2002). A survey of weak instruments and weak identification in generalized method of moments. Journal of Business & Economic Statistics, 20(4), 518–529. https://doi.org/10.1198/073500102288618658
Taylor, C. R. (1999). Time-on-the-market as a sign of quality. Review of Economic Studies, 66(3), 555–578. https://doi.org/10.1111/1467-937X.00098
Turnbull, G. K., & Dombrow, J. (2006). Spatial competition and shopping externalities: evidence from the housing market. The Journal of Real Estate Finance and Economics, 32(4), 391–408. https://doi.org/10.1007/s11146-006-6959-4
Wheaton, W. C. (1990). Vacancy, search, and prices in a housing market matching model. Journal of Political Economy, 98(6), 1270–1292. https://doi.org/10.1086/261734
Williams, J. T. (1995). Pricing real assets with costly search. Review of Financial Studies, 8(1), 55–90. https://doi.org/10.1093/rfs/8.1.55
Wooldridge, J. M. (1995). Score diagnostics for linear models estimated by two stage least squares. In G. S. Maddala, P. C. Phillips, & T. N. Srinivasan (Eds.), Advances in econometrics and quantitative economics: Essays in honor of professor C. R. Rao (pp. 66–87). Oxford: Blackwell.
Acknowledgements
Hayunga gratefully acknowledges financial support from the Terry-Sanford Research Award. We thank the National Association of Realtors for the data.
Author information
Authors and Affiliations
Corresponding author
Appendices
Appendix 1: A Search-Theoretic Model
Following Krainer (1999) and Krainer and LeRoy (2002), this section presents a summary of a standard search and matching model. Search models are optimal stopping models and illustrate how best to balance the cost of delaying the sale against the value of the option to try again. Search and matching models generally function in a stationary environment with one searcher per period. This is not to say that search models do not account for shocks. For instance, the seminal model of Wheaton (1990) specifically details the impact of changes in aggregate supply or demand. The important insight in Eqs. (2) and (4) that should guide empirical examination of real property is that prices and TOM are functions of the probability of sale (μ) and thus simultaneously determined.
Consider the market where homeowners attempt to sell their properties to potential buyers who do not live in homes currently and thus search for new properties to purchase. Sellers will choose a price in order to maximize q, the expected value of having a home on the market. Let \( \mu \left(\overset{\sim }{p}\right) \) be the probability that the owner will sell her home when the list price is \( \overset{\sim }{p} \). With \( 1-\mu \left(\overset{\sim }{p}\right) \) the probability of not selling the home, the owner then chooses a price to maximize the expected value of having the house on the market as
where β is a discount rate of future cash flows less than one. The expected value of marketing the house in the next period is given by q, where the expectation reflects the fact that the next period’s value is uncertain. The optimal pricing strategy will satisfy the first-order necessary condition of (1) as
Turning to the buyer’s perspective, the value of the home under consideration is not the housing unit per se, but the flow of services it provides. Buyers will approach the market looking to consume the housing service flow, denoted as γ, at the beginning of the next period to maximize their lifetime expected utility. For buyers who are contemplating buying a home with service flow γ for price p, the optimal strategy is to buy if the expected value of the house, net of price, is greater than the option to search again next period. Note, the flow is a reservation service flow and plays the same role as a seller’s reservation price. That is, each buyer has a reservation service flow or reservation dividend, γ∗, at which he or she is indifferent between buying a house for the asking price \( \overset{\sim }{p} \) or continuing to search. A buyer then continues searching if she draws γ < γ∗and buys the property if she draws γ ≥ γ∗.
To close the model, observe the connection between the dividend from the reservation service flow and the probability of sale from the seller’s optimization. That is, the probability of sale is simply that a buyer draws a dividend greater than the reservation dividend such that
where the seller’s value of \( \mu \left(\overset{\sim }{p}\right) \) simplifies to μ in equilibrium and the idiosyncratic dividend is distributed uniformly on the unit interval [0,1].
In equilibrium, Krainer (2001) shows how to expand the reservation service flow of buyers to determine the TOM. As in Krainer (1999) and Krainer and LeRoy (2002), the expected TOM is
Appendix 2: Sample Selection
Since the NAR survey is sent to both buyers and sellers, we can potentially control for those responders who have their home listed but may be less motivated to sell and thus have unique price-TOM preferences. A small percentage of the buyers have purchased a new home but have not sold their previous residence. The unsold properties occur in each survey year. We thus examine for potential selection bias using the traditional Heckman two-step procedure and compute the IMR. The first step estimates a probit model of whether the house has sold or not. Table 12 reports the specification modeling the propensity to sell with sold properties set equal to one and zero otherwise.
Appendix 3: Potential Endogeneity Issues in Housing Specifications
Endogeneity is a primary concern when modeling the sale prices and marketing durations of housing transactions, specifically simultaneity and omitted variables. The simultaneity problem arises because prices and TOM are determined in equilibrium such that it can be argued that causation can exist in either direction between the two outcomes. Without specifying the interdependent structural equations in a system, the first problem is that the disturbance term is correlated with TOM in a price equation, which is a violation of an OLS assumption. Moreover, because TOM and prices are both the independent and dependent variables in the system, the error terms between the two structural equations are expected to be correlated.
To illustrate the simultaneity bias, we simplify the structural equations of price and TOM to just these two measures as
To derive the bias from estimating Eq. (5) using OLS, the population estimate of the slope coefficient can be expressed as
The standard treatment for the joint determination of the structural equations in housing studies is to find an instrument or multiple instruments for the TOM regressor. Valid instruments will satisfy relevance and exclusion conditions. If the instruments are only marginally relevant or weak, standard IV point estimates, hypothesis tests, and confidence intervals are unreliable and can produce spurious findings (Stock, Wright, and Yogo, 2002; Stock and Yogo, 2005). When the TOM IV is weak, the bias of 2SLS approaches the OLS bias because the instruments have no impact on prices through the TOM IV. Since we find the biased OLS parameter estimate is negative, a weak TOM IV will tend towards an inverse or insignificant relation with prices.
To confirm that our instruments are sufficiently correlated with TOM, we report both Shea’s partial R2 and the F statistic for each IV regression. The partial R2 measures the correlation between the TOM IV and the instruments after partialling out the effect of the exogenous variables. We note that the F statistic is generally statistically significant even with a marginal IV. We accordingly take the recommendation of Stock, Wright, and Yogo (2002) that the F statistic be approximately 10 or more as well as confirm both the 2SLS relative biases at 5% and worst-case rejection rates of 5% following Stock and Yogo (2005).
The exclusion condition requires a zero covariance between the instruments and the regression residuals in the structural equation. This criterion implies that the only effect of the instruments on house prices is through the endogenous TOM. The exclusion condition cannot be tested directly because the disturbances from the structural equation are unobservable. However, when the number of instruments exceeds the endogenous regressors, the model is said to be overidentified and we can perform a secondary test of whether additional instruments are valid. We check the Wooldridge (1995) robust score test of overidentifying restrictions for each structural equation. We find that when the instruments are not predictive of TOM but correlate with price the robust score test is significant, which suggests overidentification.Footnote 8 Alternatively, when test statistics indicate the TOM IV is endogenous with prices and not weak and the overidentification test is insignificant, we find the TOM parameter estimates are consistently positive.
Omitted-variable bias is the other general econometric concern in housing studies and occurs when a model should include an influential explanatory variable but does not. To briefly illustrate assume that the true economic relation is given by the standard regression model of
where y is a vector of n observations of house transaction prices and xk is a column vector of n observations of variable xk, k = 1, …, K, which can be assembled in a n × K data matrix X with the first column usually a vector of ones to produce a constant term. The unobservable explanatory variable(s) is vector w and u is a vector containing the n disturbances. The relation between the regressand and regressors is identified in the β and δ.
Our test results find that sellers’ ages, incomes, and races are some characteristics that correlate with both transaction prices and TOM. Considering these attributes are routinely omitted in hedonic price specifications, the estimable population regression is y = Xβ + v, where v = wδ + u is a composite error term. Due to correlation with both transaction outcomes, the omitted variables can lead to a biased TOM slope estimate in a price model. The bias term can be either a positive or negative.
Rights and permissions
About this article
Cite this article
Hayunga, D.K., Pace, R.K. The Impact of TOM on Prices in the US Housing Market. J Real Estate Finan Econ 58, 335–365 (2019). https://doi.org/10.1007/s11146-018-9657-0
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11146-018-9657-0