House sales vary by price and by the time on the market (TOM). Search-theoretic models such as Wheaton (1990) and Krainer and LeRoy (2002) show that prices and TOM depend simultaneously on the probability of sale.Footnote 1 This leads to a positive relation between prices and TOM in the theory. In a survey of the literature, Benefield et al. (2014) find 197 models that include TOM as a potential price determinant. Of these models, only 24 show a positive relation while 73 report statistically insignificant parameter estimates and 100 find a significant negative relation. This study seeks to reconcile these divergent results by examining a number of possible explanations. We focus on the impact of TOM on transaction prices because the parameter estimate on transaction price in TOM models routinely exhibits the expected positive relation.

When housing specifications exhibit inverse relations between prices and TOM, researchers often cite the theoretical model of Taylor (1999) and attribute the negative parameter estimates to overpricing or a structural defect. When homeowners set their list prices much higher than market expectations they can experience a relatively long TOM. If a list price is sufficiently high such that a reduction is required to induce a transaction, enough time may have passed that a negative relation is a possibility. But this is an asymmetric relation because homes that are underpriced will not systematically spend more time on the market. Severe overpricing therefore is a possible omitted variable that leads to a unique market outcome at the upper end of the TOM distribution. We consider two approaches to examine the severe overpricing rationale.

Another rationale for a longer TOM from Taylor (1999) is when prior prospective purchasers discover a flaw in the property that is not apparent to a new potential buyer. TOM is thus a measure of structural quality and new prospective buyers become suspicious of a property that has been on the market for an extended time period (stigmatization). To test this hypothesis, we include the measure of structural quality from Genesove and Mayer (2001) in our equations.

Severe overpricing and quality can be thought of as omitted variables in price models. Using a rich dataset of property-level transactions from the National Association of Realtors (NAR), we investigate a number of other variables typically omitted from housing studies due to data unavailability. By design, the NAR collects the economic and demographic information that can affect the expected price-TOM tradeoff including hard-to-observe variables such as sellers’ levels of urgency, age, and income.Footnote 2

The literature also offers other economic explanations that can affect the price-TOM tradeoff and are usually not included in housing studies. Genesove and Mayer (2001) and Hayunga and Pace (2016) find that homeowners who expect to realize selling prices lower than their original purchase prices will set higher list prices, which can extend TOM. There is also an evolving literature examining the ratio of buyers to sellers, which is often termed market tightness or colloquially as hot versus cold markets. We include the Carrillo (2013) index to examine this explanation. We additionally consider aggregate demand and supply measures as alternatives to the market tightness variables. Our equations also control for atypicality due to studies such as Haurin (1988) finding relatively longer TOM for atypical properties compared to standard houses.

Simultaneity is another issue that we investigate in two ways. The first is the use of ordinary least squares (OLS) versus a system of simultaneous equations. Since, as mentioned, housing prices and marketing durations are jointly determined, TOM and its error terms are not independent of the vector of error terms in the price models. This can cause the OLS estimator to be biased and inconsistent. Approximately 40 studies out of the 68 examined by Benefield et al. (2014) use OLS models.

Two-stage least squares (2SLS) is commonly used to address the joint determination. The ability of 2SLS to control for the simultaneity is dependent on the quality of the instruments. It is well known that when instruments are only weakly correlated with the endogenous regressor 2SLS is biased towards the OLS estimator. We investigate the strength of the TOM instrumental variable (IV) using the NAR data, which provides high-quality instruments of marketing durations.

After considering the many explanations, our tests demonstrate that weak instrumental variables account for the varied slope coefficients on TOM in price models. In OLS specifications and 2SLS models with weak instruments, TOM is negative. Conversely, we consistently find the expected positive relation when test statistics indicate a high-quality TOM IV. This positive finding does not change when we introduce potential omitted variable bias, with the inclusion of the unique measures from the NAR data, or with the consideration of the other factors such as atypicality or market tightness.

Of final note, when comparing the models that exhibit positive versus negative or insignificant TOM slope coefficients, we observe that the economic and statistical significances of other covariates do not change greatly. This fact provides encouragement to studies that find negative or insignificant TOM parameter estimates as our equations suggest that the alternative TOM slopes will probably not materially affect the inferences concerning other variables of interest.

Data Sample

To conduct this study, we utilize the answers to the NAR survey used to create the annual Profile of Home Buyers and Sellers report. The NAR mails out a survey with over 100 questions to a random sample of recent home buyers. NAR obtains the consumer names from Experian, which maintains an extensive database of home buyers derived from county records. Because the data come from Experian and not only NAR members, the sample includes homes sold by owners without the assistance of a Realtor®.

While the survey is sent to homebuyers, we primarily use the property and demographic information of owners of previous homes for our study. The nature of our investigation of TOM necessitates this focus. For instance, the amount of time buyers are in the market is recorded, but the marketing durations of purchased homes are not recorded and there is no information on sellers of the purchased home.

The trade-off in using transactions of owners with homes that have sold or are trying to sell is that we remove first-time home buyers.Footnote 3 We find that only about 1.6% of the NAR responders are first time homebuyers, which means that approximately 50 transactions are removed from our sample of over 3000.

Our full-panel sample consists of property-level transactions of single family homes and townhouses from the 2010 to 2012 annual surveys. The 2010 survey includes sales from 2009. We restrict our sample to the surveys beginning in 2010 for multiple reasons. Foremost, because earlier questionnaires do not ask the questions required for the analysis. For example, 2009 questionnaire does not ask whether a responder is a first time seller or the number of bedrooms and bathrooms. We also desire to control for structural quality and loss aversion, which we can obtain using the original prices paid at purchase. Earlier surveys did not ask for this variable.

In constructing the data panel we remove observations with extreme or suspicious answers as well as those that lack relevant data. For instance, we drop six records that indicate a home age between 300 and 815 years and four observations indicating owner’s age less than 21 years old. We also remove 19 survey responses that indicate TOM in excess of 4 years. We further trim both ends of the transaction price distribution at the 1% level to focus on the more typical housing stock.

A potential drawback of the dataset is the low response rate, which is approximately 10% for the sample period. Despite the low response rate, the dataset befits our analysis primarily due to the many hard-to-observe variables that can impact price-TOM preferences. Further, Genesove and Han (2012) show the distribution of the NAR survey responses is similar to other surveys with higher response rates but lack the attributes important to our analysis. Additionally, the inclusion of the many economic and demographic attributes provides a linear approximation of a Heckman selection term that helps mitigate sample-selection bias. For instance, the propensity to respond to the survey may depend on responders’ demographic characteristics. Retired sellers may have lower opportunity costs of time while higher income owners have greater opportunity costs.

Table 1 reports the many sample attributes. The median sold residence is 25 years old with 3 bedrooms, 2 bathrooms, and 2000 square feet. The median original purchase price is $177,600 with a selling price of $220,000. The mean (median) TOM is 19.05 (10) weeks. The median period owners hold their properties is 8 years.

Table 1 Descriptive statistics (N = 3202)

We group the many additional characteristics as seller motivations, demographics, search costs, and structural atypicalities. As to their motivations, sellers provide multiple measures, one of which is urgency. Springer (1996) and Knight (2002) use keywords like “motivated” and “must sell” in the comments field of MLS datasets to proxy for seller urgency. Springer (1996) finds that motivated sellers experience a decrease in transaction prices and an increase in the TOM; however, the models do not control for the simultaneity of the two outcomes. Knight (2002) finds no effect on transaction prices. Keyword studies such as these are challenging because the precise level of sellers’ urgency is hard to measure as entering keywords in MLS comes at little cost. Instead of having to proxy for urgency, the NAR survey asks the sellers’ levels of urgency. Seventeen percent of sellers express high urgency, 44% indicate some urgency, and 39% answer as not being urgent.

Another general category of motivation is the sellers’ reasons for selling. Nineteen percent of the respondents indicate job relocation, 8% note a change in the family status such as a divorce or the birth of a child, 3% indicate cash constraints resulting in the property being too expensive to keep, and 4% of the sample indicate a desire to avoid foreclosure. In addition, short sales constitute another 3% of the sample. The reasons indicating the property is too expensive to keep and a desire to avoid foreclosure provide insight into the financing of the property. In addition to property taxes, insurance, and maintenance–the last being something that can be deferred–the major commitment for many homeowners is the monthly mortgage payment. The property being too expensive or the owners specifically noting they want to sell to avoid foreclosure can impact price-TOM preferences when the mortgage is a constraint or when the home value is close to the mortgage amount such as in Genesove and Mayer (1997, 2001).Footnote 4

Regarding demographics, a majority of the responders are white, speak English, and born in the U.S. While 93% of the responders identify themselves as white, this is less than the almost 96% reported in Harding et al. (2003b). First-time sellers represent 38% of the sample.

The household composition is revealed through three factors: income, number of children, and number of earners. For anonymity, the NAR survey reports household income within one of sixteen levels, which we report in Table 1. Income is distributed unimodally across the levels with the largest percentage being incomes between $100,000 and $125,000. Our analysis examines each of these categories coded as a binary variable and also a continuous measure of log income using the midpoint of each category with the top-coded upper category set to $1.5 M. The results are invariant to the income measure so we report each category.

The number of children in the household can affect the price-TOM tradeoff. For instance, Harding et al. (2003b) find that having school-age children can reduce bargaining power and thus impact prices, especially during the school year. The effect of the number of household earners on prices and liquidity is unclear. Multiple earners should exhibit higher household incomes and greater house prices, although the income measures may control for this. Alternatively, multiple earners may decrease the motivation to sell if one earner holds high-quality employment, although the urgency variables will probably capture at least some impact of this characteristic.

The sample has three general measures of search costs. Since moving further away from the current residence should increase search costs, the first is the separation distance between the sold and purchased residences using the ZIP code at each location. We analyze separation distance in two ways. The first is the natural log of the continuous variable using the actual distance. The second is to group distances into bins to investigate if there is any additional nonlinearity. The results and conclusions are invariant to the proxy so we report the slope coefficients for each bin.

Using the ZIP code of the sold property, we control for approximately 455 locational fixed effects at the three-digit ZIP code level. Fixed effects control for the market tightness and thinness. Locational fixed effects also may help control for local tax regimes, labor markets, and additional demographics.

The other measures of search costs are selling to a friend or acquaintance and addition marketing methods beyond MLS listing including holding an open house and using magazine, flyer, print, and television advertising. Each is coded as a binary variable.

We also control for atypicality following Harding et al. (2003a) to compute measures of unique structural characteristics. The controls represent approximately 1 % of the sample for each type. We define a new home as 2 years old or less and an old home is equal to or greater than 120 years old. A large home is greater than 5000 square feet and a small home is less than 900 square feet. A home has many bathrooms if there are 5 or more; many bedrooms if 6 or more.

Harding et al. (2003a) also include the inverse Mills ratio (IMR) as an atypical feature. The IMR may measure price-liquidity preferences because atypical homes may experience thinner markets and possibly spend more time on the market than more typical properties before finding a buyer desiring the particular characteristics of the atypical home. Since the sample includes properties that have not sold in each survey year, we compute the IMR using the Heckman two-step procedure to correct for possible sample selection bias from modeling prices using only sold observations. Appendix 2 provides the first-stage probit model to compute the IMR.

Empirical Findings

All of our tests model transaction prices as the primary dependent variable. We are fitting a limited information simultaneous system using 2SLS where we instrument for the secondary dependent variable (TOM). Even though TOM is a duration variable, one must be careful to use a linear (linear in the parameters) model for TOM to keep within the standard 2SLS framework. One advantage of this framework is that consistency of 2SLS does not depend on the correct specification of the first stage regression (Kelejian 1971).Footnote 5 We do not investigate the alternative situation where TOM is the primary dependent variable and price is the secondary dependent variable for the reason that instrumented prices routinely show a positive relation when used in a TOM equation. If one went to a more full information simultaneous equation system, it might be possible to use the hazard nature of the TOM variable to improve efficiency. This opens up some future research possibilities, but lies beyond the scope of this study.

We begin the empirical analysis by determining the naïve relation between prices and TOM using OLS. The first model considers independent variables typical to MLS and public-record datasets. To this specification, we then add the NAR data to determine if any are omitted variables that alter the TOM slope coefficient.

We next use a system of equations and 2SLS to control for simultaneity and investigate the strength of possible instruments of TOM. The first specification considers MLS-type variables, which test statistics indicate yields a weak TOM IV. The second model uses the NAR data to form a strong TOM IV based upon the test statistics. The OLS and weak-IV models exhibit negative or insignificant slope coefficients on TOM whereas the specification with a strong IV produces the expected positive relation between TOM and prices.

With identification of a strong IV, we next consider omitted variable bias by incrementally removing determinants that correlate with prices. We begin by withholding explanatory variables that are not typical to studies using MLS and public record data and end by removing all possible determinants except the TOM IV and the temporal and spatial fixed effects. In each case, TOM exhibits a positive slope coefficient.

The remainder of the paper considers the aforementioned other possible explanations found in the literature (e.g., severe overpricing). We include in each specification the strong TOM IV as well as all the NAR variables. In each of these tests, TOM is consistently positive.

OLS Models

The analysis begins with OLS models that uses the natural log of uninstrumented TOM as an independent variable. Table 2 reports the results. In the first model, we restrict the explanatory variables to those typically found in MLS and public-records datasets. The second model introduces the economic and demographic information. In either case, TOM is inversely correlated with prices.

Table 2 Log transaction prices

Simultaneity and a System of Equations

We now control for the simultaneity between the transaction outcomes using 2SLS, which we do throughout the remainder of this article.Footnote 6 We first investigate the use of instruments available in typical MLS datasets. The challenge is that price and marketing durations are generally determined by identical variables. Turnbull and Dombrow (2006) propose a method to address this issue using local market competition measures. Since ours is a national study and the market competition variables have limited impact, we empirically find variables within our data that correlate with TOM but not prices. Using the MLS-type dataset, we find one instrument, which is detached single-family residences (SFR) versus townhouses. A rationale for the TOM relation is that detached SFR take longer to sell due to greater heterogeneity compared to townhouses.

Model (1) in Table 3 reports the first-stage reduced-form models of TOM. We include both the instruments and the included exogenous variables in Table 3 to show the additional determinants of TOM, which will be important in later investigation of omitted variables.

Table 3 Reduced form equations

Using the predicted value of TOM from the MLS-type Model (1) equation in Table 3, we model the structural price equation and report the results in Table 4. We note first that the parameter estimate on TOM in Table 4 is insignificant. We also observe that the endogeneity statistics reported at the bottom of the column indicate a weak IV.

Table 4 Log transaction prices (Structural equations)

We next consider the full information set in both the reduced form model in Table 3 and the structural equations in Table 4. In the first stage, we add variables that exhibit multivariate correlation with TOM but do not exhibit a relation with prices. These are selling to an acquaintance and various marketing methods. Selling to an acquaintance can reduce the marketing duration if an agreement is reached early or even before the property is listed to the general public. Marketing methods can impact TOM through additional publicity that can increase buyers’ rate of arrival. Somewhat unexpectedly, the results demonstrate that the marketing methods increase TOM. This is consistent with owners not finding initial matches and invoking additional publicity. Model (2) in Table 3 reports the updated reduced form equation.

Model (2) in Table 4 presents the results with the improved TOM IV as well as the additional economic and demographic measures as covariates in an unrestricted structural equation. The endogeneity statistics at the bottom of the column demonstrate that TOM is indeed endogenous with transaction prices and that the TOM IV is strong and not overidentified. The partial R2 is 0.150 with an F statistic of 58.73. The partial elasticity of 0.043 on TOM is positive and significant.

There are a few additional relations of note in Model (2) in Table 4. The first is structural quality. Stigmatization is often cited when housing researchers find an inverse relation between transaction prices and TOM. However, the effect should be idiosyncratic for a few lemon properties and not systematic across the price distribution. Nonetheless, quality is clearly an important part of housing service flow and frequently omitted in empirical housing models.

We thus follow Genesove and Mayer (2001) to compute quality as the residuals from a hedonic model of values at the time of purchase. Quality is the portion of the previous transaction prices that the regression did not predict and the unobserved home qualities that partially determine the prices at the time of the original purchases. To the extent these qualities do not change significantly over time, the residuals are a relevant measure for their impact on future transaction prices.

Consistent with buyers preferring higher quality homes ceteris paribus, we find the slope coefficient is significant and positive in all of the price specifications. We also observe that TOM and quality are not correlated in the first-stage Model (2) in Table 3, which is empirical evidence that does not support the quality/stigmatization hypothesis.

Concerning other predictive variables, the results demonstrate that urgency, income, and race are price determinants. Sellers exhibiting greater urgency experience lower transaction prices, which is consistent with higher discount rates. There is a monotonic increase in transaction prices with higher incomes. This finding is consistent with households having greater (lower) incomes owning homes with more (less) service flow, which can include locational and neighborhood effects. The results also demonstrate that African American sellers realize a decrease in transaction prices while Asian’s experience an increase.

Omitted Variables

The results thus far demonstrate that the parameter estimates on TOM switch signs between the OLS models in Table 2 and the 2SLS specification in Table 4. Because the IV specifications are joint tests of omitted and instrumental variables, we next isolate the omitted-variable hypothesis by removing significant determinants that are generally not observed in MLS samples and thus can potentially lead to biases.

There are TOM determinants in the reduced form equations in Table 3 that are also correlated with prices in the structural equation in Table 4. These are sellers’ incomes, ages, and races along with short sales and holding periods. If these determinants are withheld from the model, the coefficient estimate on TOM may be biased. To test for omitted-variable bias, we remove these variables from the previously unrestricted model while modeling the TOM IV using the high-quality instruments from Model (2) in Table 3.

Model (1) in Table 5 reports the initial findings. In addition to the other covariates displaying the same sign and significance as in Table 4, the results demonstrate that the TOM IV maintains a positive and statistically significant effect on transaction prices. The statistical tests at the bottom of the column indicate that TOM is endogenous with prices and the IV is not weak. Thus, with a strong IV, the TOM estimate is not impacted by the omitted correlated variables.

Table 5 Omit variables in price equations

Since withholding variables that impact both TOM and prices does not change the inferences we investigate other specifications when sellers may change their price-TOM preferences and possibly cause bias if a significant determinant is not included in the specification. While not found in our data, a higher degree of urgency can cause a seller to prefer an increase in the probability of sale. To accomplish this a seller can reduce their list and reservation prices, which should decrease TOM. We therefore reconsider Model (1) without the urgency variables. We do not present the results because they are not materially altered. We also consider the omission of separation distance since sellers moving greater intervals may prefer quicker sales to avoid property maintenance costs as an absentee owner as well as higher transaction costs due to travelling back to the previous location to complete a sales transaction. Again, we find the results are invariant to the removal of the separation distance measures and thus do not present them. We next examine a specification that omits all of the additional economic and demographic information and uses the information found in typical MLS datasets. This test differs from Model (1) in Table 4 in that it includes a strong TOM IV. Model (2) in Table 5 reports the results. The high quality TOM measure maintains an endogenous, positive, and significant relation with prices. The IV is strong with a partial R2 of 0.148 and an F statistic equal to 49.51.

Lastly, we consider an acutely restricted model that omits all covariates except the strong TOM IV and the spatial and temporal fixed effects. Model (3) in Table 5 shows that TOM still positively impacts transaction prices.

Severe Overpricing

At times the literature attributes an inverse correlation between prices and TOM to severe overpricing such as in Knight (2002). The argument is that if homeowners set their list prices much higher than the expected market value, the home will sit on the market until the homeowner reduces the asking price or withdraws the property from the market. While on the market sellers may reduce their reservation prices (Huang and Palmquist, 2001) and/or gain information about the market value and subsequently decrease their list prices. The reduced list prices should result in transaction prices near the expected market value or maybe even less than the expected value. Combining the extended TOM with prices close to expected market values, a negative relation can result between the transaction outcomes.

It is critical to note, however, that this is an asymmetric relation at the upper tail of the TOM distribution that does not negate the first-order positive relation between prices and TOM, and therefore should not yield an overall negative coefficient on TOM. It is asymmetric because homes that are underpriced for quick sale should not consistently spend more TOM than expected. Instead, the severe overpricing is potentially an omitted variable.

There is an empirical challenge is measuring severe overpricing due to additional functional-form simultaneity. The severe overpricing cannot simply be measured as the difference between list prices and transaction prices and including as an independent variable, which is sometimes seen in the extant literature and termed degree of overpricing. The first reason is that this difference does not consider just the severe overpricing but also typical overpricing by sellers allowing for negotiation with potential buyers. Moreover, since the other covariates are predicting transaction prices, the simple price difference is a close proxy for list prices and, thus, not appropriate to include as an explanatory variable in a transaction price specification. Similar issues exist in using predicted list or transaction prices as possible inputs to the measure of severe overpricing. Conceptually, the price model can include an IV for the severe overpricing in a system of equations, but we find the challenge is finding high-quality instruments that separately measure TOM and severe overpricing.

We thus address the additional simultaneity using two methods. The first is to include a quadratic term to better fit the tails of the TOM distribution. The second is to explore breakpoints in the TOM IV to isolate the severe overpricing. Table 6 reports the findings.

Table 6 Severe overpricing

In Model (1), we change from using the natural log of TOM to the level and add the TOM quadratic. The specifications include the full set of covariates as in Model (2) in Table 4 as well as the strong set of TOM instruments specified in Model (2) of Table 3. We use these unrestricted reduced-form and structural equations in all remaining models.

The combined slope coefficients in Model (1) demonstrate the positive first-order relation but at a decreasing rate. The parameter estimate on the quadratic is slightly less than the threshold of 5% (p-value = 0.057). The combined relation indicates lower (higher) priced homes will experience shorter (longer) TOM, but to a point. For severely-overpriced homes, TOM can become quite long and not result in a higher price. At the other end of the distribution, the TOM on extensively underpriced homes is truncated from below at zero and homeowners can expect to spend some amount of time marketing the property.

In Model (2), we use the log of TOM and explore breakpoints. The first we find is at 52 weeks. The results demonstrate a positive partial elasticity on the first TOM segment and an insignificant parameter estimate on the segment that is greater than 52 weeks. In Model (3), we set the breakpoint at 65 weeks. The partial elasticity on the shorter TOM segment increases compared to Model (2). Consistent with the severe overpricing hypothesis, the slope coefficient on the segment greater than 62 weeks is an economically significant negative partial elasticity, however the slope coefficient is not statistically significant.

Local Market Dynamics

A national study such as ours mitigates a type of sample selection bias since it does not concentrate within one local market. Alternatively, the total summation of TOM into one slope coefficient could suffer from aggregation across markets, regions, or states of nature. To mitigate this issue, each of the specifications includes ZIP code and annual fixed effects. We find that regressing prices on just the ZIP code fixed effects explains a creditable 23% of the price variation.

We next consider alternative measures of local market dynamics as possible omitted variables. There is an evolving literature examining prices and TOM based upon the state of the market related to tightness, which is the ratio of buyers to sellers. The central idea is that housing liquidity varies over time and different states of nature. In tight or hot markets the number of potential buyers actively seeking alternative housing service flows increases relative to sellers. This can result in rising house prices, higher sales volumes, and higher market liquidity. The greater liquidity should translate into lower TOM. The transaction outcomes are reversed in cold markets. Thus, the relation between prices and TOM is negative. Krainer (1999), Krainer (2001), Krainer and LeRoy (2002), and Novy-Marx (2009) provide theoretical models and calibrations while Carrillo and Pope (2012), Carrillo (2013), Gan (2013), and Carrillo, de Wit, and Larson (2015) contribute recent empirical studies.

Note, though, that the negative relation in this literature is a function of changes in prices and TOM or the comovement of prices and liquidity over time. Hot and cold markets are relative to another market geographically or to a previous time period within the same market. This is not the focus of the housing literature using hedonic price models with TOM as a possible determinant.

Search-theoretic models do however allow for varying levels of buyers’ arrival rates and economic shocks at local levels.Footnote 7 Therefore, to confirm that the TOM relation is not an aggregation across markets, we first include the market tightness index of Carrillo (2013) in our unrestricted structural equation.

The Carrillo (2013) index, denoted as θ, is a relative measure of the bargaining power between sellers and buyers. Since ours is a national study, we follow the estimation for aggregate data. We bin transactions at the 2-digit ZIP code level to have sufficient observations to calculate θ. To estimate the structural parameters, we compute per bin the mean log list prices, mean log transaction prices, the share of transactions that occurred at a price below the list price, and the mean number of days that a property stays on the market. We then calculate the structural parameters following the equations in Carrillo (2013). The final parameter computed is the bargaining measure, which ranges from 0 to 1 with higher (lower) values indicating greater bargaining power for sellers (buyers).

The descriptive statistics on θ are including in Table 1. The average value is 0.33 with a minimum of 0.10 and a maximum of 0.84. Meeting with our priors concerning market conditions during the sample period, we find the observations with lowest θ are all transactions around the Chicago MSA, about 2% of the sample. Conversely, the highest θ markets include Fargo, North Dakota and many of the cities that suffered the most during the financial crisis from 2007 to 2009 and rebounded during the sample period. These include Los Angeles, Sacramento, and Fresno, California along with Albuquerque, New Mexico.

We employ θ in two ways. In the first, we use the continuous variable. In the second, we convert θ to binary variables to mitigate another possible endogeneity issue. Table 7 details the findings using the first method. In Models (1) and (2) we restrict the specification to only θ and a constant. As expected, the results demonstrate that the hotter (colder) markets experience an increase (decrease) in prices in Model (1) and a decrease (increase) in TOM in Model (2). In Models (3), we include the TOM IV along with the covariates sans the spatial fixed effects because they are measuring similar treatments as θ and we find they are highly collinear. Prices continue to exhibit a positive relation with θ in Model (3). Further, the partial elasticity on the TOM IV is positively and significantly correlated with transaction prices.

Table 7 Market tightness

The computation of θ, while nonlinear, includes the difference in list and transaction prices as well as the level of TOM. The fact that sellers’ bargaining power is a function of the market fundamentals that we are examining introduces the possibility of simultaneity between θ and the price and TOM disturbances. To mitigate the possible issue, we replace the continuous variable with two binary variables. We set Hot (Cold) equal to one when the value of the index places the observation in the upper (lower) quartile of the θ distribution and zero otherwise.

Table 8 reports the results. In Models (1) and (2) we again restrict the specification to only a constant term and the hot and cold measures to confirm the base relation. Hotter markets continue to experience an increase in prices and a decrease in TOM. Properties in cold markets do not initially experience a change in prices or TOM. This changes in Model (3), which is the unrestricted specification with all other covariates and the TOM IV. Now transactions in colder markets experience price decreases of approximately 5%. We note that the partial elasticity on TOM continues to be positive and significant.

Table 8 Hot and cold binaries

A different method to measure local market dynamics is to use aggregate demand and supply measures in place of θ and spatial fixed effects. Following Genesove and Han (2012), we include population and income at the MSA level as demand drivers. Homeowners with high incomes have a greater willingness to pay for consumption amenities so that improvements in these amenities will be accompanied by higher income people moving into the local area. Population can also impact housing demand through channels such as employment opportunities, consumption amenities, and zoning laws. On the supply side, we include population density following the literature such as Green, Malpezzi, and Mayo (2005), who find that density is a negative predictor of supply inelasticity.

We examine the macroeconomic variable levels as well as their first differences (Δ) computed as continuously-compounded percentages. We investigate the long-run population levels and changes using the year 2000 US Census values compared to the 2010 levels. This considers the housing stock. We also examine short-run changes using year 2008 compared to 2010 levels, which investigates the housing flow. We find the results and conclusions regarding the elasticity of TOM are invariant to the supply and demand variables. Table 9 reports the equations with Model (1) including the demand-side drivers and Model (2) adding population density levels. In both instances, the TOM IV continues to exhibit a positive and significant elasticity with transaction prices.

Table 9 Demand and supply drivers

Loss Aversion

Empirical studies such as Genesove and Mayer (1997, 2001) and Hayunga and Pace (2016) find that sellers are loss averse and reluctant to realize a loss when the current transaction price is expected to be below the original price paid for their home. Sellers consequently set list prices higher than expected, which can impact marketing durations such that loss aversion may be a unique omitted variable in price equations. Since the original purchase price is observable in the NAR dataset, we are able to compute the proxy for loss aversion using the method laid out by Genesove and Mayer (1997, 2001).

When expected losses are added to our standard unrestricted model, we observe in Table 10 a positive and significant correlation with transaction prices, which is consistent with Genesove and Mayer (2001) and Hayunga and Pace (2016). We also include the binary variables of hot and cold markets, which are significant in the expected direction. Again, the relation between TOM and prices continues to be positive and significant.

Table 10 Expected losses

Localized Markets

Real estate trades in local markets and while we account for this aspect using spatial fixed effects, these controls along with the atypicality variables may introduce two additional considerations. The first is that fixed effects across the entire US at the 3-digit ZIP code level can result in just a couple of observations per ZIP code control for our sample. Another concern with a national study is that the atypical variables may not generalize across the entire country. For instance, slightly more than 1% of the sample consists of homes greater than 5000 square feet. While this is a large home in a major metropolitan area like New York City, it may be less atypical in more rural area in, say, Texas.

Due to these considerations, our last test restricts the sample to homes in five large metropolitan in northeast US, which are Boston, New York City, Philadelphia, Baltimore, and Washington, D.C. With 559 observations, the sample is large enough to include a sufficient number of observables within each ZIP code area when we bin at the 2-digit level. We also observe that the atypicality variables still measures the extreme tails of the distributions for old, new, small, and large homes as well as those properties with many bathrooms and bedrooms.

The model in Table 11 reports the results from this test. In the specification, we include the same unrestricted Model (2) from Table 4 as well as the expected nominal loss variable. We do not include the hot and cold measures because they are multicollinear with the ZIP code controls. We find that the TOM variable of 0.048 is similar in economic magnitude to the previous equations; however, the slope coefficient is not significant at the 5% level. The p-value is 0.06. We note that with less observations the standard error increases by approximately 50%, which explains the reduced statistical significance.

Table 11 Large Northeast US MSAs

Conclusion

Due to fixed location and heterogeneous products, buyers who desire new housing must search for a property with the service flow that meets with their tastes and preferences. On the other side of the transaction sellers do not know which buyers will be a good match for their property. Thus, search theory and the matching process are fundamental to real estate markets. Search-theoretic models show that the probability of sale simultaneously affects both the expected prices and marketing durations of properties, which leads to a positive relation between these outcomes in equilibrium. The empirical research, however, is mixed with a number of models finding insignificant or negative relations. The negative relation is disconcerting because one conclusion is that underpriced homes systematically remain on the market longer than expected.

This study examines explanations for the divergence of the empirical and theoretical results. Our first finding is that, while recent articles have taken more care, many studies still fail to account for the joint determination of prices and marketing durations, which can cause biased results.

Of the studies that address simultaneity, a system of equations is typically modeled using two-stage least squares and instrumental variables. The presence of instrumental variables in a system does not guarantee that the signs on estimated parameters will agree with those predicted from theory. This can result from omitted variables or from weak instruments. Using a unique dataset that provides many variables omitted in other studies, we find that specifications that include a strong TOM instrumental variable produce the expected positive relation between TOM and prices. Alternatively, models with a weak TOM instrumental variable will generate the negative or insignificant parameter estimates on TOM.

When comparing the equations that exhibit negative and insignificant versus positive slope coefficients on TOM we note that the parameter estimates on the other independent variables do not change dramatically. Our tests suggest that the negative or insignificant slopes probably will not materially affect the inferences concerning other variables of interest in price regressions that include TOM.