Introduction

This paper examines the extent of segmentation (or integration) in US regional housing markets. This topic has been the subject of a number of other studies, both for the US and the UK. The topic has taken on added urgency due to what some observers believe is housing’s pre-eminent role in the business cycle (Leamer 2007), and especially the very central role housing appeared to play in the 2007–09 recession and subsequent slow recovery.

Given the potential links between financial distress and losses in GDP and employment, some advocate that monetary policy be set partially as a function of indicators of financial risk, including values in the housing market (see Goodhart and Hofmann 2000). This is part of a larger debate over whether equity values should have an impact on monetary policy, and the debate has been growing in the wake of the 2007–09 recession, which followed an asset price boom and bust.

The possible inclusion of house prices as a monetary policy determinant highlights the importance of integration as an issue. For home prices to be employed as a determinant of monetary policy, it is important that values be highly integrated across the country. If, instead, there is a high level of segmentation across different regions, the use of housing as a signal for policymakers becomes more problematic. As an example, if housing was booming on the west coast, but deeply depressed in Texas, Oklahoma and Louisiana, then a tight monetary policy may temper a west coast bubble but deepen a slump in oil-producing states.

The extent of regional integration thus has important policy implications. In addition, understanding the interaction between different regional housing markets may improve forecasts of home values, which is of course useful for investors as well as financial institutions, and also yield some information on the extent of potential diversification benefits from holding a (regionally) diversified portfolio of housing assets.

Given the issue’s importance, a number of studies have been undertaken. For the UK, MacDonald and Taylor (1993), Drake (1995) and Cook (2003) find different results. For the US, Pollakowski and Ray (1997), Baffoe-Bonnie (1998), Del Negro and Otrok (2007), Fadiga and Wang (2009) and Clark and Coggin (2009) employ a variety of techniques to the issue of integration in the US, and present somewhat varied findings. The techniques employed in these studies do raise a number of statistical issues, to be discussed below. One particular issue which many previous studies overlook is that some measures of integration, which rely on some form of correlation measure, miss important differences in the amplitude of house price movements across regions. For example, Fig. 1 displays two hypothetical housing markets, which are perfectly correlated in their price movements but of course exhibit substantial segmentation in their movements due to differences in amplitude. If monetary policy were set based on house prices, actual regional home prices would be sending very different signals, although correlation-based measures would miss this difference.

Fig. 1
figure 1

Hypothetical price movements for two housing markets

Accordingly, we apply a new set of tools which yield measures of integration and account for differences in both correlation and amplitude. These methods have been developed by Mink, Jacobs and De Haan (2012), who employed them to investigate output synchronization in the Euro zone. In addition to accounting for both correlation and amplitude, these methods also allow us to obtain time-varying estimates of integration. We also employ a sample which extends longer than those in previous studies, so we are able to capture both the peak bubble as well as bust periods. To anticipate our results, we find, first, that the early years of the bubble-2001-05, were contrary to some previous results in Del Negro and Otrok, years of rising segmentation, rather than increasing integration in the United States overall. This finding bolsters Greenspan’s case, stated in 2005, that the early years of the bubble may have reflected “lots of local bubbles” rather than a national bubble. However, the later years of the bubble, as well as the bust, witnessed a very sharp increase in overall integration nationally. Thus the later froth and bust likely did reflect national (or perhaps global) factors. We then find that there is much variation in terms of the degree of integration across US regions through most of the sample. This finding comports with the finding of Mink et al. (2012), whose results indicated that output in different US regions was less integrated than output in the Euro zone. As home prices likely reflect local income, it follows that housing values can exhibit a high degree of segmentation in the US. This latter finding of much variation in integration across the US makes the use of home prices as a monetary policy tool, advocated by some, problematic.

This paper proceeds as follows. The next section reviews the previous literature. Section 3 describes the data and methodology, section 4 discusses our results and section 5 concludes.

Previous Literature

The topic of regional house price integration has attracted much attention among observers of the United Kingdom-indeed the topic has been the subject of more attention in the UK than in the US. Researchers have sought to determine which British region seems the main causal driver of price movements in other regions, knowledge of which would help improve house price forecasting. In addition, such knowledge of house price interactions helps uncover the degree of segmentation in the UK housing market. That is, some regions may be slower than others to adjust to shocks, indicating a greater dependence on local factors. Put differently, this points to the issue of convergence-how quickly do home values in different regions move together-if they move together at all?

Drake (1995) finds, upon employing a Kalman filter, that house prices in the North and Scotland exhibited much different behavior than those in regions closer to the south of the UK. Others have investigated whether a long run relationship exists among the regional home prices, often employing cointegration techniques. The reasoning behind examining cointegration is that, if two given regions are economically integrated, house prices will likely be cointegrated, as large house price differentials between different areas may present an arbitrage opportunity that will be exploited. Thus regional house prices in a national market that are integrated may not be able to move far from each other without encountering some error correction mechanism. On the other hand, housing in different regions may exhibit different prices based on amenities or agglomeration effects, and prices may well move far apart without an inherent tendency to come closer together, which would mean a lack of cointegration and some degree of segmentation. MacDonald and Taylor (1993) find only limited evidence of cointegrating relationships for the UK, indicating some segmentation. Alternatively, Cook (2003) allows for asymmetric adjustment between positive and negative deviations from the long run potential relationship, and finds greater evidence for such relationships, and therefore less evidence of segmentation.

Another strain of research is based on the idea that if housing units are not completely segmented, and price movements eventually converge-at least in the same direction, if not magnitude-then the ratio of a local house price index to a national home value index should be a stationary process that does not arbitrarily wander off, without a tendency to revert to a long-run mean. A stationary process will frequently move above and below its mean.

To the end of determining convergence (stationarity of the local/national price indices ratio) some researchers have applied unit root tests to the ratio. Peterson et al. (2002) employ standard unit root tests, and fail to reject the null hypothesis of nonstationarity for the ratio, and thus conclude there is a lack of convergence-little integration-among the UK regions.

On the other hand, Cook (2003) employs the Momentum Threshold Autoregressive (M-TAR) unit root test, which allows, under the alternative hypothesis, for asymmetric adjustment to boom versus bust cycles. Allowing for this asymmetry, Cook rejects the null of a unit root for the different British regions, and concludes that the different UK housing markets are highly convergent.

Holmes and Grimes (2008) examine the issue of convergence by employing principle components analysis. The authors find the first principle component of the 13 UK regions is stationary, and thus conclude that the national housing sector displays convergence.

For the United States research on regional price movements and integration has been more recent. Pollakowski and Ray (1997) examine the interaction of home values at the regional and state level. The authors investigate the same nine census regions examined in the present study. The authors do find that regional prices Granger-cause each other. This suggests that knowledge of regional prices can help in forecasting home values. Beyond this relationship, however, the authors find “neither a spatial pattern nor any other discernible pattern” (p. 114).

Baffoe-Bonnie (1998) examines the impact of macroeconomic variables such as employment and interest rates on national and regional housing markets, where the regions are defined as Northeast, Midwest, South and West. The author finds that these macroeconomic variables have different effects on different regions, which suggests some degree of segmentation. In a related paper, Del Negro and Otrok (2007) use dynamic factor analysis and find that most movements in home prices appear driven by local drivers, but that over the 2001–2005 period, values appeared more affected by national determinants. The authors cite then-Fed chief Alan Greenspan’s May 2005 statement that he did not perceive a national housing bubble, but that “it is not hard to see.. that there are a lot of local bubbles” (Greenspan, quoted by Del Negro and Otrok, p. 1962). Del Negro and Otrok’s finding that “the increase in house prices” over 2001–05 was a “national phenomenon” (p. 1962) would appear to cast doubt on Greenspan’s “lots of local bubbles” conjecture.

Fadiga and Wang (2009) employ a state space time series model for four regions-Northeast, Midwest, South and West. The authors find that home prices in the Northeast and West seem highly determined by a common source in the long run; the South and Midwest follow a different pattern. Clark and Coggin (2009) use a trend plus cycle time series model to investigate integration among the nine census regions-the same nine also examined in the present study. The authors extract two super-regional factors, and find that these factors exhibit somewhat different patterns, until the early and mid-1990s, when prices started rising sharply. The authors find the overall evidence for convergence mixed based on the results of error correction models.

There is thus no clear picture of the extent of convergence and segmentation among US regional house prices. This study applies a new methodology to the question.

Data and Methodology

As noted, the extent of integration is important for policy as well as forecasting purposes. Previous studies have yielded interesting results, but thus far no consensus has emerged. There are several issues with techniques employed in previous studies that could make inference problematic. First, it would be optimal to have a time-varying measure of integration, but some of the methods employed, such as unit root tests or cointegration studies, simply give an account of the extent of integration, by testing whether price differences between regions are stationary, mean-reverting, or respond to an error-correction mechanism over the whole sample. But such methods do not allow for understanding how integration may evolve over time. For instance, has integration increased or decreased over time, for certain regions or the nation as a whole? Does integration exhibit patterns relating to the business cycle-i.e. does it notably increase or decrease in response to recessions or booms? Such questions cannot be answered with standard unit root or cointegration tools.

There are techniques, such as smooth trend plus cycle, Kalman filter and state space models which do allow for time-varying estimates of integration and which have been employed in previous papers. However, such techniques rely on often very restrictive assumptions, such as the independence of the residuals from different components and the normality of residuals, which are likely to be violated, thus leading to unreliable results (see Bjornland and Brubakk 2005).

Another issue is that many techniques such as unit root or cointegration studies only take account of the correlation of price shocks. Unfortunately, correlation can give a very incomplete picture of integration. For instance, Fig. 1 displays two hypothetical house price series for two hypothetical regions. Although the two series are perfectly correlated, they display very different cyclical dynamics. The amplitude of region 2’s cycles is much greater than that of region 1.

A test or estimate of integration between these two markets using certain unit root or cointegration techniques which have been employed in previous studies will yield results which indicate tight integration, and one could infer-erroneously-that monetary policy could be determined in part by such a “unified” housing market. The problem would be that, when prices in both markets were rising, and proper policy would be tight in both markets, optimal Fed policy for region 1 would be much tighter than optimal policy in region 2. Similarly in a downturn, while the Fed would rightly loosen policy, optimal policy would be much looser in region 1 than region 2.

What method could detect such differences? Mink et al. (2012) have developed a new methodology, designed in their case to examine the integration of business cycles in the Euro-zone. This technique allows for measuring how integration varies period-by-period (rather than being measured as a single correlation coefficient over the entire sample period in question) and evaluating not just differences in the sign of series gaps, but also variation in the amplitude of cycles.

To achieve stationarity, the authors begin by de-trending output data (i.e. removing the stochastic trend) for 11 European countries using the Christiano-Fitzgerald (CF) filter to extract fluctuations. Business cycles are measured as deviations (fluctuations) from the trend, divided by the corresponding trend. For a given nation, Mink et al. denote the cycle for country i at period t as g i (t). The “reference” cycle (in the case of this paper this will be the national FHFA house price index) is denoted as g r (t). The authors then create a measure called synchronicity, calculated as follows:

$$ {\varphi}_{ir}(t)=\left({g}_i(t){g}_r(t)\right)/\left(\left|{g}_i(t){g}_r(t)\right|\right) $$
(1)

This synchronicity metric measures whether the signs of cycles in two different regions are the same. It takes a value of −1 when region i is rising, and region r is falling (and vice-versa) and a value of +1 when both are rising (or falling). A measure of overall synchronicity for the national housing market can be calculated as:

$$ \varphi (t)=\left(1/\mathrm{n}\right){\displaystyle \sum_{i=1}^n\left({g}_i(t){g}_r(t)\right)}/\left(\left|{g}_i(t){g}_r(t)\right|\right) $$
(2)

This metric is equal to the measure in Eq. 1 averaged over the sample of the n regions. It is defined on a [−1 + 2n, 1] scale, with a value of 1 indicating all regions have cycles of the same sign.

While the synchronicity metric yields a measure of coherence that can vary each period (unlike a correlation coefficient which yields one single, invariant estimate over the entire sample period), it does not take account of differences in house price cycle amplitude. As Fig. 1 demonstrates, fluctuations can have the same signs, but there can still be very large differences in the amplitudes of fluctuations. Accordingly, Mink et al. have created another metric, called similarity, which accounts for such amplitude differences. Similarity is calculated as follows:

$$ {\gamma}_{it}(t)=1-\left(\left|{g}_i(t)-{g}_r(t)\right|\right)/\left|{g}_i(t)\right| $$
(3)

This measure can then be averaged over the n regions as follows:

$$ \gamma (t)=1-\left({\displaystyle \sum_{i=1}^n\left|{g}_i(t)-{g}_r(t)\right|}\right)/{\displaystyle \sum_{i=1}^n\left|{g}_i(t)\right|\Big)} $$
(4)

This measure is defined over [2-n, 1]. A value of 1 indicates all regions are having an identical cycle.

We will apply these measures to regional house prices in the US to determine the extent of integration in the national housing market.

The data on regional house prices was obtained from the FHFA. It comprises the same nine regions as in the Pollakowski and Ray (1997) and Clark and Coggin (2009) studies. The regions are: East North Central (ENC), comprising the states of Michigan, Wisconsin, Illinois, Indiana and Ohio; East South Central (ESC)-Kentucky, Tennessee, Mississippi, Alabama; Middle Atlantic (MA)-New York, New Jersey, Pennsylvania; Mountain Census Division (MT)-Montana, Idaho, Wyoming, Nevada, Utah, Colorado, Arizona, New Mexico; New England (NE)-Maine, New Hampshire, Vermont, Massachusetts, Rhode Island, Connecticut; Pacific Census Division (PAC)-Alaska, Hawaii, California, Washington, Oregon; South Atlantic (SA)-Delaware, Maryland, District of Columbia, Virginia, West Virginia, North Carolina, South Carolina, Georgia, Florida; West North Central (WNC)-North Dakota, South Dakota, Minnesota, Nebraska, Iowa, Kansas, Missouri; and West South Central (WSC)-Texas, Arkansas, Louisiana, Oklahoma.

Our data is quarterly, and runs from 1975:1 through 2012:3. This is thus the first study on regional house price integration in the US to capture both the full 2000s housing bubble and the subsequent bust. All price indices are deflated with the US Consumer Price Index (CPI; 1983 = 100) to yield measures of real home values. All series are then de-seasonalized using the ARIMA X-12 method.

As noted in previous studies of integration, especially those employing unit root or cointegration analysis, non-stationarity can complicate the analysis of home values. As discussed, rather than differencing the data, which could result in losing much information, or using cointegration analysis with its above-mentioned shortcomings, we follow Mink et al. (2012) and de-trend the data with the Christiano-Fitzgerald (CF) filter to extract the stationary component of home values. There are a number of other filters available, but they have known flaws (see Christiano and Fitzgerald 2003). Mink et al. (2012) cite Koopman and Azevedo (2008, p. 26), who state that the CF technique “is the best performing and most flexible linear filter available”. We will thus employ this method in our analysis.

For both synchronicity and similarity, we will employ the FHFA overall US home price index as the reference index. We will also follow Mink et al. and compute moving averages of the measures. In our case these will be four-year moving averages. Finally, while we can obtain measures of synchronicity and similarity through time and observe different measures of integration, we also follow Mink et al. and more formally test for structural breaks in our integration measures. Mink et al. did not actually discuss a rationale for using the moving averages. However, the moving average filter is frequently applied to macroeconomic variables such as output, and in this case outlier observations in output could distort the integration measures, and make it difficult to uncover the trends over time in co-movement. The synchronization measure, in particular, as it varies only as negative one or positive one, may not yield much information in terms of long term trends without being smoothed-five periods of positive one followed by two at minus one will be “jagged” and difficult to interpret. In addition, Cohen (2001) points out, in discussing the use of filters such as the Baxter-King and Christiano-Fitzgerald methods, that they may introduce some biases, and that “biases and timing shifts introduced by standard data transformations can be substantially neutralized by relying on moving averages” (p. 19). This is obviously very relevant to the measures here as both synchronicity and similarity are developed with the Christiano-Fitzgerald filter.

As in Mink et al., we will regress each measure-both for the overall nation and each region-on a constant and time trend, and perform an Andrews-Quandt break test. The Andrews-Quandt method allows for testing for the existence of “endogenous” breaks. This means that breaks are tested for without the researcher having to specify a particular date for the break, as in a traditional Chow test. This latter method is unreliable, as specifying a date based on knowledge of the series is a form of data mining and such tests have very poor size properties.

Results

Figure 2 displays the graphs of the 4-year moving averages for both overall synchronicity and similarity in the US. For both measures, integration starts out at high levels in the late 1970s, and fairly steadily falls into the 1990s. For synchronicity, segmentation appears to peak in 1994:2, and similarity has a trough 3 years later at 1997:3. After both synchronicity and similarity hit troughs, they do rise-however both start to fall in the early 2000s, hitting “local” troughs at the end of 2004 (synchronicity) and beginning of 2005 (similarity). While this level of integration at the beginning of 2005 was higher than each measures’ “global” trough, in both cases the local trough translated into an integration level well below the average level over the entire sample. The early years of the housing bubble were thus a period of declining integration.

Fig. 2
figure 2

US synchronicity and similarity

The results for similarity, in particular, are slightly different than those of Clark and Coggin (2009) who found that their two regional superfactors had “slightly different patterns of trends and cycles until the early to mid-1990s” (p. 264). Our results show that, nationally, integration seems to take off slightly later than Clark and Coggin observed, as our similarity measure continued to decline until mid-1997. Moreover, our results are at variance with those of Del Negro and Otrok (2007). The authors employed data spanning 1986–2005, and found that house prices were mainly driven by local factors, but that over the early bubble years of 2001–2005, the increase in house prices was a “national phenomenon”. It appears, based on our similarity and synchronicity measures that the early stages of the housing bubble may have been driven by (to quote Del Negro and Otrok, who quoted Alan Greenspan in the spring of 2005) a lot of “local bubbles”.

After 2004, however, during the peak of the bubble and then the bust, there has been a very dramatic increase in both synchronicity and similarity. Both measures reach their highest levels of the sample by the end of 2012. In this sense, the house price movements of 2005–2012 more plausibly followed a common, national pattern.

While a visual inspection of Fig. 1 would suggest that the patterns for overall synchronicity and similarity seem fairly closely linked, results from the Andrews-Quandt test, displayed in Table 1, show that synchronicity exhibited a significant break in the first quarter of 1994, very close to the time of its overall trough, while similarity experienced a significant break in the fourth quarter of 2005, in the early stages of its steep increase. Thus, while house price correlations did begin rising in the mid-1990s, the most dramatic increase in the similarity of amplitudes began right in the middle of the bubble.

Table 1 Break dates for synchronicity, similarity

We now turn to the individual regional results. Figure 3 displays the graphs of the 4-year moving averages for both synchronicity and similarity in each of the nine census areas. The measures of synchronicity and similarity reveal patterns that are somewhat alike, but decidedly not identical. For instance, East South Coast has steadily rising synchronicity with the rest of the nation after 2007, but similarity for this region actually falls at about this time, a pattern different not only from synchronicity in this area, but also different from both synchronicity and similarity in other regions. In a several of the census areas-New England, Pacific, South Atlantic, and West South Central, the troughs of synchronicity and similarity are several years apart.

Fig. 3
figure 3figure 3figure 3figure 3figure 3

Regional synchronicity and similarity

As displayed, the different regions exhibit differing levels of integration. This is partly evidenced by Table 1 which displays the Andrews tests on structural breaks for the different regions. The timing of breaks for synchronicity varies from 1985:3 for West North Central to 2007:1 for East North Central. For similarity the break dates range from 1983:4 for West South Central to 2006:4 for both East North Central and East South Central. In terms of the frequency of breaks, there were 18 in all for both measures, four of which were in the 1980s, eight were in the 1990s, and 6 between 2000 and 2006-two of these latter breaks were after the end of the sample employed by Clark and Coggin (2009).

The varied patterns of integration among different regions are further manifested in Table 2 which ranks each region by the average value of synchronicity and similarity over the sample. The results for the two measures are not identical; however, some patterns do emerge. The three most integrated regions, although the ordering differs between the two measures, are East North Central, East South Central, and South Atlantic. Thus Midwestern and southern states appear to exhibit the highest levels of integration-the least propensity to exhibit outlier behavior in terms of deviating from the overall national trend. In contrast, the least integrated, in terms of synchronicity, are Pacific Census, Mountain Census, and, least integrated of all, New England. For similarity, the least integrated regions are Middle Atlantic (New York, New Jersey, Pennsylvania), New England, and Pacific Census. Thus New England and the Pacific region are highly segmented by both measures. Middle Atlantic ranks sixth lowest in terms of integration by the synchronicity measure, and the Mountain Census region ranks sixth lowest in terms of similarity. To repeat, the Mountain Census region includes states such as Wyoming and Idaho, but also Nevada, Arizona and Colorado. It thus appears that the west coast and north east coast of the nation tend to more frequently “go their own way” in terms of deviating from overall national home prices, but that the Midwestern and southern regions exhibit such outlier behavior much less frequently.

Table 2 Means and standard deviations for synchronicity and similarity

It is important to note that these differences in terms of synchronicity and similarity among the regions are palpable. The standard deviations for the two measures in each region are displayed in Table 2. What we can tell from the data is that, for synchronicity, the region with the highest average level (East North Central) has a mean value more than two of its own standard deviations higher than that of the region with the lowest synchronicity value (New England). In addition, four regions, (East North Central, East South Central, South Atlantic, West North Central) plus the US, have a mean value of synchronicity at least one of their own standard deviations higher than the mean of New England.

The disparities among the regions in terms of similarity are even more pronounced than those for synchronicity. Five regions-East North Central, East South Central, South Atlantic, West North Central, West South Central- plus the overall US measure, have mean values more than two of their own standard deviations above the two lowest-scoring regions (New England and Pacific Census). And six of the nine regions (all the remaining regions except the Middle Atlantic), plus overall US similarity have mean values at least one of their own standard deviations above the means of New England and Pacific Census.

The wide variation in integration patterns is perhaps not surprising in light of different regional economic behavior over the business cycle. Anecdotally, one could think of the mid-1980s, which were years of strong growth for much of the country, but difficult times in agricultural states, or the early 1990s recession, which hit California and New England much harder than other parts of the country, and of course the latest recession has hit some parts of the country much harder than others. More formally, Mink et al. (2012) found, using this technique, that the United States economy exhibited less integration over the business cycle than the different nations of the Euro zone. As home values presumably react more to local employment and income conditions rather than those halfway across the country, it is perhaps not surprising that house prices also exhibit a fair amount of segmentation.

Conclusion

Clark and Coggin (2009) note that if regional incomes converge, “this phenomenon, may, in turn, be driving convergence in regional house prices, at least in a relative sense” (p. 265). And indeed the authors cite works, such as Carvalho and Harvey (2005) indicating that in the US, regional incomes may well be converging. However, recent research by Mink et al. (2012) indicates that differences in cyclical fluctuations between regions within the United States remain larger than those between nations in the Euro zone. Indeed, Fig. 1 gives us an example of two markets, with the same mean price, which nonetheless exhibit very different fluctuations-perhaps quite representative of different US regions, given the discussion above regarding the very different health that different regional housing markets in the US often display at the same time.

And the results found through our synchronicity and similarity measures re-affirm that there has been substantial segmentation in the US through most of the sample period. Indeed, examining Fig. 2 it seems that there has been, since the beginning of the sample, a large decrease in integration extending into the middle to late 1990s, and really not much change in average integration levels from there until around 2005. The large surge in integration has only occurred during the late stages of the bubble and subsequent bust-the midst of the largest bubble and bust since the FHFA began its home price indices. Under more “normal” circumstances, growing segmentation has more often been the trend. This finding thus makes the use of home prices, with their often low levels of integration across US regions, in US monetary policy problematic.