1 Introduction

Population density has been a research topic in geography, regional science, urban studies, and public policy for many years (see Fonseca and Wong 2000). While policy implications of urban population density are far-reaching, developing a better and more accurate depiction of urban density surfaces has continued to be a major challenge that began with the pioneering work of Clark (1951). Principal reasons underlying this type of modeling include acquiring a better understanding of spatial relations between the geographic distributions of urban population and jobs (e.g., the journey-to-work problem), and urban population and location rent. Accurately modeling urban population density is partly a technical issue, in terms of deriving the best mathematical equation to describe it, and partly a moving target issue, because an urban system, its environment, and its morphology are constantly changing. Consequently, a significant proportion of the literature has been dedicated to identifying the best equational form for an urban density function. For example, Batty and Kim (1992), and Martori and Suriñach (2002) evaluate different but somewhat simple mathematical functions that can be used to describe urban population density. But the most noticeably reconceptualization has been from a monocentric to a polycentric city form (see, e.g., Griffith 1981a, b; Gordon et al. 1986; Thurston and Yezer 1991; Hoch and Waddell 1993; Chen 1997; Baumont et al. 2004), addressing the deficiency of a single-center description that predicts relatively low-population densities in more peripheral locations where subcenters have emerged over time.

Although this more sophisticated conceptualization was a major step in advancing the modeling of population density from using simplistic mathematical functions to more complex model specifications, the approach is still constrained by the traditional framework of reducing the population density surface to a two-dimensional function without considering local spatial structure, and relies on a classical regression framework in the model-fitting procedures, ignoring the presence of spatial autorrelation. Other model specifications that account for spatial autocorrelation in the geographic distribution of urban population density (see Anselin and Can 1986; Griffith and Can 1995), somewhat of a very local form of polycentricity, has bolstered the modeling of population density. This effort began with attempts to describe urban population density via gradient or trend surface analysis (e.g., Hill 1973), an approach that focuses on global spatial autocorrelation map patterns. The primary purpose of this paper is to better integrate the polycentric reconceptualization with model specifications that account for spatial autocorrelation, and better capture the nature of inter-point separations in an urban area.

A related problem in modeling urban population density is to identify the location of a principal urban center [i.e., the central business district (CBD)], and to determine the number and locations of subcenters. This task may sound simple and straightforward for a monocentric city, but Alperovich and Deutsch (1992) show how misidentification can result in biased population density analysis results; intuitively speaking, the task is dramatically more difficult for polycentric cities. First, determining whether a city is monocentric or polycentric is complicated by the presence of candidate potential centers of different magnitudes according to population density throughout a city, as well as the presence of positive spatial autocorrelation. Even if a city is deemed to have multiple centers, identifying prominent ones still can be an onerous task. For a monocentric city, Alperovich and Deutsch (1994) and Griffith and Can (1995) independently show that the location of the single city center in question can be introduced as a parameter into an urban population density model, and then simultaneously estimated with other model parameters. A secondary purpose of this paper is to extend this parameterization to the case of a polycentric city.

The urban literature is replete with studies about city centers. Over the years this theme has become part of the focus of the spatial mismatch problem—relative locations of jobs and households impact upon unemployment (e.g., Holzer 1991; Preston and McLafferty 1999; Houston 2005)—and the affiliated issue of wasteful commuting—workers reside too far from their places of employment (e.g., Small and Song 1992; Horner 2002). These particular problems highlight that there are different types of city centers defined by different criteria. For instance, McMillen (2004) attempts to identify employment centers in 62 large United States (US) cities. In other words, the centers are central and non-CBDs, geographic foci of jobs. Utilizing a different criterion, Han (2005) identifies centers based upon property values. Similarly, Paez et al. (2001) use the local G statistic to identify centers with high-land values. In this paper, the local G statistic also is employed. The center of a city is based upon population density patterns, addressing the residential dimension of urban areas, and therefore tends to coincide geographically with an historical, monocentric CBD location. This modeling direction is in line with the classical work by Clark (1951), who depicts population density according to residential locations. Models of this type are empirically based or data driven, rather than built upon urban economic theories that extend the von Thünen model (Alonso 1964; Muth 1969; Mills 1970). Differences in the theoretical foundations of these two approaches are emphasized in Griffith (1999), who states that

map pattern analysis theory, cast in terms of two-dimensional stochastic processes theory or abstract mathematics, furnishes at least an interim step in theory building, and as such is one source of regional science theory. (p. 43)

This difference in perspective also is highlighted by D. Griffith and J. Paelinck (2006, submitted), who allude to some of the tensions between analysis criteria based upon theoretical economic derivations in contrast to statistical correctness and goodness-of-fit. Meanwhile, in keeping with Alperovich and Deutsch (1994) and Griffith and Can (1995), urban population density centers are included as parameters of a city’s model description.

Therefore, the overall goal of this paper is to present a spatial regression framework for modeling urban population density. In this framework, urban population density is modeled as a function of both global distance from polycentric urban centers (with a monocentric urban place being the default geographic landscape) as well as local effects of neighboring population densities surrounding each location in an urban area. This general approach follows the tradition of identifying the best equational form for an urban density function, as exemplified by Batty and Kim (1992). We adopt a statistical modeling approach in which the parameter estimates are derived from empirical population density data. The model takes into account local effects to partially overcome limitations of the traditional approach, which reduces the spatial structure to a two-dimensional function without incorporating local characteristics. Overlooking this local dependency can result in biased distance decay parameter estimates. The spatial autoregressive response (AR) model furnishes our basic model specification. While most models of urban population density use the Euclidean metric to measure global distances from centers, we adopt the more general Minkowskian metric (see, e.g., Getis et al. 2005, p. 165) to measure distances from centers. This framework allows a high degree of flexibility in accommodating different urban morphological characteristics. We also demonstrate a methodology for determining the numbers and locations of prominent population density centers in a city. Following the tradition of Alonso (1964) to some degree, our conceptual framework parallels multi-market von Thünen theory, and is demonstrated using data for the 20 largest metropolitan areas in the US according to the 2000 decennial census coupled with a block group geographic resolution. Comparisons are made between more simple monocentric and more sophisticated polycentric model results.

2 A conceptual framework

The work by Clark (1951), which uses a negative exponential function to characterize the relationship between a monotonic decline in population density from a city center with increasing distance from the center, furnishes the foundation of more recent work in modeling urban population density gradients. But as Batty and Kwang (1992) pointed out, Clark was not the first one to conceptualize the rapid decline of population density with increasing distance. The root of this concept dates back to von Thünen’s land rent model, which implies that land closer to an urban center generally has higher location rent and thus has a higher intensity of use, regardless of the land use type. The exponential decline depiction for a density gradient also is consistent with the density functions based upon urban economic theories (e.g., Muth 1969; Bussière and Snickars 1970; Mills 1970).

Following some of the early works modeling density gradients, some researchers attempted to identify determinants of these gradients (e.g., Alperovich 1983). But most studies have been dedicated to justifying the form and the specification of density gradient functions. For example, Zielinski (1979) evaluates different mathematical expressions that describe density gradients. Numerous studies approach the subject by applying the same density gradient model to different countries: Mills and Tan (1980) compare density gradient functions between developed and developing countries, whereas Edmonston et al. (1985) compare density gradient functions between US and Canadian cities. Meanwhile, Asabere and Owusu-banahene (1983) evaluate the applicability of the model to an African city. Nevertheless, while the negative exponential model formulation has been the standard form for describing urban population density, it has been challenged as alternatives have been suggested. Crampton (1991) reports results of experiments with different mathematical functional forms for cities in the developed world and Zheng (1991) suggests using a cubic-spline to model population density in the Tokyo metropolitan area. Batty and Kwang (1992) suggest that a negative power function may be more appropriate. In this paper, we posit a Minkowskian metric specification for which this negative power function is a special case. Given the prominence of polycentric cities, the utility of a monocentric-based density gradient model also is being challenged (e.g., Griffith 1981a).

2.1 Implications from von Thünen

The simple von Thünen geographic landscape contains a single center coupled with an assumption of isotropy. One extension is to introduce additional centers of varying sizes together with exponential rather than linear intensity decay with increasing distance. If intensity effects of multiple centers are additive—they also can be multiplicative—then the resulting cross-section curve may be portrayed as in Fig. 1a. The principal center has the highest peak whereas the most subordinate center has the shortest peak, and at any point along the horizontal axis, the vertical axis equals the sum of the three curves. Retaining an assumption of isotropy (i.e., Euclidean distance), this triplet of centers generates the two-dimensional contour map appearing in Fig. 1b. This map emphasizes that as distance separating the principal center and a subcenter decreases, population density gradients become increasingly indistinguishable. This map also highlights that as distance separating a principal center and one of its subcenters increases, three-dimensional mounds become increasingly distinguishable. The population density surface outcome of this kind of geographic landscape resembles an aerial view of part of the roof of the Denver International Airport (or a big top tent): peaks are represented by protrusions created in the material roof by poll supports, and inter-center gradients are represented by sags in the material roof between poll supports (see Fig. 2).

Fig. 1
figure 1

a Cross-sectional curves for a principal center and two subcenters. b Two-dimensional contour map for a principal center with two subcenters

Fig. 2
figure 2

Roof of the Denver International Airport and protrusions in a big top tent

3 Components of a general analytical model specification

Traditionally, because a monocentric city urban population density model involves the right-hand side of its equation being an exponentiation using the base e, each side of the equation is subjected to a logarithmic transformation (ln), resulting in ln(population density) being expressed as a linear function of distance from the corresponding CBD, which allows inferences to be based upon linear regression analysis coupled with a normal probability model:

$${\text{ln}}{\left({\frac{{{\text{population}}_{i}}}{{{\text{area}}_{i} }}} \right)} = \alpha - \gamma \hbox{distance}_{{\text{CBD}},\,i} + \varepsilon_{i} $$
(1)

for location i (i = 1, 2, ..., n, for the n locations contained in an urban place), where α is the intercept term, γ the distance decay parameter, distanceCBD,i the Euclidean distance separating location i from the CBD, and ε is an independent and identically distributed (iid) random error term; the normal probability model is attached to ɛ. But preserving the exponential form of the right-hand side of the equation allows inferences to be based upon non-linear regression analysis coupled with a normal probability model:

$$\frac{{{\text{population}}_{i} }}{{{\text{area}}_{i} }} = {\text{e}}^{\alpha} {\text{e}}^{ - \gamma {\rm distance}_{{\text{CBD}},\,i}} + \varepsilon_{i} .$$
(2)

This particular specification alludes to allowing population counts to be modeled with Poisson regression, where area used to calculate densities becomes ln(area) and then is included as an offset variable. Unfortunately, the introduction of an autoregressive term into this specification requires Markov Chain Monte Carlo maximum likelihood estimation of model parameters, a daunting undertaking. Investigating this modification of the model will be the topic of future research.

The preceding general framework not only collapses urban spatial structure into two-dimensional functions, ignoring the presence of neighborhood, and community structures, but estimation of the distance parameter also ignores that population density is highly spatially autocorrelated and thus the standard error of the distance parameter may be affected when estimated using a log-linear regression framework. Spatial dependence can be captured by including an autoregressive term on the right-hand side of an urban population density equation. Retaining the log-linear form of the equation results in the AR model specification

$${\text{ln}}{\left({\frac{{{\text{population}}_{i} }}{{{\text{area}}_{i} }}} \right)} = \alpha - \gamma \hbox{distance}_{{\text{CBD}},i} + \rho {\sum\limits_{j = 1}^n {w_{{ij}} {\text{ln}}{\left({\frac{{{\text{population}}_{j} }}{{{\text{area}}_{j} }}} \right)}} } + \varepsilon _{i} ,$$
(3)

where ρ is a spatial autocorrelation parameter, and w ij is the row-standardized entry in cell c ij derived from a binary geographic weights matrix C whose entries equal 1 if locations i and j are neighbors, and 0 otherwise (c ii  = 0). One advantage of this specification is that the autoregressive term in its right-hand side will tend to serve as a surrogate for missing variables, and will tend to make the error estimates behave more like iid ones. In other words, local variabilities or clusters not effectively captured by the traditional specification based only upon global distance effects are captured by the spatial autocorrelation term in this new specification.

The autoregressive structure of urban population density indicates that its accompanying geographic distribution is distorted by latent spatial dependency effects. In turn, this distortion, which means space is elastic rather than rigid in nature, invalidates a planar-based conceptualization of distance, resulting in the emergence of a breakdown in the application of Euclidean geometry concepts such as inter-point distances. Earlier model specifications of urban population density began with an assumption of Euclidean space; attempts to accommodate recognized distortions included the use of polynomials in Euclidean distance. But the elasticity of space in this context suggests that the more general Minkowskian metric is better suited for measuring inter-point distances. If Euclidean space prevails, then an estimated Minkowskian metric reduces to it; otherwise, the estimated Minkowskian metric in question better captures the curvilinear nature of inter-point distance relationships.

A Minkowskian metric can be defined as follows:

$${\text{distance}}_{{ij}} = {\left({1 \times {\left| {u_{i} - U_{k} } \right|}^{p} + 1 \times {\left| {v_{i} - V_{k} } \right|}^{p} } \right)}^{{1/p}} ,$$
(4)

where the distance is measured between location (u i , v i ) and center (U k , V k ), | | denotes absolute value, and p is a non-negative exponent; p = 2 yields the Euclidean metric, and p = 1 yields the Manhattan (i.e., city block) metric. More specifically, the unweighted Minkowskian distance formula in two-dimensional space (i.e., distance along each axis is weighted by 1; see Eq. 4) is a generalization of the commonly used Euclidean distance formula, allowing proximity between points on a two-dimensional surface to become non-Euclidean in nature. Although position and orientation continue to be of equal importance across the full range of p-values, by allowing space to become elastic, this generalization enables larger distances to be more or less emphasized, according to the value of p, capturing heterogeneous proximity in a landscape.

Duda et al. (2001) portray proximity in the common range of Minkowskian distances, 1 ≤  p ≤  ∞; Fig. 3 is adapted from their figure: the diamond identifies distance in Manhattan space (p = 1), the circle identifies distance in Euclidean space (p = 2), and the square identifies distance in the maximum (synonymously known as supremum or dominance) space (p = ∞). The impact of two coordinate axes, used for geocoding, on distance calculations is separable for Manhattan space, and becomes a function of increasing interdependence as p increases beyond 1 (i.e., interaction effects materialize and become increasingly pronounced). In other words, using the conic sections of Euclidean geometry (p = 2) as a baseline, as p increases beyond 2, more complicated loci of equi-distance points become possible. Of note is that 0 <  p <  1 characterizes cases in which the coordinate axis pairs are competitive in determining proximity, reminiscent of negative spatial autocorrelation. Therefore, replacing the Euclidean with the more general Minkowskian distance formula in conventional population density models potentially moves an analysis into the realm of non-Euclidean geometry, but only when the resulting estimate of p markedly deviates from 2.

Fig. 3
figure 3

Interpoint distances in Minkowskian space. The diamond demarcates distance in Manhattan space (p = 1); the circle demarcates distance in Euclidean space (p = 2); and, the square demarcates distance in the maximum space (p = ∞)

In order to accommodate Batty and Kwang’s (1992) suggest that a negative power function may be more appropriate than a negative exponential function for describing geographic variation in urban population density, the full distance expression is rewritten here as follows:

$$\hbox{Mode}\;1:{\left[ {1 + {\left({{\left| {u_{i} - U_{k} } \right|}^{p} + {\left| {v_{i} - V_{k} } \right|}^{p} } \right)}^{{1/p}} } \right]}^{\theta }, $$

where θ governs the role of a given distance metric. Consequently, p can only be interpreted conditional on θ. The addition of a 1 appears in this expression because the basic model structure is multiplicative in nature, and hence for zero distance an urban population density peak is preserved. Not surprisingly, parameters p and θ interact, which complicates non-linear estimation of the full set of parameters. This interaction can be minimized by recognizing selected limiting cases. As p goes to infinity, this distance expression converges on

$$\hbox{Mode}\;2: {\left[ {1 + \ln {\left({e^{{{\left| {u_{i} - U_{k} } \right|}}} + e^{{{\left| {v_{i} - V_{k} } \right|}}} } \right)}} \right]}^{\theta} .$$

As θ goes to 0, this distance expression converges on

$$\hbox{Model}\;3: \ln {\left[ {1 + {\left({{\left| {u_{i} - U_{k} } \right|}^{p} + {\left| {v_{i} - V_{k} } \right|}^{p} } \right)}^{1/p}} \right]},$$

which when exponentiated becomes the negative power function discussed by Batty and Kwang (1992). And, when p goes to infinity as θ goes to 0, this distance expression converges on

$$\hbox{Model}\;4: \ln {\left[ {1 + \ln {\left({e^{{{\left| {u_{i} - U_{k} } \right|}}} + e^{{{\left| {v_{i} - V_{{kj}} } \right|}}} } \right)}} \right]}.$$

In other words, the most general version of the distance function, Model 1, has three special cases that can be exploited to facilitate non-linear regression analysis.

These distance functions, coupled with a spatial autoregressive term and a polycentric model specification, render the following AR model description for urban population density, as a generalization of Eq. 3:

$$\ln {\left({\frac{{{\text{population}}_{i} }}{{{\text{area}}_{i} }}} \right)} = \rho {\sum\limits_{j = 1}^n {w_{{ij}} \ln {\left({\frac{{{\text{population}}_{j} }}{{{\text{area}}_{j} }}} \right)}} } + \ln {\left({{\sum\limits_{k = 1}^K {e^{{(1 - \rho)\alpha _{k} - \gamma _{k} \{ 1 + [(u_{i} - \overset{\lower0.5em\hbox{$\smash{\scriptscriptstyle\leftarrow}$}}{U} _{k})^{p} + (v_{i} - V_{k})^{p} ]^{{1/p}} \} ^{\theta } }} } }} \right)} + \varepsilon _{i}, $$
(5)

where the urban place has K centers, at locations (U k , V k ), that need to be estimated, and both the spatial autocorrelation and the nature of the distance metric are assumed to be constant throughout the urban landscape in question. This model specification allows for varying population density peaks (α k ) and varying distance decay rates (γ k ). The principal residential center, which should tend to be near the historical CBD of a city, should have the largest peak and the shallowest distance decay rate; importance of residential subcenters can be ascertained, to some degree, by their relative rankings with regard to these two parameters. Of note is that distance calculations were done for our analysis with polar coordinates in order to calculate great circle distances based upon (latitude, longitude) geocodings.

4 An empirical study of the 20 largest metropolitan areas in the US in 2000

To test how well our methodology works in describing monocentric and polycentric urban population density, we selected the 20 largest metropolitan areas in the US, according to population counts from the 2000 national decennial census (see Table 1). Figure 4 displays a map showing the locations of these selected urban places. Many are positioned in the northeast corridor, several are along the west coast, and some are in or near the Great Lakes industrial region of the country. The smallest of the 20 is San Juan in Puerto Rico. Overall, these 20 cities cover the coterminous US reasonable well (this excludes Alaska and Hawaii). We used the 2000 census data from Summary File 1A at the block group level of geographic resolution. The data are near-100% counts of the US population in 2000 (the undercount issue suggests 100% counts were not achieved). Block group data were selected because census tract level results appear to be too coarse (pertinent local spatial variability is obscured by geographic aggregation), while block level results have too many zeroes (large numbers of units house no residents; see Table 1 for the associated block group counts); block group data also allow some comparisons to be made with previous census geography analyses we have completed (Griffith et al. 2003). Because the purpose of this investigation is to describe urban population density gradients, we sought a balance between too much and too little smoothing across urban landscapes; too much smoothing tends to mask both subcenters and the role of spatial autocorrelation, whereas too little smoothing becomes overshadowed by detailed local variability that behaves more like dirtiness and noise. Results for the ln of population density appear in Table 1 (a translation parameter, d, has been added to each urban place): in most cases, ln(population density + d) comes reasonably close to mimicking a normal frequency distribution [according to the Kolmogorov–Smirnov (K–S) test statistics].

Table 1 Twenty largest US metropolitan regions (2000)
Fig. 4
figure 4

Locations of the 20 largest metropolitan areas in the US according to its 2000 national census

4.1 Preliminary spatial autoregressive monocentric findings

The starting point of our analysis is with an assumption of a single center for each urban place. This analysis helps reveal the nature of the distance function, as well as the location of the principal residential center, for each urban place. Cities such as Dallas exhibit an expected shallow distance decay effect, cities such as Detroit exhibit an expected sharp distance decay effect, and cities such as Washington, DC, exhibit a confused distance decay effect (because effects of the pair of centers located in downtown DC and downtown Baltimore essentially are being traded off against each other). Finally, pseudo-R 2 values (see Cameron and Windmeijer 1996, 1997; Mittlböck and Schemper 1999) help index the amount of geographic variability in ln(population density + d) being accounted for by global distance from a single center point. Results appearing in Table 2 are mostly comparable to monocentric results reported in the literature several decades ago.

Table 2 Single center population density model, with exponentially weighted great circle Minkowskian distance

Ultimately we incorporate a Minkowskian distance function in our population density equation. Impacts of this change in metric space are exemplified by comparing estimation results reported in Tables 2 and 3; a graphic comparison appears in Fig. 5. Generalizing the distance function has three conspicuous outcomes here. First, estimates of the maximum population density as well as the distance decay parameter tend to be altered. Second, the nature of the distance function may change [see New York city (NYC), DT, and SJ]; although two of these metropolitan areas are on the border of a different distance function, SJ is not (see Table 2). Third, the estimated urban centroids and the percentage of variance accounted for essentially are the same— \(\hat{\theta}\) and biases in parameter estimates essentially compensate for an incorrect Euclidean distance specification.

Table 3 Single center population density model, with exponentially weighted great circle Euclidean distance and p = 2
Fig. 5
figure 5

Graphical comparisons of Eq. 1 estimation results based upon Euclidean (vertical axis) and Minkowskian (horizontal axis) metrics. a Top left Population density at distance 0. b Top middle Distance decay parameter. c Top right Urban center longitude coordinate. d Bottom left Urban center latitude coordinate. e Bottom middle Distance function exponent. f Bottom right Multiple regression R 2 value

Varying values of p across urban areas indicate that the elasticity of distance can change from one urban place to another. The non-Euclidean nature of urban space arises not only from the inability of straight-line movement due to physical obstacles (e.g., buildings and fences), but also from the density and layout of a traffic network, the presence of one-way streets, the presence of limited access roads, geographic variation in road speed limits, and the extent and effectiveness of a place’s public mass transit system. Dallas and St. Louis appear to be closest to having an Euclidean metric space (see Table 2). Boston, Chicago, Houston, Minneapolis, and San Diego appear to be closest to having a Manhattan metric space (see Table 2). Cleveland, Miami, and San Juan appear to have competitive north–south and east–west distance axis corridors, which may arise from their truncation of the two-dimensional spheres of influence by major water bodies. Future research is needed to more fully understand what features of urban places co-vary with p.

Next, our analysis incorporates a spatial autoregressive term into the urban population density equation. Because of the very large numbers of block groups involved (see Table 1), Jacobian approximations (see, e.g., Griffith 2004) were employed in order to estimate the AR models. Estimation results for this expanded model specification appear in Table 4. Not surprisingly, all of the pseudo-R 2 values are noticeably greater than their R 2 counterparts reported in Table 2 (see Fig. 6). Detected spatial autocorrelation is moderate-to-strong and positive in all urban places. For all practical purposes, the estimated monocentric center estimates remain the same with and without a spatial autoregressive term in the model specification. But the presence of an autoregressive term tends to result in a small increase in peak estimates coupled with a dramatic shrinkage of distance decay parameter values. Except for some detail, the estimated Minkowskian metric tends to remain the same across the set of urban places (the sole exception is Atlanta), whereas the distance power transformation parameter estimate \((\hat{\theta})\) essentially remains unchanged, when an autoregressive term is present.

Table 4 Single center autoregressive population density model, with exponentially weighted great circle distance
Fig. 6
figure 6

Pseudo-R 2 values for the monocentric population density models describing the 20 metropolitan areas, with and without a spatial autoregressive parameter

The moderate-to-strong positive spatial autocorrelation detected across the 20 cities indicates that positive spatial externalities are helping to mold the geographic distributions of population density across each of the urban areas. In other words, forces at work (e.g., zoning, location rent, economies of scale in housing and infrastructure provision, congestion, and accessibility) tend to concentrate households in such a way that relative density increases in a few locations, while decreasing in many other locations.

5 Polycentric urban population density surfaces

The preceding analysis assumes that all 20 metropolitan areas have single centers—a monocentric city structure. But visual inspection of their corresponding population density maps suggests that a number of these cities have more than one center. Figure 7 shows two examples: the Washington, DC metropolitan area includes two centers, the city of Washington, DC, and the city of Baltimore; and the Cleveland metropolitan area appears to have three centers, two in the city itself and the suburban center of Akron. Therefore, a more analytical methodology is needed to determine if a metropolitan area has multiple centers in order to guide the modeling of urban population density gradients.

Fig. 7
figure 7

Examples of polycentric metropolitan areas: the Washington, DC MSA, and the Cleveland MSA

5.1 The use of LISA and Getis–Ord G statistics for identifying urban centers and subcenters

One may choose to use visual inspection to identify multiple centers, an approach that not only is subjective, but also is unreliable in terms of determining whether or not any given population density concentration is a significant cluster. In identifying employment subcenters, McMillen (2003) uses a locally weighted regression model to generate a density surface, and subcenters are designated as contiguous areal units with residuals higher than this surface. This approach assumes that some global pattern exists across an urban landscape, and subcenters are significant deviations from this global pattern. The approach we adopt here does not assume the presence of a global pattern and follows the recent literature in hot spots detection using local statistics. Conceptually, local statistics can detect hot spots, which are peaks of urban population density, and hence centers. One logical quantitative choice is local indicators of spatial associations (LISA; Anselin 1995). Han (2005) uses both the global Moran’s I and local Moran statistics to identify centers based upon condominium property values. Meanwhile, Feser et al. (2005) use the Getis–Ord local G statistic (Getis and Ord 1992) to identify industrial complexes. Our exploratory analyses highlight that the local Moran’s I can effectively identify urban clusters with similar population density levels, but fails to distinguished between high-density (i.e., hot spots) and low-density (i.e., cold spots) clusters, as local Moran’s I is a function of the similarity of neighboring values. However, we are interested only in high-density clusters. For example, the local Moran’s I identified a cluster in the central part of the Phoenix metropolitan area that actually consists of block groups with low-population density.

Our findings are not surprising, since a major strength of the local G statistic is its ability to differentiate between hot (high-density clusters) and cold (low-density clusters) spots. In our research context, we only need to identify clusters of high-population density—hot spots. Therefore, we computed the local G statistic for each block group based upon population density. Specifically,

$$G_{i} (d) = \frac{{{\sum\nolimits_j {w_{{ij}} (d)x_{i} } }}}{{{\sum\nolimits_j {x_{j} } }}},$$
(6)

where x i is the raw population density value of block group i, and w ij (d) is essentially the binary neighborhood function using the contiguity constraint in the queen’s case, and i can equal to j. Given the expected value and variance for each block group (Getis and Ord 1992), the local G statistic was standardized such that a z-score was calculated for each block group.

Theoretically, based upon a 2.5% level of significance, areas with local G statistic z-scores larger than 1.96 might be regarded as significant clusters; but this criterion identifies an enormous number of such clusters here. Using a more stringent significance level of 0.01, we classified all block groups with a z-score of 2.58 or higher as forming significant clusters. This adjustment helps counterbalance the impacts of multiple testing. By examining the distributions of these clusters, we determined if there are multiple centers in each metropolitan area, using coordinates of the peaks in these clusters as the starting values for estimating center locations. Figure 8 presents the six urban places (Washington DC, Cleveland, Atlanta, Detroit, San Francisco, and Los Angeles) that have the most conspicuous multiple population centers. These cities require special attention when determining the locations of their centers and subcenters, and when estimating their population density gradients.

Fig. 8
figure 8

Six polycentric metropolitan areas with their analytical principal centers and subcenters: areas with the highest saturation of red have a z-score of 2.58 or higher

5.2 Estimation results for the polycentric model specifications

Table 5 reports estimation results using multiple centers, which replace the single center results for six of the cities that are reported in Table 4. The model specification is an additive form of multiple center effects, although a multiplicative form also could be employed (see Heikkila et al. 1989). Although failure of the polycentric models to out-perform their monocentric model counterparts in terms of the pseudo-R 2 criterion seems counter-intuitive, this feature of the analysis highlights how well an autoregressive term can compensate for missing variables and/or specification error (i.e., the spatial autoregressive parameter estimate, \(\hat\rho,\) tends to be higher for monocentric specifications). In other words, while using a monocentric specification when modeling urban population density in a city with multiple centers introduces misspecification error into an analysis, a spatial autoregressive model specification still performs very well in terms of variance explained—because of the relative smoothness of a population density surface, geographic patterns attributable to the misspecification are highly positively spatially autocorrelated. This finding indicates that not only introducing an autoregressive term into, but also capturing the existence of multiple centers in a population density model specification makes a difference.

Table 5 Multiple center autoregressive population density model, with exponentially weighted great circle distance

But this somewhat counter-intuitive result is quite logical. The single-center model contains misspecification error by ignoring the presence of subcenters in a polycentric urban region. By including a spatial autoregressive term in the model specification, local variability in subcenters is modeled as local effects, which are very effectively captured by an autoregressive parameter. Although local centers are not explicitly recognized and incorporated into a single-center model, the local effects are modeled implicitly with the autoregressive term. As a result, multi-center models do not out-perform their single-center model counterparts. In other words, a single-center model with an autoregressive specification is very robust in dealing with complicated urban structure, as long as the structure is not spatially random. Thus, this robust modeling approach can be extended to model other urban phenomena besides population density that contain both a global trend and local effects.

6 Conclusion and implications

In conclusion, the time has come for urban residential modeling to employ polycentric specifications when they are applicable, while accounting for spatial autocorrelation. Sound descriptions of this type are needed when undertaking research on such higher-level problems as spatial mismatch and wasteful commuting. Results summarized here indicate that such estimation is feasible, even for very large numbers of areal units. Of note is that inclusion of an autoregressive terms does a very good job of compensating for misspecification error/missing variables. In addition, the global distance function needs to be written as a Minkowskian metric in order to capture the elasticity of space created by spatial dependency effects. Empirical evidence is offered here supporting the contention that the Minkownskian metric appears to better characterize geographic distributions of urban population density.

To some extent, the spatial autoregressive term compensates for missing variables, especially those affiliated with spatial externalities. It also accounts for spatial spillover effects: similar housing tends to concentrate in geographic space. Meanwhile, the Minkowskian metric indicates the degree of elasticity in a given geographic space, the degree of difficulty associated with movement between close and distant points in an urban landscape. Both of these features of a metropolitan region impact upon the behaviorally related distance decay parameter. Therefore, adjusting for the effects of spatial autocorrelation and the non-Euclidean nature of urban space removes estimation bias from distance decay estimates, leading to a better depiction and understanding of urban spatial structure.

The spatial regression models employed here offer better explanatory power than do models without a spatial autoregressive term, a finding that has important implications. The traditional models very much claim that population density is a function of distance from a city center. Thus, one should use distance from the city center to estimate the population density of a given location. But this logic seems counterintuitive and is inconsistent with common practice. If one wants to predict or estimate the population density of a given location, the common starting point is to examine neighboring locations to obtain an initial estimate. This is what the spatial autoregressive model does, which parallels spatial interpolation (e.g., kriging) that exploits redundant locational information. The local or neighboring density is a good predictor of population density, which is further impacted by the global variable of distance from a city center. Therefore, one important finding from the research summarized in this paper is a practical methodological framework to implement this common and realistic practice.

Analysis of the set of selected metropolitan areas reveals several interesting empirical regularities:

  1. 1.

    relatively few of the urban places display conspicuous urban population density polycentricity;

  2. 2.

    moderate-to-strong positive spatial autocorrelation characterizes urban population density across the set of 20 largest US metropolitan areas in 2000;

  3. 3.

    roughly two-thirds of the geographic variability in log-urban population density can be accounted for by spatial autocorrelation and global distance effects;

  4. 4.

    inclusion of an autoregressive term in an urban population density model changes the qualitative nature of the metric-distance function in only two of 20 cases (Atlanta, Miami; see Table 6); and,

  5. 5.

    the polycentric model specification changes the qualitative nature of the metric-distance function in only three of 20 cases (Los Angeles, Washington DC, Detroit; see Table 6).

Table 6 Minkowskian metric-distance function model

Of note is that the uncovered subcenters in the six identified polycentric urban places appear to be sensible (see Figs. 7, 8), and can be rank ordered according to their parameter estimates (see Table 5). For example, for Cleveland,

  • principal center: \(6.93{\text{e}}^{{ - 0.10d^{*}_{{ij}} }} \)

  • nearby subcenter: \(4.19{\text{e}}^{{ - 0.75d^{*}_{{ij}} }} \)

  • Akron subcenter: \(2.39{\text{e}}^{{ - 1.29d^{*}_{{ij}} }} \)

This geographic landscape presented particularly challenging estimation problems because the first subcenter is very close to the principal center (see Figs. 7, 8).

Even the more sophisticated polycentric model specifications render non-normal residuals, suggesting that perhaps a Poisson probability model should be employed. Planned future research includes capturing spatial autocorrelation effects with a spatial filter specification, which will allow a Poisson regression to be executed. This generalized linear model regression analysis will use population counts as the response variable, and include log-area as an offset variable.

Finally, one of the goals of this paper is to furnish an improved methodology for modeling urban population density using the empirical approach adopted by many researchers cited throughout this paper. Although our statistical approach is not explicitly built upon economic theories, the results we obtain are consistent with those derived from urban economic models. As mentioned previously, the difference in perspective is one of theory-driven and data-driven modeling. Our findings include that introducing multiple centers and spatial autoregressive terms can make a difference in urban landscape descriptions, that spatial autocorrelation and proximity vary across metropolitan areas, that distance decay covaries with the level of spatial autocorrelation and the nature of a distance metric, and hence its estimate may be biased in the absence of explicitly incorporating these features in a population density model specification, and that subcenters can be identified using the Getis–Ord local G statistic.