Abstract
This paper focuses on industrial location, assuming that entrepreneurs not only consider the advantages associated with a certain municipality, but also those coming from nearby areas. Exploratory analysis reflects the existence of spatial patterns in the creation of manufacturing establishments and sheds light on the geographical scope on which agglomeration economies operate in industrial location. Spatial Probit models and standard Probit models with spatially lagged explanatory variables are estimated to test whether neighboring municipalities’ location decisions and characteristics, including agglomeration economies, matter in industrial location choices. Results show that neighboring municipalities location decisions and characteristics help to explain location decisions of new establishments for 11 manufacturing industries in Spanish municipalities (NUTS V) over the period 1991–1995.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
1 Introduction
Since Alfred Marshall’s pioneering Principles of Economics, a common theme in Urban and Regional Economics has been that the agglomeration of similar firms can boost firm productivity. Thus agglomeration economies are a key variable in the location decision process. Usually, only firms located in reduced areas, such as the city of Prato (Italy), or Silicon Valley (USA) (often referred to as Marshallian industrial districts), are supposed to get the advantages of agglomeration economies. However, one can expect that spillovers and other advantages derived from agglomeration economies might also provide benefits to plants locating in nearby areas, in addition to those in the same immediate town or municipality (Ellison and Glaeser 1997). This issue is related to the so-called geographical scope of agglomeration economies commonly assumed to attenuate over distance. In that perspective, the aim of this paper is to analyze whether the location decisions of manufacturing plants in Spanish municipalities are related to the location decisions taken in surrounding or neighboring municipalities, and to give insight into the reasons for this agglomerative behavior. In order to do so, we will apply Spatial Econometric techniques to study the location decisions of 11 industries in Spanish municipalities.
Firms may cluster due to many reasons, such as history, random events, natural advantages, or agglomeration economies (Marshall 1890; Krugman 1991a, b; Ellison and Glaeser 1997; Ellison et al. 2010)Footnote 1. The most usual classification of agglomeration economies comprises urbanization economies, when the industrial mix is diverse and firms also benefit from the services and facilities of urban areas, and localization economies or Marshallian external economies, when the advantages of clustering derive from the same industry (Hoover 1948)Footnote 2. According to Marshall (1890), the sources of the so-called agglomeration economies are: shared input marketsFootnote 3, labor market poolingFootnote 4; and human capital and knowledge spilloversFootnote 5. A similar concept to localization economies are the so-called MAR externalities—named after Marshall (1890), Arrow (1962) and Romer (1986)—when the agglomeration of firms arises in an oligopolistic environment (Glaeser et al. 1992).
Most analyses of Marshallian externalities have usually focused on the aforementioned sources of agglomeration economiesFootnote 6, and on the so-called industrial scope, which deals with the distinction between localization economies and urbanization economiesFootnote 7. However, as it is pointed out in Rosenthal and Strange (2004), less attention has been paid to the other dimensions over which agglomeration economies extend: the temporal scope and the geographic scope. The temporal scope is related to whether the effects of these economies are felt immediately or whether there may be any time lag, since there may be static agglomeration economies and dynamic agglomeration economies (see Glaeser et al. 1992; Henderson 1997). The geographic scope deals with the attenuation of the benefits of agglomeration with physical distance, since, ceteris paribus, when economic agents are closer there is more potential for interaction. This paper is focused on this geographical dimension of agglomeration economies, using data from Spanish municipalities.
There is not much work done on the geographic scope of agglomeration economies, with existing studies exhibiting only limited evidence of benefits extending beyond town limits. Using US zip codes, Rosenthal and Strange (2003) show that the geographic scope of localization economies seems larger than urbanization economies. They found that employment outside the industry of focus had an inconsistent and frequently insignificant effect. For the Spanish municipalities, Viladecans-Marsal (2004), who limits her analysis to the most crowded Spanish cities (over 15,000 inhabitants), found that urbanization economies influence location in most industries, while localization economies played a minor role, and the agglomeration effects only spilled over the city borders in three of the six manufacturing industries analyzed. Using similar techniques, but studying Catalan municipalities, Jofré-Montseny (2009) found evidence on the geographical scope of localization economies for the textile and wood and furniture industries, and for urbanization economies in medical, precision and optical instruments, chemical products and metal products except for machinery industries.
On the other hand, Soest et al. (2006), working with zip code data from a Dutch province, conclude that agglomeration economies may well operate on a geographic scale that is smaller than a city, since they only found evidence for interurban externalities for manufacturing, which is analyzed as a single industry. Simmie (1998), Suárez-Villa and Alrod (1998), and Arita and McCann (2000) also cast doubts on the spatial extent of agglomeration.
According to Tobler’s first law of Geography “everything is related to everything else, but near things are more related than distant things” (Tobler 1970)Footnote 8. That sentence is often used to explain the concept of spatial dependence or spatial autocorrelation, and to justify the need to check for spatial autocorrelation when dealing with spatial data and processes. There is spatial dependence or autocorrelation when the values of a variable in a certain location are related to the values of the same variable in neighboring locations. Surprisingly spatial autocorrelation is seldom taken into consideration in industrial location decision analysis. Therefore, most of the studies referenced above are mainly based on non-spatial regression analysisFootnote 9, which limits their findings. To properly capture the geographical scope of agglomeration economies, controls for spatial dependence should be usedFootnote 10. Spatial tools allow location decisions to be influenced by the decisions of firms in neighboring or nearby municipalities. Ignoring these influences can cause a variety of issues in an empirical analysis.
The aim of this paper is to analyze the extent of dependence in location decisions between neighboring municipalities. Instead of building or testing a comprehensive or sophisticated location decision model, we focus on the similarities or dissimilarities of those location decisions among neighboring municipalities.
We apply Spatial Econometrics (Spatial Probit models and Non-spatial Probit models with spatially lagged explanatory variables) to estimate a simple location decision model and Spatial Statistics techniques (BB Join Count Statistics and Moran’ I Statistic) to analyze the spatial allocation of new manufacturing establishments in Spanish municipalities. Both methods examine spatial dependence in location decisions. Our dataset comprises the continental Spanish municipalities and 11 industries.
This paper is organized as follows. Section 2 provides data, the methodology both for the exploratory analysis and for the confirmatory analysis, and a simple location decision model is presented. Results are shown in Sect. 3. Finally, the main conclusions of this research are set out in Sect. 4.
2 A simple location model, the statistical methodology, the spatial unit of analysis and the data
In this section, we introduce the model, the spatial econometrics and spatial statistics techniques that will be implemented in the next section, some considerations about the spatial unit of analysis, and the data.
2.1 Econometric specification
Usually, location models are constructed considering the location decision problem as one of “random” profit maximizationFootnote 11 (Figueiredo et al. 2002). Following McFadden (1974) and Carlton (1983), it is considered that if an entrepreneur, who previously decided to open a new establishment in manufacturing industry j, locates in municipality i it will produce a potential profit of \(\pi _{ij}\). Formally,
where Xi reflects internal characteristics of municipality i and \(\varepsilon _{ij}\) stands for a random variable, which is expected to be distributed independently. So, this entrepreneur will locate in municipality i if the potential profit is greater than in other municipalities, m, for instance, that is
where \(i \ne m\). This profit depends on a set of local characteristics, and it is usually expressed as a linear combination of these characteristics (Figueiredo et al. 2002). Thus, in our case this profit would also depend on the characteristics of the neighboring area
where the explanatory variables Xi and WX account for the local characteristics which impact on profits and for the relevant characteristics of the neighboring municipalities, respectively. W is a spatial weights matrix (SWM), where \(w_{ij}\) is set to 1 if municipality i and municipality are considered neighbors, and to zero otherwise. So, WX could be substituted by \(W\pi _{ij}\)
As it is not possible to observe \(\pi _{ij}\) (Ellison and Glaeser 1997), the dependent variable of location models is usually the number of new establishments or new firms created over a period of time, LOC. So, we may express LOC as a linear combination of independent variables from equation (3)
Location decision models are usually estimated using limited dependent variable models, i.e., Logit, Probit or Poisson specificationsFootnote 12. However, there are potentially a variety of unobserved (or difficult to quantify) influences that could cause location decisions to be spatially dependent. For instance, some areas may have better infrastructure or road networks that are conducive to manufacturing. If LOC ij depends on what happens in neighboring municipalities, the assumption of an independently distributed \(\varepsilon _{ij}\) is too strong. Two popular tests of spatial dependence are described in Sect. 2.3. The existence of spatial autocorrelation invalidates the use of most usual statistical and econometric techniques, such us ordinary least squares, or the basic logit or Probit modelsFootnote 13. If those models are used on spatially dependent data, biased or inefficient results will be obtained.
Spatial autocorrelation in data and processes may be treated in different ways. A simple approach may be to try to remove it from the datasetFootnote 14, but this is often not sufficient. Alternatively, spatial controls can be included in the specification of the model. The two most common approaches to the later method are the spatial autoregressive model (SAR) and the spatial error model (SEM)Footnote 15.
Three models will be estimated for each manufacturing industry: a standard Probit with spatially lagged explanatory variables, (PLEV), a Bayesian spatial autoregressive Probit, (SARP), and a Bayesian spatial error Probit, (SEMP).
As changes in explanatory variables for municipality i will have a direct impact on the location decisions of municipality i, as well as an indirect or spatial spillover impact on neighbors, following Lesage and Pace (2009) we will estimate total, direct and indirect effects of SARP models.
However, since the indirect and indirect effects of SAR models are global (Lesage and Pace 2009) and that location processes may seem more localized, we will also estimate SEMP models with spatially lagged explanatory variables.
As a dependent variable, we use \({\textit{LOC}}_{ij}\), a binary variable which is set to 1 if the location decision industry j is implemented in municipality i over the period 1991–1995Footnote 16 and to 0 otherwise. We estimate an equation for each one of the eleven manufacturing industries considered. The normal approach to this type of data would be to use a Probit or logit modelFootnote 17. In the presence of spatial autocorrelation, however, standard Logit and Probit models are not very useful since \(\varepsilon \) does not follow a normal distribution. The majority of spatial econometric models with a continuous dependent variable use maximum likelihood techniques. However, with a binary dependent variable, there is no closed form solution to Probit or logit probabilities (Anselin 2002, Lesage and Pace 2009).
We therefore use an alternative approach, which employs Bayesian methods to control for spatial dependence (Lesage 1997 and Lesage 2000; Smith and Lesage 2002). Although there are other less popular alternativesFootnote 18, such as the generalized methods of moments (GMM) estimation (Pinkse and Slade 1988); or the EM (expectation maximization) approach for error models (Mcmillen 1995), Bayesian methods represent the most comprehensive approach with a range of support and previous literature. This approach, proposed in Lesage (1997, 2000) and Smith and Lesage (2002) “is the most flexible of the spatially dependent models because it can incorporate spatial lag dependence and spatial error dependence in addition to general heteroskedasticity, of unknown form (Fleming 2004, p.166–167).”
The Bayesian approach used here has its foundations in a non-spatial paper by Albert and Chib (2003), who model the binary dependent variable y as an indicator of unobserved latent utility \(y^*\) (Lesage and Pace 2009). The relationship between y and \(y^* \) is as follows: \(y_i = 1\) if \(y_i^* \ge 0\), and \(y_i = 0\) if \(y_i^*< 0\). In the present application, when the net utility (\(y_i^*\ge 0\)) of locating in municipality i is positive, \(y_i = 1\) and the firm selects i for its location. Albert and Chib (1993) recognized that \(p({ \beta },\sigma 2 {\vert } y^*)=p({\beta }, \sigma 2 {\vert } y^*,y)\), since if you have \(y^*\) you have all the information needed to create y. This significantly simplifies the problem, because if \(y^*\) is added as an additional parameter to be estimated, then the joint conditional posterior distribution of \(\beta \) and \(\sigma \)2 can be modeled as the same form as a continuous dependent variable Bayesian regression (Lesage and Pace 2009; LeSage et al. 2011).
Instead of having to numerically integrate over the conditional distributions, Albert and Chib’s (1993) contribution allows us to use Bayesian Markov chain Monte Carlo methods to sample each parameter from its conditional distribution. After numerous iterations of this sampling algorithm, a set of draws is produced that converges to the unconditional joint posterior distribution (full details are contained in Lesage and Pace (2009)). For instance, the conditional distributions of \(\rho \) in the SAR model, and \(\lambda \) in the SEM model, as followsFootnote 19.
For the number of iterations, we use 10,000 draws along with a 2500 draw “burn in”, which is discarded, but used to better calibrate the initial parameter values. To determine whether this number of draws is sufficient, Raftery–Lewis convergence diagnostics are employed. Although we implement several tests of spatial dependence below, there is not a robust method of choosing between the SAR and SEM models in the context of a binary dependent variableFootnote 20. Consequently, both models are presented below.
2.2 Data sources and location determinants
Location models try to explain how certain variables may influence location decisions. Most empirical work usually groups these variables into categories such as supply factors, demand factors, external economies and diseconomies, etc. (Guimarães et al. 2004). Since our central focus is the spatial influence of neighboring municipalities, we do not carry out an extensive analysis of location determinantsFootnote 21. As explained later, this is also due to the lack of data for NUTS V in Spain with regard to location factors such as labor cost, land prices or taxesFootnote 22, etc. The location determinants we are taking into consideration are: human capital as a supply factor; municipality product as a demand factor; local external economies (localization and urbanization); and the role of neighboring municipalities’ location decisions characteristics.
The human capital index, \({HC}_i\), is defined as the percentage of population with at least a secondary school degree in municipality i in 1991. The expected sign is positive since it reflects the skilled labor market. Municipality product in 1991, \({MP}_i\), reflects the volume of economic activity in the municipality, the potential market for new firms, so its expected sign is positive.
External economies are represented by the classic location quotient and by a diversity index.
The location quotient, \({LQ}_{ij}\) represents the advantages of geographical specialization of municipality i in industry j, that is, traditional localization economies, Marshallian externalities or MAR’s agglomeration economies in 1990. Its expected sign is positive. Since higher \({LQ}_{ij}\) may be caused both by a large number of small firms and by a small number of large firms, besides localization externalities it may also reflect the effects of concentration or internal returns of scale. It is defined as follows:
where \(E_{ij}\) accounts for total employment in manufacturing activity j in municipality i, \(E_i\) for total employment in municipality i, \(E_J\) for national employment in manufacturing activity j, and ET total national employment in all manufacturing activities.
\({DI}_i\) is a manufacturing diversification index for municipality i in 1990. The expected sign of this variable is positive since manufacturing diversity may reflect the existence of inter-industrial external economies, such as the Jacobs type (Jacobs 1969; Glaeser et al. 1992), and also because the creation of new plants is biased toward more diversified cities (Duranton and Puga 2000). This index is based on the correction for differences in sectoral employment shares at the national level of the inverse of a Hirschman–Herfindahl index proposed in Duranton and Puga (2000):
where \(s_{ij}\) is the share of manufacturing industry j in manufacturing employment in municipality i, and \(s_j\) is the share of manufacturing industry j in total national manufacturing employment.
Finally, we consider the potential role of neighboring municipalities \({NM}_i\), that is, location decisions of neighboring municipalities and the characteristics of neighboring municipalities. It may be measured by the spatially lagged independent variables in a standard (non-spatial) Probit model and in spatial error models, (\({WHC}_i\), \({WLQ}_{ij}\), \({WDI}_i\) and \({WMP}_i\)), where W is an SWM, and by the spatially lagged dependent variable in a Spatial Autoregressive Probit modelFootnote 23, (\({WLOC}_i\)). While \({WHC}_i\) and \({WMP}_i\) account for the human capital and the potential market of neighboring municipalities, \({WLQ}_{ij}\) and \({WDI}_i\) represent the geographical scope of agglomeration economies which are originated in neighboring municipalities. Location decisions of neighboring municipalities in industry j are represented by \({WLOC}_i\). That is, \({WLOC}_i \) measures part of the geographical scope of location decisions.
Therefore, location decisions may be explained as a function of local and neighboring municipalities variables, such as agglomeration economies, human capital, and potential market through the following expression:
As Ottaviano and Puga (1998) point out, literature on economic geography identifies economic agglomeration at different levels of aggregation, from the small scale, e.g., a highly specialized industrial district such as the city of Prato in Italy, to the large scale agglomerations that cut across states, such as the US “Manufacturing Belt” or the European “Hot Banana.” Since the geographic scope of agglomeration economies do not seem to be very large, as described in the previous section, we focus on Spanish municipalities (NUTS V). It seems a sensible election to study both the location of new manufacturing plants or the geographical scope of agglomeration economies, (as shown in Holl (2004a), in Jofré-Montseny (2009) or in Viladecans-Marsal (2001, 2003) and Viladecans-Marsal (2004)), since the average size of Spanish municipalities is \(64 \hbox { km}^{2}\), which is 1/3 of the average size of the U.S. zip codes analyzed in Rosenthal and Strange (2003), and around 85 % of the municipalities consideredFootnote 24 are smaller than \(100\hbox { km}^{2}\).
Nevertheless, working with Spanish municipalities also imposes a hard data constraint since most municipality data are related to socio-demographic characteristics and they are not usually up to date, because they are often produced for decennial census or for other purposes. We could try to overcome this scarcity of data using data related to higher levels of spatial aggregation, such as NUTS III, as done in Holl (2004a) to proxy municipal wages, labor force qualification, sector and industry specialization, and industry share. Unfortunately, as it is widely known in spatial analysis but often ignored in location analysis, our analysis could be wrong due to the so-called Modifiable Areal Unit Problem (MAUP)Footnote 25, which is a potential source of error that can affect spatial studies which use aggregate data sources, consist of both a scale and an aggregation problem and is related to the concept of ecological fallacy (Unwin 1996; Bailey and Gatrell 1995). Thus, as our target is not to fully explain location decisions or location determinants, but to test whether location decisions in a municipality are related to the ones taken in neighboring municipalities, we will only consider NUTS V data.
The data sources that we will use in our analysis are Registro de Establecimientos Industriales—Industrial Establishments Register—(REI), Censo de Población 1991 (1991 Population Census), Censo de Locales 1990 (1990 Establishments Census 1990), and Alañón (2002). REI dataFootnote 26 will allow us to study the spatial allocation of new manufacturing establishments in Spanish municipalities for 11 industries at 2 CNAE-93 digit level (Spanish classification of economics activities at 2 digit level). The industries considered are: food and tobacco; clothes and leather; wood and furniture; printing and paper; chemistry; other nonmetallic minerals; first transformation of metals; machinery; computer, office equipment, etc.; electric and electronic equipment; and transport equipment. We have data from 1980 to 1998Footnote 27. 1991 Population Census and 1990 Establishments Census are the last Spanish Census whose municipality data are available for all municipalities. Census data will allow us to build indicators for the advantages derived from human capital, and agglomeration economies. Alañón-Pardo (2002) provides gross domestic product of Spanish municipalities for 1991.
Due to the restrictions of the data sources referred above, while the spatial exploratory analysis will cover the 1980–1998 period, the regression analysis will be limited to the 1991–1995 period.
2.3 The spatial statistics tools
In this section, we introduce the BB Join Count statistic and Moran’s I statistic that will be applied to study the spatial allocation of new manufacturing plants in Spanish municipalities.
The BB Joint Count TestFootnote 28 for spatial autocorrelation or spatial dependence reflects whether binary variables are clustered or randomly distributed in space. The BB Join Count Test is defined as follows:
where LOC is a binary variable, which is set to 1 when a manufacturing establishment is created over a period of time, and LOC is set to 0 otherwise. \(W_{ih }\) is the i-th element of a spatial weights matrix W, which reflects whether municipalities i and h share a common border, that is, they are neighbors. Thus BB reflects the number of times a municipality where there has been manufacturing births is contiguous to another municipality where there has been manufacturing births. A positive and significant z-value for this statistic indicates positive autocorrelation, that is, for a given manufacturing industry establishments births are more spatially clustered than might be caused purely by chance (Anselin 1992).
Using a measure of spatial autocorrelation for a binary variable seems sensible, since we are interested on whether the location decision is implemented or not. However, it could be argued that in our case, the measure could produce misleading results, since LOC is a binary variable which does not account for the number of establishments created. The BB statistic will be the same whether there is one or many new establishments created in the municipality.
In order to avoid this criticism, we will also apply Moran’s I statisticFootnote 29, which is defined as follows:
where N is the number of observations; \(w_{ih}\) is as defined above; xi and xh are the number of new establishments of a given manufacturing activity which have been set up in municipalities i and h respectively; and \(S_0\) is a scaling constant, \(S_0 =\sum \nolimits _i {\sum \nolimits _h {w_{ih} } } \). A positive and significant z-value for this statistic indicates positive spatial autocorrelation, that is, municipalities which have been chosen as locations for the new entries in a given manufacturing activity tend to be close to each other.
If BB Join Count statistic and Moran’s I statistic show there is spatial autocorrelation in location decisions and in the creation of new manufacturing establishments respectively, it does not necessarily mean that this spatial co-location is due to Marshallian agglomeration economies, since firms may cluster because of history, random events, natural advantages etc., as noted in the introduction. So, if the location decisions and the establishments births are spatially autocorrelated, we will apply Moran’s I statistic to the location quotient of the 11 manufacturing industries considered. The location quotient, \({LQ}_{ij}\), represents advantages of geographical specialization, traditional localization economies, Marshallian externalities or MAR’s type agglomeration economies. If the location quotient, or municipality specialization in a given industry, is autocorrelated in space, then location decisions and establishment births may be autocorrelated in space in order to get the advantages derived from a specialized environment.
3 Results
3.1 Exploratory analysis results
In this section, we provide results on the spatial statistics tools applied to the location decisions (Table 1), on the creation of new manufacturing establishments (Table 2), and on the manufacturing industry specialization in the Spanish municipalities (Table 3)Footnote 30. These analyses correspond to the 1980–1998 period and involve 11 manufacturing industriesFootnote 31.
As can be seen in Table 1, which shows the BB Join Count Test on the location decisions in Spanish municipalities, location decisions are spatially autocorrelated in all the manufacturing industries considered, except for computer and office equipment and electric and electronic equipment industries in 1980 and 1981. That is, municipalities which have been chosen for the location of manufacturing establishments of a given industry tend to share a common border with other municipalities where there are manufacturing births for that industry, in a fashion greater than could be caused purely by chance.
Looking at the number of births for every manufacturing industry in Table 2, results are very similar. Thus, both positive location decisions for a given industry and a given year, and the number of manufacturing births, are autocorrelated in space. These spatial patterns may be due to Marshallian agglomeration economies or to other reasons, as stated at the beginning of the introduction. In order to support the evidence for Marshallian agglomeration economies, Moran’s I statistic is applied to the level of municipality specialization in every manufacturing industry considered, which is measured through the location quotient, defined in expression 9. As shown in Table 3, except for the food industry, which is widely spread across the Spanish territory, specialized municipalities in a given industry tend to be neighbors. So, since municipality specialization in a given industry is autocorrelated in space, and so are location decisions and new manufacturing births, we may not reject that the benefits of locating in specialized municipalities are behind these spatial patterns.
3.2 Econometric results
In this section, as noted in Sect. 2.1, three models are estimated for each manufacturing industry: a standard Probit with spatially lagged explanatory variables, (PLEV), a Bayesian spatial autoregressive Probit, (SARP), and a Bayesian spatial error Probit with spatially lagged explanatory variables, (SEMP). The SARP and SEMPs Bayesian models both allow for heteroskedasticity. Spatially lagged explanatory variables in PLEV models are built with first-order contiguity SWM. As PLEV models results suggest spatial effects do exist in location decisions, we extend the geographical scope of these effects. The Deviance Information Criterion (DIC) (Spiegelhalter et al. 2002) was used to select the SWM specification.
This criterion is commonly used in Bayesian analyses with competing models (LeSage et al. 2011), and is based on the model likelihood. The DIC provides a measure of fit, which adjusts for the complexity of a model. Formally, the DIC is defined as:
where D(\(\theta ) = -2\mathrm{LL}(\theta \)), or negative two times the log likelihood, and
where \(D({\bar{{{\varvec{\uptheta }}}}})\) is the deviance calculated using the mean of the parameters \({\bar{{{\varvec{\uptheta }}}}}\) obtained from the MCMC draws, and the average deviance (\(\bar{{D}}\)) is computed by taking the average of the deviance over the MCMC draws (Spiegelhalter et al. 2002). As can be seen in Table 4, multiple SWMs were examined, including nearest neighbors, NN, inverse distance, InvDist, and inverse distance squared, InvDistSQ. The 20 NN SWM and the InvDistSQ SWM for 10 km had the lowest DIC score for SEMP and SARP models, respectively (with the difference in DICs much greater than 7 in each case), providing strong evidence for the superiority of these models (LeSage et al. 2011). Note that DIC in SARP models is lower to the one in SEMP models.
To test for convergence of the MCMC routines, Raftery–Lewis convergence diagnostics (Lesage and Pace 2009) were used. Results indicate that convergence was achieved in fewer than 4000 draws for all models, with the majority converging at around 2000 draws.
The results of the econometric models are summarized in Tables 5, 6, 7, 8, 9 and 10. All non-spatially lagged explanatory variables, except for LQ in the Food and Tobacco industry in SARP and SEMP models, are significant and show the expected sign across all three models. According to these results, we cannot reject that population skills, manufacturing specialization (localization economies), market potential, and diversity (urbanization or Jacobs external economies) play an important role in location processes. Results for food industry in spatial Probit models are consistent with the lack of significance of Moran’s I for the location quotient in Table 3.
These results differ to a certain extent from the evidence shown in previous studies, such as Viladecans-Marsal (2004), where urbanization economies influence location in most sectors, but specialization only plays a minor role.
Looking at the spatially lagged explanatory variables in the PLEV models in Table 5, which account for the sources of agglomeration economies in neighboring municipalities, WLQ and WDI, are always significant and show the expected sign except for WLQ in Food and in First transformation of metals. However, WLQ is highly significant all the other industries, which could reflect the positive effect of neighboring municipalities due to Marshallian agglomeration economies. As noted in Sect. 3, the insignificant Food results may be due to the fact that this industry is highly spread across SpainFootnote 32.
The high significance of the spatially lagged diversity indicator, WDI, stresses the key role of inter-industrial linkages at an interurban level. As was suggested at the beginning of this paper and in the comments on WLQ and WDI indicator they also support evidence on the geographical scope of agglomeration economies.
A striking result is the lack of significance of the spatially lagged Human Capital indicator, WHC, in most manufacturing activities. It could mean that commuting is not very important in Spain as a whole (excluding the biggest cities) or that the commuters are not very skilled, but that its effect is also represented in WLQ since a qualified labor market is also a source of agglomeration economies.
The spatially lagged potential market indicator, WMP, is not significant in most manufacturing activities. Therefore, decision-makers seem to focus primarily on their internal market.
Moving on to the full spatial models in Tables 6 and 7, note that the spatial error and lag parameters, \(\lambda \) and \(\rho \), are significant in all models except computers and office equipment (SARP, and SEMP models) and electric and electronic equipment and transport equipment (SEMP models). Computers and office equipment is a manufacturing industry highly clustered in certain areas, and not very widespread in Spain. This agrees with the findings of the BB Joint Count test. Also, if we use \(\rho \) as a measure of the spatial dependence present in the SARP model, computers and office equipment has the lowest coefficient at 0.09. It also has the lowest \(\lambda \) coefficient in Table 6. The strongest spatial dependence is shown in food industry, since spatial autoregressive coefficient \(\rho \) is 0.57, which is consistent with the fact that this industry is highly spread across Spain. As \(\lambda \) and \(\rho \) are highly significant most manufacturing industries analyzed, we cannot reject that location decisions in neighboring municipalities matter in industrial location decisions.
The coefficient estimates from Tables 6 and 7 Footnote 33 are not easily compared to Table 6, since the impact of both the coefficient and its lag must be accounted for in the latter. Although some of the non-spatial (Table 5) coefficients are within the credible intervals for the spatial results—such as LQ for all estimates except machinery—there are many others that do not fall within the interval.
As stated in Sect. 2.1 as location processes may seem more localized, our SEMP models include spatially lagged explanatory variables (Table 6). Results on these variables do not differ much from the ones in PLEV models. WHC is not significant or present a negative sign in most industries; WLQ is significant in all industries but food; and WMP and WDI are significant and show the expected sign in all industries.
As shown in Table 4, according to DIC criteria SAR models get a better fit than SEM ones. Effect estimates for these models are shown in Tables 8, 9 and 10. As expected, direct effects, Table 9, are larger than indirect effects, Table 10, in all industries. All explanatory variables are significant, but LQ in food industry. Location decisions of each municipality seem more influenced by changes in human capital (HC) and industrial diversity (DI).
The indirect effect or spatial spillovers impact on neighbor municipalities of each explanatory variable is shown in Table 10. These results are mostly consistent with most of the ones in spatially lagged variables in PLEV and SEM models. However, human capital is significant and shows the expected sign in most industries. Changes in neighboring human capital and in industrial diversity seem to have larger impact on location decision than the ones in municipality product and industrial specialization.
These results highlight the importance of properly controlling for spatial dependence. Although past papers have used specifications similar to Table 5, that kind of model does not fully control for the error structure of spatial dependence. Although Viladecans-Marsal (2004) provides empirical evidence on the geographical scope of agglomeration economies in the biggest Spanish cities, her results differ, since agglomeration effects only spill over beyond the administrative borders in three of the six industries analyzed.Footnote 34
4 Conclusions
This paper is focused on this geographical scope of agglomeration economies in Spain, using data from municipalities. Specifically, on the role of the neighboring municipalities characteristics in location decisions. Exploratory analysis has shown that for every manufacturing industry considered births are spatially autocorrelated, no matter that we test positive location decisions or the number of births. That is, municipalities which have been chosen as location for births in a given industry tend to be neighbors of municipalities which have also been chosen as location for the same manufacturing industry. Spatial exploratory analysis on the municipality specialization suggests that spatial behavior may be due to the existence of Marshallian agglomeration economies that expand beyond the municipality borders, because the location quotient is also spatially autocorrelated for every manufacturing industry. Therefore, the geographical scope of agglomeration economies may play a role in location decision.
In order to test the role of the geographical scope of agglomeration economies in industrial location decisions confirmatory analysis was carried out. A simple location model was outlined and estimated using Spatial Econometrics and Spatial Statistics techniques. Spatial variables are highly significant for most industries, so we cannot reject that the characteristics of neighboring municipalities matter in industrial location decisions. That is, what happens in a municipality depends not only on what happens inside that municipality, but also depends on what happens in its neighboring area. Interurban agglomeration economies due to industrial diversity seem to play a larger role in the location decision of neighboring municipalities than the one of interurban agglomeration economies due to industrial specialization.
Policy makers of countries with a highly decentralized regional system, such is Spain, should bear in mind that these agglomeration economies can extend to or come from neighboring areas which belong to other regions. Therefore, inter-regional coordination is needed before implementing local or regional location incentives. This might be an important argument to justify the industrial policy has a regional definition, avoiding either the national basis less efficient (Aghion et al. 2011) and the municipal basis. In fact, most of the variables determining localization (population skill, manufacturing specialization, market potential and diversity) are mainly affected by policies of regional scope.
Future research should check the kilometric extent of agglomeration economies for every industry. Longer in time and more disaggregated industrial datasets (3 or higher digit level) are needed to analyze both the industrial, the temporal and the geographical scopes of agglomeration economies properly.
Finally, spatial autocorrelation should be taken into consideration when estimating location models, since spatial dependence invalidates the use of traditional estimation techniques.
Notes
See Rosenthal and Strange (2004) for a review on the nature and sources of agglomeration economies.
Hoover’s classification also included internal economies of scale.
“And presently subsidiary trades grow up in the neighborhood, supplying it with implements and materials, organizing its traffic, and in many ways conducing to the economy of its material (Marshall 1890)”.
“A localized industry gains a great advantage from the fact that it offers a constant market for
skill. ... Employers are apt to resort to any place where they are likely to find a good choice of workers with the special skill which they require, while men seeking employment naturally go to places where there are many employers who need such skills as theirs (Marshall 1890)”.
“Great are the advantages which people following the same skilled trade get from near neighborhood to one another. The mysteries of the trade become no mysteries; but are as it were in the air ... if one man starts a new idea, it is taken up by others and combined with suggestions of their own; and thus it becomes the source of further new ideas (Marshall 1890)”.
See Holmes (1999) and Bartlesman et al. (1994) for evidence about shared input markets. Jaffee (1989), Acs et al. (1992), Jaffe et al. (1993), and Audretsch and Feldman (1996) provide evidence on the relevance of human capital and knowledge spillovers. Evidence on labor market pooling can be found in Baumgartner (1988), Diamond and Simon (1990), Moretti (2000), Costa and Kahn (2001).
See Miller (2004) for more information on Tobler’s law.
See Arauzo-Carod et al. (2010) for a review on methods and results of empirical studies in industrial location.
There are some works following this way such as Viladecans-Marsal (2004), Autant-Bernard (2006) and LeSage et al. (2011). While LeSage et al. (2011) addresses spatial autocorrelation by estimating a spatial autorregresive Probit model to study the decisions of reopen after Hurricane Katrina, the other papers model spatial effects including spatially lagged explanatory variables. However, these other papers do not fully control for spatial dependence through the error term or the likelihood function (Anselin 1988). Viladecans-Marsal (2004) use an OLS IV estimator to analyze the role of agglomeration economies in most crowded Spanish municipalities. Autant-Bernard (2006) analyses the location of R&D establishments in French NUTS 2 using a conditional logit model. However, neither of the latter two papers use a full spatial econometric model, as we do here.
Called random, since it follows from the random utility framework. See Guimarães et al. (2004) for an extension of the random utility framework.
See Anselin (1988) for more information about spatial autocorrelation and Spatial Econometrics techniques.
By implementing robust estimation techniques, applying spatial filters or enlarging or improving the dataset, etc.
SAR models include a spatially lagged dependent variable, Wy, as one of the explanatory variables, that is \({ y = \rho Wy + X\beta +\varepsilon }\), where y is a nx1 vector of observations on the dependent variable and Wy is an nx1 vector of spatial lags for the dependent variable (where again, W is an SWM). The parameter \(\rho \) is the spatial autoregressive coefficient that indicates the strength of spatial dependence, X is an nxk matrix of observations on the (exogenous) explanatory variables with an associated \(\beta \) kx1 vector of regression coefficients, and \(\varepsilon \) is an nx1 vector of normally distributed (N(0, \(\sigma 2\))) random error terms.
SEM models deal with spatial dependence through a spatially lagged error term, which uses a non- spherical error: \({ y = X\beta + u}\), where \(u = \lambda Wu + \varepsilon \), and \(\varepsilon \sim \hbox {N}\)(\({ 0},\sigma 2 I n\)). \(\lambda \) is a coefficient on the spatially correlated errors. See Anselin (1988) for additional details.
We choose that period because of the availability of data for the dependent and independent variables.
If we were not interested on the location decisions but in the creation of new manufacturing establishments, there are several ways to estimate spatial count data models. Kaiser and Cressie (1997) developed a Poisson auto-model which allows positive spatial dependencies in multivariate count data by specifying conditional distributions as truncated or Winsorized Poisson probability mass functions, and Poisson spatial interaction models are estimated in Lesage et al. (2007) and in Fischer and Griffith (2008) to analyze origin-destination patent citation data.
See Fleming (2004) for a more complete discussion on the advantages and disadvantages of different spatial Probit estimation techniques.
Following Lesage and Pace (2009), we employ a normal prior distribution for the \(\beta \) parameters, which are conditional on an inverse gamma distribution for \(\sigma ^{2}\). The spatial parameters, \(\lambda \) and \(\rho \), have uniform prior distributions.
Unlike the case of a continuous dependent variable, where Lagrange multiplier tests can be used to choose between the two models.
Local tax data are not available for small municipalities due to statistical secrecy, and, as argued before, we should not use NUTS III in order to avoid MAUP or ecological fallacy problems.
The economic interpretation of \(\lambda Wu\) in SEM models is not so straightforward.
In order to work with spatially continuous data, we consider 7906 municipalities, that is, we ignore the municipalities which belong to Balearic Islands or to Canary Islands.
The influence of MAUP on location analysis is addressed in Pablo-Martí and Muñoz-Yebra (2009).
All manufacturing establishments must be registered in REI before starting up its activities. See Mompó and Monfort (1989) for a description of REI.
During the nineties regional governments started managing REI delegations, and data about new establishments are neither provided in a timely fashion for all the regions nor in a friendly format to be processed.
As Anselin (1992) points out binary variables take on only the values 1 and 0, areal units with observations 1 are often referred to as colored Black. Black–Black (BB) join counts is the number of times a join, colored area, is contiguous to another Black unit. See also Cliff and Ord (1980) for technical details.
As our dataset comprises 19 years and 11 manufacturing industries, due to length limitations, the descriptive analysis only include the number of municipalities in which there was creation of manufacturing establishments, the number of manufacturing establishments created per year, and the maximum of establishments created in a given year (see “Appendix” section).
If we could disaggregate the Food industry, results would probably differ.
We must bear in mind that these studies were not carried out using the same methodology and do not use exactly the same dataset, thus full comparison is not possible.
References
Acs ZJ, Audretsch DB, Feldman MP (1992) Real effects of academic research: comment. Am Econ Rev 82(1):363–367
Aghion P, Bulanger JY, Cohen E (2011) Rethinking industrial policy. Bruegel, issue 2011/04 June
Alañón-Pardo A (2002) Estimación del valor añadido per cápita de los municipios españoles en 1991 mediante técnicas de econometría espacial. Ekonomiaz 51:172–194
Anselin L (1988) Spatial econometrics. Methods and models. Kluwer Academic, Dordrecht
Anselin L (1992) SpaceStat tutorial, a book for using SpaceStat in the analysis of spatial data. University of Illinois, Urbana-Champaign
Anselin L (2002) Under the Hood. Issues in the specification and interpretation of spatial regression models. Agric Econ 17:247–267
Arauzo-Carod J (2002) Determinants of industrial location. An application for catalan municipalities. In: Working paper \(\text{N}^{\circ }\) 138, Estudios sobre la Economía Española, FEDEA, Madrid
Arauzo-Carod JM, Liviano D, Manjón M (2010) Empirical studies in industrial location: an assessment of their methods and results. J Reg Sci 50(3):685–711
Arita T, McCann P (2000) Industrial alliances and firm location behaviour: some evidence from the US semiconductor industry. Appl Econ 32:1391–1403
Arrow KJ (1962) Economic welfare and the allocation of resources for invention. In: Nelson RR (ed) The rate and direction of inventive activity. Princeton-University Press, Princeton, pp 609–626
Audretsch D, Feldman M (1996) R&D spillovers and the geography of innovation and production. Am Econ Rev 86:630–640
Autant-Bernard C (2006) Where do firms choose to locate their R&D? a spatial conditional logit analysis on french data. Eur Plan Stud 14(9):1187–1208
Bailey TC, Gatrell AC (1995) Interactive spatial data analysis, 2nd edn. Longman, London
Bartlesman EJ, Caballero RJ, Lyons RK (1994) Customer and supplier-driven externalities. Am Econ Rev 84(4):1075–1084
Baumgartner JR (1988) Physicians’ services and the division of labor across local markets. J Polit Econ 96(5):948–982
Carlton D (1983) The location and employment choices of new firms: an econometric model with discrete and continuous endogenous variables. Rev Econ Stat 65:440–449
Cliff A, Ord K (1980) Spatial processes: models & applications. Pion, London
Diamond CA, Simon CJ (1990) Industrial specialization and the returns to labor. J Lab Econ 8(2):175–201
Costa DL, Kahn ME (2001) Power couples. Quart J Econ 116:1287–1315
Duranton G, Puga D (2000) Diversity and specialisation in cities, why, where and when does it matter? Urban Stud 37:533–555
Duranton G, Puga D (2005) From sectoral to functional urban specialization. J Urban Econ 57(2):343–370
Ellison G, Glaeser E (1997) Geographic concentration in US manufacturing industries, a dartboard approach. J Polit Econ 105:889–927
Ellison G, Glaeser EL, Kerr WR (2010) What causes industry agglomeration? Evidence from coagglomeration patterns. Am Econ Rev 100(3):1195–1213
Figueiredo O, Guimarães P, Woodward D (2002) Home-field advantage: location decisions of Portuguese entrepreneurs. J Urban Econ 52:341–361
Fischer MM, Griffith DA (2008) Modeling spatial autocorrelation in spatial interaction data: an application to patent citation data in the European Union. J Reg Sci 48(5):969–989
Fleming M (2004) Techniques for estimating spatially dependent discrete choice models. In: Iin Anselin L, Florax R, Rey S (eds) Advances in spatial econometrics, methodology, tools and applications. Springer, Berlin, pp 145–167
Glaeser E, Kallal H, Scheinkman J, Shleifer A (1992) Growth in cities. J Polit Econ 100:1126–1152
Guimarães P, Figueiredo O, Woodward D (2000) Agglomeration and the location of foreign direct investment in Portugal. J Urban Econ 47:115–135
Guimarães P, Figueiredo O, Woodward D (2004) Industrial location modeling: extending the random utility framework. J Reg Sci 44:1–20
Hayter R (1997) The dynamics of industrial location, the factory, the firm and the production system. Wiley, New York
Henderson JV (1997) Externalities and industrial development. J Urban Econ 42:449–470
Henderson JV (2003) Marshall’s scale economies. J Urban Econ 53:1–28
Holl A (2004a) Manufacturing location and impacts of road transport infrastructure: empirical evidence from Spain. Reg Sci Urban Econ 34:341–364
Holl A (2004b) Transport infrastructure, agglomeration economies, and firm birth. Empirical evidence from Portugal. J Reg Sci 44:693–712
Holmes T (1999) Localization of industry and vertical disintegration. Rev Econ Stat 81:314–325
Hoover E (1948) The location of economic activity. McGraw-Hill, New York
Jacobs J (1969) The economy of cities. Vintage, New York
Jaffee AB (1989) Real effects of academic research. Am Econ Rev 79(5):957–970
Jaffe AB, Trajtenberg M, Henderson R (1993) Geographic localization of knowledge spillovers as evidenced by patent citations. Quart J Econ 108:577–598
Jofré-Montseny J (2009) The scope of agglomeration economies: evidence from Catalonia. Pap Reg Sci 88(3):575–590
Kaiser M, Cressie N (1997) Modeling Poisson variables with positive spatial dependence. Stat Probab Lett 35:423–432
Krugman P (1991a) Geography and trade. MIT Press, Cambridge
Krugman P (1991b) Increasing returns and economic geography. J Polit Econ 99:484–499
Lesage J, Pace RK (2009) Introduction to spatial econometrics. Chapman & Hall/CRC, Boca Raton
LeSage JP, Pace RK, Lam N, Campanella R, Liu X (2011) New Orleans business recovery in the aftermath of Hurricane Katrina. J R Stat Soc Ser A 174(4):1007–1027
Lesage JP, Fischer MM, Scherngell T (2007) Knowledge spillovers across Europe: evidence from a Poisson spatial interaction model with spatial effects. Pap Reg Sci 86(3):393–422
Lesage J (1997) Bayesian estimation of spatial autoregressive models. Int Reg Sci Rev 20:19–35
Lesage J (2000) Bayesian estimation of limited dependent variable spatial autoregressive models. Geogr Anal 32:19–35
Marshall A (1890) Principles of economics, ed of 1920. MacMillan, London
McFadden D (1974) Conditional logit analysis of qualitative choice behavior. In: Zarembka P (ed) Frontiers in econometrics. Academic Press, New York, pp 105–142
Mcmillen D (1995) Spatial effects in Probit models: a Monte Carlo investigation. In: Anselin L, Florax R (eds) New directions in spatial econometrics. Springer, Berlin, pp 189–228
Miller HJ (2004) Tobler’s first law and spatial analysis. Ann Assoc Am Geogr 94(2):284–289
Mompó A, Monfort V (1989) El registro industrial como fuente estadística regional: el caso de la Comunidad Valenciana. Econ Ind 268:129–140
Moretti E (2000) Estimating the social return to higher education In: NBER Working Paper No 9018
Ottaviano G, Puga D (1998) Agglomeration in the global economy: a survey of the new economic geography. World Econ 21(6):707–731
Pablo-Martí F, Muñoz-Yebra C (2009) Localización empresarial y economías de aglomeración: el debate en torno a la agregación espacial. Investig Reg 15:139–166
Pinkse J, Slade M (1988) Contracting in space: an application of spatial statistics to discrete-choice models. J Econ 85:125–154
Romer P (1986) Increasing returns and long run growth. J Polit Econ 94:1002–1037
Rosenthal S, Strange W (2003) Geography, industrial organization and agglomeration. Rev Econ Stat 85:377–393
Rosenthal S, Strange W (2004) Evidence on the nature and sources of agglomeration economies. In: Henderson J, Thisse JF (eds) Handbook of urban and regional economics, vol 4. Elsevier, Amsterdam, pp 2119–2171
Simmie J (1998) Reasons for the development of ‘Islands of innovation’: evidence from Hertfordshire. Urban Stud 35:1261–1289
Smith T, Lesage J (2002) A Bayesian probit model with spatial dependencies. Webpage of Smith T at the University of Pennsylvania, School of Engineering and Applied Science. http://www.seas.upenn.edu/tesmith/sprobit.pdf. Accessed 15 July 2011
Spiegelhalter DJ, Best NG, Carlin BP, Van Der Linde A (2002) Bayesian measures of model complexity and fit. J R Stat Soc Ser B (Stat Methodol) 64:583–639. doi:10.1111/1467-9868.00353
Suárez-Villa L, Alrod W (1998) Operational strategy, R&D and intrametropolitan clustering in a polycentric structure. The advanced electronics industries of the Los Angeles Basin. Urban Stud 34:1343–1380
Tobler WR (1970) A computer movie simulating urban growth in the Detroit region. Econ Geogr 46:234–240
Unwin DJ (1996) GIS, spatial analysis and spatial statistics. Prog Hum Geogr 20(4):540–551
Van Soest D, Gerking S, Van Oort F (2006) Spatial impacts of agglomeration externalities. J Reg Sci 46:881–899
Viladecans-Marsal E (2001) La concentración territorial de las empresas industriales: un estudio sobre el tamaño de las empresas y su proximidad geográfica. Pap Econ Española 89–90:308–321
Viladecans-Marsal E (2003) Economías externas y localización del empleo industrial. Revista Econ Aplicada 31:5–32
Viladecans-Marsal E (2004) Agglomeration economies and industrial location: city-level evidence. J Econ Geogr 4:562–582
Acknowledgements
The authors would like to thank to two anonymous referees and to the editors of this journal for their useful comments and suggestions. The usual disclaimer applies.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Alañon-Pardo, A., Walsh, P.J. & Myro, R. Do neighboring municipalities matter in industrial location decisions? Empirical evidence from Spain. Empir Econ 55, 1145–1179 (2018). https://doi.org/10.1007/s00181-017-1307-5
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00181-017-1307-5