1 Introduction

Knowledge produced in universities diffuses to nearby firms through a variety of mechanisms involving tacit and codified channels, generating positive externalities (spillovers) that can potentially be of use for external agents who do not pay a price for it. Several well-known econometric analyses have paid considerable attention to the role of the externalities of university-generated knowledge across geographical areas (Anselin et al. 1997, 2000; Feldman and Florida 1994; Fischer and Varga 2003; Jaffe 1989; Varga 1998). The main finding of these studies is that knowledge spillovers from universities are localized and contribute to higher rates of corporate patents or innovations in geographically bound areas. However, most studies of localized university spillovers focus on the geographical coincidence between knowledge generated in universities and patents or innovation in private firms; only a few empirical papers address new business location as the core of their analysis.

In this article, we extend the empirical literature on this topic. We attempt to shed light on the connection between the presence of universities and the location of new business by focusing on high-technology sectors in Spain. We consider university spillovers produced by three main university outputs: knowledge-based graduates, who provide a specialized labor force to firms; research activities, which may be used by companies to create or improve their inventions and innovations; and technological knowledge, which can potentially be transferred to firms. Note that by separating the main university outputs we take into account both the traditional university mission (education and research) and the newer university mission of technology development.

Our general objective is to test the role of geographically bounded knowledge spillovers from universities on new business location. In particular, we focus on high-technology sectors in the Spanish case. More specifically, we address three research questions.

  1. (1)

    Are university spillovers relevant in explaining new business location in high-technology sectors after controlling for other important location and cost factors?

  2. (2)

    If so, what kind of university output is more influential in encouraging new business location near universities?

  3. (3)

    What is the effect of any increase in academic output on the new business location?

These issues are important as new firm creation is considered an integral feature of economic growth and progress (Reynolds 1994). Some recent empirical studies have found significant effect of different aspects of business dynamics on regional economic growth (Audretsch and Keilbach 2004; Van Stel and Storey 2004) and economic development and employment (Fritsch 2008; Arauzo et al. 2008). Therefore, a better understanding of the role and sources of university spillovers as a determining factor of new business location may contribute to our understanding of industrial and regional policy to encourage regional growth.

To address these research questions, information on 604 companies and 63 universities was collected from different sources for our database. This dataset includes information about new business creation in high-technology sectors in Spain, and the main outputs (graduates, research, and patents) of all private and public Spanish universities. We classify the data into 36 geographical units according to the location of the new businesses and the universities. The data covers the period from 2001 to 2004 (144 observations). The methodology involves several econometric models. Our dependent variable is the number of new businesses established in the geographical area where the university is located (using Spanish provinces as the unit of analysis). The exogenous variables comprise two groups. The first contains the sources of potential university spillovers (number of graduates in scientific and technological disciplines, number of scientific papers in high-quality journals, and number of university patents granted). The second group considers traditional cost factors and location variables to capture other sources of spillovers. Several dummy variables are included to consider the temporal effects.

Our approach is novel in several respects. First, we shed light on the effects of university spillovers on new business formation. We add this evidence to the meagre literature in the area. Second, while previous work has focused on university research and development (R&D) expenses or personal, scientific research, and students as sources of spillovers, we also take into account the potential of the university to generate technological knowledge (as proxied by patents), an output not previously considered. Third, to the best of our knowledge, there is no existing research of this type at the provincial level in Spain.

The article is organized as follows. Section 2 summarizes the empirical findings of selected econometric studies on university spillovers and the relationship with new firm location. Section 3 presents the variables and the econometric specification. Section 4 describes the data. Section 5 provides the results. We briefly summarize the conclusions and discuss future research in Sect. 6.

2 Literature review and hypotheses

Our literature review is organized around three questions particularly relevant to our research: (i) what is known about the effects of university spillovers on firm locationFootnote 1; (ii) what sources of university spillovers might affect the decision of location of new entrants; and (iii) what other factors are relevant in explaining new business location in high-technology sectors? Regarding the first question, Audretsch and his co-authors found compelling evidence suggesting (in the German case) that firms have a high propensity to locate near universities. Audretsch et al. (2004) focused on whether knowledge spillovers are homogeneous with respect to different scientific fields. They found that the locational decision is shaped not only by the output of universities (for instance, students and research), but also by the nature of that output (that is, the specialized nature of scientific knowledge). Audretsch and Lehmann (2005) concluded that universities in regions with greater knowledge capacity and higher knowledge output also generate a larger number of technology start-ups. Audretsch et al. (2005) also suggested that new knowledge- and technological-based firms have a higher propensity to locate close to universities, presumably to facilitate access to knowledge spillovers. However, they also argued that two factors shape the exact role of geographic proximity: the particular knowledge context and the specific spillover mechanism.

Other relevant works in different spatial contexts are as follows. Harhoff (1999) studied the effect of regional spillovers in two major West German industries for the period from 1989 to 1993. Using a sample of 656 observations (328 counties across two industries), he found that regional spillovers effects exist and that they have a significant effect on new business formation in high-technology industries. Woodward et al. (2006) focused on the role of academic scientific and engineering research in the decision to locate new industrial plants in US counties. Their model uses research and development (R&D) expenditures in science and engineering in each county and several key determinants of business location decisions usually considered in the literature on firm location (cost and demand factors in addition to several proxies to capture agglomeration economies). Their results point to the potential relationship between local university R&D expenditures and the number of newly created high-technology plants by county.

Similarly, Abramovsky et al. (2007) provide evidence on the extent to which business sector R&D activity is located near high-quality university research departments in Great Britain. Their empirical approach relates the pattern of private-sector R&D establishment locations to the presence of relevant nearby university research departments. Their results found robust evidence for the collocation of business and university research after controlling for various sources of observed and unobserved heterogeneity across 111 geographic units. Kirchhoff et al. (2007) argue that small firms will tend to form near organizations from which they hope to acquire innovative inputs. Using a sample of new firm start-ups in the USA and 354 labor market areas (LMAs) as the unit of analysis, they found that university R&D significantly relates to new business formation, which supports the hypothesis of positive spillovers from universities. This literature highlights the importance of the geographical dimension and suggests that firms attempt to capture localized knowledge spillovers through their choice of location. Table 1 provides a summary of the main results.

Table 1 Econometric literature review of the effect of university spillovers on new firm location

At least some of these new firms could be entrepreneurial activities established by university professors, young researchers, etc. or by private founders as a path for the commercialization of university-based technologies (start-up/spinoff companies). However, it should be noted that the location behavior of this kind of firm may differ significantly because their already close link to the university suggests a preference for a location near the parent institution (Egeln et al. 2004). Several empirical studies focus on this particular group of firms. For instance, Bania et al. (1993) analyzed the importance of universities in explaining the rate of high-technology firm start-ups. They model firm start-ups during the period 1976–1978 for two industries located in 25 US metropolitan areas. Their results indicated a positive and significant relationship between university research and firm start-ups in the electrical and electronic equipment industries, while in the instruments and related industries the relationship was statistically insignificant. In several analyses, Shane (2001a, b) explored the determinants of proximity to the Massachusetts Institute of Technology (MIT) on new firm formation. Using data on new firms formed to exploit 1,397 patents assigned to MIT from 1980 to 1996, they concluded that universities create technological spillovers that can be exploited by the formation of new firms. Di Gregorio and Shane (2003) analyzed start-up firms from the university perspective (across 101 US universities over the period 1994–1998). They showed that the intellectual eminence of the university along with university policy increases the formation activity of new firms. In order to test the hypothesis that the entry of firms into biotechnology is determined by the geographic distribution of sought-after “star” researchers, Zucker et al. (1998) established a relationship involving star scientists and US biotechnology firms. Employing a dataset of 751 new companies from 1976 to 1989 across 183 functional economic US regions, they found that the location of star scientists predicts the entry of biotechnology firms.

In light of the above discussion, we propose the following general hypothesis.

Hypothesis 1

New business location in high-technology sectors positively relates to university spillovers within given geographical areas.

With regard to the second question, a strand of literature on university–industry links shows evidence for a variety of mechanisms facilitating university spillovers (for recent work, see Brenner 2007; Cohen et al. 2002; Martinelli et al. 2008; Perkmann and Walsh 2007; Schartinger et al. 2002). However, if we focus on new firm location, scientific publication and graduating students are the two main mechanisms involved in the transfer of knowledge through spillovers (Audretsch et al. 2005). Acs et al. (1999) provide two reasons why firms choose to locate a short distance from universities. First, university research is a source of significant innovation-generating knowledge that initially diffuses through personal contacts to adjacent firms. Since both basic and applied research may benefit in various ways, private enterprise induces firms to locate near an appropriate university. Second, universities are a source of highly qualified science and engineering graduates. This pool of trained workers explains firms clustering nearby. Added to this general, but compelling, explanation is another justification related to the technological capacity of the university. The idea is that there are many potentially marketable and economically valuable ideas in technological departments in universities and registered as patents. This is another incentive for firms to be located near universities.

In turn, firms can benefit from proximity to this source of technological knowledge in two ways. The first way is through licensing. Although patents are considered as codified knowledge, proximity can help firms to find university technologies. For instance, Thursby and Thursby (2003) discussed several ways in which firms identify universities’ technologies, one of which involves personal contacts between the firm’s R&D staff and university personnel. Second, the literature on patent citations as a flow of knowledge between inventors is consistent with the importance of proximity in fostering spillovers (Jaffe et al. 1998). Therefore, we propose to examine the relevance of the sources of spillovers by testing the following hypotheses.

Hypothesis 2

New business location in high-technology sectors positively relates to university technological knowledge as measured by patents within geographical areas.

Hypothesis 3

New business location in high-technology sectors positively relates to scientific research from universities within given geographical areas.

Hypothesis 4

New business location in high-technology sectors positively relates to university graduates within given geographical areas.

In respect to the third question, the empirical evidence shows the importance of university knowledge as one, but not the only, determining factor for new business formation nearby universities. According to new economic geographical principles and work on regional variations in entrepreneurship, firm location in regions usually considers agglomeration effects, demand factors, and cost factors as the main determinants (Armington and Acs 2002; Cohen and Paul 2005; Crozet et al. 2004; Guimaraes et al. 2003; Kirchhoff et al. 2007; Koo 2007; Reynolds et al. 1995; Tamasy and Heron 2008). At a smaller scale, we may also take into account yet other determinants (such as taxes) (Rathelot and Sillard 2008). In the following section, we use this final strand of the literature to include several additional variables in our models that allow us to attain the best possible specification and avoid any bias arising from the omission of explanatory variables.

3 Methodology

This section establishes an econometric framework to study the extent to which spillovers stemming from three university sources (graduates, scientific research, and technological results) determine the number of new firms locating in high-technology sectors. Following the literature, we estimate a model with the number of new firms in high-technology sectors as the dependent variable, and measures of the sources of university spillovers and other variables as explanatory variables. That is, NBL jt  = f(SUSjt−2, ORVjt−2, Year jt , u jt ) for j = 1,…,N, and t = 1,2,…,T, where NBL is the number of new businesses located in the jth geographical area, SUS is a set of explanatory variables capturing the sources of university spillovers, ORV is a set of other relevant variables, Year is a time effect, and u is the usual random term for unobserved effects. The number of new business locations in the jth geographical area then comprises a deterministic and a stochastic component. Note that we lag the explanatory variables by 2 years for the following reasons. First, the location of a new business takes time. Once the decision to start a new firm is made, the entrepreneur chooses among several different locations and there is an interlude between the location decision and the registration of the new company. We consider 2 years to be a reasonable period between the entrepreneur’s decision to start a new business and the registration of the new firm as a corporation with a definite location. Second, the 2-year lag prevents reverse causality with the cost factor variables.

3.1 Dependent variable

The literature uses several indicators to measure new business location. Following recent work by Di Gregorio and Shane (2003), Audretsch and Lehmann (2005), and Fritsch and Falck (2007), we specify the dependent variable as the number of new firms in high-technology sectors located in each geographical area in a specific year.

3.2 Explanatory variables

The group of explanatory variables includes three indicators to capture the effects of university outputs on new business location: (i) the number of graduates in the areas of science and technology, (ii) the number of scientific papers (excluding humanities and the social sciences) published in high-quality journals, and (iii) the number of technological results (university patents). However, as noted earlier, additional factors are likely to influence the entry of new firms. To take these other factors into account, we control for intra-industry spillovers (proxied by the number of firms in the same sector in the same geographical area), agglomeration spillovers (captured by the population density in each location), cost factors (average labor costs in high-technology sectors and taxes for each geographical area), and year effects. Table 2 provides details of the explanatory variables.

Table 2 Definition of explanatory variables

3.3 Specification and estimation strategy

The nature of the dependent variable (the number of new businesses in high-technology sectors established in a specific geographical area) suggests the formulation and estimation of a count model to detect the intensity of the new business location (either Poisson or negative binomial). As in earlier work (e.g., Audretsch and Lehmann 2005; Fritsch and Falck 2007), our baseline specification assumes that the dependent variable follows a Poisson distribution, where the number of events given the set of regressors, X, has a Poisson distribution with a density function:

$$ f(y_{i} |x_{i} ) = {\frac{{{\rm e}^{ - \mu } \mu_{i}^{{y_{i} }} }}{{y_{i} !}}},\begin{array}{*{20}c} {} & {} \\ \end{array} y_{i} = 0,1,2, \ldots . $$

The conditional mean depends on the individual characteristics picked up in the previously defined regressors. Put differently:

$$ \mu_{i} = E(y_{i} |x_{i} ) = \exp (x_{i} \beta ), $$

where x = (lnUTK, lnUSK, lnUGRAD, lnFHT, WHT, POPD, TAX, YEAR02, YEAR03, YEAR04) is the vector of explanatory variables. As discussed earlier, both the sources of spillovers as well as the control variables are lagged by 2 years. The standard procedure for computing the estimators is the Newton–Raphson iterative method. Convergence is ensured because the logarithmic likelihood function is globally concave.

Our estimation strategy follows two steps. In the first step, we estimate the Poisson model, pooling the data and including dummy variables to capture any temporal effects. However, one restriction of the Poisson model is that it assumes the mean and variance of the dependent variable to be equal, so this framework breaks down when the data are overdispersed, that is, when the variance of the dependent variable is greater than the mean (a requirement that cannot always be met in practice). If the data shows overdispersion, the standard errors of the Poisson model will bias downwards, which gives spuriously high values for the t-statistics (Cameron and Trivedi 1986). To provide results that are as robust as possible, we consider a number of different solutions to overcome this problem. The first is to apply the Eicker–White correction to obtain robust values of the standard error in Poisson models. The second is the estimation of a negative binomial model that assumes that the variance is a quadratic function of the mean (the density function, logarithmic likelihood function, first-order conditions, etc. are similar, and are discussed in detail in Cameron and Trivedi 1998).

In the second step, we relax the assumption of a correct specification and maintain the assumption that regressors are strictly exogenous, but consider a multiplicative constant-time unobserved effect:

$$ \mu_{it} = E(y_{it} |x_{it} ,\alpha_{i} ) = \alpha_{i} \exp (x_{it} \beta ) = \exp (\partial_{i} + x_{it} \beta ). $$

The economic justification for this new specification relies on the possibility that several fixed features, such as differences in creativity or infrastructure across geographical areas, may affect new firm formation. Details of this model can be found in Cameron and Trivedi (1998, pp. 280–290) and Wooldridge (2002, pp. 668–678).

4 Data

We constructed a new dataset including information on 604 new firms in high-technology sectors across Spain. The principal outputs (graduates, research, and patents) are from 63 private and public Spanish universities, along with other variables included in our analysis. We chose high-technology sectors on the grounds of the following arguments. First, the determining factors of firm start-ups are likely to vary across industries (Audretsch and Fritsch 1999). Second, these firms are usually more R&D intensive. Therefore, and according to the literature on “absorptive capacity” (Cohen and Levinthal 1990), these companies are in the best position to capture external knowledge. We identify the high-technology firms with the classification used by the Organization for Economic Cooperation and Development (OECD).Footnote 2

We aggregate all data across the Spanish geographical areas for the period 2001 to 2004. We consider a geographical area as the zone of influence of university outputs; this may consist of a province where there is at least one university or a region when there is only one university for the whole region. This definition results in 36 geographical areas with at least one university (four are regions, the remainder are provinces). Consequently, our sample contains both provinces and regions, but the outcome is much better than considering regions alone because these geographical units are more homogeneous. Moreover, we include more observations: Spanish regions are fewer in number (only 17 in total) and in some cases are exceptionally heterogeneous (e.g., a large region such as Andalusia contains eight provinces).

The details of the data are as follows.

Dependent variable (new business location). Our dependent variable is the number of new businesses in high-technology sectors established in each geographical area of Spain from 2001 to 2004. A new business location is identified as a corporation that did not exist in the Central Mercantile Register (an official Spanish institution containing accounting information for all Spanish corporations) prior to a given year in a specific place (geographical area). We identify each new firm by the creation date (already available for each corporation at the Register Mercantile). Firms operating in other geographical areas and relocating to a new province are not included in our data if the creation date of the company was before 2001. Relocations of firms in those years (January 2001 to December 2004) are included as new corporations in the region to which they move. According to this criterion, we identify 604 new businesses in high-technology sectors in Spain from January 2001 to December 2004. The new businesses are classified in the university geographical areas according to the previous definition.

Although it is possible to obtain information on new firm formation for a longer period, we use the 4 years in our sample period due to the lack of data availability for some explanatory variables. The source of this variable is the Sistema Anual de Balances Ibéricos (SABI) database that includes information from the Central Mercantile Register. Table 3 provides the dependent variable frequencies; we include a final column showing the cumulative number of firms for the cumulative frequencies of observations (geographical areas).

Table 3 Dependent variable frequencies

Independent variables are as follows:

  • Sources of university spillovers (SUS). We consider three main outputs from 63 public and private universities in Spain from 1999 to 2002: the number of graduates in science and technology areas (UGRAD), the number of patents owned by universities (UTK), and the number of scientific papers in high-quality journals (USK). In this analysis, we consider international journals included in the Science Citation Index (SCI) as “high-quality journals” (other kinds of scientific output, such as publications in national journals, books, communications, book reviews, letters to the editor, replies, comments, abstracts, and similar writings are not included). The data considers simple or unweighted counts of articles for Spanish universities in scientific and engineering fields, available from the SCI research database published by Thomson Reuters. The data from each university is assigned to a geographical area according to its location. Table 4 lists the descriptive statistics and the sources of information.

    Table 4 Descriptive statistics
  • Other relevant variables (ORV). Other variables in the analysis include labor costs (WHT) and taxes (TAX) along with other sources of spillovers (FHT) and agglomeration (POPD) (see Table 2), all for the period 1999–2002 and each geographical area. Table 4 shows the main statistics and sources.

5 Empirical results

Following our estimation strategy, two pooled Poisson models (including dummy variables to capture any temporal effects) were estimated as a first step (Table 5). Model I in Table 5 does not contain any variables representing the sources of university spillovers, whereas a second model (Model II) jointly incorporates all sources of university spillovers: patents, scientific publications, and graduates. A methodological problem is that the assumption of the equality of mean and variance in the Poisson model may not be reasonable. In order to cope with possible overdispersion in our Poisson estimated models, we run several negative binomial regressions (not shown). The coefficient of overdispersion (alpha) is not statistically significant in any case, evidencing the adequacy of our earlier Poisson specification. Further, the standard errors for each model in Table 5 are calculated using the heteroskedasticity-consistent Eicker–White correction, and in most cases these are more conservative than either the Newton or Berndt, Hall, Hall, and Hausman (BHHH) standard errors (this is a feasible option for avoiding lower-end standard error bias in Poisson models). We also ran other additional regressions after removing some of the control variables, but we did not obtain any significant changes in the signs and magnitudes of the estimates. A multicollinearity analysis (correlation coefficients, variance inflation factor, and condition number) was also carried out, but without consequences for our models.

Table 5 Results of Poisson models (1)

The results in Table 5 are the basis for responding to the three initial research questions. First, we test the relevance of university spillovers on new firm formation (hypothesis 1) by applying the LR = −2(lnLModel I − lnLModel II) test of joint significance on variables lnUTK, lnUSK, and lnUGRAD in Model II, against Model I (which does not include these variables). The value of the LR statistic is 13.7 (p = 0.0034), showing that the variables are jointly relevant. Consequently, the robust Poisson estimations favor hypothesis 1.

Second, we identify which university output is more influential in encouraging new business location near universities (hypotheses 2, 3, and 4). We test these hypotheses by analyzing the individual significance of the estimated coefficients for each variable in Model II. The introduction of the variables reflecting the sources of spillovers provides a significant estimated coefficient only for the variable lnUGRAD. Therefore, taking into account the results for the robust estimation of the Poisson models in Table 5, only one source of spillovers—graduates—is statistically relevant in explaining the formation of new firms in high-technology sectors in Spain.

Third, the estimated coefficients for lnUTK, lnUSK, and lnUGRAD indicate the size of the effect of any increase in academic output on new business locations (as elasticities). Given the logarithmic transformation and the exponential conditional mean in the Poisson model, the coefficients of lnUTK, lnUSK, and lnUGRAD are elasticities, indicating the average percentage change in new firm formation for a 1% change in the explanatory variable. As the coefficients of lnUTK and lnUSK are not statistically significant, we focus on the coefficient of lnUGRAD, with an estimated coefficient of 0.39. This value indicates the size of the effect of any increase in this output: for instance, doubling the number of graduates would lead to a 39.5% increase in the number of new firms in high-technology sectors.

With respect to the other variables in the analysis, we found a significant effect in all models for lnFHT (the intra-industry spillovers proxied by the number of firms in the same sector in the same geographical area) and for POPD (agglomeration spillovers captured by the population density in each location). We did not find significant effects for the two cost variables: WHT (average labor costs in high-technology sectors) and TAX (local tax burden in each geographic area), although the negative signs for their coefficients in most of the models are as expected.

Finally, we went a step further by estimating a battery of additional Poisson models with unobserved fixed and random effects (as is well known, these models include unobserved effects: factors affecting y that are not systematically related to the observable explanatory variables whose effects are of interest). The summary of these estimations is as follows. First, according to the Hausman test, random-effects models are preferred to fixed-effects models. However, the estimates from the random-effects models are very similar to those obtained in the pooled models in Table 5. Second, in order to know which models are preferred, we performed a likelihood-ratio test of alpha = 0; this compares the panel estimator of the random-effects models with the pooled Poisson estimators. The results show that the random-effects models are not significantly different from the pooled models. Therefore, we found no evidence of any unobserved effects, meaning that the pooled models in Table 5 are correctly specified.

6 Conclusions

There is compelling evidence that university R&D generates spillovers with important effects on private innovation. However, their consequences for new business location is still an issue that requires much more research. This paper contributes to the growing empirical literature on the relation between university spillovers and firm location by focusing on the Spanish case and high-technology sectors. According to the recent literature, universities generate new knowledge by conducting their own research, producing technology, and educating students. These outputs involve tacit and codified knowledge that diffuses into the economy in a number of ways and thereby generates university spillovers. After controlling for cost factors, agglomeration characteristics, and time effects, our results show that the main source of university spillovers to explain new business location near universities is generated by the number of graduates. We did not find any significant effects of research activities and technological production in universities to generate spillovers that may influence new firm location in high-technology sectors for the Spanish case.

From a political point of view, these findings suggest that reinforcing the role of universities may encourage new business location through spillovers. Nevertheless, despite the attention that the new university mission has received of late (by focusing on technological production), the traditional mission of universities (education) is—according to our results—the only source of spillovers. Therefore, promoting education instead of the production of university technology may have different outcomes in this respect.

In terms of the direction of future research, several points require clarification. First, we focus here on high-technology sectors, but this is a heterogeneous group; perhaps its disaggregation would give a different picture of economic activity. Second, the authors are currently involved in a broader extension of the methodology in this paper by considering European regions. Third, we could focus on indicators other than the number of new businesses. This could potentially include, for instance, the effects of university spillovers on firm performance.