1 Introduction

New and young firms play a leading role in US job creation (Haltiwanger et al. 2013), promote economic growth and innovation (Fritsch 2013), introduce new products and markets (Knight 2001), and drive technological evolution in regions (Fritsch and Mueller 2004). Given a wide variation in start-up rates across regions (Bosma et al. 2008; Reynolds et al. 2007), both scholars and policymakers strive to identify regional factors that help promote business formation. Recent research on the determinants of high-tech business entry increasingly relies on the knowledge spillover theory of entrepreneurship (KSTE), which contends that employees-turned-entrepreneurs start new companies in order to commercialize unused local knowledge generated by incumbent firms and universities (Acs et al. 2013, 2009; Plummer and Acs 2014). Ghio and co-authors (Ghio et al. 2015) document the growing influence of the KSTE among scholars and its spread to various fields of economics and management.

Empirical tests of the KSTE are predominantly based on investigations of total high-technology business entry as a function of regional knowledge generation (Lee et al. 2013; Plummer and Acs 2014; Qian and Acs 2013) with just a few analyses at a more detailed sector or industry level, which are mostly performed in countries other than USA (Bonaccorsi et al. 2006; Lee et al. 2013). In the US context, there are virtually no empirical tests of the KSTE at a more detailed industry level (an exception is  Tsvetkova 2015). The positive relationship that is typically found in more aggreate studies is interpreted as supporting evidence for the validity of the KSTE.

In this paper, we use instrumental variable approach to perform a systematic test of the KSTE in the US metropolitan statistical areas (MSAs). We develop a novel instrument to measure metropolitan knowledge production, which is based on the logic of the industry mix term from shift-share analysis. As a sensitivity test, we use alternative instruments recently proposed in the literature. Our analysis shows that the KSTE is inconsistent with  the US empirical evidence in more disaggregated analyses of high-tech subsectors and a high-tech manufacturing industry. Specifically, the estimation results for total high-tech firm formation are consistent with the literature, i.e., greater knowledge generation leads to increased high-tech firm entry. Also, the relationship holds for business formation in the high-tech nongoods sector. For the high-tech goods sector the results suggest that KSTE is not the dominant feature at the metropolitan statistical area (MSA) level. The negative relationship between metropolitan knowledge generation and business entry in the high-tech goods-producing sector and in the computer and electronic product manufacturing industry implies that other mechanisms are likely to be at play.

The contribution of this paper is twofold. First, it shows that the mechanism of endogenous firm formation is more complex than the KSTE, and it may not be generalizable across all industrial settings in the US MSAs. In the US high-tech goods-producing sector and in computer and electronics product manufacturing, which is one of the most innovative industries, for example, our results reveal effects that contradict the KSTE. Although our estimation is unable to explicate the mechanisms behind this finding, a review of international empirical evidence points to the role played by institutional arrangements and/or business practices (business culture) that vary across countries. Second, the paper offers an explanation of the relationship between knowledge generation and business entry using the US data and earlier research on the determinants of entrepreneurial intensity in a region.

The rest of the paper is organized as follows. The next section briefly reviews the KSTE and its empirical tests including more disaggregated analyses from other countries. We then perform systematic empirical tests in the US context. Section 3 describes our estimation approach, variables, and the data. Section 4 presents results and discussion followed by sensitivity checks in Section 5. Given that our empirical findings diverge from what would be expected according to the theory and existing international evidence, Section 6 proposes alternative theoretical explanations of the observed relationships that are consistent with the patterns revealed by the analysis. Section 7 summarizes our findings and outlines potential avenues for future research.

2 The KSTE perspective

The knowledge spillover theory of entrepreneurship (Acs et al. 2013; Acs et al. 2009) describes a mechanism that links regional knowledge generation to regional business entry. According to the theory, knowledge is produced by incumbent firms and universities (Plummer and Acs 2014), but in the process of market implementation, part of the newly generated knowledge is discarded by its creators. The unused or underutilized part of the knowledge stock of a region becomes the source of business formation opportunities (and in the case of knowledge being at least to a degree a public good, the same knowledge can be used by multiple actors). Employees or others who have access to the discarded knowledge may choose to quit their employer and set up a firm with the purpose of commercializing the unused idea.

The KSTE offers an elegant explanation of the importance of local knowledge for regional economic performance and growth. Arguably for this reason, it attracted much attention in the empirical and theoretical literature (Audretsch and Belitski 2013; Ghio et al. 2015; Qian and Acs 2013; Tsvetkova 2015). A number of theoretical extensions and theoretical modifications have been proposed, which in general suggest additional factors that are important for the ability of knowledge to lead to increased entrepreneurial entry. For example, Qian and Acs (2013) introduce absorptive capacity as an important consideration in the study of the knowledge—business entry nexus, whereas Audretsch and Belitski (2013) point to the role played by cultural diversity that is essential for the knowledge generation and its translation into firm formation. Plummer and Acs (2014) highlight localized competition, which can both promote and hinder start-ups in more knowledge-intensive environments.

The majority of the KSTE empirical tests are basically an investigation of the effects of knowledge production in a region on high-tech business entry, as the mechanism suggested by the KSTE is intuitively more applicable in high-tech settings. In many cases, the tests demonstrate a positive relationship, which is interpreted as confirming the validity of the KSTE. For instance in the US context, Qian and Acs (2013) use patenting intensity in MSAs as a measure of regional knowledge production and high-tech business entry as the dependent variable. Likewise, Plummer and Acs (2014) use the same metrics for their dependent and explanatory variables in a study of California and Colorado counties. There are virtually no detailed sector- or industry-level analyses in the USA. A rare example is an empirical test in the professional, scientific, and technical services industry by Tsvetkova (2015). The author finds that new knowledge generation per se is insufficient and MSAs need to possess sufficient entrepreneurial capital (already in place, unlike the KSTE view that knowledge would stimulate entrepreneurship) for the metropolitan knowledge generation to impact regional economic performance within the studied industry.

The evidence from other countries is mostly supportive of the KSTE but usually with qualifications. Lee et al. (2013) study the effects of patenting on firm formation in high-tech and low-tech manufacturing in Korea. Although the results generally support the KSTE, in the full model that accounts for both inter-regional and intra-regional effects, the latter (stipulated by the KSTE) are significant at the 10% only, while the positive effect of spatially lagged patenting intensity is significant at the conventional 5% level. Colombelli (2016) uses a stock of patent applications (with 15% annual depreciation rate) in Italian NUTS 3 regions as an explanatory variable in a model of innovative business entry. The start-ups included in the dependent variable calculation belong to several sectors, including both high-tech manufacturing and high-tech services. The base models support the hypothesized positive link between patent applications and new firm formation, although this effect disappears when measures of technological variety of the knowledge base and the degree of related and unrelated variety are included.

An associated body of work examines the effects of local human capital and university research intensity on start-up rates nearby. Bonaccorsi et al. (2013) find that Italian universities specializing in applied sciences and engineering have a particularly strong positive effect on provincial firm formation in service industries. Baptista and co-authors (Baptista and Mendonça 2010; Baptista et al. 2011) show a positive effect of universities on high-tech entrepreneurship in Portuguese municipalities, although the exact mechanism of such relationship is not explored. The authors highlight the importance of various factors associated with the presence of institutions of higher education, such as the number of graduates in different programs of study and others. Evidence from Germany suggests that a mere presence of universities in a region is a strong predictor of firm entry in high-technology and technologically advanced manufacturing industries (Fritsch and Aamoucke 2013).

Despite seeming consensus, Knoben et al. (2011) warn of limited applicability of the KSTE in an analysis of Dutch municipalities over the 1999–2006 period. The authors study employment growth in new companies as a function of average investment in R&D per worker, educational attainment, and the presence of a major university (including technical universities). The findings suggest that agglomeration economies are the strongest determinants, but when agglomeration is not properly accounted for in empirical testing, the results tend to point to the key role of knowledge production, which is also consistent with the findings in Fallah et al. (2014).

In sum, the KSTE is a good starting point to study the link between innovation and business entry; however, more detailed industry-level analyses are needed to assess applicability of the theory in various settings. The existing empirical tests in the US context tend to be general and focus on broad sectors, such as all high-tech industries. More detailed sectoral analyses performed in other countries do not provide unequivocal confirmation of the KSTE, while the selection of the sectors and industries in the literature appears to be ad hoc. To the best of our knowledge, there is no systematic investigation of the relationship between knowledge generation and firm formation across high-tech subsectors and industries with the analysis gradually moving to more knowledge-intensive settings in order to assess KSTE applicability across various industrial contexts. This paper fills this gap.

3 Empirical approach overview, variables, and data sources

3.1 Empirical background

The KSTE explains opportunity entrepreneurship in an innovative environment and is naturally more pertinent to the urban areas. First, due to the market power of cities, innovations are more likely to be developed and commercialized in urban areas (Shearmur 2012), which are also more likely to be home to opportunity entrepreneurship (Low et al. 2005). Next, higher population and business densities facilitate information flows that are crucial for successful innovation and opportunity entrepreneurship. Finally, the knowledge spillover theory of entrepreneurship is an integral component of an entrepreneurial ecosystem perspective, which is an urban phenomenon (Audretsch and Belitski 2016) mostly driven by the differences in the levels of human capital, business cultures, presence of infrastructure, and other characteristics that are important for entrepreneurship in general and business entry in particular (Bosma and Sternberg 2014; Mumford 2016). We, therefore, use US MSAs (361 areas defined by the Office of Management and Budget (OMB) in November 2008) as the unit of analysis. Our data span 15 years between 1997 and 2011. The period is determined by data availability for the dependent variables (business entry). In contrast, the data for explanatory variables (knowledge generation proxies) are available for earlier years. This allows us using lagged values of the explanatory variables in the analysis.

The validity of the central mechanism of the KSTE (business formation by employees-turned-entrepreneurs who commercialize an innovation) as a primary driver of aggregate metropolitan firm entry can be further verified by empirical analysis that goes from more general (less knowledge-based) to more knowledge-reliant sectors, subsectors, and industries. To this end, we start our empirical estimation with models that explain total start-up rates in US MSAs. No significant relationship between knowledge generation and business entry is expected here because only few companies in the whole pool of entrants are likely to be the result of endogenous firm formation. We further focus our analysis on business entry in all high-tech industries and then separately on high-tech nongoods-producing, high-tech goods-producing subsectors and finally on computer and electronic product manufacturing. Such research design allows testing the effect of knowledge on firm entry in increasingly more innovation-dependent industries. If new start-ups are indeed predominantly driven by employees commercializing new ideas, the positive effect of knowledge is expected to become more pronounced as we move from less knowledge-intensive to more knowledge-intensive settings.

3.2 The dependent variables

The knowledge spillover theory of entrepreneurship explains business entry as a function of knowledge generation. The dependent variable in this study is metropolitan business entry, Births. The data come from the US Census Bureau Business Information Tracking Series (BITS) tables,Footnote 1 which provide the number of start-ups at a county level broken down by industry. Using the BITS data, we calculate five MSA-level dependent variables: total firm entry, firm entry in high-technology industries, firm entry in high-tech nongoods-producing industries, firm entry in high-tech goods-producing industries, and firm entry in the computer and electronic product manufacturing industry. The dependent variables are defined as the logarithm of the number of start-ups per 1000 metropolitan residents (the population data come from the US Census Bureau). The industrial delineation in the BITS data follows the classifications used at the time of the data release and ranges from Standard Industry Classification (SIC) 1987 codes to North American Industrial Classification System (NAICS) 2007.

We use a list of high-tech industries used by Plummer and Acs (2014) to calculate high-tech entry rates for years when SIC1987 classification was used and a list of high-tech industries provided by Fallah et al. (2014) for years when NAICS categories are used.Footnote 2 The paper by Fallah et al. uses the 2002 NAICS definition. We use standard concordances from the US Census Bureau website to bridge industries and define the high-tech sector when the BITS data rely on more recent NAICS definitions. The high-tech sector is further split into goods-producing industries (i.e., those in agriculture, mining, manufacturing, construction) and nongoods-producing industries (all other except government), and entry rates are calculated for both subsectors separately.

Our last dependent variable is the firm entry in computer and electronic product manufacturing (NAICS334) industry that includes NAICS 3341 (Computer and Peripheral Equipment Manufacturing), NAICS 3342 (Communications Equipment Manufacturing), NAICS 3343 (Audio and Video Equipment Manufacturing), NAICS 3344 (Semiconductor and Other Electronic Component Manufacturing), NAICS 3345 (Navigational, Measuring, Electromedical, and Control Instruments Manufacturing), and NAICS 3346 (Manufacturing and Reproducing Magnetic and Optical Media).Footnote 3 The NAICS334 industryFootnote 4 is important to the US economy for a number of reasons. It is consistently listed as one of the most innovative industries; it creates high-paid jobs and significantly contributes to the US economic output (Helper et al. 2012; Houseman et al. 2015; Tsvetkova et al. 2014). The success of individual computer and electronic manufacturing companies crucially depends on introduction of new products and technologies and on the ability to benefit from knowledge flows (BLS 2011), which implies that the KSTE is expected to hold in the empirical analysis in computer and electronic manufacturing in particular.

3.3 The explanatory variables

The pool of ideas available for market exploitation is the key explanatory variable in the knowledge spillover theory of entrepreneurship. Because it is practically impossible to measure the pool of unutilized ideas, existing studies often use various metrics of knowledge production in a region as proxies. Perhaps the most widely used measure is the patenting activity in a region (Camp 2005; Plummer and Acs 2014; Qian and Acs 2013). This measure has a number of well-known limitations. For example, it captures only a fraction of created knowledge ignoring most process innovations altogether and assigns equal economic value to all patents, which is clearly an unrealistic assumption (Feldman and Audretsch 1999; Griliches 1979; Pakes and Griliches 1980). Despite the limitations, however, patent count may be the best available approximation of regional innovative activity (Feser 2002; Griliches 1990), especially in urban areas (Acs et al. 2002). Patenting intensity as a measure of knowledge generation is especially applicable when testing the KSTE, since the theory suggests that “[e]ntrepreneurial opportunities … lie in the emergence of new knowledge as a result of R&D activity” (Qian and Acs 2013, p. 188).

We use a log of population-adjusted number of utility patents granted to inventors residing in a metropolitan area (Patents) as the main explanatory variable. Patents are assigned to the MSA of residence of the first inventor on a year when the US PTO granted them. Since it takes several years on average to receive a patent for an invention, we lag Patents by 4 years in the empirical estimation.Footnote 5 It is hoped that such operationalization is able to better capture knowledge production in a region that can facilitate knowledge spillovers as opposed to the time when knowledge is protected by a patent with the goal of preventing knowledge sharing. The county-level patent count provided by the US PTO is the data source for the variable aggregated to the MSA level.

The specifics of the endogenous firm formation mechanism proposed by the KSTE entail a circular dependence of business entry on knowledge production and of knowledge production within firms on business entry. Such a recursive relationship is likely to lead to the endogeneity problem in empirical estimation. Measurement issues may also lead to endogeneity.

We, thus, rely on the instrumental variable approach as our main estimation strategy, which was previously used in empirical tests of the KSTE (e.g., Plummer and Acs 2014). To instrument for Patents, two variables are used as the instrument set. First is a measure of patenting activity (PatMix) that is based on the logic of the industry mix term from the shift-share analysis (the so-called Bartik’s (1991) instrument), which is widely used in economics and regional science (Betz et al. 2015; Blanchard and Katz 1992; Tsvetkova and Partridge 2016; Tsvetkova et al. 2017; Tsvetkova et al. 2018). The instrument is based on the national patenting activity across groups of manufacturing industries (technology classes) and each MSA’s manufacturing industrial composition.Footnote 6

PatMix is calculated as in (1) below

$$ PatMix={\sum}_{i=1}^n{S}_i\log \left({NP}_i\right) $$
(1)

where Si is the share of employment in manufacturing industries that correspond to technology class i in total manufacturing MSA employment and NPi stands for the national number of patents in technology class i as reported by the US PTOFootnote 7, and there are n technology classes. PatMix is calculated using employment data (aggregated to metropolitan level) from the Economic Modeling Specialists International (EMSI), a proprietary dataset that contains employment by four-digit NAICS industry codes for all US counties. The PatMix variable then reflects whether an MSA has a composition of industries that exhibit high or low patenting rates. Because national patenting rates by industry are exogenous to a given MSA and any lagged labor supply responses associated with industry composition should be limited after controlling for demographic factors such as education and racial composition, PatMix is expected to be exogenous. National patenting by industry and year comes from the US PTO report U.S. Patenting Trends by NAICS Industry Category Utility Patent Grants, Calendar Years 1963–2012.Footnote 8

Following previous empirical tests of the KSTE (Plummer and Acs 2014), we include the second variable in the instrument set, the logged share of high-tech employment in nongoods-producing industries in an MSA (HighTechNGEmpShare)Footnote 9 calculated from EMSI data using the list of high-tech industries provided by Fallah et al. (2014). In previous studies, the share of high-tech employment was used (aside from being an instrument for knowledge production in empirical tests of the KSTE) as an approximation for the pool of people who generate local knowledge and the thickness of local input markets that has been shown to be generally important for entrepreneurship and high-tech entrepreneurship in particular (Bublitz et al. 2015; Dohse and Vaona 2014; Helsley and Strange 2011). In our case, in addition to the more traditional justifications, we use the share of high-tech employment in nongoods-producing services to capture some of the ideas that originate in services and those related to the process innovation, which are not captured by patenting measures that are primarily from manufacturing. An instrument set consisting of PatMix and HighTechNGEmpShare performs best in terms of model identification, although the estimation results do not change if we use the share of high-tech employment (HightTechEmpShare) or the share of high-tech employment in goods-producing industries (HighTechGEmpShare) instead of HighTechNGEmpShare.

3.4 Control variables

In addition to a measure of metropolitan knowledge generation, all models include a set of control variables that account for various factors relevant to business entry in metro economies. Most generally, these controls are intended to capture the area’s industry structure, the level of human capital, agglomeration economies, and economic conditions.

We approximate the industrial structure of an MSA by a measure of industrial diversity and a measure of diversity of high-tech industries. Although some researchers believe that specialization of a regional economy in one industry or a sector should lead to superior performance and knowledge generation in this sector (Audretsch 2003; van der Panne 2004; van Stel and Nieuwenhuijsen 2004), more recently (and with regard to the high-tech industries in particular), a consensus seems to emerge that a cross-fertilization of ideas in the spirit of Jacobs (1969) is more important for a vibrant innovative environment, idea flows, and, eventually, entrepreneurship (Feldman and Audretsch 1999; Frenken et al. 2007; van Stel and Nieuwenhuijsen 2004; Fallah et al. 2014). Bishop (2012) shows that the diversity of knowledge is an important driver of the UK high-tech business entry. In line with previous research, we approximate metropolitan industrial diversity by an entropy measure calculated at the four-digit NAICSFootnote 10 level and based on all industries present in an MSA. We supplement this measure by a separate entropy measure calculated only for high-tech industries to capture the diversity of the knowledge base.

More specifically, we calcualte IndDiversity as in (2) below.

$$ IndDiversity={\sum}_{i=1}^n{S}_i\ln \left(\frac{1}{s_i}\right) $$
(2)

where Si stands for the share of a four-digit NAICS industry i employment in total metropolitan employment and there are n industries. The entropy index is zero if all employment is concentrated in one industry, and it is maximized if employment is distributed evenly among industries. The diversity of knowledge base is calculated in the same way but uses only four-digit NAICS high-tech industries as defined by Fallah and co-authors (Fallah et al. 2014). In this case, Si is the share of a high-tech four-digit NAICS industry and there are n high-tech industries.

To capture the industrial conditions that are relevant to the dependent variables, we introduce a measure of localized competition.Footnote 11 Plummer and Acs (2014) argue that localized competition (competition for ideas embedded in people) is important for business entry in the knowledge spillover theory of entrepreneurship. They contend that the degree of localized competition in this context may have suppressing effects on firm formation, as a larger number of firms would increase the likelihood of idea utilization (i.e., fewer unutilized or underutilized ideas) and may decrease chances of success via increased competition. On the other hand, more efficient utilization of ideas should stimulate knowledge production and promote a knowledge-rich environment that offers greater entrepreneurial opportunities. The variable LocalComp is our measure of localized competition. Following Plummer and Acs (2014), it is calculated from the EMSI database as the ratio of establishments to employees in an MSA divided by the same ratio for the whole economy.Footnote 12 We use total establishment and employment counts when fitting models of total MSA business entry; for the remaining models, which focus on high-tech, we use high-tech establishments and employment as discussed above.

Another strand of literature suggests that creativity of well-educated people and the diversity of cultural backgrounds is important within the KSTE framework. As such, creativity and diversity should facilitate business formation by creating an environment favorable for information sharing and learning (Audretsch and Belitski 2013). To account for such a possibility, our models include a measure of employment concentration in professional services, Professionals. This variable is calculated from the EMSI data as the number of employed in NAICS52 (Finance and Insurance), NAICS54 (Professional, Scientific, and Technical Services), and NAICS55 (Management of Companies and Enterprises) per 1000 working people in an MSA. The share of foreign-born population in an MSA—Foreign—is an approximation for what Audretsch and Belitski (2013) call the melting pot index. The US Census Bureau is the data source for this variable.

Density supports knowledge spillovers and is an integral part of agglomeration (Griliches 1992; Koo 2005; López-Bazo et al. 2004). This study uses population density (PopDensity), calculated from the US Census Bureau data, to reflect this urban feature. The share of adult population with graduate or professional degree GradProfShare is aggregated from the US Census Bureau county-level data. The measure likely reflects the pool of potential “knowledge” entrepreneurs and indicates availability of educated workforce in a region.

Personal income growth (IncGrowth) and the unemployment rate (UnempRate) are included to account for metropolitan economic conditions. A growing per capita income should reflect widening opportunities for local companies, as an expanding local market should stimulate business entry (Armington and Acs 2002). The unemployment rate is a measure of economic hardships. Since the estimation period includes the Great Recession, these two measures help capture the degree to which each MSA was sensitive to the downward economic trends in the national economy and may proxy for labor-force availability.Footnote 13 The US Bureau of Economic Analysis (BEA) is the source for the former variable, while the latter variable comes from the US Bureau of Labor Statistics (BLS).

In addition to these variables, all models include MSA and year fixed effects to factor out constant metropolitan characteristics and the economy-wide cyclical trends that uniformly affect all MSAs. A concise summary of all continuous variables and their sources is in Appendix Table 5. The Appendix Table 6 gives summary statistics of the variables used in the main specification. The VIF statistics are shown in the Appendix Table 7.

3.5 Estimation approach

The regression-based test of endogeneity using our main instrument set indicates that knowledge production tends to be endogenous in all models dealing with the high-tech (overall, two subsectors and the NAICS334 industry).Footnote 14 Thus, these models are estimated using an IV approach (Baum et al. 2007; Schaffer 2012) to derive the within estimates. Factoring out location-specific unchanging traits is important for at least two reasons. First, the relationship between regional characteristics and firm formation may differ by location (Cheng and Li 2011), and, second, the within nature of the estimates produced by the fixed-effect models shows the expected changes in the outcome variable if characteristics of an MSA change and, thus, allows us to infer causal responses (along with the help of our IV approach).

The variable Patents is instrumented with PatMix and HighTechNGEmpShare (Eq. (4)) where all patent-related measures are lagged by 4 years. Births is modeled as a function of a fitted value of patenting activity in an MSA, a vector of control variables X and a set of MSA and year fixed effects (Eq. (3)). Equation (4) fits an instrument for the stock of knowledge that can be exploited by potential knowledge entrepreneurs, while Eq. (3) presents the core model.

$$ {Births}_{it}=\alpha +{\beta}_1{\widehat{lnPatents}}_{it}+{\boldsymbol{X}}_{it}{\boldsymbol{\beta}}_x+{\delta}_i+{\theta}_t+{\varepsilon}_{it} $$
(3)
$$ {\widehat{lnPatents}}_{it}=\alpha +{\beta}_1{PatMix}_{it}+{\beta}_2{HighTechNGEmpShare}_{it}+{\varepsilon}_{it} $$
(4)

where subscript i refers to an MSA, subscript t to a year, δi is an MSA fixed effect, θt is a time dummy (with 1997 serving as the baseline), and εit is a robust error term (when fitting an instrument, all variables that enter the Eq. (3) also enter the Eq. (4) by default; Appendix Table 8 shows first-stage estimation results). We estimate Eq. (3) separately for each of the five dependent variables (total start-up rates, start-up rates in the high-tech sector, start-up rates in the high-tech nongoods-producing sector, start-up rates in the high-tech goods-producing sector, and start-up rates in computer and electronic manufacturing).

4 Results and discussion

The main estimation results for all models are shown in Table 1 (variables used in a log form are indicated by the postscript (ln)). The endogeneity test (the P value of its χ2 (1)-distributed statistics is in the last row of Table 1) indicates a possible endogeneity problem in all high-tech business entry models and justifies the IV approach. The selected instruments are strong (the first-stage Kleibergen-Paap rk Wald F statistic is always above 25, whereas the Stock and Yogo (2005) weak ID test critical value (at 10%) is 19.93); the Hansen J test suggests that all models are identified and the exclusion restriction holds (P values of the overidentification test are above the cutoff value of 0.05).

Table 1 IV estimation results for firm entry

At least two important conclusions follow from Table 1. First and foremost, knowledge generation has heterogeneous effects on business formation depending on the (high-tech) sector and industry. For example, accounting for MSA fixed effects, metropolitan knowledge production is positively related to firm entry in high-technology industries 4 years later. This result is in line with the previous KSTE empirical tests and, if taken in isolation, would support the validity of the knowledge spillover theory of entrepreneurship. Our results, however, show that this finding is driven entirely by the positive relationship between knowledge and firm formation in the high-tech nongoods (services) producing industries. In the high-tech goods-producing industries, the relationship is negative, i.e., greater knowledge production in an MSA is related to lower business entry. The same is true for the computer and electronic product manufacturing, one of the most innovative and productive US industries. One reason may be that high patent production may indicate a different segment of manufacturing such as more R&D facilities, while another reason may be that patents serve to monopolize certain goods or processes that limit new firm entry or to intensify competition in general discouraging potential entrants. In addition to the potential role of patents in discouraging entry, incumbent companies are likely to hinder their employees exit options, especially in relation to starting a new business. This is mostly made possible via the use of the noncompete agreements that depend on state laws in scope and enforceability. It was shown that laws that increasingly favor employers in their ability to impose exit restrictions on employees are associated with reduced firm entry in high-tech industries but also with increased investments by incumbents, which is likely to increase knowledge generation in these industries (Jeffers 2018). Related to this, as high-tech manufacturing becomes more knowledge-intensive, costs of starting a business are likely to be higher, making entry more difficult (Vatne 2017).

Second, there are clear differences in business formation drivers between high-tech nongoods sector and high-tech goods sector, suggesting very different regimes, most likely related to differences in entry barriers as well as the fact that high-tech service firms have entirely different input-output clienteles. Besides, firms manufacturing a high-tech product may want their facilities in lower cost locales. These differences also imply that a theory that predicts a dominant straightforward relationship between knowledge production and business entry is unlikely to explain firm formation across high-tech subsectors. According to Table 1, aside from divergent effects of knowledge on firm formation in high-tech goods- and high-tech nongoods-producing industries, the effects of high-tech diversity, localized competition, population density, and unemployment also differ. The diversity of high-tech industries in a metro area stimulates business entry in high-tech services (perhaps due to a larger base of high-tech service clients or a less-risky set of clients that are not tied to the health of one industry), but not in high-tech goods-producing subsectors. Localized competition, which empirically can be seen as the opposite of market power in an MSA, as well as population density has the same effect. Higher unemployment, on the other hand, is related to lower firm formation rates in high-tech industries in general and in high-tech services in particular—suggesting that general workforce availability is less important than overall economic conditions.

5 Sensitivity analysis

To test the sensitivity of the results reported in Sect. 4, we now use alternative estimation approaches and different instrument sets. We start with using a 3SLS procedure, which was used as the main estimation approach in previous KSTE tests (Plummer and Acs 2014). The three-stage estimation procedure involves developing instrumented values for all endogenous variables, obtaining consistent estimates of the covariance matrix of error terms and using the values from the first step and the consistent covariance matrix from the second step to perform a GLS-type estimation.Footnote 15 Thus, the 3SLS approach accounts for the possibility that the residuals are correlated in a particular MSA across firm entry types.

Table 2 reports 3SLS second-stage estimation results (first-stage results are shown in Appendix Table 9), which are practically identical to the ones reported in Table 1. More specifically, knowledge production measured by metropolitan patenting intensity and instrumented by an industry mix-like measure of expected patenting in an MSA and the share of employment in high-tech nongoods-producing industries is (1) not related to overall firm entry, (2) positively related to firm entry in high-tech industries in general and in high-tech nongoods-producing industries in particular, and (3) negatively related to business formation in high-tech goods-producing industries, specifically in computer and electronic product manufacturing.

Table 2 3SLS estimation results for firm entry

Next, we use the total share of high-tech employment and the share of high-tech employment in goods-producing industries as one component of the alternative instrument sets in place of HighTechNGEmpShare. Appendix Table 10 shows the results. Although fewer models are identified, the results are remarkably consistent across all specifications.

As a further test of sensitivity, we calculate predicted patenting based on high-tech and low-tech employment composition of a locality (Regress_M) following the approach presented in Détang-Dessendre et al. (2016) and use it in Lewbel (2012) IV procedure. In particular, we calculate high-tech and low-tech employment shares in all two-digit NAICS codes using Fallah et al. (2014) classification of industries. We then use these shares lagged by 1 to 7 years and state fixed effects as explanatory variables in negative binomial regression to estimate the number of utility patents in a given year. In the last step, estimated coefficients and actual employment shares are used to predict metropolitan patenting as described in Eqs. (5) and (6) below.

$$ Patent{s}_{it}={\sum}_{n=1}^N{\beta}_n EmpShar{e}_{\mathit{\operatorname{int}}-p}+{\delta}_s+{\varepsilon}_{it} $$
(5)
$$ Regress\_{M}_{itp}={\sum}_{n=1}^N{\beta}_n EmpShar{e}_{\mathit{\operatorname{int}}-p} $$
(6)

where i denotes MSA, t is time period and p is the lag (p = 1,…,7), n is high-tech or low-tech subset of each two-digit NAICS industry code, EmpShare is share of subset n in total MSA employment, and δs is state fixed effects.

Table 3 presents second-stage estimation results using a combination of Regress_M (lagged by 4 years) and Lewbel-generated instruments. The specification tests at the bottom suggest that the instrument set is strong in the first stage and passes the exclusion restriction. Generally, the results presented in Table 3 are consistent with those shown in Tables 1 and 2 with two notable exceptions. Although the estimation coefficients on the knowledge production measure are in the same direction as before, they are insignificant in all models but computer and electronic product manufacturing. In these cases, however, Lewbel IV procedure does not detect endogeneity problem suggesting that a simple OLS approach is appropriate.

Table 3 Lewbel procedure estimation results for firm entry

Given no evidence of endogeneity in four out of five models in Table 3, we re-estimate our models using fixed-effect panel data approach with metropolitan patenting intensity lagged by 4 years as the explanatory variable. The results are reported in Table 4. The evidence on the effects of regional knowledge generation on business entry in high-tech goods-producing sector and in NAICS334 is in line with Tables 1 and 2. In all other models, patenting intensity appears insignificant.Footnote 16 Since our main estimation results suggest no presence of endogeneity in the model of total firm formation, OLS results from Table 4 are preferred. In summary, the diverging effects of knowledge creation on firm entry, which depend on the subsector and industry, are confirmed in all estimation frameworks. In particular, the negative effect in the high-tech goods-producing sector and in the computer and electronic product manufacturing is stable across all specifications. The lack of a statistical relationship between knowledge generation and total business entry is also stable. The evidence on the impacts in the high-tech sector and in high-tech services is inconclusive, but the results presented in Tables 1 and 2 combined with the findings in the existing literature suggest that positive effects are more likely.

Table 4 OLS fixed-effect estimation results for firm entry

6 Labor-market approach to industrial entry in the high-tech context

Given the results reported in Sects. 4 and 5, it appears that the KSTE is not applicable in the high-tech goods-producing sector and the computer and electronic product manufacturing, at least in the US MSAs during the period studied. Yet, it does potentially renew scholarly and policymakers’ interest in the relationship between knowledge generation and firm formation and in the mechanisms behind such a relationship. Our results combined with the findings in the existing literature confirm (with qualifications) that aggregate high-tech metropolitan business entry positively depends on knowledge production. In the high-tech goods-producing sector and in the computer and electronic product manufacturing, on the contrary, the effects of patenting activity are always negative and significant. This latter result runs contrary to the predictions of the KSTE but is notably consistent across a number of estimation frameworks and specifications in our study.

The divergent effects of patenting on business entry documented in this research point to a need to develop a richer theoretical framework that accommodates the patterns of relationships revealed by the data. The KSTE is mostly seen as a regional theory, i.e., the stock of new (and unutilized) ideas produced in a region is the main determinant of regional business entry, while the personal motivation and ability of a potential entrepreneur to start a business are outside its scope. Yet, an individual and her decision to switch to entrepreneurship is what ultimately drive firm formation. In this sense, a labor perspective of entrepreneurship is highly relevant and, arguably, better positioned to explain the patterns revealed by our analysis.

We start with a labor market approach to industrial entry (Storey and Jones 1987), where a wholly new business entry in industry i (as opposed a relocation from elsewhere) can be defined as a function of expected profitability (πi) and entry barriers (Xi).Footnote 17 We define a system of equations, which describes business entry in high-tech nongoods-producing industries and in high-tech goods-producing industries as a function of expected profits and entry barriers. Combining the insights from the labor market approach to firm entry and the KSTE, we postulate that both arguments of the entry function depend on knowledge generation:

$$ {E}_{NG}=f\left({\pi}_{NG}(k),{X}_{NG}(k)\right)\kern0.5em \frac{\partial {E}_{NG}}{\partial {\pi}_{NG}}>0,\frac{\partial {E}_{NG}}{\partial {X}_{NG}}<0 $$
(7)
$$ {E}_G=f\left({\pi}_G(k),{X}_G(k)\right)\kern0.5em \frac{\partial {E}_G}{\partial {\pi}_G}>0,\frac{\partial {E}_G}{\partial {X}_G}<0 $$
(8)

Our empirical results suggest the following relationships between metropolitan knowledge generation and the main components of Eqs. (7) and (8).

$$ \frac{\partial {\pi}_{NG}}{\partial k}>0 $$
(9)
$$ \frac{\partial {X}_{NG}}{\partial k}<0 $$
(10)
$$ \frac{\partial {\pi}_G}{\partial k}<0 $$
(11)
$$ \frac{\partial {X}_G}{\partial k}>0 $$
(12)

We now consider alternative scenarios that potentially can lead to the relationships described in (9)–(12) explicitly taking into consideration their spatial nature. The positive relationship between expected profits in the high-tech services sector and knowledge generation might come through at least two channels. First, knowledge discovery (measured by patenting intensity) is usually a lengthy and cost-intensive process, which is likely to indicate an expanding market that can attract entrants to high-tech services ready to cater to industries engaged in R&D of new products and technologies, as well as to their employees. Alternatively, as utility patents are granted for a discovery or an invention of a useful manufacturing process, machine, or a material, it is plausible that new goods improving the quality, speed, and the variety of high-tech services being offered are patented and implemented in practice, increasing the profitability of high-tech service firms. Such mechanism, however, fails to explain the spatial nature of the empirically documented relationships because it is less likely to be locally bound.

The lowering entry barriers in the high-tech nongoods-producing industries as a result of knowledge production are plausible in at least one scenario. In the spirit of the KSTE, if expanding R&D activities of incumbent firms (which result into increased metropolitan patenting intensity) require more resources and intermediate inputs from high-tech services, current employees might choose to seize this opportunity and start up a firm that would service their prior employer and its competitors if they require the same type of high-tech services. Since former employees are likely to possess extensive expertise and be well connected in the industry of their prior employment, the newly minted entrepreneurs would be in a favorable position to provide high-tech services that match their clients’ expectations in terms of quality, technicality, timing, and other important dimensions that often require an insider experience. Besides, starting a business in an upstream or downstream industry is more likely not to be covered by noncompete clause of the employment contract with former employer.

The declining expected profits in the high-tech goods-producing sector as a result of increased knowledge generation are not as plausible because patents may more successfully guard a good-producing firm’s intellectual property and larger entry barriers such as capital requirements may allow more pricing power. Thus, we disregard such a possibility here. Besides the pricing power arguments, knowledge production may be very likely to raise entry barriers in the high-tech goods-producing industries. As new products become more and more sophisticated, setting up a new firm that would bring these products to the market requires more financial resources, professionalization of the firm, and development of a diverse set of competences, which are likely to be beyond the ability of a newly established company and is likely to make obtaining initial financing more challenging (Vatne 2017).

In addition, the more knowledge is being produced in a region, the more incentive incumbents would have to bind their employees with contracts that include noncompete clauses limiting the ability to switch employers and start new firms (at least locally). In regions and countries where noncompete clauses are not legally enforceable (e.g., in California) or where imposing such limitations is expensive for an employer (e.g., if incumbent firms are obliged to pay lost wages to former employees), the KSTE is likely to hold. Perhaps for this reason, European studies, even at more disaggregated sectoral level, tend to conclude that the KSTE is able to explain firm entry. In other settings (e.g., US MSAs outside of California), however, institutional arrangements (enforceability of the covenants not to compete) and common business practices (the prevalence of noncompete clauses) are likely to create legal barriers for potential entrepreneurs in their attempts to bring unused ideas to market. The KSTE ignores such possibilities, assuming unconstrained exit opportunities for employees, which is, in our view, a major limitation of the theory.

7 Conclusions

This paper presents a systematic empirical test of the knowledge spillover theory of entrepreneurship for US MSAs. Using the 1997–2011 data, we estimate the effects of knowledge production on total business entry and business entry in high-tech industries overall, high-tech nongoods-producing industries, high-tech goods-producing industries, and in computer and electronic product manufacturing. While the effects of metropolitan knowledge generation on total business entry are insignificant, in the high-tech subsectors, it has statistically significant (in most specifications) but divergent effects. In the models explaining total high-tech firm formation and firm formation in high-tech services, the effect of knowledge is positive in our main specification (in line with the previous empirical tests of the KSTE). On the contrary, in the high-tech goods-producing sector, and in the computer and electronic product manufacturing industry in particular, increased knowledge generation suppresses firm entry. Overall, the drivers of firm formation are likely to differ between high-tech goods- and nongoods-producing sectors. We develop a simplified theoretical framework and present likely explanations for the observed relationships.

Our theoretical explorations coupled with empirical results suggest that the following mechanisms are more likely to be behind the relationships documented in this study. First, knowledge generation measured by patenting intensity is likely to expand market opportunities in high-tech services (via higher expected profits and potentially via lower entry barriers for entrepreneurs who used to be employed in knowledge-generating industries), thus explaining the positive effect found in the main analysis here and in previous studies in comparable contexts. In the high-tech goods-producing industries, the decreased entry as a result of more intensive patenting is likely to stem from intensified competition with more technologically advanced incumbents, increasing costs of starting a new business able to compete against existing firms and potentially more restrictive noncompete clauses of employment contracts. The results also show that the increased business formation in high-tech services as a result of knoweldge generation is large enough to counterbalance decreased firm entry in high-tech goods-producing industries in a region (where significant effects are detected). Further empirical work is needed to test the hypothesized mechanisms in order to gain deeper understanding of the processes that drive the impacts of regional knowledge generation on business formation across high-tech industries. In particular, sorting out the relative roles of labor market effects versus knowledge spillovers, as well as legal barriers such as liberal patenting laws and noncompete agreements that reduce workforce mobility and limit new competition, is critical in providing more reliable policy advice. Indeed, such policies may be so protective of intellectual property rights that an unintended consequence might be net reductions in innovation and economic growth.