1 Introduction

New firm formation and its role in regional economic development has been studied extensively in recent years (cf. van Praag and Versloot 2007; Fritsch and Falck 2007; Andersson and Koster 2011). The theoretical foundation for this interest can be traced back to Schumpeter (1934, 1943). He popularized the term “creative destruction” to describe the transformation that accompanies innovation, in which new, creative firms challenge incumbents, some of which are destroyed or forced to increase productivity by the increased competition (Christensen 2013; Wennekers and Thurik 1999; Fritsch and Mueller 2004; Robinson et al. 2006; Carreira and Teixeira 2010; Baptista and Preto 2011; Andersson et al. 2012; Bos and Stam 2014). Consequently, new firm formation is generally assumed to stimulate economic growth, both nationally and regionally (Wennekers and Thurik 1999; Aghion et al. 2004, 2005; van Stel et al. 2005; Acemoglu et al. 2006, 2007; Aghion and Griffith 2008; Dejardin 2011).

This being said, some contributions find that the net effect of new firms on growth may be negative in the short term, before turning positive with a significant impact on growth for as long as 10 years (Acs and Mueller 2008; Fritsch and Mueller 2008; Andersson and Noseleit 2011; Fritsch 2011; Andersson et al. 2012). A lagged positive effect from entry on growth seems to occur at the regional level as well (cf. Johnson 2004; Carree and Thurik 2008; Dejardin 2011), although these results appear somewhat inconclusive (Braunerhjelm 2011). This may in part be due to something (Schumpeter 1934, 1939) himself acknowledged: That the large majority of new firms are not true innovators. Rather they can be seen as a type of “turbulence” (Audretsch and Fritsch 1999; Santarelli and Vivarelli 2002, 2007; Brown et al. 2006; Nightingale and Coad 2014; Henrekson and Sanandaji 2014). Most entrants die young; generally 50 % or less survive the first 5 years (Geroski 1995; OECD 2003; Bartelsman et al. 2005; Delmar and Wennberg 2010), but conditional on survival, new firms can be expected to grow faster than their mature counterparts (Wagner 1994; Haltiwanger et al. 2010).

This motivates this study on Swedish data on limited liability firms, where I make a distinction between regular entrants and surviving entrants, defined as firms that survive for at least 2 years post-entry. The distinction makes it possible to assess whether different municipal and industry characteristics are conducive to surviving entrants as compared to all entrants.

Sweden usually gets a low rank in international comparisons as regards rates of self-employment, new firm formation and entrepreneurship (Delmar and Davidsson 2000) although the trend seems to have improved in recent years (Braunerhjelm et al. 2013; Amoros and Bosma 2013). Several features of the Swedish policy environment potentially discourage the formation of new firms, notably the high ratio of taxes to GDP, high marginal taxes on labor and capital income earned by entrepreneurs (cf. Andersson and Klepper 2013; Sørensen 2010; Stenkula et al. 2014; Du Rietz et al. 2013, 2014), and strict employee security provisions and wage policies (Davis and Henrekson 1999; Skedinger 2008, 2012). Nonetheless, in a recent study, Andersson and Klepper (2013) do not find any large differences in new firm formation in Sweden compared to Denmark, Brazil and the USA.

Results from prior studies examining conditions for entry in Sweden suggest that a broad set of conditions affects the entry decision. For example, Nyström (2008) examines the relationship between the municipal institutional environment and new firm formation, finding that positive attitudes toward private enterprises and right wing political rule has a positive effect on entry, while a large governmental sector has a negative effect. The findings of Daunfeldt et al. (2010) meanwhile suggest that the level of market concentration in the local industry as well as relative purchasing power within the municipality greatly affects entry of Swedish retail firms. Meanwhile, Andersson and Koster (2011) study persistence in start-ups at the regional level, finding that regions with high levels of start-up rates will exhibit stronger persistence. Furthermore, Eliasson and Westlund (2013) test the relationship between self-employment entry and variables connected to demography and education, labor-market status and regional attributes, and find that the same factors influence entry in Swedish urban and rural areas.

Both municipal and industrial conditions that may affect regular entry as well as surviving entry are therefore tested in this paper. I performed the analysis using an extensive dataset covering all limited liability firms in Sweden during 2000–2008, and done both at the level of Sweden’s 290 municipalities, and at the level of industries within municipalities (based on the 1-, 3-, and 5-digit NACE classifications). This makes it possible to take both municipal- and industry-specific factors into account and to observe whether effects differ depending on the level of aggregation.

I took three sets of entry determinants into account. First, a set of general municipal characteristics, which may be associated with the conducive dimension described by Stenholm et al. (2013), such as the presence of skilled labor force, accessibility of suppliers and customers, and proximity of high-quality universities. Secondly, I followed Nyström (2008) in considering a set of institutional variables of both a formal and informal nature, since institutions are considered as the “rules of the game” and can be expected to impact new firm formation (cf. North 1990). Lastly, a set of variables related more intimately with industrial organization are included, since such variables are important parts of the firm’s immediate ecology (Metcalfe 2009).

This paper treats the choice to make entry as a discrete event, which implies the use of count data models. Such models have previously been used extensively to model firm location decisions (see e.g., Guimarães et al. 2003; Brixy and Grotz 2006; Koo and Cho 2011; Manjón-Antolín and Arauzo-Carod 2011; Mota and Brandão 2011), but less so for Swedish data (Daunfeldt et al. 2006, 2010, 2013 are exceptions but they only consider the retail and wholesale sectors of the economy). In a count framework, both positive and zero occurrence are natural outcomes (Hammer and Landau 1981; Lambert 1992; Hall 2004; Karazsia and van Dulmen 2008). Of four count data models considered, the negative binomial regression model consistently best accounted for the features of the data and to have the greatest predictive power. It is therefore used in the estimations in this paper.

Results are robust to different specifications and levels of aggregation, suggesting that some municipal and industrial variables have a considerable effect on both regular and surviving entry. A standard deviation (SE) increase in municipal median income increases the expected entry 15–30 %, while a SD increase in the share of population with a 3-year university education increases it by 16–45 %. In addition, SD increases in industry concentration or in minimum efficient scale more than halved expected entry, more industrial liquidity has a clear-cut positive effect. The effects from institutional variables, e.g., taxes, government ideology and the business climate, are low by comparison. This is in line with Stenholm et al. (2013) who found that for the formation of innovative, high-growth new ventures, the regulative environment matters little compared to the conducive dimension, noting in particular the importance of knowledge spillovers and the capital necessary for high-impact entrepreneurship.

The results were quite similar regardless of whether regular entry or surviving entry served as dependent variable. One exception was population size, which had a strong influence on entrants overall, but a much weaker effect on entrants that went on to survive for at least 2 years. By contrast, the importance of the level of education appears stronger for surviving entrants than for regular entrants, pointing to the importance of human capital.

2 Theory and hypotheses

Firm entry and exit rates are both considerable (Storey 1994; Brown et al. 2006). For example, the average annual rates of entry and exit in Sweden during 1990–1993 were 0.133 and 0.125 (Davidsson et al. 1996, p 50). But differences in entry and exit rates are substantial across industries (Beesley and Hamilton 1984; Dunne et al. 1988) and across regions (Keeble and Walker 1994). A rather large empirical literature examines the factors affecting the utility of, and propensity for, self-employment (cf. Le 1999; Georgellis and Wall 2005; Vejsiu 2011).

Much early research regarding the determinants of entry concerned industry conditions (Bain 1956; Yip 1982). Many studies have also examined the regional determinants of entry (Acs and Storey 2004), some focused on Sweden (Davidsson et al. 1994; Karlsson and Nyström 2011; Nyström 2006, 2007, 2008). Davidsson et al. (1994) studied Sweden’s eighty local labor-market regions during 1984–1989, finding regional characteristics that largely explained differences in entry. Yet regional determinants of new firm formation typically differ across industries (Fritsch 1997; Audretsch and Fritsch 1999; Berglund and Brännäs 2001; Armington and Acs 2002). Hence, it is important to study these conditions together in order to distinguish between industrial and regional effects (Santarelli and Vivarelli 2007), as many of the determinants of start-up rates are highly intertwined.

Notably, there appears to be persistence in start-up rates at the regional level (Andersson and Koster 2011; Fritsch and Mueller 2007). Many factors that influence start-up activities change slowly, for example, education level, market size, industry size and agglomeration economies (Verheul et al. 2002). In addition, regions with high start-up rates for an extended period of time may develop a positive climate toward entrepreneurship involving both formal and informal institutions, with a positive influence on subsequent start-ups (Wagner and Sternberg 2002; Fölster 2002; Andersson and Koster 2011). This points to the importance of examining regional, political and industrial factors together.

Three sets of factors are thus taken into account in this paper: first, a set of standard variables in the literature on the geography of new firm formation; second, a set of political economy variables; third, a set of industrial organization characteristics pertaining to each industry within the municipality. In the following sections, I discussed the hypotheses related to each of these three sets of factors.

2.1 Standard geography variables

In a recent cross-country study, Stenholm et al. (2013, p. 182) found that a set of framework conditions conducive for new innovations and knowledge-driven economy are particularly important for the emergence of so-called high-impact firms (cf. Acs et al. 2009; Acs 2010; Stenholm 2011; Wong et al. 2005). Among them are the presence of a skilled labor force, accessibility of suppliers and customers, and proximity of high-quality universities (cf. Bruno and Tyebjee 1982; Lee et al. 2004; van de Ven 1993). In such an environment, entrepreneurial intensions are amplified by the interplay between innovations, skills and resources. Arguably, the relevance of such variables should be visible also at the local level.

Generally, education and human capital have been found to have an important role in fostering entry, increasing the survival of new firms and improving their post-entry performances (Bates 1990; Gimeno et al. 1997; Acs et al. 2007). Human resources and universities have been found to be important for entrepreneurship (Evans and Leighton 1990; Kirchhoff et al. 2007), and local access to knowledge and human capital appear particularly important for entry by knowledge-based firms (Baptista and Mendonça 2010; see also Audretsch et al. 2005; Baptista et al. 2011).

While firms requiring high-skilled workers may locate in municipalities where they are available, well-educated people may also be more prone to start new firms, as education generates higher levels of entrepreneurial ability (Lucas 1978; Evans and Leighton 1990; Van Praag and Cramer 2001). Well-educated people may also be better prepared to identify market opportunities and have more growth-oriented aspirations (Davidsson 1991; Cassar 2006, 2007; Stam et al. 2009). Evidence suggests that higher education in a local area significantly increases the supply of entrepreneurs (Doms et al. 2010; Glaeser et al. 2010).

Nonetheless, Delmar et al. (2003) found that people who are well-educated in science and technology did not become self-employed to a greater extent than others in Sweden, and Daunfeldt et al. (2006) found little effect from education—measured as the fraction of people in the municipality that had at least enrolled at university—on entry in wholesale and retail industries. This may be because higher education comes with a higher opportunity cost; while good for managerial capabilities, it also improves prospects for salaried employment (Vivarelli 2013, p. 1470).

The local labor market can also be expected to play an important role, as attested by the fact that a majority of founders usually have been previously employed in the same geographical area and sector (Shane 2000; Klepper 2001; Helfat and Lieberman 2002; Stam 2007). Other studies suggest that job losses and unemployment foster entry (Storey 1994; Audretsch and Vivarelli 1995, 1996; Armington and Acs 2002; Acs 2006; Santarelli et al. 2009). Individuals may be pushed into self-employment by a lack of work opportunities (Lewis 1954; Earle and Sakova 2000). High unemployment may thus cause higher entry rates (Storey 1994; Armington and Acs 2002; Acs 2006). Unemployment may also indicate an increase of available labor resources that an entrepreneur can take advantage of (Koo and Cho 2011). Grek et al. (2011) found a negative relationship between employment rate and firm formation in Sweden. On the other hand, Vejsiu (2011) found that periods of high unemployment are characterized by higher self-employment among employees, but lower self-employment among those already unemployed in Sweden.

Many studies nonetheless found a negative or statistically insignificant relationship between unemployment and new firm formation (Garofoli 1994; Guesnier 1994; Reynolds 1994; Sutaria and Hicks 2004), nor is there evidence for a larger share of entrepreneurs among the unemployed (Armington and Acs 2002; Fritsch and Mueller 2007). In fact, while time series analysis generally finds a positive relationship between unemployment and entry, cross-sectional analysis usually finds a negative relationship (Carree 2002).

Some studies have found more entry where demand was high (Davidsson et al. 1994; Guesnier 1994), which suggests that higher income in the municipality should have an effect (Daunfeldt et al. 2006). It may, however, deter entry in industries sensitive to labor costs (Nyström 2006).

At the country level, higher income has been found to offer more opportunities for growth, and more resources necessary for entrepreneurship (Reynolds et al. 2002; Bowen and de Clercq 2007; Bosma et al. 2010). Self-employment has also been positively associated with wealth measures (Evans and Jovanovic 1989; Evans and Leighton 1989; Meyer 1990; Holtz-Eakin et al. 1994; Lindh and Ohlsson 1996; Blanchflower and Oswald 1998).

Another aspect of demand is size of the population. Spatial agglomerations within industries have been argued to create localization advantages in terms of spillovers and cooperation between firms. The benefit of urbanization, meanwhile, comes from lower transport costs, and proximity to suppliers and customers (Marshall 1920; Krugman 1992; Aghion and Howitt 1992). Density generates high accessibility between people, which may stimulate creativity by making face-to-face interaction less costly and time consuming (Gordon and Ikeda 2011; Boschma and Lambooy 1999; Westlund et al. 2013). There may, however, be a downside due to increasing land prices and congestion costs (Pellenbarg et al. 2002). Yet entry has been found to be substantially more common in more densely populated regions (Audretsch and Fritsch 1994b; Guesnier 1994; Daunfeldt 2002), but neither Nyström (2006) nor Daunfeldt et al. (2006) found any effect from population size on entry in Sweden.

2.2 Political economy variables

The importance of the institutional framework for new firm formation can in part be understood in relation to Baumol’s (1990, 1993, 2002) theory of productive and unproductive entrepreneurship (cf. Boettke 2001; Boettke and Coyne 2003; Ovaska and Sobel 2005). In this view, the underlying supply of entrepreneurs is roughly consistent across countries or regions. What differs is how the supply is channeled. At the country level, institutions providing secure property rights, rule of law, contract enforcement and constraints on wealth transfers by government are expected to increase the relative return to productive entrepreneurship (cf. Sobel 2008, pp. 643–644). Evidence suggests that such institutions also promotes new firm formation (cf. Sobel 2008; Hall and Sobel 2008; Stenholm et al. 2013).

Much of the institutional setting in Sweden is determined at the national level and hence can be expected to vary little across municipalities, as regards, e.g., property rights protection and rule of law. Sweden’s municipalities are the country’s smallest political entities with self-governing power. They have autonomy concerning policies such as income taxes, spending and regulation. The autonomy differs depending on the policy area, however.

As mentioned, Sweden has among the highest ratios of taxes to GDP, and high marginal tax rates and capital income earned by entrepreneurs (Du Rietz et al. 2013, 2014). The high tax burden for entrepreneurs has been identified as a reason for the later development of an active venture capital market compared to the USA (Lerner and Tag 2013). While the national government sets capital gains and payroll tax rates, Swedish municipalities, as noted, set their own income tax rates, and a large part of entrepreneurial income is taxed as wage income, not capital gains (Henrekson 2005; Stenkula 2012). However, the variation in these tax rates is rather low: The SD during the period 2000–2008 was only 1.1 % (see Table 4).

Though it is easy to find theoretical arguments that go both ways, the literature provides only weak empirical support for the idea that high income taxes are inimical to enterprise (Robson and Wren 1999). While a change in tax policy favoring the self-employed relative to others should increase self-employment (Knight 1921; Kihlstrom and Laffont 1979; Appelbaum and Katz 1986), it is less obvious why a general tax reduction should do so (Fölster 2002). Redistributive taxation may even serve as an insurance mechanism that actually stimulates risk-taking and self-employment (Sinn 1996).

But welfare-state arrangements generally insure employees, not the investments and efforts of entrepreneurs (Ilmakunnas and Kanniainen 2001; Henrekson 2005). Tax-financed welfare expenditures may also reduce individual savings necessary for entrepreneurial ventures, while services that otherwise could be provided by the entrepreneur are instead provided by the government (Fölster 2002). Another reason to expect a positive relationship between taxes and entry is that self-employment may facilitate tax evasion (Long 1982; Blau 1987; Robson and Wren 1999). Engström and Holmlund (2009) estimate that Swedish households with at least one self-employed member underreport their total incomes by around 30 %.

A positive relationship between tax rates and some measure of entrepreneurial activity has been found in many time series studies dealing with problems of co-integration (Parker 1996; Cowling and Mitchell 2002; Robson 1998). The Swedish evidence is mixed. No statistically significant effect was found for average municipal tax rate on new firm formation in Sweden’s local labor markets during 1984–1989 (Davidsson et al. 1994). A strong negative connection has been found between average income tax burden and the share of self-employed, in a 20-year panel of 24 Swedish counties (Fölster 2002), but Stenkula (2012) found little effect from income tax rates on the rate of self-employment. These studies were, however, undertaken at a higher level of aggregation than the municipality level.

The interplay between economic and socio-cultural factors are potentially important for entrepreneurial actions (Krueger et al. 2000). This points to the importance of informal institutions, i.e., the norms, values and cultural meanings that affect human behavior (Veciana and Urbano 2008). Notably there are locally embedded values and attitudes toward entrepreneurship that may influence both the rate and level of entrepreneurship activity in a region (Westlund and Bolton 2003; Fritsch and Wyrwich 2013). The importance of the local culture is further suggested by the fact that regional rates of new firm formation are highly persistent (Andersson and Noseleit 2011). The regional culture may therefore influence perceived entrepreneurial opportunities more than the political environment (Mai and Gan (2007; Owen-Smith and Powell 2008).

Already Schumpeter (1934, p. 86) emphasized the importance of the “reaction of the social environment against one who wishes to do something new.” Individuals should be more likely to start a firm if they perceive that popular opinion is favorable to entrepreneurship and businesses, pointing to the importance of societal legitimization (Etzioni 1987; Kostova 1997; Veciana and Urbano 2008; McCloskey 2010). Values and beliefs (concerning e.g., acceptance of capitalism, and beliefs concerning the societal contributions and sacrifices of entrepreneurs) have been found to affect new firm formation rates, in a study of three matched pairs of regions in Sweden (Davidsson and Wiklund 1997). But the effect was small, with no specific aspect of regional culture consistently appearing as determinant.

The business climate is furthermore shaped by government regulations. Regulatory processes can promote or hinder entrepreneurship by shaping the level of risk involved in the formation and start of a business, and entrepreneurial behavior is influenced by how the rules are adopted and enforced (Baumol and Strom 2007; Busenitz et al. 2000; Verheul et al. 2002). While regulations may be instrumental to protect the public against externalities and other market failures (Pigou 1924), there is a risk that they benefit decision makers and insiders (Tullock 1967; Stigler 1971; Appelbaum and Katz 1986). For example, incumbent firms may prefer regulation that keeps out competition. In general, contract enforcement regulation that affects the efficiency of the legal system tends to improve the possibilities for entry and enhance innovation (Djankov et al. 2010; La Porta et al. 2008; Aidis et al. 2009). Regulation as such has been shown to affect entrepreneurship, the size of start-ups and regional development (Ciccone and Papaioannou 2007; Ardagna and Lusardi 2010a, b). Administrative costs in complying with regulation are higher for small firms than for large ones (OECD 2001), and the small business sector has been found to be larger in economies in which business start-up costs are lower (Ayyagari et al. 2007). Since most entrants are small, stricter rules may thus have a negative effect on entry. On the other hand, cross-country evidence suggests that the regulatory environment, while critical for the formation on new firms, matters relatively little for the formation of innovative, high-growth new ventures (Stenholm et al. 2013).

Entry can also be expected to be related to the political ideology of the party or coalition governing the municipality. While ideology may be important, e.g., for spending and taxation policy, it can also be an indication of attitudes to self-employment and entrepreneurship, as left-of-center parties may have less favorable views of entrepreneurship than do right-of-center parties (cf. Reynolds 1994; Nyström 2008). Nevertheless, municipal politicians are often pragmatic, and the Social Democratic party has been instrumental in market liberalization at the central level in Sweden (Bergh 2009).

Reynolds et al. (1994) measured the extent of socialist voting in recent elections in five European countries, finding no statistically significant effect from this variable on Swedish firm formation rates, except when manufacturing was considered separately. Daunfeldt et al. (2006) found no statistically significant effect from the presence of a right-leaning government on entry in municipal wholesale and retail industries. However, both Nyström (2008) and Daunfeldt et al. (2010) found right wing political rule to have a positive effect on entry. It should be pointed out that there may be a potential source of endogeneity here, as already existing entrepreneurs tend to vote for right-of-center parties.

2.3 Industrial organization variables

It is important to note that industrial organization variables will be evaluated at the level of industries within municipalities in this paper. Agglomeration effects can manifest themselves as localization economies, i.e., potential entrants locate where many other firms have located, to form a cluster. Nyström (2006) and Daunfeldt et al. (2006) found positive effects from localization economies on entry.

In the view of Mansfield (1962), a queue of well-informed entrepreneurs waits outside the market. The triggering factor for whether they enter or not is considered to be the expected level of profit (Orr 1974; Nakosteen and Zimmer 1987; Khemani and Shapiro 1986; Vivarelli 2013), though the size of the effect is usually found to be rather small, possibly because markets are already near equilibrium. Various proposed entry barriers, such as the industry minimum efficient scale of production, may for the same reason work more as barriers to survival than to entry (Siegfried and Evans 1994; Geroski 1995). Notably, many studies find a positive relationship between start-up size and survival (Audretsch and Mahmood 1995; Mata et al. 1995; Agarval and Audretsch 2001). Small, new firms can be expected to have a larger-scale disadvantage in industries with a larger minimum efficient scale, forcing them to grow quickly or exit (Strotmann 2007). Previous work emphasizes the importance of an environment dominated by smaller and independent firms for promoting entry (cf. Glaeser et al. 2010; Glaeser and Kerr 2009).

Higher concentration of firms in an industry may also deter entry, since a smaller number of market participants have a greater probability of overcoming collective action problems and act together to deter entry (Orr 1974; Chappell et al. 1990; Geroski 1995).

Large average expenditures on R&D by incumbents in the industry may also increase the costs of entry, though the empirical evidence is ambiguous (Siegfried and Evans 1994; Arauzo-Carod 2005). If the spillovers from R&D are localized, such spillovers could even have a positive effect (Jaffe et al. 1993).

Finally, access to capital may be an important constraint to opening a new business (Koo and Cho 2011; Feldman 2001; Gompers and Lerner 2001). Liquidity constraints are more severe for small firms (Fazzari et al. 1988), which are thus less likely to obtain new capital at market interest rates. Cressy (2006) developed a theoretical model of firm growth, showing that firms often die young because financial resources are lacking. But difficulties in obtaining external finance may be a symptom rather than a cause of the problem facing the firm (Santarelli and Vivarelli 2007). In fact, asymmetric information and entrepreneurial overoptimism could make overlending to low-quality firms possible (De Meza 2002).

3 Data and descriptive statistics

The self-employed in Sweden can incorporate their business, turning it into a limited liability firm (aktiebolag), which has a legal personality and is treated as a separate tax subject, i.e., corporate income tax is levied on the net return.Footnote 1 All limited liability firms are required to submit annual reports to the Patent and registration office (PRV), including, e.g., number of employees, wages and profits.

All industry-specific information in this study came from PAR, a Swedish consulting firm that gathers information from PRV, for use primarily by decision makers in Swedish commercial life. The data cover all Swedish limited liability firms active at some point during 1997–2010, yielding 3,831,854 firm-years for 503,958 firms.

The panel contains both continuous incumbents and firms that entered or exited during the period. Since the last 2 years saw a marked drop in the number of firms, the study period was cut off at 2008. Firm activities are specified by branch of industry down to the 5-digit level according to the European Union’s NACE classification system. This permitted the choice to study entry at four levels of aggregation: at the municipality level, and at the 1-, 3- and 5-digit level within the municipality.

Municipality-specific data came from three additional sources: first, Kfakta, a database compiled by the department of political science at Lund University and it contains many regional and municipal variables from 1974 onward; secondly, the Swedish Confederation of Enterprises’ (SCE) business conditions database, that consists of survey data on local business conditions from the year 2000 onward, which restricted the starting point of the study to this year; lastly, Statistics Sweden’s (SCB) online public databases.

Entry is measured as the first time a firm’s organization number appears in the PAR database. Additional information on firm start dates is utilized to make sure what is observed is a true entrant rather than a firm that changed its organizational form or organization number. To distinguish between regular and surviving entrants, I observe when a firm’s organization number disappears from the database and use additional information on closure date to make sure it is a real exit. An entrant is then defined as a surviving entrant if it survives at least 2 years post-entry.

Table 1 shows annual entry in the period 2000–2008 and surviving entry in 2000–2006. The mean annual entry rate was 0.076. For comparison, Davidsson et al. (1996) found a mean entry rate of 0.133 in the recession years 1990–1993, while Nyström (2006) reported an entry rate between 0.097 and 0.113 during 1997–2001. The lower entry rate reported here is probably due to the fact that limited liability firms have minimum capital requirements. The mean annual surviving entry rate was 0.059, suggesting that about 78 % (0.059/0.076) of firms survive at least 2 years post-entry.

Table 1 Annual total entry an entry rate in the economy, 2000–2008, and annual surviving entry and surviving entry rate in the economy, 2000–2006

Variations in entry rates 2000–2008 and surviving entry rates 2000–2006 across 1-digit industries are shown in Table 2. Electricity and construction, and transport, communication and finance, had the highest entry rates, while low-tech manufacturing had the lowest. In absolute terms, however, real estate and computer services had by far the greatest number of new firms. For surviving entrants, by contrast, rates were highest for education, health, social work, real estate and computer service.

Table 2 Annual mean entry and entry rate, 2000–2008, and annual mean surviving entry and surviving entry rate, 2000–2006, by 1-digit (NACE) industry level

Table 3 shows differences in entry rates across Sweden’s 21 counties 2000–2008, ranging from 0.056 in Kalmar to 0.101 in Blekinge. These are also the counties with the smallest and largest shares of surviving entrants 2000–2006. In absolute terms, the counties with major metropolitan regions (Stockholm, Västra Götaland, Skåne) had by far the most new firms, and this pattern remains when considering surviving entrants. There are larger differences in entry rates across all 289 municipalities, where entry rates range from 0.00 to 0.28 between 2000 and 2008, and surviving entry rates range from 0.00 to 0.22 between 2000 and 2006.

Table 3 Entry and entry rate by county, 2000–2008, surving entry and surviving entry rate by county, 2000–2006

Descriptive statistics of the variables used in the empirical analysis are shown in Table 4. Descriptive statistics are presented at the municipality level, and at the 1-, 3- and 5-digit industry levels within each municipality. Operationalizations are based on the theoretical overview in Sect. 2.

Table 4 Descriptive statistics of variables, 2000–2008

Following many previous studies of business location decision, both sets of entry variables used in the empirical analysis were measured as count data (see Arauzo-Carod 2008; Arauzo-Carod et al. 2010 for surveys of this literature). Hence, the first set, entry, was measured as the total number of new firms at each level of aggregation in a given year. Surviving entry was measured as the total number of new firms that survived for at least 2 years.

The standard geography variables were all measured at the municipality level. Education is measured as the percentage of people aged 16–74 with at least 3 years of post-secondary education. Unemployment is the percentage aged 18–64 who were openly unemployed. Income is median income in 1,000’s of SEK, while population is in 1,000’s of inhabitants.

Turning to political economy variables, tax is the sum of the municipal and county council taxes (total municipal tax) in percentage points, while spending variable is measured as municipal government expenditures per inhabitant in 1,000’s of SEK. Business climate was operationalized using a question from the survey conducted by the Confederation of Swedish Enterprises. The question asks local business leaders to rate the municipal business climate on a scale from 1 (poor) to 6 (good). Missing values in 2001 were replaced with the means of values in adjacent years (Greene 2008). It should be noted that such perceptions by incumbents are not necessarily equivalent to good entry conditions. To proxy for political ideology, a dummy variable was given the value one if there is a right-leaning majority in the municipal government.

Finally, the industry-specific variables were measured at the level of industries within each municipality. The number of incumbent firms was included to capture localization economies (Nyström 2006). Expected industry profitability was measured as mean equity returns. Minimum efficient scale was measured as net sales of the median firm, a common empirical approximation (Sutton 1991). Concentration was measured using a Herfindahl index consisting of the sums of squares of firms’ local revenue industry shares, i.e., \(s_{1jm}^{2}+s_{2jm}^{2}+\cdots +s_{kjm}^{2}\), where \(k\) is the number of firms. R&D intensity was measured as average expenditure for research and development. Finally, liquidity was measured as the average of the firm liquidity per revenue.

Table 5 shows that partial correlations were quite large between some independent variables at the 1-digit NACE level the municipality, just as for other levels of industry aggregation (not shown). This suggests a potential problem of multicollinearity resulting in non-significant estimates, although the large population size may remedy this (Long 1997, p. 54).

Table 5 Partial correlations between independent variables at the municipal 1-digit NACE level

4 Choice of model

As Mota and Brandao (2011, p. 4) point out, research on firms’ locational decisions usually appeals to discrete-choice models, notably condition logit, that rely on the random utility maximization framework of McFadden (1974). This methodology was first implemented on location decision by Carlton (1983) and popularized in the subsequent decades (see e.g., Bartik 1985; Coughlin et al. 1991; Head et al. 1999; Figueiredo et al. 2002). Guimarães et al. (2003) introduced the notion of modeling the location choice by means of a poisson model, which can equivalently model the coefficients of the conditional logit model. While the conditional logit model assumes that the odds of choosing an alternative are independent of other alternatives, the poisson lifts this independence of irrelevant alternatives (IIA) assumption (e.g., Arauzo-Carod and Manjón-Antolín 2004; Barbosa et al. 2004; Basile 2004; Guimaraes et al. 2004; Holl 2004b; Arauzo-Carod 2005). Other count data models can be seen as extensions of the poisson.

As the aggregation of individual entry decisions yields a discrete outcome at each point in time, I base the econometric methodology in the paper on panel count data models (Brixy and Grotz 2006; Daunfeldt et al. 2006, 2010; Koo and Cho 2011; Cockburn and MacGarvie 2011). A “count” refers to the number of specified events that occur in a given period. By definition, such data only consist of non-negative integers. They are usually highly skewed with a preponderance of zeros, thereby violating fundamental assumptions of many multivariate statistical techniques, particularly normality of residuals (Atkins and Gallop 2007; Tabachnick et al. 2007).

Count data models are seen as advantageous when using typical location data, because they can deal with the so-called zero problem; in this case, the situation in which a large number of territorial units see no new firms. This is typical when the territorial units are small (Alañón-Pardo and Arauzo-Carod 2013; cf. Cameron and Trivedi 1998, 2001). Since the dependent variable is the number of firms located in each territorial unit, it is useful to know not only how many times a unit has been chosen by new firms, but also which units have not been chosen by any firm. Units with no entrants are relevant because values of independent variables in these locations explain why they have not been chosen by entrants (Chappell et al. 1990; Arauzo-Carod and Viladecans-Marsal 2009, p. 550). Both positive and zero occurrences are in other words natural outcomes of the specification (Hammer and Landau 1981; Lambert 1992; Hall 2004; Karazsia and van Dulmen 2008). This also allows for more disaggregation, which has been lacking in previous studies (Johnson 2004).Footnote 2

Since Guimarães et al. (2003), most research on firms’ locational choice takes the poisson model as the starting point (Mota and Brandão 2011; see e.g., Arauzo-Carod and Manjón-Antolín 2004; Barbosa et al. 2004; Basile 2004; Holl 2004b; Gabe and Bell 2004; Cieślik 2005; Guimaraes et al. 2004; Arauzo-Carod 2005, 2008; Autant-Bernard et al. 2006; Jofre-Monseny and Sole-Olle 2010; Jofre-Monseny et al. 2011; Alañón-Pardo and Arauzo-Carod 2013). The poisson model captures the discrete and non-negative nature of the data and allows inference to be drawn on the probability of event occurrence (Barbosa et al. 2004, p. 469).

The poisson model deals with the zero problem mentioned above, but only provided that two assumptions are correct (Arauzo-Carod and Viladecans-Marsal 2009, p. 551). The first assumption is that of equidispersion, i.e., that the conditional variance of the dependent variable is equal to the conditional mean. In practice, this assumption is often not valid, which may be due to unobserved heterogeneity in the mean function (Chappell et al. 1990; Anand and Kogut 1997), as firm entries cluster in bigger areas (Alañón-Pardo and Arauzo-Carod 2013). This causes a type of heteroskedasticity which yields downward biased estimates of the standard errors (SEs) (Cameron and Trivedi 1986, p. 31; Winkelmann and Zimmermann 1995; Kennedy 2003, p. 279f; Mota and Brandão 2011). In the data at hand, the variance in entry and surviving entry was larger than the mean at all levels of aggregation (Table 4, above). It therefore appears that the equidispersion assumption is not valid.

The second assumption for the poisson model is that of no excess zeros. As mentioned, poisson models can deal with the existence of some observations with value zero, but not with an excessive number (Arauzo-Carod and Viladecans-Marsal 2009; Cameron and Trivedi 1998). Figure 1 shows histograms of the annual occurrence across municipalities of entry 2000–2008 and surviving entry 2000–2006. As can be seen, zero was the most frequent occurrence. The preponderance of zeros is even more marked at the level of industries within municipalities. This suggests that the second assumption of no excess zeros may also be invalid for our data.

Fig. 1
figure 1

Histograms of the occurrence of entry in municipalities, 2000–2008, and the occurrence of surviving entry in municipalities, 2000–2006

The descriptive statistics hence display signs of both overdispersion and excess zeros. It is likely that both problems stem from the existence of unobserved heterogeneity in the mean parameter (Mullahy 1997; Arauzo-Carod et al. 2010, p. 700). This makes it necessary to consider other count data models that can deal with these shortcomings.

The negative binomial model is an obvious candidate. It handles overdispersion by increasing the conditional variance without changing the conditional mean. It does so by lifting the assumption of independence of observations by adding a parameter reflecting unobserved (between-subject) heterogeneity (Arauzo-Carod 2008), i.e., when different groups of observations display different variance patterns but are homogenous within groups (Greene 2008, p. 911). The heterogeneity can hence be interpreted as a location-specific random effect (Arauzo-Carod et al. 2010, p. 700). At the same time, the negative binomial model can handle the excess zero problem by allowing for a higher probability of a zero count and a longer tail than the poisson. Its use in the previous business location literature has been frequent (see e.g., Gabe and Bell 2004; Barbosa et al. 2004; Holl 2004a; Autant-Bernard et al. 2006; Arauzo-Carod and Viladecans-Marsal 2009; Alañón-Pardo and Arauzo-Carod 2013; Mota and Brandão 2011).

Yet the negative binomial model also assumes that every location has a nonzero probability of a positive occurrence, hence assuming the same process for zero and nonzero counts. There may, however, be a qualitative difference between transition from zero events to the first occurrence, for example, if a two-stage process governs the firm’s decision to enter the market, analogous to location decisions of foreign investors (List 2001; Dohse and Schertler 2003). This points to the possibility of instead making a discrete representation of the unobserved locational heterogeneity (Arauzo-Carod et al. 2010).

The zero-inflated poisson and the zero-inflated negative binomial models allow zeros to be generated by two processes, assuming two groups of latent (i.e., unobserved) observations. Observations in the first group have no probability of entry, e.g., because they are banned by environmental regulations. Observations in the second group have a nonzero probability of entry, since in principle there is nothing that prevents entry from happening, although it may not happen for example because the location is small or remote (Arauzo-Carod et al. 2010; Kennedy 2003, pp. 279–280). Zeros can hence arise for two reasons. This binary form of heterogeneity is commonly parameterized using the logistic or normal cumulative distribution functions. Depending on the choice of cdf, a logit or probit model for the probability of zero entrants is mapped onto the original count model. The resulting specification critically differs from the parent count model in that it does not yield the same probabilities for zero and positive outcomes (Mullahy 1986; Greene 1994).

The zero-inflated poisson model uses the standard poisson model in this manner, while the zero-inflated negative binomial model uses the standard negative binomial model. In the zero-inflated poisson model, this entails adding stochastic error term to the specification for the conditional mean. This splitting mechanism changes the mean structure by increasing the conditional variance and the probability of zero counts (Lambert 1992; Greene 1994). Meanwhile, one can view the zero-inflated negative binomial model as adding unobserved heterogeneity to the poisson equation for the counts in the latent group. Zero-inflated models have previously been used to study business location decisions by, e.g., Gabe (2003), Arauzo-Carod (2008) and Manjón-Antolín and Arauzo-Carod (2011).

The longitudinal structure of the data at hand can be used as a complementary way of dealing with unobserved heterogeneity. The fixed effects estimator for poisson and negative binomial models developed by Hausman et al. (1984) is consistent regardless of the correlation between the individual effects and covariates and has been used extensively in previous industry location studies (see e.g., Papke 1991; Becker and Henderson 2000; List and McHone 2000; Holl 2004a, b, c; Manjón-Antolín and Arauzo-Carod 2011).

While the validity of the standard poisson model seems questionable based on what is known of the dataset, the choice between the three other count data models is not as clear-cut pre-estimation. As mentioned, the negative binomial model assumes between-subject heterogeneity, the zero-inflated poisson model assumes different probability models for zero and nonzero counts, while the zero-inflated negative binomial model assumes both between-subject heterogeneity and different probability models (Arauzo-Carod 2008, p. 201).

Post-estimation becomes easier to assess the models. Figure 2 in the appendix plots residuals from regressions at the 5-digit industry level. It shows that the poisson model makes its largest misprediction at count zero, which, given what is known about the models difficulties to handle excess zeros, is not surprising. The three other models see their largest mispredictions at count one. All three models underpredict this occurrence. However, the underprediction is small for the negative binomial model relative to the zero-inflated models. This casts doubt on a binary form of heterogeneity and different processes for zero and nonzero counts. Rather, overdispersion and excess zeros seem to reflect between-subject heterogeneity. Likelihood-ratio and Vuong tests also confirm that the negative binomial model is preferable to the other three models.Footnote 3 For this reason, I will present results in the remainder of the paper using this model.

Fig. 2
figure 2

Residuals from the four models at the 5-digit NACE level

The negative binomial generalizes the poisson by introducing an individual unobserved effect into the conditional mean. A gamma density is usually assumed. In the formulation suggested by Cameron and Trivedi (1986, the model has a variance function that is a simple multiple of the mean. Familiar panel data approaches have a fairly straightforward extension in the count data setting (Hausman et al. 1984). The fixed effect is built into the model as an individual specific \(\theta _{i}\). Thus,

$$\begin{aligned} pr\left( y_{i1,}y_{i2,\ldots ,}y_{iT_{i}}|\sum _{i=1}^{T_{i}}y_{it}\right) =\frac{\varGamma \left( 1+\sum _{i=1}^{T_{i}}y_{it}\right) \varGamma \left( \sum _{i=1}^{T_{i}}\lambda _{it}\right) }{\varGamma \left( \sum _{i=1}^{T_{i}}y_{it}+\sum _{i=1}^{T_{i}}\lambda _{it}\right) } \prod _{t=1}^{T_{i}}\frac{\varGamma \left( y_{it}+\lambda _{it}\right) }{\varGamma (1+y_{it})\varGamma (\lambda _{it})}, \end{aligned}$$

where \(\lambda _{it}=e^{x_{it}\varvec{\beta }}, E(y_{it}|x_{it})=\theta _{i}\mathrm{exp}(x'_{it}\beta )\) and \(Var(y_{it}|x_{it})=\theta _{i}\mathrm{exp}(x'_{it}\beta )(1+\theta _{i})\).

Taking this as a point of departure, the regression estimated at the municipality level can be expressed as

$$\begin{aligned} \mathrm{entry}_{mt}=\theta _{m}\mathrm{exp}(\mathbf {x}'_{mt-1}\mathbf {\beta })=\theta _{m}\mathrm{exp}(\beta _{0}+\beta _{1}'Y_{mt-1}+\beta _{2}T_{t}+\varepsilon _{mt}) \end{aligned}$$
(1)

where \(\mathrm{entry}_{mt}\) is a discrete count variable measuring the number of entrants in municipality \(m\,(m=1,\ldots , 289)\) in year \(t (t=2000,\ldots , 2008)\). \(Y_{mt-1}\) is a vector containing the municipal-specific covariates, while \(T_{t}\) are time-specific fixed effects. \(\beta _{0}\) is a constant term, while \(\beta _{1}\) and \(\beta _{2}\) are parameter vectors to be estimated, and \(\epsilon _{mt}\) is the random error term.

For estimation of regressions at the 1-, 3- and 5-digit industry level within municipalities, a slightly different specification is needed,

$$\begin{aligned} \mathrm{entry}_{mjt}=\theta _{mj}\mathrm{exp}(\mathbf {x}'_{mjt-1}\mathbf {\beta })=\theta _{mj}\mathrm{exp}(\alpha _{0}+\alpha _{1}'X_{mjt-1}+\alpha _{2}'Y_{mt-1}+\alpha _{3}'T_{t}+\varepsilon _{mjt})\nonumber \\ \end{aligned}$$
(2)

where \(\mathrm{entry}_{mjt}\) measures the number of entrants in municipality \(m\,(m=1,\ldots , 289)\), in industry \(j\), where \(j\) corresponds to either the 1-digit \((j=1,\ldots ,10)\), 3-digit \((j=1,\ldots ,370)\) or 5-digit \((j=1,\ldots ,1{,}428)\) level, in year \(t\,(t=2000,\ldots {}, 2008)\). The vector \(X_{jmt-1}\) contains industry-specific covariates, \(Y_{mt-1}\) is the vector of municipality-specific covariates, and \(T_{t}\) are time-specific fixed effects. \(\alpha _{0}\) is a constant term, while \(\alpha _{1}-\alpha _{3}\) are parameter vectors to be estimated, and \(\epsilon _{mjt}\) is the error term.

Controlling whether results are sensitive to different levels of aggregation is important. The relevant market size may for example differ between industries and firms. The relevant market for most hairdressers probably is the municipality, or even a geographical location within the municipality. Meanwhile, the relevant competition for other firms might be firms from a broad section of knowledge industries, while in others it may be just a few firms producing very similar products. Employing different levels of aggregations are a means to controlling for this, although hardly foolproof since each firm and industry is to some extent unique in this respect. This is important to keep in mind when interpreting the results from the estimations. All regressors are furthermore lagged by 1 year to avoid simultaneity. Longer lag structures were avoided so as to not lose more time periods than necessary (Ilmakunnas and Topi 1999).Footnote 4

5 Results

Table 6 presents the results from estimations of the negative binomial model with entry at the four levels of aggregation as the dependent variable. Regressions include time- and municipality-specific fixed effects, and SEs are clustered at municipality level. Z statistics and Wald \(\chi ^{2}\) tests are reported. Column (i) shows results when the dependent variable is total entry in municipality \(m\) at time \(t\). Columns (ii)–(iv) show results with entry in the municipality at the 1-, 3- and 5-digit industry levels as dependent variable.Footnote 5

Table 6 Incidence-rate ratios and \(z\) statistics from regressions with entry as dependent variable, time- and municipality-specific fixed effects, SEs clustered at municipality level, 2000–2008

Since count data are highly non-normal, the effect of each variable on the outcome depends on the level of all other variables. To facilitate the interpretation of the effects, results are presented as incidence-rate ratios which are the factor by which the dependent variable can be expected to change for a SD increase in the explanatory variable. A value above one corresponds to a positive effect, while a value below one corresponds to a negative effect (Long and Freese 2006).

A SD increase in the share of the population with at least 3 years of post-secondary education was associated with an increase in entry of 16–45 %, although the effect at the 5-digit level is not statistically significant. And while unemployment seems to have had little association with entry, a SD increase in municipal median income was associated with an increase of 15–30 %.

While the population variable was statistically significant only at the 3- and 5-digit levels, the size of the association was large. 5-digit industries had eleven times more entry in municipalities with a SD higher population. This points to the importance of urbanization economies for entry.

The political economy variables appear to have been less important. A higher municipal tax rate was associated with more entry, possibly because of a higher relative payoff from entrepreneurial activities, or because entry facilitates tax evasion (Robson and Wren 1999). The effect was not always statistically significant, nor very dramatic; a tax higher by one SD (about one percentage point) corresponded to about 4 % higher entry. The association of entry with municipal spending was negative but also small and never statistically significant.

The business climate variable was only statistically significant at the 3-digit level, where a perceived 1 SD better business climate was associated with 3 % more entry. At the 3- and 5-digit levels, industries in municipalities with a right-leaning government should see 5–6 % more entry than those in municipalities with other governments, in line with the findings of Daunfeldt et al. (2010).

Among the industrial organization variables, localization economies seem important. A SD more firms in an industry at the 3- and 5-digit levels was associated with about twice as many entrants.

Profits had the expected positive association only at the 1-digit level, casting doubt on the importance of this variable for entry, in line with previous evidence (Geroski 1995). As mentioned, this may be because most industries are already close to equilibrium, so that while profits are important for entry decisions, the data at hand could not capture the relation.

With a 1 SD higher minimum efficient scale, there was 27–60 % less entry at the 5- and 1-digit levels. The effect was not statistically significant at the 3-digit level but still negative and large. Likewise, a SD higher Herfindahl index was associated with 37 % less entry at the 1-digit level, and more than a 50 % at the 3- and 5-digit levels.

The effect from R&D intensity is ambiguous, as the direction is negative at the 1-digit level and positive at the other levels. Industries with 1 SD more liquidity had 8 % more entry at the 3- and 5-digit levels.

Table 7 presents results from negative binomial regressions with surviving entry as dependent variable at the four levels of aggregation. The results are similar to those in Table 6, with a few interesting exceptions. The effect from population is now considerably smaller and never statistically significant, suggesting that while urbanization may be conducive to entry in general, it had little importance for firm survival. But education is now always significant and the effect is larger except at the 3-digit level.

Table 7 Incidence-rate ratios and \(z\) statistics from regressions with surviving entry as dependent variable, time- and municipality-specific fixed effects, SEs clustered at municipality level, 2000–2006

Table 9 in the “Appendix” shows results when the dependent variable is regressed against each regressor one at a time. The direction of the effects is the same in all cases, with the exception of R&D intensity, which is now also negative and significant at the 3- and 5-digit levels. The results are also similar in Table 10, which shows some alternative specifications of the model at the 3-digit level where regressors with a pairwise correlation higher than 0.3 are excluded in turn. This suggests that multicollinearity is of little concern.

6 Summary

Previous evidence suggest that while the short-run effect of start-ups is negative, the long-run effect may be positive and for growth for as long as 10 years. This lagged positive effect also seems to occur for start-ups at the regional level (Johnson and Parker 1996; Dejardin 2011; Carree and Thurik 2008). At the same time, while most firms die within the first few years after entry, those that do survive can be expected to grow faster than their mature counterparts (Wagner 1994; Haltiwanger et al. 2010).

This points to the importance of distinguishing between regular and surviving entrants when considering regional and industrial determinants of new firm formation. This study, using a dataset covering all limited liability firms in Sweden during 2000–2008, makes such a distinction, by considering the municipal and industrial determinants conducive to regular entrant and surviving entrants, defined as firms that survive for at least 2 years. The dataset made it possible to trace not only in what municipality a new firm was established, but also in what industry, down to the 5-digit level.

A negative binomial regression model was estimated, since this model accounted for overdispersion in the dependent variable and had the greatest predictive power of the four count data models considered. Results suggest that a SD increase in municipal median income increases expected entry by 15–30 %, in line with the literature suggesting that higher incomes offer more resources for entrepreneurship (cf. Reynolds et al. 2002; Bosma et al. 2010). Meanwhile, the effect from a SD increase in the share of the population with at least 3 years of post-secondary education leads to an increase of entry by 16–45 %, supporting evidence that higher education in a local area increases the supply of entrepreneurs (Doms et al. 2010; Glaeser et al. 2010). Results also suggested that SD increases in industry concentration or in minimum efficient scale could more than halve the expected number of entrants, in line with previous evidence (cf. Geroski 1995; Glaeser et al. 2010; Glaeser and Kerr 2009). By comparison, the effects from taxes, government ideology and business climate were all low. This may be because the variation of these variables is generally rather small, suggesting that results should be more obvious in cross-country comparisons. On the other hand, Stenholm et al. (2013) suggest that while the regulatory environment matters for the formation of new firms, it matters less for the formation of innovative, high-growth ventures.

Interesting differences appeared as regards the determinants of regular and surviving entrants. Notably, population size had a strong positive association with entry, but the effect seemed to be less important for surviving entry. This could be taken to imply that the downside of agglomeration, e.g., higher land prices and congestion costs (Pellenbarg et al. 2002), mainly affect such firms. By contrast, the importance of the level of education appears stronger for surviving entrants than for regular entrants, pointing to the importance of human capital. This is in line with previous evidence pointing to the importance of human capital for high-growth ventures (cf. Stenholm et al. 2013).

While this study distinguished between entrants in general and those that survived for at least 2 years, further studies could do more in this respect. For example, it would be interesting to consider gazelle entrants (Henrekson and Johansson 2010), i.e., firms that grow quickly post-entry. Do the conditions favoring gazelle entry differ from those favoring entry in general?