1 Introduction

Since the pioneering works of Schumpeter (1934, 1942), entrepreneurship has been regarded as a main driver of development in market economies. There is vast empirical evidence supporting a positive relationship between entrepreneurial activity and growth (Callejón and Segarra, 1999; Audretsch and Keilbach, 2004; Carree and Thurik, 2003). This evidence has motivated policymakers of both, developed and developing countries, to implement policies and programs aimed at stimulating entrepreneurship and business creation (Gilbert et al., 2004).

The importance of entrepreneurship for development renews the interest for understanding the factors promoting entrepreneurial activity. Large-scale international efforts, such as the Global Entrepreneurship Monitor (Reynolds et al., 2000) and the Global Entrepreneurship and Development Index (Szerb & Acs, 2012), are example of such intellectual interest. Among the vast taxonomy of factors identified by these and other research programs, our interest is framed in the discussion about individual versus place-based drivers of self-employment, an important aspect of entrepreneurship (Parker, 2009), although fairly underresearched from a regional point of view. The concern for regional drivers of self-employment is motivated by the observation in many countries that subnational differences in entrepreneurial activity are sharp and persistent (Parker, 2005a, b; Fritsch and Mueller, 2007; Andersson and Koster, 2011).

While there is a rich literature focusing on the importance of individual traits for self-employment and entrepreneurship (Durand and Shea, 1974; McClelland, 1965; Blanchflower and Oswald, 1998), another strand emphasizes on conditions of the local contexts where potential undertakers unfold (Glaeser et al., 2010; Naudé et al., 2008; Audretsch and Fritsch, 1994). Between the two traditions, there is a gap in linking both scales of analysis in conceptual frameworks addressing the selection into self-employment and in empirically testing their relative importance. In particular, most applied studies are spatially neutral and ignore the fundamental tradeoff between the returns to self-employment and salaried work, as remarked by occupational choice theories (see Parker, 2009). Nevertheless, the literature suggests that there are both place-specific and individual characteristics that may simultaneously affect wages and self-employed earnings in the same direction, thus exerting an a priori ambiguous effect on the propensity to choose self-employment. For example, formal education likely influences productivity in both the managerial (Hundley, 2001) and labor (Paredes, 2013) functions. A location variable could be market access, remarked as positively affecting business profits (Sato et al., 2012) but at the same time average wage rates (Hanson, 2005).

Summarizing, there is a need for moving ahead toward more structured, economically grounded, and spatially sensitive approaches to the drivers of self-employment. In this paper, we go a step further in disentangling such hierarchical relationship and in exploiting individual and place heterogeneity, to assess both kinds of factors conditioning occupational choices using a structural and multilevel approach.

What we wonder is whether territorial conditions help explaining the selection into self-employment in Chile, once individual attributes are taken into account. There are not many analyses of spatial drivers of self-employment in the country. Spatial differences in self-employment rates are, however, large as in most countries. Figure 1 depicts the municipal rates of self-employment out of agriculture according to the place of work, built from data from the national population census of 2002. There are municipalities with self-employment rates that double the rates of others. Our starting hypothesis is that place exerts an influence above and beyond individual characteristics, since to a great extent, conditions the performance of businesses and the productivity of workers, thus shaping the expectations on relative returns to alternative occupations.

Fig. 1
figure 1

Rates of self-employment out of agriculture in Chilean municipalities, 2002. Source: Authors based on the 2002 National Population Census

To test for the importance of individual and placed-based drivers of self-employment in Chile, we follow an essentially empirical, open-ended approach, which is nurtured by different strands of the entrepreneurship and the spatial economics literatures. We extend the standard occupational choice framework by conditioning the agent’s decision variables on individual attributes, but also on regional characteristics according to mainstream location theories. By doing so, this paper articulates, in a unified empirical setting, two strands of the literature on entrepreneurship: the determinants of (e.g., Blanchflower and Oswald, 1998; Evans and Leighton, 1989) and the returns to entrepreneurship and self-employment (e.g., Hamilton, 2000; Hundley, 2001).

Along with being one of few studies linking the occupational choice and the spatial economics literature, this article offers some methodological innovations. In the first place, the empirical strategy entails the estimation of a structural (in a statistical sense) model composed of a set of reduced-form earnings equations and a probabilistic model of selection into self-employment (see Parker, 2005b) depending, among others, on expected earning differentials. This approach allows disentangling the net partial effects of factors conditioning occupational choices, exerted both directly on the propensity to be in self-employment and indirectly through expected wages and self-employed earnings. Following a recursive estimation method, we are able to build a research design that offers a way around the impossibility of observing individuals in both states (self-employment and salaried work). Finally, we follow a multilevel econometric strategy that allows us to better account for sources of variation of different hierarchical scales.

The results indicate that while individual-level variables account for a large share of the variation of wages, self-employed earnings, and the probability of being in self-employment, there is still a non-negligible remaining variation due to places, which is largely explained by factors set out by economic theories of location. In addition, we verify that the mechanisms driving the lection into self-employment differ markedly between employers and own-account self-employed. While theoretical expectations based on occupational choice theories are largely met for the former group, the results for the own-accounts point at selection based on individual traits and territorial conditions pushing people into low productivity self-employment.

The paper is organized in five sections. Section 2 presents the conceptual framework and the empirical specification. Section 3 describes the variables and data sources. Section 4 presents and discusses the results and the final section concludes.

2 Modeling framework

2.1 The occupational choice

At any point in time, the individual i in region r will choose self-employment as long as the utility of independent work (U e) is greater than that in salaried work (U w) (Blanchflower and Oswald, 1998). This unobserved utility differential can be summarized by means of a latent variable (I ), such that the decision rule can be modeled through the following indicator function (e.g., Goetz and Rupasinga, 2009):

$$ {I}_{ir}=\left\{\begin{array}{l}1\mathrm{if}{U^e}_{ir}-{U^w}_{ir}>0\\ {}0\mathrm{else}\end{array}\right\}, $$
(1)

with I = 1 indicating that the individual is self-employed.

Following the random utility framework (McFadden, 1974), utilities derived from alternative occupational choices are expressed as a sum of a linear deterministic (V) and a stochastic (ε) component:

$$ {U^e}_{ir}={V^e}_{ir}+{\varepsilon^e}_{ir}, $$
(2)
$$ {U^w}_{ir}={V^w}_{ir}+{\varepsilon^w}_{ir}. $$
(3)

According to occupational choice theories (Lucas, 1978; Parker, 2005a, b), individuals choose among alternative occupations based on expected earnings, such that the deterministic components of (2) and (3) depends on expected wages \( \left({\tilde{w}}_{ir}\right) \) and self-employment earnings \( \left({\tilde{\pi}}_{ir}\right) \). But, the literature also indicates that the propensity to choose self-employment may also vary according to a broad arrange of individual traits and regional conditions, so we include a vector of individual attributes characterizing the individual in either occupation (y i ) and place-characteristics conditioning the individual’s utility (p r ).Footnote 1

$$ {V^e}_{i r}={\delta}_0^e+{\delta}_1{\pi}_{i r}+{\updelta}_2^e{\mathbf{y}}_i+{\updelta}_3^e{\mathbf{p}}_r $$
(4)
$$ {V^w}_{i r}={\delta}_0^w+{\delta}_1{w}_{i r}+{\updelta}_2^w{\mathbf{y}}_i+{\updelta}_3^w{\mathbf{p}}_r $$
(5)

Given (4) and (5), the indicator function (1) takes the form

$$ {I}_{i r}^{\ast }={\delta^I}_0+{\delta^I}_1\left({\tilde{\pi}}_{i r}-{\tilde{w}}_{i r}\right)+{{\boldsymbol{\updelta}}^I}_2{\mathbf{y}}_i+{{\boldsymbol{\updelta}}^I}_3{\mathbf{p}}_r+\left({\varepsilon^e}_{i r}-{\varepsilon^w}_{i r}\right). $$
(6)

The expectations formation mechanism implied by Eq. (6) is that at any moment, the individual assesses the potential return to each occupation based on the actual earnings, he observes for individuals with similar characteristics and located in the same regional economy. Thus, it resembles the selection mechanism of other models in the literature (e.g., Parker, 2005a, b). Yet, people may choose self-employment based also on personality traits that shape individual preferences toward self-employment, such as orientation to control and achievement or a desire for labor independence (Blanchflower and Oswald, 1998; Durand and Shea, 1974). These varied motivations are captured by variables y i in Eq. (6). In addition, regional conditions may influence attitudes toward self-employment, such as risk in local labor markets (Low and Weiler, 2012) or the local entrepreneurial culture (Beugelsdijk, 2007). These regional conditions are captured by variables p r in Eq. (6).

Assuming a logistic distribution for \( {\overline{\varepsilon}}_{ir}={\varepsilon^e}_{ir}-{\varepsilon^w}_{ir} \), linearity in parameters implies a logit representation of the indicator function (6):

$$ P\left({I}_{ir}=1|\tilde{\pi},\tilde{w}\right)=\frac{1}{1+ \exp \left(-\left[{\delta^I}_0+{\delta^I}_1\left({\tilde{\pi}}_{ir}-{\tilde{w}}_{ir}\right)+{{\boldsymbol{\updelta}}^I}_2{\mathbf{y}}_{ir}+{{\boldsymbol{\updelta}}^I}_3{\mathbf{p}}_r\right]\right)} $$
(7)

Equation (7) is estimated recursively following a two-stage procedure. In the first stage, expected wages and self-employed earnings are predicted and in the second stage, the selection equation is estimated with the predicted earning differentials. We are thus able to build expected earnings to each observed occupation in the sample and test for the hypothetical tradeoff between expected self-employed earnings and wages involved in the decision of becoming an independent worker.

To specify the two earning equations predicting π and w, we assume first that each individual is equipped with a range of either intrinsic or acquired attributes we generically call human capital. For analytic purposes, human capital is split in a set of skills conditioning their labor productivity (h w) and a set of attributes defining their ability to manage businesses (h e)—this latter usually regarded as managerial talent (Lucas, 1978) or entrepreneurial ability (Parker, 2005a, b). It worth noting, as remarked by Lucas (1978), that this distinction is somewhat artificial, and in practice, h w and h e may well have several elements in common.

Thus, following the Mincerian tradition (Mincer, 1974), the expected wage the agent can get as a salaried worker is modeled as a function of her stock of human capital. But also, wage rates vary across regions due to location externalities influencing labor productivity, which include well-known channels such as urbanization and localization economies (Ciccone and Hall, 1996; Rosenthal and Strange, 2001). Simultaneously, local context also affects wages through spatial price differentials, due to the level of amenities (Roback, 1982) or due to distance to markets (Hanson, 2005). Such place-specific variables are summarized in vector (z r ), and therefore, the expected wage equation is specified as

$$ {\tilde{w}}_{ir}= w\left({\mathbf{h}}_{ir}^w,{\mathbf{z}}_r\right) $$
(8)

On the other hand, productivity in self-employment positively depends on the managerial skill of the entrepreneur (h e) (Lucas, 1978; Parker, 2005a, b), but we also include here the same regional variables capturing the effects of location on workers’ productivity (z r ). Finally, we consider fixed costs of setting up and running a business, which also varies across locations (F r )due to, for instance, differences in cost structures across industries, in transaction costs in factors and product markets, and so on (Glaeser et al., 2010). Thus, the self-employed earning equation is defined as

$$ {\tilde{\pi}}_{ir}=\pi \left({\mathbf{z}}_r,{\mathbf{h}}_{ir}^e,{\mathbf{F}}_r\right). $$
(9)

Equations (7), (8), and (9) motivate, through a simple framework, an exploration of the role of individual and location-specific factors in conditioning occupational choices. The framework above makes it clear that individual and place-based factors affect the propensity to self-employment both directly (captured by parameters δ I 2 in Eq. (7)) and indirectly by conditioning the expected earnings in Eqs. (8) and (9) and, through this differential, the propensity to be in self-employment (mediated by parameter δ I 1 in Eq. (7)). Our modeling framework should be seen as a partial equilibrium approach aimed at emphasizing on the equilibrium condition of free entry into self-employment based on earning differentials (Eq. 6). This is the mechanism remarked by occupational choice theories we want to test with the Chilean data. This strategy resembles the approaches followed by previous studies such as Rees and Shah (1986) and Earle and Zakova (2000).

2.2 Empirical implementation

The model above considers individual attributes and characteristics of places as regressors. The combination of variables at two hierarchical levels in linear OLS regression models with only one error term brings misspecification problems, as well as consequences over estimated standard errors (Bullen et al. 1997). Therefore, we follow a multilevel modeling strategy. The use of multilevel estimation methods allows incorporating these different sources of variation and also capturing the interaction between factors of different scales, so we can assess the relative contribution of both sets of variables in explaining the variation of the dependent variables (Paredes, 2013).

The empirical counterpart of the model is the following recursive system of hierarchical equations, capturing variation at the level of individuals (level 1) and regions (level 2):

$$ {w}_{ir}={\beta}_{0 r}^w+\sum_l{\beta}_{l ir}^w{h}_{l ir}^w+{\mu}_{ir}^w,\kern0.5em {\beta}_{0 r}^w={\beta}_{00}^w+\sum_m{\alpha}_{m r}^w{z}_{m r}+{u}_r^w, $$
(10)

(Wage equation)

$$ {\pi}_{ir}={\beta}_{0 r}^{\pi}+\sum_n{\beta}_{n ir}^{\pi}{h}_{n ir}^{\pi}+{\mu}_{ir}^{\pi},\kern0.5em {\beta}_{0 r}^{\pi}={\beta}_{00}^{\pi}+\sum_q{\alpha}_{q r}^{\pi}{z}_{q r}+\sum_p{\gamma}_{p r}{F}_{p r}+{u}_r^{\pi}, $$
(11)

(Self-employment earnings equation)

$$ P\left({I}_{i r}=1|{\widehat{w}}_{i r},{\widehat{\pi}}_{i r},{\mathbf{y}}_i,{\mathbf{p}}_r\right)=\frac{1}{1+ \exp \left[-\left({\delta}_{0 r}+{\delta}_1\left({\widehat{\pi}}_{i r}-{\widehat{w}}_{i r}\right)+{\boldsymbol{\updelta}}_2{\mathbf{y}}_i\right)\right]},{\delta}_{0 r}={\delta}_{00}+{\boldsymbol{\updelta}}_3{\mathbf{p}}_r+{u}_r^P $$
(12)

(Selection equation)

Each of the three dependent variables is conditioned on covariates at the level of individual (level 1) and spatial unit (level 2). The hierarchical nature of the problem requires capturing the unobservable effects at levels 1 and 2, so two error terms are considered: \( {\mu}_{ir} \) (level-1 error) and \( {u}_r \) (level-2 error). Both errors’ variances are estimated through likelihood-based techniques. Level 2 errors are assumed to be uncorrelated across regions and also independent of level 2 regressors. Level 1 errors are also assumed to be independent both across regions and individuals and also independent of the individual-level covariates and of level 1 errors. Both errors have zero expectation and Var(μ) = σ 2 and Var(u) = τ 2, respectively.

Once the regression parameters are obtained, the variance terms and the spatial intercepts are estimated, such that we can predict wages and the self-employed earnings. Therefore, this strategy permits a conditional expected outcome for each individual in the alternative occupations. Both predicted variables are incorporated into Eq. (12) to evaluate how the earnings differential affects the probability of being in self-employment.

A fundamental assumption behind our hierarchical approach is that the variance of the level-2 error, \( \left({u}_r\right) \), is significantly different from zero. If that is not the case, a multilevel approach is unwarranted. The intraclass correlation coefficient (ICC or ρ) is used to assess the importance of place as a source of variation:

$$ ICC=\frac{\tau^2}{\tau^2+{\sigma}^2}. $$

Finally, to calculate the direct and indirect effects of each regressor on the probability of being self-employed, the partial effect can be decomposed by differentiation of the reduced-form of Eq. (12) with respect to the variable of interest (x j ), such that the partial effect is the sum of two terms. For the case of continuous regressors,

$$ \mathrm{PE}\left({x}_j\right)=\frac{\partial P\left( I=1\right)}{\partial {x}_j}\equiv \frac{\partial P\left(\left(\widehat{\pi}-\widehat{w}\right),\mathbf{x}\right)}{\partial {x}_j}+\frac{\partial P\left(\left(\widehat{\pi}-\widehat{w}\right),\mathbf{x}\right)}{\partial \left(\widehat{\pi}-\widehat{w}\right)}\frac{\partial \left(\widehat{\pi}-\widehat{w}\right)}{\partial {x}_j}. $$
(13)

The first partial derivative at the right-hand side of (13) captures the direct effect of the variable on the probability of being in self-employment (which can be interpreted as the effect of the variable on the preferences and/or attitudes toward self-employment). The second term is the indirect effect, a pecuniary effect exerted through expected earning differentials. A similar reasoning can be applied to binary regressors. Each partial effect is computed for each observation and then evaluated as the average partial effect (APE) across individuals in the sample (Rabe Hesketh and Skrondhal, 2008).

3 Variables and data

Individual-level data are taken from the 2009 Chilean National Socioeconomic Survey (CASEN), at the moment we started this research, the only round that registers the working location of household members. Location is recorded at the level of comunas (municipality), the lowest administrative units existing in Chile. The sample includes occupied persons between 15 and 64 years old, working outside the natural resources primary sector (non-farm workers) and declaring their habitual place of work in one of the Chilean comunas.

For the selection equation, the dependent variable takes the value of one for self-employed and zero for salaried workers. We define self-employment aggregating both forms of independent work in the CASEN survey, i.e., employers and own-account self-employed. Later, we will separate the sample between these two subgroups. Salaried workers are defined as waged employees in the private sector. Wages and self-employed earnings are built from net earnings declared for the main occupation by salaried and independent workers respectively, as recorded by CASEN.Footnote 2 As usual in Mincer wage models, both wages and self-employed earnings are expressed in logs when estimating the equations. To remove outliers, as in Paredes (2013), we exclude the lower and upper 1% of wages and self-employed earnings. These filters yielded a total sample of 46,472 individuals.

For the vector of individual variables in the wage equation (h w), we include schooling and labor experience variables. We also include squares of experience to capture non-linear effects. Schooling variables were also included in the selection equation to control for a potential positive effect of education on attitudes toward self-employment (Lazear, 2005). In this latter equation, experience controls for the fact that the probability of departing from self-employment decreases with duration in self-employment (Evans and Leighton, 1989). We finally consider a female dummy to account for the well-known gender wage gaps in the country (Ñopo, 2006; Montenegro, 2001) and in the selection equation, for the lower propensity of women to choose self-employment (Langowitz and Minniti, 2007). Finally, based on Hundley (2001), we also include two dummies capturing demographic conditions that may affect productivity in both occupations, and that may act as a disincentive to start a business: the existence of children below 6 years old and disabled persons at home. We also include dummies of economic sectors following the CASEN classification (1-digit ISIC) to capture productivity differences across industries and, in the selection equation, sector-specific entry barriers.

Regarding the variables proxing entrepreneurial or managerial talent (h e), we included the same Mincerian human capital variables in the wage equation. Also, a dummy taking the value of one if a parent was a self-employed (Blanchflower and Oswald, 1998) as role models would contribute to the development of individuals’ entrepreneurial skills as well as positive attitudes toward entrepreneurship (Andersson and Koster, 2011). We also built a variable proxying for collateral capacity, given the importance of access to financial resources for initiating and upscaling a business (Blanchflower and Oswald, 1998). Finally, we include a binary variable of business networks, with an expected positive effect on business performance (Bosma et al., 2004).

Place-specific variables were built at the level of comunas. In terms of location variables influencing worker’s productivity and therefore wages (and marginal production costs) (z), we follow Paredes (2013) in considering the New Economic Geography (NEG) (Krugman, 1991) and the spatial equilibrium framework (Roback, 1982). The NEGs predict higher wages in areas with high market potential due to interregional demand linkages increasing demand for traded goods and therefore the demand for laborin that sector. Here, we proxy market access with the metric developed by Harris (1954).Footnote 3

Roback’s (1982) spatial equilibrium approach considers the effects of amenities on the localization of firms and workers according to the spatial equalization of production costs and indirect utilities, both depending on wages and land rents. Regions with few amenities should compensate lower quality of life with higher wages and lower housing prices. In the selection equation, amenities could be positively correlated with propensity to self-employment, as amenities are potential attractors of entrepreneurial people (Glaeser et al., 2010). The variables measuring amenities include climatic and geographic variables, as well as housing prices. We also include urbanization (Jacobs) externalities, as a sectorial diversification index, as diversity may positively affect performance of and propensity to self-employment due to eased localized knowledge spillovers (van der Panne, 2004). We also include average schooling in the comuna to account for human capital externalities increasing workers’ and self-employed productivity (Hanson, 2005; Glaeser et al., 2010). In addition, greater human capital may contribute to shape preference toward entrepreneurship, by promoting a diverse and creative environment (Beugelsdijk, 2007). Similarly, we include a measure of start-up activity in order to capture the potential effects of entrepreneurial and creative environments on workers’ and entrepreneurs’ preferences and productivity (Audretsch and Keilbach, 2004).

Finally, in terms of regional variables conditioning local fixed costs (F r ), we follow Glaeser et al. (2010) in including a labor-intensity variable, since a higher intensity would proxy for lower overhead costs of running firms. We include this variable in the selection equation, as it may also capture competition in local labor markets (van der Panne, 2004). We also include a measure of costs of accessing financial services, since it has been shown as reducing start-up rates in other developing countries (Naudé et al., 2008).

Finally, we include other variables directly conditioning the probability in the selection equation. These variables are reflected by vectors y (for individual-level variables) and p r (for territorial-level variables) in Eq. (12) and include (i) a dummy for persons in mid-labor career (between 35 and 45 years old) with an expected positive coefficient (see Reynolds et al., 1995); (ii) risk in local labor markets, since risk aversion strongly hinders self-employment (van Praag and Cramer, 2001); and (iii) the unemployment rate in the comuna.

The Appendix presents a summary of the variables included in the different equations, the sources, and measurement issues.

3.1 Descriptive statistics

The share of self-employed in the sample is 29%, which is close to estimates of total entrepreneurial activity rates in the country (24.3%) by the Global Entrepreneurship Monitor (GEM) Chile Project (Amorós and Acha, 2014), but it is larger than figures reported for the USA (around 17%, see Goetz and Rupasingha, 2009). Among the self-employed, 93% lived in the same place in 2004. This is consistent with findings by Michelacci and Silva (2007) with regard to the prevalence of businesses managed by local entrepreneurs in the USA and Italy. However, one observes similar figures for salaried workers, which is indicative of a low mobility of the Chilean labor force in general (Soto and Torche, 2004).

Table 1 reports the descriptive statistics of individual variables for salaried workers (top) and self-employed individuals (bottom) in this sample. Figures in the table are largely consistent with self-employed profiles in the literature. First, this group shows more persistence in the same job (9 years in average). This level of persistence is remarkable as Evans and Leighton (1989) report a rate of exit to self-employment of around a half in the first 7 years. Self-employed in the sample are more likely to have self-employed parents and have more collateral capacity, as also reported by Blanchflower and Oswald (1998). They also tend to participate more in social and business organization, although in the case of the two groups, participation is extremely low (1% for self-employed and 0.2% for salaried workers). Contrary to Hamilton’s (2000) findings, they are slightly less educated in average. They are also more oriented toward retail and certain services (small shops, restaurants, etc.) and less to manufacturing. They also show a higher share of disabled, which may be indicative of the importance of home-based self-employment (e.g., Benhabib et al., 1991). The figures in the table also indicate that in a country with low labor participation of women (Contreras and Plaza, 2010), self-employment is slightly more feminine. As reported by Hamilton (2000), self-employed in our sample also show greater apparent experience; in other words, they are older in average.

Table 1 Descriptive statistics—individual-level variables

What is distinctive in the Chilean case is the distribution of earnings. Figure 2 illustrates the earnings distributions for salaried workers and self-employed. Earnings of the self-employed are larger in average and considerably more dispersed. However, unlike other studies in developed countries pointing at some few “superstar” entrepreneurs driving average earnings up (Hamilton, 2000), self-employed earnings in Chile tend to be higher than wages along most of the distribution. Wages, and therefore opportunity costs of self-employment, are, in general, very low in Chile.

Fig. 2
figure 2

Distribution of hourly earnings, salaried workers, and self-employed (logs)

Table 2 summarizes the descriptive statistics of the municipal variables. The data shows a large heterogeneity of contextual factors, in terms of both “first” and “second nature” geography. While there are some few agglomerated locations with larger levels of economic diversification and service provision, most Chilean comunas are mainly rural, sparsely populated and lacking of business support services (Olfert et al., 2014). There is a large variation in factors such as market potential or the labor intensity index (both with coefficients of variation above 100%). The climatic and topographic variables also illustrate the sharp contrasts, such as the extreme north-south precipitation and temperature gradients.

Table 2 Descriptive statistics—comuna-level variables

4 Results

4.1 Wages and self-employed earnings

As a first exploratory analysis, we fitted pure variance component specifications for the wage and self-employed earning equations, i.e., including no regressors and only the municipal random effect. The results (unreported) indicate that the municipality accounts for quite a small share of total wage variation(ICC w = 6.6%). On the contrary, the relative importance of place variation increases in the case of the self-employed earnings (ICC π = 9.6%). Despite the low values of both ICCs, likelihood ratio tests confirmed the statistical significance of the municipal random effect in both cases.

Table 3 shows the results of the recursive estimation of system of Eqs. (10), (11), and (12) for the full specifications, i.e., including individual and municipality-level regressors. We place first our attention on the wage equation in column (1) of Table 3. Our results largely confirm the findings by Paredes (2013) and studies in the country (Ñopo, 2006; Montenegro, 2001) with respect to the importance of human capital (schooling and labor experience) in conditioning wages, as well as the gender wage gap.Footnote 4 The estimated “Mincer rate of return” for an additional year of education is around 7% and experience shows a positive and diminishing marginal return.

Table 3 Estimation results—Eqs. (10), (11), and (12)

Regarding the municipal-level variables, amenity-related variables showed mixed results. The coefficient for the minimum temperature in the coldest month is negative and marginally significant, which is consistent with the income quality of life tradeoff in Roback’s (1982) framework. The positive coefficient for the municipal housing price index suggests, as pointed by spatial equilibrium models, that firms located in places with higher housing prices must compensate workers for the greater costs of living, ceteris paribus. Likewise, sectorial diversity shows a not significant correlation with wages, a result that is in line with previous findings by Modrego et al. (2015) with respect to local innovation in the country. The market potential variable showed, against theoretical expectations, a negative and not significant correlation with wages, which confirms findings by other cross-sectional studies in Chile (Paredes, 2013).Footnote 5 A territorial variable strongly conditioning wages is the level of schooling in the comuna. This result is in line with the idea of productivity-enhancing localized knowledge spillovers (e.g., Ciccone and Hall, 1996; Rosenthal and Strange, 2001).

The two-level model provides an almost complete explanation of the municipal random effect, with an ICC that now amounts to only a 2.1%. The addition of level 2 variables yielded a reduction of around 55% in the regional variance component with respect to the model with individual-level regressors only (unreported). This means that, jointly, the set of level 2 regressors included explains a significant part of the territorial component of wage variation.

Column (2) of Table 3 reports the results for the self-employed earning equation. As expected, Mincerian human capital variables behave in a similar way compared to the wage equation, indicating that, in general, human capital is also associated with greater returns to self-employment. However, estimations reveal a lower marginal return to schooling and experience in self-employment compared with salaried work. Again, female self-employed earn less than males do, even after controlling for other observables, what confirms in the context of a developing country, the findings by Hundley (2001). The gender penalty is actually larger in self-employment.Footnote 6 The variables proxying more specific entrepreneurial human capital also behaved largely as expected, given results reported in the literature. Having a parent self-employed and collateral capacity are both positively and significantly correlated with higher entrepreneurial earnings, while the opposite is true for having disabled persons at home (Hundley, 2001). The effect of networks is positive but not significant, although, despite the very low participation levels, we cannot discard that this variable is to some extent endogenous (to the extent that participation is a consequence of being self-employed).Footnote 7

The coefficient of the municipal market potential is, again, negative and not significant, which contradicts the idea of profit-enhancing effects of proximity to large markets. Economic diversity is also not significant, challenging the relevance of the Jacobs externalities hypothesis for the Chilean case. However, the available sectorial classification (19 sectors) is perhaps too broad to fully capture this sort of externalities. With regard of variables measuring the local costs of doing businesses, the inverse access to banks index showed a coefficient that is not significant, suggesting that greater costs of accessing financial services is perhaps less restrictive in Chile compared to other developing countries (see Naudé et al., 2008). The negative sign for the labor intensity index was unexpected, given the findings by Glaeser et al. (2010) for the USA. Finally, the variables capturing human capital externalities show the expected positive and statistically significant coefficient. That is the case of the average schooling in the municipality and local entrepreneurial activity. In particular, average schooling in the comuna shows a larger coefficient for the self-employed, suggesting the importance of knowledge externalities for thriving entrepreneurial environments.

The two-level specification yields an ICC of 2.7%; small and similar to that calculated for wages. This full-model’s ICC means a reduction of around 61% compared to the specification including only level 1 variables (unreported). This result, again, supports the idea that, jointly, spatial economic theories of location and entrepreneurship make a good job in accounting for the spatial variation of self-employed earnings in Chile.

4.2 Selection into self-employment

The pure variance-component model (i.e., without regressors, not reported) indicates that location explains only a 6.7% of the variation of the probability of being in self-employment. However, the LR test reveals that the multilevel model better fits the data compared to a standard logit regression.

Estimated wage and earnings equations were used to predict earning differentials. The ratio of predicted self-employed earnings to predicted wages in the sample ranges from 0.82 to 4.18, with a median of 1.66. Column (3) of Table 3 summarizes the results for the selection equation including relevant level 1 and level 2 covariates. Since this equation uses an estimated regressor, we report bootstrap standard errors (see for instance Redding and Venables, 2004). The selection equation yielded a negative parameter for the predicted earning differential. The estimate indicates that a one-point increase in the log difference between expected self-employed earnings and expected wages reduces the odds of being observed as self-employed by near a quarter. Despite that this coefficient varies across comunas,Footnote 8 it is however imprecisely estimated and we cannot reject that the mean partial correlation is zero (p value = 0.15). This result is inconsistent with the selection mechanism based on returns to alternative occupations remarked by economic theories of occupational choice. The lack of significance of the earning differential is robust to an alternative measure, built with the observed (instead of the predicted) earning for the actual occupation.Footnote 9

Other estimation coefficients help building an explanation to this result. For instance, the positive and significant coefficients for the gender and disabled dummy variables, two conditions highlighted as negatively associated with propensity to be in self-employment in developed countries (Parker, 2005b; Hundley, 2001).Footnote 10 Moreover, the negative and significant coefficients for the market potential and start-up activity support views discarding the hypothesis of entrepreneurial environments offering higher profit opportunities (see Glaeser et al., 2010). Moreover, some control variables that were a priori important, proved to be not significant. For instance, individual schooling or the mid labor career variable).

Summarizing, the fact that certain conditions remarked as drivers (or constraints) of entrepreneurship in the literature show a counterintuitive (or not significant) correlation in this context, suggests that, to a large extent, self-employment in Chile could be more related to what is sometimes called necessity entrepreneurship (Wennekers et al., 2005; Acs and Amorós, 2008). By this, we refer to self-employment mainly driven by push factors (Reynolds et al., 1995; Choi and Pan, 2006), such as limited human or financial capital, limited entrepreneurial skills, low wages (and therefore opportunity costs of self-employment), and/or job rationing.

4.3 Selection into self-employment by type of self-employment: employers and own-account self-employedFootnote 11

In order to identify potentially different drivers of self-employment, we fitted Eqs. (11) and (12) for two groups of self-employed: (i) employers (those hiring workers) and (ii) own-account self-employed. Among the self-employed in the sample, the majority (89%) are own-accounts. Table 4 presents the descriptive statistics of individual variables for both groups. Employers’ mean earnings more than double that of own-accounts. Figure 3 depicts the earnings distribution for the two groups and makes clear the stochastic dominance of the employers’ earnings distribution. Table 4 indicates that employers are more educated (2 years), younger (less “apparently experienced”), less likely women, have lower levels of disability at home, and are more likely to have collateral and networks. They also show more persistence and their parents are more likely self-employed. Employers tend to participate relatively more in sectors like financial services and own-accounts in retail and construction. Clearly, we are dealing with two distinct groups.

Table 4 Descriptive statistics—individual-level variables by type of self-employment
Fig. 3
figure 3

Distribution of hourly earnings, employers, and own-account self-employed (logs)

Table 5 presents employment transition shares between 2001 and 2006 taken from panel CASEN 2006 survey. Due to sampling issues, this instrument is not suited to explore the effect of place heterogeneity and it is only used to illustrate basic aspects of self-employment dynamics in the country.Footnote 12 Of 7496 individuals aging at least 15 years old in 2001 and at most 65 years old in 2006, it can be seen that in a period of moderate growth, employers were the least stable group. On the contrary, very few own-accounts were able to upscale to employers and, instead, many moved to salaried work (just like the unemployed and inactive). Salaried workers, in turn, tended to remain in such condition. Overall, the table portraits what seems to be a risk-averse labor force, showing some preference for salaried work.

Table 5 Transitions between occupational categories: 2001–2006 (%)

The estimation results for the two subsamples are summarized in Table 6. Column (1) reports the results for the earnings equation for employers and column (3) for the own-account self-employed. In general, the results are very similar. The only significant difference is for the coefficient of schooling. Not surprisingly, the return of an additional year of education is larger for employers (8 against 5%).

Table 6 Estimation results—Eqs. (11) and (12) for employers and own-account self-employed

Column (2) and column (4) of Table 6 report the results for the selection equation. Unlike the earning equations, comparison of both estimates points at remarkable differences in the mechanisms driving the selection into self-employment. The expected positive and significant coefficient for the predicted earnings differential was obtained only for the group of employers. In this case, the point estimate is 0.943, which in terms of odds ratios, means that one point increase in the log difference of earnings rise the odds of being observed as self-employed by 2.6 times. Rees and Shah (1986) previously reported a positive effect of earning differentials on the probability of self-employment for the U.K. On the contrary, for the own-accounts, the earning differential variable has a negative and significant coefficient.Footnote 13 This result is in line with findings by Earle and Zakova (2000) for six transition economies of Eastern Europe. Again, main conclusions hold if one uses the alternative earnings differential.Footnote 14

Looking at other covariates, individual’s schooling shows the expected positive and significant coefficient only for the subsample of employers. In the case of the own-accounts, the negative coefficient is likely indicative of low educational attainment constraining access to good jobs and pushing people into low-productivity self-employment. The probability of being in self-employment is not different for women in the employers’ group once relevant observables are controlled for, but it is significantly larger for own-account women. Similar is the case of the variable of disability. With regard of municipality-level variables, the results are qualitatively similar, except for the start-up activity variable (and to a lesser extent the diversity index), which likely indicates negative effect of deeper, dynamic labor markets (or of greater competition in product markets) on the probability of being observed as an own-account.

4.4 Direct, indirect, and net effects of individual and place-level drivers of self-employment

Figure 4 illustrates the estimated net average partial effect (APE), with their 95% confidence intervals, for employers (upper panel) and own-account self-employed (lower panel). Table 7 reports the results of the decomposition of APEs (Eq. 13).

Fig. 4
figure 4

Average partial effects (APEs) of individual and place-based variables on the probability of being in the state of self-employment

Table 7 Decomposition of average partial effects (APE): direct and indirect effects

In the case of employers, the APEs indicate that when both direct and indirect effects are taken into account, a parent self-employed is an individual’s condition that raises significantly the probability of being observed as an employer (3 percentage points). This is the result of a combined positive effect on both employer earnings (indirect effect) and preferences for self-employment (direct effect) (Table 7). Equally, and consistent with Blanchflower and Oswald (1998), having collateral increases the probability by roughly the same amount. APEs of schooling and experience are also positive and significant, but in this case, this is the result of a negative indirect effect (due to a larger rate of return in salaried work) that is offset by the positive direct effect. A remarkable result is that based on the estimated APE, we cannot reject that women are equally likely to be observed as employers when other observables are controlled for. The indirect effect is significantly negative, due to greater gender earning gap in self-employment. However, this effect is somewhat compensated by a slightly positive direct effect).

Regarding comuna-level variables, average schooling shows a significant positive APE for employers. An additional year of education in the comuna increases the probability of observing an individual as an employer by 0.3 percentage points. This is the result of both positive direct and indirect effects. On the contrary, market potential and the unemployment rate show a negative APE, which in both cases is the sum of a negative direct and indirect effects. The result for the market potential variable is in accordance with Naudé et al. (2008) for South Africa and likely reflects a combined effect of market crowding (due to greater product-market competition) and more (and better) job opportunities for qualified people in agglomerations. The negative effect for the unemployment variable likely reflects the deterring effect of the stagnated environments on both employers’ earnings and on expectations toward entrepreneurship (Choi and Pan, 2006).

There are some notable differences in the case of the own account self-employed. First, individual’s schooling shows a negative (and significant) APE, with an additional year of education reducing the probability of being in own-account self-employment by 0.2 percentage points. Also, women are almost 5 percentage points more likely observed as own-accounts, as well as individuals having disabled persons at home (2 percentage points). According to findings in the literature (see Parker, 2005b; Hundley, 2001), these results are a likely indication that, for this group, low levels of educational attainment, as well as cultural and institutional barriers, are constraining their participation in the salaried labor market.

With regard of comuna-level variables, results are similar, except for the slightly negative (although significant) APE for the start-up activity in the comuna and the significant positive APE for the risk index in the case of the own-accounts (both being not significant for employers). The first result is possibly indicating that, for the own-accounts, higher entrepreneurial activity means greater opportunities for getting a desired salaried job. Our interpretation for the second result is that the own-accounts are too skilled and asset-constrained as to cope with inherent risks of self-employment by choosing occupations.

Summarizing, the empirical evidence indicate that, at least in the Chilean case, pull factors remarked by economic theories of occupational choice better fit the reality of employers rather than of own-account self-employed. Based on the results, only the former group fits the notion of opportunity entrepreneurship (Wennekers et al., 2005; Acs and Amorós, 2008), or at least self-employment induced by pull drivers, such as perceived earning differentials. On the contrary, own-accounts seem to be pushed into such condition due to low skills, asset constraints, low opportunity costs of self-employment and/or job rationing (see for instance Lederman et al., 2014).

Overall, the results show a complex picture about the drivers of self-employment in Chile, where a dualistic theory of the labor market seems to apply for the own-account self-employed. On the contrary, employers seem to fit with the alternative view of self-employment as a purposeful occupational choice by individuals with greater skills, assets, and opportunities.Footnote 15

5 Conclusions

This study has inquired into the effect of a broad array of individual and place-related factors on the selection into self-employment in Chile. By proposing a structural and multilevel approach, we have been able to shed some lights not only on the drivers of independent work in the country, but also on the mechanisms through which they work.

We verified, first, the importance of individual attributes and, to a lesser extent, of territorial characteristics in explaining the variation of workers’ wages, self-employed earnings and the probability of being in self-employment. While individual traits explain most of the variation of these outcomes, territorial conditions still explain a non-negligible share of variation (around 7 to 10%). In addition, several municipality-level variables significantly condition earnings and the probability of observing an individual as a self-employed.

Second, our research strategy has allowed testing the selection based on the returns to alternative occupations suggested by economic models of occupational choice. Although the theory fits well in the case of employers, that is not the case for own-account self-employed.

Third, the analysis disentangled the net partial effects of both individual and place characteristics on the probability of being in self-employment, taking into account their potentially compensating indirect effects through the opportunity cost of self-employment (wages). As reported for developed countries, having a parent self-employed and collateral are two factors positively and significantly correlated with the probability of being in self-employment. On the contrary, and against expectations, municipal market potential has a negative and significant effect. We are unable to find marked gender effects in the probability of being observed as an employer, although there are significant positive correlations between female and being in own-account self-employment.

An important caveat arises at this point. The static approach followed here cannot incorporate relevant dynamic aspects of entrepreneurship that may affect the results. Since only the current occupational status is observed, we cannot identify situations such as failed business owners or transitions between own-account self-employment and the employer status.Footnote 16 Such limitations could be address using longitudinal data, unable at a detailed geographical scale in Chile. However, as remarked by Blanchflower and Oswald (1998), cross-sectional analysis of current occupational status still permits drawing some conclusions. First, it portrays the situation of the policy-relevant persistent self-employed. As previously noted, average permanency in self-employment is long in this sample (around 9 years). Second, we can delve into the long-lasting effect of some individual attributes by correlating the cross-sectional probability of being observed as self-employed with several variables that respond either to “predetermined” decisions (such as individual’s schooling or parents’ employment status during individual’s childhood) or to exogenous conditions (such as gender or disability at home). Third, we can assess the lasting effect of slowly changing territorial conditions (such as labor diversity, market potential, or relative housing prices) (see Andersson and Koster, 2011).

Despite our results cannot be taken as fully conclusive, due to these various shortcomings obscuring causal relationships, the bulk of the evidence points at a coherent, nuanced view of the Chilean labor market. While employers seem to actually choose self-employment, the own-accounts seem to respond according to the dualistic view of the labor market, where they are largely pushed into that condition. Overall, our results are broadly consistent with conclusions from studies in other middle-income countries (Earl and Zakova, 2000).

The results provide some guidelines for entrepreneurship support policies in Chile. An immediate conclusion is that programs focused on strengthening capacities of individual entrepreneurs remain well warranted. However, there is a need for differentiated approaches tailored to different forms of entrepreneurship, which respond to different incentives and motivations and serve to different policy purposes. On the other hand, since location variables have a modest explanatory power and yield little marginal changes in the probability of being in self-employment, one may well wonder whether there is a role for place-based entrepreneurship support policies in Chile.

We believe there is. First, several municipality-level variables (such as average schooling or the unemployment rate) significantly condition wages, employers’ earnings, and the probability of choosing self-employment. Second, the empirical model cannot capture to what extent the spatial distribution of individual traits is determined by place characteristics. Third, human capital sorts spatially in response to economic geography factors (Behrens and Robert-Nicoud, 2015). Yet, since the Chilean labor force is not very mobile, place-characteristic may be at least acting as barriers to interregional mobility. In sum, our estimations could be considered a lower bound of the true effect of place-heterogeneity.

In any case, the design and evaluation of entrepreneurship support initiatives should bear in mind that strengthening individual entrepreneurial skills and regional entrepreneurial environments will likely create, simultaneously, conditions for more and better jobs, particularly for individuals constrained in their labor opportunities and in places with thin labor markets. Therefore, such policies may not yield the expected results of immediately triggering local entrepreneurial activity.