Introduction

The relation between macroeconomic variables (such as unemployment or economic growth rates) and self-employment has been a traditional source of controversy among economists, caused by an ambiguous prediction provided by the theory (Thurik et al. 2008). On the one hand, the ‘recession-push’ theory supports the idea that unemployment reduces the opportunities of gaining paid-employment and the expected gains from job search, which “pushes” people into self-employment.Footnote 1 Therefore, this theory suggests the existence of a positive relationship between self-employment and unemployment, that is, an opposite relation between the business cycle and the self-employment rate. On the other hand, the ‘prosperity-pull’ hypothesis represents an opposite interpretation of this relationship: at times of high unemployment firms face a lower market demand. This reduces self-employment incomes and possibly also the availability of capital, while the risk of bankruptcy increases. Thus, individuals are “pulled” out of self-employment. At the same time, self-employment may become riskier because if the venture fails, it is less likely that the self-employed worker can find a job in paid-employment. As a result, a negative relationship between self-employment and unemployment is suggested.

Empirical evidence should be a natural way to solve a controversy of these characteristics. However, evidence has not provided unambiguous results. In this sense, most microeconometric studiesFootnote 2 appear to support a “prosperity-pull” hypothesis, whereas macroeconometric analysesFootnote 3 usually generate ambiguous results or weak evidence in favour of the “recession-push” hypothesis.Footnote 4 In this paper we will argue that the mixed set of results in earlier studies is in part due to problems of accurately measuring unemployment. Therefore we develop a new method which uses employment rates instead of unemployment rates to test the “recession push” hypothesis. We apply our new method using a data base for Spain over the period 1976–2004.

There are at least five disadvantages of testing the recession-push hypothesis using unemployment rates. First, it is difficult to distinguish between who is really unemployed and who is out of the labour force. This separation should ideally be made according to who wants a job and who does not. However, official statistics have great difficulty accurately measuring this separation as only individuals fulfilling some criteria of actively searching for a job are entered as unemployed.Footnote 5 Hence, many ‘hidden’ unemployed are not included in the official unemployment statistics, causing a bias of which the magnitude cannot be traced (Congregado 2008).

A second and related drawback of previous studies is the explicit or implicit assumption of a one-to-one relationship between unemployment and self-employment. This is a questionable assumption as participation rates vary in a way that need not be stationary. The extent of underestimating the number of unemployed may be higher in recession periods, i.e. in times where unemployment rates are already high. In other words, in recession periods relatively many unemployed may ‘escape’ to the status of inactivity by leaving the labor force. This makes it even harder to make a correct assessment of the relation between unemployment and self-employment.

Third, apart from the difficulties of measuring the ‘hidden’ unemployed, one might argue that also theoretically it may be more accurate to use employment rates instead of unemployment rates. When, in the occupational choice process, an individual considers becoming self-employed, the expected wage income associated to paid employment will be the main opportunity cost of the self-employment option. Hence, in order to evaluate the value of this expected wage income, the employment rate is a much more direct measure than the unemployment rate. In particular, when employment is low, it is likely that the demand for labour is low, and hence wages, or the opportunity costs of self-employment, are low. Of course high unemployment rates are often associated with low employment rates but using unemployment is a more indirect way of evaluating the opportunity cost of self-employment. Again, as argued above, using unemployment is particularly troublesome when there are strong movements between the unemployment and inactivity statuses.

Fourth, empirical estimates of the self-employment/unemployment relationship invariably confound the above two effects, capturing a “net” effect of the recession-push and the prosperity-pull effects. In addition, reversed causality is also at play in the sense that a higher number of self-employed individuals may bring down unemployment by means of entrepreneurial activities (Thurik et al. 2008, Audretsch et al. 2002).

Fifth and finally, results in some previous studies are conditioned by the investigation of linear relationships, not controlling for non-linearity. If it is the case that in different phases of the business cycle different types of effects prevail, results from linear models could be hiding either of the two effects.

Given that the available empirical evidence proved unable to solve this controversy, we should explore new empirical approaches or take into account some additional explanatory mechanisms in order to understand and interpret the why and wherefore of the lack of uniformity shown by the empirical evidence. In this sense we explore the recession-push hypothesis from a different perspective: omitting deliberately the use of unemployment rates to avoid the related measurement problems, but, alternatively, analyzing the relationship between paid-employment and self-employment while allowing the employment rate to have an impact as well. As the complement of employment in the adult population is the sum of the (official) unemployed and the inactive population, we basically investigate the interactions between paid-employment, self-employment, and unemployment in a broader sense (including the ‘hidden’ unemployed). To do this we will employ a vector error correction model (VECM).

In addition, we will investigate both linear and non-linear (cointegration) relationships. In particular, we allow the strength of the ‘recession-push’ effect to vary according to the employment rate. It is conceivable that the pressure to start their own business is stronger for individuals in a situation of low employment compared to a situation of high employment, as it is harder for individuals to find paid-employment in times of recession. The basic idea behind our approach is the following. The traditional approach for testing the existence of a recession-push (refugee) effects consists of analyzing the relationship between unemployment and self-employment rates, using linear cointegration techniques. Contrary to earlier studies we will test whether or not the relationship is time-dependent (in particular dependent on the business cycle). If the statistical test indicates that the relation is not time-dependent, linear cointegration techniques are sufficient. Otherwise, non-linear techniques should be used.

To carry out this task, we extend earlier analyses in two ways: i) analyzing the relationship between self-employment and paid-employment rates in a VECM linear model, where the error correction term can be interpreted as the employment rate, and given the relationship between the employment and unemployment rates, interpreting the self-employment adjustment process when a shock occurs; ii) testing the possible existence of a nonlinear relationship, as a way to verify if the long-term relationship is time-varying.

In sum, this paper aims at investigating the interactions between paid-employment, self-employment and unemployment (in broad sense) in the framework of a VECM model, using Spanish quarterly data during the period 1976:3–2004:4. Our approach allows us to solve methodological and measurement problems associated with the use of unemployment rates for testing the recession-push hypothesis. In addition, in an attempt to explore the robustness of the results obtained by means of the traditional approach, i.e. analysis of a linear VECM, we will test if the relationships under investigation are time-dependent, by means of a threshold cointegration model.

The paper is organized as follows: The empirical methodology is outlined in the “Econometric methodology” section, the empirical tests are performed in the “Modelling non-linearity” section, while the main conclusions are summarized in the “Conclusions” section.

Econometric methodology

As mentioned above, before employing non-linear econometric methodology we estimate a linear VECM using the maximum likelihood technique. The data used in the empirical analysis are quarterly observations drawn from the Labour Force Survey (LFS) produced by the Spanish National Institute of Statistics (INE). The sample period ranges from 1976:3 to 2004:4, where self-employment is defined, adopting the ICSE-1993 criteria,Footnote 6 as the sum of employers, own-account workers and members of producer’s cooperatives.

The benchmark linear model is a finite-order vector autoregression model (VAR) of the following form:

$$ {x_t} = c + \sum\limits_{{i = 1}}^k {{A_i}{x_{{t - i}}} + {\varepsilon_t}} $$
(1)

In the above model, \( {x_t} = \left[ {{w_t},{s_t}} \right]\prime \) is a vector of non-stationary variables containing the paid-employment rate (w t ) and the self-employment rate (s t ), A i is a 2 × 2 matrix of parameters, and ε t is a 2 × 1 vector of residuals.Footnote 7 Cointegration requires that all the variables have the same order of integration. Before estimating a linear cointegration model we have tested for the order of integration of the paid-employment and the self-employment series. To this end, we have used the modified version of the Dickey-Fuller and Phillips-Perron tests proposed by Ng and Perron (2001). According to these results, s t and w t are I(1). We refer to Appendix A, Table 4, for more details.

In order to characterize the long run dynamic adjustments, we can rewrite the equilibrium VAR model as a vector error correction model (VECM). The VAR(k) model can be rewritten in its VECM representation by substracting x t−1 from the left and right hand sides:

$$ \matrix{ {\Delta {x_t} = c + \left( {{A_1} - I} \right){x_{{t - 1}}} + ... + {A_k}{x_{{t - k}}} + {\varepsilon_t} = } \hfill \cr { = c + \left( {{A_1} - I} \right){x_{{t - 1}}} - \left( {{A_1} - I} \right){x_{{t - 2}}} + \left( {{A_1} - I} \right){x_{{t - 2}}} + {A_2}{x_{{t - 2}}} + ... + {A_k}{x_{{t - k}}} + {\varepsilon_t} = } \hfill \cr { = c + \underbrace{{\left( {{A_1} - I} \right){x_{{t - 1}}} - \left( {{A_1} - I} \right){x_{{t - 2}}}}}_{{\left( {{A_1} - I} \right)\Delta {x_{{t - 1}}}}} + \left( {{A_1} - I} \right){x_{{t - 2}}} + {A_2}{x_{{t - 2}}} + ... + {A_k}{x_{{t - k}}} + {\varepsilon_t} = } \hfill \cr { = c + \left( {{A_1} - I} \right)\Delta {x_{{t - 1}}} + \left( {{A_1} + {A_2} - I} \right)\Delta {x_{{t - 2}}} + ... + {A_k}{x_{{t - k}}} + {\varepsilon_t}} \hfill \cr }<!end array> $$

Hence,

$$ \Delta {x_t} = c + \sum\limits_{{i = 1}}^{{k - 1}} {{\Gamma_i}\Delta {x_{{t - i}}} + \Pi {x_{{t - k}}} + {\varepsilon_t}} $$
(2)

where \( {\Gamma_i} = - \left( {{I} - \sum\limits_{{i = 1}}^{{k - 1}} {{A_i}} } \right) \) and \( \Pi = - \left( {I - \sum\limits_{{i = 1}}^k {{A_i}} } \right) \).

Another decomposition of (1) is given by:

$$ \Delta {x_t} = c + \sum\limits_{{i = 1}}^{{k - 1}} {\Gamma_i^{*}\Delta {x_{{t - i}}} + \Pi {x_{{t - 1}}} + {\varepsilon_t}} $$
(2’)

where \( \Gamma_i^{*} = - \left( {\sum\limits_{{i = 1}}^{{k - 1}} {{A_{{i + 1}}}} } \right) \) and \( \Pi = - \left( {{I} - \sum\limits_{{i = 1}}^k {{A_i}} } \right) \).

The matrix Π is usually decomposed as:

$$ \Pi = \alpha {\beta^{\prime}} $$
(3)

where α and β are nxr matrices containing the adjustment coefficients and the cointegrating vector, respectively, n is the number of variables, and r is the number of cointegrating relationships (one, in our case). The symbol Δ in Eq. 2 is the first difference operator. In this form all terms in Eq. 2 are stationary, that is, integrated of order zero, denoted I(0).

The lagged residuals from the cointegrating vector βx t−1, act as an error correction term. This term captures the extent of disequilibrium for the system of variables with respect to the long-run relation between all variables in the system. The α parameters on the error correction terms in each individual equation indicate the speed of adjustment of this variable back to its long-run value. A significant error correction term (i.e. a significant α parameter) implies long-run causality from the explanatory variables to the dependent variable under consideration.

In our application the system can be written as:

$$ \left[ {\matrix{ {\Delta {w_t}} \cr {\Delta {s_t}} \cr }<!end array> } \right] = \Gamma (L)\left[ {\matrix{ {\Delta {w_{{t - i}}}} \cr {\Delta {s_{{t - i}}}} \cr }<!end array> } \right] + \left[ {\matrix{ {{\alpha_1}} \cr {{\alpha_2}} \cr }<!end array> } \right]\left( {{w_{{t - 1}}} - \beta {s_{{t - 1}}}} \right) + \left[ {\matrix{ {\varepsilon_t^w} \cr {\varepsilon_t^s} \cr }<!end array> } \right] $$
(4)

where α 1 and α 2 indicate the speed of adjustment of each variable back to its long-run value.

We estimated this model using the maximum likelihood procedure developed by Johansen (1988, 1991). Based on statistical tests, we include two lags for the right hand side variables of our model (see Table 5 in Appendix A for details). Importantly, we tested that β does not significantly differ from −1 (see Appendix C), hence when estimating the VECM, we fix the value of β to −1. This implies the error-correction term equals w t−1 + s t−1, i.e. the error correction term is equal to the employment rate (see also Appendix B). This is convenient for interpretation. We also tested for cointegration, by applying the Johansen reduced rank regression approach. The result of this test is that the null hypothesis of no cointegration (r = 0) is rejected at the 5% level. Further tests revealed that the number of cointegrating relations is equal to one (r = 1). Again, more details can be found in Appendix A (Table 6).

The estimation results of the linear VECM are reported in Table 1. Both in the wage-employment equation and in the self-employment equation the error-correction terms are not significant. As the α’s are not statistically different from zero, both rates are said to be long-run weakly exogenous with respect to the long-run equilibrium.

Table 1 Linear VECM estimates—paid-employment and self-employment

However, the non-significance of the α parameters could be due to the presence of nonlinearity in the relation –i.e. the relation could be time-dependent. In particular the relation could vary according to different stages of the business cycle. In the next section we will account for nonlinearity by applying a two-regime threshold cointegration model, proposed by Hansen and Seo (2002).

Modelling non-linearity

We account for non-linearity by applying a threshold cointegration method. The concept of threshold cointegration characterizes a discrete adjustment, in a way in which the system will reach the long-run equilibrium only when it exceeds or does not reach a critical threshold.

Hansen and Seo (2002) provide a vector error-correction model (VECM) in which a cointegration relationship exists between two variables and a threshold effect as an error correction term. As an extension of model (4), a two-regime threshold cointegration model takes the form

$$ \begin{aligned} & {\left[ {\begin{array}{*{20}c} {{\Delta s_{t} }} \\ {{\Delta w_{t} }} \\ \end{array} } \right]} = \Gamma {\left( L \right)}{\left[ {\begin{array}{*{20}c} {{\Delta s_{{t - 1}} }} \\ {{\Delta w_{{t - 1}} }} \\ \end{array} } \right]} + {\left[ {\begin{array}{*{20}c} {{\alpha _{{11}} }} \\ {{\alpha _{{21}} }} \\ \end{array} } \right]}{\left( {w_{{t - 1}} - \beta s_{{t - 1}} } \right)} + {\left[ {\begin{array}{*{20}c} {{\varepsilon ^{s}_{t} }} \\ {{\varepsilon ^{w}_{t} }} \\ \end{array} } \right]} \rightleftarrows with \rightleftharpoons {\left( {w_{{t - 1}} - \beta s_{{t - 1}} } \right)} \leqslant \gamma \\ & {\left[ {\begin{array}{*{20}c} {{\Delta s_{t} }} \\ {{\Delta w_{t} }} \\ \end{array} } \right]} = \Gamma '{\left( L \right)}{\left[ {\begin{array}{*{20}c} {{\Delta s_{{t - 1}} }} \\ {{\Delta w_{{t - 1}} }} \\ \end{array} } \right]} + {\left[ {\begin{array}{*{20}c} {{\alpha ^{'}_{{11}} }} \\ {{\alpha ^{'}_{{21}} }} \\ \end{array} } \right]}{\left( {w_{{t - 1}} - \beta s_{{t - 1}} } \right)} + {\left[ {\begin{array}{*{20}c} {{v^{s}_{t} }} \\ {{v^{w}_{t} }} \\ \end{array} } \right]} \rightleftarrows with \rightleftharpoons {\left( {w_{{t - 1}} - \beta s_{{t - 1}} } \right)} > \gamma \\ \end{aligned} $$
(5)

Hansen and Seo (2002) proposed a heteroskedastic-consistent LM test where the null hypothesis of linear cointegration (i.e., there is no threshold effect) is tested against the alternative of threshold cointegration. The test assumes a fixed value of β (−1, in our case). Application of the test for our model reveals that the null hypothesis of linear cointegration is indeed rejected in favour of threshold cointegration. We refer to Appendix A for details (see Table 7).

The estimated threshold is \( \hat{\gamma } = 38.82 \), with the error correction term defined as \( {w_t} + {s_t} = {e_t} \) (i.e., the employment rate). Hence, the first regime would occur when the employment rate is below 38.82%. This is the relatively unusual regime, including 13% of the observations (namely, 1984:4 to 1986:3; 1987:1; and 1993:4 to 1994:4). By contrast, the usual regime (with 87% of the observations) would occur when the employment rate is above 38.82%.

The estimated two-regime threshold VECM is reported in Tables 2 and 3, where significant error-correction effects appear in the first regime (the estimated α parameters are significant) but not in the second regime.

Table 2 Threshold VECM estimates (Hansen & Seo approach)
Table 3 Threshold VECM estimates (Hansen & Seo approach)

For the self-employment rate equation, the adjustment coefficient (α) is significantly different from zero when the employment rate is below 38.82%, meaning that a value of the gap below 38.82% in one quarter produces upward pressure on the self-employment rate in the subsequent quarter to restore the long-run equilibrium. By contrast, when the employment rate is above 38.82%, the error-correction term in the self-employment rate equation is not significant. As α is not statistically different from zero, the self-employment rate is said to be long-run weakly exogenous with respect to the long-run equilibrium in this second regime.

The economic interpretation of the above findings is as follows. When the employment rate is very low, or, put differently, the unemployment rate in broad sense (i.e. including the inactive) is very high, this phenomenon in itself causes upward pressure on the self-employment rate. Alternative income options are less numerous hence more people start their own businesses. Importantly though, we do not observe this phenomenon when the employment rate is above the estimated threshold value. These results suggest that the recession-push hypotheses is only valid when economic circumstances are poor, i.e. when employment rates are (very) low.

As regards the paid-employment rate equation, the adjustment coefficient (α) is significantly different from zero when the employment rate is below 38.82%, and the effect is negative. We interpret this as a signal that in times of economic recession it is hard to find a job in paid-employment. Hence, the mere phenomenon of low employment causes even more individuals to lose their wage jobs and some of them may be inclined to start their own business, witness the positive α in the self-employment equation in regime 1.

Besides some degree of path-dependency in the wage-employment equation, we note one other interesting result from the table. In both regimes there is a significant positive (causal) effect of the self-employment rate on the paid employment rate, which seems to be quite substantial. This might imply that—at the aggregate level—self-employed individuals create jobs by their entrepreneurial activities. This finding of an ‘entrepreneurial’ effect for Spain is consistent with findings at the international level reported by Thurik et al. (2008).

Figure 1 plots the error-correction effect, i.e., the estimated response of (changes in) the paid-employment rate (Δw t ) and the self-employment rate (Δs t ) to the discrepancy between them (i.e. to the employment rate) in the previous period, holding the other variables constant. As we can see, for a “high” employment rate (i.e. above the threshold, greater than 38.82%), the response of both series (paid-employment rate and self-employment rate) is nearly zero. However, for a “low” employment rate (i.e. lower than 38.82%), the effect on paid-employment is negative while the effect on self-employment is positive. The latter finding is consistent with the recession-push hypothesis, which can be seen to be only valid for low employment rates.

Fig. 1
figure 1

Response of self-employment and paid-employment rates to error correction (w+s)

In sum, according to our results, the null hypothesis of linear cointegration is rejected in favour of a two-regime threshold cointegration model. Consequently, a system of two regimes would seem to characterize the discontinuous adjustment of the self-employment rate towards a long-run equilibrium, with the threshold parameter—the employment rate—estimated at 38.82 percentage points. Therefore, we have a cointegrating relationship only when the employment rate is below 38.82%. This first regime, or the relatively unusual regime in the Spanish economy (with 12.61% of the observations), is coincident with many of the higher unemployment levels during the period, as we can see in Fig. 2. This figure shows the unemployment rate (u t ), defined as the unemployment to population (aged 16+) ratio, based on the official unemployment data, defining the unemployment threshold as the difference between the active population to population (aged 16+) ratio and the employment ratio. Although in general high unemployment rates correspond to low employment rates and vice versa (which one would expect), the figure shows that the classification of regimes might nevertheless be quite different depending on whether the threshold is computed in terms of employment or in terms of unemployment.Footnote 8 This illustrates the importance of defining the threshold in terms of employment.

Fig. 2
figure 2

Spanish Labour Market indicators. 1976:2–2004:4. Source: Spanish Labour Force Survey. Instituto Nacional de Estadística

Conclusions

The relationship between unemployment and self-employment has been studied extensively. Due to its complex, multifaceted nature, various scholars have found a large array of different results, so that the exact nature of the relation is still not clear. An important element of the relation is captured by the recession-push hypothesis which states that in times of high unemployment individuals are pushed into self-employment for lack of alternative sources of income such as paid employment. We make two contributions to this literature. First, we argue that official unemployment rates may not capture the ‘true’ rate of unemployment as it does not include ‘hidden’ unemployed who are out of the labour force. Therefore, we propose a new method where the ‘recession-push’ effect relates not only to the (official) unemployed but also to the inactive population. Second, we argue that the magnitude of the recession-push effect is non-linear in the business cycle, i.e. the effect is disproportionally stronger when economic circumstances are worse. We account for this possibility by introducing a two-regime threshold cointegration model. Estimating our model with quarterly data for Spain over the period 1976–2004, we find that the recession-push hypothesis is only valid when the employment rate (the number of employed individuals as a percentage of the total population of 16 years and older) is lower than 39%.

Our paper contributes to a better understanding of the relation between self-employment and unemployment. We have shown that the relation varies with the business cycle, operationalised as the employment rate. Our results raise the question why unemployed individuals are more inclined to start their own business when employment levels are low, compared to situations of high employment. An obvious factor to start a business in times of recession would be the lower job offer arrival rate, resulting in too high search costs for finding a paid job. However, we may also think of a second possible reason. If one imagines a situation where members of the labour force (employed and unemployed individuals) support not only children but also adult (inactive) members of the population (e.g. elderly), then, in times of low employment, the average number of people to be supported (e.g. inactive family members) by an unemployed individual is higher compared to a situation of a flourishing economy (i.e. high employment). This is because the ratio of inactive versus active members of the population is likely to be higher when employment is lower. Hence, on average an unemployment benefit would have to be divided between more people. This might increase pressure for the unemployed to find a (higher) income through self-employment, particularly because the unemployment benefit is only received for a fixed period of time, after which one receives a benefit that is typically lower. This pressure may be felt especially hard when the employment rate is (extremely) low.

Given the current international credit crunch, the regime of low employment, although found only for 13% of the observations during the period 1976–2004, may be particularly relevant in present times. Therefore an important avenue for future research is to investigate the decision processes at the micro level that lead individuals to start businesses in times of recession. In addition, future work could not only fruitfully apply the methodology used in this article to a broader range of countries, but should also seek for differences between different types of self-employment by decomposing the aggregate self-employment rate into its constituent parts (employers, own-account workers and members of producer’s cooperatives) in order to determine whether the recession-push effect is being driven by one or more of these elements.