1 Introduction

Previous literature indicated the complex net cost puzzle of training apprentices with firm-specific skills. Firms may be reluctant to train apprentices, since training is costly and time-intensive, and the return on investment is insecure (Acemoglu and Pischke 1998, 1999a, b; Dionisius et al. 2009; Muehlemann et al. 2010). Nonetheless, facts and figures indicate that many firms are training apprentices (Eichhorst et al. 2015). In 2015, in the EU-28 10.3 million students (47.3%) followed a vocational education and training (VET) program (Eurostat Database 2017).Footnote 1 There are at least two particular reasons, identified from the literature, for why firms train apprentices despite the costs and risks associated with it. First, firms invest in the human capital of students in order to reveal competences and abilities of the trained apprentices (Lerman 2014; Muehlemann 2016). At the end of the training period, good apprentices can be retained, while firms can get return on the trained specific skills. Moreover, it can result in a better match quality between the applicant and the job, which yields lower turnover rates (Muehlemann 2016). This recruiting strategy of firms is called ‘investment strategy.’ Second, firms may wish to reduce the wage bill by choosing to replace low-skilled workers with apprentices. If firms wish to sustain productivity and gross profit, it is then implicitly assumed that apprentices can perform as well as low-skilled workers on the job. This recruiting strategy of firms is referred to as ‘substitution strategy.’Footnote 2

Evidence on the (lack of) profitability of both types of recruiting strategies has been presented by Mohrenweiser and Zwick (2009). On the one hand, the authors find that an increase in the share of apprentices relative to the share of low-skilled workers has a negative impact on the performance of manufacturing firms in Germany. On the other hand, the authors estimate a positive impact in the short run of training apprentices in the commercial and trade, and crafts and construction sectors. It is then argued that employers are willing to bear a net cost when: (1) apprentices are more likely to stay in the training company after the training period; (2) the skills learned at the training company are firm specific; and (3) it is difficult to find skilled applicants on the job market.

Another explanation for the negative impact of apprenticeship training on firm-level productivity in the manufacturing sector in Germany can be found in Dionisius et al. (2009). These authors compare the net cost associated with apprenticeship training in Germany with Switzerland and conclude that it is not profitable to train apprentices in Germany, while it is profitable in Switzerland. It is then argued that in the German manufacturing sector the tasks allocated to apprentices are less productive than in Switzerland. Therefore, the practice of allocating more productive tasks to apprentices (in Switzerland) appears to be better for firm performance. However, Dionisius et al. (ibid.) indicate other factors that can explain differences in the net cost between Germany and Switzerland, such as relative wages and the different organization and legislation of the VET system. For example, apprentices in Switzerland can earn higher wages than in Germany, which can partly be explained by the allocated productive tasks within a Swiss firm. Swiss firms are also forced to employ apprentices in a cost-efficient way owing to the relatively low employment protection legislation (EPL) in Switzerland as compared to Germany. The relatively low EPL can stimulate Swiss firms to apply substitution recruiting strategies more frequently than German firms (see also Muehlemann et al. 2010).

This paper contributes to the previous literature in at least three ways. First, while previous literature provides several studies on Germany and Switzerland or other continental European countries (Steedman 2001; Muehlemann 2016), we estimate the effect of apprenticeship training on firm performance in Hungary for the first time. Similar to other European countries, increasing apprenticeship training is a special priority for the Hungarian government (Kis et al. 2008). In order to achieve higher enrollment rates in VET, the government provides all Hungarian firms with financial incentives (Sect. 2).

Further, we apply these aforementioned identification strategies on a unique administrative matched employer–employee dataset covering the period of 2003–2011 in Hungary. The fact that we have collected administrative large-scale data for Hungary is considered the second important contribution to the previous literature, because: (1) the previous literature uses relatively old (1997–2002) survey data on apprenticeship training in firms (e.g., Mohrenweiser and Zwick 2009) or (2) the previous literature can only provide evidence for Germany or Switzerland using the large-scale cost–benefit survey in Germany measured since the 1970s or the Swiss survey measured since 2000 (Muehlemann 2016; Muehlemann and Wolter 2014). Further, owing to the relevant Hungarian legislation, we can precisely identify apprentices from vocational training schools, and their total days worked at the firm, in the microdata.Footnote 3 We observe the full working history of all individuals (including apprentices, but also regular employees) working at the training establishment. Owing to this information, we are able to distinguish between inexperienced apprentices (< 1 year of experience) and experienced apprentices (> 1 year of experience). In line with Bajgar and Criscuolo (2016), we argue that productivity may depend on the specific phase in the apprenticeship training program, given that experienced apprentices at the firm are more likely to perform productive tasks. In addition, owing to detailed administrative firm balance sheet data, it is possible to directly estimate the productivity effects of apprenticeship training on firms.

Third, there are relatively few studies that estimate the causal impact of (on-the-job) apprenticeship training on firm performance (Muehlemann and Wolter 2014; Bajgar and Criscuolo 2016). In line with the literature on the impact of (apprenticeship) training on firm performance, for example, Griliches and Mairesse (1995), Dearden et al. (2006), Mohrenweiser and Zwick (2009) and Konings and Vanormelingen (2015), we explore different identification strategies in order to retrieve a causal estimate. These strategies include fixed effects and system GMM. In particular, system GMM has proven to be superior to other techniques (e.g., matching models in Dionisius et al. 2009) that are prone to omitted variable bias, endogeneity and simultaneity problems, in particular, in the occurrence of an aggregate activity shock like the 2008 financial crisis (Sect. 3).

The net cost puzzle for Hungary is complex. The results indicate that apprentices were trained in increasing numbers over the period of 2003 to 2011 in the sectors of manufacturing; construction; wholesale, retail and repair; and hotels and restaurants. However, both inexperienced and experienced apprentices are associated with lower productivity for all of the four sectors. At the same time, we observe that retention rates are relatively low, at least compared to Germany,Footnote 4 with less than 10% in 2011.

This paper proceeds as follows. In Sect. 2, we describe the Hungarian VET system and legislation. Section 3 discusses the empirical strategy. Data and descriptive statistics are presented in Sect. 4 and the results in Sect. 5. A discussion of the results is given in Sect. 6. Conclusion is given in Section 7.

2 The Hungarian VET system

Until the change from the socialist system to a market economy, most training workshops were run by enterprises in Hungary. After the system change in 1989, training was transferred to the model workshops of schools in most cases (partly because of the closure of firms and the economic crisis during the transition). However, since the 2000s there has been a clearly identifiable and increasing trend in the involvement of enterprises in apprentice training. The reputation of vocational education and training is low in Hungary; most people prefer general education. Job prospects of people with vocational certificates are not appealing. According to the Institute for Economic and Enterprise Research (Gazdaság-és Vállalkozáselemzési Intézete, GVI) of the Hungarian Chamber of Commerce and Industry (Magyar Kereskedelmi és Iparkamara MKIK) in 2010, 24.1% of young people were unemployed 9 month after receiving their vocational qualifications and an additional 32.4% continued their studies (either to train for another occupation or obtain higher qualifications), while only 38.8% were employed (CEDEFOP 2011).

Compulsory education in Hungary starts at the age of five and ends at the age of 18.Footnote 5 Students, who successfully finish (primary and) lower secondary education, can choose between three different upper secondary education tracks, generally at the age of 14. Two of these provide vocational education and training (VET): vocational secondary schools (szakközépiskola, SZKI) with higher education access and vocational schools (szakiskola, SZI) without it.

About two-thirds of all students at this age indeed choose VET programs. In the 2009–2010 academic year, in total 128,674 students (26.8%) choose education or training offered in SZI and 177,020 studentsFootnote 6 (36.8%) choose education or training offered in SZKI. The option for students who wish to go outside the VET system is general education in the so-called grammar schools. In the 2009–2010 academic year, in total 175,259 students (36.4%) enrolled in grammar schools, which provide a secondary education diploma (ISCED 3A). (CEDEFOP 2011, Table 7, page 32).Footnote 7

VET programs offered in vocational schools (SZI) consist of general and prevocational education and training in the first 2 years (‘general education grades’) and are followed by one to three (generally two) years of practical training (‘VET grades,’ ISCED 2C or ISCED 3C) to obtain a vocational qualification.Footnote 8 This vocational qualification is not sufficient for accessing higher education. Generally, students enter the labor market in the years after leaving the training, or have to follow additional (usually three years) programs for gaining access to higher education. Vocational secondary schools (SZKI) offer 4 years of general and prevocational training (ISCED 3A) that leads to a secondary education diploma. The 4 years before obtaining these diplomas is also referred to as ‘general education grades.’ With the diploma, students can choose adult education or training (i.e., advanced tertiary vocational programs and bachelor and master programs) or go for 1-year to 3-year (post-secondary non-tertiary) vocational education and training. (Detailed information can be retrieved from CEDEFOP 2011, p. 28.)

Figure 1 provides a summary of the Hungarian VET system. The analysis in this paper solely focusses on students who choose vocational schools (SZI). Vocational schools (SZI) may offer school-based vocational education and employer-provided training at the workplace to students (sometimes in tandem). SZI students can choose either types of vocational training. As a result, in any one class, one can find a combination of students who follow either school-based vocational education or employer-provided training at the workplace. There is no clear rule in the Hungarian legislation for VET that allocates students between school-based training and on-the-job training.Footnote 9 In fact, relatively little is known about the allocation mechanisms. Horn (2016) found that, after taking local labor market conditions into account, individual background characteristics do not determine the decision whether one participates in school-based training or employer-provided training at the workplace.

Fig. 1
figure 1

Source: Authors own design based on CEDEFOP (2011)

Simplified visualization of the VET system in Hungary.

School-based training involves the school in the organization of the training. Usually, vocational teachers are teaching in model workshops, which can be, but are not necessarily in the school building. For employer-provided training at the workplace, the student and/or the school engages a private firm. In this case, a special tri-party contract is required (‘tanulószerződés’) between the firm, the student and the school. Training firms are required to compensate apprentices for their work, although the amount of this compensation is very small. The required payment is 20% of the minimum wage per month (50 euros in 2011). However, the government only regulates the minimum wage an apprentice can earn, whereby firms are allowed offering higher (or varied) wages to apprentices.

Unfortunately, not much is known about what kind of tasks apprentices do within firms. The Hungarian system is highly decentralized, so we expect heterogeneity in apprenticeship training across firms. Based on anecdotal evidence, some firms provide formal training while others provide internship-type work placement (see also Horn 2016).

The Hungarian government provides financial incentives to firms in order to encourage the training of apprentices coming from vocational training schools. This is organized as follows. All employers are required to pay a compulsory VET tax (szakképzési hozzájárulás) to the government, which is 1.5% of the total payroll of the firm. However, employers are also allowed spending their VET tax on training apprentices themselves, offering direct support to a VET institution, or training their own employees (Kis et al. 2008).Footnote 10 This legislative environment aims to increase the number of firms training apprentices. Similar pay-or-train schemes can be found in other developed countries, for example, Canada (Dostie 2015), France (Bassanini et al. 2005) or Denmark (CEDEFOP 2012, p. 54). However, there is not much evidence on the effectiveness of these types of schemes in increasing the overall level of training in firms (Dostie 2015).

3 Empirical strategy

Similar to other studies on the impact of training, for example, Aubert and Crepon (2006), Göbel and Zwick (2009), Mohrenweiser and Zwick (2009), and Konings and Vanormelingen (2015), we estimate a Cobb–Douglas production function that includes indicators for human capital and physical capital and also accounts for the firms’ state of technology. In particular, we estimate:

$$\begin{aligned} \ln (\pi_{jt} ) & = \beta_{0} + \beta_{1} \ln (\pi_{jt - 1} ) + \beta_{2} {\text{share}}\;{\text{man}}_{jt} + \beta_{3} {\text{share}}\;{\text{cons}}_{jt} + \beta_{4} {\text{share}}\;{\text{retail}}_{jt} + \beta_{5} {\text{share}}\;{\text{hotels}}_{jt} \\ & \quad + \,\beta_{6} {\text{share}}\;{\text{skill}}\left( m \right)_{jt} + \beta_{7} {\text{share}}\;{\text{skill}}\left( h \right)_{jt} + \beta_{8} \ln \left( {{\text{depr}}_{jt} } \right) + \sum \gamma_{k} X_{kjt} + \mu_{j} + \varepsilon_{jt} \\ \end{aligned}$$
(1)

where the outcome measure \(\ln (\pi_{jt} )\) is the value of total production (value added) per full-time equivalent worker of firm \(j\) in year \(t\). Since we use share of days worked in our specification and apprentices generally only work part time, we divided the total number of days worked in a given firm by the number of workdays in the year to get the number of full-time equivalent workers.

In order to increase homogeneity across firms, we solely focus on four sectors (industries) that employ apprentices heavily in Hungary, namely manufacturing; construction; wholesale, retail and repair; and hotels and restaurants (see Sect. 4 for more details).

With respect to the specification of human capital, we adhere to the previous works of Dearden et al. (2006) and Mohrenweiser and Zwick (2009). These authors compare the value of marginal productivity (VMP) of apprentices with the VMP of semi- and low-skilled employees working at a firm. The ratio of apprentices and semi- and unskilled employees is usually expressed in a percentage form; however, the firm-level data also allow us to express these indicators in days worked at the firm within a particular year. Hereby, we can be more accurate regarding the actual productivity of employees.

The parameters \({\text{share}}\;{\text{man}}_{jt}\); \({\text{share}}\;{\text{cons}}_{jt}\); \({\text{share}}\;{\text{retail}}_{jt}\); and \({\text{share}}\;{\text{hotels}}_{jt}\) denote the percentage share of days worked by apprentices in the sectors manufacturing; construction; wholesale, retail and repair; and hotels and restaurants, respectively. The reference category is then the share of days worked by regular employees at the firm. We also include \({\text{share}}\;{\text{skill}}\left( m \right)_{jt}\) and \({\text{share}}\;{\text{skill}}\left( h \right)_{jt}\) in order to compare the share of medium-skilled (or semi-skilled) and high-skilled employees with the share of low-skilled apprentices. This way the reference category becomes the share of days worked by low-skilled regular employees at the firm. These control variables can be particularly interesting for statistical inference on replacing cost-intensive semi- and low-skilled labor by apprentices at the time of a negative shock on gross profit. This will be discussed in the next paragraph.

Factual comparing of the VMP of apprentices with ‘regular’ employees can hamper statistical inference owing to omitted variables bias and self-selection bias (Bajgar and Criscuolo 2016). Regression (1) acknowledges unobserved firm-specific effects \(\mu_{j}\). Correlational estimates are most likely biased due to the firms’ endogenous recruiting process (Cabus and Somers 2017). For example, firms could have altered the human capital mix in response to the financial crisis of 2008 and the Great Recession, in essence an exogenous aggregate activity shock on the labor market that also affects firms’ performance. It can be seen that Hungary immediately felt the consequences of the economic contraction on the labor market (Eurostat 2016). As a result, firms may have replaced cost-intensive semi- and low-skilled labor by apprentices, while a negative shock on revenue (value of total productivity, VTP) can also be observed. One may then wrongly conclude that increasing the percentage share of apprentices in the firm declines VTP. These endogeneity issues (or ‘simultaneity of events’) usually tend to over- or underestimate the ‘true’ effects of employing apprentices at the firm on VTP.

We deal with the aforementioned issues in several ways. First, we estimate a Cobb–Douglas production function that controls for trends and time-invariant (unobserved) information with respect to the shares of days worked by apprentices over time (\(\tau_{t}\)); the industry wherein the firm operates (\(\rho_{j}\)); and the firm \(\left( {\nu_{j} } \right)\). Doing so, in particular, we estimate the change in percentage share of (or days worked by) apprentices between time \(t - 1\) and \(t\) on the change in VTP. This corresponds to a first-difference model (Nickell 1981):

$$\begin{aligned} \Delta \ln (\pi_{jt} ) & = \beta_{1} \Delta \ln (\pi_{jt - 1} ) + \beta_{2} \Delta {\text{share}}\;{\text{man}}_{jt} + \beta_{3} \Delta {\text{share}}\;{\text{cons}}_{jt} + \beta_{4} \Delta {\text{share}}\;{\text{retail}}_{jt} + \beta_{5} \Delta {\text{share}}\;{\text{hotels}}_{jt} \\ & \quad + \,\beta_{6} \Delta {\text{share}}\;{\text{skill}}\left( m \right)_{jt} + \beta_{7} \Delta {\text{share}}\;{\text{skill}}\left( h \right)_{jt} + \beta_{8} \Delta \ln \left( {{\text{depr}}_{jt} } \right) + \sum \gamma_{k} \Delta X_{kjt} + \Delta \varepsilon_{jt} \\ \end{aligned}$$
(2)

The first-difference model deletes the constant from regression (1) and the firm-specific effects \(\mu_{j}\). Further, for the empirical application it meets the requirements of small T (few time periods) large N (many firms) (Baum 2013; see Sect. 4—Data). According to Arellano and Bond (1991), more efficient estimates can be constructed by using generalized methods of moments (GMM). By controlling for firm-specific effects in a first-difference model, one may construct instruments from the lagged level of the corresponding variables. However, it is also acknowledged that using the lagged level of first differences does not solve the issue of endogeneity (i.e., the lagged level may be correlated with past and current realizations of the error term; Baum 2013). Therefore, Arellano and Bover (1995) and further Blundell and Bond (1998) suggest that the change in percentage share of (or days worked by) apprentices should be instrumented by the fourth lag of the corresponding variable. The authors argue that, in this manner, contemporary exogenous aggregate activity shocks have no effect on the lagged variables.

The set of instruments for each included ‘share’ in regression (2) can be represented by the following matrix notation:

$$Z_{j} = \left( {\begin{array}{*{20}c} {\begin{array}{*{20}c} . & . \\ {{\text{share}}_{j1} } & . \\ {{\text{share}}_{j2} } & {{\text{share}}_{j1} } \\ \end{array} } \\ {\begin{array}{*{20}c} \vdots & \vdots \\ {{\text{share}}_{jT - 2} } & {{\text{share}}_{jT - 3} } \\ & \\ \end{array} } \\ \end{array} } \right)$$

In short, the first observation is lost in applying the first-difference transformation. Then, the instrument lag of apprentices \(\left( {t - 1} \right)\) has been instrumented by using the change in share of apprentices between time period \(\left( {t - 2} \right)\) and \(\left( {t - 3} \right)\), so that we need at least four subsequent periods of observation (from time \(t\) to time \(t - 3\)) of one firm. Consequently, we estimate a dynamic panel data model and, according to Blundell and Bond (1998), the optimal moment (or instrument) conditions are identified by applying GMM. We have used the same empirical strategy to the share of days worked by apprentices in the four industries, namely manufacturing; construction; wholesale, retail and repair; and hotels and restaurants. Taking all these equations together, this empirical strategy is often referred to as system GMM. The results of the system GMM are then compared to those of fixed effects models that include no instruments.

With respect to the specification of physical capital, we include an indicator for the logarithm of yearly depreciation per full-time equivalent worker denoted by \(\ln \left( {depr_{jt} } \right)\). This variable is included in order to control for differences between firms with respect to (optimal) size and economies of scale. Hence, it is acknowledged that smaller firms will have a more rapid decline in marginal revenue from employing an additional worker (i.e., an apprentice or a semi- or low-skilled worker).

The vector \(X_{kjt}\) denotes a set of control variables. In particular, we control for the firm workforce composition in terms of share of days worked by prime-aged (25–55) and older workers (55 +) and the share of days worked by people in a second job. We also include other control variables measured at the level of the firm, namely the firm size; year of the firm’s entry into the dataset; whether the firm is exporting production and/or services to foreign countries; and an indicator for foreign ownership.

Finally, a standard Cobb–Douglas production function takes into account the firms’ state of technology by including a fixed parameter (often denoted by: A). Higher technological advances are associated with capital-intensive production, more high-skilled and fewer low-skilled workers (Griliches 1969; Autor et al. 2003). We control for these potential differences across firms and across industries in several ways. First, in the short run, the state of technology can be considered constant over time. A first-difference model then controls for technological advances. Second, in case there should be some variation over time, we control for capital expansion by including the variable of yearly depreciation. Third, by using the fourth lag as an instrument for change in the percentage share of apprentices, one avoids the effects of contemporary capital expansion on (endogenous) hiring decisions of the firm (Blundell and Bond 1998). To conclude, we estimate a first-difference instrumental variable specification that also includes a constant parameter. It is argued that the constant of the regression captures the variation that cannot be explained by the aforementioned variables.

4 Data

We use unique Hungarian administrative data that match employer information to employee information. The employee data contain monthly observations of basic demographic information and complete work and social transfer history on a monthly basis on all individuals between January 2003 and December 2011. Furthermore, we have yearly balance sheet data on private firms that can be matched to the employees in the data.

In the data, we can identify apprentices working with a special student contract required for an apprenticeship at a private firm (Sect. 2). Unfortunately, the dataset does not contain information on educational attainment for all individuals, but only for individuals who were registered as unemployed between 2003 and 2011, and for people who studied in or after 2009. In fact, information on educational attainment for students with a contract at a private firm is only necessary for the identification of students in tertiary education, which are, on average, incomparable to students in vocational training schools. In order to overcome this data limitation, we only considered apprentices who were at most 19 years old at the end of their apprenticeship. By definition, this restriction rules out apprentices in tertiary education. Furthermore, we also excluded people working as apprentices in the educational sector, because we believe that firms in the educational sector are specializing in apprenticeship training for private vocational training schools. These firms are doing private, school-based practical training.

Based on KIR-STAT (2011; Table a05t24), officially, there were 36,960 students in total who were younger than 19 years old, who had a student contract with a private firm, and who were completing practical training in these private firms in 2011. In our final dataset, which is 50% of the whole population aged 5–74 in 2003, we identified 16,824 apprentices in 2011. Ultimately, we were able to identify over 91% of all apprentices from vocational training schools in Hungary in 2011.

To estimate the effect of apprenticeship training on firm performance and to allow for dynamic panel data estimation techniques, we constructed a yearly firm panel dataset for each year between 2003 and 2011. We calculated the share of apprentices and days worked by apprentices in every firm in each year. We counted all employees and apprentices who worked at the firm for at least 1 month in a given year. We also calculated the shares of low-skilled, medium-skilled and high-skilled workers in the firms,Footnote 11 and the share of young workers (younger than 25 years oldFootnote 12), prime-aged workers (25–55 years old) and older workers (older than 55 years old).

We limited the final dataset to firms with a minimum of five employees (on average) in order to reduce measurement error from using worker shares. Furthermore, because of the requirements for using the two-step system GMM estimator (Sect. 3), we restricted the final dataset to firms with at least four observations in the dataset and with no bigger gaps than three years between observations.

A final restriction deals with the industries wherein apprentices work. It can be seen in Fig. 2 that there are only four broadly defined industries that employ a significant number of apprentices, namely manufacturing; construction; wholesale and retail; and hotels and restaurants. Altogether, these four sectors cover 93% of all apprentices from vocational training schools employed in private firms. Therefore, the data are restricted to these four important sectors.

Fig. 2
figure 2

Source: Compiled by the authors

Total number of apprentices by industry.

Table 4 (“Appendix”) presents the total number of observations and firms in each stage in our data selection process. The final dataset consists of 287,012 observations for 42,745 different firms.

Table 1 presents the descriptive statistics. It is clear that a lot of firms in Hungary do not train apprentices. In Sect. 5, we also take into account the firms without apprentices in our sample, because they are important when we include (industry-specific) yearly trends in our estimations. As a robustness check, however, we run our estimations on the subsample of only firms with apprentices, and the results are very similar to the ones presented in the paper.

Table 1 Descriptive statistics

We present the descriptive statistics separately for the sample of firms with at least one apprentice between 2003 and 2011 (‘apprentice sample’) and for firms who did not hire apprentices at all during this period (‘non-apprentice sample’). Based on t-tests, the mean of the different variables differs between these two subsamples in almost all cases (see last column for p-values).

Regarding the skill and age compositions of the firms, Table 1 shows that 12.8% of employees are defined as low skilled (incl. 0.6% of apprentices), 63.5% are medium skilled and 23.6% are high skilled. The share of low-skilled and high-skilled workers is lower in our ‘apprentice sample,’ while the share of medium-skilled workers is higher. Table 1 also includes a variable called ‘second job’ that indicates the share of people in second jobs. About 1.9% of employees had a second job. This share is a bit lower in our ‘apprentice sample’ (1.3%). The largest share (82.4%) of employees is prime aged, while 9.8% are youngsters and 7.6% are aged 55 and over. The age composition of our ‘apprentice sample’ is different, and the share of young workers is much higher (19.4%) in these firms.

With regard to the information for firms, the final dataset comprises of 11.2% foreign-owned firms, with another 28.2% of firms that export goods or services to foreign countries. The share of foreign-owned firms is lower in our ‘apprentice sample,’ namely only 8.8%. Based on our data, it appears that foreign firms are less likely to hire apprentices than domestic ones. In addition, the share of exporting firms is also lower in the ‘apprentice sample.’

In total, 90.1% of firms are small and medium enterprises (up to 50 employees); 8.6% of firms are large (between 51 and 300 employees); and 1.3% of firms are very large (> 301 employees). In comparison, our ‘apprentice sample’ contains only 78.4% small and medium enterprises and 17.4% large firms, while 4.3% of the sample represent very large enterprises. Large, but not very large, firms are more likely to hire apprentices.

The average productivity per full-time equivalent worker of these firms is equal to HUF 6.8 million (EUR 21,430).Footnote 13 The value of yearly depreciation of physical capital is equal to HUF 32.5 million (EUR 102,286). In our ‘apprentice sample,’ the productivity per full-time equivalent workers is lower and yearly depreciation is higher than the whole sample average. It should be noted that there is a fairly large standard deviation in all of these averages.

To sum up, based on the descriptive statistics, we can state that our ‘apprentice sample’ contains fewer foreign and exporting firms than the whole sample average. In addition, these firms tend to be larger, but on average with lower performance. The employee skill composition is also different. The share of medium-skilled workers is higher and so is the share of young employees.

Figure 3 shows the change in the average share of apprentices by the different industries separately. Although the average share of apprentices in the whole sample is very low (0.6%), the mean apprentice share in firms with at least one apprentice is much higher (5.3%). Additionally, we are able to observe a steady and significant increase in all four of the included sectors. By 2011, the average share of apprentices rose between approximately 9 to 17% from 2 to 4% in 2003.

Fig. 3
figure 3

Source: Compiled by the authors

Share of apprentices by industries in firms with at least one apprentice.

Figure 4 shows the evolution of mean retention rate by industries. Here, the trend seems to be similar to the previous figure until 2008. However, it is worth mentioning that these retention rates are very low, especially compared to the numbers shown in other studies for different countries. Moreover, after the financial crisis in 2008, these low retention rates began to further decline.

Fig. 4
figure 4

Source: Compiled by the authors

Mean retention rates by industries in firms with at least one apprentice.

To explore these changes in more detail, we present Figs. 7 and 8 (“Appendix”) which shows the change in the share of apprentices and the retention rates, respectively, by firm size and ownership categories.Footnote 14 Based on Fig. 7, the increase was higher in domestic firms (also there are few small and medium firms hiring apprentices in our sample) and the share of apprentices is higher in smaller firms. Figure 8 shows that the main difference between retention rates is based on firm size rather than ownership. Although the general pattern is similar in all subgroups (except small foreign firms due to the few number of observations), the figure shows that large firms are more likely to keep the apprentices they train.

5 Results

5.1 OLS

The baseline OLS results without controlling for fixed effects and with a full set of control variables are included in “Appendix” (Table 5). The share of days worked by low-skilled regular workers is the reference category. Consequently, the estimates for apprentice share across industries in Table 5 (and all other tables below) should be interpreted as substituting away from low-skilled regular workers toward training/employing apprentices. We find negative and significant correlations between the share of days worked by apprentices and productivity in all four industries. Therefore, we can then conclude that substituting away from low-skilled labor toward apprentices is associated with a decline in firm performance. Further, based on the control variables we argue that exporting firms and firms with foreign ownership yield higher productivity. There is also a positive association between the amount of depreciation and productivity (both measured per full-time equivalent worker). This indicates that larger firms are generally more productive.

5.2 Fixed effects

As presented in Sect. 4, only a selected number of firms offer training to students; moreover, the selection (of students) into training may be endogenous.Footnote 15 As a first step to overcome selection issues, we use a fixed effects model that controls for firms’ unobserved invariant information. Table 2 presents the results. We found significant and negative correlations in all four sectors between the share of days worked by apprentices and firm productivity per full-time equivalent worker. For example, the coefficients in Table 2 indicate that a one percentage point increase in the share of days worked by apprentices (compared to the share of days worked by low-skilled regular employees) in the manufacturing sector is associated with a decrease of 0.743% in productivity per full-time equivalent worker.Footnote 16 The estimated coefficients are lower than in our OLS specification, but still indicate significant negative correlations between hiring apprentices and firm performance in all four sectors.

Table 2 Summary of results using fixed effects

With regard to the control variables, we still find positive associations between depreciation, exporting and productivity. However, in the fixed effect models, the positive association between foreign ownership and productivity is no longer significant. Whereas foreign ownership is a time-invariant variable, its significance seems to disappear by using the fixed effects estimation method.

5.3 System GMM

In order to further control for potential problems with simultaneity, likely present in times of economic crisis, this section presents the results from using system GMM with finite sample correction (Windmeijer 2005). For application, we used the command xtabond2 in the statistical software package Stata (Roodman 2006). Additionally, we imposed common factor restrictions using a minimum distance estimator in order to obtain a single coefficient for all covariates in the dynamic model (Blundell and Bond 1998).Footnote 17

The results from Table 3 indicate that a one percentage point increase in the share of days worked by apprentices (compared to the share of days worked by low-skilled regular workers) decreases the firms’ productivity per full-time equivalent worker in all four industries by 0.89 to 1.74%.

Table 3 Summary results using system GMM

We should mention that the Hansen test of over-identification restrictions indicates p-values far below 0.05. However, in line with Roodman (2006, 2009), we argue that these p-values are mainly due to the fact that we have a large amount of data. Indeed, we have much more observations than Mohrenweiser and Zwick (2009), who also used system GMM in order to control for simultaneity. To test for this, we estimated our results on a 5% random sample of data. When doing so, the estimated coefficients remained similar, although with much larger standard errors, nevertheless the statistics of the Hansen test turned out to be very similar to the ones reported in Mohrenweiser and Zwick (2009). Furthermore, Roodman (2006, 2009) argues that the Hansen test is prone to weakness, certainly when using many instruments on a small dataset. Therefore, the tests on instrument validity should be interpreted with caution. To conclude, we argue that system GMM can still be considered the first-best option besides fixed effects or first-difference models in terms of estimating potentially causal effects (Sect. 3).

5.4 Experience at the firm

Since we have full work history for the workers in our dataset, we are able to distinguish between inexperienced apprentices (< 1 year of experience at the firm) and more experienced apprentices (> 1 year of experience at the firm).Footnote 18 It is expected that experienced apprentices in particular can perform skilled tasks during their apprenticeship and, thereby, are able to replace regular low-skilled workers, but for a lower wage. Moreover, dismissal costs for apprentices in Hungary are low compared to those for low-skilled employees. The contract between the firm, the school and the apprentice automatically ends at the end of the learning trajectory. However, in order to dismiss a low-skilled employee, employers have to argue a reason in a written document. Then, there are a minimal number of days, usually 30 days, notice period. Depending on the tenure of the workers in the firm, they might be entitled to severance payment (several months wage, depending on tenure). Furthermore, there are special regulations in place for collective dismissals (Labour Code 1992). Some of the industries, where apprenticeships are common, also have unions for low-skilled workers. This increases dismissal costs even further. In sum, dismissal costs raise the attractiveness to hire apprentices (instead of low-skilled employees) in periods of economic contractions.

Table 8 (“Appendix”) presents the results by the experience level of the apprentices. The results using system GMM estimation indicate no significant difference between experienced and inexperienced apprentices in the sectors of construction and wholesale and retail (i.e., both types of apprentices have significant negative coefficients in these two sectors with overlapping confidence intervals). The manufacturing and hotels and restaurant sectors do have statistically significant differences between the coefficients estimated for experienced and inexperienced apprentices; however, these differences are small in magnitude. On the whole, the results seem to suggest that less experienced apprentices are more productive compared to more experienced apprentices.

Based on these findings, we can hypothesize that (1) Hungarian firms do not allocate (more) skilled tasks to more experienced apprentices during the training period or (2) experienced apprentices are employed for more hours on the job than inexperienced apprentices and, therefore, have bigger negative effects on productivity. Qualitative research should further explore the validity of these arguments. However, overall we are able to conclude that more experienced apprentices do not increase the productivity of the firms as per full-time equivalent worker.

5.5 Firm size and ownership

Earnings and mobility patterns may substantially differ by firm size or organizational type (Bougheas and Georgellis 2004). The majority of the firms in our data are small and medium enterprises, and additionally, domestic firms tend to train more apprentices than foreign-owned firms (Table 12 in “Appendix”). Therefore, we estimated our previous specifications separately for different subgroups of firms by sizeFootnote 19 and ownership.Footnote 20 We find that our results are driven by domestic firms (Tables 9, 10 in “Appendix”). Both the fixed effects and the system GMM estimations produced insignificant coefficients for foreign-owned enterprises (Table 11). However, it should be noted that there are only few foreign firms in the dataset with apprentices, and our foreign firms with apprentices sample differ significantly from its domestic counterpart. Based on Table 12, we can see that foreign firms that train apprentices are mainly bigger exporting firms. In addition, (also because of their bigger size) the share of apprentices is significantly lower and the share of low-skilled employees is higher than in the sample of domestic firms with apprentices. It might be the case that these few special firms provide very different training quality and/or are able to use apprentices differently than the domestic ones. Unfortunately, our data do not make it possible to say anything about training quality and task allocation during training, so further evidence is needed to explain this heterogeneity of the results. The estimated coefficients for domestic firms correspond to our results for the whole sample presented in Tables 2 and 3.

With regard to firm size, the estimated coefficients are more comparable across sectors, both using fixed effects (Table 9) and system GMM estimators (Table 10). It is only for the hotels and restaurants sector that the estimated coefficient loses its significance for large firms in both cases.

6 Discussion

Based on our findings, we can confirm that firms did hire apprentices despite the negative (short term) effects on productivity. This would point to an investment recruiting strategy, which implies that firms are willing to bear training costs in order to retain the best apprentices at the firm. However, if firms use an investment training strategy, we expect them to retain a high percentage of apprentices after the training period. In contrast, the observed retention rates are quite low (less than 25% at the peak), especially compared to the ones in other countries with similar workplace-based vocational training. Furthermore, we observe that firm behavior changed over our time period. Figure 5 shows that firms increasingly seem to substitute away from low-skilled regular workers toward cheaper apprentices. In other words, it might be attractive to hire apprentices if their relative wages are lower than their relative productivity in comparison with low-skilled workers. The figure shows that firms hired an increasing number of apprentices between 2003 and 2011, while the retention rate declined from over 20% in 2007 to 10% in 2011. While the share of students in vocational schools was relatively stable over time (see CEDEFOP 2011, Fig. 4, p. 31), we argue that the increase in apprenticeship contracts is most likely due to a shift from school-based training to employer-based training. Relatively high shares of apprentices employed in firms in combination with low retention rates point toward substitution recruiting strategies, which became more preferred in response to the financial crisis. We can only argue for substitution recruiting strategies in the case that wages of apprentices are really low, in combination with cheap dismissal policy, and if this would offset productivity loss, as compared to low-skilled workers. However, our data do not include information on the hourly wages, so we cannot make this claim hard.

Fig. 5
figure 5

Source: Compiled by the authors

Substitute away from low-skilled labor toward hiring apprentices.

Horn (2016) puts forward more evidence on substitution recruiting strategies. The author claims that students graduating from VET face similar employment chances regardless of the establishment of practical training and regardless of having opted for school-based training or employer-provided workplace training. Previous evidence already indicated that higher (lower) hiring costs of regular workers on the external labor market can explain why firms increasingly (decreasingly) train apprentices (Blatter et al. 2015; Hara 2017; Schumann 2017). In order to investigate this further, ideally, one need to have information on hours worked per day by apprentices and by low- and semi-skilled workers in order to compare wages across these different types of workers. Relative low labor costs in comparison with productivity could indeed drive the substitution recruiting strategy in response to the financial crisis. However, we do not have information on hours worked per day in the data. Estimates on relative wage costs can then be biased by part-time contracts, and these types of contracts are heavily used by apprentices, while this is not necessarily the case for low-skilled employees. Therefore, we cannot explore the relationship between relative labor costs and productivity any further. However, in order to investigate the substitution recruiting strategy, we show the change in separation rates (on the left) and hiring rates (on the right) of low-skilled regular workers over time by firms that also offered training (or had at least one apprentice in Fig. 6. A spike in the separation rate can be observed in 2008, and a plunge in the hiring rates in 2009. At the same time, we observe a steady increase in the number of hired apprentices between 2008 and 2011. These patterns support our argument that in reaction to an aggregate activity shock, firms increasingly use apprentices as substitutes for low-skilled regular workers.

Fig. 6
figure 6

Source: Compiled by the authors

Separation and hiring rate of low-skilled workers in firms with apprentices.

All in all, based on the results and patterns in our data, the net cost puzzle still remains complex in the case of Hungary. Our regression results indicate that a larger share of days worked by apprentices means lower productivity per full-time equivalent worker at the firms in all four sectors. There seems to be no, or only a small difference between the effects of experienced and inexperienced apprentices. Therefore, one could argue that firms do not allocate skilled tasks to apprentices later in their training period. The quality of training, however, remains a black box would need further investigation. However, in spite of the negative effect on productivity, retention rates are generally low. There are some differences among firms, especially arising from firm size (larger firms retain a higher share of their apprentices), but even the largest retention rates are much smaller than similar figures in, for example Germany.

7 Conclusion

This paper provides evidence on the effectiveness of apprenticeship training for Hungarian firms in terms of productivity per full-time equivalent worker. Using different empirical methods that facilitate a causal interpretation of our results, we conclude that Hungarian firms that train apprentices face decreased productivity in all four examined sectors. The estimated effects are significant and robust across different specifications. However, further research is needed in order to investigate the complex net cost puzzle in Hungary. First, it is suggested that the short-term benefits of apprenticeship training depend whether firms assign (more) skilled tasks to apprentices during the course of their training period. Unfortunately, from the administrative data that we have, it is impossible to retrieve additional information on the quality of the training process. This limitation calls for further research in this area. Second, the skill supply of apprentices may (additionally) explain observed retention patterns, as suggested by Muehlemann (2016). It may be the case that the quality of the pool of apprentices in Hungarian vocational training is poor, so that firms are only able to retain a small number of them. This is an aspect that would be well worth exploring.