1 Introduction

Education is one of the most important long-term determinants of countries’ development paths and growth trajectories (e.g., Sala-i Martin et al. 2004). We still know relatively little about the origins of public education across the world, however. In particular, why and under what circumstances did public education first originate in non-democratic regimes?Footnote 1 What determines when political and economic elites demand investment in public education?

Much of the work in political economy has been concentrated on spending patterns in democracies (Boix 1998; Busemeyer 2007; Persson and Tabellini 2003; Ansell 2008; Iversen and Stephens 2008) or differences between regime types (Ansell 2008; Baum and Lake 2003; Boix 2003; Acemoglu et al. 2013; Stasavage 2005).Footnote 2 We know much less about the origins of spending on education in non-democracies, even though this is where first investments generally occurred. In contrast to the observed variation, scholars often assume that non-democracies have little interest to invest in public goods spending (e.g., Boix 2003; Bueno de Mesquita et al. 2005). In an attempt to enhance our understanding of elite demands for education and investments in non-democracies, I empirically investigate a theory of when self-interested non-democratic elites prefer higher levels of public spending.

I make use of a theoretical model developed by Galor and Moav (2006) and argue that differences in factor endowments by political elites can lead to differential preferences over government spending. Economic elites who own capital that directly benefits from higher government spending on public services demand investment in these public goods. Depending on the type of capital they own, higher government investment in education can increase elites’ return on capital ownership. For example, the return on public education spending is high for capital owners when skilled workers are capital enhancing. Therefore, while government spending may directly benefit the poor masses, economic elites in this context have an incentive to push for increased public goods spending, even if they bear part of the costs via taxation. On the other hand, owners of capital that is limited in its complementarity to government spending are likely to oppose such investments.

To investigate the theoretical argument, I have collected new data on economic and political characteristics in Prussian cities in the latter half of the 19th century. This period, in Prussia and Germany more generally, is marked by profound economic change (Pierenkemper and Tilly 2004), state development, and a growing fiscal development: the introduction of the general income tax (e.g., Mares and Queralt 2015, 2018; Hollenbach 2018). Most importantly, these data allow me to calculate measures of income inequality and investment in public education that are not available for other subnational, or even national, administrative units in this period. The data come from a census of all Prussian cities with more than 25,000 inhabitants at the time (Silbergleit 1908) and allow me to directly test the argument. In addition to data availability, using data at the municipal level has several advantages from a research design perspective. First, the design allows me to control for several confounding factors, such as external war and the political system. Moreover, the Prussian case enables me to undertake a straightforward test of the proposed argument. The design of the Prussian electoral system explained in more detail below, guaranteed an extreme overrepresentation of the economic elites. This system linked economic and political power, especially at the local level. Whereas many theoretical arguments in political science and economics assume the congruence between political and economic elites, in reality, this link is often tenuous. In contrast, in the Prussian setting, the political and economic elites strongly overlap, which allows for a systematic investigation of the theoretical argument.

As I show in the empirical analysis, in line with the theoretical argument, areas with high levels of industrial employment had more significant investments in public education. As I expand upon below, industrial employment captures the political power of industrialists or put differently: the economic interest of the political elite. In the first part of the empirical section, I show these findings to be true for standard regression analysis, even when controlling for a large number of possible confounders. The results are robust to controlling for the occurrence of political protests, political power of the working class, average taxation, income inequality, province level fixed effects, as well as modeling spatial dependence. In the second part of the empirical analysis, I attempt to more precisely estimate the causal effect of industrial employment on education investments. I introduce a new instrumental variable for industrial employment based on underlying rock strata that led to the development of coal beds. Using spatial instrumental variable estimation (Betz et al. 2018), I show that the estimated causal effect of industrial employment on public investments in education is substantially important and precisely estimated.Footnote 3

Whereas the theoretical argument in this paper is largely based on Galor and Moav’s (2006) theoretical model, the paper makes several important contributions to the literature. First, to my knowledge, this is the first rigorous empirical test of the general theoretical argument outside of the English case. Second, I introduce new data at the city level in Prussia in the 19th century. As discussed above, the unit of analysis has significant advantages over cross-national data from a research design perspective. Lastly, I undertake spatial instrumental variable analysis that ought to increase confidence in the results.

2 Public spending in autocracies

Throughout history, the vast majority of citizens have lived in non-democratic societies. Only since 1991 has democracy been the most prevalent political system in the world; even in 2015, 41 percent of the world’s population lived under non-democratic regimes.Footnote 4 Moreover, a non-democratic government is, in essence, the original regime type, since all modern states were once under non-democratic rule. Nevertheless, our understanding of politics and what explains differences in public policy in non-democracies is quite limited.

In contrast to a vast literature on the differences between democracies and autocracies, much less research has attempted to explain what determines the differences in fiscal and other public policies within non-democratic regimes.

Figure 1 shows the empirical densities for total government expenditure as a percentage of GDP, separated by regime type.Footnote 5 The left plot shows the density of observations for years before the turn to the 20th century. The right plot shows the densities for democratic and non-democratic country-years from 1901 to 2011. The plots are notable for two reasons. First, across both periods, the level of government spending that is observed in autocracies covers almost the whole range of observed values in democracies. The only exceptions are OECD countries with spending levels above 50% of GDP in the later part of the 20th century. We do, therefore, observe a large variation in spending levels within autocracies, which has often gone unexplained. Second, before the turn of the century, the average level of government spending is slightly higher in non-democracies (8.91%) than in democracies (7.98%). This is also the case for the period from 1900 until 1925. The differences in spending levels between democracies and non-democracies only developed in the latter part of the 20th century. For many countries, levels of government spending began to rise before a transition to democracy. And, as Paglayan (2018) shows, the provision of primary education in many countries increased long before democratization.Footnote 6 In the second period plotted, 1901 − 2011, the average level of spending in democracies is substantially larger. Nevertheless, observations in non-democracies over that period vary significantly, ranging from 1.23% to 49.36% of GDP.

Fig. 1
figure 1

The left density plot shows densities for government expenditure separated by regime type (Boix et al. 2013) for available expenditure data from 1800-1900 (Mauro et al. 2015). The right plot shows the densities for the years 1901 - 2011. While democracies do on average spend more in both time periods, it is clear the level government spending varies tremendously across non-democracies. Moreover, as the left plot exemplifies, in the 19th century the differences between non-democracies and democracies were much less pronounced. Empirically, it is clear that not all elites in non-democracies oppose government spending

As Fig. 1 indicates, non-democratic regimes exhibit large differences in how much governments spend. The variation raises the question under what circumstances political elites in non-democracies push for higher government spending. While our understanding of the differences among autocracies is limited, one common explanation is based on the level of institutionalization among them (Escribà-Folch 2009; Gehlbach and Keefer 2011; Jensen et al. 2013; Boix and Svolik 2013). Another strand of the literature contends that as the size of the politically pivotal share of the population (or selectorate) increases, governments spend more on public versus private goods and vice versa (Bueno de Mesquita et al. 2005).

While other scholars have investigated the provision of public goods such as education in authoritarian regimes, much of the focus has been on how the political power of the poor or inequality affects their provision. For example, Go and Lindert (2010) find that the American North strongly outperformed the South in school enrollment rates in the 19th century, most likely due to higher local autonomy and voting power of the poor. Galor et al. (2009) and Kourtellos et al. (2013) show that higher land inequality is associated with a delay in the expansion of primary schooling, both in the US context in the 20th century and cross-nationally. Cinnirella and Hornung (2016) use data on Prussian counties in the 19th century to show an initial negative relationship between land inequality and primary school enrollment that becomes weaker as labor coercion decreases. Contrary to prevalent theories, however, Cinnirella and Hornung (2016) find that land concentration does not affect the supply of education, but instead peasants’ demand for primary education. Paglayan (2018), on the other hand, argues that mass education in autocracies may serve the autocrat by increasing state consolidation and indoctrinating the population, an idea that is often mentioned when it comes to early Prussian education (Wittmütz 2007).

In contrast, I propose a theory based on Galor and Moav (2006) about when political elites have economic incentives to invest in public education and demand government spending, even if it benefits the politically less powerful masses. I contend that under the right circumstances, capitalist elites have an interest in utilizing the state to increase the provision of public education. Independent of the institutional structure and size of the ruling coalition, the economic activities of political elites matter, and can induce different levels of government spending.

This idea builds heavily on Galor and Moav (2006) and is similar in mechanism to the argument in Lizzeri and Persico (2004). Lizzeri and Persico (2004) contend that the franchise extension in England was not a consequence of political pressure from the disenfranchised. Instead, liberal elites realized that increasing the number of poor voters was in their interests. More poor voters would raise the likelihood of a political majority for the liberal elites’ preferred policies, i.e., more public goods spending. Thus, in Lizzeri and Persico’s (2004) view, the expansion of the franchise in England was not driven by the masses’ redistributive pressures (“or threat of revolution”) but instead by intra-elite conflict over public vs. private goods spending. Urban elites demanded more investment in (health) infrastructure and foresaw that increasing the pool of voters would allow them to pursue these policies against the opposition of landed elites.

In a similar vein, Galor and Moav (2006) argue that the demise of class conflicts in the 19th and 20th centuries in England was not due to the higher redistribution associated with democratization, but instead because industrialists in the second phase of industrialization demanded increased investment in public goods. “The capitalists found it beneficial to support publicly financed education, enhancing the participation of the working class in the process of human and physical capital accumulation, leading to a widening of the middle class and to the eventual demise of the capitalist-workers class structure” (Galor and Moav 2006, 1). Brown (1988, 1989) shows that cities in more democratic countries (UK, USA) lagged in their investments in sanitation compared to cities with smaller ruling coalitions in Prussia. Brown (1989) contends that as workers became more valuable, investment in public health became more profitable for the wealthy, since it significantly reduced their workers’ sick days and increased life expectancy. In a similar vein to the argument made here, Bourguignon and Verdier (2000) have argued that positive externalities can become large enough for non-democratic elites to invest in education. In their model, enfranchisement is linked to education, and thus public education can lead to a loss of political power for the elites. Nevertheless, if positive externalities of public education are large enough, the benefits can outweigh the possible costs (Bourguignon and Verdier 2000).

As in Galor and Moav (2006), I contend that when the capital-skill complementarity is high, economic elites can directly benefit from government investment in skill formation. When elites own capital that relies on physical and human capital, higher government spending in health and education directly benefits these elites by increasing their return on capital. As Galor and Moav (2006) show in their formal theoretical model, once the return to additional investment in physical capital is smaller than the marginal return to spending on the public education of workers, capital owners will prefer higher taxes to finance public spending on education. Importantly, the effect of public spending on individual-level returns, however, depends on the complementarity between physical and human capital. In cases where complementarity is high, capital owners are likely to demand higher levels of spending, even if the spending was financed through higher taxes and therefore increases their own tax payments. For example, when public education increases labor productivity and thus the returns to capital in the industrial sector at a higher degree than it increases wages, then owners of factories benefit directly from the more productive workforce.Footnote 7

If the supply of skilled labor is low, but the demand is rising, increased public investment in public education can become profitable for elites, conditional on the type of capital they own. First, as discussed in the previous paragraph, it raises the productivity of the workforce for industries that require skilled labor. Second, it increases the supply of skilled workers for capital owners, thereby lowering the upward pressure on wages. Similarly, public spending on health care or sanitation raises the life expectancy of workers and reduces the number of sick days, thereby promoting their reliability and longevity (Brown 1989). The supply of public education is especially profitable if the beneficiaries are poor and lack access to credit. In this case, e.g., a setting of high inequality, education will be under-supplied without public investment, given that private investments are limited (Benabou 2002; Galor and Moav 2004).

In contrast, owners of capital with low skill complementarity have little interest in publicly financed education and other public goods. When labor supply is high, and capital owners demand low-skilled labor, such as in agriculture, there are fewer benefits of public investment. In such situations, labor is easily replaceable and public education provides no value for capitalist elites. Landed elites may have an interest to oppose public financing of education for two reasons. First, higher spending is likely to be financed by higher taxes and thus costly for individual landowners. Second, higher education of workers may raise their mobility as well as wage demands, thereby directly increasing costs for agricultural elites (Galor et al. 2009). Similarly, according to Lizzeri and Persico (2004), elites in rural and less dense areas were less concerned about the public provision of sanitation since they were less affected by the illnesses of the poor.

In the Prussian case that is investigated here, Tilly (1966, 484f) documents the demand by industrialists to increase government spending that would “generate external economies and make private investment, for example in metalworking enterprise, more profitable” (emphasis added).

Businesses benefited strongly from government investment (especially at the local level). Industrialists were fundamentally affected by the availability of skilled labor and sufficient infrastructure and thus government spending and investment. As Becker et al. (2011a, 2011b) argue, even the most basic and menial tasks in factories required some level of literacy and math skills, which would be provided in early public schools. Furthermore, basic education enabled faster adoption and development of new technologies. Becker et al. (2011a) show that industrial development in Prussia benefited greatly from early educational investments in schooling.

In the late 19th century, as German industrialization was catching up with Britain, a large part of faster economic growth was due to the higher education of German workers (Pierenkemper and Tilly 2004; Tipton 1996). “German workers were becoming better paid, and they were also becoming better workers. The German states had an unmatched record in the nineteenth century for investment in human capital” (Tipton 1996, 76). Tilly (1991, 179f.), describes the public investments in science and education as one important change in the second phase of German industrialization: “This ‘second phase’ - some have called it ‘high industrialization’ - describes the period to 1914 and encompasses a number of important changes: [...]; the development of scientific knowledge as a factor of production and its encouragement by government institutions; and the absolute and relative growth of very large industrial enterprises.”

Similar to the work by Engerman and Sokoloff (2002), I contend that factor endowments are an essential part of the story, as they at least partly determine economic activity. Given an abundance of land and a high supply of unskilled labor, economic elites (or owners of large estates) have little reason to push for higher government spending. Owners of industrial capital, however, who lack adequate labor supply and require a more educated workforce can benefit directly from the state providing these public goods. Industrial elites, therefore, benefit from government spending on health and education, as it increases the return on their private investments. Ergo, these capital owners have incentives to demand higher levels of public spending on education and other productive public goods. Galor et al. (2009) point out that a conflict exists between large landowners who prefer abundant and cheap unskilled labor and elites who benefit from increasing the productivity of the workforce. I, therefore, expect non-democratic polities in which industrial elites hold political power to invest in public education.

3 Research design & case selection

In this paper, I use a unique and extraordinarily rich data set with observations from Prussian cities in the 19th and early 20th centuries to investigate the argument made above at the local level. Cities as administrative units were part of the Prussian central state and the German Reich. As Pierenkemper and Tilly (2004, 143) succinctly describe, ” local government supplied most of the infrastructure and public services, [...], upon which daily life and indeed the very functioning of the economy itself depended.” Moreover, as discussed in more detail below, the case of Prussia allows for a direct investigation of the argument by linking economic and political power.

Using these local level data in the empirical test has several advantages. First, the use of the city census guarantees a level of comparability concerning density, size, and political organization. Second, by explaining subnational/local level differences in education investment, the research design allows me to control for several confounding factors, such as the political system, trade policy, or the threat of war. The political system is very similar across the sample of cities, making it unlikely that differences would cause changes in spending levels. Similarly, trade and defense policies are decided at higher levels of government, i.e., Prussia or the German Reich. Observations in the data set should, therefore, not differ significantly on these policies, allowing for a cleaner investigation of how local demands for domestic spending differ. Lastly, using only city-level data minimizes introducing rural-urban differences, which were quite pronounced at this time, especially when it comes to the provision of education (Hühner 1998).

Even though Prussia theoretically enacted compulsory schooling under Friedrich Wilhem I (Frederick William I) in 1717, schools were supplied by the King, which led to a very slow increase in schooling and often underqualified teachers (Hühner 1998, 27). The Prussian central state continued to enact laws governing education throughout the 18th and 19th century and attempted to tighten compulsory schooling laws. De facto, however, schools were a responsibility of municipalities, especially when it came to financing. Cities, rural communities (Gutsbezirke), or even local manorial lords were the administrative units that were responsible for funding local schools. Schools were financed via school fees, local taxes, or directly by local estates. Only after 1888 was state assistance to school financing allowed, yet the level of financial support was relatively minor and much more significant in rural areas compared to cities, which are studied here (Hühner 1998, 32f.).

To investigate the theory laid out above, I make use of the variation in education investments across municipalities by using city-level data. The data set includes all “large Prussian cities” with over 25,000 inhabitants and is based on a Prussian city census from 1907 (Silbergleit 1908). Figure 2 shows the unit of analysis, 110 Prussia cities, as they are distributed across Prussia. County (Kreis) borders are marked in black (1882 county borders), and cities are depicted as gray dots. Darker shading and larger point sizes represent larger populations in 1907. The largest and darkest point shows Berlin. While the majority of observations are clearly concentrated in the western, more industrial part of the country, a number of observations are located in the more agrarian, eastern parts of Prussia. The local-level observations provide a unique opportunity to investigate the circumstances under which economic elites were in favor of providing public services to the general public.

Fig. 2
figure 2

The plot shows the location of all cities (observations) in the data set and their respective populations in 1907. Counties are plotted with their 1882 borders (MPIDR 1975). Figure A.1 in the Appendix shows the location of cities around Berlin and in the Ruhr Area in more detail

3.1 Local non-democratic politics

During the period studied, the political system across cities in Prussia was quite similar. Prussia held regular elections for the lower house as well as to elect members to the parliament of the German Reich. Voters in cities also elected city council members. While elections were common and all male citizens above 24 had the right to vote, neither Prussia nor the slightly more democratic German Reich are considered to be democracies at the time according to measures commonly used in political science (Boix et al. 2013; Marshall et al. 2016; Coppedge et al. 2018).

Several features of the political system at the local level led to the enormous political power of economic elites, which is fundamental for the empirical investigation of the theoretical argument. As I discuss in more detail in the following, the system ensured that economic elites dominated politics and that industrial elites did so in industrial areas.

A particular undemocratic institution in Prussian elections was the Dreiklassenwahlrecht – three-class franchise. All eligible voters (male citizens above 24 years) were ordered by the size of their tax payments and then split into three groups. The first group contained the richest taxpayers, who paid for one-third of the local tax revenue. The second group contained the next-richest taxpayers, again responsible for paying one-third of the local tax revenue. The last group contained all other male citizens. Thus, the richest citizens paying for one-third of the tax revenue also had a third of the voting power, no matter the number of voters in the group. In many cases, the top group was a tiny fraction of the population. The three-class franchise, employed in the vast majority of cities. very effectively tied political power in the electoral district to economic power. As Pierenkemper and Tilly (2004, 143) summarize: local governments were “largely in the hands of local elites and local economic interests, who operated via a few civil servants and the quasi-parliamentary bodies elected on the basis of an extremely narrow suffrage.”

In addition to the franchise, other characteristics of the political system sustained the political power of economic elites. Prussian elections, for example, were not held under secret ballot. This enabled employers to pressure poorer voters, prohibiting a free choice (Thier 1999; Hallerberg 2002). The rules governing city administration in Prussia also included the property owner privilege (Hausbesitzerprivileg), which specified that 50% of the members of the municipal parliament would have to be property owners (owners of houses) (Hühner 1998).

In combination, these political institutions were not only profoundly anti-democratic, but they also ensured industrial elites were effectively holding power in areas with heavy industry. Entrepreneurs clearly understood the beneficial effect of the voting rules and pushed to keep the three-class franchise (Jaeger 1967). Industrialists and other entrepreneurs were strongly represented in the Prussian lower house, yet their representation at the local level, where electoral districts were smaller, was even stronger. In the Ruhr area, Prussia’s most industrial region, the top two electoral classes elected mainly industrialists, bankers, and traders to the city councils. As an extreme example, in the city of Essen, the Krupp family by itself selected one third of city council members from 1886 to 1894 (Jaeger 1967, 87). Similarly, in Elbing, a shipyard owner was the only voter in the top class and thus elected 20 of the 60 city council members (Jaeger 1967, 262). In 1898, in 15 cities in the Rhineland, the top class included less than 1% of all voters (Jaeger 1967). Spoerer (2004, 189) suggests that the political rules made it essential for industrialists to live close to their firm’s location. The political power would enable them to strongly influence local spending decisions which could directly benefit their firms.

3.2 Data & measurement

The vast majority of variables used in the empirical analysis are newly collected from the 1907 city census of all Prussian cities with more than 25,000 inhabitants (Silbergleit 1908). The relevant variables were transcribed for all cities listed in the census. I then geo-coded all cities, where possible. For variables that are not available at the city level, I use the respective city’s geo-location to merge county-level or electoral district data. Table B.1 in the Appendix provides summary statistics and the source for all variables used in the analyses.Footnote 8

To investigate investment in education at the city level, I create two dependent variables that measure public investment in education. First, I calculate the cost of schooling per capita for each city, to which I refer as school expenditure in the following. I then take the natural log of the calculated per capita cost. The second dependent variable is a measure of school enrollment. I calculate the share of 5 to 15-year-olds in a given city that attend the Volksschule in a given city.Footnote 9

In the theoretical section above, I argue that political elites in non-democracies push for investment in public education when they own capital that is complemented by human capital. Specifically, I contend that during the period studied, owners of industrial capital had an interest in the state providing public education. To operationalize the political influence of owners of industrial capital I use the share of industrial employment.Footnote 10 For this variable to be a good proxy, two conditions have to hold. First, as the share of industrial employment increases in a given administrative unit, the share of capital income based on industry in the same administrative unit also has to increase. Second, the increasing share of capital income ought to be directly translated into political power over spending decisions. The particularities of the Prussian political system effectively ensure that both of these conditions hold. As discussed in more detail above, the Prussian political system directly linked economic power to political power, especially at the local level. Where industry was economically important, industrial elites effectively controlled the selection of political decision-makers at the city level and directly influenced policy. Moreover, decisions over school spending were generally made directly at the city level (Hühner 1998).

As one example of the appropriateness of the measure, the reader may consider a comparison between the cities of Dortmund and Muenster, for which Krabbe (1985) provides data on the share of industrial elites in the city council. Dortmund was highly industrialized with a high share of workers employed in industry, trades, and mining and very few workers employed in agriculture. On the main independent variable, share of industrial employment, Dortmund shows a value of 13.3% for 1882, just below the sample 90th percentile. Muenster, on the other hand, was much less industrial. Muenster only has 1.9% of workers employed in industry in 1882, just above the 10th percentile in the sample. Similarly, on the alternative measure of industrial employment in 1895, Dortmund’s value is above the sample 75th percentile, while Muenster is below the 25th percentile. At the same time, the number of industrial elites in the city council in Dortmund was 28% in the period 1870–1890, with traders and bankers comprising another 30%. In Muenster, on the other hand, only 6% of the council members were industrialists in the period 1875–1885, and 27% were traders and bankers (Krabbe 1985, 142, 147). While this comparison, of course, does not prove the adequacy of the measure across the whole sample, it provides some evidence that for the two cases available the proxy reflects what it is intended to measure.

In addition to the primary independent variable, the main empirical models include a large set of control variables. First, I control for city size, i.e., logged population. I also create a measure of income inequality at the city level, measured as a Gini coefficient based on the number of city inhabitants in different income groups.Footnote 11

Unfortunately, data on total city income, such as GDP, is not available. In an attempt to control for city income levels I add a variable measuring the average taxes paid by city residents. While not perfect, this variable should capture income levels. I also use the data on income groups and create an average city income, assuming that each resident earns the average of their income group.

Lastly, a competing theoretical argument might be that it is easier for protesters to organize in very urban areas with industrial production, especially if factories further enhance the ability for collective action. Second, more industrial areas are likely to be early hotbeds of future socialist movements. I add two additional controls to account for the possibility that industrial employment proxies for the political power of the working class. First, I add a control for the number of protest events that occurred in the 19th century. Tilly (1980, 1990) originally compiled data at the city level based on newspaper articles. While incomplete, it is as comprehensive as possible for the period covered and should include major protest events (Tilly 1980). I geocode the city-level data where possible and create a count of protest events within a 15km radius around each city in the sample. Next, I add data on social democratic (SPD) vote share in the German Reich parliamentary elections of 1893. Unfortunately, electoral results at the city are not available, but because of the more democratic electoral rules in the German Reich compared to Prussian elections, these results are more likely to reflect the actual strength of SPD support. Any city level electoral results would likely underestimate the strength of the labor movement due to voter suppression and the franchise rules.Footnote 12 The electoral results for the German Reich are merged based on the geo-coded coordinates of the cities and a map of the electoral districts (Ziblatt 2009; Ziblatt and Blossom 2011).

Next, I calculate a measure of land concentration in the cities’ surrounding counties, as land inequality is likely to be correlated with industrial development and has been shown to affect school enrollment (Cinnirella and Hornung 2016). I also include additional controls for geographic and economic factors. I add covariates for longitude, logged rainfall in millimeters, and the logged area of the surrounding county. Additionally, I add a proxy for migration, i.e., the share of the city population that was born outside the city, and an indicator whether the city was in an area under French control after the French Revolution (Acemoglu et al. 2011). Lastly, models include province fixed effects and indicators for slightly different executions of the franchise rules.Footnote 13

Unfortunately, not all measures are available for the same point in time. The measure of industrial employment, the primary independent variable, is only available for 1882. I, therefore, use all other variables measured at the time point closest to 1882, which generally means 1893. The enrollment rate is unfortunately only available for 1905/06. School expenditure per capita is available for both 1895 and 1905; the results do not change substantially depending on which year is used. For consistency, I present results for both dependent variables measured in 1905 in the main body of the paper.Footnote 14

4 Empirical analysis

As a first step in the empirical analysis, I estimate ordinary least squares (OLS) models to show the association between the independent variable of interest, the share of industrial employment, and the two measures of educational provision at the city level. Specifically, I estimate the following model:

$$ y_{i} = \alpha +\boldsymbol{\beta}_{k} \boldsymbol{X}_{i,k} + \gamma \text{ industrial}_{i} + \epsilon_{i}, \quad i = 1,...,n $$
(1)

where yi is the measure of educational investment in city i, α is the common intercept, Xi,k is the matrix of control variables, including indicators for a city’s province and βk is a vector of the associated coefficients. industriali is the share of industrial employment in city i and γ is the estimated association between the share of industrial employment and the outcome of interest. Lastly, 𝜖i is the iid error term. Given that some cities in the sample are in the same county and therefore have the same industrial employment share, standard errors are generally clustered at the county level.

Table 1 shows the estimated relationship between industrial employment and the dependent variables based on standard OLS regressions with different sets of control variables. Columns one and four show the coefficients for industrial employment (γ above) from bivariate regressions with logged per capita school expenditure and school enrollment as the outcomes, respectively. Columns two and five show the coefficients for industrial employment when I include a limited set of control variables: income inequality, average income, logged population, and tax payments per capita. These are the controls that are measured most closely in time to the independent variable of interest and where the danger of post-treatment bias is minimized. Columns three and six show the estimated regression coefficients on industrial employment when I add further controls for number of protest events, longitude, share of population born in the city (migration), logged rainfall in millimeter, logged county area, land inequality, a dummy for French presence, and SPD vote share in the elections to the German Reich’s parliament in 1893. Aside from the bivariate models (columns one and four), all models include indicators for the cities’ provinces, i.e., province fixed effects, and indicators for the system that is used to create the three-class franchise.

Table 1 OLS model of expenditure and enrollment on industrial employment

Based on the theoretical argument, we would expect industrial employment to have a positive association with school expenditure and school enrollment (γ in Eq. 1 above). As Table 1 shows, the estimated association between industrial employment and per capita school expenditure is indeed positive and quite large. Depending on the set of controls included, the estimated coefficient ranges from 2.44 to 2.96. Importantly, across the different models and including an extensive set of control variables, including province fixed effects, the coefficient is quite stable and the 95% confidence interval does not include zero.Footnote 15 For the most conservative model with the full set of controls, the estimated coefficient means that a one standard deviation increase in industrial employment from its mean (0.058) is associated with an expected increase in logged per capita expenditure from 2.19 to 2.32 or a 6 percent increase.

Similarly, when regressing school enrollment on industrial employment, the estimated coefficient for industrial employment is quite large and precisely estimated, especially in the bivariate model. For the association with enrollment, the coefficient for industrial employment is more sensitive to the included set of covariates. In the three models, the coefficient ranges from 1.18 in the bivariate model to 0.4 in the model with all controls included. Its 95% confidence interval, however, does not cover zero for any of the estimates. Using the most conservative estimate again, here an increase in industrial employment from its mean by one standard deviation is associated with an expected increase in enrollment from 63.6% to 65.7%, i.e., a two percentage point or three percent increase.

The estimated results remain statistically significant at the 5% level and in the expected direction when I use an alternative measure of industrial employment in 1895 based on Galloway’s data (2007) or when I estimate the model with the logged absolute number of industrial workers in 1882 (Tables C.2 and C.3 in the Appendix). One possible problem with the analysis is that industrial employment is measured at the county level, whereas the unit of analysis is the city. Also, 28 cities in the data come from counties with more than one city (i.e., these observations have the same value on industrial employment). As an additional robustness check, I therefore include indicator variables for these particular cities in the OLS regression models with the full set of controls. The results remain effectively the same as those presented above.Footnote 16

4.1 Spatial autoregressive models

The results above show a correlation between the provision of public education at the city level and the share of industrial employment, proxying for elite capital ownership. One additional concern with the data are potential spatial spillovers and spatial dependence. For example, industrial employment in one county/city may increase the demand for education spending in neighboring cities. Similarly, investment in education in one city may allow for free-riding by elites and less investment in nearby cities. Overall, spatial dependence in the main variable of interest, industrial employment, and the outcome variables could lead to biased estimates. Using Moran’s I test, I am unable to reject the null hypothesis of no spatial correlation in the residuals for the OLS models without province fixed effects. To account for possible spatial dependence in the data, I estimate spatial autoregressive models by including a spatial lag for the dependent variable. The model is estimated via generalized spatial two-stage least squares (GS2SLS) (Drukker et al. 2013) and can be written as:

$$ y_{i} = \lambda \sum\limits_{j \neq i} w_{i,j} y_{j} + \alpha +\boldsymbol{\beta}_{k} \boldsymbol{X}_{i,k} + \gamma \text{ industrial}_{i} + \epsilon_{i}, \quad i = 1...n $$
(2)

Here yj is the value on the dependent variable in all other cities and wi,j determines the neighboring structure to create a weighted average of adjacent cities on the dependent variable. In this case, the spatial weights matrix is based on the inverse distance between cities and then row-standardized. This creates a weighted average of neighbors, where closer cities are weighted more heavily. Table D.1 in the Appendix presents the results from the spatial autoregressive models for both dependent variables. The results show evidence of possible spatial dependence, though when province fixed effects are included the spatial correlation parameter (λ above) decreases substantially. Specifically, spatial models lead to the estimated coefficients in the bivariate models to decrease whereas the estimated coefficients in the models with province fixed effects are quite stable.

Importantly, the substantive results of the spatial autoregressive models are quite similar to those of the standard OLS regression. As Table D.1 in the Appendix shows, the estimated coefficients for industrial employment are positive and statistically significant at conventional levels for both school expenditure and enrollment as the dependent variable. Again, in line with previous results, the spatial autoregressive model provides correlational evidence for the theoretical argument made above. Cities with more industrial employment are associated with higher investments in education. Even though the data exhibit spatial dependence, controlling for its presence does not change the conclusion.

5 Causal Identification using instrumental variable

Despite the strong results from the OLS models and the spatial autoregressive models, concerns remain with regards to establishing the hypothesized relationship, let alone causality. The main threats to the results presented above come from omitted variable bias, reverse causality, or measurement error. To better identify the potential causal effect of industrial capital ownership on educational inputs, I estimate an instrumental variables model treating industrial employment as the potentially endogenous variable.

To instrument for industrial employment, I use an exogenous geographic variable – the location of carboniferous rock strata. These rock strata developed during the Carboniferous era (more than 3 million years ago) and are likely to result in the presence of coal mining areas. Carboniferous (literally “coal bearing”) rock strata were mapped by the Federal Institute for Geosciences and Natural Resources in Germany (Asch 2005). As Fernihough and O’Rourke (2014) show, these Carboniferous areas are highly correlated with later coal discoveries.

The use of rock strata as an instrument for industrial employment works through coal being one of the most critical natural resources during industrialization (especially the second phase) and a significant driver of economic progress (Fernihough and O’Rourke 2014). Indeed, the industrial take-off in Europe would have been impossible without the vast coal deposits in England (Pomeranz 2002; Wrigley 2010; Gutberlet 2013). The availability of raw materials is imperative to industrial development and manufacturing, especially at a time when transport costs were still very high. Close location to coal mines, therefore, ought to be relevant to industry location. I expect distance to Carboniferous areas to be negatively correlated with industrial employment. Specifically, I use the natural log of a city’s distance to the closest carboniferous rock strata as an instrument for industrial employment.

Figure 3 shows the bivariate relationship between the potentially endogenous variable of interest, industrial employment, and the instrument, logged distance to the closest Carboniferous rock strata. As expected, there is a robust negative relationship between the two variables: the R2 for the bivariate regression is 0.42. Table E.1 in the Appendix shows the results when regressing industrial employment on the instrument (logged distance to the closest Carboniferous area) and the limited set of control variables. As one can see, the estimated coefficient of the instrument is negative and statistically significant. The robust F-statistic for the first stage in the standard two-stage least squares model with the full set of controls is 16.41 and 17.64 in the model with the limited set of control variables. Based on the available evidence, the instrument is quite strong in predicting industrial employment.

Fig. 3
figure 3

The plot shows the relationship between the potentially endogenous variable (industrial employment) and the instrument used (logged distance to closest Carboniferous area)

A second necessary assumption for the IV estimation to be valid is the exclusion restriction, i.e., that the instrument is independent of any other determinants of the outcome but the endogenous variable of interest. Mathematically the exclusion restriction is generally expressed as: Cov(𝜖i,Zi = 0), where Zi is the instrument and 𝜖i is the unobserved error term in the second stage (Angrist and Pischke 2009). While the exclusion restriction is not testable, from a theoretical perspective, it seems highly unlikely that rock strata directly influence the political processes, not least because they precede these by millions of years. A possible concern, however, is that other indirect paths exist outside of industrial capital ownership, by which coal-bearing rock strata, or coal deposits, could affect educational investments. Two main avenues come to mind. First, it could be that industrial areas are richer and are thereby investing more in education. Second, aside from income, it could be possible that it is easier to collect taxes from industrial vs. agricultural capital.Footnote 17 To block the potential path from coal to educational investment through income or taxation, the instrumental variable models include covariates for average income, logged population, as well as average tax payments. It is difficult to imagine other potential ways in which the location of coal deposits would change educational investments. Nevertheless, I provide the instrumental variable results estimated with the full set of controls, including province fixed effects.

Spatial dependence, however, is again a concern, in particular, because of the geographic nature of the instrument. As Betz et al. (2018) show, spatial dependence in the outcome can lead to significant bias in standard two-stage least squares models, especially if the instrument also exhibits spatial clustering. Given the instrument is based on distance to rock strata, spatial correlation is highly likely. Based on the particularities of the dependent variable and the instrument, I therefore estimate a spatial 2sls model. The main difference is that the model also estimates a spatial lag of the dependent variable, which in turn is instrumented by spatial lags of the regressors (Drukker et al. 2013; Betz et al. 2018). The spatial two-stage least squares (s-2sls) model nests the standard 2sls estimation without (potentially falsely) assuming zero spatial dependence. When no spatial dependence is present, the s-2sls estimate is effectively the same as the standard 2sls estimate (Betz et al. 2018). The spatial weights matrix used in the estimation is the same as above, based on the inverse distance between the cities. Standard errors are adjusted for potential heteroscedasticity.

Table 2 shows the results for the spatial IV regressions and the estimated parameters of interest. The full model results are presented in Table E.2 in the Appendix. Two things stand out. First, across all models, the results are quite similar to the OLS regression results. In the models with logged per capita expenditure, the estimated coefficients in the spatial instrumental variable models are slightly larger. With enrollment as the dependent variable, the coefficients on industrial employment in the spatial two-stage least squares models are very similar to the OLS results but slightly larger.Footnote 18

Table 2 s-2sls estimates: expenditure and enrollment on industrial employment (Instrumented)

Based on the spatial IV model, the estimated effects are therefore slightly larger than those reported above. For both dependent variables of interest, the s-2sls results suggest a causal effect of industrial employment on education similar in magnitude to those estimated in the OLS models.

Lastly, as Table E.3 in the Appendix shows, the results are effectively identical when standard two-stage least squares models are estimated.

6 Conclusion

When do political elites invest in the provision of public goods? How can differences in public spending within non-democracies be explained? In this paper, I use data from Prussian cities at the end of the 19th century to investigate these questions. I argue that economic elites have an interest in higher government spending on public services if it increases their return on capital. Specifically, when the complementarity between physical capital and human capital is high, capital owners have strong interests in getting the state to invest in the provision of human capital. I argue that this was the case for owners of industrial capital in 19th century Prussia.

I use data from a census of Prussian cities to investigate the theoretical argument. To do so, I collected data on educational investment and other economic and political characteristics in 110 Prussian cities. Using standard regression techniques and spatial autoregressive models, I show that industrial employment is robustly associated with higher local spending on education. At the same time, however, industrial employment is also associated with higher enrollment rates in the local Volksschule. Moreover, using distance to carboniferous rock strata as an instrument for industrial capital ownership, I provide evidence of a causal effect of industry on educational investment during this time. Lastly, I undertake a bounding exercise to show that these results are unlikely to be the artifact of omitted variable bias.

While this paper shows the effect of industry location on educational investment, several potential avenues for further research stand out. First, the effect of different types of capital ownership on other public goods could be investigated. For example, the relationship with other budget items, such as policing and health spending, might be of interest. Further, is the investment in public goods related to inequality and the potential repression of politically disenfranchised groups? Moreover, it is possible that the demand for public spending motivates the development of tax capacity. As revenue must precede spending, new demands for public spending create pressures for higher taxation and the development of fiscal capacity. In this sense, public education spending could provide elites with the motivation to increase the fiscal capacity of the state. Future research ought to further investigate the interplay of elite capital ownership, public goods investments, fiscal capacity development.

Lastly, while this paper is primarily focused on elite interests and their political influence in a non-democratic setting, the findings should have implications for democracies as well. Investment in education may be a cross-cutting cleavage in that some economic elites have interests aligned with the masses to fund public education, whereas other elites and voters may oppose such investments. A similar mechanism as outlined above may, therefore, have different implications for democratic polities, which ought to be investigated in the future.