1 Introduction

An ageing population is considered as one of the major challenges for developed economies. To deal with population change, its causes have to be understood. One major cause for population change—the existence of the welfare state—has received comparatively little attention in the recent academic debate. This paper tries to fill this gap by analysing the link between social security and fertility in a theoretical model and by testing the model implications with historical data.

To most economists, it is clear that social insurance provision as well as social insurance contributions trigger changes in behaviour, for example in the labour supply decision (Becker 1965; Feldstein 1974) or labour migration (e.g. Borjas 1999), also in the historical context (e.g. Khoudor-Cásteras 2008). This link between social security and individual behaviour has been postulated as the so-called social security hypothesis (Feldstein 1974), which states that the individual provision for the major risks of life—sickness, accidents and poverty—declines whenever the state provides insurance against these risks. Therefore, it may seem surprising that the link between social security and other changes in individual behaviour, such as fertility, has received less attention both in the seminal economic literature (e.g. Becker 1960, 1988, 1991) and in broader discussions on the fertility decline in advanced economies.Footnote 1

In the public finance literature, it is well-established that the link between fertility and the public provision of pension insurance can be considered a special case of the social security hypothesis (Bental 1989; Prinz 1990; Cigno 1993; Cigno and Rosati 1996; Sinn 2004; Fenge and Meier 2005; Cigno and Werding 2007; Cremer et al. 2008). Recently, the link between pensions and fertility has received increasing attention, also in the historical context of the first demographic transition (Guinnane 2011). However, testing this link is more difficult than testing for example labour market effects of social insurance, since social security and in particular pension systems have been in place for over a century in most advanced welfare states, and cross-country variation is rare (Ehrlich and Zhong 1998 and Boldrin et al. 2015 are exceptions). Moreover, there are only few exogenous changes within a given pension system (Cigno and Rosati 1992; Cigno et al. 2003; Billari and Galasso 2009). In addition, social insurance also affects the savings decision, which complicates the analysis even further.

This paper provides two major contributions to understanding the impact of social security on fertility and thus to the better understanding of the causes of population ageing. First, we further develop microeconomic overlapping generation models in the spirit of Cigno (1993), to illustrate that the effect of pension insurance on fertility is ex ante ambiguous and determined by the extent to which pension insurance affects lifetime income. From the model we derive testable hypotheses on this link between pensions and fertility. Second, we use the historical data set on the introduction of pension insurance in Imperial Germany developed by Scheubel (2013) to test the hypotheses derived from the model. We quantify the effect of pension insurance on fertility using cross-regional variation in 23 provinces of Imperial Germany, for which we construct a pseudo panel of two periods. As this implies that the total number of observations in our study is only 46, we show that our results hold for different proxies and that they do not depend on the specific construction of the pseudo panel.

Imperial Germany was the first European country that enacted an irreversible transition into a welfare state. Company-level pension insurance for workers in certain professions was already common during the mid-nineteenth century (Jopp 2013). By the end of the nineteenth century, the concept of pension insurance had become statutory for almost all workers at the national level (Scheubel 2013). The authorities collected information on several key variables of the system since its inception. We use this data for our analysis, which provides a unique opportunity for analysis since it covers the period of the introduction of social security in Imperial Germany.

The model used in this paper is a simple overlapping generations (OLG) model which combines three options to provide for old age: private savings, an intra-family transfer from children to parents and a public pension system. One of the key assumptions in the model is the reduction of labour supply whenever a household decides to have children. This assumption implies that there is an opportunity cost of having children in terms of foregone lifetime income. Since a higher contribution to the pension system reduces the net wage, it also reduces this opportunity cost, having ceteris paribus a positive effect on the birth rate. At the same time, whenever the return from contributions to the pension system is lower than the return from the other possibilities of saving for old age, in other words, when there is an implicit tax in the pension system, a higher contribution rate also implies lower disposable income, which has a negative effect on fertility. In equilibrium, these effects are traded off against each other. The internal rate of return of the pension system determines the size of the overall effect. In our empirical analysis, we show that the overall effect of a higher enrolment rate in pension insurance is associated with a lower birth rate.

This paper thus provides a theoretical underpinning and an empirical confirmation of the negative relationship between statutory old-age insurance or more broadly statutory social insurance and fertility. The effect amounts to a total reduction of approximately 0.5 marital births per 1000 between 1895 and 1907. Since we also test for the other determinants of the first demographic transition which have been identified in the literature (e.g. Richards 1977; Galloway et al. 1994; Brown and Guinnane 2007), we can compare the impact of pension insurance to other factors. For example, the impact of a 1 % increase in pension insurance coverage is approximately equivalent to 10 times the impact of a 1 % increase in education (proxied by the share of recruits with at least basic schooling) and to half the impact of 1 additional person per building (which is our proxy for urbanisation).

Considering that the impact of social security on people’s lives has increased rather than decreased since the early nineteenth century, the impact of social security on current levels of fertility is likely to be even larger. Therefore, the impact of social security on the current ageing problem should not be underestimated. In particular in the context of strained public finances and a widespread need for structural reforms, reevaluating the design of the welfare state seems a promising area of development.

Section 2 provides institutional details on social policy in late nineteenth century Germany. Section 3 then presents the theoretical model and Section 4 derives the identification strategy from the theoretical framework, provides information on the data set as well as considerations on econometric issues. Section 5 presents a descriptive analysis and multivariate results as well as sensitivity analyses. Section 6 concludes.

2 Institutional background

The introduction of comprehensive social insurance in Germany took place between 1883 and 1891. Health insurance was introduced in 1883 and accident insurance in 1884. The law on pension insurance was adopted in 1889 and came into force in 1891.

The pension system of 1891 consisted of both funded and pay as you go elements and was turned into a pure pay as you go (PAYG) system by a law adopted in 1899 and coming into effect in 1900. The pension system of the 1890s was neither a pure pay as you go pension scheme nor a fully-funded pension scheme (Scheubel 2013). While the system was based on current contributions financing current pensions, it was also supposed to accumulate a capital stock. However, there was a general fear that the government would touch the capital stock, not least because it seemed that some regions ran into financing difficulties. In addition, the 1891 set-up was not perceived as socially fair. As a consequence, the pension system became a fully-fledged pay as you go system when the law was revised in 1899, coming into effect in 1900.Footnote 2

The pension system was mandatory only for parts of the population (Scheubel 2013). For workers in specific occupational categories with an annual income below 2000 Reichsmark pension insurance was mandatory; for people in other occupations it was voluntary (Verhandlungen des Reichstages 1888).Footnote 3 As a consequence, about 20–25 % of the population were covered by pension insurance.

Pension insurance provided so-called invalidity pensions and old-age pensions. Invalidity pensions were provided if a worker was unable to work because of physical conditions; old-age pensions were provided if a worker was unable to work because of age. A worker had to prove that they met one of these conditions by either reaching the age of 70 or by proving that they were not able to earn at least the average day labourer’s wage.Footnote 4 Both invalidity pension and old-age pension were designed as a supplementary income. As average life expectancy for a boy born in Prussia between 1865 and 1867 was 32.5 years (Marschalck 1982) and average life expectancy for a child born between 1881 and 1890 in Imperial Germany was 42.3 years (Marschalck 1982), the invalidity pension was by far the more important supplementary income. Since both types of pension were paid when workers were unable to earn their income due to invalidity (either age-related or not related to age), we interpret the distinction between invalidity pensions and old age pensions as mainly semantic. This interpretation is in line with the common understanding of ageing in the historical context, when not only biological age was considered as a qualifying condition for a pension, but also physical deficiencies (Bourdieu and Kezstenbaum 2007). In other words, the invalidity pension was the relevant pension for a worker to be considered as ‘old’ at the time.

Contribution rates only differed between wage categories, which implies that workers paid contributions according to income, but the link between income and contribution was not direct. Between 1891 and 1900, there were four wage categories. A fifth category was introduced with the revision of the law in 1899, which divided the previous category IV in two new categories.

The average old-age pension in Imperial Germany was 21.88 % of the average annual wage in rail track supervision and maintenance, and the average invalidity pension was 21.36 % of the average annual wage in that sector (Lotz 1905).Footnote 5

The administration of the pension system was decentralised and administered by regional authorities, the so-called Regional Insurance Agencies (Landesversicherungsanstalten). These Regional Insurance Agencies (RIAs) already administered the health insurance system and enjoyed discretion with regard to setting contribution rates within certain limits and with regard to approving pension applications.

3 Theoretical analysis of the pension system and fertility

Microeconomic theories of fertility choice were developed by Becker and others (Becker 1960, 1965, 1988, 1991, 1992; Schultz 1969; Barro and Becker 1986, 1988, 1989; Easterlin 1975; Becker and Tomes 1976; Cigno and Ermisch 1989). These approaches to an (economic) theory of fertility are often referred to as the demand model of fertility, because children are modelled as a consumption good and fertility is considered as the demand for children. In equilibrium, the marginal benefit of an additional child has to be equal to the marginal cost of rearing the child.

More recently, the microeconomic theories were related to economic growth (Barro and Becker 1989; Becker et al. 2010; Becker 1992). This provided the missing link between the microeconomic theories and the macroeconomic view on the fertility decline that was adopted by its early observers. The impact of institutions on fertility has also become a focus of economic research (e.g. McNicoll 1980; Becker and Murphy 1988; Smith 1989; Guinnane and Ogilvie 2008; Fenge and Meier 2009; Fenge and von Weizsäcker 2010). The impact of institutions has, however, not been discussed extensively in the context of the demographic transition in the nineteenth century Europe (refer to Guinnane 2011 for a review).

Our model, which provides a framework for analysing the impact of pension insurance on the first demographic transition, is also linked to the literature on the social security hypothesis, which postulates that institutions have an impact on behaviour, such as labour supply reactions, but which has also been linked to old age provision (Caldwell 1978; Willis 1979; Bental 1989). With our model, we investigate how the introduction or expansion of public pension schemes reduces fertility. The model combines three options to provide for old age in a simple two-period overlapping generations setting. The first option is to voluntarily accumulate savings in order to form a capital stock from which private pensions can be drawn during the retirement period. The second option is an intra-family transfer that children give to their parents because they derive utility from the well-being of their parents (altruistic preferences). The third option is to contribute to a PAYG pension systemFootnote 6 which provides a public pension in the old age period. The decisions about fertility, savings and the intra-family transfer are endogenous. We discuss several possible channels how the introduction or extension of a pension system may affect the decisions of a generation.

The set-up and results of our model are consistent with earlier findings in the literature. Boldrin et al. (2015) calculate the quantitative effects of the public provision of old-age pensions in calibrated models based on the old-age security motive for children. They find that in such models, there is a sizeable negative effect on fertility which is consistent with our empirical results. Moreover, they show that an improved access to capital markets reduces the incentives for childbearing. Our model is consistent with these findings, as illustrated in Appendix AA. Ehrlich and Kim (2007) also find that social security contributions and benefits generate incentives to reduce fertility, but in contrast to our model, they analyse a PAYG pension system in which the number of children neither affects labour supply nor wage income. They show that these effects cannot be fully compensated by inter-temporal or intergenerational transfers within families. Puhakka and Viren (2012) show that also Beveridgean PAYG pension systems (i.e. with lump-sum contributions and pensions) reduce fertility. In contrast, Hirazawa et al. (2014) show that in a model with specific log-linear utility functions the effect of Bismarckian PAYG pension scheme on fertility vanishes.

3.1 The model

We consider the impact of a pension system on fertility, savings and intra-family transfers in a two-period overlapping generations model which is similar to Fenge and Meier (2005). In period t, the size of the working population is N t . By convention, we denote the working generation in period t as generation t. The growth of population is given by the factor \(\frac {N_{t+1}}{N_{t}}=1+\overline {n_{t+1}}\). We analyse the decisions of a household on the number of children n t , savings s t and the intra-family transfer b t in period t. Note that the number of children of an atomistic household has no effect on population growth. In other words, the fact that more children imply more contributors to the pension system is not internalised by the household which leads to the well-known positive externality in PAYG pension systems, also known as the fiscal externality of children in a PAYG pension system. The number of children in a family and the growth rate of the population only coincide in equilibrium if all households are identical.

In the first period, the labour supply of the household depends on the number of children. Children reduce the time available for labour.Footnote 7 Normalising total time to unity, working time is given by 1−f(n t ) with f (n t )>0 and f (n t )≥0. Hence, the time needed for rearing a child f(n t ) increases with the number of children.Footnote 8 The wage rate is w t . The household pays contributions from wage income at the rate τ into the pension system. We assume the contribution rate to be constant, which corresponds to the historical set-up which only had four wage categories. The direct cost of raising a child is π t . Furthermore, we consider an intra-family old-age provision from the children to the parents. Each grown-up child pays a transfer b t in her working period to the parents in retirement.Footnote 9 Young children participate in consumption c t in the first period, which is determined by the following budget constraint:

$$ c_{t}=w_{t}(1-f(n_{t}))(1-\tau )-s_{t}-\pi_{t}n_{t}-b_{t}. $$
(1)

In the second period, the household retires and consumes z t+1. Old-age consumption can be financed via the pension p t+1, the returns on savings with interest factor 1 + r t+1 = R t+1 and the intra-family transfer b t+1 paid by the children. The budget constraint in the second period is:

$$ z_{t+1}=p_{t+1}+R_{t+1}s_{t}+b_{t+1}n_{t}. $$
(2)

The old-age consumption of the parental generation t−1,

$$ z_{t}=p_{t}+R_{t}s_{t-1}+b_{t}n_{t-1}, $$
(3)

enters the utility of the children’s generation t. Since s t−1,n t−1 and p t are determined in the past, the only determinant of z t in period t is the intra-family transfer b t paid by the household of generation t. The utility of the household depends on own consumption in both periods, on the old-age consumption of the parents and the individual number of children. The function U(c t ,z t+1,z t ,n t ) is increasing in all four arguments, strictly concave and additively separable: U c z = U c n = U z z = U z n =0.

Since fertility enters the utility function, having children is induced by a consumption motive. The consumption motive is a way of modelling the intrinsic motivation for having children. Furthermore, children provide a transfer to their parents in old-age, which constitutes an investment motive for children. This investment motive is important to create a model set-up which corresponds to the set-up of pension insurance in Imperial Germany. During the first 10 years, the pension system set-up could be considered partially funded, such that we expect behavioural effects via the reduced importance of the transfer channel mainly between 1891 and 1900. We also present theoretical results on the behavioural effect of the transfer channel.

The household determines the number of children and savings by maximising utility subject to the budget constraints (1), (2) and (3). Substituting these constraints for the consumption variables in the utility function results in a maximisation problem of a function depending on n t , s t and b t :

$$\begin{array}{@{}rcl@{}} \max_{n_{t},s_{t},b_{t}}V(n_{t},s_{t},b_{t}) &=&U(w_{t}(1-f(n_{t}))(1-\tau )-s_{t}-\pi n_{t}-b_{t}, \\ &&p_{t+1}+R_{t+1}s_{t}+b_{t+1}n_{t},p_{t}+R_{t}s_{t-1}+b_{t}n_{t-1},n_{t}). \end{array} $$
(4)

Hence, we can write the first-order conditions of the maximisation problem as:

$$ V_{n}=-U_{c_{t}}((1-\tau )w_{t}f^{\prime }(n_{t})+\pi_{t})+U_{z_{t+1}}\left( \frac{\partial p_{t+1}}{\partial n_{t}} +b_{t+1}\right) +U_{n_{t}}=0, $$
(5)
$$ V_{s}=-U_{c_{t}}+U_{z_{t+1}}R_{t+1}=0, $$
(6)

and

$$ V_{b}=-U_{c_{t}}+U_{z_{t}}n_{t-1}=0. $$
(7)

The second-order conditions for a maximum are satisfied (see Appendix AA).

In the following, we analyse the impact of increasing the contribution rate τ in period t on fertility n t in a PAYG pension system. In order to focus the paper on this key effect, we present the other effects on savings s t and the intra-family transfer b t in Appendix AA. Furthermore, Appendix AA presents also the model results for an economy without a functioning capital market.

3.2 The effect of a Bismarckian pay as you go pension system on fertility

In a PAYG system, pensions of generation t are financed by the contributions of generation t+1. If the PAYG pension is of the Bismarckian type, the individual pension is identical to the average pension weighted by an individual factor which relates the individual pension contribution payment of a household of generation t to the generation’s average:

$$ p_{t+1}^{BIS}=(1+\overline{n_{t+1}})\tau w_{t+1}(1-\overline{f(n_{t+1})}) \frac{\tau w_{t}\left( 1-f(n_{t})\right) }{\tau w_{t}\left( 1-\overline{ f(n_{t})}\right) }, $$
(8)

where \((1-\overline {f(n_{t})})\) denotes the average labour supply of generation t and the growth factor of the population, \(1+\overline {n_{t+1}} =\frac {N_{t+1}}{N_{t}}\), is equal to the average number of children of generation t. If the individual contribution, τ w t (1−f(n t )), is above average, \(\tau w_{t}\left (1-\overline {f(n_{t}) }\right ) \), the individual pension, \(p_{t+1}^{BIS}\), is higher than the average pension, \((1+\overline {n_{t+1}})\tau w_{t+1}(1-\overline {f(n_{t+1})} ) \), by the same proportion. Since the wage rate and the contribution rate are identical for all households, we may write the proportionality factor as \( \frac {1-f(n_{t})}{1-\overline {f(n_{t})}}\) and call it the Bismarck factor. In equilibrium, the average population growth factor is identical to individual fertility: \(\overline {n_{t}}=n_{t}\) and, hence, average labour supply is identical to individual labour supply: \(1-\overline {f(n_{t})} =1-f(n_{t})\) in the case of homogeneous households.

In the Bismarckian case, a higher number of children reduces the pension claims proportional to the payroll growth factor \((1+\overline {n_{t+1}}) \frac {w_{t+1}}{w_{t}}\frac {1-\overline {f(n_{t+1})}}{1-\overline {f(n_{t})}}\):

$$ \frac{\partial p_{t+1}^{BIS}}{\partial n_{t}}=-(1+\overline{n_{t+1}})\tau w_{t}f^{\prime }(n_{t})\frac{w_{t+1}}{w_{t}}\frac{1-\overline{f(n_{t+1})}}{1- \overline{f(n_{t})}}<0. $$
(9)

We assume that individuals take this effect into account when deciding on fertility. In a Bismarckian system pensions are proportional to individual wage income. If raising children reduces working time, it should be obvious for rational individuals that raising children also reduces pensions.

Second period consumption is given by:

$$ z_{t+1}=(1+\overline{n_{t+1}})\tau w_{t+1}(1-\overline{f(n_{t+1})})\frac{ 1-f(n_{t})}{1-\overline{f(n_{t})}}+R_{t+1}s_{t}+b_{t+1}n_{t} $$
(10)

and the intertemporal budget by:

$$\begin{array}{@{}rcl@{}} R_{t+1}c_{t}\,+\,z_{t+1} \!&=&\!R_{t+1}\left[ \left( 1\,-\,\tau \right) w_{t}(1\,-\,f(n_{t}))\,-\,\pi_{t}n_{t}\,-\,b_{t}\right] \\ &&+(1+\overline{n_{t+1}})\frac{w_{t+1}(1\,-\,\overline{f(n_{t+1})})}{w_{t}(1\,-\, \overline{f(n_{t})})}\tau w_{t}(1\,-\,f(n_{t}))\,+\,b_{t+1}n_{t}. \end{array} $$
(11)

The marginal price of children in present value terms of period t+1 is:

$$\begin{array}{@{}rcl@{}} {\Pi}_{t+1}^{BIS} &=&R_{t+1}((1-\tau )w_{t}f^{\prime }(n_{t})+\pi_{t}) \\ &&+(1+\overline{n_{t+1}})\frac{w_{t+1}(1-\overline{f(n_{t+1})})}{w_{t}(1- \overline{f(n_{t})})}\tau w_{t}f^{\prime }(n_{t})-b_{t+1}. \end{array} $$
(12)

If this marginal price is positive, there is an inner solution of the fertility decision. We assume it to be positive in the following.

Moreover, we denote the internal rate of return of contributions to the PAYG pensions system in equilibrium by:

$$ {\Omega}_{t+1}\equiv p_{t+1}/\tau w_{t}\left( 1-f\left( n_{t}\right) \right). $$
(13)

If contribution rates are constant as we assume this is equal to the payroll growth factor:

$$ {\Omega}_{t+1}=\left( 1+\overline{n_{t+1}}\right) \frac{w_{t+1}}{w_{t}}\frac{ 1-\overline{f(n_{t+1})}}{1-\overline{f(n_{t})}}. $$
(14)

Now, we consider the fertility decision in a PAYG pension system of the Bismarckian type. The fertility effect is given by:

$$ \frac{\partial n}{\partial \tau }=-\frac{\left\vert \begin{array}{ccc} V_{n\tau } & V_{ns} & V_{nb} \\ V_{s\tau } & V_{ss} & V_{sb} \\ V_{b\tau } & V_{bs} & V_{bb} \end{array} \right\vert }{\left\vert \begin{array}{ccc} V_{nn} & V_{ns} & V_{nb} \\ V_{sn} & V_{ss} & V_{sb} \\ V_{bn} & V_{bs} & V_{bb} \end{array} \right\vert }. $$
(15)

Due to the second-order conditions for a maximum the denominator is negative as shown in Appendix AA. In order to calculate the sign of the numerator of Eq. 15, we need the second derivatives of utility with respect to the contribution rate:

$$\begin{array}{@{}rcl@{}} V_{n\tau } \!&=\!&w_{t}f^{\prime }(n_{t})U_{z}(R_{t+1}\!-{\Omega}_{t+1})+w_{t}(1\,-\,f(n_{t})) \\ &&\left[ U_{cc}((1\,-\,\tau )w_{t}f^{\prime }(n_{t})+\pi_{t})\!+U_{zz}\left( b_{t+1}\,-\,{\Omega}_{t+1}\tau w_{t}f^{\prime }(n_{t})\right) {\Omega}_{t+1}\right] , \end{array} $$
(16)
$$ V_{s\tau }=w_{t}(1-f(n_{t}))[U_{cc}+U_{zz}{\Omega}_{t+1}R_{t+1}]<0, $$
(17)

and

$$ V_{b\tau }=U_{cc}w_{t}(1-f(n_{t}))<0. $$
(18)

The numerator of Eq. 15 can be calculated as:

$$\begin{array}{@{}rcl@{}} \left\vert \begin{array}{ccc} V_{n\tau } & V_{ns} & V_{nb} \\ V_{s\tau } & V_{ss} & V_{sb} \\ V_{b\tau } & V_{bs} & V_{bb} \end{array} \right\vert \!&=&\!(R_{t+1}\,-\,{\Omega}_{t+1})\left[ w_{t}f^{\prime }(n_{t})U_{z_{t+1}}\right. \\ &&(n_{t-1}^{2}U_{c_{t}c_{t}}U_{z_{t}z_{t}}\,+\,R_{t+1}^{2}U_{c_{t}c_{t}}U_{z_{t+1}z_{t+1}}\!+n_{t-1}^{2}R_{t+1}^{2}U_{z_{t}z_{t}}U_{z_{t+1}z_{t+1}}) \\ &&+w_{t}(1\,-\,f(n_{t}))U_{c_{t}c_{t}}U_{z_{t}z_{t}}U_{z_{t+1}z_{t+1}}n_{t-1}^{2} \\ &&\left. \left( R_{t+1}((1\,-\,\tau )w_{t}f^{\prime }(n_{t})\,+\,\pi_{t})\,-\,\left( b_{t+1}\!-{\Omega}_{t+1}\tau w_{t}f^{\prime }(n_{t})\right) \right) \right]. \end{array} $$
(19)

The sign of the numerator is ambiguous and we have to consider the separate effects in turn. Using Eq. 14, the marginal price of children from Eq. 12 can be written as R t+1((1−τ)w t f (n t ) + π t )−(b t+1−Ω t+1 τ w t f (n t )) which is positive.

The price effect

The first summand on the RHS of Eq. 19 is the effect of the contribution rate via the marginal price of a child. It is positive for the following reason. Raising the contribution rate reduces the opportunity cost of having children in terms of foregone lifetime income. A higher contribution rate reduces the net wage income in the first period so that the opportunity cost of a child is reduced by w t f (n t ). Moreover, a higher contribution rate raises the pension entitlement in the second period. This implies that the reduction of the Bismarck pension due to another child increases. This increase of the opportunity cost of a child in the second period is expressed by \(\frac {{\Omega }_{t+1}}{R_{t+1}}w_{t}f^{\prime }(n_{t})\) in present values of period t. Therefore, a higher contribution rate lowers the opportunity cost of having a child in the first period, but raises the opportunity cost of having a child in the second period in terms of pension entitlements. If R t+1 t+1 the total opportunity cost falls. Partial derivation of Eq. 12 with respect to τ shows that the price of a child decreases with a higher contribution rate:

$$ \frac{\partial {\Pi}_{t+1}^{BIS}}{\partial \tau }=-\left( R_{t+1}-{\Omega}_{t+1}\right) w_{t}f^{\prime }(n_{t})<0. $$
(20)

Since children become relatively cheaper as providers for old-age, the number of children increases.

The income effect

The second summand on the RHS of Eq. 19 is the effect of the contribution rate via a change in lifetime income. This income effect reduces fertility. By using the definition of the payroll growth factor (14), the lifetime budget constraint (11) can be written as:

$$ R_{t+1}c_{t}+z_{t+1}=w_{t}(1-f(n_{t}))\left[ R_{t+1}-\tau \left( R_{t+1}-{\Omega}_{t+1}\right) \right] -\left( R_{t+1}\pi_{t}-b_{t+1}\right) n_{t}. $$
(21)

The derivation of the RHS of Eq. 21 with respect to τ shows that a higher contribution rate reduces lifetime income by

$$\left( R_{t+1}-{\Omega}_{t+1}\right) w_{t}(1-f(n_{t})). $$

Lifetime income is reduced because PAYG pension system imposes an implicit tax on wage income if R t+1 t+1t (e.g. Barro and Becker 1988; Fenge and Werding (2004); Sinn 2000), since in this case compulsory contributions to the pension system mean a loss in lifetime income as investing the same amount of contributions in the capital market instead would yield a higher rate of return. The implicit wage tax rate can be written as τ(R t+1−Ω t+1)>0. A higher contribution rate raises this implicit tax and reduces lifetime income. This reduction of lifetime income is partially compensated by decreasing the number of children. The compensation per child not born is equivalent to the price of a child \({\Pi }_{t+1}^{BIS}=R_{t+1}((1-\tau )w_{t}f^{\prime }(n_{t})+\pi _{t})-\left (b_{t+1}-{\Omega }_{t+1}\tau w_{t}f^{\prime }(n_{t})\right ) >0\). Hence, due to the income effect, fertility decreases with rising contribution rates. Footnote 10

The total effect on fertility

is negative if the income effect is larger than the price effect and vice versa. The scale of both effects depends on the factor (R t+1−Ω t+1) with the internal rate of return assumed to be lower than the capital market interest rate. Hence, the size of the total fertility effect is larger the smaller the internal rate of return of the pension system Ω t+1p t+1/τ w t (1−f(n t )). We can state:

Proposition 1

Fertility effect If the internal rate of return is lower than the capital market interest rate, the introduction or expansion of a pay as you go public pension scheme of the Bismarck type sets incentives to reduce (increase) the number of children if the income effect is higher (lower) than the price effect on fertility. Furthermore, the fertility effect is stronger the smaller the internal rate of return of the pension system.

From the theoretical result, we can derive the following proceeding for the empirical investigation. The fertility effect as the combination of the price effect and the income effect is ambiguous. If the income effect is larger than the price effect, fertility declines with an introduction or extension of the PAYG pension system. Since the theoretical model does not provide a definite answer to how fertility responds to the PAYG system, we analyse the fertility effect empirically in order to get a definite understanding of which partial effect prevails.

We can summarise the findings in our main hypotheses:

Hypothesis 1: Total Fertility effect in a Bismarckian pay-as-you-go pension system

Under the condition R t+1 t+1, if the PAYG pension system is introduced and the income effect is higher than the price effect then fertility declines.

Hypothesis 2: Price and Income Effect

Assume R t+1 t+1. Then a rising contribution rate of the PAYG pension system has two effects. The opportunity cost of a child decreases which has a positive effect on fertility (price effect). Lifetime income decreases due to a higher implicit tax which has a negative effect on fertility (income effect).

Hypothesis 3: Magnitude of the Fertility effect

The smaller the internal rate of return Ω t+1, the stronger all three effects, especially the higher is the fertility effect.

4 Data, identification strategy and econometric considerations

4.1 The data set

To test our hypotheses, we use regional historical data from the time of the introduction of the first comprehensive pension system and the first demographic transition. The introduction of the first comprehensive pension system in Germany towards the end of the nineteenth century is well-suited for an analysis of the impact of pension insurance on fertility because there is well-documented regional variation in key variables of the pension system which we can use for identification. In addition, fertility developments have also been well-documented for most German provinces.

The data on fertility, population and a set of control variables is taken from the Imperial Annual Yearbook of Statistics (Statistisches Jahrbuch für das Deutsche Reich).Footnote 11

The data on the pension system, first collected by Kaschke and Sniegs (2001), is taken from the Annual Reports of the RIAs. As they were largely autonomous in administering the pension system, it should not be surprising that each RIA also collected detailed statistics on how it managed the pension system.

As the states and provinces recorded in the Annual Yearbook of Statistics did not fully overlap with the set-up of the RIAs, we use the matched data set developed by Scheubel (2013). While the Annual Yearbook of Statistics provides information at the state and province level, of which there were in total 44, some RIAs covered more than one state or province and one state could also be covered by more than one RIA (for example large states such as the Kingdom of Bavaria). In the matched data set, regions, provinces and RIAs are matched based on their geographical location which implies that observations for those states or provinces which are covered by one RIA are averaged and observations for those RIAs which cover a part of the same state of province are also averaged. As a consequence, the combined data set consists of 24 cross-sectional observations of which we however drop one outlier as detailed below. Figure 1 shows the regional entities in the harmonised data set.

Fig. 1
figure 1

Regions in Imperial Germany

4.2 Identification strategy

4.2.1 Mechanism

To identify an effect of social insurance on fertility, we follow our theoretical model in looking at the coverage of pension insurance. Our model suggests that the overall effect \(\frac {\partial n}{\partial \tau }\) is composed of a price effect related to the reduction in disposable income caused by contributing to the pension system and by an income effect related to lower lifetime income caused by the implicit tax in the pension system. The sign of the total effect is determined by the larger of the two effects, and its size is determined by the internal rate of return of the pension system.

As the sustained decline in birth rates pictured in the left panel of Fig. 2 only appeared across all provinces long after the onset of industrialisation, we hypothesise that the overall effect of pension insurance on fertility is negative. In the framework of our model, this would mean that the income effect dominates the price effect. The fertility decline seems to have started fully only in the 1890s when pension insurance was introduced, and it increased its pace after 1899 when more pay as you go elements were introduced. The challenge for our identification strategy is to choose appropriate proxies for the overall effect \(\frac { \partial n}{\partial \tau }\) at the regional level which help us to identify the overall effect of pension insurance by using regional variation.

Fig. 2
figure 2

Marital birth rates and pension application approval rates. Notes: Marital birth rates for all regional entities in the data set, expressed in per thousand. Rate of approved pension applications for all regional entities in the data set, expressed in %. For the sake of illustrating trends, region names are suppressed

4.2.2 Choice of proxies

Our choice of proxies is determined by the functioning of Bismarck’s pension system and by data availability. RIAs collected a battery of variables on the functioning of the pension system from which we calculate five proxies which we believe help us best to gauge the effect of pension insurance and the behavioural mechanisms underlying this effect.

As we are first and foremost interested in the total effect of pension insurance, \(\frac {\partial n}{\partial \tau }\), we use the coverage of compulsory insurance, the share of the population insured, as a proxy. As unfortunately, the contribution rate τ did not vary much across RIAsFootnote 12 and there was no change foreseen in contribution rates at least for the first 10 years, we use one of the main particularities of Bismarck’s pension system to proxy the effect of τ: insurance was compulsory, but only for a certain group of people. Only those people were required to contribute to pension insurance who, based on their job description, were considered as workers. As the share of people classified as workers differed across regions, also the share of people who had to participate in pension insurance differed across regions. Consequently, only a certain share of the population experienced the decline in lifetime income which was caused by having to pay contributions to pension insurance, and which in the terms of our model corresponds to raising contribution rates from 0 to τ for the part of the population covered by pension insurance.

While compulsory insurance should proxy well the overall effect of pension insurance on lifetime income, we also use this variable weighted with the likelihood of receiving a pension. The decentralised set up of the pension system introduced some uncertainty with regard to receiving a pension, which should have intensified the effect of compulsory insurance on lifetime income. As the RIAs enjoyed some discretion in the maintenance of the system’s administration, particularly with regard to the eligibility criteria for receiving a pension (Kaschke and Sniegs 2001),Footnote 13 prospective pensioners had to apply for receiving a pension. They had to prove that they had paid contributions at least for the minimum period as well as prove that they were unable to earn a subsistence level income. Particularly, the latter criterion involved considerable judgement by the RIA official dealing with the application. The right panel of Fig. 2 shows that there has been considerable variation in this approval rate across provinces. As the approval rate can hence be considered as the probability of receiving a pension, we weigh the share insured with the approval rate to create a refined proxy for the overall effect of pension insurance.

To test whether the mechanism suggested by our model framework is present in the data, we also construct proxies for the income effect and the price effect. To proxy the income effect, we construct a measure of the implicit tax in the pension system. The implicit tax is normally given by the difference between contributions and discounted future pension. We calculate the implicit tax as the difference between the average pension contribution per insured (i.e. τ(1−f(n))w) and the average pension (i.e. p t ). To proxy the price effect, we choose a variable which captures particularly the reduction in first-period disposable income. As the average contribution per insured is inversely related to disposable income, we consider it a good proxy.

Finally, to further examine whether the mechanism suggested by our model framework is visible in the data, we construct a proxy for the internal rate of return that helps us to investigate whether the magnitude of the overall effect is affected by the internal rate of return. We proxy the internal rate of return by dividing the average future pension level by the average contribution, but we also use the ratio between current pension level and current average contribution as a proxy.

4.2.3 Identifying assumptions

As RIAs collected the data independently, we can base our identification of an effect on cross-regional variation on three main identifying assumptions. First, the effect of pension insurance needs to be observable to the population in order to instigate an immediate change in the fertility rate. Second, pension insurance (and the variables which we choose as proxies) needs to be independent of the fertility decline which took place during the second half of the nineteenth century to distinguish the effects of pension insurance. Third, the variables we would like to measure and the respective proxies we construct need to reflect regional differences in the pension system rather than other regional differences. We look at these assumptions in turn.

First, there are several reasons that lead us to assume that people in the street were able to observe that the pension system was working. If young couples perceived that the pension system was working reliably and that pensions were sufficient to make up for savings or transfers from children, they could be induced to have fewer children. Since the pension system operated locally, it is likely that people in the street could form an opinion on the coverage of pension insurance because either they were insured themselves or because they observed their fellow citizens’ participation in the pension system. Participation was observable, because weekly contributions were paid at the post office and so-called Klebemarken, a form of stamps, had to be collected in a book similar to a collector’s album. Consequently, even those not insured could observe participation rates as well as the level of average contributions. Moreover, the average pension level should have been clear from the beginning. The system entailed some transitional arrangements, which meant that pensions were paid as of 1892 to some groups of the population which were too old or too unable to earn enough to accumulate the number of stamps needed for applying for a pension. In fact, the number of pensioners was particularly large in 1892. It is fair to assume that even those unrelated to a pensioner would be able to hear about the pension level from the pensioner’s co-workers. As applications for and payment of pensions was administered locally, it is also probable that people were informed about the approval rate of pension applications and based on this formed an opinion about the probability of receiving a pension.

Second, it is essential that the variation in the share insured, the average contribution, the average pension and the approval rate is not caused by the fertility decline itself. In particular, should changes in the birth rate drive differences in one of these key pension system variables, it would not be possible to causally relate the cross-regional differences in those variables to differences in fertility. One particular concern in this regard may be the fact that the share insured is related to having an occupation which required compulsory insurance. As insurance was intended for the ‘working class’, it would be difficult to relate any changes in fertility to pension insurance if the ‘working class’ had a fertility rate which was significantly different from other parts of the population. For example, it is likely that rural workers displayed a higher birth rate as children contributed to household income. However, the ‘working class’ as defined in the law on pension insurance included workers from all sectors of the economy, including sectors for which we would expect a lower fertility rate. For example, particularly the workers in mining were likely to have a lower birth rate since miners’ associations provided pension insurance long before the introduction of comprehensive health insurance at the union level (Jopp 2013). To further ensure that our proxy ‘share insured’ does not pick up any fertility development which might be particular to the ‘working class’, we add a variable which measures the share of workers and which has been developed by Scheubel (2013) based on the job description in the Annual Yearbook of Statistics (based on the 1895 and 1907 occupational census). If the fertility rate was different among workers, this variable should pick up this difference. Similarly, it is unlikely that the discretion which RIA officials exercised in approving pension applications was related to the birth rate of a particular group of people rather than to individual motives. One exception to this may have been the discrimination against Slav minorities which had been widespread in provinces with a large Slav population (Kaschke and Sniegs 2001), should fertility have been different among Slavs. This cannot be ruled out as Knodel (1974) found higher fertility rates in regions with a larger number of Poles and Galloway et al. (1994) find a significant positive effect of a large Slav population on fertility. We discuss the implications for our analysis when presenting our descriptive results below.

Third, when estimating the effect of pension insurance on fertility, we need to make sure that the effect is not confounded by other developments, such as industrialisation. It is obvious that pre-pension system differences between the states or provinces, such as the number of the elderly, the degree of migration or the level of industrialisation would affect both the birth rate and also the pension system indicators. For example, a high number of elderly ceteris paribus should result both in a lower birth rate and in a higher number of approved pensions and thus lower pensions. A higher level of industrialisation should result in more working women and thus lower fertility while it would also imply that more people would be insured as there were more workers in the industrialising areas. However, these factors only constitute a problem for identification if we cannot control for them. Hence, we have added an extensive set of proxies which lead us to assume that once controlling for the confounding factors, the variation in key pension system variables across provinces is indeed exogenous.

4.2.4 Choice of the dependent variable

We choose the crude marital birth rate (CMBR) as the main measure of fertility.Footnote 14 Since the calculation of fertility indices which are more widespread in the analysis of fertility in a non-historic context requires information on the age of the female population, which in our data set is only available for years 1871, 1885 and 1890, the CMBR is easy to compute and available as a long time series. Moreover, the CMBR is well-suited to analyse main cross-jurisdictional developments in fertility since it maps broadly the same developments as other fertility indices.Footnote 15 We also use other fertility measures to check the robustness of our model in Section 5.3.3.

4.2.5 Choice of control variables

Particularly, the third identifying assumption rests on an appropriate selection of control variables. We choose the variables describing both current and future consumption to reflect earlier empirical studies on the determinants of the first demographic transition.Footnote 16 The factors that have previously been found to be the main determinants of the first demographic transition (Guinnane 2011 gives a comprehensive overview, other studies are Galloway et al. 1994, 1998; Richards 1977; Brown and Guinnane 2007; and in particular Knodel 1974 for Germany) are consistent with a consumption-based model of fertility like the one we use as a motivation for our study. This should not be surprising given the fact that modern fertility theory (e.g. Becker 1960, 1965, 1988, 1991; Schultz 1969; Barro and Becker 1986, 1988, 1989; Easterlin 1975; Becker and Thomes 1976; Cigno and Ermisch 1989) has emerged from earlier, mostly empirical studies on the determinants of fertility, also in the historical context (e.g. the Princeton Fertility Project, refer to Coale 1965; Coale and Watkins 1986).

The determinants of the first demographic transition include a general (child) mortality decline which increased returns to child quality (since more children survived, the investment in their education became more valuable), which has been found to be associated with a smaller family size. Innovation in contraception and the changed availability of contraception (which was spread by urbanisation and better communication) improved the success of attempts to control fertility. As compulsory schooling laws or laws banning child labour were introduced, the direct costs of children who previously contributed to the household income rose. One factor frequently mentioned in the literature, the higher opportunity cost of children due to increased labour market participation of women, is one of the main mechanisms in our model. Finally, the introduction of comprehensive social insurance reduced the value of children as an insurance against risk.

Table 1 details how we have proxied these developments with the variables available in our data. We provide summary statistics for all variables in Table 2.

Table 1 Availability of variables
Table 2 Summary statistics

4.3 Econometric considerations

4.3.1 Estimation approach

While we would prefer estimating a panel model, not all variables have been collected by the Imperial Statistical Office for all years. The data collected for Imperial Germany by the Imperial Statistical Office is, for example not as detailed as Prussian data, which has been used for similar analyses before (Becker and Wößmann 2009; Becker et al. 2010, 2011a; Hornung 2014). One of the reasons for the different level of detail is that information had to be harmonised for all parts of Imperial Germany, not all of which collected data as detailed as the data collected by the Prussian Statistical Office.Footnote 17 The CMBR are available for almost all years. However, variables on the demographic structure or the share of the population working in the primary, secondary or tertiary sector have only been collected during a population or occupational census.Footnote 18 In addition, not every variable has been recorded during every population census. For example, information on age structure was not collected after 1890. Occupational information was only collected in the 1871, 1882, 1895 and 1907 occupational censuses. Unfortunately, this also impacts some of the pension system variables. While the level of pensions and the approval rate have been collected almost in every year and for every RIA since 1891, the share insured—which is based on occupational information—has only been collected during the occupational census of 1895 and during the occupational census of 1907. Hence, our main proxy is only available for 1895 and 1907.

The main complication for our empirical specification thus arises from the fact that there is no year during which all variables are available. As our identification strategy builds on variation between jurisdictions, we consider it essential that we are able to control for province-specific effects. Given that our main proxy is available for two points in time, 1895 and 1907, we aim to use panel techniques on a panel of t=2. However, not all control variables are available for 1895 and 1907. While other authors have imputed or extrapolated values if they were missing for some variables for some years (e.g. Becker et al. 2010, 2011a, b), the possibility to do so is limited if a variable is only available for t=2 and n=23. Hence, we resort to a solution used in previous studies (e.g. Galloway et al. 1994, 1998).

In particular, we construct a panel of two periods, r and s. As our main proxy is available for 1895 and 1907, r=1895 and s=1907. For variables which are not available in r and s but for two other years, we use the first year for cross section r and the second year for cross section s. For example, information on the age structure of the population is only available for 1871, 1885 and 1890. Hence, for all proxies based on the age structure, such as the old age dependency ratio, r=1885 and s=1890. Table 1 gives an account of the years of availability for each variable we use in the model and also lists the years which we use for the construction of the two cross sections. While the data set resulting from this approach is not a clear pseudo panel consisting of two pooled cross sections because most variables are included for the same observations for two different points in time, we think of it as a pseudo panel since we do include some control variables for different points in time.

Using our pseudo panel can introduce biases if r and s are very different. We adjust those variables not expressed in percentage terms to the population size in the year from which they are taken to make the numbers comparable. Moreover, we provide several robustness checks regarding the selection of years.

After constructing the pseudo panel, we run a panel estimator on the two cross-sections to allow us to account for province-specific effects. To control for those unobserved province-specific effects, we use a fixed effects estimator, similar to e.g. Galloway et al. 1994, who also constructed their pseudo panel in a similar way. To account for the invariant region-specific effects, we use standard errors adjusted for some forms of serial correlation. In addition, as errors can be correlated across adjacent provinces (spatial correlation), we also use standard errors which are robust to some forms of spatial correlation.

4.3.2 Model specifications

In line with our identifying assumptions and corresponding to our theoretical considerations, we estimate a model in which the share insured and the main determinants of the fertility decision enter our econometric model additively. Our empirical specification reads:

$$ n_{i,t} = y_{0} + T_{t} + \beta_{\tau}\tau_{i,t} + \boldsymbol{x _{i,t}}\boldsymbol{\beta}_{x} + \boldsymbol{z_{i,t}} \boldsymbol{\beta}_{z} + \alpha_{i} + \varepsilon_{i,t}. $$
(22)

Note that this specification corresponds in spirit to our model (e.g. Eq. 4), assuming that the fertility decision is determined by the four main elements of the utility function: the pleasure of having children and the pleasure of supporting the elderly, the impact on current consumption, and the impact on future consumption. The measure n i,t refers to the crude marital birth rate (CMBR) (or in our sensitivity analysis, to the Marital Fertility Index, MFI, and the Total Fertility Index, TFI) in jurisdiction i in year t. T t is a time-specific effect, i.e. in most specifications a dummy for year 1907. We capture the overall effect of pension insurance by the share insured which we label τ i,t . Hence, the coefficient \( \beta _{\tau _{i,t}} \) should give an estimate of the total effect, i.e. of \( \frac {\partial n}{\partial \tau }\) for the case when the share insured is raised from 0 to τ. x i,t is a vector of demographic variables which affect the pleasure of having children, the pleasure of supporting the elderly, as well as current and future consumption (including the consumption of the elderly who are part of the household); z i,t is a vector of variables related to industrialisation which affect current and future consumption; α i refers to time-invariant region-specific effects and ε i,t is an i.i.d. error term.

To test the robustness of our results, we reproduce specification (22) with a different proxy for the overall effect of pension insurance, the share insured weighted with the approval rate, as shown in Eq. 23. The approval rate is denoted by a and the weighted share insured is denoted by τ w:

$$ n_{i,t} = y_{0} + T_{t} + \beta_{\tau}\tau_{i,t} + \beta_{\tau^{w}}\tau^{w}_{i,t} + \beta_{a} a_{i,t} + \boldsymbol{x_{i,t}}\boldsymbol{\beta}_{x} + \boldsymbol{z_{i,t}}\boldsymbol{\beta}_{z} + \alpha_{i} + \varepsilon_{i,t}. $$
(23)

To test the relevance of income and price effect, we estimate a model with the implicit tax, denoted by t i,t , and the average contribution, denoted by c i,t , as additional explanatory variables:

$$ n_{i,t} = y_{0} + T_{t} + \beta_{\tau}\tau_{i,t} + \beta_{t}t_{i,t} + \beta_{c}c_{i,t}+ \boldsymbol{x_{i,t}}\boldsymbol{\beta} _{x} + \boldsymbol{z_{i,t}}\boldsymbol{\beta}_{z} + \alpha_{i} + \varepsilon_{i,t}. $$
(24)

Finally, to evaluate the impact of the internal rate of return, we add the proxy for it, denoted by Ω, as well as an interaction term with the share insured, denoted by Ωτ to the model:

$$ n_{i,t} = y_{0} + T_{t} + \beta_{\tau}\tau_{i,t} + \beta_{\Omega}{\Omega}_{i,t} + \beta_{{\Omega}^{\tau}}{\Omega}^{\tau}_{i,t}+ \boldsymbol{x_{i,t} }\boldsymbol{\beta}_{x} + \boldsymbol{z_{i,t}} \boldsymbol{\beta}_{z} + \alpha_{i} + \varepsilon_{i,t}. $$
(25)

5 Results

5.1 Descriptive analysis

A sustained fertility decline started in Imperial Germany only during the 1890s, which is also when the pension system was introduced (refer to Fig. 2). The decline became particularly pronounced around 1900 when the pension system was turned from a partially funded system into a full pay as you go system. In our sample, the crude marital birth rate fell from more than 33 births per thousand in 1895 to less than 30 births per thousand in 1907.

This decline in birth rates is correlated with the change in the share insured. While the average share of the population which was insured in pension insurance only rose marginally from 21.3 to 21.6 % between 1895 and 1907, this small difference hides substantial increases in some provinces. The left panel of Fig. 3 shows that the change in the share insured between 1895 and 1907 ranged between −4 and 3 percentage points, but in most provinces the share insured rose by between 1 and 2 percentage points. Figure 3 also highlights the negative cross-regional correlation between the change in the share insured and the change in the crude marital birth rate and hence supports our hypothesis that the pension system has contributed to the decline in birth rates.

Fig. 3
figure 3

Changes in marital birth rates and the share insured. Birth rates expressed in per thousand. Change in share insured expressed in percentage points. OP = Ostpreu βen; WP = Westpreu βen, BG = Brandenburg; PM = Pommern; PS = Posen; SL = Schlesien; SA = Sachsen-Anhalt; SH = Schleswig-Holstein; HV = Hannover; WF = Westfalen; HN = Hessen-Nassau; RL = Rheinland; BY = Bayern; PZ = Pfalz; KS = Königreich Sachsen; WB = Württemberg; BA = Baden; HE = Hessen; MB - Mecklenburg; TH = Thüringen; OL = Oldenburg; BR = Braunschweig; HA = Hansestädte; EL = Elsa β-Lothringen

The correlation between the change in the share insured and the change in crude marital birth rates would be even stronger if we disregarded the most rural provinces. The Kingdom of Bavaria (Bayern) is the most obvious outlier in this respect, being largely rural and one of the largest provinces. Also most provinces in the Eastern part of Prussia were largely rural: Posen, Westpreußen and Ostpreußen.

In addition, the left panel of Fig. 3 shows that the change in birth rates was not significantly different in those East Prussian provinces with large Slav minorities. The provinces with large Slav minorities are highlighted. While the change in birth rates in the provinces with large Slav minorities has been at the lower end of the range, they are not significantly below the range of the other provinces.

To render further support to our hypothesis that the introduction of pension insurance contributed to the fertility decline in Imperial Germany, we illustrate the negative relationship between the change in the weighted share insured and the change in marital birth rates in the right panel of Fig. 3. The right panel of Fig. 3 illustrates that the negative correlation between the change in the weighed share insured and the change in birth rates persists.

However, the right panel of Fig. 3 also shows that in terms of weighted share insured, the East Prussian provinces with large Slav minorities differed significantly from other provinces. While the change in the birth rate was not significantly lower than in other provinces, the change in the weighted share insured was at the lower end of the range in those provinces with large Slav minorities. This supports the observation by Kaschke and Sniegs (2001) that RIA officials in those provinces with large Slav minorities discriminated against Slavs when deciding on the approval of a pension application. We acknowledge this in our multivariate analysis by excluding Ostpreußen as the most obvious outlier and by checking the robustness of our results with regard to excluding the provinces with large Slav minorities altogether.

5.2 Multivariate analysis

Our multivariate analysis indicates that the negative relationship between the share insured and the birth rate persists when controlling for other determinants of the first demographic transition. Table 3 shows four specifications to illustrate that the overall effect of pension insurance on fertility was negative. Two additional specifications confirm that our model framework is applicable to the data.

Table 3 Determinants of the first demographic transition

Column (1) is equivalent to the left panel of Fig. 3 and confirms the significant correlation between the change in the CMBR and the change in the share insured. The coefficient indicates that a change in the enrolment in pension insurance by 1 percentage point is associated with an average reduction of fertility by approximately 0.54 marital births per thousand. This is quite substantial considering that on average, the standard deviation of the share insured in our sample is 3 % and the standard deviation of marital births per thousand in our sample is 4.3 and in view of the fact that in our sample marital births per thousand only fell from 33.38 marital births per thousand in 1895 to 29.79 in 1907. Determinants of the first demographic transition

Columns (2) and (3) show that our results continue to hold if we add control variables, the coefficients on which are in line with standard demographic transition theory. Column (2) adds basic demographic information. As the consumption value of children typically rises with marriage, we expect that 1 marriage per thousand leads to approximately 1 more birth per thousand. By contrast, the need to care for the elderly, which we proxy by the old age dependency ratio, reduces disposable income and may thus lead a couple to have fewer children. Finally, Catholicism has been associated with higher birth rates. We look at the number of Protestant inhabitants in a province relative to the number of Catholic inhabitants. Correspondingly, we expect this proxy to have a negative effect on fertility.

Adding demographic information in column (2) confirms that the negative effect of pension insurance persists. However, contrary to our expectation, the coefficient on marriages is not significant in specification (2). This may be related to unobserved correlation with variables which are not included in the specification in column (2) since the coefficient turns significant and of the expected magnitude in our sensitivity analyses in Table 4. Similarly, neither the old age dependency ratio nor the share of Protestants are significant, also suggesting potential omitted variable bias from other determinants of the first demographic transition which are not included in column (2).

Table 4 Sensitivity: policy effects

Hence, we add in column (3) proxies for industrialisation which have been found to be key determinants of the first demographic transition. These include a measure of the share of workers developed by Scheubel (2013), as discussed above, the gender imbalances ratio, which is a proxy for migration (refer to Table 1), the share of recruits with at least basic schooling to measure the diffusion of education, the share of revenues in contribution category I relative to the other categories to proxy the share of working women,Footnote 19 the lagged number of persons per building to proxy urbanisation, the share of the population working in trade to proxy the diffusion of knowledge and an index measuring the average harvest per hectare based on data for five different types of crops (refer to Table 1).Footnote 20 Adding information on industrialisation in column (3) confirms the negative effect of pension insurance on fertility in addition to confirming the main determinants of the first demographic transition. While the share of workers in a province is not significant,Footnote 21 the gender imbalances ratio has a significantly negative effect on the birth rate; an increase in the gender imbalances ratio by 10 % is associated with a decrease of 1.7 births per 1000.

The proxy for urbanisation indicates a negative and significant impact on the birth rate which is fairly consistent also across other specifications. It suggests that an additional person in a building is associated with a reduction of approximately 1 marital birth per thousand.

The proxy for the diffusion of knowledge is highly significant in specification (3) and suggests that an additional 1 % of the population working in trade is associated with 0.5 fewer births per thousand. While this variable is not significant in all specifications in Table 3, its negative effect is confirmed in the sensitivity analysis in Table 4.

It is reasonable to assume that not only pension insurance changed people’s behaviour, but that in fact the major game changer was the whole package of social insurance introduced at the time. Therefore, it would make sense to assume that other insurance like health care coverage should also have an effect on fertility. Hence, we also add a measure of health care coverage in column (3): the share of the population covered by the previously introduced health insurance. In fact, the coefficient suggests that health insurance coverage has a positive effect on births. This may be related to health insurance reducing the mortality of both mothers and children.

That being said, the insight we gain from column (3) is an important one: our model confirms previous findings from the demographic transition literature, but it also shows that pension insurance had a significant additional impact. For example, column (3) implies that the effect of an increase of the share insured by 1 % is approximately equivalent to an increase in the gender imbalances ratio by 3 %.

In column (4), we confirm the negative effect of pension insurance for a different proxy which shows that the significant negative coefficient on the share insured is related to pension insurance instead of picking up, e.g. some particular characteristics of the group of insured. In particular, we add the weighted share insured, which is the share insured weighted with the approval rate. As we also have to add the approval rate, the total marginal effect of the share insured can be derived as \(\beta _{\tau }+\beta _{\tau ^{w}}* a \). If all pension applications would be approved, i.e. if a=1, the total effect of the share insured would amount to a reduction in the birth rate of 1.05 per thousand. At the average approval rate, the total marginal effect of the share insured is equivalent in magnitude to the effect of the unweighted share insured in column (3).

In addition, we illustrate in column (5) that the underlying behavioural mechanisms are in line with the framework of our model. In column (5), we add to the basic specification from column (3) two variables which we consider as best available proxies of an income and a price effect. In line with our expectations, adding the two proxies for the income effect and the price effect reduces the coefficient on the share insured and both proxies are significant and of the expected sign. However, the magnitude of the price effect proxy coefficient is larger than the income effect proxy coefficient and the coefficient on the share insured is only halved and remains significant. Hence, we consider specification (5) as a confirmation of our model framework, but as also highlighting that for one the model framework may have its limitations in explaining the dynamics of Bismarck’s pension system which after all has been a mixed system for the first 10 years of its existence and for another the proxies may be imperfect and picking up unobserved differences in the pension system which we cannot measure. For example, it cannot be ruled out fully that a higher contribution per insured could be picking up a higher wage level in a province.

In column (6), we test an additional element of our model which tells us that the overall effect of pension insurance should be stronger the lower the internal rate of return of the pension system (hypothesis 3). We add the proxy for the internal rate of return as well as an interaction with the share insured to our baseline model. Both the share insured and the interaction term are significant in this specification. Again, the coefficient on the share insured can be calculated as β τ + β Ωτ Ω. At the average level of the internal rate of return, in our sample 103.721, the total effect of the share insured on the birth rate would be equivalent to a reduction by 0.4 births per thousand at a 1 % increase in the share insured, which corresponds to the coefficient in our baseline specification.

5.3 Sensitivity

5.3.1 Estimation approach

While it may seem straightforward to use a fixed effects estimator with standard errors adjusted for serial correlation for the case presented in this paper, we illustrate a comparison with a simple OLS model and with a model in first differences in the supplementary Appendix BB.Footnote 22

Assuming that the province-specific unobserved effects are well-captured in a fixed effects model, the model may however not sufficiently control for spatial correlation. For example, if the decline in birth rates is correlated for adjacent provinces, this will lead to a correlation between the province-specific effects α i with the error term ε i,t . One option to deal with this potential endogeneity issue is introducing a spatial lag and adjusting the standard errors accordingly (e.g. Anselin 1988). Another option is to correct standard errors using non-parametric techniques (e.g. Driscoll and Kraay 1998; Conley and Molinari 2007). However, given the small sample size and the limited effective time dimension (T=2), these methods cannot be used for our small sample. At the same time, when running the basic model only with the variables which are available for more than just a few periods (such as marriages, agricultural productivity, education, share of contributions in category I), implementing a Driscoll and Kraay (1998) adjustment of the standard errors gives coefficients of broadly the same magnitude as our small-sample model.

5.3.2 Other policy changes

Since the late 1890s and the early 1900s were a time of industrial change, but also of cultural and political changes, it is important to rule out that we measure effects other than the one of pension insurance. There are three major changes which are of particular interest. First, the pension system was reformed in 1899; the law came into effect in 1900. This change turned the previously mixed system into a full pay as you go system. A new contribution category was introduced. In addition, a new financial equalisation scheme between RIAs was introduced. Second, in 1903, there was a major amendment to child labour laws (Boentert 2007) which rendered children more costly in the sense that stricter child labour laws reduced the scope for current consumption as children went to school instead of contributing to household income.Footnote 23 Third, in 1903 and particularly in 1904, the earlier introduction of a financial equalisation scheme between RIAs prompted the Federal Insurance Agency to conduct a review of RIAs’ code of conduct (Kaschke and Sniegs 2001) which may have led to more restrictive approval practices and thus have lowered the probability of receiving a pension.

To test whether there was a major difference in coefficients if we do not use 1907 as a reference year, we run our baseline specification (column (3) in Table 3) comparing the year 1895 to the key years 1899, 1900, 1903 and 1904. As a placebo check, we also compare the year 1904 to the year 1907. Also for this sensitivity analysis, we construct a panel of t=2 from pooled cross sections. As some control variables are only available for 2 years, we have to use the same observations for all pseudo panels for these control variables, as indicated in the notes to Table 4. Particularly, this applies to the share insured, the share of Protestants, the share of workers, and the share working in trade.

Column (1), which compares years 1895 and 1899 indicates a significant negative effect of the share insured on the birth rate. However, the coefficient is not as large as in column (3) in Table 3 . This gives some support that at least half of the effect we have seen in column (3) of Table 3 is driven by the introduction of pension insurance.

The coefficient on the share insured is marginally not significant in column (2) which compares years 1895 and 1900 while being marginally significant and comparable in magnitude to the baseline specification when we compare years 1895 and 1903. This effect persists when comparing years 1895 and 1904 in column (4). While the significant coefficient in column (3) could be interpreted as child labour laws having an effect on household disposable income and thus reinforcing the impact of compulsory pension insurance, the evidence from column (4) suggests that the review of the code of conduct of RIAs in 1903/1904 may have raised awareness about the pension system among the population and hence may have intensified any behavioural reaction. The placebo comparison in column (5) which compares years 1904 and 1907 confirms that the main effect we show in our baseline model is driven by the years before 1904.

The magnitude of the control variables in Table 4 is broadly in line with our baseline specification. Similar to the baseline specification, marriages have a positive effect on the birth rate while a higher relative share of Protestants is associated with a lower birth rate. Migration, education, female labour force participation and a high share of people working in trade also reduce the birth rate. The coefficient on the share of workers is positive in some of the specifications in Table 4 which is in line with our initial hypothesis. While the proxy for urbanisation, the number of people per building, has the expected negative effect in most specifications, we suspect that the positive effect in column (1) may be related to the lag being too small for the years used in that specification.Footnote 24 While the coefficient on the crop yield index is consistent with the baseline model in the specification in which it is significant, we relate the inconclusive behaviour of this variable in Table 4 to the fact that we had to extrapolate some values, particularly around 1900.

5.3.3 Measuring fertility

While we have already discussed that the CMBR is a meaningful measure of fertility, especially in the historical context, we show that other measures of fertility give comparable results for the years 1885 and 1890 for which we can compute these alternative measures. Typical fertility indices, which are used in cross-country studies, are the total fertility rate (TFR), or the Total Fertility Index (TFI) and the Marital Fertility Index (MFI) developed by Coale (1965, 1969), which are slightly more sophisticated as they takes into account natural fertility.Footnote 25

One caveat to looking at other measures of fertility is that we cannot include the share insured and the proxy for the share of working women from our baseline specification. As pension insurance had not yet been introduced, these variables are not available for the years 1880 and 1885. However, a regression only using the fertility determinants available for 1880 and 1885 is broadly in line with our baseline model and helps to illustrate that the use of the CMBR instead of more sophisticated fertility indicators yields reliable results.

Table 5 shows such a specification which compares years 1885 and 1890. Column (1) shows the CMBR and column (2) the crude birth rate (CBR). Column (3) shows the corresponding MFI and column (4) the corresponding TFI.

Table 5 Sensitivity: measuring fertility

It is obvious that models (1) and (2) as well as models (3) and (4) are comparable in terms of the variables which they confirm as important determinants of fertility. Therefore, we conclude that using the CMBR in our model gives results that do not need to be qualified by the fact that we cannot control for the age structure of mothers.

To show that the similarity is not driven by the specification of the model being too inflexible, we additionally provide evidence that the model identifies different determinants in case the dependent variable measures something different: the column denoted Illeg. rate shows the model predictions when using the share of non-marital births as dependent variable. The results in this column differ from the other columns where we expect them to, e.g. marriages reduce the illegitimacy rate, but the illegitimacy rate is not affected if a large share of wives and husbands is separated due to migration.

6 Conclusions

Our paper provides a theoretical underpinning and an empirical confirmation of the negative relationship between statutory old-age insurance and fertility. We thereby provide further evidence on a well-known theoretical concept in public economics, the social security hypothesis. In addition, we use a historical data set to show that a negative relationship between pensions and fertility can already be observed for late nineteenth century Germany. More broadly, our analysis is a confirmation of the fact that people react to institutional incentives.

The theoretical model, adding to the literature on overlapping generation models, highlights the effects of compulsory contributions to a pension system on fertility when labour supply is endogenous. Our crucial assumption is that labour supply is reduced when a household has children, which translates into an opportunity cost of having children in terms of foregone lifetime income. This gives rise to two counterbalancing effects in equilibrium: (i) a higher contribution rate to the pension system reduces this opportunity cost, leading to a positive effect on fertility (which we name the price effect) and (ii) a higher contribution rate to the pension system lowers lifetime income to the extent that there is an implicit tax in the pension system, leading to a negative effect on fertility (which we name the income effect). While the sign of the overall effect is determined by the larger of the two, the size of the overall effect is determined by the internal rate of return of the pension system.

Our empirical results confirm that a higher enrolment rate in pension insurance leads to a lower fertility rate. We use a historical data set which covers the introduction of the Bismarckian pension system at the end of the nineteenth century in Imperial Germany. This data set allows us to exploit cross-jurisdictional variation in the regional enrolment rate for identification.

The results are robust even when controlling for other determinants of the first demographic transition, confirming the residual effect of pension insurance on the fertility decline. When controlling for those determinants, an increase of the share insured by 1 % translates into a total reduction of approximately 0.5 marital births per thousand. This corresponds to a contribution of 15 % of the total decline in birth rates between 1895 and 1907.

Because our analysis only covers the time span 1895–1907, we cannot account for the longer term impact of pension insurance on people’s behaviour. After all, behavioural change mostly takes place gradually. It should, however, not be surprising that nowadays most individuals do not consider old-age provision as a motive for having children. The state had assumed this task long ago. Given that the direct effect of pensions on fertility amounted to almost 15 % of the overall decline between 1895 and 1907, the contribution of statutory pension insurance to the overall decline of fertility up to the current date must be even larger.