1 Introduction

Whether and how agricultural productivity influences long run development is an important question in the literature on growth and development (e.g. Schultz 1953; Lewis 1954; Rostow 1960). Theoretically, increased agricultural productivity would stimulate the transition to a modern, industrial economy in the context of a closed economy. Yet, in open economies increased agricultural productivity would reinforce specialization in agriculture and delay the development of a modern economy (Matsuyama 1992; Galor and Mountford 2008).

In this paper, we leverage new evidence on this question by considering the introduction of clover into Denmark,Footnote 1 a country which has often been referred to as a case of “development through agriculture” (Lampe and Sharp forthcoming). Our main analysis considers the effect of the introduction of clover on the populations of market towns which relied on the surrounding local market for their food supply and had a monopoly on the local agricultural trade until the introduction of a liberalization in 1857. For these reasons, the market towns and their local markets arguably would have strong similarities with a closed economy at least until they lost their local monopolies. Our analysis also examines the extent to which the effect persists during the late nineteenth and early twentieth centuries and some of the channels through which clover affected the agricultural sector.Footnote 2

Our main analysis finds that the introduction of clover had a positive impact on market town populations at least until 1870, and the estimated effects—although less precise—are similar for later years. Additional analyses show that clover increased the local number of cows, grain yields and human capital. Therefore, as the introduction of clover increased cow herds, it is also likely to have facilitated the later take-off of a modern dairy sector. The modern sector utilized centrifuges and steam power, and arguably played a central role in the Danish economic take-off (Henriksen et al. 2012; Jensen et al. 2018) as emphasized by Danish economic historians (Lampe and Sharp forthcoming). Clover may increase agricultural productivity in two principal ways. First, clover serves to increase nitrogen supply in the soil, which increases crop yields (e.g. Kjærgaard 1995). In fact, the supply of nitrogen governs the yields of crops, such as wheat, barley, and rye, when they have enough water.Footnote 3 Second, clover provides excellent animal fodder, which allows for a larger cattle population and an increased production of milk and butter.

Thus, our evidence proposes that increased agricultural productivity was a positive force for long run development in an economy that liberalized and became more open. Our evidence also suggests that indeed the introduction of clover is one of the “success stories of agriculture as the basis for the beginning of the development process” (World Bank 2007). More broadly, the paper also contributes to the understanding of the degree to which the agricultural revolution that started in the Netherlands and England had an impact on European economic development.

There are two reasons as to why we interpret the results as supporting the notion that clover introduction increased per capita incomes. First, market town populations are strongly related to urbanization rates for years in which data are available. Second, cross-sectional evidence shows that clover played a role for the establishment of cooperative creameries, which constituted the dairy sector. Denmark was likely in a Post-Malthusian Regime for most of the period (Klemp and Møller 2016), and it experienced increased growth at the end of the nineteenth century as part of its economic take-off.Footnote 4

For most of the time periods of our analysis (until 1857), the unit of our analysis, the Danish market towns, had local monopolies on the trade in agricultural goods. We think that this might explain in part why we observe local effects, but interestingly we note that the effects persist after the 1857 reform. We further note that Denmark’s decisive move towards free trade occurred in 1864 when a tariff reform was passed (see Henriksen et al. 2012). Yet, Hansen (1984) notes that in 1838 tariffs on finished goods were reduced and the so-called Sound Toll was abolished in 1857, meaning that the overall economy was becoming more open.

To identify the effect of clover adoption, we exploit the widespread adoption of this crop in Denmark as a historical experiment using a differences-in-differences estimation strategy.Footnote 5 The existence of detailed historical data for Denmark provides a unique opportunity to investigate the shock to agricultural productivity as caused by the adoption of clover. Moreover, we exploit plausibly exogenous variation in alfalfa soil suitability in an instrumental variables approach to deal with the fact that clover adoption is likely to be endogenous.

Specifically, we use data from Kjærgaard (1991, 1995) on the areas that had adopted clover in 1805 as a proxy of soil suitability for growing clover.Footnote 6 By this year, clover had spread widely in Denmark. Since these data indicate adoption, reverse causality concerns naturally arise, as early adoption may be caused by increased demand from the growing urban populations. To address this, we instrument clover adoption by soil suitability for growing alfalfa—like clover also a legume—which according to the historical narrative had its breakthrough after the period studied. Thus, our IV estimates compare areas that adopted clover due to exogenous soil suitability for growing alfalfa with non-adopters before and after widespread adoption. The breakthrough of clover in Denmark has been dated to the early nineteenth century.Footnote 7

The late breakthrough of alfalfa, combined with Kjærgaard’s data, allow us to investigate the causal effect of clover on urbanization across Denmark. We show that our results are robust to adding controls for soil suitability and other potential confounders, excluding the capital Copenhagen and other large cities, the size of local markets, and alternative functional forms. Moreover, there are no discernible pre-trends in areas with more suitable soil for growing alfalfa. We also demonstrate that any direct effect of alfalfa suitability should account for more than 60 percent of the observed reduced form using the technique of Conley et al. (2012) to explain the observed effect.

Moreover, we supplement our estimations on town panel data with cross-sectional evidence from the 1838 agricultural census, supplemented with data on cooperative creameries and human capital, to obtain evidence on the channels through which clover affected long run development.

Our instrumental variables estimation shows a positive effect of clover on urban populations, whereas OLS estimates are indicative of reverse causality bias. Moreover, the cross-sectional evidence from the first Danish agricultural census in 1838 suggests that clover increased grain yields, the number of cows, the number of cooperative creameries and the number of folk high schools, the latter being our measure of human capital. Using mediating regressions, we find suggestive evidence that human capital accumulation was a mediating factor for the increase in urban populations. We also find that our underlying variation in alfalfa suitability is not related to pea suitability. Therefore, our results cannot be explained by increased use of peas, which themselves fix nitrogen, although much less efficiently than clover. Finally, we also show that our results are unlikely to be driven by general productivity increases on soils historically suitable for cereal production.

The rest of the paper is organized as follows. Section 2 provides a brief literature review of the previous empirical research. Section 3 provides the historical background and discusses the advantages and breakthrough of clover in more detail. Section 4 describes the empirical strategy as well as the data. Section 5 presents the results. Section 6 concludes.Footnote 8

2 Literature review of previous empirical research

2.1 Previous research on the effect of clover

In related research, Allen (2008) carries out simulations that attribute more than 50 percent of the rise in crop yields in England between 1300 and 1800 to nitrogen-fixing legumes such as clover, in what is labeled the nitrogen hypothesis. Yet, he does not evaluate the effect econometrically.Footnote 9 Some econometric evidence is offered by Allen (1992), who runs regressions using a data set of 35 Oxfordshire farms and finds little relationship between the share of land used for leguminous crops and yields for wheat and barley. Yet, he does not resolve the endogeneity problem. By contrast, we proceed by testing the effect of clover using a differences-in-differences approach. This is done by interacting a cross-sectional measure of soil suitability for growing alfalfa (the first difference) with the timing of the widespread adoption of clover, which took place after 1800 (the second difference).

2.2 Previous empirical research on the impact of agricultural productivity on long run development

Previous research has investigated the impact of the introduction of the potato (Nunn and Qian 2011), the introduction of the heavy plow (Andersen et al. 2016), the introduction of maize (Chen and Kung 2016), the introduction of genetically modified soy beans (Bustos et al. 2016), the Green Revolution (Gollin et al. 2016), agricultural reforms in China (Marden 2016) and in Italy (Carillo 2018). Some of this research focus on developments in the past and other in the present (see also, Ashraf and Galor (2011) for the effect of agricultural productivity on development till 1500).

One advantage of our study compared to that of Nunn and Qian (2011) is that the data that we utilize overcome some of the weaknesses of the well-known Bairoch et al. (1988) data set as used by Nunn and Qian (2011). First, the Bairoch et al. (1988) data consider only (1) towns that reached 5000 inhabitants at least once before 1800 and (2) towns for which the 5000 inhabitants criterion could not be ruled out. As pointed out by Mumford (1974), focusing only on cities that grow large is equivalent to looking only at adults as human, and by doing so we will miss out on the variation from smaller places that did not make it once to the 5000 inhabitants threshold within the period. By focusing on market towns that had similar rights and possibly grew from levels lower than 5000 inhabitants, we alleviate this problem. Second, the Danish data also have better coverage over time, as Bairoch et al. (1988) only cover data every 100 years from 1000 to 1700 and every 50 years from 1700 to 1850.Footnote 10 Third, the Danish data are census data and do not rely on any interpolation. By contrast, Bairoch et al. (1988) interpolated data for non-census years.

Moreover, some confounding factors can be ruled out by focusing on the Danish case. Denmark does (and did) not have coal deposits, and this excludes the possibility that proximity to coal may confound the results.Footnote 11 Also, while the potato was introduced in Denmark in the period studied, which we control for, this was not the case for other new world crops such as maize.Footnote 12,Footnote 13 Further, focusing on Denmark allows us to control for the effect of concurrent institutional changes, which is difficult in studies that exploit variations in city size across borders.Footnote 14 During the period studied, Denmark was subject to serfdom from 1733 to 1800, and we control for this by including common time effects. The common time effects arguably also capture other common institutional changes that were taking place in this period.

Like the paper on the heavy plow, we use Danish data, but we observe that Andersen et al. (2016) lack data on urban populations and contemporary agricultural outputs. In this way, their analysis is suggestive of an effect of agricultural productivity on development, but they cannot relate this to the increased grain yields in the historical period (AD 1000–1300). Moreover, the introduction of clover likely had effects on the development of a modern dairy sector that used more advanced technology and contributed to long run economic development. The heavy plow is likely to have worked mainly by raising grain yields.

Chen and Kung (2016) show that the introduction of maize increased the population in China, but did not, unlike the potato, have an impact on urbanization rates. We cover a longer period, though we note that their study has urbanization rates. Moreover, we consider a different shock that does not relate to the introduction of new grains but rather impacted both grain yields and the dairy sector. Carillo (2018) considers the Battle for Grain (BFG) policy pursued in the 1920s in Italy, which like Chen and Kung (2016), mainly relates to grain productivity. The BFG included a package of higher yielding wheat, agricultural education and subsidies for purchasing e.g. tractors. Carillo (2018) exploits the BFG to investigate the effect of agricultural productivity on structural change using data from the late nineteenth century to the early twenty-first century. He also considers whether the increases in productivity led to increased human capital, as also done in the present study.

Some modern studies are close to ours. Bustos et al. (2016) investigate the impact of genetically modified soy beans using data from Brazil. Gollin et al. (2016) show that the adoption of high yielding varieties during the “Green Revolution” increased GDP per capita. Marden (2016) presents evidence from the reform-era in China suggesting that agricultural productivity drove the non-agricultural sector. While these studies certainly have merit, all of them study a shorter period than the current study.

3 Historical background

3.1 Advantages of clover

One advantage of introducing clover as a crop is that it serves to increase nitrogen supply in the soil. According to Cooke (1967: p. 3), nitrogen is in a class of its own, for in most agriculture its supply governs the yields of crops that have enough water. For the historical period under consideration, Kjærgaard (1995: p. 4) concludes that the main way to increase the supply of nitrogen was to increase the cultivation of leguminous crops. In northern Europe, this was done mainly by introducing clover, which is considerably better at fixing nitrogen than many other crops, such as peas, which were already grown. Kjærgaard (1991: p. 111) shows that clover adds about three times as much nitrogen as, for example, peas. Allen (2008: p. 186) notes that experimental data posit a proportional relationship between crop yields and nitrogen input.

A second advantage is that clover served as fodder for cattle, whereby milk and butter production could be increased. Moreover, as pointed out by Mokyr (2009: p. 180), better-fed animals produce more fertilizer. He also points out that the empirical relationship between clover and soil fertility was known at the time. This is corroborated by Marshall (1929: p. 46), who notes that the relationship was known by eighteenth century writers, such as Stephen Switzer and Robert Maxwell, who posited that clover draws nitrogen from the air and “gives it to the land.”

Finally, we note that alfalfa has similar advantages but as will be described in Sect. 4, it did not have its breakthrough in Denmark in the period studied. We return to evaluating whether and how clover affected specific measures of agricultural productivity in Sect. 5.5.Footnote 15

3.2 The breakthrough of clover

According to Falbe-Hansen (1887: p. 136), it was not until the 1820s and 1830s that clover was introduced on many farms in Denmark. Kjærgaard (1991, 1995) provides a short historiography of clover in Denmark. Importantly, he provides maps of its geographic diffusion in 1775, 1785, 1795, and 1805. It should also be noted that Kjærgaard’s maps only indicate clover adoption and not the intensity. Clover seems to have been grown in a single town in 1732, but Kjærgaard (1995: p. 6) stresses that it was still regarded as a new crop and was only grown experimentally. His maps suggest that only few places had adopted clover by 1775 (see panel A of Fig. 1) and it was not until 1805 that it had spread to a significant proportion of the country (see panel B of Fig. 1). The map for 1805 shows limited diffusion in the western part of the country, whereas adoption in the eastern part was much more widespread. These data indicate a breakthrough around 1805, yet we note that given the data availability for urban populations, we cannot distinguish 1805 from the 1830s since the population census years were 1801 and 1834, see Sect. 4.1.

Fig. 1
figure 1

Source: Kjærgaard (1991, 1995) (Color figure online)

Clover adoption in Denmark in 1775 and 1805. Notes Each town is shown by a dot. Green areas indicate clover adoption.

We finally note that clover would often be introduced on Danish fields that used a crop rotation system known as Koppelwirtschaft invented in Holstein (Bjørn 1988: pp. 35–37). This system required that the cultivated area was divided into at least seven fields. With, for example, 11 fields, the fields would be used in the following way: (1) fallow, (2) wheat or rye, (3). barley, (4) rye, (5) barley, (6) oats with clover, (7) clover for hay, (8) clover for hay and grazing, and (9–11) grazing. As shown by the example, the principal advantages of clover could be exploited. First, crops were planted along with clover to increase nitrogen in the soil. Second, clover was used as grazing for cattle. The eleven-field system might not be feasible on smaller farms, and in this case Bjørn (pp. 38–39) notes that five fields were used with grains being planted along with clover.

3.3 Historical narrative for England

Kerridge (1967: pp. 280–288) describes some of the diffusion of clover in England from 1650 onwards, but does not provide evidence of how important it was. As in the case of Denmark, alfalfa does not seem to have been important initially (p. 288). Overton (1996: p. 110) observes that clover accounted for about 3 percent of the arable acreage at the beginning of the eighteenth century and had increased to 30 percent around 1830. He also cites seventeenth century writers, who had observed that “after the three or four first years of clovering, it will so frame the earth, that it will be very fit to corn again”, which he interprets to mean that cereals following clover will have higher yields. Overton (p. 117) also states that the Norfolk four-course system could have been responsible for unprecedented changes in both crop and livestock productivity and output.Footnote 16 While nitrogen fixation increased yields, farmers were more interested in the fodder provided for livestock (p. 121).

Thus, the account of Overton (1996) suggests that both channels mentioned in the introduction were present. In a similar spirit to Chorley (1981), Allen (2008) investigates the impact on grain yields and shows that the effects would be very slow when legumes, such as beans and peas, were introduced in the crop rotation. With the Norfolk system, the effects are larger because of the nitrogen fixing property of clover, but since it included turnips, this would tend to reduce its positive effects on grain yields. The full effects also take a long time to materialize fully, but Allen’s simulation suggests that there would be effects after 30–50 years.

3.4 Early historical clover experiments

While the chemistry of nitrogen fixation was not fully understood in the nineteenth century, there was a lively debate about its importance. As noted by Kjærgaard (1995), Justus Liebig long insisted that nitrogen was not of importance.Footnote 17 The importance of nitrogen was maintained by British soil chemists, Lawes and Gilbert, and they were shown to be right. Historical and modern farming experiments strongly suggest the existence of nitrogen fixation with effects showing up over varying time periods, though the full effect takes longer time to materialize as shown in the work by Allen (2008).

The Rothamsted experiments conducted on a farm in England in the historical period suggests that clover in the rotation increases the yields of wheat relatively quickly, as data were reported for 8-year periods (Hall 1905: Chapter 10). This suggests that nitrogen fixation also happens over short time periods. Moreover, in the 1830s, Boussingault provided experimental evidence that legumes contributed nitrogen to the soil (Aulie 1970). Boussingault made experiments on fields in 1838 and found that the soil on which he planted clover had an increase in nitrogen, whereas the part on which he planted wheat there was none. All the studies suggest that the effects could very well materialize within 30–50 years.

4 Data and empirical strategy

4.1 Data

Our data include a measure of development, a measure of clover adoption, a measure of alfalfa soil suitability as well as other measures used in the analysis. We will discuss each in turn.

Outcome and variables of interest

Our measure of development is market town population, which is a measure of urbanization. There are several reasons as to why higher agricultural productivity could affect town development.Footnote 18 First, it is arguably the case that only societies with a certain level of agricultural productivity can sustain urban centers (Acemoglu et al. 2005).Footnote 19 Thus, the evolution of urban populations is one of the best proxies for economic development historically (Acemoglu et al. 2005; Cantoni 2015).

Second, higher agricultural productivity may spur rural–urban migration if it serves to lower the demand for labor in the agricultural sector as stressed by Nunn and Qian (2011).

Third, Gollin et al. (2007) emphasize that agricultural productivity needs to reach some critical level before resources can be moved into industry, which historically would often be in towns.

Fourth, in case Malthusian effects are present, larger populations would affect the degree of specialization, which could increase urban populations. In line with a Malthusian model, Ashraf and Galor (2011) demonstrate that higher productivity leads to higher population densities at the national level. While this result could be in line with models of worker migration, the authors show evidence suggesting that this is not the case. In our setting, we also use a productivity shock to estimate the impact on populations, though these are urban populations. One interpretation is that with larger populations overall, this could call for more division of labor and an increase in urban occupations along Smithian lines (Galor 2011).Footnote 20 Yet, given that some of the small towns also had agricultural production, part of the response could be Malthusian. Nevertheless, as the towns were far from self-sufficient, this seems less likely.Footnote 21

Evidence by Klemp and Møller (2016) suggests that Denmark was in the Post-Malthusian regime in the period 1824–1890, which strongly overlaps with our period. This regime is characterized by the presence of the Malthusian positive link between income and population, but income is not stagnating as in the Malthusian regime (Galor and Weil 2000). Moreover, Denmark also had its economic take-off in this period as mentioned in the introduction. We also find similar results when we use only the largest towns and cities in the data set. Focusing on market towns and their local markets we cannot compute urbanization ratios since the level of analysis is the town. Even so, when we use data for the whole country using the subnational unit known as herred (plural: herreder), evidence from the Danish census for different years shows that the share of the urban population is positively and significantly correlated with market town population. The correlation coefficient between the share of market town dwellers and log market town populations is 0.78 and significant at the 1 percent level in 1834. In line with the arguments stated above, Jedwab and Vollrath (2015) show that urbanization and GDP per capita have been positively correlated in the cross-section across time periods. Still, they also consider cases in which urbanization can occur without economic growth. We consider an agricultural revolution, which is one of the cases in which urbanization is believed to be positively associated with economic development (Jedwab and Vollrath 2015: p. 14).

Market town populations

The census data for market town populations were compiled by “den digitale byport” (the digital town gate).Footnote 22 The market towns had local monopolies on carrying out trade, craftsmanship, and other business activities until the Trade Act of 1857 (Christensen and Mikkelsen 2006; Degn 1987). This had been the case since the Market Town Act of 1422, which meant that trade was transferred to market places and fairs, and thus peasants were prohibited from selling or buying goods in the countryside. When these privileges were lost in 1857, it was arguably a common shock to all market towns, and we therefore do not believe that it poses a threat to identification. Yet, effects might be weakened, as the local effects may depend on the privileges of the market towns.

We use a balanced panel of 56 market towns from 1672 to 1901, which allows us to have the longest possible pre-period prior to adoption.Footnote 23 Moreover, we use data on the Kingdom of Denmark to keep focus on an area with homogeneous institutions. The areas of present-day Denmark near the German land border were part of duchies that were under the rule of the Danish king but had their own institutions. Further, these areas were no longer under Danish rule after the Danish-Prussian war of 1864.Footnote 24 The population data are available at irregular time periods. The first complete data for market town populations are from 1672 as originally compiled by Degn (1987). The next data are for 1769 and are taken from the first countrywide census. Census data are further available for the following years: 1787, 1801, 1834, 1840, 1845, 1855, 1860, 1870, 1880, 1890, and 1901. We use all these years in our estimations. Table A1 in Appendix B gives the population for the 56 towns in 1672 and in 1901. As shown, all the market towns had reached at least 500 inhabitants in 1901, and only two towns had populations below 1000 inhabitants. This also means that some market towns had their populations doubled many times in the period studied.

Figure 2 shows the change in the average town population among the 56 towns. There is a positive change in the average population throughout the period, but the most outspoken increases seem to take place after 1845.

Fig. 2
figure 2

Source: “Den digitale byport”, http://dendigitalebyport.byhistorie.dk/koebstaeder/

Average market town population based on 56 market towns, 1672–1901.

To construct our measures of clover adoption and alfalfa suitability, we use the fact that the market towns had local markets. The local markets varied historically from 1 (approximately 7.5 km) to 4 Danish miles. In many cases, the radius of local markets was 2 miles.Footnote 25 Thus, we calculate the share of soil adopting clover or suitable for alfalfa in a circle (buffer) with a 15 km radius as a benchmark. The share is calculated from the land mass within this 15 km circle around each town center. This is what we use to construct, for example, our clover measure in Eqs. (1) and (2) of Sect. 4.2. We examine the robustness of this choice in the empirical analysis by choosing different radiuses and construct measures for buffers with a radius of 5, 10 and 20 km to check for the sensitivity of the distance from a market town.

Clover adoption, 1805

We use Kjærgaard’s map shown in panel B of Fig. 1 for the geographic distribution of clover adoption in 1805, which was collected on the basis of all “eighteenth century manor archives and a number of other sources” (Kjærgaard 1995: p. 6).Footnote 26 Kjærgaard constructed maps of clover adoption by parish, which is the smallest geographic unit for Denmark. The extraordinary level of detail of Kjærgaard’s map allows us to make very local estimates of how much of the area around a market town grew clover. In practice, we calculate our clover measure from a digitized version of Kjærgaard’s map using the buffers described above. We measure the extent of clover adoption by using data on the areas that had adopted this crop by 1805. The underlying parish data are made into a raster file making it possible to calculate the share of land with clover adoption within the 15 km buffer with the market town as the center. According to Fig. 1, clover had not been adopted in many areas prior to 1805. The 1805 distribution is probably close to an “equilibrium distribution”, and will include many areas with good suitability for growing clover. Even so, it is likely to be endogenous in our differences-in-differences setup, and we therefore implement an instrumental variables strategy based on soil suitability for growing alfalfa, as will be explained next. We also estimate a model using Kjærgaard’s data for four available periods between 1775 and 1805 to include information on time varying clover adoption. The adoption measure exclusively indicates the extensive margin of clover adoption, but Kjærgaard (1995) notes that clover accounted for 30–50 percent in the crop rotation on the island of Funen, which has the most clover grown according to his maps. We mentioned above that historical examples of crop rotation also support that clover was in use.

Figure 3 combines market town population in 1672 with the share of clover adoption in 1805. Interpreting the 1805 clover adoption as an “equilibrium distribution”, we see no sign that this reflects effects of pre-trends in the sense of more clover intensive areas having larger market towns initially in the sample period. In Figure A1 in Appendix B, we impose the growth rates of town populations for the period 1834–1901 on a similar map to provide an initial impression of the spatial distribution of development.

Fig. 3
figure 3

Source: Kjærgaard (1991, 1995) (Color figure online)

Clover adoption in Denmark in 1805 and market town size in 1672. Notes Green areas had adopted clover by 1805. Red circles indicate the size of market towns in 1672. Blue circles show 15 km buffers around each of the 56 market towns.

Alfalfa suitable soil

We use a raster file from the Food and Agriculture Organization (FAO). Clover and alfalfa both belong to the same family of legumes, and according to the growing instructions made by Danish supplier “Hunsballe”, both clover and alfalfa grow better on clay and clayey soils.Footnote 27 This suggests that suitability for growing alfalfa also captures suitability for growing clover. The best soils for growing alfalfa have medium suitability, and we use soils with at least medium suitability to construct our measure of alfalfa suitability.

The alfalfa measure is based on the share of land with at least medium suitability for alfalfa within a buffer of some radius from each market town of similar size as the one applied to the clover measure. These soils are indicated in green in Fig. 4 (see Figures A2–A4 for alternative maps).

Fig. 4
figure 4

Source: FAO raster data on soil suitability for alfalfa (Color figure online)

Soil suitability for growing alfalfa. Notes Each town is shown by a dot. Green area indicates medium or better suitability for alfalfa.

The suitability measure is constructed according to the following procedure as summarized by Nunn and Qian (2011, pp. 609–610), which holds for any crop. In a first step, for each cell FAO identified the days of the year when the moisture and thermal requirements of the crop are met. Using this information, FAO determined the exact starting and ending dates of the length of the growing period for each grid cell. This allows FAO to determine whether the cell is suitable for growing the crop. In a second step, constraint-free crop yields were determined, and the yield in each grid-cell was measured as a percentage of this benchmark. Finally, additional constraints that exist in each cell were identified. In the end, the procedure determines the percentage of maximum obtainable yield of each of the cells. This leads to a classification. Cells with attainable yields of 80 percent or above the maximum potential yield are classified as “Very Suitable”. Cells that attain 60–80 percent of maximum yields are classified as “Suitable”, “Medium Suitable”: 40–60 percent, “Marginally Suitable”: 20–40 percent. We use the data under the assumption of rain fed agriculture and low input level. The spatial resolution of the underlying raster file is 0.5 × 0.5 degrees.

Given that alfalfa and clover require similar soils, it is plausible that soils with medium or better suitability for growing alfalfa are the more suitable for growing clover. This is backed up by three pieces of evidence. First, we find a strong relationship when we use instrumental variables estimation in the differences-in-differences setup reported in Sect. 5. Second, when we correlate the share around a town or in a herred that grew clover in 1805 with the share of medium suitable soil for growing alfalfa, we find strong and significant correlations. The simple correlation coefficients are 0.53 and 0.44 for the town panel and the herred cross-section respectively. Comparing the maps does reveal that there are areas adopting clover without much alfalfa suitable soil, yet we are ultimately exploiting conditional relationships, and to illustrate those, we show partial plots for the first stage regressions in Figure A5 for both the town panel and the cross-sectional herred data. While the relationships are not perfect, they are positive and strongly significant. We also note that we can use different buffer zones, and still obtain similar results. Finally, we point to evidence from the 1886 agricultural census of Prussia, which shared a border with the Kingdom of Denmark, available from the Prussian Economic History Database (Becker et al. 2014). Prussian farmers grew alfalfa perhaps due to better climatic conditions, and in fact yields for clover and alfalfa are strongly correlated; see Table A2 and Figures A6 and A7 in Appendix B. Therefore, the relevance criterion for our proposed instrument is plausibly satisfied.Footnote 28

Alfalfa may have been grown in small amounts over the period considered, but as stressed by Kjærgaard (1995) clover was the legume par excellence in Denmark. Kjærgaard (1991: p. 71) notes that alfalfa was not widely adopted for climatic reasons. As mentioned in footnote 28, alfalfa may have been less productive in coastal climates and for this reason, clover might have been preferred by Danish farmers.

Brøndegaard (1978–1980) also stresses the importance of clover and that alfalfa was not widely adopted in the nineteenth century. We compared the articles on the two crops in his encyclopedia “Folk and Flora.” There it is stated that in 1805, the introduction of clover had led to improved productivity and was the foremost improvement to agriculture at the time. Moreover, it is observed in line with the empirical evidence that clover is thriving on clayey soils. It is said not to deplete the soil and wheat yields improve after the use of clover. Furthermore, red clover is recognized as a valuable fodder crop (p. 217). The same text mentions examples of attempts with the use of alfalfa as a fodder crop, but it did not have its breakthrough until after 1900, and it is speculated that alfalfa did not fit well into the crop rotation of the time. This is also corroborated by handbooks on agriculture edited by Madsen-Mygdal (1912, 1938). Madsen-Mygdal (1912: p. 452) explains that Danish farmers had made many unsuccessful attempts at growing alfalfa, and therefore they viewed the crop as too uncertain. He concludes that many farmers believed that alfalfa was unlikely to ever gain importance for this reason. In the 1938 version of the same handbook, he reiterates that many farmers held the view that alfalfa was too uncertain under Danish conditions (Madsen-Mygdal 1938). This suggests that since alfalfa was not widely adopted in the period studied, the suitability for growing alfalfa did not affect market town populations directly through alfalfa adoption.Footnote 29 If the main channel through which alfalfa suitability could have direct effects is actual adoption, local alfalfa soil suitability plausibly satisfies the exclusion restriction.

4.2 Empirical strategy

We approach our investigation by estimating models that build on the logic of differences-in-differences estimation. We also investigate the cross-sectional effects of clover on measures of agricultural productivity and human capital so as to provide evidence on channels through which clover affects town population, but we postpone the discussion of this to Sect. 5.5.

As mentioned, we rely on differences-in-differences estimation. In practice, we construct interactions between clover adoption or alfalfa suitability varying within the cross-section of towns with a time dummy which is one after the breakthrough of clover as of 1834. We estimate a differences-in-differences model for urban populations using the interaction of clover adoption with a dummy equal to one from 1834 onwards as the right-hand-side variable. We begin by estimating the following specification for the natural logarithm of the populations (ln pop) of town i at time t:

$$ \ln pop_{it} = \beta^{clover} \ln \left( {1 + clover_{i,\, 1805} } \right) I_{t}^{post} + \mathop \sum \limits_{R} \lambda_{R} I_{i}^{R} + \mathop \sum \limits_{j = 1769}^{1901} p_{j} I_{t}^{j} + \varvec{x^{\prime}\gamma } + \varepsilon_{it}, $$
(1)

where \( \beta^{clover} \) is the parameter of interest that measures whether there are effects of the breakthrough of clover adoption. \( I_{t}^{post} \) is equal to one from 1834 and zero otherwise. Based on the historical narrative, we assume that the widespread adoption happened after 1801 and the first population census after this year was made in 1834. \( I_{t}^{j} \) represents time dummies and \( I_{i}^{R} \) represents market town dummies (i.e., time and town fixed effects). \( clover_{i,1805} \) is the share of land on which clover was grown in 1805 around each market town (i.e., land adopting clover as reflected in panel B of Fig. 1 out of total land mass in some radius from the center of the market town).Footnote 30 \( \varvec{x} \) is a vector of control variables to be discussed below and \( \varvec{\gamma} \) the associated coefficients. Summary statistics for all variables are given in Table A3.

The potential bias of the differences-in-differences estimator can best be understood by considering the simple two period version (Wooldridge 2002). For our case, this is \( (urban_{t + 1}^{clover} - urban_{t}^{clover} ) - (urban_{t + 1}^{no\,clover} - urban_{t}^{no\,clover} ) \), where the superscript clover indicates clover adoption in 1805. It is, for example, possible that places that had adopted clover by 1805, had large urban populations prior to 1805 and that clover mainly helped to sustain the larger urban populations. This would tend to introduce a negative bias from such pre-trends as the first term would be small. Alternatively, if there is a positive differential trend in the urban population in clover adopting towns, we would obtain a positive bias. We accordingly estimate versions of the models where we instrument \( clover_{i, 1805} \) by \( alfalfa_{i} \), which similarly denote the share of soil suitable land around the market town for growing another legume, namely alfalfa. As we noted above, obtaining better alfalfa and clover yields requires similar soil conditions thereby arguably fulfilling the relevance criterion for an instrument.

The alfalfa suitability measure is plausibly exogenous since the historical record shows that alfalfa had its breakthrough in Denmark long after the period studied, for which reason we use it as an instrument for \( clover_{i, 1805} \). To be more precise, we instrument \( ln\left( {1 + clover_{i, 1805} } \right) I_{t}^{post} \) by \( ln\left( {1 + alfalfa_{i} } \right) I_{t}^{post} \). Alfalfa suitability is also required to have no direct effects on urban populations to fulfill the exclusion restriction. This could fail if, for example, alfalfa was widely adopted in the surrounding areas of market towns. Yet, as we have explained alfalfa was not widely adopted in the period making this unlikely. Moreover, we control for the soil suitability for other crops, such as potatoes and barley, and other potential confounders in some specification so as to add credibility to the exclusion restriction.

We also estimate flexible specifications for the reduced form for alfalfa suitability (and clover adoption) which allows for separate coefficients for different points of time (t):

$$ \ln {\text{pop}}_{it} = \mathop \sum \limits_{j = 1769}^{1901} \beta_{j}^{clover} ln\left( {1 + alfalfa_{i} } \right)I_{t}^{j} + \mathop \sum \limits_{R} \lambda_{R} I_{i}^{R} + \mathop \sum \limits_{j = 1769}^{1901} p_{j} I_{t}^{j} + x^{\prime}\gamma + \varepsilon_{it} . $$
(2)

This model allows us to evaluate when this plausibly exogenous soil suitability as an indicator of the effect of clover began having an effect on urban populations. In other words, this allows for a test of whether our assumption about the breakthrough of clover after 1801, based on the historical narrative, is plausible.

We include town and period fixed effects to avoid confounding the effects of our variables of interest with town-specific and time-specific common factors.Footnote 31 Town fixed effects will capture spatial heterogeneity in terms of direct fixed time-invariant effects of, for instance, soil quality and location. In contrast, our focus is on time-varying effects that arise due to the interaction between soil quality and the breakthrough of clover in a differences-in-differences setup.

We include time fixed effects as we are interested in the differential impact of clover on development such that any effect we estimate is relative to a common shock and aggregate changes. Controlling for common shocks is also important since the period studied was a time of institutional changes at the country level. For example, eighteenth century Denmark had serfdom, which tied rural workers to the countryside, and this may have hampered town development (e.g., Christensen 1945). Serfdom was a common shock to the whole country, and it is accordingly important to control for it.

We also control for potato suitability using data from the FAO as the potato was adopted in the same period as clover (Falbe-Hansen 1887). Most soils of Denmark have at least a medium suitability for growing potatoes, and we therefore use the share of land with good suitability or higher within a buffer following Andersen et al. (2016).Footnote 32 In some specifications, we also control for time by fixed effects for the regions that did not have serfdom throughout the period to capture differences in institutional trajectories. If all soils suitable for growing, for instance, plow positive crops, such as barley, in general were experiencing productivity changes,Footnote 33 alfalfa suitability would not be uncorrelated with the error term. Therefore, we control for other measures of agricultural productivity, such as overall yields and (historical) barley yields or (modern) barley suitability, in some specifications. We will return to this below in Sect. 5.3, where we also apply the technique of Conley et al. (2012) to investigate the exclusion restriction. It is also possible that we capture other nitrogen fixing crops such as peas. We will investigate this in Sect. 5.5.

5 Results

5.1 Results from the non-flexible models

The results for the non-flexible model are shown in Table 1 based on a benchmark case of clover adopting land shares within a 15 km radius or buffer. As our baseline, we report standard errors corrected for town-specific clustering. In Table A4 in Appendix B, we demonstrate that results are robust to using Conley t-statistics for which standard errors are corrected for spatial correlation in the error term. These Conley t-statistics allow for direct dependence between towns that are within 2 degrees, or roughly 222 km, of each other. We find that results are in general stronger with this type of standard error and therefore regard clustering corrected standard errors as the more conservative benchmark.Footnote 34

Table 1 The introduction of clover and its effect on market town population

The first part of Table 1 presents the basic (partial) correlations. In column (1), in which we control for fixed effects for towns and years, we find a negative and insignificant coefficient for clover. Once we add controls for the potato as in column (2), the sign changes and the coefficient is significant at the 10 percent level. In column (3), we add fixed year effects specific for the areas of Funen and Jutland to capture institutional heterogeneity. While all the country had serfdom (known as “stavnsbaand”) from 1733 to 1800, the eastern islands of Zealand, Falster, Lolland, and Møn had a version of serfdom known as “vornedskab” until 1701. Therefore, the eastern islands had serfdom in all observed periods, whereas this is not true for the western part of the Kingdom on Funen and in Jutland. The effect of this on the estimated coefficient is that it reduces the point estimate and makes the coefficient insignificant.

Once we run the same models with the alfalfa measure in columns (4)–(6), we obtain significance at the 10 percent level or better. As we will discuss in Sect. 5.3, the flexible estimations for clover adoption and the results of columns (1)–(3) in Table 1 are consistent with the presence of reverse causality in which those places that had large urban populations in the beginning of the period also exhibited higher pressure for adopting clover in the differences-in-differences estimation setup. The flexible estimations in Sect. 5.3 show that urban populations were large prior to the breakthrough of clover in the towns surrounded by areas that adopted clover and that any effect after adoption will be more difficult to measure. For the flexible models with the alfalfa measure, we do not find any indications of such pre-trends.

Next, we turn to the instrumental variables estimates. As we argued above, alfalfa did not have its breakthrough in Denmark until the twentieth century due to climatic conditions as well as insufficient knowledge about growing methods. Yet, suitability for growing clover and suitability for growing alfalfa are likely to be linked since these crops tend to grow on similar types of soils and belong to the same group of crops.

Columns (7) and (8) in Table 1 report the instrumental variables estimates for the effects of clover adoption instrumented with alfalfa suitability on market town population. First stage results are reported in columns (1) and (2) in Table A6 in Appendix B. We begin by noting that the F-statistics are large. The value is about 28 for the models in columns (7) and (8). This is larger than the usual rule of thumb of 10 suggesting that the instrument is strong.Footnote 35 The instrumental variables estimates in columns (7) and (8) show that these are larger than the OLS counterparts in columns (1) and (2). Moreover, they are significant at the 5 percent level, whereas the OLS estimates are smaller and insignificant when we control for Funen/Jutland specific year effects. This suggests that the instrumental variables estimates effectively deal with the reverse causality mentioned above. Investigating the extent to which the exclusion restriction is violated, the approach based on Conley et al. (2012) further strengthens that we are in fact dealing effectively with reverse causality and omitted factors; see Sect. 5.3.

We use the instrumental variables model in column (7) to calculate the counterfactual contribution from clover to market town population growth from 1672 to 1901 (i.e., we use the estimated model to remove the effects of clover in 1901). These calculations suggest that clover can account for 7.7 percent of the increase in town populations from 1672 to 1901. This effect is modest, yet not trivial.Footnote 36

Altogether, these results lend support to what Allen (2009) refers to as the standard or established model in which agricultural productivity drives urban populations. He argues for the case of England that causation in the opposite direction was more important. He emphasizes the role of the expansion of London and it is possible that the capital of Denmark, Copenhagen, played a similar role. If this is the case, dropping Copenhagen should make a large difference to the result. Yet, results are essentially unchanged when dropping Copenhagen, as shown in Table A7 in Appendix B; see also Sect. 5.4.

5.2 Additional control variables and alternative measures

As we discussed above, one concern is that the alfalfa instrument may pick up other factors. For example, an increase in agricultural productivity in growing plow positive crops such as barley. We address this issue in Table 2 in which we add many other control variables than just the one for potatoes interacted with a dummy equal to 1 from 1834 and 0 otherwise. We use the potato interaction as our baseline, but also investigate what happens when we interact the other control variables with a dummy equal to one from 1834. In terms of variables measuring agricultural productivity, we add caloric yields after 1500 as calculated by Galor and Özak (2016), a historical measure of barley suitability taken from Andersen et al. (2016),Footnote 37 and an indicator for whether an area was using a grass field system in 1682 taken from Jensen et al. (2018).

Table 2 The introduction of clover and its effect on market town population: additional control variables

The caloric yield measure proxies for agricultural productivity of all crops and the barley measure proxies for a cereal that is known to be plow positive. The grass field measure is included as clover was regarded to be a grass substitute (Overton 1985; Jensen 1998). We have added population in 1672, which could capture historical agricultural productivity, but also serves to control for potential convergence. The latter part of the nineteenth century was a period of railway expansion, and to counter the possibility that we are capturing this, we control for the roll-out of railways.Footnote 38

Columns (1), (3), and (5) in Table 2 present our baseline results, and in columns (2), (4), and (6), we show the effect of adding control variables. We notice that none of the agricultural productivity variables are significant in the population equation. Moreover, the coefficient on clover is significant at the 5 percent level in both the OLS and IV estimations. The same is true for the alfalfa measure in the reduced form. The estimated effect and significance increase when we add these measures suggesting that controlling for more variables, perhaps surprisingly, adds more precision. We therefore conclude that the alfalfa instrument does not capture general productivity increases on soils with high caloric yields, soils that are suitable for growing cereals or soils that were used for growing grass—the main competitor to clover. In Sects. 5.3 and 5.5, using flexible specifications and cross-section evidence, we further investigate this. Adding the suitability measures also rules out that we are simply capturing the fact that towns were established in locations where agricultural productivity for existing crops was naturally high.

We have also estimated a version of the model in which we control flexibly for all the variables, and we find that the result is almost identical; see column (1) of Table 3.

Table 3 The introduction of clover and its effect on market town population: flexible control variables, gradual adoption and varying buffer sizes

As an alternative way of testing the relation between clover adoption and market town populations, we exploit Kjærgaard’s maps and construct the adoption measure for each of the years 1775, 1785, 1795, and 1805 and run a new model based on the OLS estimator allowing for possible effects of pre-trends in the differences-in-differences setup. Time dummies are interacted with the clover adoption measure for the closest year for which we have an observable market town population. The results are shown in columns (2) and (3) of Table 3. We observe that the coefficients on the 1775, 1785, and 1795 clover measure interacted with the dummy for the corresponding years are insignificant. For years from 1834 onwards (i.e., after 1805), the coefficient is larger than for other years and significant at the 10 percent level. We notice that the t-value increases from 0.20 to 1.74 between 1769 and 1834. If we add the grass field variable, we get a similar result, but the significance increases; see column (3). The fact that coefficients in earlier periods are insignificant suggests an absence of discernible pre-trends prior to the breakthrough of clover. Yet, it should be noted that the coefficient for 1769 is large and that the coefficients then decrease in subsequent years. This could indicate, if anything, a negative pre-existing trend, which in general would work against us.

The rest of Table 3 investigates whether our definition of local markets matters for the results focusing on alfalfa. Columns (4) and (5) show that calculating shares of land suitable for alfalfa for smaller radiuses of 5 or 10 km tends to reduce the size of the effect marginally, although the precision is also reduced. With 5 km buffers, the estimate is not significant at conventional levels. Still, we notice that 5 km is very conservative and that the significance is not far from achieving the 10 percent level. Column (7) demonstrates that effects get larger with a 20 km buffer, though the precision is similar to the benchmark of 15 km. We conclude that results are not driven by the choice of the local market size. We show in Table A10 in Appendix B that the same holds true for the instrumental variables estimates.

5.3 Flexible estimations and alternative explanations

In this section, we present the results from flexible models as specified by Eq. (2). We also use the flexible models to investigate whether our results simply capture some general development associated with clay soils or initial population density, which would affect the interpretation as well as the plausibility of the exclusion restriction for alfalfa suitability. To further investigate the plausibility of the exclusion restriction, we use the technique developed by Conley et al. (2012).

Figure 5 shows the point estimates (along with 90 percent confidence intervals) from estimating flexible models for market town population with the alfalfa measure interacted with time dummies of the observable years, respectively. To supplement the figure, Table 4 reports the results in the first row, which shows point estimates as well as t-statistics.

Fig. 5
figure 5

Source: FAO raster data and own calculations

The flexible estimations for alfalfa. Notes: The figure shows years on the first axis and the flexibly estimated coefficients on alfalfa on the second axis. The model controls for fixed effects for town and year as well as for the potato flexibly. Buffer size is 15 km. The vertical line indicates 1801, which is the last year of our pre-period.

Table 4 Flexibly estimated effects on market town populations of alfalfa, clover, clay and initial market population in 1672

When we use the alfalfa measure, which is plausibly exogenous since the historical record shows that alfalfa had its breakthrough in Denmark long after the period studied, we observe systematic significance from 1834 onwards. This is largely in line with the historical narrative on the breakthrough of clover with alfalfa and clover being grown on similar exogenously given clay soils. The coefficient on the alfalfa measure increases during the period 1787–1901 (with a small dip in 1880).Footnote 39 Relating to the potential biases of the differences-in-differences estimator presented in Sect. 5.3, we do not observe any discernible pre-trends prior to 1834, as the coefficients are insignificant and the pattern is a decrease in the coefficient from 1769 to 1787 followed by a small increase to 1801. The coefficient for 1769 is relatively large but insignificant at conventional levels. Even so, a coefficient of this size may lead us to suspect a positive pre-existing trend. If this was the case, we would expect the coefficients for 1787 and 1801 to be larger than the 1769 coefficient. In fact, they are smaller and not statistically distinguishable from the 1672 level. Given that there is a decrease in the coefficients, there may be a negative pre-existing trend. If that was indeed the case, it would work against us. To highlight that the coefficients tend to decrease from 1769 to 1801, we show in Figure A8 in Appendix B the coefficients for the pre-period and impose a linear trend for the coefficients. Moreover, in Figures A9 and A10, we control for the population levels in 1672 and 1769, which makes the imposed trend for the pre-period flatter. As shown in row 1 of Table 4, urban populations are significantly larger from 1834 on soils with higher alfalfa suitability. We also note that while the coefficients of flexible alfalfa from the 1880s onwards are like those in other years in Table 4, they are insignificant. This could be because these years are less comparable with 1801 and before. Yet, we note that all three estimates are close to being significant at the 10 percent level.

In contrast, when we use clover in row 2 of Table 4, the coefficients for 1769, 1787, and 1801 all exhibit significance, for which reason we suspect endogeneity bias. The early effects of clover can plausibly be related to reverse causality, as the clover measure used represents the 1805 clover distribution. This may suggest that the areas adopting clover had higher urban population pressure prior to adoption and pre-trends are important. The effects in the final years are more difficult to explain, but they might suggest that population pressure took off in the adopting areas. The pattern of coefficients for clover is consistent with the smaller effects, we find in Table 1 and the direction of the bias discussed in Sect. 3. The relative change between the pre- and post-period is modest suggesting that a differences-in-differences estimation would yield small effects. Still, as clover adoption is endogenous, the alfalfa measure is the one that is more likely to capture the causal effect.

Yet, there may be alternative explanations of the pattern observed for alfalfa. These will be investigated next.

Clay soils and initial productivity

While our use of alfalfa suitability measures as sources of exogenous variation with controls for historical barley suitability go some way towards establishing that the results represent a causal effect, concerns may remain. For example, we noted above that alfalfa and clover both tend to grow better on clay soils. Thus, a concern would be that we capture the effect of other variables that make clay soils more productive.

Mokyr (1990: p. 59) notes that new plows with a curved mouldboard were introduced in the Netherlands in the seventeenth century. The new plows had iron mouldboards, as wood is difficult to shape into the desired form. They also tended not to have wheels, and fewer draft animals were needed. Further, the Rotherham plow or swing plow with an iron mouldboard was patented in England in 1730. It was first introduced in Denmark in 1770, though its general diffusion was slow. According to Falbe-Hansen (1887: pp. 137–138), the swing plow replaced the old wheel plow with a wooden mouldboard gradually over time but also led to improvements of the old wheel plow. He notes that in 1820, there were 10 wheeled plows for every swing plow. Kjærgaard (1991: p. 111) argues that the diffusion was faster and that plows with iron mouldboards were used on no less than two-thirds of Danish farms at the end of the eighteenth century.

As noted by Christensen (1996: p. 640), contemporaries stressed that the functions of the plow were to turn the soil—which incorporates, for example, animal manure into the soil—and to cut the roots of weeds. Both the older heavy plows and the swing plows could carry out these functions. They were arguably more important on clay soils, where turning the soil was important for effective weed control because these soils offer more resistance than lighter soils (Andersen et al. 2016). This suggests that the clay soils may have become more productive because of new plows as well as the cultivation of clover. We address this concern by running a regression for market town population on the alfalfa variable and the share of clay soil interacted with time dummies reported in Fig. 6 and Table 4. If our reduced form (non-flexible) results for alfalfa are simply capturing that the clay soils become more productive for a variety of reasons, we should expect to obtain a similar pattern as for alfalfa. In fact, the patterns of the coefficients of clay soils and alfalfa soils in Fig. 6 are markedly different.

Fig. 6
figure 6

Sources: Alfalfa soil and clay soil are obtained from FAO raster data and town population in 1672 is obtained from “den digitale byport” (the digital town gate) and own calculations

Clay soils, alfalfa suitability and initial population. Notes: The figure shows years on the first axis and the flexible estimates on the second axis. The lines show the coefficients on the alfalfa, clay and 1672 population variables estimated for each year available. The model controls for fixed effects for town and year as well as the potato flexibly. Buffer size is 15 km for measures of alfalfa, clover, barley, and clay. The vertical line indicates 1801, which is the last year of our pre-period.

The results in Table 4 using the share of clay soil suggest that it is unlikely that the results are driven by a general productivity effect of new plows on clay soils. While the coefficients for clay soils are positive and rising in some years until 1860, they actually decrease dramatically from this period onward. Moreover, the t-statistics reported in Table 4 are never significant at any conventional level. This suggests that, if anything, it was the clay soils that were suitable for legumes that got more productive. This is also reflected in Fig. 6, which plots the estimated coefficients for clay soils and alfalfa (interacted with time dummies). Andersen et al. (2016) provide evidence that the older heavy plow was important for urbanization in clay soil areas in Denmark and Europe in general. Our results suggest that the new plows of the eighteenth century were perhaps not as big improvements as believed by some contemporaries. In fact, plowing contests carried out from 1770 to 1820 also suggest that heavy plows could still compete with swing plows. For example, this was the case in the 1803 tests carried out by the Danish veterinary school (Christensen 1996: p. 632). We have also added a control for barley suitability in the instrumental variables models above [see column (6) in Table 2]. As pointed out above, barley is a plow positive crop and for this reason, and since the clay soils do not exhibit any significant relationship with population, we do not believe that our results are explained by the adoption of new plows.

Finally, as an alternative test of general productivity increases, we consider the initial population in 1672, which would capture historical agricultural productivity. Yet, we see that when we estimate the impact of the measure flexibly interacting the population in 1672 with time dummies, we fail to find a pattern like the one for alfalfa in the first row of Table 4 and as shown in Fig. 6.

Violation of exclusion restriction

Figure A11 in Appendix B presents estimates of the effect of clover adoption on town populations using the modified Instrumental-Variable approach of Conley et al. (2012). We apply the union of confidence interval (UCI) approach to evaluate the direct effect of alfalfa suitability, which we denote by γ, on town populations imposing that the support of the direct effect, γ, is [0, δ], with δ > 0. Thus, we impose the restriction that the direct effect is positive, which is needed to explain the estimated effect of clover adoption. We apply the UCI approach since Conley et al. (2012: p. 260) state that “the interval estimates across different γ provide a conservative (in terms of coverage) interval estimate for β”. For our case the coefficient of interest is \( \beta_{{}}^{clover} \).

Figure A11 shows that our estimate of \( \beta_{{}}^{clover} \) becomes insignificant at the 10 percent level, when the direct effect equals 0.128. Thus, the violation of the exclusion restriction (the direct effect of the alfalfa suitability on town population) needs to be about 62 percent (0.128/0.205) of the overall reduced form effect to render our IV results insignificant [see column (4) in Table 1]. Given that the direct effect of alfalfa suitability should come from either alfalfa adoption or other soil types (which themselves have little explanatory power), we believe that it is perhaps possible, but not plausible, that our results are produced by a violation of the exclusion restriction.

Trade

The period from around 1830 to the late 1870s is in general known as the grain sales period (Olsen 1961). Skrubbeltrang (1934–1935) provides data showing that grain exports, were much larger than imports and were on the rise at least from the 1830s. Thus, Denmark was a net exporter of grains. Hansen (1984: p. 104) shows that about 60 percent of grain exports went to Norway and England from the 1820s. In the 1840s, 40 percent went to England. Given this, one might suspect that trade is a driving force in our results. Still, while grain exports were on the rise, this is also true for our post-period. Hansen (1984: pp. 68–69) shows that both grain prices and sales were increasing until 1807. The coefficients on alfalfa are lower in both 1787 and 1801 than in 1769, so it seems unlikely that trade explains our results.

From the 1880s, exports were shifting towards animal products (Jensen et al. 2018), and if our results are driven by increased grain yields due to nitrogen fixation, this may explain the reduced significance as of 1880 in row 1 of Table 4. The loss of privilege of the market towns in 1857 is a likely factor, though it is puzzling that this works with a 20-year lag. Another possibility is the 1864 move towards free trade (Henriksen et al. 2012), which may have affected grain producers adversely, but again the effect comes with a long time lag. 1864 also marks the loss of Schleswig–Holstein and the loss of Hamburg as the main port for shipping to England.

Henriksen et al. (2012) maintain that the shift to dairying happened gradually, yet its dominance in agricultural production was established at the end of our study period. Moreover, we find below that clover adoption affected the number of cattle locally already by the late 1830s, when we do observe effects.

To test more directly for the impact of trade as a confounder, we have included railway access in Table 2, and find that this variable had the expected impact, yet it does not change our result. There are also a few towns that built new ports in the period and when we control for these, our main result remains unchanged, see Table A12 in Appendix B.

5.4 Additional robustness checks

While our results suggest a causal impact of clover on local town development, a number of questions remain. For example, we investigate the extent to which the relationship is driven by larger cities. We have already noted that dropping Copenhagen does not drive results, but to investigate this further, we exclude the three largest cities in modern times in Table A13 and while precision is reduced, the estimated coefficients are similar.Footnote 40 We finally look at what happens when we only use the cities in Bairoch et al. (1988) and find that the coefficients are much larger than for our baseline. This could indicate that effects depend on population levels, but could also point to a selection effect.Footnote 41 Since, we have shown that controlling for initial population is not significant in Table 2, we do not believe the results are mainly driven by a few large cities.

We next consider the degree to which the effect persists into the twentieth century and in Table A14 in Appendix B extend the sample to 1925. While the effect is less precisely estimated, the coefficient is still positive and significant at the 10 percent level. Even so, we note that flexible estimates suggest that the effect is insignificant for the years in the 1900s; see Figure A12 in Appendix B. Also, when we look at the period after 1901, the coefficient is positive but very insignificant. This suggests that the effect was slowly dying out in the first quarter of the twentieth century.

We have further checked that results do not depend on the choice of the coding of the variable. For example, we have examined whether we can estimate the models without log transforming clover; see Table A15. We have also tried to replace clover adoption by dummy variables as reported in Table A16. First, we have used a pure dummy measure of having any medium suitable soil for growing alfalfa. When we do so, we obtain similar results, but the significance is reduced to the 10 percent level. When we consider areas that have a share of least 25 percent suitable soil in the buffer, we find stronger results and the significance increases. Even so, we prefer to use the share measure, which is continuous between 0 and 1, and it is what is normally used in the literature.

In Table A17, we replace the caloric yield measure by the measure of overall suitability from Zabel et al. (2014) and find our main conclusion to be unchanged.Footnote 42

We have also implemented a dynamic panel model of the type:

$$ \ln {\text{pop}}_{it} = \left( {1 + \alpha } \right)\ln {\text{pop}}_{it - 30} + \beta_{{}}^{clover} ln\left( {1 + clover_{i, 1805} } \right) I_{t}^{post} + \mathop \sum \limits_{R} \lambda_{R} I_{i}^{R} + \mathop \sum \limits_{j = 1801}^{1890} p_{j} I_{t}^{j} + \varvec{x^{\prime}\gamma } + \varepsilon_{it} . $$
(3)

By using this model, we can control for potential convergence in an alternative way to using initial population times year dummies. Yet, to be able to interpret the lags properly, we need to restrict our attention to observations for which we have relatively similar distances between observations. This leads us to focus on a model with (approximate) 30-year lags and we use data for 1769, 1801, 1834, 1860, and 1890. We also note that this is a dynamic panel model, which is difficult to estimate consistently.

In Table A18, we present three different models estimated by OLS and IV, respectively. The first three columns are estimated by OLS without a lagged dependent variable in column (1), with a lagged dependent variable but without town fixed effects in column (2), and with a lagged dependent variables and town fixed effects in column (3). The final three columns show the corresponding IV estimates. As in our other results, the IV estimates are all positive and significant. The full model in column (6) suggests that there is convergence in town populations. Yet, there is a positive effect of clover. It is well-known that Nickell bias may be an issue with this type of model. Fortunately, Angrist and Pischke (2009) show that the models in columns (4) and (5) bound the effect of clover. As both are positive and significant, we conclude that the results are not driven by convergence in town populations. Whether the effects of clover are on levels or growth is unclear from these estimations, but column (6) suggests that there are temporary effects on growth given the evidence of conditional convergence.

5.5 Channel regressions

In this section, we probe into the channels through which clover increased agricultural productivity and long run development and provide suggestive evidence on mediating variables. We carry out cross-section regressions in which we investigate whether clover adoption in 1805 predicts (1) cow density; (2) crop yields of barley, rye, and wheat (calculated as harvest per seed); (3) crop yields for potato; (4) cooperative creamery density; (5) human capital. The logic is to test whether clover adoption in 1805 affected these outcomes. In practice, we estimate the following model using the instrumental variables estimator:

$$ y_{i} = \alpha + \beta clover_{i,1805} + {\mathbf{X^{\prime}}}_{i} {\varvec{\upgamma}} + \varepsilon_{i} , $$
(4)

The outcomes denoted by \( y_{i} \) are number of cows per square kilometer from the agricultural census of 1838; rye, barley, wheat and potato yields also from the agricultural census of 1838; and cooperative creameries per square kilometer from Jensen et al. (2018). We introduce human capital measures below. \( \beta \) captures the effect of clover. The control variables are denoted by the (column) vector, \( {\mathbf{X}}_{i} \). They include the historical measure of suitability for growing barley as well as suitability for growing potatoes both calculated as shares. \( \varepsilon_{i} \) is the usual error term. To take endogeneity of clover adoption into account, we instrument by alfalfa suitability. The data are cross-sectional for the historical subnational unit known as herred, as also mentioned above, and shown in Figure A13 in Appendix B. Since the data are cross-sectional, we cannot include fixed effects for each herred, so to take out the level effect of historical suitability we add our historical measure of barley suitability introduced above (see also Appendix A) as well as the measure for potato suitability. The idea is here that we do not want to confound the effects of clover with persistent suitability for growing cereals or potatoes. By including these variables, we add credibility to the exclusion restriction.

As mentioned above, one of the possible channels concerns yields of barley, rye, and wheat and relates to the nitrogen hypothesis of Chorley (1981) and Allen (2008). Therefore, we test whether cereal yields increased because of clover, which would corroborate the nitrogen hypothesis. The second mechanism is that clover contributed to a greater number of cows (per square kilometer) by being a fodder. This in turn will lead to increased production of milk, butter, and meat. These are underlying channels through which “the standard model” of Allen (2009) results in agricultural productivity that drives urban populations analyzed previously.

Column (1) in Table 5 shows the first stage results. They indicate that alfalfa suitability and clover adoption are strongly related. Column (2) of Table 5 provides evidence that clover led to more cows, as a herred with higher shares of the area with clover had a significantly higher number of cows per square kilometer. In columns (3)–(5), we investigate the effect on yields for rye, barley, and wheat.Footnote 43 The instrumental variables estimates are all large and significant at least at the 10 percent level (for results based on estimation by OLS, see Table A19 in Appendix B). Thus, the data are consistent with the nitrogen hypothesis.

Table 5 The effect of clover adoption on agricultural productivity and human capital

We also note that increasing the number of cows locally may have been important for the take-off of the cooperative creameries, which arose at the end of the nineteenth century; see also Henriksen (1999) and Jensen et al. (2018). The cooperative creameries took advantage of industrial technologies in the late nineteenth century and are often regarded as being pivotal for the Danish economic take-off. In this way, clover contributed to the take-off of the Danish economy. In fact, when we run an instrumental variables model with cooperative creameries per square kilometer in 1890 as the outcome, clover is positively and significantly related to this variable, see column (6) in Table 5.

As discussed in Carillo (2018), it is possible that agricultural productivity also affects human capital accumulation. Education could influence farmers’ ability to understand and evaluate new inputs, and it is likely that the returns to education increase when the technological environment is changing.

Since standard measures of human capital such as literacy and enrollment rates as the available data do not vary by region, we test the impact of the adoption of clover on human capital formation as captured by the number of folk high schools per square kilometer in 1905.Footnote 44 Folk high schools were targeted at young adults (aged 16–25) from the countryside and would typically teach hygiene in the production of milk, cultivation of plants and more general knowledge about democracy and how to participate in society (Jensen et al. 2018). Information on the schools in existence in 1905 are available in Statistics Denmark (1907).Footnote 45

In column 7 of Table 5, we show that clover adoption influenced folk high school density. The coefficient on the clover share is positive and statistically significant at the 1 percent level. The coefficient on clover in the equation for agricultural school density is positive, but small and insignificant (not reported). The insignificant result could suggest that clover did not stimulate the very specific education at the agricultural schools, but it should be kept in mind that there were only 14 schools of this type, which may reduce statistical power. Yet, the results support that clover adoption had an impact on human capital via the folk high schools.

We do not find evidence in favor of Kjærgaard’s contention that clover was important for potato yields (not reported). In fact, visual inspection of the data suggests that the areas that could grow alfalfa well were less suited for growing potatoes. We have further tested whether a larger clover area is consistent with better opportunity for the honey bee as proposed by Kjærgaard (1995). We find that, in fact, there is such an effect. This leads us to conclude that clover helped turn Denmark into a “land of milk and honey” as suggested by P.E. Lüders.Footnote 46

To further investigate the importance of the variables in Table 5 in explaining the result for urban populations, we have treated these variables as mediating variables. We believe that doing so could be suggestive of the importance of the different channels.

For all the variables, we have matched the market towns to the herred in which they are located. We then interact the variables from the 1838 census with a dummy which is one after 1834 and zero otherwise.Footnote 47 The cooperative creameries were not in existence until after 1880 and so we exploit that they could not influence urban populations before 1890, and use an interaction with a dummy which is one from 1890. For folk high schools, we use data for folk high school density in which we use the timing of the establishment of the schools.

The results are shown in Table 6. Regarding cow density and cooperative creameries, we find that these exert a zero direct or negative influence, see columns 1 and 5. The estimated effect of clover becomes larger, but it should be noted that confidence intervals strongly overlap for estimates with and without these mediating variables. For grains a similar picture emerges, though in this case, the mediating variables are significant in some cases, see columns 2 to 4. Yet, we note that for barley yields that are available for the full sample, the coefficient is insignificant. The coefficient on the human capital measure is positive and significant, and reduces the effect and significance of clover to the 10 percent level, see column 6. This evidence suggests that the effect of clover on urban populations was mediated by its influence on human capital. Yet, it should be kept in mind that one challenge to the interpretation of the results in Table 6 is that the mediating variables could be endogenous to urban population themselves.

Table 6 The introduction of clover and its effect on market town population: mediating variables

It is possible that the expansion of clover was accompanied by an expansion of traditional nitrogen fixing crops such as peas. To examine whether our results can in part be explained by this, we test whether alfalfa suitability is associated with higher yields of peas using the data from the Danish agricultural census of 1838. This would reveal whether alfalfa suitability simply captures the suitability for growing peas. In Table 7, we provide evidence on this using data at the herred level. In column (1), we find a positive, yet insignificant, correlation. In columns (2) and (3) we control for the suitability of potatoes and barley, and the coefficient becomes negative. When we run an instrumental variables regression like those in Table 5, we find that the coefficient is negative and insignificant for clover. All these results suggest that our identifying variation is unrelated to yields and suitability of peas, and instead capture suitability for alfalfa and clover.

Table 7 Pea yields and alfalfa suitability

6 Conclusion

In this paper, we have evaluated the impact of clover on Danish market town populations. We present evidence suggesting that clover mattered for the development and that exogenous changes to agricultural productivity may contribute to long-run development as captured by urban population and urbanization rates. This effect is present despite the movement of the economy to a more liberal trade regime, and is unlikely to be driven by institutional heterogeneity, the expansion of railways, general productivity in agriculture, and other potential confounders.

We also investigated potential channels. Our results suggest that clover did affect both agricultural productivity and human capital accumulation, and also indicate that the effect of the adoption of clover on urban populations was mediated by its impact on human capital formation.

One may reflect upon whether the results translate to other contexts. As noted above, clover has also been deemed important for the case of England as suggested by Allen’s estimates and the ones by Chorley (1981) for northwest Europe. This indicates some generality of our results, but as noted above, the English system of crop rotation was slightly different than the one used in Denmark. Moreover, alfalfa may have played some role in the German case as suggested by the Prussian Economic History Database. Nonetheless, the results indicate the importance of legumes, and it seems plausible that soil conditions and knowledge about growing methods determined which legumes were adopted.